CogStack - experiences of deploying integrated information retrieval and extraction services in a large National Health Service Foundation Trust hospital
- PMID: 29941004
- PMCID: PMC6020175
- DOI: 10.1186/s12911-018-0623-9
CogStack - experiences of deploying integrated information retrieval and extraction services in a large National Health Service Foundation Trust hospital
Abstract
Background: Traditional health information systems are generally devised to support clinical data collection at the point of care. However, as the significance of the modern information economy expands in scope and permeates the healthcare domain, there is an increasing urgency for healthcare organisations to offer information systems that address the expectations of clinicians, researchers and the business intelligence community alike. Amongst other emergent requirements, the principal unmet need might be defined as the 3R principle (right data, right place, right time) to address deficiencies in organisational data flow while retaining the strict information governance policies that apply within the UK National Health Service (NHS). Here, we describe our work on creating and deploying a low cost structured and unstructured information retrieval and extraction architecture within King's College Hospital, the management of governance concerns and the associated use cases and cost saving opportunities that such components present.
Results: To date, our CogStack architecture has processed over 300 million lines of clinical data, making it available for internal service improvement projects at King's College London. On generated data designed to simulate real world clinical text, our de-identification algorithm achieved up to 94% precision and up to 96% recall.
Conclusion: We describe a toolkit which we feel is of huge value to the UK (and beyond) healthcare community. It is the only open source, easily deployable solution designed for the UK healthcare environment, in a landscape populated by expensive proprietary systems. Solutions such as these provide a crucial foundation for the genomic revolution in medicine.
Keywords: Clinical informatics; Elasticsearch; Electronic health records; Information extraction; Natural language processing.
Conflict of interest statement
Ethics approval and consent to participate
The creation of the CogStack software was an internal service development project for King’s College Hospital NHS Foundation Trust, and thus did not require ethical approval. As no patient identifiable data was required for the development of the software, no approval was sought from the Health Research Authority according to Confidentiality Advisory Group guidelines (
Consent for publication
Not applicable: No individual persons data is presented in this manuscript.
Competing interests
RJ and RS have received research funding from Roche, Pfizer, J&J and Lundbeck.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Figures


Similar articles
-
Natural language processing data services for healthcare providers.BMC Med Inform Decis Mak. 2024 Nov 26;24(1):356. doi: 10.1186/s12911-024-02713-x. BMC Med Inform Decis Mak. 2024. PMID: 39593087 Free PMC article. Review.
-
SemEHR: A general-purpose semantic search system to surface semantic data from clinical notes for tailored care, trial recruitment, and clinical research.J Am Med Inform Assoc. 2018 May 1;25(5):530-537. doi: 10.1093/jamia/ocx160. J Am Med Inform Assoc. 2018. PMID: 29361077 Free PMC article.
-
Deployment of a Free-Text Analytics Platform at a UK National Health Service Research Hospital: CogStack at University College London Hospitals.JMIR Med Inform. 2022 Aug 24;10(8):e38122. doi: 10.2196/38122. JMIR Med Inform. 2022. PMID: 36001371 Free PMC article.
-
Foresight-a generative pretrained transformer for modelling of patient timelines using electronic health records: a retrospective modelling study.Lancet Digit Health. 2024 Apr;6(4):e281-e290. doi: 10.1016/S2589-7500(24)00025-6. Lancet Digit Health. 2024. PMID: 38519155 Free PMC article.
-
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217. Cochrane Database Syst Rev. 2022. PMID: 36321557 Free PMC article.
Cited by
-
Machine learning outcome prediction using stress perfusion cardiac magnetic resonance reports and natural language processing of electronic health records.Inform Med Unlocked. 2024;44:101418. doi: 10.1016/j.imu.2023.101418. Inform Med Unlocked. 2024. PMID: 38173908 Free PMC article. No abstract available.
-
Natural language processing data services for healthcare providers.BMC Med Inform Decis Mak. 2024 Nov 26;24(1):356. doi: 10.1186/s12911-024-02713-x. BMC Med Inform Decis Mak. 2024. PMID: 39593087 Free PMC article. Review.
-
In response to Ballantyne and Schaefer's 'Consent and the ethical duty to participate in health data research'.J Med Ethics. 2019 May;45(5):351-352. doi: 10.1136/medethics-2018-105271. Epub 2019 Jan 7. J Med Ethics. 2019. PMID: 30617201 Free PMC article.
-
Semantic computational analysis of anticoagulation use in atrial fibrillation from real world data.PLoS One. 2019 Nov 25;14(11):e0225625. doi: 10.1371/journal.pone.0225625. eCollection 2019. PLoS One. 2019. PMID: 31765395 Free PMC article.
-
Multidisciplinary research priorities for the COVID-19 pandemic: a call for action for mental health science.Lancet Psychiatry. 2020 Jun;7(6):547-560. doi: 10.1016/S2215-0366(20)30168-1. Epub 2020 Apr 15. Lancet Psychiatry. 2020. PMID: 32304649 Free PMC article. Review.
References
-
- Simborg DW. An emerging standard for health communications: The HL7 standard. Healthc Comput Commun. 1987;4(10):58–60. - PubMed
-
- Worden R, Scott P. Simplifying HL7 Version 3 messages. Stud Health Technol Inform. 2011;169:709–13. - PubMed
-
- Antolík J. Automatic annotation of medical records. Stud Health Technol Inform. 2005;116:817–22. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical