Deriving comorbidities from medical records using natural language processing
- PMID: 24177145
- PMCID: PMC3861932
- DOI: 10.1136/amiajnl-2013-001889
Deriving comorbidities from medical records using natural language processing
Abstract
Extracting comorbidity information is crucial for phenotypic studies because of the confounding effect of comorbidities. We developed an automated method that accurately determines comorbidities from electronic medical records. Using a modified version of the Charlson comorbidity index (CCI), two physicians created a reference standard of comorbidities by manual review of 100 admission notes. We processed the notes using the MedLEE natural language processing system, and wrote queries to extract comorbidities automatically from its structured output. Interrater agreement for the reference set was very high (97.7%). Our method yielded an F1 score of 0.761 and the summed CCI score was not different from the reference standard (p=0.329, power 80.4%). In comparison, obtaining comorbidities from claims data yielded an F1 score of 0.741, due to lower sensitivity (66.1%). Because CCI has previously been validated as a predictor of mortality and readmission, our method could allow automated prediction of these outcomes.
Keywords: Comorbidity; Confounding Factors; Natural Language Processing.
Figures

Similar articles
-
Natural language processing for the assessment of cardiovascular disease comorbidities: The cardio-Canary comorbidity project.Clin Cardiol. 2021 Sep;44(9):1296-1304. doi: 10.1002/clc.23687. Epub 2021 Aug 4. Clin Cardiol. 2021. PMID: 34347314 Free PMC article.
-
Extracting Multifaceted Characteristics of Patients With Chronic Disease Comorbidity: Framework Development Using Large Language Models.JMIR Med Inform. 2025 May 15;13:e70096. doi: 10.2196/70096. JMIR Med Inform. 2025. PMID: 40373298 Free PMC article.
-
Ensembles of natural language processing systems for portable phenotyping solutions.J Biomed Inform. 2019 Dec;100:103318. doi: 10.1016/j.jbi.2019.103318. Epub 2019 Oct 23. J Biomed Inform. 2019. PMID: 31655273 Free PMC article.
-
A comparison of the Charlson comorbidities derived from medical language processing and administrative data.Proc AMIA Symp. 2002:160-4. Proc AMIA Symp. 2002. PMID: 12463807 Free PMC article.
-
The measured effect magnitude of co-morbidities on burn injury mortality.Burns. 2016 Nov;42(7):1433-1438. doi: 10.1016/j.burns.2016.03.007. Epub 2016 Sep 1. Burns. 2016. PMID: 27593340 Free PMC article.
Cited by
-
Antibiotic-Specific Risk for Community-Acquired Clostridioides difficile Infection in the United States from 2008 to 2020.Antimicrob Agents Chemother. 2022 Dec 20;66(12):e0112922. doi: 10.1128/aac.01129-22. Epub 2022 Nov 15. Antimicrob Agents Chemother. 2022. PMID: 36377887 Free PMC article.
-
Natural Language Processing-Enabled and Conventional Data Capture Methods for Input to Electronic Health Records: A Comparative Usability Study.JMIR Med Inform. 2016 Oct 28;4(4):e35. doi: 10.2196/medinform.5544. JMIR Med Inform. 2016. PMID: 27793791 Free PMC article.
-
Use of Natural Language Processing Algorithms to Identify Common Data Elements in Operative Notes for Total Hip Arthroplasty.J Bone Joint Surg Am. 2019 Nov 6;101(21):1931-1938. doi: 10.2106/JBJS.19.00071. J Bone Joint Surg Am. 2019. PMID: 31567670 Free PMC article.
-
Natural language processing for the assessment of cardiovascular disease comorbidities: The cardio-Canary comorbidity project.Clin Cardiol. 2021 Sep;44(9):1296-1304. doi: 10.1002/clc.23687. Epub 2021 Aug 4. Clin Cardiol. 2021. PMID: 34347314 Free PMC article.
-
A Prediction Model Incorporating Peripheral Eosinopenia as a Novel Risk Factor for Death After Hospitalization for Clostridioides difficile Infection.Gastro Hep Adv. 2022;1(1):38-44. doi: 10.1016/j.gastha.2021.10.002. Epub 2022 Feb 7. Gastro Hep Adv. 2022. PMID: 35974881 Free PMC article.
References
-
- Smoller JW, Lunetta KL, Robins J. Implications of comorbidity and ascertainment bias for identifying disease genes. Am J Med Genet 2000;96:817–22 - PubMed
-
- Schneeweiss S, Seeger JD, Maclure M, et al. Performance of comorbidity scores to control for confounding in epidemiologic studies using claims data. Am J Epidemiol 2001;154:854–64 - PubMed
-
- Schneeweiss S, Maclure M. Use of comorbidity scores for control of confounding in studies using administrative databases. Int J Epidemiol 2000;29:891–8 - PubMed
-
- Condon JR, You J, McDonnell J. Performance of comorbidity indices in measuring outcomes after acute myocardial infarction in Australian indigenous and non-indigenous patients. Intern Med J 2012;42:e165–73 - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources