Deriving comorbidities from medical records using natural language processing

Hojjat Salmasian¹, Daniel E Freedberg, Carol Friedman

Affiliations

PMID: 24177145
PMCID: PMC3861932
DOI: 10.1136/amiajnl-2013-001889

Deriving comorbidities from medical records using natural language processing

Hojjat Salmasian et al. J Am Med Inform Assoc. 2013 Dec.

. 2013 Dec;20(e2):e239-42.

doi: 10.1136/amiajnl-2013-001889. Epub 2013 Oct 31.

Authors

Hojjat Salmasian¹, Daniel E Freedberg, Carol Friedman

Affiliation

¹ Department of Biomedical Informatics, Columbia University, New York, USA.

PMID: 24177145
PMCID: PMC3861932
DOI: 10.1136/amiajnl-2013-001889

Abstract

Extracting comorbidity information is crucial for phenotypic studies because of the confounding effect of comorbidities. We developed an automated method that accurately determines comorbidities from electronic medical records. Using a modified version of the Charlson comorbidity index (CCI), two physicians created a reference standard of comorbidities by manual review of 100 admission notes. We processed the notes using the MedLEE natural language processing system, and wrote queries to extract comorbidities automatically from its structured output. Interrater agreement for the reference set was very high (97.7%). Our method yielded an F1 score of 0.761 and the summed CCI score was not different from the reference standard (p=0.329, power 80.4%). In comparison, obtaining comorbidities from claims data yielded an F1 score of 0.741, due to lower sensitivity (66.1%). Because CCI has previously been validated as a predictor of mortality and readmission, our method could allow automated prediction of these outcomes.

Keywords: Comorbidity; Confounding Factors; Natural Language Processing.

PubMed Disclaimer

Figures

**Figure 1**
Scatter plot of summed Charlson comorbidity index scores calculated for the test set (N=70), using the automated method versus those calculated by manual chart review. The size of circles is in proportion to the number of patients for whom the respective scores were calculated by each method.

See this image and copyright information in PMC

Cited by

Antibiotic-Specific Risk for Community-Acquired Clostridioides difficile Infection in the United States from 2008 to 2020.
Zhang J, Chen L, Gomez-Simmonds A, Yin MT, Freedberg DE. Zhang J, et al. Antimicrob Agents Chemother. 2022 Dec 20;66(12):e0112922. doi: 10.1128/aac.01129-22. Epub 2022 Nov 15. Antimicrob Agents Chemother. 2022. PMID: 36377887 Free PMC article.
Natural Language Processing-Enabled and Conventional Data Capture Methods for Input to Electronic Health Records: A Comparative Usability Study.
Kaufman DR, Sheehan B, Stetson P, Bhatt AR, Field AI, Patel C, Maisel JM. Kaufman DR, et al. JMIR Med Inform. 2016 Oct 28;4(4):e35. doi: 10.2196/medinform.5544. JMIR Med Inform. 2016. PMID: 27793791 Free PMC article.
Use of Natural Language Processing Algorithms to Identify Common Data Elements in Operative Notes for Total Hip Arthroplasty.
Wyles CC, Tibbo ME, Fu S, Wang Y, Sohn S, Kremers WK, Berry DJ, Lewallen DG, Maradit-Kremers H. Wyles CC, et al. J Bone Joint Surg Am. 2019 Nov 6;101(21):1931-1938. doi: 10.2106/JBJS.19.00071. J Bone Joint Surg Am. 2019. PMID: 31567670 Free PMC article.
Natural language processing for the assessment of cardiovascular disease comorbidities: The cardio-Canary comorbidity project.
Berman AN, Biery DW, Ginder C, Hulme OL, Marcusa D, Leiva O, Wu WY, Cardin N, Hainer J, Bhatt DL, Di Carli MF, Turchin A, Blankstein R. Berman AN, et al. Clin Cardiol. 2021 Sep;44(9):1296-1304. doi: 10.1002/clc.23687. Epub 2021 Aug 4. Clin Cardiol. 2021. PMID: 34347314 Free PMC article.
A Prediction Model Incorporating Peripheral Eosinopenia as a Novel Risk Factor for Death After Hospitalization for Clostridioides difficile Infection.
Wang Y, Salmasian H, Schluger A, Gomez-Simmonds A, Choy A, Li J, Axelrad JE, Freedberg DE. Wang Y, et al. Gastro Hep Adv. 2022;1(1):38-44. doi: 10.1016/j.gastha.2021.10.002. Epub 2022 Feb 7. Gastro Hep Adv. 2022. PMID: 35974881 Free PMC article.

See all "Cited by" articles

References

1. Smoller JW, Lunetta KL, Robins J. Implications of comorbidity and ascertainment bias for identifying disease genes. Am J Med Genet 2000;96:817–22 - PubMed
1. Schneeweiss S, Seeger JD, Maclure M, et al. Performance of comorbidity scores to control for confounding in epidemiologic studies using claims data. Am J Epidemiol 2001;154:854–64 - PubMed
1. Schneeweiss S, Maclure M. Use of comorbidity scores for control of confounding in studies using administrative databases. Int J Epidemiol 2000;29:891–8 - PubMed
1. Della Porta MG, Malcovati L, Strupp C, et al. Risk stratification based on both disease status and extra-hematologic comorbidities in patients with myelodysplastic syndrome. Haematologica 2011;96:441–9 - PMC - PubMed
1. Condon JR, You J, McDonnell J. Performance of comorbidity indices in measuring outcomes after acute myocardial infarction in Australian indigenous and non-indigenous patients. Intern Med J 2012;42:e165–73 - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Deriving comorbidities from medical records using natural language processing

Affiliation

Deriving comorbidities from medical records using natural language processing

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources