Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Dec;20(e2):e239-42.
doi: 10.1136/amiajnl-2013-001889. Epub 2013 Oct 31.

Deriving comorbidities from medical records using natural language processing

Affiliations

Deriving comorbidities from medical records using natural language processing

Hojjat Salmasian et al. J Am Med Inform Assoc. 2013 Dec.

Abstract

Extracting comorbidity information is crucial for phenotypic studies because of the confounding effect of comorbidities. We developed an automated method that accurately determines comorbidities from electronic medical records. Using a modified version of the Charlson comorbidity index (CCI), two physicians created a reference standard of comorbidities by manual review of 100 admission notes. We processed the notes using the MedLEE natural language processing system, and wrote queries to extract comorbidities automatically from its structured output. Interrater agreement for the reference set was very high (97.7%). Our method yielded an F1 score of 0.761 and the summed CCI score was not different from the reference standard (p=0.329, power 80.4%). In comparison, obtaining comorbidities from claims data yielded an F1 score of 0.741, due to lower sensitivity (66.1%). Because CCI has previously been validated as a predictor of mortality and readmission, our method could allow automated prediction of these outcomes.

Keywords: Comorbidity; Confounding Factors; Natural Language Processing.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Scatter plot of summed Charlson comorbidity index scores calculated for the test set (N=70), using the automated method versus those calculated by manual chart review. The size of circles is in proportion to the number of patients for whom the respective scores were calculated by each method.

Similar articles

Cited by

References

    1. Smoller JW, Lunetta KL, Robins J. Implications of comorbidity and ascertainment bias for identifying disease genes. Am J Med Genet 2000;96:817–22 - PubMed
    1. Schneeweiss S, Seeger JD, Maclure M, et al. Performance of comorbidity scores to control for confounding in epidemiologic studies using claims data. Am J Epidemiol 2001;154:854–64 - PubMed
    1. Schneeweiss S, Maclure M. Use of comorbidity scores for control of confounding in studies using administrative databases. Int J Epidemiol 2000;29:891–8 - PubMed
    1. Della Porta MG, Malcovati L, Strupp C, et al. Risk stratification based on both disease status and extra-hematologic comorbidities in patients with myelodysplastic syndrome. Haematologica 2011;96:441–9 - PMC - PubMed
    1. Condon JR, You J, McDonnell J. Performance of comorbidity indices in measuring outcomes after acute myocardial infarction in Australian indigenous and non-indigenous patients. Intern Med J 2012;42:e165–73 - PubMed

Publication types