Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Apr;23(e1):e20-7.
doi: 10.1093/jamia/ocv130. Epub 2015 Sep 2.

Combining billing codes, clinical notes, and medications from electronic health records provides superior phenotyping performance

Affiliations

Combining billing codes, clinical notes, and medications from electronic health records provides superior phenotyping performance

Wei-Qi Wei et al. J Am Med Inform Assoc. 2016 Apr.

Abstract

Objective: To evaluate the phenotyping performance of three major electronic health record (EHR) components: International Classification of Disease (ICD) diagnosis codes, primary notes, and specific medications.

Materials and methods: We conducted the evaluation using de-identified Vanderbilt EHR data. We preselected ten diseases: atrial fibrillation, Alzheimer's disease, breast cancer, gout, human immunodeficiency virus infection, multiple sclerosis, Parkinson's disease, rheumatoid arthritis, and types 1 and 2 diabetes mellitus. For each disease, patients were classified into seven categories based on the presence of evidence in diagnosis codes, primary notes, and specific medications. Twenty-five patients per disease category (a total number of 175 patients for each disease, 1750 patients for all ten diseases) were randomly selected for manual chart review. Review results were used to estimate the positive predictive value (PPV), sensitivity, andF-score for each EHR component alone and in combination.

Results: The PPVs of single components were inconsistent and inadequate for accurately phenotyping (0.06-0.71). Using two or more ICD codes improved the average PPV to 0.84. We observed a more stable and higher accuracy when using at least two components (mean ± standard deviation: 0.91 ± 0.08). Primary notes offered the best sensitivity (0.77). The sensitivity of ICD codes was 0.67. Again, two or more components provided a reasonably high and stable sensitivity (0.59 ± 0.16). Overall, the best performance (Fscore: 0.70 ± 0.12) was achieved by using two or more components. Although the overall performance of using ICD codes (0.67 ± 0.14) was only slightly lower than using two or more components, its PPV (0.71 ± 0.13) is substantially worse (0.91 ± 0.08).

Conclusion: Multiple EHR components provide a more consistent and higher performance than a single one for the selected phenotypes. We suggest considering multiple EHR components for future phenotyping design in order to obtain an ideal result.

Keywords: International Classification of Diseases; clinical notes; diagnosis codes; electronic health records; medications; phenotype; problem lists.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Weighted Venn diagrams of the distributions of patients with ICD-9, primary notes, and specific medications. Each color represents a resource. Different area colors represent the number of patients that were found within intersecting resources.
Figure 2:
Figure 2:
Receiver operating characteristic (ROC) curve for ICD-9, primary notes, and specific medications. ROC was performed using data of 1750 reviewed cases across 10 diseases. AUC: Area under the curve.

References

    1. Shea S, Hripcsak G . Accelerating the use of electronic health records in physician practices . New Engl J Med. 2010. ; 362 ( 3 ): 192 – 195 . - PubMed
    1. Wilke RA, Xu H, Denny JC, et al. . The emerging role of electronic medical records in pharmacogenomics . Clin Pharmacol Therap. 2011. ; 89 ( 3 ): 379 – 386 . - PMC - PubMed
    1. Roden DM, Xu H, Denny JC, Wilke RA . Electronic medical records as a tool in clinical pharmacology: opportunities and challenges . Clin Pharmacol Therap. 2012. ; 91 ( 6 ): 1083 – 1086 . - PMC - PubMed
    1. Hripcsak G, Albers DJ . Next-generation phenotyping of electronic health records . JAMIA. 2013. ; 20 ( 1 ): 117 – 121 . - PMC - PubMed
    1. Kho AN, Pacheco JA, Peissig PL, et al. . Electronic medical records for genetic research: results of the eMERGE consortium . Sci Trans Med. 2011. ; 3 ( 79 ): 79re71 . - PMC - PubMed

Publication types

LinkOut - more resources