Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Oct 7;11(1):19959.
doi: 10.1038/s41598-021-98719-w.

An independently validated, portable algorithm for the rapid identification of COPD patients using electronic health records

Affiliations

An independently validated, portable algorithm for the rapid identification of COPD patients using electronic health records

Su H Chu et al. Sci Rep. .

Abstract

Electronic health records (EHR) provide an unprecedented opportunity to conduct large, cost-efficient, population-based studies. However, the studies of heterogeneous diseases, such as chronic obstructive pulmonary disease (COPD), often require labor-intensive clinical review and testing, limiting widespread use of these important resources. To develop a generalizable and efficient method for accurate identification of large COPD cohorts in EHRs, a COPD datamart was developed from 3420 participants meeting inclusion criteria in the Mass General Brigham Biobank. Training and test sets were selected and labeled with gold-standard COPD classifications obtained from chart review by pulmonologists. Multiple classes of algorithms were built utilizing both structured (e.g. ICD codes) and unstructured (e.g. medical notes) data via elastic net regression. Models explicitly including and excluding spirometry features were compared. External validation of the final algorithm was conducted in an independent biobank with a different EHR system. The final COPD classification model demonstrated excellent positive predictive value (PPV; 91.7%), sensitivity (71.7%), and specificity (94.4%). This algorithm performed well not only within the MGBB, but also demonstrated similar or improved classification performance in an independent biobank (PPV 93.5%, sensitivity 61.4%, specificity 90%). Ancillary comparisons showed that the classification model built including a binary feature for FEV1/FVC produced substantially higher sensitivity than those excluding. This study fills a gap in COPD research involving population-based EHRs, providing an important resource for the rapid, automated classification of COPD cases that is both cost-efficient and requires minimal information from unstructured medical records.

PubMed Disclaimer

Conflict of interest statement

STW is an author for UpToDate. MHC has received grant support from GSK and Bayer, speaking fees from Illumina, and consulting fees from AstraZeneca. STW is an author for UpToDate. MHC has received grant support from GSK and Bayer, speaking fees from Illumina, and consulting fees from AstraZeneca. The remaining authors have no conflicts of interest to declare.

Figures

Figure 1
Figure 1
Overview of COPD datamart selection and developed algorithms.
Figure 2
Figure 2
Broad overview of steps in phenotyping algorithm development.
Figure 3
Figure 3
Receiver-operator characteristic curves to assess classification performance of model-based algorithms.

References

    1. Mathers CD, Loncar D. Projections of global mortality and burden of disease from 2002 to 2030. PLoS Med. 2006;3(11):e442. doi: 10.1371/journal.pmed.0030442. - DOI - PMC - PubMed
    1. Soriano JB, Abajobir AA, Abate KH, et al. Global, regional, and national deaths, prevalence, disability-adjusted life years, and years lived with disability for chronic obstructive pulmonary disease and asthma, 1990–2015: A systematic analysis for the Global Burden of Disease Study 2015. Lancet Respir. Med. 2017;5(9):691–706. doi: 10.1016/S2213-2600(17)30293-X. - DOI - PMC - PubMed
    1. Vogelmeier CF, Criner GJ, Martinez FJ, et al. Global strategy for the diagnosis, management, and prevention of chronic obstructive lung disease 2017 report. GOLD executive summary. Am. J. Respir. Crit. Care Med. 2017;195(5):557–582. doi: 10.1164/rccm.201701-0218PP. - DOI - PubMed
    1. Hill K, Goldstein RS, Guyatt GH, et al. Prevalence and underdiagnosis of chronic obstructive pulmonary disease among patients at risk in primary care. CMAJ. 2010;182(7):673–678. doi: 10.1503/cmaj.091784. - DOI - PMC - PubMed
    1. Lamprecht B, Soriano JB, Studnicka M, et al. Determinants of underdiagnosis of COPD in national and international surveys. Chest. 2015;148(4):971–985. doi: 10.1378/chest.14-2535. - DOI - PubMed

Publication types