Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Nov 9;23(11):e28946.
doi: 10.2196/28946.

Using Artificial Intelligence With Natural Language Processing to Combine Electronic Health Record's Structured and Free Text Data to Identify Nonvalvular Atrial Fibrillation to Decrease Strokes and Death: Evaluation and Case-Control Study

Affiliations

Using Artificial Intelligence With Natural Language Processing to Combine Electronic Health Record's Structured and Free Text Data to Identify Nonvalvular Atrial Fibrillation to Decrease Strokes and Death: Evaluation and Case-Control Study

Peter L Elkin et al. J Med Internet Res. .

Abstract

Background: Nonvalvular atrial fibrillation (NVAF) affects almost 6 million Americans and is a major contributor to stroke but is significantly undiagnosed and undertreated despite explicit guidelines for oral anticoagulation.

Objective: The aim of this study is to investigate whether the use of semisupervised natural language processing (NLP) of electronic health record's (EHR) free-text information combined with structured EHR data improves NVAF discovery and treatment and perhaps offers a method to prevent thousands of deaths and save billions of dollars.

Methods: We abstracted 96,681 participants from the University of Buffalo faculty practice's EHR. NLP was used to index the notes and compare the ability to identify NVAF, congestive heart failure, hypertension, age ≥75 years, diabetes mellitus, stroke or transient ischemic attack, vascular disease, age 65 to 74 years, sex category (CHA2DS2-VASc), and Hypertension, Abnormal liver/renal function, Stroke history, Bleeding history or predisposition, Labile INR, Elderly, Drug/alcohol usage (HAS-BLED) scores using unstructured data (International Classification of Diseases codes) versus structured and unstructured data from clinical notes. In addition, we analyzed data from 63,296,120 participants in the Optum and Truven databases to determine the NVAF frequency, rates of CHA2DS2‑VASc ≥2, and no contraindications to oral anticoagulants, rates of stroke and death in the untreated population, and first year's costs after stroke.

Results: The structured-plus-unstructured method would have identified 3,976,056 additional true NVAF cases (P<.001) and improved sensitivity for CHA2DS2-VASc and HAS-BLED scores compared with the structured data alone (P=.002 and P<.001, respectively), causing a 32.1% improvement. For the United States, this method would prevent an estimated 176,537 strokes, save 10,575 lives, and save >US $13.5 billion.

Conclusions: Artificial intelligence-informed bio-surveillance combining NLP of free-text information with structured EHR data improves data completeness, prevents thousands of strokes, and saves lives and funds. This method is applicable to many disorders with profound public health consequences.

Keywords: CHA2DS2-VASc; HAS-BLED; NVAF; afib; artificial intelligence; atrial fibrillation; bio-surveillance; bleed risk; natural language processing; stroke risk.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: GB, MW, JM, JT, and KM are employed at Pfizer.

Figures

Figure 1
Figure 1
Four receiver operator characteristic curves for cumulative congestive heart failure, hypertension, age ≥ 75 years, diabetes mellitus, stroke or transient ischemic attack, vascular disease, age 65 to 74 years, sex category (CHA2DS2-VASc), and Hypertension, Abnormal liver/renal function, Stroke history, Bleeding history or predisposition, Labile INR, Elderly, Drug/alcohol usage (HAS-BLED) risk scores. NLP: natural language processing.

References

    1. Camm AJ, Lip GY, De Caterina R, Savelieva I, Atar D, Hohnloser SH, Hindricks G, Kirchhof P, ESC Committee for Practice Guidelines (CPG) 2012 focused update of the esc guidelines for the management of atrial fibrillation: an update of the 2010 esc guidelines for the management of atrial fibrillation. Developed with the special contribution of the european heart rhythm association. Eur Heart J. 2012 Nov;33(21):2719–47. doi: 10.1093/eurheartj/ehs253.ehs253 - DOI - PubMed
    1. January CT, Wann LS, Alpert JS, Calkins H, Cigarroa JE, Cleveland JC, Conti JB, Ellinor PT, Ezekowitz MD, Field ME, Murray KT, Sacco RL, Stevenson WG, Tchou PJ, Tracy CM, Yancy CW, American College of Cardiology/American Heart Association Task Force on Practice Guidelines 2014 AHA/ACC/HRS guideline for the management of patients with atrial fibrillation: a report of the american college of cardiology/american heart association task force on practice guidelines and the heart rhythm society. J Am Coll Cardiol. 2014 Dec 02;64(21):1–76. doi: 10.1016/j.jacc.2014.03.022. https://linkinghub.elsevier.com/retrieve/pii/S0735-1097(14)01740-9 S0735-1097(14)01740-9 - DOI - PubMed
    1. Roger VL, Go AS, Lloyd-Jones DM, Benjamin EJ, Berry JD, Borden WB, Bravata DM, Dai S, Ford ES, Fox CS, Fullerton HJ, Gillespie C, Hailpern SM, Heit JA, Howard VJ, Kissela BM, Kittner SJ, Lackland DT, Lichtman JH, Lisabeth LD, Makuc DM, Marcus GM, Marelli A, Matchar DB, Moy CS, Mozaffarian D, Mussolino ME, Nichol G, Paynter NP, Soliman EZ, Sorlie PD, Sotoodehnia N, Turan TN, Virani SS, Wong ND, Woo D, Turner MB, American Heart Association Statistics Committee and Stroke Statistics Subcommittee Heart disease and stroke statistics--2012 update: a report from the American Heart Association. Circulation. 2012 Jan 3;125(1):2–220. doi: 10.1161/CIR.0b013e31823ac046. http://circ.ahajournals.org/cgi/pmidlookup?view=long&pmid=22179539 CIR.0b013e31823ac046 - DOI - PMC - PubMed
    1. Wolf PA, Abbott RD, Kannel WB. Atrial fibrillation as an independent risk factor for stroke: the framingham study. Stroke. 1991 Aug;22(8):983–8. doi: 10.1161/01.str.22.8.983. - DOI - PubMed
    1. Gage BF, Waterman AD, Shannon W, Boechler M, Rich MW, Radford MJ. Validation of clinical classification schemes for predicting stroke: results from the national registry of atrial fibrillation. J Am Med Assoc. 2001 Jun 13;285(22):2864–70. doi: 10.1001/jama.285.22.2864.joc01974 - DOI - PubMed

Publication types