Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan 11;145(2):122-133.
doi: 10.1161/CIRCULATIONAHA.121.057480. Epub 2021 Nov 8.

ECG-Based Deep Learning and Clinical Risk Factors to Predict Atrial Fibrillation

Affiliations

ECG-Based Deep Learning and Clinical Risk Factors to Predict Atrial Fibrillation

Shaan Khurshid et al. Circulation. .

Abstract

Background: Artificial intelligence (AI)-enabled analysis of 12-lead ECGs may facilitate efficient estimation of incident atrial fibrillation (AF) risk. However, it remains unclear whether AI provides meaningful and generalizable improvement in predictive accuracy beyond clinical risk factors for AF.

Methods: We trained a convolutional neural network (ECG-AI) to infer 5-year incident AF risk using 12-lead ECGs in patients receiving longitudinal primary care at Massachusetts General Hospital (MGH). We then fit 3 Cox proportional hazards models, composed of ECG-AI 5-year AF probability, CHARGE-AF clinical risk score (Cohorts for Heart and Aging in Genomic Epidemiology-Atrial Fibrillation), and terms for both ECG-AI and CHARGE-AF (CH-AI), respectively. We assessed model performance by calculating discrimination (area under the receiver operating characteristic curve) and calibration in an internal test set and 2 external test sets (Brigham and Women's Hospital [BWH] and UK Biobank). Models were recalibrated to estimate 2-year AF risk in the UK Biobank given limited available follow-up. We used saliency mapping to identify ECG features most influential on ECG-AI risk predictions and assessed correlation between ECG-AI and CHARGE-AF linear predictors.

Results: The training set comprised 45 770 individuals (age 55±17 years, 53% women, 2171 AF events) and the test sets comprised 83 162 individuals (age 59±13 years, 56% women, 2424 AF events). Area under the receiver operating characteristic curve was comparable using CHARGE-AF (MGH, 0.802 [95% CI, 0.767-0.836]; BWH, 0.752 [95% CI, 0.741-0.763]; UK Biobank, 0.732 [95% CI, 0.704-0.759]) and ECG-AI (MGH, 0.823 [95% CI, 0.790-0.856]; BWH, 0.747 [95% CI, 0.736-0.759]; UK Biobank, 0.705 [95% CI, 0.673-0.737]). Area under the receiver operating characteristic curve was highest using CH-AI (MGH, 0.838 [95% CI, 0.807 to 0.869]; BWH, 0.777 [95% CI, 0.766 to 0.788]; UK Biobank, 0.746 [95% CI, 0.716 to 0.776]). Calibration error was low using ECG-AI (MGH, 0.0212; BWH, 0.0129; UK Biobank, 0.0035) and CH-AI (MGH, 0.012; BWH, 0.0108; UK Biobank, 0.0001). In saliency analyses, the ECG P-wave had the greatest influence on AI model predictions. ECG-AI and CHARGE-AF linear predictors were correlated (Pearson r: MGH, 0.61; BWH, 0.66; UK Biobank, 0.41).

Conclusions: AI-based analysis of 12-lead ECGs has similar predictive usefulness to a clinical risk factor model for incident AF and the approaches are complementary. ECG-AI may enable efficient quantification of future AF risk.

Keywords: atrial fibrillation; deep learning; electronic health records.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. Study overview
Depicted is an overview of the study. We trained a deep learning model to predict incident AF (ECG-AI) in Massachusetts General Hospital (MGH). We then developed a model combining ECG-AI and the CHARGE-AF clinical risk score (CH-AI) in the same training population. We then validated ECG-AI and CH-AI in three test sets: MGH, individuals from a separate hospital (Brigham and Women’s Hospital, BWH), and the UK Biobank prospective cohort study.
Figure 2.
Figure 2.. Discrimination of incident AF
Depicted is discrimination of age and sex (gray), CHARGE-AF (orange), ECG-AI (green), and CH-AI (purple), in the MGH test set (left panels), BWH test set (middle panels), and UK Biobank test set (right panels). Top panels plot the average precision and bottom panels plot the area under the receiver operating characteristic curve (AUROC) across increasing length of the prediction window (x-axis). In the top panels, the black triangles represent the cumulative event rate (i.e., the precision of a randomly guessing model).
Figure 3.
Figure 3.. Calibration for incident AF
Depicted are fitted calibration curves demonstrating the relationship between predicted event risk (x-axis) and observed cumulative event incidence (y-axis) for and age and sex (gray), CHARGE-AF (orange), ECG-AI (green), and CH-AI (purple). Perfect calibration is indicated by the hashed diagonal line, denoting perfect correspondence between predicted and observed risk. Curves were obtained using adaptive hazard regression relating predicted risk and observed event risk.
Figure 4.
Figure 4.. Cumulative risk of AF stratified by predicted AF risk
Depicted is the cumulative risk of AF stratified by high predicted risk of AF as determined using both ECG-AI and CHARGE-AF (dark red), ECG-AI only (red), CHARGE-AF only (orange), or neither model (yellow). High AF risk was defined as 5-year AF risk ≥5% in MGH and BWH (as performed in the original CHARGE-AF derivation study), and 2-year AF risk ≥1% in the UK Biobank (approximating the top tertile of risk). The number at risk across each stratum over time is depicted below each plot.
Figure 5.
Figure 5.. Representations of ECG-AI behavior
Depicted are two forms of visualizing the behavior of the ECG-AI deep learning model. Panel A is a saliency map of ECG-AI demarcating regions of the ECG waveform having the greatest influence on AF risk predictions. Blue shades depict the magnitude of the gradient of predicted AF risk with respect to the ECG waveform amplitude, where darker shades illustrate regions of the waveform exerting greater salience, or influence on AF risk predictions. Saliency was averaged over a random sample of 4,096 individuals in the BWH test set, and the red waveform depicts the median waveform in each lead among the 4,096 individuals. Panel B displays the median waveform of a random sample of 1,000 individuals in the BWH test set with low predicted AF risk (i.e., 5-year AF risk < 2.5%, green) versus the median waveform of a random sample of 1,000 individuals in the BWH test set with high predicted AF risk (i.e., 5-year AF risk > 5%, red).

References

    1. Wolf PA, Abbott RD, Kannel WB. Atrial fibrillation: a major contributor to stroke in the elderly. The Framingham Study. Arch Intern Med. 1987;147:1561–1564. - PubMed
    1. Corley SD, Epstein AE, DiMarco JP, Domanski MJ, Geller N, Greene HL, Josephson RA, Kellen JC, Klein RC, Krahn AD, et al. Relationships between sinus rhythm, treatment, and survival in the Atrial Fibrillation Follow-Up Investigation of Rhythm Management (AFFIRM) Study. Circulation. 2004;109:1509–1513. - PubMed
    1. Carlisle MA, Fudim M, DeVore AD, Piccini JP. Heart Failure and Atrial Fibrillation, Like Fire and Fury. JACC Heart Fail. 2019;7:447–456. - PubMed
    1. Diener H-C, Hart RG, Koudstaal PJ, Lane DA, Lip GYH. Atrial Fibrillation and Cognitive Function: JACC Review Topic of the Week. J Am Coll Cardiol. 2019;73:612–619. - PubMed
    1. Voskoboinik A, Kalman JM, De Silva A, Nicholls T, Costello B, Nanayakkara S, Prabhu S, Stub D, Azzopardi S, Vizi D, et al. Alcohol Abstinence in Drinkers with Atrial Fibrillation. New England Journal of Medicine. 2020;382:20–28. - PubMed

Publication types