Performance of a Convolutional Neural Network and Explainability Technique for 12-Lead Electrocardiogram Interpretation
- PMID: 34347007
- PMCID: PMC8340011
- DOI: 10.1001/jamacardio.2021.2746
Performance of a Convolutional Neural Network and Explainability Technique for 12-Lead Electrocardiogram Interpretation
Abstract
Importance: Millions of clinicians rely daily on automated preliminary electrocardiogram (ECG) interpretation. Critical comparisons of machine learning-based automated analysis against clinically accepted standards of care are lacking.
Objective: To use readily available 12-lead ECG data to train and apply an explainability technique to a convolutional neural network (CNN) that achieves high performance against clinical standards of care.
Design, setting, and participants: This cross-sectional study was conducted using data from January 1, 2003, to December 31, 2018. Data were obtained in a commonly available 12-lead ECG format from a single-center tertiary care institution. All patients aged 18 years or older who received ECGs at the University of California, San Francisco, were included, yielding a total of 365 009 patients. Data were analyzed from January 1, 2019, to March 2, 2021.
Exposures: A CNN was trained to predict the presence of 38 diagnostic classes in 5 categories from 12-lead ECG data. A CNN explainability technique called LIME (Linear Interpretable Model-Agnostic Explanations) was used to visualize ECG segments contributing to CNN diagnoses.
Main outcomes and measures: Area under the receiver operating characteristic curve (AUC), sensitivity, and specificity were calculated for the CNN in the holdout test data set against cardiologist clinical diagnoses. For a second validation, 3 electrophysiologists provided consensus committee diagnoses against which the CNN, cardiologist clinical diagnosis, and MUSE (GE Healthcare) automated analysis performance was compared using the F1 score; AUC, sensitivity, and specificity were also calculated for the CNN against the consensus committee.
Results: A total of 992 748 ECGs from 365 009 adult patients (mean [SD] age, 56.2 [17.6] years; 183 600 women [50.3%]; and 175 277 White patients [48.0%]) were included in the analysis. In 91 440 test data set ECGs, the CNN demonstrated an AUC of at least 0.960 for 32 of 38 classes (84.2%). Against the consensus committee diagnoses, the CNN had higher frequency-weighted mean F1 scores than both cardiologists and MUSE in all 5 categories (CNN frequency-weighted F1 score for rhythm, 0.812; conduction, 0.729; chamber diagnosis, 0.598; infarct, 0.674; and other diagnosis, 0.875). For 32 of 38 classes (84.2%), the CNN had AUCs of at least 0.910 and demonstrated comparable F1 scores and higher sensitivity than cardiologists, except for atrial fibrillation (CNN F1 score, 0.847 vs cardiologist F1 score, 0.881), junctional rhythm (0.526 vs 0.727), premature ventricular complex (0.786 vs 0.800), and Wolff-Parkinson-White (0.800 vs 0.842). Compared with MUSE, the CNN had higher F1 scores for all classes except supraventricular tachycardia (CNN F1 score, 0.696 vs MUSE F1 score, 0.714). The LIME technique highlighted physiologically relevant ECG segments.
Conclusions and relevance: The results of this cross-sectional study suggest that readily available ECG data can be used to train a CNN algorithm to achieve comparable performance to clinical cardiologists and exceed the performance of MUSE automated analysis for most diagnoses, with some exceptions. The LIME explainability technique applied to CNNs highlights physiologically relevant ECG segments that contribute to the CNN's diagnoses.
Conflict of interest statement
Figures



Comment in
-
Leveraging Large Clinical Data Sets for Artificial Intelligence in Medicine.JAMA Cardiol. 2021 Nov 1;6(11):1296-1297. doi: 10.1001/jamacardio.2021.2878. JAMA Cardiol. 2021. PMID: 34347003 No abstract available.
Similar articles
-
An artificial intelligence-enabled ECG algorithm for the identification of patients with atrial fibrillation during sinus rhythm: a retrospective analysis of outcome prediction.Lancet. 2019 Sep 7;394(10201):861-867. doi: 10.1016/S0140-6736(19)31721-0. Epub 2019 Aug 1. Lancet. 2019. PMID: 31378392
-
Automatic multilabel electrocardiogram diagnosis of heart rhythm or conduction abnormalities with deep learning: a cohort study.Lancet Digit Health. 2020 Jul;2(7):e348-e357. doi: 10.1016/S2589-7500(20)30107-2. Epub 2020 Jun 4. Lancet Digit Health. 2020. PMID: 33328094
-
Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network.Nat Med. 2019 Jan;25(1):65-69. doi: 10.1038/s41591-018-0268-3. Epub 2019 Jan 7. Nat Med. 2019. PMID: 30617320 Free PMC article.
-
The Role of Machine Learning in the Detection of Cardiac Fibrosis in Electrocardiograms: Scoping Review.JMIR Cardio. 2024 Dec 30;8:e60697. doi: 10.2196/60697. JMIR Cardio. 2024. PMID: 39753213 Free PMC article.
-
Deep learning for comprehensive ECG annotation.Heart Rhythm. 2020 May;17(5 Pt B):881-888. doi: 10.1016/j.hrthm.2020.02.015. Heart Rhythm. 2020. PMID: 32354454 Free PMC article. Review.
Cited by
-
A multimodal deep learning tool for detection of junctional ectopic tachycardia in children with congenital heart disease.Heart Rhythm O2. 2024 May 16;5(7):452-459. doi: 10.1016/j.hroo.2024.04.014. eCollection 2024 Jul. Heart Rhythm O2. 2024. PMID: 39119021 Free PMC article.
-
Revolutionizing electrocardiography: the role of artificial intelligence in modern cardiac diagnostics.Ann Med Surg (Lond). 2025 Jan 9;87(1):161-170. doi: 10.1097/MS9.0000000000002778. eCollection 2025 Jan. Ann Med Surg (Lond). 2025. PMID: 40109609 Free PMC article. Review.
-
An ensemble of features based deep learning neural network for reduction of inappropriate atrial fibrillation detection in implantable cardiac monitors.Heart Rhythm O2. 2022 Nov 1;4(1):51-58. doi: 10.1016/j.hroo.2022.10.014. eCollection 2023 Jan. Heart Rhythm O2. 2022. PMID: 36713039 Free PMC article.
-
Identifying Mitral Valve Prolapse at Risk for Arrhythmias and Fibrosis From Electrocardiograms Using Deep Learning.JACC Adv. 2023 Aug;2(6):100446. doi: 10.1016/j.jacadv.2023.100446. Epub 2023 Aug 5. JACC Adv. 2023. PMID: 37936601 Free PMC article.
-
Artificial Intelligence-Based Left Ventricular Ejection Fraction by Medical Students for Mortality and Readmission Prediction.Diagnostics (Basel). 2024 Apr 4;14(7):767. doi: 10.3390/diagnostics14070767. Diagnostics (Basel). 2024. PMID: 38611680 Free PMC article.
References
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials