Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jul 5;146(1):36-47.
doi: 10.1161/CIRCULATIONAHA.121.057869. Epub 2022 May 9.

rECHOmmend: An ECG-Based Machine Learning Approach for Identifying Patients at Increased Risk of Undiagnosed Structural Heart Disease Detectable by Echocardiography

Affiliations

rECHOmmend: An ECG-Based Machine Learning Approach for Identifying Patients at Increased Risk of Undiagnosed Structural Heart Disease Detectable by Echocardiography

Alvaro E Ulloa-Cerna et al. Circulation. .

Abstract

Background: Timely diagnosis of structural heart disease improves patient outcomes, yet many remain underdiagnosed. While population screening with echocardiography is impractical, ECG-based prediction models can help target high-risk patients. We developed a novel ECG-based machine learning approach to predict multiple structural heart conditions, hypothesizing that a composite model would yield higher prevalence and positive predictive values to facilitate meaningful recommendations for echocardiography.

Methods: Using 2 232 130 ECGs linked to electronic health records and echocardiography reports from 484 765 adults between 1984 to 2021, we trained machine learning models to predict the presence or absence of any of 7 echocardiography-confirmed diseases within 1 year. This composite label included the following: moderate or severe valvular disease (aortic/mitral stenosis or regurgitation, tricuspid regurgitation), reduced ejection fraction <50%, or interventricular septal thickness >15 mm. We tested various combinations of input features (demographics, laboratory values, structured ECG data, ECG traces) and evaluated model performance using 5-fold cross-validation, multisite validation trained on 1 site and tested on 10 independent sites, and simulated retrospective deployment trained on pre-2010 data and deployed in 2010.

Results: Our composite rECHOmmend model used age, sex, and ECG traces and had a 0.91 area under the receiver operating characteristic curve and a 42% positive predictive value at 90% sensitivity, with a composite label prevalence of 17.9%. Individual disease models had area under the receiver operating characteristic curves from 0.86 to 0.93 and lower positive predictive values from 1% to 31%. Area under the receiver operating characteristic curves for models using different input features ranged from 0.80 to 0.93, increasing with additional features. Multisite validation showed similar results to cross-validation, with an aggregate area under the receiver operating characteristic curve of 0.91 across our independent test set of 10 clinical sites after training on a separate site. Our simulated retrospective deployment showed that for ECGs acquired in patients without preexisting structural heart disease in the year 2010, 11% were classified as high risk and 41% (4.5% of total patients) developed true echocardiography-confirmed disease within 1 year.

Conclusions: An ECG-based machine learning model using a composite end point can identify a high-risk population for having undiagnosed, clinically significant structural heart disease while outperforming single-disease models and improving practical utility with higher positive predictive values. This approach can facilitate targeted screening with echocardiography to improve underdiagnosis of structural heart disease.

Keywords: cardiomyopathies; echocardiography; electrocardiography; heart valve diseases; machine learning; ventricular dysfunction.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Flow diagram from source data to the study datasets. We processed data from research repositories created using EHR data from Epic, ECG data from MUSE, and echocardiography data from Xcelera. The clinical MUSE database was processed to include 12-lead ECGs sampled at either 250 Hz or 500 Hz, acquired after 1984 from patients >18 years. ECG indicates electrocardiogram; EHR, electronic health record; and Echo, echocardiography.
Figure 2.
Figure 2.
rECHOmmend model diagram showing the classification pipeline for ECG traces and other electronic health record data. The output (gray triangle) of each CNN applied to ECG trace data are concatenated with labs, vitals, and demographics to form a feature vector. This vector is the input to the classification pipeline (min–max scaling, mean imputation, XGBoost classifier, and calibration), which outputs a composite prediction for the patient. AR indicates aortic regurgitation; AS, aortic stenosis; CNN, convolutional neural network; ECG, electrocardiogram; EF, ejection fraction; EHR, electronic health record; IVS, interventricular septum; MR, mitral regurgitation; MS, mitral stenosis; and TR, tricuspid regurgitation.
Figure 3.
Figure 3.
Performance of the rECHOmmend model in cross-validation experiments across various inputs. The plot on the left shows the AUROC while the plot on the right shows the AUPRC. AUPRC indicates area under the precision-recall curve; and AUROC, area under the receiver operating curve.
Figure 4.
Figure 4.
Results of retrospective deployment scenario from 2010. A, Results for all patients. B, Relative results per 100 at-risk patients. These results are based on a threshold yielding 50% sensitivity from the pre-2010 cross-validation experiment, resulting in 41.1% positive predictive value, 96.2% negative predictive value, 95.7% specificity, 44.1% sensitivity, and 6.4% prevalence in 2010. For 100 patients without known history of disease obtaining an ECG, the rECHOmmend model will identify 11 patients at high risk of disease, of which 5 are expected to have true disease within 1 year. The model will identify 89 patients not at high risk of disease, of which 86 are not expected to have true disease within 1 year. ECG indicates electrocardiogram; FN, false negative; FP, false positive; TN, true negative; and TP, true positive.

References

    1. Ross J, Jr, Braunwald E. Aortic stenosis. Circulation. 1968;38(1 Suppl):61–67. doi: 10.1161/01.cir.38.1s5.v-61 - PubMed
    1. Cheitlin MD, Gertz EW, Brundage BH, Carlson CJ, Quash JA, Bode RS, Jr. Rate of progression of severity of valvular aortic stenosis in the adult. Am Heart J. 1979;98:689–700. doi: 10.1016/0002-8703(79)90465-4 - PubMed
    1. Davies SW, Gershlick AH, Balcon R. Progression of valvar aortic stenosis: a long-term retrospective study. Eur Heart J. 1991;12:10–14. doi: 10.1093/oxfordjournals.eurheartj.a059815 - PubMed
    1. Curtis JP, Sokol SI, Wang Y, Rathore SS, Ko DT, Jadbabaie F, Portnay EL, Marshalko SJ, Radford MJ, Krumholz HM. The association of left ventricular ejection fraction, mortality, and cause of death in stable outpatients with heart failure. J Am Coll Cardiol. 2003;42:736–742. doi: 10.1016/s0735-1097(03)00789-7 - PubMed
    1. Martinez-Naharro A, Baksi AJ, Hawkins PN, Fontana M. Diagnostic imaging of cardiac amyloidosis. Nat Rev Cardiol. 2020;17:413–426. doi: 10.1038/s41569-020-0334-7 - PubMed

Publication types