Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 12;8(1):23.
doi: 10.1038/s41746-024-01418-9.

Unsupervised deep learning of electrocardiograms enables scalable human disease profiling

Affiliations

Unsupervised deep learning of electrocardiograms enables scalable human disease profiling

Sam F Friedman et al. NPJ Digit Med. .

Abstract

The 12-lead electrocardiogram (ECG) is inexpensive and widely available. Whether conditions across the human disease landscape can be detected using the ECG is unclear. We developed a deep learning denoising autoencoder and systematically evaluated associations between ECG encodings and ~1,600 Phecode-based diseases in three datasets separate from model development, and meta-analyzed the results. The latent space ECG model identified associations with 645 prevalent and 606 incident Phecodes. Associations were most enriched in the circulatory (n = 140, 82% of category-specific Phecodes), respiratory (n = 53, 62%) and endocrine/metabolic (n = 73, 45%) categories, with additional associations across the phenome. The strongest ECG association was with hypertension (p < 2.2×10-308). The ECG latent space model demonstrated more associations than models using standard ECG intervals, and offered favorable discrimination of prevalent disease compared to models comprising age, sex, and race. We further demonstrate how latent space models can be used to generate disease-specific ECG waveforms and facilitate individual disease profiling.

PubMed Disclaimer

Conflict of interest statement

Competing interests: Dr. Lubitz is a full-time employee of Novartis Institutes for Biomedical Research as of July 18, 2022. Dr. Lubitz has received sponsored research support from Bristol Myers Squibb, Pfizer, Boehringer Ingelheim, Fitbit, Medtronic, Premier, and IBM, and has consulted for Bristol Myers Squibb, Pfizer, Blackstone Life Sciences, and Invitae. Dr. Anderson receives sponsored research support from Bayer AG and Massachusetts General Hospital and has consulted for ApoPharma. Dr. Weng receives sponsored research support from IBM to the Broad Institute. Dr. Ellinor has received sponsored research support from Bayer AG and IBM Health, and he has consulted for Bayer AG, Novartis and MyoKardia. Dr. Batra, Dr. Reeder and Dr. Friedman have received sponsored research support from Bayer AG and IBM Health. Dr. Ho and Dr. Khurshid have received sponsored research support from Bayer AG. The remaining authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Study overview.
Flow diagram of autoencoder and phenotype vector derivation for latent space phenome-wide association studies (PheWAS), conducted in parallel for both 12-lead and single-lead electrocardiogram (ECG) models. We trained an autoencoder to encode and reconstruct 12- and single-lead ECGs using the Massachusetts General Hospital (MGH) subset of the Community Care Cohort Project (C3PO) dataset (MGH-C3PO). We tested the autoencoder in three test sets without modification: a) an MGH-C3PO holdout set, b) the independent Brigham and Women’s Hospital (BWH) subset of C3PO (BWH-C3PO), and c) the UK Biobank prospective cohort study. To assess for associations with disease, we derived phenotype vectors using labeled ECGs from 50% of the MGH-C3PO dataset, and projected those vectors onto each test set without modification. For every individual in each test set, we calculated the projected component, or the position along each phenotype vector (hereafter termed “vector component score”), and tested associations between vector component scores and corresponding Phecodes. We performed sample-level PheWAS in each of the three datasets and then meta-analyzed the results.
Fig. 2
Fig. 2. Latent space phenome-wide association study results for the 12-lead electrocardiogram autoencoder model.
Panels depict phenome-wide association study results for the 12-lead electrocardiogram autoencoder. Top panel depicts existing disease associations, and bottom panel incident disease associations. Each Phecode tested for association is represented as a single point on the plot. The x-axis represents the phenotype category and the y-axis represents the -log10(p value) for the association test.
Fig. 3
Fig. 3. Significant associations in the latent space and electrocardiogram intervals phenome-wide association studies.
Panel a displays the test statistic distribution (absolute z-score) for the ECG term in the meta-analyzed phenome-wide association study (PheWAS), stratified by modeling approach. Results are displayed for the 12-lead and 1-lead electrocardiogram (ECG) latent space models, as well as the ECG intervals model. Panel b demonstrates the number of significantly associated Phecodes, defined as those exceeding a Bonferroni-corrected two-sided p value of 3.1 × 10-5 (0.05 divided by 1584, the number of unique Phecodes included across all meta-analyses). For the intervals model, a result was considered significant if the meta-analyzed p value for any of the tested ECG intervals (PR, QRS, QT) exceeded the significance threshold. When compared to the ECG intervals model, the latent space models yield a greater number of significant associations, both overall and across disease categories.
Fig. 4
Fig. 4. Model-based, disease-specific ECG reconstructions.
Median waveform reconstructions for centroids reflecting individuals without (blue) and with (red) left bundle branch block in panel a, hypokalemia (hypoptassemia) in panel b, hypertrophic cardiomyopathy in panel c, and rheumatoid arthritis in panel d.

Similar articles

References

    1. Trobec, R. & Tomašić, I. Synthesis of the 12-lead electrocardiogram from differential leads. IEEE Trans. Inf. Technol. Biomed.15, 615–621 (2011). - PubMed
    1. Barold, S. S. Willem Einthoven and the birth of clinical electrocardiography a hundred years ago. Card. Electrophysiol. Rev.7, 99–104 (2003). - PubMed
    1. Rivera-Ruiz, M., Cajavilca, C. & Varon, J. Einthoven’s string galvanometer: the first electrocardiograph. Tex. Heart Inst. J.35, 174–178 (2008). - PMC - PubMed
    1. Salvati, M. et al. Electrocardiographic changes in subarachnoid hemorrhage secondary to cerebral aneurysm. Report of 70 cases. Ital. J. Neurol. Sci.13, 409–413 (1992). - PubMed
    1. Surawicz, B. Relationship between electrocardiogram and electrolytes. Am. Heart J.73, 814–834 (1967). - PubMed

Grants and funding