Cross-modal autoencoder framework learns holistic representations of cardiovascular state

doi:10.1038/s41467-023-38125-0

. 2023 Apr 28;14(1):2436.

doi: 10.1038/s41467-023-38125-0.

Cross-modal autoencoder framework learns holistic representations of cardiovascular state

Adityanarayanan Radhakrishnan^#¹, Sam F Friedman^#², Shaan Khurshid^{2

3}, Kenney Ng⁴, Puneet Batra², Steven A Lubitz^{5

6}, Anthony A Philippakis⁷, Caroline Uhler^{8

9}

Affiliations

¹ Massachusetts Institute of Technology, Cambridge, USA.
² Broad Institute of MIT and Harvard, Cambridge, USA.
³ Massachusetts General Hospital, Massachusetts, USA.
⁴ IBM T.J. Watson Research Center, New York, USA.
⁵ Broad Institute of MIT and Harvard, Cambridge, USA. lubitz@broadinstitute.org.
⁶ Massachusetts General Hospital, Massachusetts, USA. lubitz@broadinstitute.org.
⁷ Broad Institute of MIT and Harvard, Cambridge, USA. aphilipp@broadinstitute.org.
⁸ Massachusetts Institute of Technology, Cambridge, USA. cuhler@mit.edu.
⁹ Broad Institute of MIT and Harvard, Cambridge, USA. cuhler@mit.edu.

^# Contributed equally.

PMID: 37105979
PMCID: PMC10140057
DOI: 10.1038/s41467-023-38125-0

Cross-modal autoencoder framework learns holistic representations of cardiovascular state

Adityanarayanan Radhakrishnan et al. Nat Commun. 2023.

. 2023 Apr 28;14(1):2436.

doi: 10.1038/s41467-023-38125-0.

Authors

Adityanarayanan Radhakrishnan^#¹, Sam F Friedman^#², Shaan Khurshid^{2

3}, Kenney Ng⁴, Puneet Batra², Steven A Lubitz^{5

6}, Anthony A Philippakis⁷, Caroline Uhler^{8

9}

Affiliations

¹ Massachusetts Institute of Technology, Cambridge, USA.
² Broad Institute of MIT and Harvard, Cambridge, USA.
³ Massachusetts General Hospital, Massachusetts, USA.
⁴ IBM T.J. Watson Research Center, New York, USA.
⁵ Broad Institute of MIT and Harvard, Cambridge, USA. lubitz@broadinstitute.org.
⁶ Massachusetts General Hospital, Massachusetts, USA. lubitz@broadinstitute.org.
⁷ Broad Institute of MIT and Harvard, Cambridge, USA. aphilipp@broadinstitute.org.
⁸ Massachusetts Institute of Technology, Cambridge, USA. cuhler@mit.edu.
⁹ Broad Institute of MIT and Harvard, Cambridge, USA. cuhler@mit.edu.

^# Contributed equally.

PMID: 37105979
PMCID: PMC10140057
DOI: 10.1038/s41467-023-38125-0

Abstract

A fundamental challenge in diagnostics is integrating multiple modalities to develop a joint characterization of physiological state. Using the heart as a model system, we develop a cross-modal autoencoder framework for integrating distinct data modalities and constructing a holistic representation of cardiovascular state. In particular, we use our framework to construct such cross-modal representations from cardiac magnetic resonance images (MRIs), containing structural information, and electrocardiograms (ECGs), containing myoelectric information. We leverage the learned cross-modal representation to (1) improve phenotype prediction from a single, accessible phenotype such as ECGs; (2) enable imputation of hard-to-acquire cardiac MRIs from easy-to-acquire ECGs; and (3) develop a framework for performing genome-wide association studies in an unsupervised manner. Our results systematically integrate distinct diagnostic modalities into a common representation that better characterizes physiologic state.

PubMed Disclaimer

Conflict of interest statement

S.A.L. receives sponsored research support from Bristol Myers Squibb, Pfizer, Boehringer Ingelheim, Fitbit/Google, Medtronic, Premier, and IBM, and has consulted for Bristol Myers Squibb, Pfizer, Blackstone Life Sciences, and Invitae. A.A.P. is a Venture Partner at GV. He has received funding from IBM, Bayer, Pfizer, Microsoft, Verily, and Intel. C.U. serves on the Scientific Advisory Board of Immunai and Relation Therapeutics and has received sponsored research support from Janssen Pharmaceuticals. The remaining authors declare no competing interests.

Figures

**Fig. 1. An overview of our cross-modal autoencoder framework for integrating cardiovascular data modalities.**
Our model is trained on ECG and cardiac MRI pairs from the UK Biobank. a A visualization of our training pipeline. Modality-specific encoders map data modalities into a shared latent space in which a contrastive loss is used to enforce the constraint that paired samples are embedded nearby and further apart from other samples. Modality specific decoders are then used to reconstruct modalities from points in the latent space. b Learned cross-modal representations are used for downstream phenotype prediction tasks by training a supervised learning model (e.g., a kernel machine) on the latent representations. c Our framework enables translation between modalities: ECGs can be translated to corresponding MRIs and vice-versa. d The learned cross-modal representations can be used to understand genotype-phenotype maps in the absence of labelled phenotypes by performing a GWAS in the cross-model latent space and clustering SNPs via their signatures (i.e., the vector in latent space oriented from homozygous reference to the mean of heterozygous and homozygous alternate); SNPs 1 and 4 have similar signatures in the latent space and thus similar phenotypic effects.

**Fig. 2. Improvement of phenotype prediction from cross-modal representations over unimodal representations or supervised learning from the original modalities.**
a A t-SNE visualization of the cross-modal embeddings for the ECG and MRI samples demonstrates that the modality specifc embeddings are well-mixed, unlike the modality specific embeddings obtained from the unimodal autoencoders. b Ranking each MRI by its cosine similarity with a given ECG in the latent space, we visualize the accuracy that the ground truth MRI appears in the top k neighbors among 4752 test ECG-MRI pairs from the UK Biobank. c Kernel regression on cross-modal representations outperforms kernel regression on unimodal representations and supervised deep learning methods on 4 different tasks: (1) prediction of ECG derived phenotypes from MRIs only (n = 4120, mean values are reported with error bars indicating one standard deviation); (2) prediction of MRI-derived phenotypes from ECG only (n = 4218, mean values are reported with error bars indicating one standard deviation); (3) prediction of general physiological phenotypes that are of categorical nature from either ECG or MRI (n = 4218, mean values are reported with error bars indicating one standard deviation); and (4) prediction of general physiological phenotypes that are of continuous nature from either ECG or MRI (n = 4212, mean values are reported with error bars indicating one standard deviation). All MRI phenotype abbreviations are defined in the “Methods” subsection “Models, data, and scaling law for phenotype prediction tasks”. Error bars are computed using 5-fold cross-validation. d Analysis of the scaling law when utilizing our framework for predicting MRI derived phenotypes from ECGs only. We observe that increasing the number of unlabelled ECG–MRI pairs for pre-training boosts the mean R² prediction of 9 MRI-derived phenotypes by twice as much as increasing the number of labelled MRI samples. This analysis highlights the benefit of collecting more unlabelled ECG–MRI pairs as compared to paired labelled examples for this task.

Fig. 3. Cross-modal autoencoders enable imputing cardiac MRIs from ECGs while capturing MRI-specific features such as left ventricular mass (LVM) and right ventricular end-diastolic volume (RVEDV) on test MRI–ECG pairs.
a Examples showing qualitatively that MRIs imputed from test ECG samples capture LVM for those individuals with LVM in the highest and lowest quartile. The LVM in the original, translated, and reconstructed MRI is shown in red. b Examples showing qualitatively that MRIs imputed from test ECGs capture RVEDV for those individuals with RVEDV in the highest and lowest quartile. The RVEDV in the original, translated, and reconstructed MRI is shown in red. c The predictions of LVM and RVEDV on MRIs imputed from test ECGs correlate with the predictions of these phenotypes performed on the original MRIs.

**Fig. 4. Cross-modal autoencoders capture genotype–phenotype associations for cardiovascular data.**
a Manhattan plots for GWAS of BMI and RVEF derived from cross-modal embeddings identify lead SNPs associated with these traits. For BMI, such GWAS identifies SNPs associated with FTO, which is known to have an effect on BMI and obesity risk. For RVEF, such GWAS identifies SNPs associated with BAG3, HMGA2, and MLF1, which have been previously associated with RVEF. b To more generally capture genetic associations with the heart, a GWAS can be performed in the cross-modal ECG and MRI latent space even in the absence of labelled data. The Manhattan plots of such unsupervised GWAS identify lead SNPs including those associated with NOS1AP, TTN, SCN10A, SLC35F1, KCNQ1, which have been previously associated with cardiovascular phenotypes. c The corresponding QQ plots and λ_GC factors indicate that there is minimal inflation in the unsupervised GWAS of cross-modal ECG and cardiac MRI embeddings. d Clustering SNPs by the vector from the mean embedding of homozygous reference samples to the mean embedding of heterozygous and homozygous alternate samples in order to group SNPs by similar phenotypic effect results in clusters of SNPs corresponding to those associated with the QT interval (NOS1AP and KCNQ1), those related to the P-wave (SCN10A and ALPK3), as well as SNPs that affect multiple cardiac traits (e.g., BAG3, SLC35F1, and KCND3).

See this image and copyright information in PMC

Cited by

Deep learning-derived cardiovascular age shares a genetic basis with other cardiac phenotypes.
Libiseller-Egger J, Phelan JE, Attia ZI, Benavente ED, Campino S, Friedman PA, Lopez-Jimenez F, Leon DA, Clark TG. Libiseller-Egger J, et al. Sci Rep. 2022 Dec 31;12(1):22625. doi: 10.1038/s41598-022-27254-z. Sci Rep. 2022. PMID: 36587059 Free PMC article.
Multimodal explainable artificial intelligence identifies patients with non-ischaemic cardiomyopathy at risk of lethal ventricular arrhythmias.
Kolk MZH, Ruipérez-Campillo S, Allaart CP, Wilde AAM, Knops RE, Narayan SM, Tjong FVY; DEEP RISK investigators. Kolk MZH, et al. Sci Rep. 2024 Jun 27;14(1):14889. doi: 10.1038/s41598-024-65357-x. Sci Rep. 2024. PMID: 38937555 Free PMC article.
Applying multimodal AI to physiological waveforms improves genetic prediction of cardiovascular traits.
Zhou Y, Khasentino J, Yun T, Biradar MI, Shreibati J, Lai D, Schwantes-An TH, Luben R, McCaw ZR, Engmann J, Providencia R, Schmidt AF, Munroe PB, Yang H, Carroll A, Khawaja AP, McLean CY, Behsaz B, Hormozdiari F. Zhou Y, et al. Am J Hum Genet. 2025 Jul 3;112(7):1562-1579. doi: 10.1016/j.ajhg.2025.05.015. Epub 2025 Jun 20. Am J Hum Genet. 2025. PMID: 40543505 Free PMC article.
Autoencoder-based phenotyping of ophthalmic images highlights genetic loci influencing retinal morphology and provides informative biomarkers.
Sergouniotis PI, Diakite A, Gaurav K; UK Biobank Eye and Vision Consortium; Birney E, Fitzgerald T. Sergouniotis PI, et al. Bioinformatics. 2024 Dec 26;41(1):btae732. doi: 10.1093/bioinformatics/btae732. Bioinformatics. 2024. PMID: 39657956 Free PMC article.
The Use of Artificial Intelligence to Predict the Development of Atrial Fibrillation.
Pipilas D, Friedman SF, Khurshid S. Pipilas D, et al. Curr Cardiol Rep. 2023 May;25(5):381-389. doi: 10.1007/s11886-023-01859-w. Epub 2023 Mar 31. Curr Cardiol Rep. 2023. PMID: 37000332 Free PMC article. Review.

See all "Cited by" articles

References

1. Sudlow, C. et al. UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med.12, 3 (2015). - PMC - PubMed
1. Bycroft C, et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562:203–209. - PMC - PubMed
1. Li Y, Yang M, Zhang Z. A survey of multi-view representation learning. IEEE Trans. Knowl. Data Eng. 2018;31:1863–1883.
1. Hotelling, H. Relations between two sets of variates. In Breakthroughs in Statistics (eds Kotz, S. & Johnson, N. L.) 162–190 (Springer, 1992).
1. Andrew, G., Arora, R., Bilmes, J. & Livescu, K. Deep canonical correlation analysis. In International Conference on Machine Learning (eds Dasgupta, S. & McAllester, D.) (Proceedings of Machine Learning Research, 2013).

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources

[1] Sudlow, C. et al. UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med.12, 3 (2015). - PMC - PubMed

[2] Sudlow, C. et al. UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med.12, 3 (2015). - PMC - PubMed

[3] Bycroft C, et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562:203–209. - PMC - PubMed

[4] Bycroft C, et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562:203–209. - PMC - PubMed

[5] Li Y, Yang M, Zhang Z. A survey of multi-view representation learning. IEEE Trans. Knowl. Data Eng. 2018;31:1863–1883.

[6] Li Y, Yang M, Zhang Z. A survey of multi-view representation learning. IEEE Trans. Knowl. Data Eng. 2018;31:1863–1883.

[7] Hotelling, H. Relations between two sets of variates. In Breakthroughs in Statistics (eds Kotz, S. & Johnson, N. L.) 162–190 (Springer, 1992).

[8] Hotelling, H. Relations between two sets of variates. In Breakthroughs in Statistics (eds Kotz, S. & Johnson, N. L.) 162–190 (Springer, 1992).

[9] Andrew, G., Arora, R., Bilmes, J. & Livescu, K. Deep canonical correlation analysis. In International Conference on Machine Learning (eds Dasgupta, S. & McAllester, D.) (Proceedings of Machine Learning Research, 2013).

[10] Andrew, G., Arora, R., Bilmes, J. & Livescu, K. Deep canonical correlation analysis. In International Conference on Machine Learning (eds Dasgupta, S. & McAllester, D.) (Proceedings of Machine Learning Research, 2013).

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Cross-modal autoencoder framework learns holistic representations of cardiovascular state

Affiliations

Cross-modal autoencoder framework learns holistic representations of cardiovascular state

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources