A foundational vision transformer improves diagnostic performance for electrocardiograms

Akhil Vaid^{1

2

3

4}, Joy Jiang^{5

6}, Ashwin Sawant⁷, Stamatios Lerakis^{8

9}, Edgar Argulian^{8

9}, Yuri Ahuja¹⁰, Joshua Lampert^{8

9}, Alexander Charney^{5

11

12

13}, Hayit Greenspan¹⁴, Jagat Narula^{8

9}, Benjamin Glicksberg^{11

15}, Girish N Nadkarni^{5

6

11

15

16}

Affiliations

¹ The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA. akhil.vaid@mssm.edu.
² Mount Sinai Clinical Intelligence Center, Icahn School of Medicine at Mount Sinai, New York, NY, USA. akhil.vaid@mssm.edu.
³ Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA. akhil.vaid@mssm.edu.
⁴ The Hasso Plattner Institute for Digital Health at Mount Sinai, New York, NY, USA. akhil.vaid@mssm.edu.
⁵ The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
⁶ Mount Sinai Clinical Intelligence Center, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
⁷ Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
⁸ Mount Sinai Heart, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
⁹ Department of Cardiology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
¹⁰ Department of Medicine, NYU Langone Health, New York, NY, USA.
¹¹ Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
¹² The Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
¹³ Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
¹⁴ Department of Biomedical Engineering, Tel Aviv University, Tel Aviv, 6997801, Israel.
¹⁵ The Hasso Plattner Institute for Digital Health at Mount Sinai, New York, NY, USA.
¹⁶ Division of Nephrology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

PMID: 37280346
PMCID: PMC10242218
DOI: 10.1038/s41746-023-00840-9

A foundational vision transformer improves diagnostic performance for electrocardiograms

Akhil Vaid et al. NPJ Digit Med. 2023.

. 2023 Jun 6;6(1):108.

doi: 10.1038/s41746-023-00840-9.

Authors

Affiliations

¹ The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA. akhil.vaid@mssm.edu.
² Mount Sinai Clinical Intelligence Center, Icahn School of Medicine at Mount Sinai, New York, NY, USA. akhil.vaid@mssm.edu.
³ Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA. akhil.vaid@mssm.edu.
⁴ The Hasso Plattner Institute for Digital Health at Mount Sinai, New York, NY, USA. akhil.vaid@mssm.edu.
⁵ The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
⁶ Mount Sinai Clinical Intelligence Center, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
⁷ Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
⁸ Mount Sinai Heart, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
⁹ Department of Cardiology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
¹⁰ Department of Medicine, NYU Langone Health, New York, NY, USA.
¹¹ Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
¹² The Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
¹³ Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
¹⁴ Department of Biomedical Engineering, Tel Aviv University, Tel Aviv, 6997801, Israel.
¹⁵ The Hasso Plattner Institute for Digital Health at Mount Sinai, New York, NY, USA.
¹⁶ Division of Nephrology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

PMID: 37280346
PMCID: PMC10242218
DOI: 10.1038/s41746-023-00840-9

Abstract

The electrocardiogram (ECG) is a ubiquitous diagnostic modality. Convolutional neural networks (CNNs) applied towards ECG analysis require large sample sizes, and transfer learning approaches for biomedical problems may result in suboptimal performance when pre-training is done on natural images. We leveraged masked image modeling to create a vision-based transformer model, HeartBEiT, for electrocardiogram waveform analysis. We pre-trained this model on 8.5 million ECGs and then compared performance vs. standard CNN architectures for diagnosis of hypertrophic cardiomyopathy, low left ventricular ejection fraction and ST elevation myocardial infarction using differing training sample sizes and independent validation datasets. We find that HeartBEiT has significantly higher performance at lower sample sizes compared to other models. We also find that HeartBEiT improves explainability of diagnosis by highlighting biologically relevant regions of the EKG vs. standard CNNs. Domain specific pre-trained transformer models may exceed the classification performance of models trained on natural images especially in very low data regimes. The combination of the architecture and such pre-training allows for more accurate, granular explainability of model predictions.

PubMed Disclaimer

Conflict of interest statement

Dr. Nadkarni reports consultancy agreements with AstraZeneca, BioVie, GLG Consulting, Pensieve Health, Reata, Renalytix, Siemens Healthineers, and Variant Bio; research funding from Goldfinch Bio and Renalytix; honoraria from AstraZeneca, BioVie, Lexicon, Daiichi Sankyo, Meanrini Health and Reata; patents or royalties with Renalytix; owns equity and stock options in Pensieve Health and Renalytix as a scientific cofounder; owns equity in Verici Dx; has received financial compensation as a scientific board member and advisor to Renalytix; serves on the advisory board of Neurona Health; and serves in an advisory or leadership role for Pensieve Health and Renalytix. All other authors have reported that they have no relationships relevant to the contents of this paper to disclose.

Figures

**Fig. 1. Modeling workflow.**
Pre-training of the HeartBEiT model. (1) Each original ECG is partitioned into 14 × 14 patches (2) of 16 × 16 pixels. These patches are tokenized, and some of them are masked (3). The Dall-E model (4) acts as the tokenizer and converts the image into discrete tokens (5) which are then made part of the Masked Image Modeling process (6). This allows for pre-training the HeartBEiT model’s attention modules (7), and the model may then be used for downstream fine-tuning and inference (8, 9) upon addition of a Multi-Layer Perceptron classification head (10).

**Fig. 2. Left ventricular ejection fraction < = 40% classification on ECGs.**
a Internal testing performance (4 Mount Sinai facilities). b Internal testing performance difference. c External validation performance (Morningside patients). d External validation performance difference. Red dashed line in (b) and (d) indicates HeartBEiT performance.

**Fig. 3. Hypertrophic cardiomyopathy classification on ECGs.**
a Internal testing performance (4 Mount Sinai facilities). b Internal testing performance difference. c External validation performance (Morningside patients). d External validation performance difference. Red dashed line in (b) and (d) indicates HeartBEiT performance.

**Fig. 4. STEMI detection on ECGs (PTB-XL database).**
a Internal testing performance. b Internal testing performance difference. Dashed red line in (b) indicates HeartBEiT performance.

**Fig. 5. Saliency mapping for STEMI detection at 1% training data.**
a ViT-B/16. b EfficientNet-B4. c ResNet-152. d HeartBEiT. HeartBEiT localizes to the ST segments. Other models are more diffuse in highlighting features of importance and may be less useful clinically.

See this image and copyright information in PMC

References

1. Drazen E, Mann N, Borun R, Laks M, Bersen A. Survey of computer-assisted electrocardiography in the United States. J. Electrocardiol. 1988;21:S98–S104. doi: 10.1016/0022-0736(88)90068-4. - DOI - PubMed
1. Vaid A, et al. Automated Determination of Left Ventricular Function Using Electrocardiogram Data in Patients on Maintenance Hemodialysis. Clin. J. Am. Soc. Nephrol. 2022;17:1017–1025. doi: 10.2215/CJN.16481221. - DOI - PMC - PubMed
1. Vaid A, et al. Using deep-learning algorithms to simultaneously identify right and left ventricular dysfunction from the electrocardiogram. Cardiovasc. Imaging. 2022;15:395–410. - PMC - PubMed
1. Vaid A, et al. Multi-center retrospective cohort study applying deep learning to electrocardiograms to identify left heart valvular dysfunction. Commun. Med. 2023;3:24. doi: 10.1038/s43856-023-00240-w. - DOI - PMC - PubMed
1. Mincholé A, Camps J, Lyon A, Rodríguez B. Machine learning in the electrocardiogram. J. Electrocardiol. 2019;57:S61–S64. doi: 10.1016/j.jelectrocard.2019.08.008. - DOI - PubMed

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A foundational vision transformer improves diagnostic performance for electrocardiograms

Affiliations

A foundational vision transformer improves diagnostic performance for electrocardiograms

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources