A Pretraining Approach for Small-sample Training Employing Radiographs (PASTER): a Multimodal Transformer Trained by Chest Radiography and Free-text Reports

Kai-Chieh Chen¹, Matthew Kuo², Chun-Ho Lee³, Hao-Chun Liao⁴, Dung-Jang Tsai^{5

6}, Shing-An Lin⁶, Chih-Wei Hsiang⁷, Cheng-Kuang Chang⁷, Kai-Hsiung Ko⁷, Yi-Chih Hsu⁷, Wei-Chou Chang⁷, Guo-Shu Huang⁷, Wen-Hui Fang⁸, Chin-Sheng Lin^{5

6

9}, Shih-Hua Lin¹⁰, Yuan-Hao Chen¹¹, Yi-Jen Hung¹², Chien-Sung Tsai¹³, Chin Lin^{14

15

16}

Affiliations

¹ Graduate Institute of Life Sciences, College of Biomedical Sciences, National Defense Medical University, Taipei, Taiwan, R.O.C.
² School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA.
³ School of Public Health, College of Public Health, National Defense Medical University, Taipei, Taiwan, R.O.C.
⁴ Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical University, Taipei, Taiwan, R.O.C.
⁵ Medical Technology Education Center, School of Medicine, College of Medicine, National Defense Medical University, Taipei, Taiwan, R.O.C.
⁶ Tri-Service General Hospital, Military Digital Medical Center, National Defense Medical University, Taipei, Taiwan, R.O.C.
⁷ Department of Radiology, Tri-Service General Hospital, National Defense Medical University, Taipei, Taiwan, R.O.C.
⁸ Department of Family and Community Medicine, Tri-Service General Hospital, National Defense Medical University, Taipei, Taiwan, R.O.C.
⁹ Division of Cardiology, Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical University, Taipei, Taiwan, R.O.C.
¹⁰ Division of Nephrology, Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical University, Taipei, Taiwan, R.O.C.
¹¹ Department of Neurological Surgery, Tri-Service General Hospital, National Defense Medical University, Taipei, Taiwan, R.O.C.
¹² Division of Endocrinology and Metabolism, Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical University, Taipei, Taiwan, R.O.C.
¹³ Division of Cardiovascular Surgery, Department of Surgery, Tri-Service General Hospital, National Defense Medical University, Taipei, Taiwan, R.O.C.
¹⁴ Graduate Institute of Life Sciences, College of Biomedical Sciences, National Defense Medical University, Taipei, Taiwan, R.O.C.. xup6fup@mail.ndmctsgh.edu.tw.
¹⁵ Medical Technology Education Center, School of Medicine, College of Medicine, National Defense Medical University, Taipei, Taiwan, R.O.C.. xup6fup@mail.ndmctsgh.edu.tw.
¹⁶ Tri-Service General Hospital, Military Digital Medical Center, National Defense Medical University, Taipei, Taiwan, R.O.C.. xup6fup@mail.ndmctsgh.edu.tw.

PMID: 41023263
DOI: 10.1007/s10916-025-02263-3

A Pretraining Approach for Small-sample Training Employing Radiographs (PASTER): a Multimodal Transformer Trained by Chest Radiography and Free-text Reports

Kai-Chieh Chen et al. J Med Syst. 2025.

. 2025 Sep 30;49(1):120.

doi: 10.1007/s10916-025-02263-3.

Authors

Affiliations

¹ Graduate Institute of Life Sciences, College of Biomedical Sciences, National Defense Medical University, Taipei, Taiwan, R.O.C.
² School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA.
³ School of Public Health, College of Public Health, National Defense Medical University, Taipei, Taiwan, R.O.C.
⁴ Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical University, Taipei, Taiwan, R.O.C.
⁵ Medical Technology Education Center, School of Medicine, College of Medicine, National Defense Medical University, Taipei, Taiwan, R.O.C.
⁶ Tri-Service General Hospital, Military Digital Medical Center, National Defense Medical University, Taipei, Taiwan, R.O.C.
⁷ Department of Radiology, Tri-Service General Hospital, National Defense Medical University, Taipei, Taiwan, R.O.C.
⁸ Department of Family and Community Medicine, Tri-Service General Hospital, National Defense Medical University, Taipei, Taiwan, R.O.C.
⁹ Division of Cardiology, Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical University, Taipei, Taiwan, R.O.C.
¹⁰ Division of Nephrology, Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical University, Taipei, Taiwan, R.O.C.
¹¹ Department of Neurological Surgery, Tri-Service General Hospital, National Defense Medical University, Taipei, Taiwan, R.O.C.
¹² Division of Endocrinology and Metabolism, Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical University, Taipei, Taiwan, R.O.C.
¹³ Division of Cardiovascular Surgery, Department of Surgery, Tri-Service General Hospital, National Defense Medical University, Taipei, Taiwan, R.O.C.
¹⁴ Graduate Institute of Life Sciences, College of Biomedical Sciences, National Defense Medical University, Taipei, Taiwan, R.O.C.. xup6fup@mail.ndmctsgh.edu.tw.
¹⁵ Medical Technology Education Center, School of Medicine, College of Medicine, National Defense Medical University, Taipei, Taiwan, R.O.C.. xup6fup@mail.ndmctsgh.edu.tw.
¹⁶ Tri-Service General Hospital, Military Digital Medical Center, National Defense Medical University, Taipei, Taiwan, R.O.C.. xup6fup@mail.ndmctsgh.edu.tw.

PMID: 41023263
DOI: 10.1007/s10916-025-02263-3

Abstract

While deep convolutional neural networks (DCNNs) have achieved remarkable performance in chest X-ray interpretation, their success typically depends on access to large-scale, expertly annotated datasets. However, collecting such data in real-world clinical settings can be difficult because of limited labeling resources, privacy concerns, and patient variability. In this study, we applied a multimodal Transformer pretrained on free-text reports and their paired CXRs to evaluate the effectiveness of this method in settings with limited labeled data. Our dataset consisted of more than 1 million CXRs, each accompanied by reports from board-certified radiologists and 31 structured labels. The results indicated that a linear model trained on embeddings from the pretrained model achieved AUCs of 0.907 and 0.903 on internal and external test sets, respectively, using only 128 cases and 384 controls; the results were comparable those of DenseNet trained on the entire dataset, whose AUCs were 0.908 and 0.903, respectively. Additionally, we demonstrated similar results by extending the application of this approach to a subset annotated with structured echocardiographic reports. Furthermore, this multimodal model exhibited excellent small sample learning capabilities when tested on external validation sets such as CheXpert and ChestX-ray14. This research significantly reduces the sample size necessary for future artificial intelligence advancements in CXR interpretation.

Keywords: Chest radiograph; Deep learning; Few-shot prediction; Foundation model; Multimodal learning; Small sample training; Transformer.

PubMed Disclaimer

Conflict of interest statement

Declarations. Competing Interests: The authors declare no competing interests. Institutional Review Board: This study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board of Tri-Service General Hospital (IRB NO. C20230519). The IRB approved the study protocol and waived the requirement for individual informed consent due to the use of fully anonymized and retrospective data. Informed Consent: All the data were obtained from the hospital’s quality control center, fully anonymized prior to analysis, and exempt from informed consent as approved by the Institutional Review Board. Conflict of interest: The authors have no conflicts of interest to declare.

References

1. Raoof, S., et al., Interpretation of plain chest roentgenogram. Chest, 2012. 141(2): p. 545–558. - DOI - PubMed
1. Çallı, E., et al., Deep learning for chest X-ray analysis: A survey. 2021. 72: p. 102125.
1. Litjens, G., et al., A survey on deep learning in medical image analysis. 2017. 42: p. 60–88.
1. Liu, X., et al., A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Health, 2019. 1(6): p. e271-e297. - PubMed
1. Aggarwal, R., et al., Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis. NPJ Digit Med, 2021. 4(1): p. 65. - DOI - PubMed - PMC

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A Pretraining Approach for Small-sample Training Employing Radiographs (PASTER): a Multimodal Transformer Trained by Chest Radiography and Free-text Reports

Affiliations

A Pretraining Approach for Small-sample Training Employing Radiographs (PASTER): a Multimodal Transformer Trained by Chest Radiography and Free-text Reports

Authors

Affiliations

Abstract

Conflict of interest statement

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources