Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Sep 30;49(1):120.
doi: 10.1007/s10916-025-02263-3.

A Pretraining Approach for Small-sample Training Employing Radiographs (PASTER): a Multimodal Transformer Trained by Chest Radiography and Free-text Reports

Affiliations

A Pretraining Approach for Small-sample Training Employing Radiographs (PASTER): a Multimodal Transformer Trained by Chest Radiography and Free-text Reports

Kai-Chieh Chen et al. J Med Syst. .

Abstract

While deep convolutional neural networks (DCNNs) have achieved remarkable performance in chest X-ray interpretation, their success typically depends on access to large-scale, expertly annotated datasets. However, collecting such data in real-world clinical settings can be difficult because of limited labeling resources, privacy concerns, and patient variability. In this study, we applied a multimodal Transformer pretrained on free-text reports and their paired CXRs to evaluate the effectiveness of this method in settings with limited labeled data. Our dataset consisted of more than 1 million CXRs, each accompanied by reports from board-certified radiologists and 31 structured labels. The results indicated that a linear model trained on embeddings from the pretrained model achieved AUCs of 0.907 and 0.903 on internal and external test sets, respectively, using only 128 cases and 384 controls; the results were comparable those of DenseNet trained on the entire dataset, whose AUCs were 0.908 and 0.903, respectively. Additionally, we demonstrated similar results by extending the application of this approach to a subset annotated with structured echocardiographic reports. Furthermore, this multimodal model exhibited excellent small sample learning capabilities when tested on external validation sets such as CheXpert and ChestX-ray14. This research significantly reduces the sample size necessary for future artificial intelligence advancements in CXR interpretation.

Keywords: Chest radiograph; Deep learning; Few-shot prediction; Foundation model; Multimodal learning; Small sample training; Transformer.

PubMed Disclaimer

Conflict of interest statement

Declarations. Competing Interests: The authors declare no competing interests. Institutional Review Board: This study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board of Tri-Service General Hospital (IRB NO. C20230519). The IRB approved the study protocol and waived the requirement for individual informed consent due to the use of fully anonymized and retrospective data. Informed Consent: All the data were obtained from the hospital’s quality control center, fully anonymized prior to analysis, and exempt from informed consent as approved by the Institutional Review Board. Conflict of interest: The authors have no conflicts of interest to declare.

References

    1. Raoof, S., et al., Interpretation of plain chest roentgenogram. Chest, 2012. 141(2): p. 545–558. - DOI - PubMed
    1. Çallı, E., et al., Deep learning for chest X-ray analysis: A survey. 2021. 72: p. 102125.
    1. Litjens, G., et al., A survey on deep learning in medical image analysis. 2017. 42: p. 60–88.
    1. Liu, X., et al., A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Health, 2019. 1(6): p. e271-e297. - PubMed
    1. Aggarwal, R., et al., Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis. NPJ Digit Med, 2021. 4(1): p. 65. - DOI - PubMed - PMC

MeSH terms

LinkOut - more resources