Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jul 8;24(1):191.
doi: 10.1186/s12911-024-02591-3.

Optimization of vision transformer-based detection of lung diseases from chest X-ray images

Affiliations

Optimization of vision transformer-based detection of lung diseases from chest X-ray images

Jinsol Ko et al. BMC Med Inform Decis Mak. .

Abstract

Background: Recent advances in Vision Transformer (ViT)-based deep learning have significantly improved the accuracy of lung disease prediction from chest X-ray images. However, limited research exists on comparing the effectiveness of different optimizers for lung disease prediction within ViT models. This study aims to systematically evaluate and compare the performance of various optimization methods for ViT-based models in predicting lung diseases from chest X-ray images.

Methods: This study utilized a chest X-ray image dataset comprising 19,003 images containing both normal cases and six lung diseases: COVID-19, Viral Pneumonia, Bacterial Pneumonia, Middle East Respiratory Syndrome (MERS), Severe Acute Respiratory Syndrome (SARS), and Tuberculosis. Each ViT model (ViT, FastViT, and CrossViT) was individually trained with each optimization method (Adam, AdamW, NAdam, RAdam, SGDW, and Momentum) to assess their performance in lung disease prediction.

Results: When tested with ViT on the dataset with balanced-sample sized classes, RAdam demonstrated superior accuracy compared to other optimizers, achieving 95.87%. In the dataset with imbalanced sample size, FastViT with NAdam achieved the best performance with an accuracy of 97.63%.

Conclusions: We provide comprehensive optimization strategies for developing ViT-based model architectures, which can enhance the performance of these models for lung disease prediction from chest X-ray images.

Keywords: Lung disease; Optimizer; Vision transformer; X-ray.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Schematic overview of the analysis workflow
Fig. 2
Fig. 2
Classification of the overall classes with various models and optimizers. A, B The 4 class (A) and 7 class (B) datasets were classified using the ViT model with various optimizers (Adam, AdamW, NAdam, RAdam, SGDW, and Momentum), respectively. C, D The 7 class dataset was classified using the FastViT (C) or CrossViT (D) models with various optimizers, respectively. The evaluation metrics included accuracy, F1-score, precision, and recall, calculated at various learning rates of 10–4, 10–5, and 10–6
Fig. 3
Fig. 3
Classification of each disease class with various models and optimizers. A, B Each class in the 4 class (A) and 7 class (B) datasets was classified using the ViT model with various optimizers (Adam, AdamW, NAdam, RAdam, SGDW, and Momentum), respectively. C, D Each class in the 7 class dataset was classified using the FastViT (C) or CrossViT (D) models with various optimizers, respectively. The evaluation metrics included accuracy, F1-score, precision, and recall, calculated at various learning rates of 10–4, 10–5, and 10–6

Similar articles

Cited by

References

    1. Khan AI, Shah JL, Bhat MM. CoroNet: A deep neural network for detection and diagnosis of COVID-19 from chest x-ray images. Comput Methods Programs Biomed. 2020;196:105581. doi: 10.1016/j.cmpb.2020.105581. - DOI - PMC - PubMed
    1. Wang L, Lin ZQ, Wong A. COVID-Net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images. Sci Rep. 2020;10(1):19549. doi: 10.1038/s41598-020-76550-z. - DOI - PMC - PubMed
    1. Rahaman MM, Li C, Yao Y, Kulwa F, Rahman MA, Wang Q, Qi S, Kong F, Zhu X, Zhao X. Identification of COVID-19 samples from chest X-Ray images using deep learning: A comparison of transfer learning approaches. J Xray Sci Technol. 2020;28(5):821–839. - PMC - PubMed
    1. Zhou SK, Greenspan H, Davatzikos C, Duncan JS, Van Ginneken B, Madabhushi A, Prince JL, Rueckert D, Summers RM. A review of deep learning in medical imaging: Imaging traits, technology trends, case studies with progress highlights, and future promises. Proc IEEE. 2021;109(5):820–838. doi: 10.1109/JPROC.2021.3054390. - DOI - PMC - PubMed
    1. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S. An image is worth 16x16 words: Transformers for image recognition at scale. 2020.

MeSH terms