Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Dec;37(6):3174-3192.
doi: 10.1007/s10278-024-01140-8. Epub 2024 Jun 5.

Enhancing Skin Cancer Diagnosis Using Swin Transformer with Hybrid Shifted Window-Based Multi-head Self-attention and SwiGLU-Based MLP

Affiliations

Enhancing Skin Cancer Diagnosis Using Swin Transformer with Hybrid Shifted Window-Based Multi-head Self-attention and SwiGLU-Based MLP

Ishak Pacal et al. J Imaging Inform Med. 2024 Dec.

Abstract

Skin cancer is one of the most frequently occurring cancers worldwide, and early detection is crucial for effective treatment. Dermatologists often face challenges such as heavy data demands, potential human errors, and strict time limits, which can negatively affect diagnostic outcomes. Deep learning-based diagnostic systems offer quick, accurate testing and enhanced research capabilities, providing significant support to dermatologists. In this study, we enhanced the Swin Transformer architecture by implementing the hybrid shifted window-based multi-head self-attention (HSW-MSA) in place of the conventional shifted window-based multi-head self-attention (SW-MSA). This adjustment enables the model to more efficiently process areas of skin cancer overlap, capture finer details, and manage long-range dependencies, while maintaining memory usage and computational efficiency during training. Additionally, the study replaces the standard multi-layer perceptron (MLP) in the Swin Transformer with a SwiGLU-based MLP, an upgraded version of the gated linear unit (GLU) module, to achieve higher accuracy, faster training speeds, and better parameter efficiency. The modified Swin model-base was evaluated using the publicly accessible ISIC 2019 skin dataset with eight classes and was compared against popular convolutional neural networks (CNNs) and cutting-edge vision transformer (ViT) models. In an exhaustive assessment on the unseen test dataset, the proposed Swin-Base model demonstrated exceptional performance, achieving an accuracy of 89.36%, a recall of 85.13%, a precision of 88.22%, and an F1-score of 86.65%, surpassing all previously reported research and deep learning models documented in the literature.

Keywords: Medical image analysis; Skin cancer detection; SwiGLU-based MLP; Swin Transformer; Vision transformer.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics Approval: No ethics approval was required for this work as it did not involve human subjects, animals, or sensitive data that would necessitate ethical review. Consent to Participate: No formal consent to participate was required for this work as it did not involve interactions with human subjects or the collection of sensitive personal information. Consent for Publication: This study did not use individual person’s data. Competing Interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Sample images from ISIC 2019 skin lesion dataset
Fig. 2
Fig. 2
Proposed method architecture
Fig. 3
Fig. 3
Proposed SwiGLU-based MLP module and default MLP module in Swin transformers
Fig. 4
Fig. 4
Hybrid transformer blocks
Fig. 5
Fig. 5
Confusion matrix of the proposed model
Fig. 6
Fig. 6
Experimental results of the proposed model alongside the top 10 deep learning models with the highest accuracy
Fig. 7
Fig. 7
Experimental results of the proposed model alongside the top 10 deep learning models with the highest F1-score

Similar articles

Cited by

References

    1. S. Bibi, M.A. Khan, J.H. Shah, R. Damaševičius, A. Alasiry, M. Marzougui, M. Alhaisoni, A. Masood, MSRNet: Multiclass Skin Lesion Recognition Using Additional Residual Block Based Fine-Tuned Deep Models Information Fusion and Best Feature Selection, Diagnostics 2023, Vol. 13, Page 3063 13 (2023) 3063. 10.3390/DIAGNOSTICS13193063. - PMC - PubMed
    1. D. Gutman, N.C.F. Codella, E. Celebi, B. Helba, M. Marchetti, N. Mishra, A. Halpern, Skin Lesion Analysis toward Melanoma Detection: A Challenge at the International Symposium on Biomedical Imaging (ISBI) 2016, hosted by the International Skin Imaging Collaboration (ISIC), (2016). https://arxiv.org/abs/1605.01397v1 (accessed May 5, 2024).
    1. G. Akilandasowmya, G. Nirmaladevi, S.U. Suganthi, A. Aishwariya, Skin cancer diagnosis: Leveraging deep hidden features and ensemble classifiers for early detection and classification, Biomed Signal Process Control 88 (2024) 105306. https://doi.org/10.1016/J.BSPC.2023.105306.
    1. V. Dillshad, M.A. Khan, M. Nazir, O. Saidani, N. Alturki, S. Kadry, D2LFS2Net: Multi-class skin lesion diagnosis using deep learning and variance-controlled Marine Predator optimisation: An application for precision medicine, CAAI Trans Intell Technol (2023). https://doi.org/10.1049/CIT2.12267.
    1. Skin cancer statistics | World Cancer Research Fund International, (n.d.). https://www.wcrf.org/cancer-trends/skin-cancer-statistics/ (accessed July 31, 2023).

LinkOut - more resources