Review

. 2023 Dec 1;13(12):8747-8767.

doi: 10.21037/qims-23-542. Epub 2023 Oct 7.

Transformers in medical image segmentation: a narrative review

Rabeea Fatma Khan^#¹, Byoung-Dai Lee^#¹, Mu Sook Lee²

Affiliations

¹ Department of Computer Science, Graduate School, Kyonggi University, Suwon, Republic of Korea.
² Department of Radiology, Keimyung University Dongsan Hospital, Daegu, Republic of Korea.

^# Contributed equally.

PMID: 38106306
PMCID: PMC10722011
DOI: 10.21037/qims-23-542

Review

Transformers in medical image segmentation: a narrative review

Rabeea Fatma Khan et al. Quant Imaging Med Surg. 2023.

. 2023 Dec 1;13(12):8747-8767.

doi: 10.21037/qims-23-542. Epub 2023 Oct 7.

Authors

Rabeea Fatma Khan^#¹, Byoung-Dai Lee^#¹, Mu Sook Lee²

Affiliations

¹ Department of Computer Science, Graduate School, Kyonggi University, Suwon, Republic of Korea.
² Department of Radiology, Keimyung University Dongsan Hospital, Daegu, Republic of Korea.

^# Contributed equally.

PMID: 38106306
PMCID: PMC10722011
DOI: 10.21037/qims-23-542

Abstract

Background and objective: Transformers, which have been widely recognized as state-of-the-art tools in natural language processing (NLP), have also come to be recognized for their value in computer vision tasks. With this increasing popularity, they have also been extensively researched in the more complex medical imaging domain. The associated developments have resulted in transformers being on par with sought-after convolution neural networks, particularly for medical image segmentation. Methods combining both types of networks have proven to be especially successful in capturing local and global contexts, thereby significantly boosting their performances in various segmentation problems. Motivated by this success, we have attempted to survey the consequential research focused on innovative transformer networks, specifically those designed to cater to medical image segmentation in an efficient manner.

Methods: Databases like Google Scholar, arxiv, ResearchGate, Microsoft Academic, and Semantic Scholar have been utilized to find recent developments in this field. Specifically, research in the English language from 2021 to 2023 was considered.

Key content and findings: In this survey, we look into the different types of architectures and attention mechanisms that uniquely improve performance and the structures that are in place to handle complex medical data. Through this survey, we summarize the popular and unconventional transformer-based research as seen through different key angles and analyze quantitatively the strategies that have proven more advanced.

Conclusions: We have also attempted to discern existing gaps and challenges within current research, notably highlighting the deficiency of annotated medical data for precise deep learning model training. Furthermore, potential future directions for enhancing transformers' utility in healthcare are outlined, encompassing strategies such as transfer learning and exploiting foundation models for specialized medical image segmentation.

Keywords: Transformers; artificial intelligence (AI); deep learning; image segmentation; medical imaging.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-23-542/coif). The authors have no conflict of interest to share.

Figures

**Figure 1**
Structure of a transformer. MHSA, multi-head self-attention.

**Figure 2**
Attention mechanism of a transformer. (A) Scaled dot product. (B) MHSA layer of a transformer. MHSA, multi-head self-attention.

**Figure 3**
Main categories for transformer-network classification. CNN, convolutional neural network.

**Figure 4**
Popular feature sub-space reduction techniques for 3D and 2D medical data. (A) Patch partitioning (B) Convolution layers to spatially reduce the feature dimensions. (C) Convolution layers and patch partition to significantly minimize the feature subspace. 3D, three-dimensional; 2D, two-dimensional.

**Figure 5**
Popular hierarchical encoder-decoder techniques involving transformers. (A) Transformer encoder-decoder. (B) Sequential network. (C) CNN encoder-decoder with transformer in bottleneck. (D) Interleaved CNN and transformer blocks within the encoder and decoder. (E) Transformer encoder with CNN decoder. (F) Parallel branches of CNN encoder and transformer encoder followed by a fusion module before the CNN decoder. CNN, convolutional neural network.

See this image and copyright information in PMC

References

1. Norouzi A, Rahim MS, Altameem A, Saba T, Rad AE, Rehman A, Uddin M. Medical image segmentation methods, algorithms, and applications. IETE Tech Rev 2014;31:199-213. 10.1080/02564602.2014.906861 - DOI
1. Pham DL, Xu C, Prince JL. Current methods in medical image segmentation. Annu Rev Biomed Eng 2000;2:315-37. 10.1146/annurev.bioeng.2.1.315 - DOI - PubMed
1. Kayalibay B, Jensen G, van der Smagt P. CNN-based segmentation of medical imaging data. arXiv:1701.03056 [Preprint]. 2017. Available online: https://arxiv.org/abs/1701.03056
1. Li F, Zhou L, Wang Y, Chen C, Yang S, Shan F, Liu L. Modeling long-range dependencies for weakly supervised disease classification and localization on chest X-ray. Quant Imaging Med Surg 2022;12:3364-78. 10.21037/qims-21-1117 - DOI - PMC - PubMed
1. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. Attention is all you need. In: Advances in Neural Information Processing Systems 30 (NIPS 2017). 2017.

Publication types

Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Transformers in medical image segmentation: a narrative review

Affiliations

Transformers in medical image segmentation: a narrative review

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

LinkOut - more resources

Full Text Sources