. 2024 Feb 29:12:e17005.

doi: 10.7717/peerj.17005. eCollection 2024.

Enhancing medical image segmentation with a multi-transformer U-Net

Yongping Dan¹, Weishou Jin¹, Xuebin Yue², Zhida Wang¹

Affiliations

¹ School of Electronic and Information, Zhongyuan University Of Technology, Zhengzhou, Henan, China.
² Research Organization of Science and Technology, Ritsumeikan University, Kusatsu, Japan.

PMID: 38435997
PMCID: PMC10909362
DOI: 10.7717/peerj.17005

Enhancing medical image segmentation with a multi-transformer U-Net

Yongping Dan et al. PeerJ. 2024.

. 2024 Feb 29:12:e17005.

doi: 10.7717/peerj.17005. eCollection 2024.

Authors

Yongping Dan¹, Weishou Jin¹, Xuebin Yue², Zhida Wang¹

Affiliations

¹ School of Electronic and Information, Zhongyuan University Of Technology, Zhengzhou, Henan, China.
² Research Organization of Science and Technology, Ritsumeikan University, Kusatsu, Japan.

PMID: 38435997
PMCID: PMC10909362
DOI: 10.7717/peerj.17005

Abstract

Various segmentation networks based on Swin Transformer have shown promise in medical segmentation tasks. Nonetheless, challenges such as lower accuracy and slower training convergence have persisted. To tackle these issues, we introduce a novel approach that combines the Swin Transformer and Deformable Transformer to enhance overall model performance. We leverage the Swin Transformer's window attention mechanism to capture local feature information and employ the Deformable Transformer to adjust sampling positions dynamically, accelerating model convergence and aligning it more closely with object shapes and sizes. By amalgamating both Transformer modules and incorporating additional skip connections to minimize information loss, our proposed model excels at rapidly and accurately segmenting CT or X-ray lung images. Experimental results demonstrate the remarkable, showcasing the significant prowess of our model. It surpasses the performance of the standalone Swin Transformer's Swin Unet and converges more rapidly under identical conditions, yielding accuracy improvements of 0.7% (resulting in 88.18%) and 2.7% (resulting in 98.01%) on the COVID-19 CT scan lesion segmentation dataset and Chest X-ray Masks and Labels dataset, respectively. This advancement has the potential to aid medical practitioners in early diagnosis and treatment decision-making.

Keywords: CT or X-ray lung images; Medical image segmentation; Multi-transformer; Unet.

PubMed Disclaimer

Conflict of interest statement

The authors declare there are no competing interests.

Figures

**Figure 1. Overall structure of the model.**
The Swin Transformer and Deformable Transformer serve as the backbone network. Patch merging and patch expanding technologies are employed in the Encoder and Decoder, respectively, to modify the size of feature maps. Furthermore, the model incorporates additional skip connections to enhance multi-scale information fusion, ensuring the retention of crucial information.

**Figure 2. Swin Transformer block.**
The attention blocks with a movable window are composed of W-MSA and SW-MSA attention modules.

**Figure 3. Deformable Attention block.**
(A) This block is structured with a standard attention network architecture. (B) Deformable Attention introduces relative position deviation by incorporating an offset network to enhance the multi-head attention of the output. (C) Provides an overview of the detailed structure of the offset network.

**Figure 4. Comparison of model segmentation accuracy.**
The red section represents the prediction accuracy of our model, while the blue section represents the prediction accuracy of SwinUnet.

**Figure 5. Automatic segmentation result.**
The lung image is segmented automatically through the network.

See this image and copyright information in PMC

Cited by

IDCC-SAM: A Zero-Shot Approach for Cell Counting in Immunocytochemistry Dataset Using the Segment Anything Model.
Fanijo S, Jannesari A, Dickerson J. Fanijo S, et al. Bioengineering (Basel). 2025 Feb 14;12(2):184. doi: 10.3390/bioengineering12020184. Bioengineering (Basel). 2025. PMID: 40001703 Free PMC article.
Joint segmentation of sternocleidomastoid and skeletal muscles in computed tomography images using a multiclass learning approach.
Ashino K, Kamiya N, Zhou X, Kato H, Hara T, Fujita H. Ashino K, et al. Radiol Phys Technol. 2024 Dec;17(4):854-861. doi: 10.1007/s12194-024-00839-1. Epub 2024 Sep 6. Radiol Phys Technol. 2024. PMID: 39242477 Free PMC article.
Flood change detection model based on an improved U-net network and multi-head attention mechanism.
Wang F, Feng X. Wang F, et al. Sci Rep. 2025 Jan 26;15(1):3295. doi: 10.1038/s41598-025-87851-6. Sci Rep. 2025. PMID: 39865097 Free PMC article.

References

1. Abedalla A, Abdullah M, Al-Ayyoub M, Benkhelifa E. Chest X-ray pneumothorax segmentation using U-Net with EfficientNet and ResNet architectures. PeerJ Computer Science. 2021;7:e607. doi: 10.7717/peerj-cs.607. - DOI - PMC - PubMed
1. Adams R, Bischof L. Seeded region growing. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1994;16(6):641–647. doi: 10.1109/34.295913. - DOI
1. Batra A, Singh S, Pang G, Basu S, Jawahar C, Paluri M. Improved road connectivity by joint learning of orientation and segmentation. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; Piscataway. 2019. pp. 10385–10393.
1. Candemir S, Jaeger S, Palaniappan K, Musco JP, Singh RK, Xue Z, Karargyris A, Antani S, Thoma G, McDonald CJ. Lung segmentation in chest radiographs using anatomical atlases with nonrigid registration. IEEE Transactions on Medical Imaging. 2013;33(2):577–590. doi: 10.1109/TMI.2013.2290491. - DOI - PMC - PubMed
1. Cao H, Wang Y, Chen J, Jiang D, Zhang X, Tian Q, Wang M. Swin-unet: Unet-like pure transformer for medical image segmentation. European conference on computer vision; Cham. 2022. pp. 205–218.

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Enhancing medical image segmentation with a multi-transformer U-Net

Affiliations

Enhancing medical image segmentation with a multi-transformer U-Net

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Medical

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

Substances

Related information

LinkOut - more resources

Full Text Sources

Medical