SCANeXt: Enhancing 3D medical image segmentation with dual attention network and depth-wise convolution
- PMID: 38439873
- PMCID: PMC10909707
- DOI: 10.1016/j.heliyon.2024.e26775
SCANeXt: Enhancing 3D medical image segmentation with dual attention network and depth-wise convolution
Abstract
Existing approaches to 3D medical image segmentation can be generally categorized into convolution-based or transformer-based methods. While convolutional neural networks (CNNs) demonstrate proficiency in extracting local features, they encounter challenges in capturing global representations. In contrast, the consecutive self-attention modules present in vision transformers excel at capturing long-range dependencies and achieving an expanded receptive field. In this paper, we propose a novel approach, termed SCANeXt, for 3D medical image segmentation. Our method combines the strengths of dual attention (Spatial and Channel Attention) and ConvNeXt to enhance representation learning for 3D medical images. In particular, we propose a novel self-attention mechanism crafted to encompass spatial and channel relationships throughout the entire feature dimension. To further extract multiscale features, we introduce a depth-wise convolution block inspired by ConvNeXt after the dual attention block. Extensive evaluations on three benchmark datasets, namely Synapse, BraTS, and ACDC, demonstrate the effectiveness of our proposed method in terms of accuracy. Our SCANeXt model achieves a state-of-the-art result with a Dice Similarity Score of 95.18% on the ACDC dataset, significantly outperforming current methods.
Keywords: 3D medical image segmentation; Depth-wise convolution; Dual attention; InceptionNeXt; Swin transformer.
© 2024 The Author(s).
Conflict of interest statement
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Figures








Similar articles
-
VSmTrans: A hybrid paradigm integrating self-attention and convolution for 3D medical image segmentation.Med Image Anal. 2024 Dec;98:103295. doi: 10.1016/j.media.2024.103295. Epub 2024 Aug 24. Med Image Anal. 2024. PMID: 39217673 Free PMC article.
-
Efficient brain tumor segmentation using Swin transformer and enhanced local self-attention.Int J Comput Assist Radiol Surg. 2024 Feb;19(2):273-281. doi: 10.1007/s11548-023-03024-8. Epub 2023 Oct 5. Int J Comput Assist Radiol Surg. 2024. PMID: 37796413 Review.
-
Dual encoder network with transformer-CNN for multi-organ segmentation.Med Biol Eng Comput. 2023 Mar;61(3):661-671. doi: 10.1007/s11517-022-02723-9. Epub 2022 Dec 29. Med Biol Eng Comput. 2023. PMID: 36580181
-
AttmNet: a hybrid Transformer integrating self-attention, Mamba, and multi-layer convolution for enhanced lesion segmentation.Quant Imaging Med Surg. 2025 May 1;15(5):4296-4310. doi: 10.21037/qims-2024-2561. Epub 2025 Apr 28. Quant Imaging Med Surg. 2025. PMID: 40384647 Free PMC article.
-
SwinBTS: A Method for 3D Multimodal Brain Tumor Segmentation Using Swin Transformer.Brain Sci. 2022 Jun 17;12(6):797. doi: 10.3390/brainsci12060797. Brain Sci. 2022. PMID: 35741682 Free PMC article.
Cited by
-
LKDA-Net: Hierarchical transformer with large Kernel depthwise convolution attention for 3D medical image segmentation.PLoS One. 2025 Aug 8;20(8):e0329806. doi: 10.1371/journal.pone.0329806. eCollection 2025. PLoS One. 2025. PMID: 40779579 Free PMC article.
-
A novel LVPA-UNet network for target volume automatic delineation: An MRI case study of nasopharyngeal carcinoma.Heliyon. 2024 May 4;10(10):e30763. doi: 10.1016/j.heliyon.2024.e30763. eCollection 2024 May 30. Heliyon. 2024. PMID: 38770315 Free PMC article.
References
-
- Dosovitskiy Alexey, Beyer Lucas, Kolesnikov Alexander, Weissenborn Dirk, Zhai Xiaohua, Unterthiner Thomas, Dehghani Mostafa, Minderer Matthias, Heigold Georg, Gelly Sylvain, et al. An image is worth words: transformers for image recognition at scale. 2020. arXiv:2010.11929 arXiv preprint.
-
- Chen Jieneng, Lu Yongyi, Yu Qihang, Luo Xiangde, Adeli Ehsan, Wang Le Lu Yan, Yuille Alan L., Zhou Yuyin. Transunet: transformers make strong encoders for medical image segmentation. 2021. arXiv:2102.04306 arXiv preprint.
-
- Zhang Zhuangzhuang, Zhang Weixiong. Pyramid medical transformer for medical image segmentation. 2021. arXiv:2104.14702 arXiv preprint.
-
- Cao Hu, Wang Yueyue, Chen Joy, Jiang Dongsheng, Zhang Xiaopeng, Tian Qi, Wang Manning. European Conference on Computer Vision. Springer; 2022. Swin-Unet: Unet-like pure transformer for medical image segmentation; pp. 205–218.
-
- Lin Ailiang, Chen Bingzhi, Xu Jiayu, Zhang Zheng, Lu Guangming, Zhang David. DS- TransUNet: dual swin transformer U-Net for medical image segmentation. IEEE Trans. Instrum. Meas. 2022;71:1–15.
LinkOut - more resources
Full Text Sources
Miscellaneous