. 2024 Feb 28;10(5):e26775.

doi: 10.1016/j.heliyon.2024.e26775. eCollection 2024 Mar 15.

SCANeXt: Enhancing 3D medical image segmentation with dual attention network and depth-wise convolution

Yajun Liu¹, Zenghui Zhang¹, Jiang Yue², Weiwei Guo³

Affiliations

¹ Shanghai Key Laboratory of Intelligent Sensing and Recognition, Shanghai Jiao Tong University, China.
² Department of Endocrinology and Metabolism, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, China.
³ Center for Digital Innovation, Tongji University, China.

PMID: 38439873
PMCID: PMC10909707
DOI: 10.1016/j.heliyon.2024.e26775

SCANeXt: Enhancing 3D medical image segmentation with dual attention network and depth-wise convolution

Yajun Liu et al. Heliyon. 2024.

. 2024 Feb 28;10(5):e26775.

doi: 10.1016/j.heliyon.2024.e26775. eCollection 2024 Mar 15.

Authors

Yajun Liu¹, Zenghui Zhang¹, Jiang Yue², Weiwei Guo³

Affiliations

¹ Shanghai Key Laboratory of Intelligent Sensing and Recognition, Shanghai Jiao Tong University, China.
² Department of Endocrinology and Metabolism, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, China.
³ Center for Digital Innovation, Tongji University, China.

PMID: 38439873
PMCID: PMC10909707
DOI: 10.1016/j.heliyon.2024.e26775

Abstract

Existing approaches to 3D medical image segmentation can be generally categorized into convolution-based or transformer-based methods. While convolutional neural networks (CNNs) demonstrate proficiency in extracting local features, they encounter challenges in capturing global representations. In contrast, the consecutive self-attention modules present in vision transformers excel at capturing long-range dependencies and achieving an expanded receptive field. In this paper, we propose a novel approach, termed SCANeXt, for 3D medical image segmentation. Our method combines the strengths of dual attention (Spatial and Channel Attention) and ConvNeXt to enhance representation learning for 3D medical images. In particular, we propose a novel self-attention mechanism crafted to encompass spatial and channel relationships throughout the entire feature dimension. To further extract multiscale features, we introduce a depth-wise convolution block inspired by ConvNeXt after the dual attention block. Extensive evaluations on three benchmark datasets, namely Synapse, BraTS, and ACDC, demonstrate the effectiveness of our proposed method in terms of accuracy. Our SCANeXt model achieves a state-of-the-art result with a Dice Similarity Score of 95.18% on the ACDC dataset, significantly outperforming current methods.

Keywords: 3D medical image segmentation; Depth-wise convolution; Dual attention; InceptionNeXt; Swin transformer.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

**Figure 1**
Overview of our SCANeXt structure.

**Figure 2**
Components of the spatial-wise transformer.

**Figure 3**
Components of the channel-wise transformer.

**Figure 4**
Components of the depthwise convolution module.

**Figure 5**
Qualitative comparison of the segmentation performance for the Synapse dataset.

**Figure 6**
Qualitative comparison of the segmentation performance for the BraTS dataset.

**Figure 7**
Qualitative comparison of the segmentation performance for the ACDC dataset.

**Figure 8**
The model size vs. DSC is shown in this plot. Circle size indicates computational complexity by FLOPs.

See this image and copyright information in PMC

Cited by

LKDA-Net: Hierarchical transformer with large Kernel depthwise convolution attention for 3D medical image segmentation.
Li M, Ma J, Zhao J. Li M, et al. PLoS One. 2025 Aug 8;20(8):e0329806. doi: 10.1371/journal.pone.0329806. eCollection 2025. PLoS One. 2025. PMID: 40779579 Free PMC article.
A novel LVPA-UNet network for target volume automatic delineation: An MRI case study of nasopharyngeal carcinoma.
Zhang Y, Xu HR, Wen JH, Hu YJ, Diao YL, Chen JL, Xia YF. Zhang Y, et al. Heliyon. 2024 May 4;10(10):e30763. doi: 10.1016/j.heliyon.2024.e30763. eCollection 2024 May 30. Heliyon. 2024. PMID: 38770315 Free PMC article.

References

1. Dosovitskiy Alexey, Beyer Lucas, Kolesnikov Alexander, Weissenborn Dirk, Zhai Xiaohua, Unterthiner Thomas, Dehghani Mostafa, Minderer Matthias, Heigold Georg, Gelly Sylvain, et al. An image is worth $16 \times 16$ words: transformers for image recognition at scale. 2020. arXiv:2010.11929 arXiv preprint.
1. Chen Jieneng, Lu Yongyi, Yu Qihang, Luo Xiangde, Adeli Ehsan, Wang Le Lu Yan, Yuille Alan L., Zhou Yuyin. Transunet: transformers make strong encoders for medical image segmentation. 2021. arXiv:2102.04306 arXiv preprint.
1. Zhang Zhuangzhuang, Zhang Weixiong. Pyramid medical transformer for medical image segmentation. 2021. arXiv:2104.14702 arXiv preprint.
1. Cao Hu, Wang Yueyue, Chen Joy, Jiang Dongsheng, Zhang Xiaopeng, Tian Qi, Wang Manning. European Conference on Computer Vision. Springer; 2022. Swin-Unet: Unet-like pure transformer for medical image segmentation; pp. 205–218.
1. Lin Ailiang, Chen Bingzhi, Xu Jiayu, Zhang Zheng, Lu Guangming, Zhang David. DS- TransUNet: dual swin transformer U-Net for medical image segmentation. IEEE Trans. Instrum. Meas. 2022;71:1–15.

LinkOut - more resources

Full Text Sources
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

SCANeXt: Enhancing 3D medical image segmentation with dual attention network and depth-wise convolution

Affiliations

SCANeXt: Enhancing 3D medical image segmentation with dual attention network and depth-wise convolution

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

LinkOut - more resources

Full Text Sources

Miscellaneous

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Related information

LinkOut - more resources

Full Text Sources

Miscellaneous