Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Oct 23;14(11):5904-5920.
doi: 10.1364/BOE.499640. eCollection 2023 Nov 1.

TCI-UNet: transformer-CNN interactive module for medical image segmentation

Affiliations

TCI-UNet: transformer-CNN interactive module for medical image segmentation

Xuan Bian et al. Biomed Opt Express. .

Abstract

Medical image segmentation is a crucial step in developing medical systems, especially for assisting doctors in diagnosing and treating diseases. Currently, UNet has become the preferred network for most medical image segmentation tasks and has achieved tremendous success. However, due to the limitations of convolutional operation mechanisms, its ability to model long-range dependencies between features is limited. With the success of transformers in the computer vision (CV) field, many excellent models that combine transformers with UNet have emerged, but most of them have fixed receptive fields and a single feature extraction method. To address this issue, we propose a transformer-CNN interactive (TCI) feature extraction module and use it to construct TCI-UNet. Specifically, we improve the self-attention mechanism in transformers to enhance the guiding ability of attention maps for computational resource allocation. It can strengthen the network's ability to capture global contextual information from feature maps. Additionally, we introduce local multi-scale information to supplement feature information, allowing the network to focus on important local information while modeling global contextual information. This improves the network's capability to extract feature map information and facilitates effective interaction between global and local information within the transformer, enhancing the representational power of transformers. We conducted a large number of experiments on the LiTS-2017 and ISIC-2018 datasets to verify the effectiveness of our proposed method, with DCIE values of 93.81% and 88.22%, respectively. Through ablation experiments, we proved the effectiveness of the TCI module, and in comparison with other state-of-the-art (SOTA) networks, we demonstrated the superiority of TCI-UNet in accuracy and generalization.

PubMed Disclaimer

Conflict of interest statement

The authors declare there is no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Structure of Transformer and TCI Module. (a) shows the structure diagram of Transformer, while Fig. 1 (b) shows the structure diagram of TCI module proposed in this paper. TCI has been re-designed and improved at the position of MSA, realizing the interaction between local information and global information inside the Transformer.
Fig. 2.
Fig. 2.
Overall Framework Diagram. (a) TCI-UNet Structure; (b) TCI module Schematic Diagram.
Fig. 3.
Fig. 3.
Overall structure of TCI module.
Fig. 4.
Fig. 4.
Window Self-Attention Enhance block.
Fig. 5.
Fig. 5.
MGC block structure diagram.
Fig. 6.
Fig. 6.
Comparison chart of network parameter size and computation time.
Fig. 7.
Fig. 7.
Comparison of network thermodynamic diagram after gradually adding different blocks.
Fig. 8.
Fig. 8.
Comparison chart of network parameter size and computation time.
Fig. 9.
Fig. 9.
Comparison of segmentation results on ISIC-2018 dataset.
Fig. 10.
Fig. 10.
Comparison of segmentation results on MICCAI 2017 LiTS dataset.
Fig. 11.
Fig. 11.
Boxplot of DICE and IOU indicators on LiTS and ISIC datasets.

References

    1. Long J., Shelhamer E., Darrell T., “Fully convolutional networks for semantic segmentation,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 3431–3440. - PubMed
    1. Ronneberger O., Fischer P., Brox T., “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical image computing and computer-assisted intervention, (Springer, 2015), pp. 234–241.
    1. Zhou Z., Siddiquee M. M. R., Tajbakhsh N., Nima T., Jianming L., “Unet++: A nested u-net architecture for medical image segmentation,” Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support (Springer; 2018), pp. 3–11. - PMC - PubMed
    1. Huang H., Lin L., Tong R., Hu Hongjie, Zhang Q., Iwamoto Y., Han X., Chen Y., Wu J., “UNet 3+: A full-scale connected unet for medical image segmentation,” ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (IEEE, 2020). pp. 1055–1059.
    1. Li X., Chen H., Qi X., Dou Q., Fu C., Heng P., “H-DenseUNet: hybrid densely connected UNet for liver and tumor segmentation from CT volumes,” IEEE Trans. Med. Imaging 37(12), 2663–2674 (2018).10.1109/TMI.2018.2845918 - DOI - PubMed

LinkOut - more resources