Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jul 1;14(1):15013.
doi: 10.1038/s41598-024-64982-w.

Multi-branch CNN and grouping cascade attention for medical image classification

Affiliations

Multi-branch CNN and grouping cascade attention for medical image classification

Shiwei Liu et al. Sci Rep. .

Abstract

Visual Transformers(ViT) have made remarkable achievements in the field of medical image analysis. However, ViT-based methods have poor classification results on some small-scale medical image classification datasets. Meanwhile, many ViT-based models sacrifice computational cost for superior performance, which is a great challenge in practical clinical applications. In this paper, we propose an efficient medical image classification network based on an alternating mixture of CNN and Transformer tandem, which is called Eff-CTNet. Specifically, the existing ViT-based method still mainly relies on multi-head self-attention (MHSA). Among them, the attention maps of MHSA are highly similar, which leads to computational redundancy. Therefore, we propose a group cascade attention (GCA) module to split the feature maps, which are provided to different attention heads to further improves the diversity of attention and reduce the computational cost. In addition, we propose an efficient CNN (EC) module to enhance the ability of the model and extract the local detail information in medical images. Finally, we connect them and design an efficient hybrid medical image classification network, namely Eff-CTNet. Extensive experimental results show that our Eff-CTNet achieves advanced classification performance with less computational cost on three public medical image classification datasets.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Eff-CTNet and comparison methods in terms of Acc-parameters trade-offs over three datasets.
Figure 2
Figure 2
Overview of Eff-CTNet. Eff-CTNet consists of EC and ET Block.
Figure 3
Figure 3
Example of EC Block structure. (a) is the EC block including downsampling, (b) is the EC Block without downsampling.
Figure 4
Figure 4
Specific structure of the ET module in Eff-CTNet.
Figure 5
Figure 5
Specific structure of the GCA module in ET module.
Figure 6
Figure 6
Grad-CAM visualization results for different comparison models on the BUSI, COVID19-CT, Chaoyang datasets.
Figure 7
Figure 7
Training curves of Eff-CTNet on BUSI, COVID19-CT and Chaoyang datasets.
Figure 8
Figure 8
ROC curves for different comparison methods on the BUSI, COVID19-CT, Chaoyang datasets.
Figure 9
Figure 9
Confusion matrix visualization of Eff-CTNet on the BUSI, COVID19-CT, Chaoyang datasets.

References

    1. Li, Q. et al. Medical image classification with convolutional neural network. In 2014 13th international conference on control automation robotics & vision (ICARCV), 844–848 (IEEE, 2014).
    1. Ronneberger, O., Fischer, P. & Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, 234–241 (Springer, 2015).
    1. Dosovitskiy, A. et al. An image is worth 16x16 words: Transformers for image recognition at scale. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021 (OpenReview.net), (2021).
    1. Dai Y, Gao Y, Liu F. Transmed: Transformers advance multi-modal medical image classification. Diagnostics. 2021;11:1384. doi: 10.3390/diagnostics11081384. - DOI - PMC - PubMed
    1. Shou, Y. et al. Object detection in medical images based on hierarchical transformer and mask mechanism. Comput. Intell. Neurosci.2022 (2022). - PMC - PubMed

MeSH terms

LinkOut - more resources