. 2025 Jul 15;25(1):283.

doi: 10.1186/s12880-025-01826-7.

Colorectal cancer unmasked: A synergistic AI framework for Hyper-granular image dissection, precision segmentation, and automated diagnosis

Affiliations

¹ Department of Computing Technologies, School of Computing, SRM Institute of Science and Technology, Kattankulathur, Chennai, Tamilnadu, 603203, India. akella.raju@gmail.com.
² Department of Networking and Communications, School of Computing, SRM Institute of Science and Technology, Kattankulathur, Chennai, Tamilnadu, 603203, India.
³ Department of Information Technology (IT), Aditya University, Surampalem, Andhra Pradesh, 533437, India.
⁴ Department of Computer Science and Engineering (Data Science), Institute of Aeronautical Engineering, Dundigul, Hyderabad, Telangana, 500043, India. ranjith.gatla@gmail.com.
⁵ Department of Computer Science and Engineering (AI & ML), Institute of Aeronautical Engineering, Hyderabad, Telangana, 500043, India.
⁶ Department of IT, Pragati Engineering College (A), Surampalem, Andhra Pradesh, India.
⁷ Department of CS, Pragati Engineering College (A), Surampalem, Andhra Pradesh, India.
⁸ Department of Mechanical Engineering, Graphic Era (Deemed to be University), Dehradun, 248002, India.
⁹ Department of Mechanical Engineering, Rayat Bahra Institute of Engineering and Nano Technology, Hoshiarpur, Punjab, India.
¹⁰ Al-Kitab University, Kirkuk, 36015, Iraq.
¹¹ Department of Buildings and Construction Techniques Engineering, College of Engineering, Al-Mustaqbal University, Hillah, Babylon, 51001, Iraq.
¹² Al-Safwa University College, Kerbala, Iraq.
¹³ Institute of Technology, Dire-Dawa University, Dire Dawa, 1487, Ethiopia. wkhan9450@gmail.com.

PMID: 40665235
PMCID: PMC12265324
DOI: 10.1186/s12880-025-01826-7

Colorectal cancer unmasked: A synergistic AI framework for Hyper-granular image dissection, precision segmentation, and automated diagnosis

Akella S Narasimha Raju et al. BMC Med Imaging. 2025.

. 2025 Jul 15;25(1):283.

doi: 10.1186/s12880-025-01826-7.

Authors

Affiliations

¹ Department of Computing Technologies, School of Computing, SRM Institute of Science and Technology, Kattankulathur, Chennai, Tamilnadu, 603203, India. akella.raju@gmail.com.
² Department of Networking and Communications, School of Computing, SRM Institute of Science and Technology, Kattankulathur, Chennai, Tamilnadu, 603203, India.
³ Department of Information Technology (IT), Aditya University, Surampalem, Andhra Pradesh, 533437, India.
⁴ Department of Computer Science and Engineering (Data Science), Institute of Aeronautical Engineering, Dundigul, Hyderabad, Telangana, 500043, India. ranjith.gatla@gmail.com.
⁵ Department of Computer Science and Engineering (AI & ML), Institute of Aeronautical Engineering, Hyderabad, Telangana, 500043, India.
⁶ Department of IT, Pragati Engineering College (A), Surampalem, Andhra Pradesh, India.
⁷ Department of CS, Pragati Engineering College (A), Surampalem, Andhra Pradesh, India.
⁸ Department of Mechanical Engineering, Graphic Era (Deemed to be University), Dehradun, 248002, India.
⁹ Department of Mechanical Engineering, Rayat Bahra Institute of Engineering and Nano Technology, Hoshiarpur, Punjab, India.
¹⁰ Al-Kitab University, Kirkuk, 36015, Iraq.
¹¹ Department of Buildings and Construction Techniques Engineering, College of Engineering, Al-Mustaqbal University, Hillah, Babylon, 51001, Iraq.
¹² Al-Safwa University College, Kerbala, Iraq.
¹³ Institute of Technology, Dire-Dawa University, Dire Dawa, 1487, Ethiopia. wkhan9450@gmail.com.

PMID: 40665235
PMCID: PMC12265324
DOI: 10.1186/s12880-025-01826-7

Abstract

Colorectal cancer (CRC) is the second most common cause of cancer-related mortality worldwide, underscoring the necessity for computer-aided diagnosis (CADx) systems that are interpretable, accurate, and robust. This study presents a practical CADx system that combines Vision Transformers (ViTs) and DeepLabV3 + to accurately identify and segment colorectal lesions in colonoscopy images.The system addresses class balance and real-world complexity with PCA-based dimensionality reduction, data augmentation, and strategic preprocessing using recently curated CKHK-22 dataset comprising more than 14,000 annotated images of CVC-ClinicDB, Kvasir-2, and Hyper-Kvasir. ViT, ResNet-50, DenseNet-201, and VGG-16 were used to quantify classification performance. ViT achieved best-in-class accuracy (97%), F1-score (0.95), and AUC (92%) in test data. The DeepLabV3 + achieved segmentation state-of-the-art for tasks of localisation with 0.88 Dice Coefficient and 0.71 Intersection over Union (IoU), ensuring sharp delineation of areas that are malignant. The CADx system accommodates real-time inference and served through Google Cloud for information that accommodates scalable clinical implementation. The image-level segmentation effectiveness is evidenced by comparison with visual overlay and expert-manually deliminated masks, and its precision is illustrated by computation of precision, recall, F1-score, and AUC. The hybrid strategy not only outperforms traditional CNN strategies but also overcomes important clinical needs such as detection early, balance of highly disparate classes, and clear explanation. The proposed ViT-DeepLabV3 + system establishes a basis for advanced AI support to colorectal diagnosis by utilizing self-attention strategies and learning with different scales of context. The system offers a high-capacity, reproducible computerised colorectal cancer screening and monitoring solution and can be best deployed where resources are scarce, and it can be highly desirable for clinical deployment.

Keywords: Automatic segmentation; DeepLabV3+; Diagnosis of colorectal cancer; Hyper-granular image analysis; Vision transformers.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: Not applicable. This study utilized publicly available colonoscopy image datasets (CVC-ClinicDB, Kvasir-SEG, and Hyper-Kvasir), which were merged to form the CKHK-22 dataset. These datasets are anonymized and do not involve any direct interaction with human subjects or use of live human data. Therefore, ethical approval and participant consent were not required. Conflicts of interest: Authors stated that no conflict of Interest. Competing interests: The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

**Fig. 1**
The CADx architecture of CNNs and transformers with DeepLabV3 +

**Fig. 2**
The 10 classes CKHK-22 mixed dataset sample images

**Fig. 3**
CKHK-22 mixed dataset pre-processed sample images

**Fig. 4**
The train and test split of the CKHK-22 dataset

**Fig. 5**
PCA applied on CKHK-22 colonoscopy images

**Fig. 6**
Train vs test data for haemorrhoids and polyps classes

**Fig. 7**
The CNN architecture of the ResNet-50

**Fig. 8**
The map of features extracted by CNN ResNet-50

**Fig. 9**
The CNN architecture of the DenseNet-201

**Fig. 10**
A map of the features extracted by CNN DenseNet-201

**Fig. 11**
The CNN architecture of the VGG16

**Fig. 12**
A map of the features extracted by CNN VGG16

**Fig. 13**
The network architecture of the vision transformer

**Fig. 14**
The attention map extracted by the vision transformer network

**Fig. 15**
DeepLabV3 +’s illustrative architecture for colorectal cancer recognition

**Fig. 16**
The CADx flow diagram for the detection of colorectal cancer

**Fig. 17**
Hyperparameters for DeepLabV3 +, ViT, and CNN

**Fig. 18**
Comparison of the training and testing accuracy and AUC of CNNs and ViT

**Fig. 19**
The performance metrics of the models in CADx

**Fig. 20**
Accuracy of vision transformers and training loss

**Fig. 21**
The confusion matrix for the classification of two classes

**Fig. 22**
The Receiver Operating Characteristic (ROC) curve for classification

**Fig. 23**
The graphs depicting the accuracy and loss of the DeepLabV3 + model

**Fig. 24**
The detection of malignant Polyps with DeepLabV3 +

**Fig. 25**
The CADx system’s overarching architecture

See this image and copyright information in PMC

References

1. Arem H, Loftfield E. Cancer Epidemiology: a Survey of Modifiable Risk Factors for Prevention and Survivorship. Am J Lifestyle Med. 28 March 2017;2(3). - PMC - PubMed
1. Cancer.net, [Online]. Available: https://www.cancer.net/cancer-types/colorectal-cancer/diagnosis. [19 November 2022].
1. Mathur P, Sathishkumar K, Das MCP, Sudarshan KL, Santhappan S, Statistics C. Report From National Cancer Registry Programme, India. JCO Global Oncology-An American Society of Clinical Oncology Journal. 2020;6. 16 July 2020 1063–75. - PMC - PubMed
1. Sathishkumar K, Chaturvedi M, Das P, Stephen S, Mathur P. Cancer incidence estimates for 2022 & projection for 2025: result from National Cancer Registry Programme, India. Indian J Med Res. 11 October 2022. 598–607. - PMC - PubMed
1. Cancer statistics of Japan, Japan National Cancer Center:31 May 2021. [Online]. Available: https://ganjoho.jp/public/qa_links/report/statistics/2021_en.html. [26 July 2024].

MeSH terms

Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- BioMed Central
- PubMed Central
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Colorectal cancer unmasked: A synergistic AI framework for Hyper-granular image dissection, precision segmentation, and automated diagnosis

Affiliations

Colorectal cancer unmasked: A synergistic AI framework for Hyper-granular image dissection, precision segmentation, and automated diagnosis

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources

Medical