Developing multimodal cervical cancer risk assessment and prediction model based on LMIC hospital patient card sheets and histopathological images
- PMID: 40890689
- PMCID: PMC12400732
- DOI: 10.1186/s12911-025-03174-6
Developing multimodal cervical cancer risk assessment and prediction model based on LMIC hospital patient card sheets and histopathological images
Abstract
Cervical cancer remains a significant global health burden, particularly in low- and middle-income countries (LMICs) where access to early diagnostic tools is limited. In Ethiopia, cervical cancer diagnosis often relies on manual interpretation of biopsies, which can be time-consuming and subjective. This study aims to develop a multimodal machine learning model that integrates histopathological images and associated patient clinical records to improve cervical cancer risk prediction and biopsy detection. The dataset comprises 404 biopsy images and corresponding clinical records from 499 patients, collected at Jimma Medical Center. The preprocessing of histopathological images and clinical records involved image enhancement, data augmentation, imputation of missing values, and class balancing techniques. Subsequently, (I) a pre-trained convolutional neural network deep learning (VGG16) model was applied on the histopathological dataset, (II) a Random Forest classifier was trained on the patient clinical records, and (III) a late fusion strategy was employed to integrate the outputs of both classifiers for multimodal analysis. Recursive Feature Elimination was used to identify key predictive factors from the patient data, and the model’s performance was thoroughly validated using accuracy, AUC-ROC curves, and confusion matrices, ensuring reliability across all classes. As a result, convolutional neural networks and Random Forest classifiers achieved accuracies of 91% and 96%, respectively. The integrated multimodal model achieved 92% accuracy, demonstrating enhanced robustness and clinical relevance by combining complementary data sources. These findings suggest that multimodal approaches hold promise for improving cervical cancer diagnostics in resource-limited settings. Future work will focus on validating the model with diverse datasets and integrating it into clinical workflows to support healthcare providers in LMICs.
Keywords: Cervical cancer; Deep learning; LMICs; Machine learning; Multimodal model; Predictive model; Risk analysis.
Conflict of interest statement
Declarations. Ethics approval and consent to participate: This study complies with the principles of the Declaration of Helsinki and was approved by the Jimma Institute of Technology, Research, and Ethical Review Board (IRB Reference No. RPD/JiT/183/15). The data used in this research were collected retrospectively from hospital patient card sheets and histopathological image archives. The need for informed consent was waived by the Jimma Institute of Technology, Research, and Ethical Review Board, by national research ethics regulations. Consent publication: Not applicable. Competing interests: The authors declare no competing interests.
Figures
References
-
- Noncommunicable diseases, Accessed. Dec. 07, 2024. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/noncommunicable-diseases
-
- Cervical cancer., Accessed. Dec. 07, 2024. [Online]. Available: https://www.who.int/health-topics/cervical-cancer#tab=tab_1
-
- Mazeron JJ, Gerbaulet A. The centenary of discovery of radium. Radiother Oncol. Dec. 1998;49(3):205–16. 10.1016/S0167-8140(98)00143-1. - PubMed
-
- Lozano R. Comparison of computer-assisted and manual screening of cervical cytology. Gynecol Oncol. Jan. 2007;104(1):134–8. 10.1016/j.ygyno.2006.07.025. - PubMed
LinkOut - more resources
Full Text Sources
Research Materials
