Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Sep 1;25(1):322.
doi: 10.1186/s12911-025-03174-6.

Developing multimodal cervical cancer risk assessment and prediction model based on LMIC hospital patient card sheets and histopathological images

Affiliations

Developing multimodal cervical cancer risk assessment and prediction model based on LMIC hospital patient card sheets and histopathological images

Kelebet Chane Jemane et al. BMC Med Inform Decis Mak. .

Abstract

Cervical cancer remains a significant global health burden, particularly in low- and middle-income countries (LMICs) where access to early diagnostic tools is limited. In Ethiopia, cervical cancer diagnosis often relies on manual interpretation of biopsies, which can be time-consuming and subjective. This study aims to develop a multimodal machine learning model that integrates histopathological images and associated patient clinical records to improve cervical cancer risk prediction and biopsy detection. The dataset comprises 404 biopsy images and corresponding clinical records from 499 patients, collected at Jimma Medical Center. The preprocessing of histopathological images and clinical records involved image enhancement, data augmentation, imputation of missing values, and class balancing techniques. Subsequently, (I) a pre-trained convolutional neural network deep learning (VGG16) model was applied on the histopathological dataset, (II) a Random Forest classifier was trained on the patient clinical records, and (III) a late fusion strategy was employed to integrate the outputs of both classifiers for multimodal analysis. Recursive Feature Elimination was used to identify key predictive factors from the patient data, and the model’s performance was thoroughly validated using accuracy, AUC-ROC curves, and confusion matrices, ensuring reliability across all classes. As a result, convolutional neural networks and Random Forest classifiers achieved accuracies of 91% and 96%, respectively. The integrated multimodal model achieved 92% accuracy, demonstrating enhanced robustness and clinical relevance by combining complementary data sources. These findings suggest that multimodal approaches hold promise for improving cervical cancer diagnostics in resource-limited settings. Future work will focus on validating the model with diverse datasets and integrating it into clinical workflows to support healthcare providers in LMICs.

Keywords: Cervical cancer; Deep learning; LMICs; Machine learning; Multimodal model; Predictive model; Risk analysis.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: This study complies with the principles of the Declaration of Helsinki and was approved by the Jimma Institute of Technology, Research, and Ethical Review Board (IRB Reference No. RPD/JiT/183/15). The data used in this research were collected retrospectively from hospital patient card sheets and histopathological image archives. The need for informed consent was waived by the Jimma Institute of Technology, Research, and Ethical Review Board, by national research ethics regulations. Consent publication: Not applicable. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Proposed methodology
Fig. 2
Fig. 2
Pipeline for multimodal processing model
Fig. 3
Fig. 3
Histopathological processing pipeline
Fig. 4
Fig. 4
Patient card-sheet classification architecture
Fig. 5
Fig. 5
Sample histopathology images (subtle pathological features)
Fig. 6
Fig. 6
Class visualization before and after data augmentation
Fig. 7
Fig. 7
Patient card-sheet class distribution after resampling
Fig. 8
Fig. 8
Mean age of the women facing the risk of cervical cancer
Fig. 9
Fig. 9
ROC curve for multi-class classification of training and testing sets
Fig. 10
Fig. 10
Training and validation accuracy
Fig. 11
Fig. 11
Training and validation loss
Fig. 12
Fig. 12
Roc curves for VGG16
Fig. 13
Fig. 13
VGG16 confusion matrix
Fig. 14
Fig. 14
ResNet50 model training and validation accuracy
Fig. 15
Fig. 15
ResNet50 model training and validation loss
Fig. 16
Fig. 16
ResNet50 ROC curve
Fig. 17
Fig. 17
ResNet50 confusion matrix
Fig. 18
Fig. 18
Confusion matrix for RF model
Fig. 19
Fig. 19
ROC curve for RF model
Fig. 20
Fig. 20
Confusion matrix curve for the decision tree model
Fig. 21
Fig. 21
ROC curve for the decision tree model

References

    1. Noncommunicable diseases, Accessed. Dec. 07, 2024. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/noncommunicable-diseases
    1. Cervical cancer., Accessed. Dec. 07, 2024. [Online]. Available: https://www.who.int/health-topics/cervical-cancer#tab=tab_1
    1. Mazeron JJ, Gerbaulet A. The centenary of discovery of radium. Radiother Oncol. Dec. 1998;49(3):205–16. 10.1016/S0167-8140(98)00143-1. - PubMed
    1. Mekuria M et al. Prevalence of cervical cancer and associated factors among women attended cervical cancer screening center at Gahandi Memorial Hospital, Ethiopia, Cancer Inform. 2021;20. 10.1177/11769351211068431 - PMC - PubMed
    1. Lozano R. Comparison of computer-assisted and manual screening of cervical cytology. Gynecol Oncol. Jan. 2007;104(1):134–8. 10.1016/j.ygyno.2006.07.025. - PubMed

LinkOut - more resources