. 2023 Nov 24;23(1):276.

doi: 10.1186/s12874-023-02103-3.

Multiclass risk models for ovarian malignancy: an illustration of prediction uncertainty due to the choice of algorithm

Ashleigh Ledger¹, Jolien Ceusters^{1

2}, Lil Valentin^{3

4}, Antonia Testa^{5

6}, Caroline Van Holsbeke⁷, Dorella Franchi⁸, Tom Bourne^{1

9

10}, Wouter Froyman^{1

9}, Dirk Timmerman^{1

9}, Ben Van Calster^{11

12

13}

Affiliations

¹ Department of Development and Regeneration, KU Leuven, Herestraat 49 box 805, Leuven, 3000, Belgium.
² Department of Oncology, Leuven Cancer Institute, Laboratory of Tumor Immunology and Immunotherapy, KU Leuven, Leuven, Belgium.
³ Department of Obstetrics and Gynecology, Skåne University Hospital, Malmö, Sweden.
⁴ Department of Clinical Sciences Malmö, Lund University, Malmö, Sweden.
⁵ Department of Woman, Child and Public Health, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy.
⁶ Dipartimento Universitario Scienze della Vita e Sanità Pubblica, Università Cattolica del Sacro Cuore, Rome, Italy.
⁷ Department of Obstetrics and Gynecology, Ziekenhuis Oost-Limburg, Genk, Belgium.
⁸ Preventive Gynecology Unit, Division of Gynecology, European Institute of Oncology IRCCS, Milan, Italy.
⁹ Department of Obstetrics and Gynecology, University Hospitals Leuven, Leuven, Belgium.
¹⁰ Queen Charlotte's and Chelsea Hospital, Imperial College, London, UK.
¹¹ Department of Development and Regeneration, KU Leuven, Herestraat 49 box 805, Leuven, 3000, Belgium. ben.vancalster@kuleuven.be.
¹² Department of Biomedical Data Sciences, Leiden University Medical Centre (LUMC), Leiden, Netherlands. ben.vancalster@kuleuven.be.
¹³ Leuven Unit for Health Technology Assessment Research (LUHTAR), KU Leuven, Leuven, Belgium. ben.vancalster@kuleuven.be.

PMID: 38001421
PMCID: PMC10668424
DOI: 10.1186/s12874-023-02103-3

Multiclass risk models for ovarian malignancy: an illustration of prediction uncertainty due to the choice of algorithm

Ashleigh Ledger et al. BMC Med Res Methodol. 2023.

. 2023 Nov 24;23(1):276.

doi: 10.1186/s12874-023-02103-3.

Authors

Affiliations

¹ Department of Development and Regeneration, KU Leuven, Herestraat 49 box 805, Leuven, 3000, Belgium.
² Department of Oncology, Leuven Cancer Institute, Laboratory of Tumor Immunology and Immunotherapy, KU Leuven, Leuven, Belgium.
³ Department of Obstetrics and Gynecology, Skåne University Hospital, Malmö, Sweden.
⁴ Department of Clinical Sciences Malmö, Lund University, Malmö, Sweden.
⁵ Department of Woman, Child and Public Health, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy.
⁶ Dipartimento Universitario Scienze della Vita e Sanità Pubblica, Università Cattolica del Sacro Cuore, Rome, Italy.
⁷ Department of Obstetrics and Gynecology, Ziekenhuis Oost-Limburg, Genk, Belgium.
⁸ Preventive Gynecology Unit, Division of Gynecology, European Institute of Oncology IRCCS, Milan, Italy.
⁹ Department of Obstetrics and Gynecology, University Hospitals Leuven, Leuven, Belgium.
¹⁰ Queen Charlotte's and Chelsea Hospital, Imperial College, London, UK.
¹¹ Department of Development and Regeneration, KU Leuven, Herestraat 49 box 805, Leuven, 3000, Belgium. ben.vancalster@kuleuven.be.
¹² Department of Biomedical Data Sciences, Leiden University Medical Centre (LUMC), Leiden, Netherlands. ben.vancalster@kuleuven.be.
¹³ Leuven Unit for Health Technology Assessment Research (LUHTAR), KU Leuven, Leuven, Belgium. ben.vancalster@kuleuven.be.

PMID: 38001421
PMCID: PMC10668424
DOI: 10.1186/s12874-023-02103-3

Abstract

Background: Assessing malignancy risk is important to choose appropriate management of ovarian tumors. We compared six algorithms to estimate the probabilities that an ovarian tumor is benign, borderline malignant, stage I primary invasive, stage II-IV primary invasive, or secondary metastatic.

Methods: This retrospective cohort study used 5909 patients recruited from 1999 to 2012 for model development, and 3199 patients recruited from 2012 to 2015 for model validation. Patients were recruited at oncology referral or general centers and underwent an ultrasound examination and surgery ≤ 120 days later. We developed models using standard multinomial logistic regression (MLR), Ridge MLR, random forest (RF), XGBoost, neural networks (NN), and support vector machines (SVM). We used nine clinical and ultrasound predictors but developed models with or without CA125.

Results: Most tumors were benign (3980 in development and 1688 in validation data), secondary metastatic tumors were least common (246 and 172). The c-statistic (AUROC) to discriminate benign from any type of malignant tumor ranged from 0.89 to 0.92 for models with CA125, from 0.89 to 0.91 for models without. The multiclass c-statistic ranged from 0.41 (SVM) to 0.55 (XGBoost) for models with CA125, and from 0.42 (SVM) to 0.51 (standard MLR) for models without. Multiclass calibration was best for RF and XGBoost. Estimated probabilities for a benign tumor in the same patient often differed by more than 0.2 (20% points) depending on the model. Net Benefit for diagnosing malignancy was similar for algorithms at the commonly used 10% risk threshold, but was slightly higher for RF at higher thresholds. Comparing models, between 3% (XGBoost vs. NN, with CA125) and 30% (NN vs. SVM, without CA125) of patients fell on opposite sides of the 10% threshold.

Conclusion: Although several models had similarly good performance, individual probability estimates varied substantially.

Keywords: Calibration; Machine learning; Multiclass models; Ovarian Neoplasms; Prediction models.

PubMed Disclaimer

Conflict of interest statement

LV reported receiving grants from the Swedish Research Council, Malmö University Hospital and Skåne University Hospital, Allmänna Sjukhusets i Malmö Stiftelse för bekämpande av cancer (the Malmö General Hospital Foundation for Fighting Against Cancer), Avtal om läkarutbildning och forskning (ALF)–medel, and Landstingsfinansierad Regional Forskning during the conduct of the study; and teaching fees from Samsung outside the submitted work. DT and BVC reported receiving grants from the Research Foundation–Flanders (FWO) and Internal Funds KU Leuven during the conduct of the study. TB reported receiving grants from NIHR Biomedical Research Centre, speaking honoraria and departmental funding from Samsung Healthcare and grants from Roche Diagnostics, Illumina, and Abbott. No other disclosures were reported. All other authors declare no competing interests.

Figures

**Fig. 1**
Flexible calibration curves for models with CA125 on external validation data. Abbreviations: MLR, multinomial logistic regression; XGBoost, extreme gradient boosting

**Fig. 2**
Scatter plots for the estimated risk of a benign tumor on validation data. For each pair of models with CA125. Abbreviations: MLR, multinomial logistic regression; RF, random forest; XGBoost, extreme gradient boosting; NN, neural network; SVM, support vector machine

**Fig. 3**
Scatter plots of the estimated risk of a borderline tumor on validation data. For each pair of models with CA125. Abbreviations: MLR, multinomial logistic regression; RF, random forest; XGBoost, extreme gradient boosting; NN, neural network; SVM, support vector machine

**Fig. 4**
Scatter plots of the estimated risk of a stage I tumor on validation data. For each pair of models with CA125. Abbreviations: MLR, multinomial logistic regression; RF, random forest; XGBoost, extreme gradient boosting; NN, neural network; SVM, support vector machine

**Fig. 5**
Scatter plots of the estimated risk of a stage II-IV tumor on validation data. For each pair of models with CA125. Abbreviations: MLR, multinomial logistic regression; RF, random forest; XGBoost, extreme gradient boosting; NN, neural network; SVM, support vector machine

**Fig. 6**
Scatter plots of the estimated risk of a secondary metastatic tumor on validation data. For each pair of models with CA125. Abbreviations: MLR, multinomial logistic regression; RF, random forest; XGBoost, extreme gradient boosting; NN, neural network; SVM, support vector machine

See this image and copyright information in PMC

Cited by

A comparison of modeling approaches for static and dynamic prediction of central line-associated bloodstream infections using electronic health records (part 2): random forest models.
Albu E, Gao S, Stijnen P, Rademakers FE, Janssens C, Cossey V, Debaveye Y, Wynants L, Van Calster B. Albu E, et al. Diagn Progn Res. 2025 Jul 21;9(1):21. doi: 10.1186/s41512-025-00194-8. Diagn Progn Res. 2025. PMID: 40691852 Free PMC article.
Validation of machine learning-based models to predict and explain the risk of ovarian cancer: a multicentric study on BRCA-mutated patients undergoing risk-reducing salpingo-oophorectomy.
Loizzi V, Comes MC, Arezzo F, Apostol AI, Bove S, Fanizzi A, Fruscio R, Gregorc V, Legge F, Mancari R, Marchetti C, Negri S, Russo G, Vertechy L, Scambia G, Massafra R, Cormio G. Loizzi V, et al. Front Oncol. 2025 Apr 15;15:1574037. doi: 10.3389/fonc.2025.1574037. eCollection 2025. Front Oncol. 2025. PMID: 40303993 Free PMC article.
A Review on Biomarker-Enhanced Machine Learning for Early Diagnosis and Outcome Prediction in Ovarian Cancer Management.
Hormaty S, Seiwan AN, Rasheed BH, Parvaz H, Gharahzadeh A, Ghaznavi H. Hormaty S, et al. Cancer Med. 2025 Sep;14(17):e71224. doi: 10.1002/cam4.71224. Cancer Med. 2025. PMID: 40927964 Free PMC article. Review.
Understanding overfitting in random forest for probability estimation: a visualization and simulation study.
Barreñada L, Dhiman P, Timmerman D, Boulesteix AL, Van Calster B. Barreñada L, et al. Diagn Progn Res. 2024 Sep 27;8(1):14. doi: 10.1186/s41512-024-00177-1. Diagn Progn Res. 2024. PMID: 39334348 Free PMC article.

References

1. Woo YL, Kyrgiou M, Bryant A, et al. Centralisation of services for gynaecological cancers – a Cochrane systematic review. Gynecol Oncol. 2012;126:286–90. doi: 10.1016/j.ygyno.2012.04.012. - DOI - PubMed
1. Vernooij F, Heintz APM, Witteveen PO, et al. Specialized care and survival of Ovarian cancer patients in the Netherlands: nationwide cohort study. J Natl Cancer Inst. 2008;100:399–406. doi: 10.1093/jnci/djn033. - DOI - PubMed
1. Froyman W, Landolfo C, De Cock B, et al. Risk of Complications in patients with conservatively managed ovarian tumours (IOTA5): a 2-year interim analysis of a multicentre, prospective, cohort study. Lancet Oncol. 2019;20:448–58. doi: 10.1016/S1470-2045(18)30837-4. - DOI - PubMed
1. Moons KGM, Altman DG, Reitsma JB, et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med. 2015;162:W1–73. doi: 10.7326/M14-0698. - DOI - PubMed
1. Steyerberg EW. Clinical prediction models: a practical approach to development, validation, and updating. 2. Cham: Springer; 2019.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Multiclass risk models for ovarian malignancy: an illustration of prediction uncertainty due to the choice of algorithm

Affiliations

Multiclass risk models for ovarian malignancy: an illustration of prediction uncertainty due to the choice of algorithm

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Medical

Research Materials

Miscellaneous

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

LinkOut - more resources

Full Text Sources

Medical

Research Materials

Miscellaneous