Performance of Statistical and Machine Learning Risk Prediction Models for Surveillance Benefits and Failures in Breast Cancer Survivors

doi:10.1158/1055-9965.EPI-22-0677

. 2023 Apr 3;32(4):561-571.

doi: 10.1158/1055-9965.EPI-22-0677.

Performance of Statistical and Machine Learning Risk Prediction Models for Surveillance Benefits and Failures in Breast Cancer Survivors

Yu-Ru Su¹, Diana S M Buist¹, Janie M Lee², Laura Ichikawa¹, Diana L Miglioretti^{1

3}, Erin J Aiello Bowles¹, Karen J Wernli¹, Karla Kerlikowske^{4

5

6}, Anna Tosteson⁷, Kathryn P Lowry², Louise M Henderson⁸, Brian L Sprague^{9

10}, Rebecca A Hubbard¹¹

Affiliations

¹ Kaiser Permanente Washington Health Research Institute, Kaiser Permanente WA, Seattle, Washington.
² Department of Radiology, University of Washington and Seattle Cancer Care Alliance, Seattle, Washington.
³ Division of Biostatistics, Department of Public Health Sciences, University of California Davis, Davis, California.
⁴ Department of Medicine, University of California, San Francisco, California.
⁵ Department of Epidemiology and Biostatistics, University of California, San Francisco, California.
⁶ General Internal Medicine Section, Department of Veterans Affairs, University of California, San Francisco, California.
⁷ The Dartmouth Institute for Health Policy and Clinical Practice and Norris Cotton Cancer Center, Geisel School of Medicine at Dartmouth, Lebanon, New Hampshire.
⁸ Department of Radiology, University of North Carolina, Chapel Hill, North Carolina.
⁹ Department of Surgery, University of Vermont, Burlington, Vermont.
¹⁰ Department of Radiology, University of Vermont, Burlington, Vermont.
¹¹ Department of Biostatistics, Epidemiology & Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania.

PMID: 36697364
PMCID: PMC10073265
DOI: 10.1158/1055-9965.EPI-22-0677

Performance of Statistical and Machine Learning Risk Prediction Models for Surveillance Benefits and Failures in Breast Cancer Survivors

Yu-Ru Su et al. Cancer Epidemiol Biomarkers Prev. 2023.

. 2023 Apr 3;32(4):561-571.

doi: 10.1158/1055-9965.EPI-22-0677.

Authors

Affiliations

¹ Kaiser Permanente Washington Health Research Institute, Kaiser Permanente WA, Seattle, Washington.
² Department of Radiology, University of Washington and Seattle Cancer Care Alliance, Seattle, Washington.
³ Division of Biostatistics, Department of Public Health Sciences, University of California Davis, Davis, California.
⁴ Department of Medicine, University of California, San Francisco, California.
⁵ Department of Epidemiology and Biostatistics, University of California, San Francisco, California.
⁶ General Internal Medicine Section, Department of Veterans Affairs, University of California, San Francisco, California.
⁷ The Dartmouth Institute for Health Policy and Clinical Practice and Norris Cotton Cancer Center, Geisel School of Medicine at Dartmouth, Lebanon, New Hampshire.
⁸ Department of Radiology, University of North Carolina, Chapel Hill, North Carolina.
⁹ Department of Surgery, University of Vermont, Burlington, Vermont.
¹⁰ Department of Radiology, University of Vermont, Burlington, Vermont.
¹¹ Department of Biostatistics, Epidemiology & Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania.

PMID: 36697364
PMCID: PMC10073265
DOI: 10.1158/1055-9965.EPI-22-0677

Abstract

Background: Machine learning (ML) approaches facilitate risk prediction model development using high-dimensional predictors and higher-order interactions at the cost of model interpretability and transparency. We compared the relative predictive performance of statistical and ML models to guide modeling strategy selection for surveillance mammography outcomes in women with a personal history of breast cancer (PHBC).

Methods: We cross-validated seven risk prediction models for two surveillance outcomes, failure (breast cancer within 12 months of a negative surveillance mammogram) and benefit (surveillance-detected breast cancer). We included 9,447 mammograms (495 failures, 1,414 benefits, and 7,538 nonevents) from years 1996 to 2017 using a 1:4 matched case-control samples of women with PHBC in the Breast Cancer Surveillance Consortium. We assessed model performance of conventional regression, regularized regressions (LASSO and elastic-net), and ML methods (random forests and gradient boosting machines) by evaluating their calibration and, among well-calibrated models, comparing the area under the receiver operating characteristic curve (AUC) and 95% confidence intervals (CI).

Results: LASSO and elastic-net consistently provided well-calibrated predicted risks for surveillance failure and benefit. The AUCs of LASSO and elastic-net were both 0.63 (95% CI, 0.60-0.66) for surveillance failure and 0.66 (95% CI, 0.64-0.68) for surveillance benefit, the highest among well-calibrated models.

Conclusions: For predicting breast cancer surveillance mammography outcomes, regularized regression outperformed other modeling approaches and balanced the trade-off between model flexibility and interpretability.

Impact: Regularized regression may be preferred for developing risk prediction models in other contexts with rare outcomes, similar training sample sizes, and low-dimensional features.

PubMed Disclaimer

Conflict of interest statement

The following authors have potential conflicts of interest; Dr. Diana Buist: Athena WISDOM Study Data Safety and Monitoring Board (2015-present); Dr. Janie M Lee: Research Grant from GE Healthcare (11/15/2016-12/31/2020), Consulting agreement with GE Healthcare (2017 only); Dr. Diana Miglioretti: Honorarium from Society for Breast Imaging for keynote lecture in April 2019. Royalties from Elsevier; Dr. Karla Kerlikowske: Non-paid consultant for Grail on the STRIVE study (2017-present). No other disclosures were reported.

Figures

**Figure 1.. Assessment on overall model calibration.**
We showed the overall model calibration measured by three metrics, including the ratio of expected to observed events (E/O ratio) and calibration intercept and slope for surveillance failure (interval cancer; top panel) and benefit (surveillance-detected cancer; button panel) for each prediction modeling approach. The 95% CI for each calibration measure was shown using the error bars. The vertical lines showed the ideal value for each metric (1 for E/O ratio, 0 for calibration intercept, and 1 for calibration slope).

**Figure 2.. Weak calibration for 7 risk prediction models for surveillance failure (interval second breast cancer).**
Each subfigure demonstrated the weak calibration of an individual modeling approach by comparing the mean predicted risk (x-axis) to the observed risk of surveillance failure (y-axis) in 10 deciles determined by the predicted risk. The vertical error bars showed the 95% confidence interval of the observed risk of surveillance failure in individual deciles. A p-value from the Hosmer-Lemeshow test (HL test) was shown as well for each modeling approach.

**Figure 3.. Weak calibration for 7 risk prediction models for surveillance benefit (surveillance-detected cancer).**
Each subfigure demonstrated the weak calibration of an individual modeling approach by comparing the mean predicted risk (x-axis) to the observed risk of surveillance benefit (y-axis) in 10 deciles determined by the predicted risk. The vertical error bars showed the 95% confidence interval of the observed risk of surveillance benefit in individual deciles. A p-value from the Hosmer-Lemeshow test (HL test) was shown as well for each modeling approach.

**Figure 4.. Receiver operating characteristic curves for surveillance failure (panel A) and surveillance benefit (panel B) for well-calibrated risk modeling approaches.**
We showed the receiver operating characteristics curves for 4 well-calibrated risk prediction models, the expert model, LASSO model, elastic-net (EN) model and gradient boosting machines (GBM), along with their AUCs and the corresponding 95% confidence interval.

**Figure 5.. Race and ethnicity-stratified model calibration for surveillance failures (top panel) and benefits (button panel) in Non-Hispanic racial and ethnic groups.**
This figure showed the assessment of race and ethnicity-stratified model calibration by three metrics, including the ratio of expected and observed events (E/O ratio) and the calibration intercept and slope within three Non-Hispanic racial and ethnic groups, including Asian and Pacific Islander, Black, and White. The 95% CI for each calibration measure was shown using the error bars. The vertical lines showed the ideal value for each metric (1 for E/O ratio, 0 for calibration intercept, and 1 for calibration slope).

See this image and copyright information in PMC

Cited by

Predicting five-year interval second breast cancer risk in women with prior breast cancer.
Hubbard RA, Su YR, Bowles EJA, Ichikawa L, Kerlikowske K, Lowry KP, Miglioretti DL, Tosteson ANA, Wernli KJ, Lee JM. Hubbard RA, et al. J Natl Cancer Inst. 2024 Jun 7;116(6):929-937. doi: 10.1093/jnci/djae063. J Natl Cancer Inst. 2024. PMID: 38466940 Free PMC article.
Sources of Disparities in Surveillance Mammography Performance and Risk-Guided Recommendations for Supplemental Breast Imaging: A Simulation Study.
Hubbard RA, Pujol TA, Alhajjar E, Edoh K, Martin ML. Hubbard RA, et al. Cancer Epidemiol Biomarkers Prev. 2023 Nov 1;32(11):1531-1541. doi: 10.1158/1055-9965.EPI-23-0330. Cancer Epidemiol Biomarkers Prev. 2023. PMID: 37351916 Free PMC article.
Development and validation of prediction models for sentinel lymph node status indicating postmastectomy radiotherapy in breast cancer: population-based study.
Svensson M, Bendahl PO, Alkner S, Hansson E, Rydén L, Dihge L. Svensson M, et al. BJS Open. 2025 Mar 4;9(2):zraf047. doi: 10.1093/bjsopen/zraf047. BJS Open. 2025. PMID: 40197824 Free PMC article.

References

1. Parikh RB; Manz C; Chivers C; Regli SH; Braun J; Draugelis ME; et al. Machine Learning Approaches to Predict 6-Month Mortality Among Patients With Cancer. JAMA Netw Open 2019, 2 (10), e1915997. 10.1001/jamanetworkopen.2019.15997. - DOI - PMC - PubMed
1. Goldstein BA; Navar AM; Carter RE Moving beyond Regression Techniques in Cardiovascular Risk Prediction: Applying Machine Learning to Address Analytic Challenges. Eur Heart J 2016, ehw302. 10.1093/eurheartj/ehw302. - DOI - PMC - PubMed
1. Ming C; Viassolo V; Probst-Hensch N; Dinov ID; Chappuis PO; Katapodi MC Machine Learning-Based Lifetime Breast Cancer Risk Reclassification Compared with the BOADICEA Model: Impact on Screening Recommendations. Br J Cancer 2020, 123 (5), 860–867. 10.1038/s41416-020-0937-0. - DOI - PMC - PubMed
1. Gravesteijn BY; Nieboer D; Ercole A; Lingsma HF; Nelson D; van Calster B; et al. Machine Learning Algorithms Performed No Better than Regression Models for Prognostication in Traumatic Brain Injury. Journal of Clinical Epidemiology 2020, 122, 95–107. 10.1016/j.jclinepi.2020.03.005. - DOI - PubMed
1. Nusinovici S; Tham YC; Chak Yan MY; Wei Ting DS; Li J; Sabanayagam C; et al. Logistic Regression Was as Good as Machine Learning for Predicting Major Chronic Diseases. J Clin Epidemiol 2020, 122, 56–69. 10.1016/j.jclinepi.2020.03.002. - DOI - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

[1] Parikh RB; Manz C; Chivers C; Regli SH; Braun J; Draugelis ME; et al. Machine Learning Approaches to Predict 6-Month Mortality Among Patients With Cancer. JAMA Netw Open 2019, 2 (10), e1915997. 10.1001/jamanetworkopen.2019.15997. - DOI - PMC - PubMed

[2] Parikh RB; Manz C; Chivers C; Regli SH; Braun J; Draugelis ME; et al. Machine Learning Approaches to Predict 6-Month Mortality Among Patients With Cancer. JAMA Netw Open 2019, 2 (10), e1915997. 10.1001/jamanetworkopen.2019.15997. - DOI - PMC - PubMed

[3] Goldstein BA; Navar AM; Carter RE Moving beyond Regression Techniques in Cardiovascular Risk Prediction: Applying Machine Learning to Address Analytic Challenges. Eur Heart J 2016, ehw302. 10.1093/eurheartj/ehw302. - DOI - PMC - PubMed

[4] Goldstein BA; Navar AM; Carter RE Moving beyond Regression Techniques in Cardiovascular Risk Prediction: Applying Machine Learning to Address Analytic Challenges. Eur Heart J 2016, ehw302. 10.1093/eurheartj/ehw302. - DOI - PMC - PubMed

[5] Ming C; Viassolo V; Probst-Hensch N; Dinov ID; Chappuis PO; Katapodi MC Machine Learning-Based Lifetime Breast Cancer Risk Reclassification Compared with the BOADICEA Model: Impact on Screening Recommendations. Br J Cancer 2020, 123 (5), 860–867. 10.1038/s41416-020-0937-0. - DOI - PMC - PubMed

[6] Ming C; Viassolo V; Probst-Hensch N; Dinov ID; Chappuis PO; Katapodi MC Machine Learning-Based Lifetime Breast Cancer Risk Reclassification Compared with the BOADICEA Model: Impact on Screening Recommendations. Br J Cancer 2020, 123 (5), 860–867. 10.1038/s41416-020-0937-0. - DOI - PMC - PubMed

[7] Gravesteijn BY; Nieboer D; Ercole A; Lingsma HF; Nelson D; van Calster B; et al. Machine Learning Algorithms Performed No Better than Regression Models for Prognostication in Traumatic Brain Injury. Journal of Clinical Epidemiology 2020, 122, 95–107. 10.1016/j.jclinepi.2020.03.005. - DOI - PubMed

[8] Gravesteijn BY; Nieboer D; Ercole A; Lingsma HF; Nelson D; van Calster B; et al. Machine Learning Algorithms Performed No Better than Regression Models for Prognostication in Traumatic Brain Injury. Journal of Clinical Epidemiology 2020, 122, 95–107. 10.1016/j.jclinepi.2020.03.005. - DOI - PubMed

[9] Nusinovici S; Tham YC; Chak Yan MY; Wei Ting DS; Li J; Sabanayagam C; et al. Logistic Regression Was as Good as Machine Learning for Predicting Major Chronic Diseases. J Clin Epidemiol 2020, 122, 56–69. 10.1016/j.jclinepi.2020.03.002. - DOI - PubMed

[10] Nusinovici S; Tham YC; Chak Yan MY; Wei Ting DS; Li J; Sabanayagam C; et al. Logistic Regression Was as Good as Machine Learning for Predicting Major Chronic Diseases. J Clin Epidemiol 2020, 122, 56–69. 10.1016/j.jclinepi.2020.03.002. - DOI - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Performance of Statistical and Machine Learning Risk Prediction Models for Surveillance Benefits and Failures in Breast Cancer Survivors

Affiliations

Performance of Statistical and Machine Learning Risk Prediction Models for Surveillance Benefits and Failures in Breast Cancer Survivors

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Medical