. 2023 Feb 8;15(4):1090.

doi: 10.3390/cancers15041090.

Combining Breast Cancer Risk Prediction Models

Zoe Guan¹, Theodore Huang², Anne Marie McCarthy³, Kevin Hughes⁴, Alan Semine⁵, Hajime Uno^{6

7}, Lorenzo Trippa^{6

8}, Giovanni Parmigiani^{6

8}, Danielle Braun^{6

8}

Affiliations

¹ Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY 10017, USA.
² Vertex Pharmaceuticals, Boston, MA 02210, USA.
³ Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA.
⁴ Department of Surgery, Medical University of South Carolina, Charleston, SC 29425, USA.
⁵ Advanced Image Enhancement, Fall River, MA 02720, USA.
⁶ Department of Data Science, Dana-Farber Cancer Institute, Boston, MA 02115, USA.
⁷ Department of Medicine, Harvard Medical School, Boston, MA 02115, USA.
⁸ Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA.

PMID: 36831433
PMCID: PMC9953824
DOI: 10.3390/cancers15041090

Combining Breast Cancer Risk Prediction Models

Zoe Guan et al. Cancers (Basel). 2023.

. 2023 Feb 8;15(4):1090.

doi: 10.3390/cancers15041090.

Authors

Zoe Guan¹, Theodore Huang², Anne Marie McCarthy³, Kevin Hughes⁴, Alan Semine⁵, Hajime Uno^{6

7}, Lorenzo Trippa^{6

8}, Giovanni Parmigiani^{6

8}, Danielle Braun^{6

8}

Affiliations

¹ Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY 10017, USA.
² Vertex Pharmaceuticals, Boston, MA 02210, USA.
³ Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA.
⁴ Department of Surgery, Medical University of South Carolina, Charleston, SC 29425, USA.
⁵ Advanced Image Enhancement, Fall River, MA 02720, USA.
⁶ Department of Data Science, Dana-Farber Cancer Institute, Boston, MA 02115, USA.
⁷ Department of Medicine, Harvard Medical School, Boston, MA 02115, USA.
⁸ Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA.

PMID: 36831433
PMCID: PMC9953824
DOI: 10.3390/cancers15041090

Abstract

Accurate risk stratification is key to reducing cancer morbidity through targeted screening and preventative interventions. Multiple breast cancer risk prediction models are used in clinical practice, and often provide a range of different predictions for the same patient. Integrating information from different models may improve the accuracy of predictions, which would be valuable for both clinicians and patients. BRCAPRO is a widely used model that predicts breast cancer risk based on detailed family history information. A major limitation of this model is that it does not consider non-genetic risk factors. To address this limitation, we expand BRCAPRO by combining it with another popular existing model, BCRAT (i.e., Gail), which uses a largely complementary set of risk factors, most of them non-genetic. We consider two approaches for combining BRCAPRO and BCRAT: (1) modifying the penetrance (age-specific probability of developing cancer given genotype) functions in BRCAPRO using relative hazard estimates from BCRAT, and (2) training an ensemble model that takes BRCAPRO and BCRAT predictions as input. Using both simulated data and data from Newton-Wellesley Hospital and the Cancer Genetics Network, we show that the combination models are able to achieve performance gains over both BRCAPRO and BCRAT. In the Cancer Genetics Network cohort, we show that the proposed BRCAPRO + BCRAT penetrance modification model performs comparably to IBIS, an existing model that combines detailed family history with non-genetic risk factors.

Keywords: BCRAT; BRCAPRO; ensemble learning; model aggregation; stacking.

PubMed Disclaimer

Conflict of interest statement

Giovanni Parmigiani is a member of the Scientific Advisory Board of Ambry Genetics, a commercial provider of cancer susceptibility tests. Commercial use of the BRCAPRO model is licensed by the Dana Farber Cancer Institute as part of the BayesMendel software package. Licensing revenues are used for model and software upgrades. None of the authors derives personal income from BRCAPRO licensing revenues. Kevin Hughes receives Honoraria from Hologic (Surgical implant for radiation planning with breast conservation and wire free breast biopsy) and Myriad Genetics. He has a financial interest in CRA Health (Formerly Hughes RiskApps), which was recently sold to Volpara. CRA Health develops risk assessment models/software with a particular focus on breast cancer and colorectal cancer. Kevin Hughes is a founder of the company. He is the Co-Creator of Ask2Me.Org which is freely available for clinical use and is licensed for commercial use by the Dana Farber Cancer Institute, Harvard University, and the Massachusetts General Hospital (MGH). His interests in CRA Health and Ask2Me.Org were reviewed and are managed by MGH and Partners Health Care in accordance with their conflict of interest policies.

Figures

**Figure A1**
BCRAT cause-specific hazard of breast cancer for White women in the general population ( ${\tilde{λ}}_{B} (t) = {\tilde{λ}}_{B, 0} (t) / (1 - A R (t))$ ) and BRCAPRO cause-specific hazard of breast cancer for White female non-carriers ( $λ_{B}^{0} (t)$ ).

**Figure 1**
Inputs to BRCAPRO and BCRAT and their overlap.

**Figure 2**
Calibration plots by decile of risk for five-year predictions in a simulated dataset with 45,557 counselees (724 cases). For each model, we grouped individuals by decile of risk and plotted the observed proportion of women who developed cancer (with 95% Wilson CI) versus the predicted probability (sum of risk predictions) within each decile.

**Figure 3**
Scatter plots, density plots, and pairwise correlations of five-year predictions in the CGN data. Red corresponds to data from cases, blue corresponds to data from controls (individuals who did not develop breast cancer within five years), and beige corresponds to data from counselees censored before five years. Corr: Pearson correlation, cens: counselees censored before five years, ctrl: controls. The lower diagonal panels show scatter plots of the predictions from each pair of models. For example, the panel in the second row, first column has predictions from BRCAPRO + BCRAT (E) on the x-axis and predictions from BRCAPRO + BCRAT (M) on the y-axis. The diagonal panels show density plots of the predictions from each model, stratified by case-control status. For example, the panel in the first row, first column shows the distribution of predictions from BRCAPRO + BCRAT (M). The upper diagonal panels show the Pearson correlations between predictions from each pair of models. For example, the first row, second column shows the overall correlation, correlation among cases, correlation among censored counselees, and correlation among controls for predictions from BRCAPRO + BCRAT (M) and BRCAPRO + BCRAT (E). *** = p < 0.001.

**Figure 4**
Calibration plots by decile of risk for five-year predictions in CGN. For each model, we grouped individuals by decile of risk and plotted the observed proportion of women who developed cancer (with 95% Wilson CI) versus predicted probability (sum of risk predictions) within each decile. In computing the observed proportions, the inverse probabilities of the censoring weights were used to account for censoring.

See this image and copyright information in PMC

Cited by

Deep learning of longitudinal mammogram examinations for breast cancer risk prediction.
Dadsetan S, Arefan D, Berg WA, Zuley ML, Sumkin JH, Wu S. Dadsetan S, et al. Pattern Recognit. 2022 Dec;132:108919. doi: 10.1016/j.patcog.2022.108919. Epub 2022 Jul 22. Pattern Recognit. 2022. PMID: 37089470 Free PMC article.
Validation of Breast Cancer Risk Models by Race/Ethnicity, Family History and Molecular Subtypes.
McCarthy AM, Liu Y, Ehsan S, Guan Z, Liang J, Huang T, Hughes K, Semine A, Kontos D, Conant E, Lehman C, Armstrong K, Braun D, Parmigiani G, Chen J. McCarthy AM, et al. Cancers (Basel). 2021 Dec 23;14(1):45. doi: 10.3390/cancers14010045. Cancers (Basel). 2021. PMID: 35008209 Free PMC article.
Association Between Risk Factors and Major Cancers: Explainable Machine Learning Approach.
Huang X, Ren S, Mao X, Chen S, Chen E, He Y, Jiang Y. Huang X, et al. JMIR Cancer. 2025 May 2;11:e62833. doi: 10.2196/62833. JMIR Cancer. 2025. PMID: 40315870 Free PMC article.
Critical Risk Assessment, Diagnosis, and Survival Analysis of Breast Cancer.
Manir SB, Deshpande P. Manir SB, et al. Diagnostics (Basel). 2024 May 8;14(10):984. doi: 10.3390/diagnostics14100984. Diagnostics (Basel). 2024. PMID: 38786282 Free PMC article.
Multiplex Digital Spatial Profiling in Breast Cancer Research: State-of-the-Art Technologies and Applications across the Translational Science Spectrum.
Rossi M, Radisky DC. Rossi M, et al. Cancers (Basel). 2024 Apr 23;16(9):1615. doi: 10.3390/cancers16091615. Cancers (Basel). 2024. PMID: 38730568 Free PMC article. Review.

See all "Cited by" articles

References

1. Siegel R.L., Miller K.D., Jemal A. Cancer statistics, 2020. CA Cancer J. Clin. 2020;70:7–30. doi: 10.3322/caac.21590. - DOI - PubMed
1. American Cancer Society Facts and Figures 2020. 2020. [(accessed on 3 May 2020)]. Available online: https://www.cancer.org/research/cancer-facts-statistics/all-cancer-facts....
1. Cintolo-Gonzalez J.A., Braun D., Blackford A.L., Mazzola E., Acar A., Plichta J.K., Griffin M., Hughes K.S. Breast cancer risk models: A comprehensive overview of existing models, validation, and clinical applications. Breast Cancer Res. Treat. 2017;164:263–284. doi: 10.1007/s10549-017-4247-z. - DOI - PubMed
1. Gail M.H., Brinton L.A., Byar D.P., Corle D.K., Green S.B., Schairer C., Mulvihill J.J. Projecting individualized probabilities of developing breast cancer for white females who are being examined annually. J. Natl. Cancer Inst. 1989;81:1879–1886. doi: 10.1093/jnci/81.24.1879. - DOI - PubMed
1. Gail M.H., Costantino J.P., Pee D., Bondy M., Newman L., Selvan M., Anderson G.L., Malone K.E., Marchbanks P.A., McCaskill-Stevens W., et al. Projecting individualized absolute invasive breast cancer risk in African American women. J. Natl. Cancer Inst. 2007;99:1782–1792. doi: 10.1093/jnci/djm223. - DOI - PubMed

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Combining Breast Cancer Risk Prediction Models

Affiliations

Combining Breast Cancer Risk Prediction Models

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Grants and funding

LinkOut - more resources

Full Text Sources

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Related information

Grants and funding

LinkOut - more resources

Full Text Sources