Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 1;27(1):67.
doi: 10.1186/s13058-025-02031-8.

Development and validation of a risk prediction model for premenopausal breast cancer in 19 cohorts

Affiliations

Development and validation of a risk prediction model for premenopausal breast cancer in 19 cohorts

Kristen D Brantley et al. Breast Cancer Res. .

Abstract

Background: Incidence of premenopausal breast cancer (BC) has risen in recent years, though most existing BC prediction models are not generalizable to young women due to underrepresentation of this age group in model development.

Methods: Using questionnaire-based data from 19 prospective studies harmonized within the Premenopausal Breast Cancer Collaborative Group (PBCCG), representing 783,830 women, we developed a premenopausal BC risk prediction model. The data were split into training (2/3) and validation (1/3) datasets with equal distribution of cohorts in each. In the training dataset variables were chosen from known and hypothesized risk factors: age, age at menarche, age at first birth, parity, breastfeeding, height, BMI, young adulthood BMI, recent weight change, alcohol consumption, first-degree family history of BC, and personal history of benign breast disease (BBD). Hazard ratios (HR) and 95% confidence intervals (CI) were estimated by Cox proportional hazards regression using age as time scale, stratified by cohort. Given that complete information on all risk factors was not available in all cohorts, coefficients were estimated separately in groups of cohorts with the same available covariate information, adjusted to account for the correlation between missing and non-missing variables and meta-analyzed. Absolute risk of BC (in situ or invasive) within 5 years, was determined using country-, age-, and birth cohort-specific incidence rates. Discrimination (area under the curve, AUC) and calibration (Expected/Observed, E/O) were evaluated in the validation dataset. We compared our model with a literature-based model for women < 50 years (iCARE-Lit).

Results: Selected model risk factors were age at menarche, parity, height, current and young adulthood BMI, family history of BC, and personal BBD history. Predicted absolute 5-year risk ranged from 0% to 5.7%. The model overestimated risk on average [E/O risk = 1.18 (1.14-1.23)], with underestimation of risk in lower absolute risk deciles and overestimation in upper absolute risk deciles [E/O 1st decile = 0.59 (0.58-0.60); E/O 10th decile = 1.48 (1.48-1.49)]. The AUC was 59.1% (58.1-60.1%). Performance was similar to the iCARE-Lit model.

Conclusion: In this prediction model for premenopausal BC, the relative contribution of risk factors to absolute risk was similar to existing models for overall BC. The discriminatory ability was nearly identical (< 1% difference in AUC) to the existing iCARE-Lit model developed in women under 50 years. The inability to improve discrimination highlights the need to investigate additional predictors to better understand premenopausal BC risk.

Keywords: Premenopausal breast cancer; Risk prediction model; Young-onset breast cancer.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: Approval from institutional review boards and individual consent for all cohorts in the PBCCG conformed to each study’s ethics review requirements. Consent for publication: The authors give their consent for publication. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Calibration for the risk model among 19 cohorts within the PBCCG. E/O within the testing dataset (N = 261,130, breast cancer cases = 3,226), for (A) absolute risk of BC a by decile of absolute risk in the training dataset and (B) relative risk of BC in the testing dataset, by decile of relative risk in the training dataset (log scale)

References

    1. Arnold M, Morgan E, Rumgay H, et al. Current and future burden of breast cancer: global statistics for 2020 and 2040. Breast. 2022;66:15–23. 10.1016/j.breast.2022.08.010. - PMC - PubMed
    1. Heer E, Harper A, Escandor N, Sung H, McCormack V, Fidler-Benaoudia MM. Global burden and trends in premenopausal and postmenopausal breast cancer: a population-based study. Lancet Glob Health. 2020;8(8):e1027–37. 10.1016/S2214-109X(20)30215-1. - PubMed
    1. Dyba T, Randi G, Bray F, et al. The European cancer burden in 2020: Incidence and mortality estimates for 40 countries and 25 major cancers. Eur J Cancer. 2021;157:308–47. 10.1016/j.ejca.2021.07.039. - PMC - PubMed
    1. Koh B, Tan DJH, Ng CH, et al. Patterns in cancer incidence among people younger than 50 years in the US, 2010 to 2019. JAMA Netw Open. 2023;6(8): e2328171. 10.1001/jamanetworkopen.2023.28171. - PMC - PubMed
    1. Partridge AH, Hughes ME, Warner ET, et al. Subtype-dependent relationship between young age at diagnosis and breast cancer survival. J Clin Oncol. 2016;34(27):3308–14. 10.1200/JCO.2015.65.8013. - PubMed

Publication types

LinkOut - more resources