Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Mar;567(7748):399-404.
doi: 10.1038/s41586-019-1007-8. Epub 2019 Mar 13.

Dynamics of breast-cancer relapse reveal late-recurring ER-positive genomic subgroups

Affiliations

Dynamics of breast-cancer relapse reveal late-recurring ER-positive genomic subgroups

Oscar M Rueda et al. Nature. 2019 Mar.

Abstract

The rates and routes of lethal systemic spread in breast cancer are poorly understood owing to a lack of molecularly characterized patient cohorts with long-term, detailed follow-up data. Long-term follow-up is especially important for those with oestrogen-receptor (ER)-positive breast cancers, which can recur up to two decades after initial diagnosis1-6. It is therefore essential to identify patients who have a high risk of late relapse7-9. Here we present a statistical framework that models distinct disease stages (locoregional recurrence, distant recurrence, breast-cancer-related death and death from other causes) and competing risks of mortality from breast cancer, while yielding individual risk-of-recurrence predictions. We apply this model to 3,240 patients with breast cancer, including 1,980 for whom molecular data are available, and delineate spatiotemporal patterns of relapse across different categories of molecular information (namely immunohistochemical subtypes; PAM50 subtypes, which are based on gene-expression patterns10,11; and integrative or IntClust subtypes, which are based on patterns of genomic copy-number alterations and gene expression12,13). We identify four late-recurring integrative subtypes, comprising about one quarter (26%) of tumours that are both positive for ER and negative for human epidermal growth factor receptor 2, each with characteristic tumour-driving alterations in genomic copy number and a high risk of recurrence (mean 47-62%) up to 20 years after diagnosis. We also define a subgroup of triple-negative breast cancers in which cancer rarely recurs after five years, and a separate subgroup in which patients remain at risk. Use of the integrative subtypes improves the prediction of late, distant relapse beyond what is possible with clinical covariates (nodal status, tumour size, tumour grade and immunohistochemical subtype). These findings highlight opportunities for improved patient stratification and biomarker-driven clinical trials.

PubMed Disclaimer

Figures

Extended Data Fig.1 |
Extended Data Fig.1 |. Description of the cohorts used in this study.
a. Description of the METABRIC discovery cohort, clinical characteristics and flow chart of sample inclusion for analysis. b. Description of the validation cohort, clinical characteristics and flow chart of sample inclusion for analysis.
Extended Data Fig.2 |
Extended Data Fig.2 |. Effect of censoring non-malignant deaths in the estimation of disease-specific survival and prognostic value of clinical covariates at different disease states.
a. Cumulative incidence computed as 1-Kaplan-Meier estimator using only disease-specific death as endpoint and censoring other types of death. b. Cumulative incidence computed using a competing-risk model that takes into account different causes of death. The bias of the 1-Kaplan-Meier estimator is visible. c. Distribution of age at the time of diagnosis for ER− and ER+ patients. The number of patients in each group is indicated in all Panels. This analysis was done with the FD. d. Log Hazard Ratios (HR) calculated using the multistate model stratified by ER status (n=3147) for different covariates, namely grade, lymph node (LN) status, tumor size (size), time from local relapse, time from surgery. Log HR are shown from different states, including post surgery (PS; HR of progressing to relapse or DSD), loco-regional recurrence (LR; HR of progressing to DR or DSD) and distant recurrence (DR; HR of cancer-specific death). 95% confidence intervals are shown. This analysis was done with the FD.
Extended Data Fig.3 |
Extended Data Fig.3 |. Model calibration and validation in an external dataset.
a. Internal validation of the global predictions of the models on all transitions using bootstrap (n=200). Boxplots are computed using the median of the observations, the first and third quartiles as hinges and the +/−1.58 Interquartile range divided by the square root of the sample size as notches. The optimism (difference between the training predictive ability and the test predictive ability of several discriminant measures (see Methods). b. Internal calibration of the global predictions of the models on all transitions using bootstrap (n=200). The distribution of the mean absolute error between observed and predicted is plotted. Boxplot defined as above (see Methods). c. External calibration of disease-specific death (DSD) risk and non-malignant death risk using PREDICT 2.1 (n=1841). The distribution of the mean absolute error between the predictions of PREDICT and our model based on ER status only is plotted. Boxplots defined as above. d. Scatterplot of the predictions of DSD risk computed by PREDICT and our model based on the IntClust subtypes only at 10 years (n=1841) (see Methods). Pearson correlation is shown. e. Concordance index (c-index) of prediction of risk of distant relapse (distant relapse free survival, DRFS), disease-specific death (disease specific survival, DSS), death (overall survival, OS) and relapse (relapse free survival, RFS) in the 178 withheld METABRIC samples and in a metacohort composed of 8 published studies amongst ER−/HER2− patients in the high-risk IntClust subtypes, where results are shown for individual cohorts and the combined metacohort (see Methods, Supplementary Information). Error bars correspond to 95% confidence intervals for the c-index. The number of patients in each group is indicated.
Extended Data Fig.4 |
Extended Data Fig.4 |. Different subtypes have distinct probabilities of recurrence.
a. Average probability of experiencing a distant relapse (DR, defined as the probability of having a distant relapse at any point followed by any other transition) for the high risk ER+ IntClust (IC) subtypes (IC1; n=134, IC6; n=81, IC9; n=134, IC2; n=69) relative to IC3 (n=269), the best prognosis ER+ subgroup. This analysis was restricted to ER+/HER2− cases, which represent the vast majority for each of these subtypes. Error bars represent 95% confidence intervals for the mean. b. As in Panel (a), but showing the average probability of experiencing DR or cancer related death after a LR (IC1; n=21, IC6; n=10, IC9; n=21, IC2; n=13, IC3; n=30). c. Average probability of recurrence (distant relapse or cancer-specific death) after loco-regional relapse for all patients in each of the 11 IntClust subtypes. d. Median time until an additional relapse (DR or cancer specific death) after LR for all patients in each the 11 IntClust subtypes (n=270). This has been computed using a Kaplan-Meier approach with competing risks of progression and non-malignant death. Error bars represent 95% confidence intervals for the median time. Asterisks denote situations where the median time cannot be computed because less than 50% of the patients relapsed. This analysis was done with the MD. e. Average probability of cancer related death after DR for all patients by subtype. f. As in Panel (d), except that the median time until cancer specific death after DR is shown (n=596). g. Mean probabilities of having relapse after surgery and after being 5 and 10 years disease-free (see Methods and Supplementary Table 3) for the patients in each of the four clinical subtypes. Error bars represent 95% confidence intervals. The number of patients in each group is indicated. h, i, j, k. Same as Panels (b, c, d, e) for the IHC subtypes (same sample sizes). l. As in Panel (g) but for the PAM50 subtypes. The number of patients in each group is indicated. m, n, o, p. Same as Panels (b, c, d, e) for the PAM50 subtypes (same sample sizes except for Panel (p); n=593).
Extended Data Fig.5 |
Extended Data Fig.5 |. The ER−/HER2− integrative subtypes exhibit distinct risks of relapse.
Probabilities of distant relapse (DR) or cancer related death (C/D) amongst ER−/Her2− patients who were disease free at 5 years post diagnosis reveals dramatic differences in the risk of relapse for TNBC IntClust (IC) subtypes IC4ER− versus the IC10 (Basal-like enriched) subtype. Here the base clinical model with IHC subtypes is compared with the base clinical model plus IntClust subtype information. Error bars represent 95% confidence intervals. The number of patients in each group is indicated.
Extended Data Fig.6 |
Extended Data Fig.6 |. Subtype specific risks of relapse after loco-regional relapse.
Transition probabilities from LR to other states (LR=Loco-regional relapse, DR=Distant relapse, D/C=Cancer/disease specific death, D/O=Death by other causes) for individual average patients stratified based on ER status, IHC, PAM50, or IntClust subtypes. 95% confidence bands were computed using bootstrap. This analysis was done with the FD for ER+/ER− comparisons and the MD for the remainder.
Extended Data Fig.7 |
Extended Data Fig.7 |. Associations between probabilities of distant relapse 10 years after loco-regional relapse with clinico-pathological and molecular features of the primary tumor.
For each patient that had a loco-regional recurrence (LR), the 10-year probability of having distant relapse (DR) or cancer-related death (D/C) is plotted against different variables. A loess fit is overlaid in order to highlight the relationship between the probability and tumor size or time of relapse. Boxplots are computed using the median of the observations, the first and third quartiles as hinges and the +/−1.58 interquartile range divided by the square root of the sample size as notches. This analysis was done with the MD and the model was stratified by IntClust subtype (n=257).
Extended Data Fig.8 |
Extended Data Fig.8 |. Subtype specific risks of relapse after a distant relapse.
Transition probabilities from DR to other states (LR=Loco-regional relapse, DR=Distant relapse, D/C=Cancer related death, D/O=Death by other causes) for individual average patients stratified based on ER status, IHC, PAM50 or IntClust subtypes. 95% confidence bands were computed using bootstrap. This analysis was done with the FD for ER+/ER− comparisons and the MD for the remainder.
Extended Data Fig.9 |
Extended Data Fig.9 |. Distribution of the number of relapses by molecular subtype.
a. Times of distant recurrence (DR) for ER− and ER+ patients (n=605). Each dot represents a distant recurrence, coded by color for different sites. b. Distribution of the number of distant relapses for different subtypes (n=611), based on ER/HER2 status (ER+/HER2+ n=36, ER+/HER2− n=263, ER−/HER2+ n=41, ER−/HER2− n=82), PAM50 (Basal n=79, Her2 n=69, Luminal A n=101, Luminal B n=138, Normal n=33) and IntClust subtypes (IC1 n=40, IC2 n=25, IC3 n=32, IC4ER+ n=46, IC4ER− n=16, IC5 n=72, IC6 n=23, IC7 n=24, IC8 n=54, IC9 n=38, IC10 n=52). ER status was imputed based on expression in 6 samples. These analyses were done with RD cohort.
Extended Data Fig.10 |
Extended Data Fig.10 |. Site specific patterns of relapse in the IHC, PAM50 and IntClust subtypes.
a. Left Panel: Percentages of patients with a given site of metastasis in the IHC subtypes (barplots, total numbers also indicated). Upright triangles indicate significant positive differences in that group with respect to the overall mean and inverted triangles indicate significant positive differences in that group with respect to the overall mean using simultaneous testing of all sites (see Methods). Location of metastatic sites is not anatomically accurate. Right Panel: Cumulative incidence functions (as 1-Kaplan-Meier estimates) for each site of metastasis in the IHC subtypes. The same patient can have multiple sites of metastasis. b. Same as in Panel (a) but for the PAM50 subtypes. c. Same as in Panel (a) but for the IntClust subtypes. These analyses were done with RD cohort.
Figure 1.
Figure 1.. A multistate model of breast cancer relapse enables individual risk of relapse predictions throughout disease progression.
a. Graphical representation of the model. Nodes represent possible states and arcs possible transitions between states, where parameters that have an effect on the hazard are indicated. b. Subtype-specific risk of relapse at diagnosis. Transition probabilities from surgery to other states (DF=Disease-free, LR=Loco-regional relapse, DR=Distant relapse, D/C=Cancer specific death, D/O=Death by other causes) are shown for individual average patients across the breast cancer subtypes. Subtypes were defined based on ER status using the FD and for IHC, PAM50 and integrative (IntClust) subtypes using the MD. 95% confidence bands (shaded areas) were computed using the bootstrap (see Methods).
Figure 2.
Figure 2.. The integrative breast cancer subtypes exhibit distinct patterns of relapse.
a. Mean probabilities of having a relapse after surgery and after being 5 and 10 years disease-free for the patients in each of the 11 integrative (IntClust/IC) subtypes, ordered by increasing risk of relapse. IC3, IC7, IC8 and IC4ER+ represent lower risk ER+ subtypes; IC10 and IC4ER− TNBC subtypes with variable relapse patterns; IC1, IC6, IC9 and IC2 late relapsing ER+ subtypes; and IC5 HER2+ tumors prior to trastuzumab. Error bars represent 95% confidence intervals. The lower colored bar shows the prevalence of each integrative subtype in the breast cancer population. b. Frequencies of copy number amplifications in specific IntClust subtypes (IC1, IC6, IC9 and IC2). Putative driver genes indicated by an asterisk. c. Proportion of ER+ tumors that belong to the four late-relapsing IntClust subtypes. This analysis was done with the MD.
Figure 3.
Figure 3.. The integrative subtypes improve prediction of late distant recurrence in ER+/HER2− breast cancer beyond clinical covariates.
a. Probabilities of distant relapse (DR) or disease-specific death (DSD) amongst ER+/HER2− patients who were disease free at 5 years post diagnosis reveals significant risk for IntClust (IC) 1,2,6,9 relative to IC3, which varies over time and is not captured by the standard clinical model. Dots represent average probabilities and error bars 95% confidence intervals. b. Average probabilities of DR or DSD for ER+/HER2− patients in the four late-relapsing subgroups relative to IC3 for patients who were relapse free five years post diagnosis. c. Evaluation of the utility of the IHC model relative to the IntClust model for predicting late DR in ER+/HER2− patients who were relapse-free at 5 years. C-indices are shown for both models at different time intervals in the METABRIC cohort (n=1337, ER+/HER2− n=1013) and the external validation cohort (n=1080, ER+/HER2− n=739). Error bars represent 95% confidence intervals. This analysis was done with the MD.
Figure 4.
Figure 4.. Organ-specific patterns and timing of distant relapse in ER+ and ER− patients.
a. Percentages of patients and cumulative incidence (1-Kaplan-Meier estimates) for each site of metastasis in ER+ and ER− cases. Upright triangles indicate significant positive differences and inverted triangles indicate significant negative differences in that group with respect to the overall mean (see Methods). b. Relapse-free survival curves for sequential recurrences in ER− (n=186) and ER+ (n=419) patients computed using a conditional PWP model. Each curve shows the probability of not having any other relapse for individuals that had a previous relapse. The top bar shows the median time until the n-th relapse. c. Log Hazard ratios of disease-specific death (DSD) with 95% confidence intervals of the time-dependent Cox model for distant relapse (DR) in ER− (n=179) and ER+ (n=410) patients. This analysis was done with the RD.

Comment in

References

    1. Blows FM et al. Subtyping of breast cancer by immunohistochemistry to investigate a relationship between subtype and short and long term survival: A collaborative analysis of data for 10,159 cases from 12 studies. PLoS Med. 7, (2010). - PMC - PubMed
    1. Davies C et al. Long-term effects of continuing adjuvant tamoxifen to 10 years versus stopping at 5 years after diagnosis of oestrogen receptor-positive breast cancer: ATLAS, a randomised trial. Lancet 381, 805–816 (2013). - PMC - PubMed
    1. Sestak I et al. Factors predicting late recurrence for estrogen receptor-positive breast cancer. J. Natl. Cancer Inst 105, 1504–1511 (2013). - PMC - PubMed
    1. Sgroi DC et al. Prediction of late distant recurrence in patients with oestrogen-receptor-positive breast cancer: A prospective comparison of the breast-cancer index (BCI) assay, 21-gene recurrence score, and IHC4 in the TransATAC study population. Lancet Oncol. 14, 1067–1076 (2013). - PMC - PubMed
    1. Pan H et al. 20-Year Risks of Breast-Cancer Recurrence after Stopping Endocrine Therapy at 5 Years. N. Engl. J. Med 377, 1836–1846 (2017). - PMC - PubMed

Publication types

MeSH terms