Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Aug 28;15(8):431.
doi: 10.1186/s13059-014-0431-1.

Genome-driven integrated classification of breast cancer validated in over 7,500 samples

Genome-driven integrated classification of breast cancer validated in over 7,500 samples

H Raza Ali et al. Genome Biol. .

Abstract

Background: IntClust is a classification of breast cancer comprising 10 subtypes based on molecular drivers identified through the integration of genomic and transcriptomic data from 1,000 breast tumors and validated in a further 1,000. We present a reliable method for subtyping breast tumors into the IntClust subtypes based on gene expression and demonstrate the clinical and biological validity of the IntClust classification.

Results: We developed a gene expression-based approach for classifying breast tumors into the ten IntClust subtypes by using the ensemble profile of the index discovery dataset. We evaluate this approach in 983 independent samples for which the combined copy-number and gene expression IntClust classification was available. Only 24 samples are discordantly classified. Next, we compile a consolidated external dataset composed of a further 7,544 breast tumors. We use our approach to classify all samples into the IntClust subtypes. All ten subtypes are observable in most studies at comparable frequencies. The IntClust subtypes are significantly associated with relapse-free survival and recapitulate patterns of survival observed previously. In studies of neo-adjuvant chemotherapy, IntClust reveals distinct patterns of chemosensitivity. Finally, patterns of expression of genomic drivers reported by TCGA (The Cancer Genome Atlas) are better explained by IntClust as compared to the PAM50 classifier.

Conclusions: IntClust subtypes are reproducible in a large meta-analysis, show clinical validity and best capture variation in genomic drivers. IntClust is a driver-based breast cancer classification and is likely to become increasingly relevant as more targeted biological therapies become available.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Reproducible IntClust gene expression profiles enable accurate classification. (A) Cross-tabulation of IntClust subtypes classified according to the combined (copy number and gene expression) classifier and the expression-based classifier in the METABRIC validation dataset (N = 983). Intensity of box colors is proportional to the depicted value. (B) Comparison of average gene-expression profiles for all 10 IntClust groups in the METABRIC discovery set (left) and TCGA samples (right). The x-axis is genomic position and the y-axis is z-score log2-normalised gene expression level. (C) Scatter plot of the goodness of fit, number of samples and number of available features for expression-based IntClust classification by each study. GOF, goodness of fit.
Figure 2
Figure 2
Distribution of IntClust and transcriptome-based subtypes by study. (A) Bar charts depicting the proportion of samples that belong to each subtype for IntClust (bottom panel), PAM50-based (middle panel), and SCMGENE-based (top panel) classification by study. The total number of samples in each study (N) is depicted at the top of the bars. (B) Bar charts depicting the relative proportions of PAM50 and SCMGENE subtypes within IntClust subtypes, separately for the METABRIC and external studies.
Figure 3
Figure 3
Association between subtype and clinical outcome. (A) Survival plots by subtype for external studies with available time-to-event data. The METABRIC study is excluded. (B) Comparison of univariable hazard ratios (boxes) and 95% confidence intervals (vertical lines) for IntClust subtypes, taking IntClust 3 as referent, for each of three follow-up brackets (0 to 4 years, 4 to 8 years, 8 to 15 years) separately for cases in the METABRIC study and those in external studies. Box sizes are weighted according to sample size. (C) Bar-charts depicting the proportion of tumors that underwent pathological complete response (pCR) by molecular subtype in all external studies of neo-adjuvant chemotherapy.
Figure 4
Figure 4
Explained variation in gene expression levels of genes contained within TCGA-defined regions of recurrent copy number alteration in breast cancer. Forest plots of the average differences in adjusted R-squared statistics between classifiers (IntClust and PAM50) by study according to genes within loci recurrently amplified (red) or deleted (blue) in breast cancer. Boxes represent point estimates where box size is weighted according to study sample size and horizontal lines depict 95% CIs. Point estimates and confidence intervals are based on bootstrap resampling of 1,000 replicates. Diamonds depict the weighted average difference.

References

    1. Perou C, Sørlie T, Eisen M, van de Rijn M, Jeffrey S, Rees C, Pollack J, Ross D, Johnsen H, Akslen L, Fluge O, Pergamenschikov A, Williams C, Zhu S, Lønning P, Børresen-Dale A, Brown P, Botstein D. Molecular portraits of human breast tumours. Nature. 2000;406:747–752. doi: 10.1038/35021093. - DOI - PubMed
    1. Sorlie T, Tibshirani R, Parker J, Hastie T, Marron JS, Nobel A, Deng S, Johnsen H, Pesich R, Geisler S, Demeter J, Perou CM, Lonning PE, Brown PO, Borresen-Dale AL, Botstein D. Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci U S A. 2003;100:8418–8423. doi: 10.1073/pnas.0932692100. - DOI - PMC - PubMed
    1. Teschendorff AE, Miremadi A, Pinder SE, Ellis IO, Caldas C. An immune response gene expression module identifies a good prognosis subtype in estrogen receptor negative breast cancer. Genome Biol. 2007;8:R157. doi: 10.1186/gb-2007-8-8-r157. - DOI - PMC - PubMed
    1. Paik S, Shak S, Tang G, Kim C, Baker J, Cronin M, Baehner FL, Walker MG, Watson D, Park T, Hiller W, Fisher ER, Wickerham DL, Bryant J, Wolmark N. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med. 2004;351:2817–2826. doi: 10.1056/NEJMoa041588. - DOI - PubMed
    1. van ’t Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R, Friend SH. Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002;415:530–536. doi: 10.1038/415530a. - DOI - PubMed

Publication types

Substances