Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jul 27;14(4):R113.
doi: 10.1186/bcr3236.

The gene expression landscape of breast cancer is shaped by tumor protein p53 status and epithelial-mesenchymal transition

The gene expression landscape of breast cancer is shaped by tumor protein p53 status and epithelial-mesenchymal transition

Erik Fredlund et al. Breast Cancer Res. .

Abstract

Introduction: Gene expression data derived from clinical cancer specimens provide an opportunity to characterize cancer-specific transcriptional programs. Here, we present an analysis delineating a correlation-based gene expression landscape of breast cancer that identifies modules with strong associations to breast cancer-specific and general tumor biology.

Methods: Modules of highly connected genes were extracted from a gene co-expression network that was constructed based on Pearson correlation, and module activities were then calculated using a pathway activity score. Functional annotations of modules were experimentally validated with an siRNA cell spot microarray system using the KPL-4 breast cancer cell line, and by using gene expression data from functional studies. Modules were derived using gene expression data representing 1,608 breast cancer samples and validated in data sets representing 971 independent breast cancer samples as well as 1,231 samples from other cancer forms.

Results: The initial co-expression network analysis resulted in the characterization of eight tightly regulated gene modules. Cell cycle genes were divided into two transcriptional programs, and experimental validation using an siRNA screen showed different functional roles for these programs during proliferation. The division of the two programs was found to act as a marker for tumor protein p53 (TP53) gene status in luminal breast cancer, with the two programs being separated only in luminal tumors with functional p53 (encoded by TP53). Moreover, a module containing fibroblast and stroma-related genes was highly expressed in fibroblasts, but was also up-regulated by overexpression of epithelial-mesenchymal transition factors such as transforming growth factor beta 1 (TGF-beta1) and Snail in immortalized human mammary epithelial cells. Strikingly, the stroma transcriptional program related to less malignant tumors for luminal disease and aggressive lymph node positive disease among basal-like tumors.

Conclusions: We have derived a robust gene expression landscape of breast cancer that reflects known subtypes as well as heterogeneity within these subtypes. By applying the modules to TP53-mutated samples we shed light on the biological consequences of non-functional p53 in otherwise low-proliferating luminal breast cancer. Furthermore, as in the case of the stroma module, we show that the biological and clinical interpretation of a set of co-regulated genes is subtype-dependent.

PubMed Disclaimer

Figures

Figure 1
Figure 1
A breast cancer gene expression network. (a) Genes (represented as blue squares) with pair-wise gene expression correlations above 0.3 in a dataset representing 1,608 breast cancer samples were connected by edges and visualized using network graphics. Genes with less than five connecting edges were removed to extract a highly interconnected network. The network is complex and hard to interpret, even though all connections are statistically significant. (b) Although the network is dominated by regions of lower correlations (blue), there are regions in which genes are connected by higher correlations (red). (c) By restricting the analysis to genes with correlations above 0.6, a network of eight visually distinct modules reflecting the high correlation areas in (b) was extracted. In this way, the complex network in (a) could be reduced to a network with gene modules related to tumour biological themes. (d) Correlation-based modules were verified by assaying co-expression in independent breast cancer gene expression datasets. All pair-wise Pearson correlations between genes within modules were calculated across all samples for two additional breast cancer datasets representing 676 and 295 samples, respectively. The mean correlation for each module, as depicted by colored boxes, was used as a measure of module co-expression reproducibility. M-Pr, mitotic progression; M-Cp, mitotic checkpoint. (e, f) Module expression acts as surrogate markers for breast cancer molecular characteristics. (e) SR activity is high in ER-positive, but also in some ER-negative tumors. (f) Basal module activity is high in basal-like and normal-like tumors.
Figure 2
Figure 2
Separation of cell cycle genes into two modules is dependent on TP53 status. (a) Module genes were assayed for effects on proliferation in the KPL-4 breast cancer cell line using an RNAi-based cell spot microarray system. Knockdown of genes in the mitotic progression module significantly inhibited cell proliferation as assayed using Ki-67 staining intensity (P = 0.003, left panel), whereas knockdown of genes in the mitotic checkpoint module did not show any significant effects (P = 0.85, center panel). A group of non-specific control siRNAs showed that the majority of genes in the assayed siRNA library abrogate cellular proliferation (right panel). Module effects on KPL-4 proliferation was estimated by comparing the observed mean Ki-67 intensity for the module genes (black arrows) and compared to background Ki-67 distributions (density curves) based on 10,000 random groups of the same size as the assayed module. P-values are one-sided. (b) The mitotic progression and checkpoint modules are separated in ER-positive breast cancer, but interconnected as a single module in ER-negative breast cancer. (c, d) The separation of the mitotic progression and checkpoint modules relate to sample TP53 mutation status. Interconnection between the mitotic progression and checkpoint modules were assayed using the NACC at increasing cut-off correlation levels in luminal A and luminal B samples. NACC was calculated in luminal samples with known TP53 mutation status from the (c) GSE3494 and (d) GSE22358 breast cancer datasets. TP53 wildtype (WT) samples showed a clear separation between the mitotic progression and checkpoint modules at increasing correlation cut-off levels (green lines). However, in TP53-mutated samples modules remained interconnected at higher levels of correlation (red lines). The NACC for TP53-mutated samples was compared to 10,000 random selections of the same number of TP53 WT samples (black dashed lines) and stars denote permutation-based p-values below 0.05. Error bars represent standard deviations. (e) Luminal samples from the U133A set were divided into quartile groups based on TP53 expression and NACC between mitotic progression and checkpoint modules were calculated within these groups. Decreasing TP53 expression correlated to higher level of interconnection between the mitotic progression and checkpoint modules with the highest TP53 expression quartile samples showing a distinctly higher module interconnection than the lowest quartile samples. As reference the NACC for all luminal samples is shown (black dotted line). (f) Dichotomizing breast cancer patients of either luminal A (LumA) or luminal B (LumB) subtype on mitotic progression module activity did not add prognostic information (P = 0.6 and P = 0.09, log-rank tests) using DMFS as endpoint, (g) while an above mean activity of the mitotic checkpoint module identified groups within both luminal A and luminal B tumors with worse prognosis (luminal A P = 3*10-5, luminal B P = 0.01, log-rank tests).
Figure 3
Figure 3
The stroma module represents mesenchymal cell characteristics. (a) Hierarchical clustering of module activity scores, calculated in data representing 51 breast cancer cell lines, showed separation into the main cell line types: luminal, basal A and basal B [50]. Black arrows denote cell lines characterized as representing a claudin-low phenotype [51]. M-Pr, mitotic progression; M-Cp, mitotic checkpoint. (b) Expression of EMT-inducing factors increases expression of genes from the stroma module. Data for the 187 module genes from a dataset representing overexpression of TGF-beta1, Twist, Gsc or Snail in immortalized breast fibroblasts were visualized using heatmaps. Data are shown as fold changes in relation to mock transfection control. (c) A high stroma module activity score correlates to a more favorable prognosis in patients of the luminal A subtype (P = 0.04, log-rank test), whereas (d) an opposite trend was observed for patients with tumors of the basal-like subtype (P = 0.07, log-rank test). Patients were dichotomized based on a stroma module activity score above or below mean within each subtype. (e) Within the basal-like classified patients a high stroma module activity score correlated to node-positive disease (P = 0.007, t-test). (f) Within the luminal A classified patients a higher stroma module activity score, quantized into four groups, correlated to a smaller tumor size (P = 8*10-4, ANOVA). (g) Hierarchical clustering of primary breast fibroblasts, fibroblast-like (claudin-low) breast cancer cell lines, and breast cancer cell lines, based on expression of genes in the stroma module. Data from GSE13915 [54].
Figure 4
Figure 4
The breast cancer-derived gene expression modules are preserved across several cancer forms. (a) The breast cancer gene expression modules were assayed for co-expression in data representing seven other cancer forms by calculating the average pair-wise Pearson correlation for genes within each module separately. All observed correlations were significant as compared to a random average pair-wise correlations based on 1,000 permutations (data not shown) M-Pr, mitotic progression; M-Cp, mitotic checkpoint. (b) A high activity score of the mitotic progression module correlated to increasing grade in an ovarian carcinoma dataset (n = 285, P = 2 × 10-14, ANOVA). (c) An above mean expression of genes in the stroma module correlates to decreased disease-specific survival (DSS) in a colon carcinoma dataset. (n = 177, P = 0.003, log-rank test). (d) A high immune response (IR) module activity correlated to favorable overall survival (OS) in a dataset representing 57 stage IV melanomas (P = 0.02, log-rank test). (e) Calculation of pair-wise Pearson correlations in an NSCLC dataset for genes in the breast cancer basal module (blue network) revealed that only a subset of these genes were correlated in NSCLC (red network). A core basal gene expression module (n = 5) was derived from genes with conserved correlations in both breast and lung cancer data (red network). (f) A high expression sum for the core basal module acted as a marker for squamous cell lung carcinoma (SCC) compared to the other NSCLC morphological types adenocarcinoma (ADC) and large cell carcinoma (LCC) (P = 5*10-24, ANOVA).

Similar articles

Cited by

References

    1. Bogaerts J, Cardoso F, Buyse M, Braga S, Loi S, Harrison JA, Bines J, Mook S, Decker N, Ravdin P, Therasse P, Rutgers E, van 't Veer LJ, Piccart M. Gene signature evaluation as a prognostic tool: challenges in the design of the MINDACT trial. Nat Clin Pract Oncol. 2006;14:540–551. - PubMed
    1. Paik S. Development and clinical utility of a 21-gene recurrence score prognostic assay in patients with early breast cancer treated with tamoxifen. Oncologist. 2007;14:631–635. doi: 10.1634/theoncologist.12-6-631. - DOI - PubMed
    1. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000;14:25–29. doi: 10.1038/75556. - DOI - PMC - PubMed
    1. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA. 2005;14:15545–15550. doi: 10.1073/pnas.0506580102. - DOI - PMC - PubMed
    1. Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, Hastie T, Eisen MB, van de Rijn M, Jeffrey SS, Thorsen T, Quist H, Matese JC, Brown PO, Botstein D, Eystein Lonning P, Borresen-Dale AL. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci USA. 2001;14:10869–10874. doi: 10.1073/pnas.191367098. - DOI - PMC - PubMed

Publication types

Substances