Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 May 1;24(9):2182-2193.
doi: 10.1158/1078-0432.CCR-17-3378. Epub 2018 Feb 9.

Pan-Cancer Molecular Classes Transcending Tumor Lineage Across 32 Cancer Types, Multiple Data Platforms, and over 10,000 Cases

Affiliations

Pan-Cancer Molecular Classes Transcending Tumor Lineage Across 32 Cancer Types, Multiple Data Platforms, and over 10,000 Cases

Fengju Chen et al. Clin Cancer Res. .

Abstract

Purpose: The Cancer Genome Atlas data resources represent an opportunity to explore commonalities across cancer types involving multiple molecular levels, but tumor lineage and histology can represent a barrier in moving beyond differences related to cancer type.Experimental Design: On the basis of gene expression data, we classified 10,224 cancers, representing 32 major types, into 10 molecular-based "classes." Molecular patterns representing tissue or histologic dominant effects were first removed computationally, with the resulting classes representing emergent themes across tumor lineages.Results: Key differences involving mRNAs, miRNAs, proteins, and DNA methylation underscored the pan-cancer classes. One class expressing neuroendocrine and cancer-testis antigen markers represented ∼4% of cancers surveyed. Basal-like breast cancers segregated into an exclusive class, distinct from all other cancers. Immune checkpoint pathway markers and molecular signatures of immune infiltrates were most strongly manifested within a class representing ∼13% of cancers. Pathway-level differences involving hypoxia, NRF2-ARE, Wnt, and Notch were manifested in two additional classes enriched for mesenchymal markers and miR200 silencing.Conclusions: All pan-cancer molecular classes uncovered here, with the important exception of the basal-like breast cancer class, involve a wide range of cancer types and would facilitate understanding the molecular underpinnings of cancers beyond tissue-oriented domains. Numerous biological processes associated with cancer in the laboratory setting were found here to be coordinately manifested across large subsets of human cancers. The number of cancers manifesting features of neuroendocrine tumors may be much higher than previously thought, which disease is known to occur in many different tissues. Clin Cancer Res; 24(9); 2182-93. ©2018 AACR.

PubMed Disclaimer

Conflict of interest statement

Disclosure of potential conflicts of interest: The authors have no conflicts of interest.

Figures

Figure 1
Figure 1. Molecular classes of TCGA cancers that transcend tumor lineage or tissue-of-origin
(A) Using an alternative molecular classification approach, whereby differences between cancer types were first removed computationally prior to classification on the basis of mRNA expression data, ten major pan-cancer “classes” were identified. The first heat map shows differential mRNA expression patterns (values normalized within each main cancer type) for a set of 854 genes found to best distinguish between the ten subtypes (see Methods). The second shows differential expression patterns for a select set of genes representing pathways of particular interest. Numbers of cases (n=10224) denote representation on RNA-seq data platform. (B) Molecular features from other data platforms associating with pan-cancer molecular class. Top heat map shows differential expression patterns (values normalized within each cancer type), representing a top set of 50 miRNA features that distinguish between the ten molecular classes from part A. The second heat map shows differential protein expression patterns (by RPPA platform, values normalized within each cancer type), representing a top set of 25 features that distinguish between the ten subtypes. The third heat map shows differential DNA methylation patterns (values centered within each cancer type) for a top set of features that distinguish a class associated with basal-like breast cancer. Additional sample-level data tracks denote levels of genome-wide copy number alteration, cancer type (according to TCGA project, color coding in part C), BRCA Pam50 subtype, and estimated tumor sample purity(47) (white, ~100% purity). (C) The percent representations of each pan-cancer class by cancer type (according to TCGA project) are represented using a colorgram. (D) Differences in patient overall survival among the pan-cancer molecular classes. P values by stratified log-rank test incorporating cancer type as a confounder. Overall p value evaluates for significant differences among the groups as defined by pan-cancer class. (E) Significance of overlap between the pan-cancer class assignments made in the present study (columns), with molecular-based subtype assignments (rows) made previously for a subset of cases. P-values by one-sided Fisher’s exact test; only p-values with False Discovery Rate (FDR)<0.1(48) are represented. See Methods for TCGA project abbreviations. See also Supplementary Figures 1 to 7 and Supplementary Data 1 and 2.
Figure 2
Figure 2. Pathway-associated gene signatures across pan-cancer molecular classes
(A) By pan-cancer molecular class, pathway-associated mRNA signatures (using values normalized within each cancer type). See Figure 1C and part D for cancer type color legend. Numbers of cases (n=10224) denote representation on RNA-seq data platform. (B) Corresponding to cases from part A, heat maps showing DNA methylation and expression levels for miR-200 family members (using values normalized or centered within each cancer type). Representative DNA methylation probes(49) that map to the promoter of each miRNA cluster are shown (miR-141/200c = cg24702147, miR-200a/200b/429 = cg15822328). Normalized expression levels for a set of canonical epithelial or mesenchymal markers (11) are also shown. (C) Scatter plot of differential methylation vs differential expression (using values normalized or centered within each cancer type), for cg24702147 versus miR-141/200c (normalized values for the two miRNAs being averaged). Numbers of cases denote representation on all three data platforms for mRNA-seq, miRNA-seq, and 450K DNA methylation. Data points are colored according to pan-cancer class, as represented in parts A and B. (D) For c7 and c8 pan-cancer classes (associated with mesenchymal cells, along with hypoxia, NRF2/KEAP1, Wnt, and Notch signatures), distributions by cancer type. See also Supplementary Figures 8 to 12 and Supplementary Data 3 and 4.
Figure 3
Figure 3. Normal tissue and cell type associations with the pan-cancer molecular classes
(A) Inter-profile correlations were computed between TCGA expression profiles (with values normalized within each cancer type) and profiles from the Fantom consortium expression dataset of various cell types or tissues from human specimens (n=850 profiles)(10). Membership of the Fantom profiles in general categories of “cancer”, “cell line”, “immune” (immune cell types or blood or related tissues), “CNS” (related to central nervous system including brain), “squamous” (including bronchial, trachea, oral regions, throat and esophagus regions, nasal regions, urothelial, cervix, sebocyte, keratin/skin/epidermis), “fibroblast”, or “adipocyte/heart” is indicated. Cancer type color coding in part D. (B) Heat maps of differential expression (values normalized within each cancer type), for genes encoding immunotherapeutic targets (top), for LCK and SYK proteins (middle, representing markers for T-cells and B-cells, respectively), and for genes encoding cancer-testis (CT) antigens (from the CT Gene Database, http://cancerimmunity.org/resources/ct-gene-database/). (C) Heat maps of differential expression (values normalized within each cancer type), for genes encoding canonical markers of neuroendocrine tumors (top), and for a set of 51 genes in a panel of neuroendocrine tumor (NET) markers(34), as uncovered previously using gene expression profiling (bottom). (D) For c3 (immune-associated), c10 (immune-associated), and c4 (CNS- and neuroendocrine-associated) pan-cancer classes, distributions by cancer type. See also Supplementary Figure 13 and Supplementary Data 5.
Figure 4
Figure 4. Differentially active pathways across pan-cancer molecular classes
(A) Pathway diagram representing core metabolic pathways, with differential expression patterns represented (using values normalized within cancer type), comparing tumors in pan-cancer classes c1, c6, or c8 with tumors in the other seven classes (red, significantly higher in c1/c6/c8). (B) Diagram of tumor-associated macrophage roles in the tumor microenvironment(35), and of Notch, NRF2-ARE, and Wnt/beta-catenin pathways, with differential expression patterns represented (using values normalized within cancer type), comparing tumors in pan-cancer classes c3, c7, or c8 with tumors in the other seven classes (red, significantly higher in c3/c7/c8). (C) Diagram of immune checkpoint pathway (featuring interactions between T cells and antigen-presenting cells, including tumor cells), with differential expression patterns represented (using values normalized within cancer type), comparing tumors in pan-cancer classes c3, c5, or c10 with tumors in the other seven classes (red, significantly higher in c3/c5/c10). P-values in parts A-C by Mann-Whitney U-test. FDR, false discovery rate.
Figure 5
Figure 5. Observation of patterns associated with TCGA pan-cancer molecular classes in an external multi-cancer expression profiling dataset
(A) Gene expression profiles of 2041 cancer cases of various pathologically defined cancer types, represented in the Expression Project for Oncology (expO) (GSE2109) dataset (profiles being normalized within their respective cancer type), were classified according to TCGA pan-cancer molecular class. Expression patterns for the top set of 854 mRNAs distinguishing between the ten TCGA molecular classes (from Figure 1A) are shown for both TCGA and GSE2109 datasets. Genes in the GSE2109 sample profiles sharing similarity with TCGA class-specific signature pattern are highlighted. (B) In the same manner as carried out for TCGA datasets, expO expression profiles were scored for pathway-associated gene signatures (from Figure 2A), surveyed for immune checkpoint markers and for CT antigen genes (from Figure 3B, using the same gene ordering), surveyed for canonical neuroendocrine tumor (NET) markers (from Figure 3C), and scored for similarity to normal cell type categories represented in the fantom dataset (from Figure 3A). Pan-cancer class associations of particular interest (which tend to follow the patterns first observed in TCGA cohort) are highlighted. The purple-cyan heat maps off to the right denote t-statistics for comparing the given class versus the rest of the tumors; dark purple or cyan corresponds approximately to p<0.01. Parts A and B have the same ordering of expO expression profiles. See also Supplementary Figures 14 to 16 and Supplementary Data 6.

References

    1. Perou C, Sørlie T, Eisen M, van de Rijn M, Jeffrey S, Rees C, et al. Molecular portraits of human breast tumours. Nature. 2000;406(6797):747–52. - PubMed
    1. Cancer_Genome_Atlas_Research_Network. Weinstein J, Collisson E, Mills G, Shaw K, Ozenberger B, et al. The Cancer Genome Atlas Pan-Cancer analysis project. Nature genetics. 2013;45(10):1113–20. - PMC - PubMed
    1. Creighton C. The molecular profile of luminal B breast cancer. Biologics. 2012;6:289–97. - PMC - PubMed
    1. Zhang Y, Kwok-Shing Ng P, Kucherlapati M, Chen F, Liu Y, Tsang Y, et al. A Pan-Cancer Proteogenomic Atlas of PI3K/AKT/mTOR Pathway Alterations. Cancer Cell. 2017 E-pub May 8. - PMC - PubMed
    1. Hoadley K, Yau C, Wolf D, Cherniack A, Tamborero D, Ng S, et al. Multiplatform Analysis of 12 Cancer Types Reveals Molecular Classification within and across Tissues of Origin. Cell. 2014;158(4):929–44. - PMC - PubMed

Publication types

Substances

LinkOut - more resources