Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Mar:41:120-133.
doi: 10.1016/j.ebiom.2019.01.064. Epub 2019 Feb 22.

The prognostic landscape of interactive biological processes presents treatment responses in cancer

Affiliations

The prognostic landscape of interactive biological processes presents treatment responses in cancer

Bin He et al. EBioMedicine. 2019 Mar.

Abstract

Background: Differential gene expression patterns are commonly used as biomarkers to predict treatment responses among heterogeneous tumors. However, the link between response biomarkers and treatment-targeting biological processes remain poorly understood. Here, we develop a prognosis-guided approach to establish the determinants of treatment response.

Methods: The prognoses of biological processes were evaluated by integrating the transcriptomes and clinical outcomes of ~26,000 cases across 39 malignancies. Gene-prognosis scores of 39 malignancies (GEO datasets) were used for examining the prognoses, and TCGA datasets were selected for validation. The Oncomine and GEO datasets were used to establish and validate transcriptional signatures for treatment responses.

Findings: The prognostic landscape of biological processes was established across 39 malignancies. Notably, the prognoses of biological processes varied among cancer types, and transcriptional features underlying these prognostic patterns distinguished response to treatment targeting specific biological process. Applying this metric, we found that low tumor proliferation rates predicted favorable prognosis, whereas elevated cellular stress response signatures signified resistance to anti-proliferation treatment. Moreover, while high immune activities were associated with favorable prognosis, enhanced lipid metabolism signatures distinguished immunotherapy resistant patients.

Interpretation: These findings between prognosis and treatment response provide further insights into patient stratification for precision treatments, providing opportunities for further experimental and clinical validations. FUND: National Natural Science Foundation, Innovative Research Team in University of Ministry of Education of China, National Key Research and Development Program, Natural Science Foundation of Guangdong, Science and Technology Planning Project of Guangzhou, MRC, CRUK, Breast Cancer Now, Imperial ECMC, NIHR Imperial BRC and NIH.

Keywords: Biological processes; Cell-proliferation; Immune processes; Prognosis; Treatment response.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
GSEA defines the prognoses of biological processes (a) Overview of the working model. Response to a treatment in heterogeneous patients depends on two factors according to the treatment-response model: 1) inactivation of the target and 2) prognostic contribution of the target. An alternative target inactivating treatment rescues type II, but not type I resistance, and the mechanisms for type I resistance is key to stratifying patients and effective treatment. The prognoses of treatment targeting biological process in distinct cancer types were evaluated by integrating transcriptomic and clinical outcomes in datasets. Gene signatures and interacting biological processes that determine prognostic variation and treatment response were established to distinguish treatment responses. PrognosisTarget, the prognostic score of treatment targeting processes. InactivationTarget, the ratio of inactivated target by specific treatment. (b) A detailed analysis workflow. Black boxes indicate input or output data and blue boxes analysis method. Based on the gene sets defined by biological processes and treatment interventions available in the MSigDB database, a method was developed that relied on the genome-wide gene-prognosis metadata (PRECOG (GEO microarray datasets) or TCGA (in-house generated RNA-seq datasets) gene-prognosis data) and GSEA algorithm. This prognosis-ranked GSEA provided an estimate of the prognostic values of biological gene sets. Metagene signatures were developed according to cancer type dependent prognostic and transcriptional features, and validated by a variety of bioinformatic resources. (c) Representative enrichment plots for chemical and genetic perturbations (CGP) gene set pairs (_UP: upregulated by perturbation, _DN: downregulated by perturbation) in prognosis-ranked GSEA (pan-cancer prognostic z-score). NES, normalized enrichment score. AdvP, Adverse Prognostic Phenotype; FavP, Favorable Prognostic Phenotype. FDR, false discovery rate. FDR and Nominal P values were defined by GSEA software. (d) Prognostic NES of 57 paired CGP gene sets in PRECOG prognosis-ranked GSEA (pan-cancer prognostic z-score). Gene sets of known oncogenes or treatment perturbations are labeled in red. Orange bars indicate “UP” (upregulated by perturbation) gene sets and blue bars indicate “DN” (downregulated by perturbation) gene sets. (e) Plots indicating –Log10HR (Hazard Ratio) and –Log10P of GSK3iDNLead and RapaDNLead genes in the TCGA and GEO datasets. Leading edge genes were defined by GSEA, which was determine as genes in the gene set that appear in the ranked list at, or before the point where the running sum reaches its maximum deviation from zero, interpreted as the core of a gene set that accounts for the enrichment signal. P values (log-rank test) and HR were determined between the two patient groups stratified by median level of expression of leading edge genes using Kaplan Meier survival analysis. Each point stands for an independent dataset. Red and blue dots indicate adverse and favorable prognosis, respectively. (f) Representative curves for Kaplan Meier analyses of patients with low (green) or high (red) GSK3iDNLead (left) RapaDNLead (right) expression (stratified by median expression, n = 249/249 for GSE62564-NEUB, n = 43/42 for TCGA-MESO and n = 253/253 for TCGA-KIRC) in the cancer datasets. P values were determined using the log-rank test.
Fig. 2
Fig. 2
The prognostic landscape of biological processes in cancer (a) The NES of hallmark biological processes in prognosis-ranked GSEA (pan-cancer prognostic z-score). NES values were generated for all of the 50 GSEA hallmark processes using the decreasingly ranked pan-cancer prognostic z-scores. Gene sets enriched in adverse (NES > 0, left) and favorable (NES < 0, right) prognoses were ranked by NES. Red: cell-proliferation related gene sets. Blue: immune gene sets. FDR and Nominal P values were defined by GSEA software. (b) and (c) Plots indicating –Log10HR and –Log10P of the cell-proliferation programs (b) and immune processes (c) in the TCGA and GEO datasets. CycleLead here includes the 49 leading edge genes from the E2F_TARGETS and MYC_TARGET_V1 whereas the ImmuneLead here includes 46 leading edge genes from the INTERFERON_GAMMA and INTERFERON_ALPHA gene sets. P values and hazard ratios were determined by Kaplan Meier survival analyses of leading edge genes (patients were stratified by median gene expression). Each point stands for an independent dataset. Red and blue dots indicate adverse and favorable prognoses, respectively. (d) Representative curves for Kaplan Meier analyses of patients with low (green) or high (red) CycleLead (upper) or ImmuneLead (lower) expression in the GEO datasets (patients were stratified by median gene expression, n = 249/249 for GSE62564-NEUB, n = 148/147 for NKI-BRC, n = 253/253 for TCGA-KIRC, n = 82/81 for TCGA-SKCM and n = 40/40 for GSE10141-LIHC). P values were determined using the log-rank test. (e) Upper panel: Hierarchical clustering of prognostic NES for 50 hallmark gene sets in 39 malignancies. Abbreviations for cancer types are listed in Table S1. NES for the 50 hallmark gene sets were calculated in the GSEA software according to the prognostic z-scores of each cancer type that were download from the PRECOG website. Cancers with cell-proliferation programs conferring a favorable prognosis were marked in blue whereas those with immune processes indicating poor prognosis were marked in red. Lower panel: Hierarchical clustering of NES for 6 representative gene sets in 19 TCGA datasets whose prognostic z-scores were generated in this study (See Methods). Sidebar, NES index for adverse (red) and favorable (green) prognoses. (f) and (g) Plots depicting prognostic NES of immune (f) and cell-proliferation (g) related GO processes assessed in 39 individual cancer types. Significant enrichments (P <.05, FDR q <0.25) for each gene set is labeled in red (AdvP) and blue (FavP), respectively.
Fig. 3
Fig. 3
Cell-proliferation PVS links prognosis to treatment response (a) Clustering of prognostic values for individual CycleLead genes (Fig. 2b) determined by the Human Pathology Atlas. –Log10PLogRank values were calculated (negative values for favorable prognosis, and positive values for unfavorable prognoses) and clustered by average linkage clustering. Cancer types with favorable prognoses in cell-proliferation associated leading edge genes are labeled in red. Sidebar, prognostic score index. (b) Kaplan Meier analysis of “GO cell cycle” gene set in patients classified by their CycleR signature expression in the TCGA lung adenocarcinoma. Expression levels (low, L; high, H) were defined according to the median value of the gene set expression of the cohort (n = 129/129/129/128 for four groups in TCGA-LUAD). P values were determined using the log-rank test. (c) Hierarchical clustering of ssGSEA enrichment scores of GOBP gene sets (MSigDB) that were differentially enriched between sensitive and resistant cell lines. Sensitivity to 22 compounds targeting cell-proliferation program were defined by the AUC values in the Cancer Therapeutics Response Portal database. Transcriptomic profiles for cell lines were downloaded from the CCLE database. The enrichment of GOBP gene sets were analyzed by ssGSEAProjection in the GenePattern. (d) Independent shRNA targeting PLAT, PLAU, SERPINE1 control (shGFP) were delivered to MDA-MB-231 cells by lentivirus (pLKOTeton). Cells were selected for six days by puromycin (2 mg/ml) two days after transfection. After four days of doxycycline (Dox, 200 ng/ml) treatment, cells were subjected to cell proliferation assay. For proliferation assay, cells were maintained in Dox and treated with DMSO, MLN8237 (AURKA inhibitor, 200 nM) or Palbociclib (CDK4/6 inhibitor, 500 nM). Cell viability was determined by CCK-8 kit, and quantified compared to control cells (shGFP, Dox + DMSO). *, P <.01 (two-tailed Student’s t-test, n = 4). (e) The NES of CycleR and top enriched Hallmark gene sets in eight treatment response GEO datasets. GSEA were performed in pre-treated transcriptomes of patients classified as responders and non-responders according to their response to treatments (including glucocorticoids, platinum, taxanes and doxorubicin). (f) Enrichment plots for CycleR in six GEO datasets. GSEA was performed in transcriptomes of responders and non-responders in each dataset. (g) and (h) Hierarchical clustering of CycleC (g) and CycleR (h) genes in TCGA glioblastoma (upper) and pancreatic cancer (lower) cohorts. Yellow boxes indicate transcriptomic subsets with high gene expression. Sidebars in (g) and (h) show the gene expression index. (i) Expression of CycleC (left) and CycleR (right) in TCGA glioblastoma subtypes. Glioblastoma were pre-classified according to their transcriptional features. Signature expression is defined as the mean value of all genes in the signature. *, P <.05; **, P <.01; ****, P <.0001 (Dunnett multiple comparisons following One-Way ANOVA). (j) Expression of CycleC (left) and CycleR (right) in TCGA glioblastoma. Glioblastoma were pre-classified according to their IDH mutation status. Signature expression is defined as the mean value of all genes in the signature. ****, P <.0001 (two-tailed Student’s t-test).
Fig. 4
Fig. 4
Immune interactive processes indicate immunotherapy response (a) The prognostic NES of favorable LSF gene set (32 genes that are specifically expressed in lymphocytes that exhibited favorable prognostic z-scores in > 60% (24/39) PRECOG cancer datasets) in 19 matched TCGA (blue bars) and PRECOG (black bars) cancer types. (b) and (c) Kaplan Meier analysis of the “GO immune response” gene set in patient subsets classified by ImmuR (b) or ImmuC (c) signature expression in the TCGA melanoma (SKCM, b) and lung squamous cell carcinoma (LUSC, c) datasets. Expression level (low, L; high, H) was defined according to the median value of gene set expression of the cohort (n = 118/117/117/117 for four groups in SKCM, and n = 125/125/125/125 for four groups in LUSC). P values were determined by log-rank test. (d) The combined ssGSEA enrichment scores of ImmuC and ImmuR gene sets (score ImmuC-score ImmuR) in individual patients from four independent datasets (GSE67501, GSE78220, GSE91061 and Aa5951). Patients were classified as responders (Res, black dots) and non-responders (NoR, red dots) to immunotherapy according to their clinical outcomes in each dataset. P values were calculated using a two-tailed Student’s t-test. (e) Expression of MGLL in individual patients from four independent datasets (GSE67501, GSE78220, GSE91061 and Aa5951). Patients were classified as responders (Res, labeled in black) or non-responders (NoR, labeled in red) to immunotherapy according to their clinical outcomes. P values were calculated using a two-tailed Student's t-test. (f) The NES of ImmuR (upper panel) and ImmuC (lower panel) in three independent GEO datasets. GSEA were performed in transcriptomes of patients classified as responders and non-responders according to their response to immunotherapy. (g) The NES of ImmuR, ImmuC and Hallmark gene sets in three GEO datasets (black, GSE67501; blue, GSE91061; red, GSE78220; grey, Aa5951). GSEA were performed in transcriptomes of patients classified as responders and non-responders according to their response to treatment. (h) and (i) Hierarchical clustering of ImmuC (h) and ImmuR (i) genes in three TCGA datasets, including glioblastoma (upper panel), lung adenocarcinoma (middle panel) and lung squamous cell carcinoma (lower panel). Yellow boxes indicate distinct clustered subsets with high gene expression. Sidebars, gene expression index. (j) A summary of the strategy to identify interactive biological processes for treatment responses and to stratify treatment responders from non-responders. The prognoses of BPGSs were systematically evaluated among cancer datasets. Differentially expressed genes between BPGS-favorable and BPGS-adverse patients were defined to identify the underlying biological processes and to distinguish responder patients. The concept has now been proven in treatments targeting both tumor (anti-proliferation treatment) and non-tumor cells (immunotherapy).

References

    1. Bedard P.L., Hansen A.R., Ratain M.J. Tumour heterogeneity in the clinic. Nature. 2013;501(7467):355–364. - PMC - PubMed
    1. Nevins J.R., Potti A. Mining gene expression profiles: expression signatures as cancer phenotypes. Nat Rev Genet. 2007;8(8):601–609. - PubMed
    1. Subramanian A., Tamayo P., Mootha V.K. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005;102(43):15545–15550. - PMC - PubMed
    1. de Leeuw C.A., Neale B.M., Heskes T. The statistical properties of gene-set analysis. Nat Rev Genet. 2016;17(6):353–364. - PubMed
    1. Liberzon A., Birger C., Thorvaldsdottir H. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 2015;1(6):417–425. - PMC - PubMed