Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Nov 28;7(5):526-536.e6.
doi: 10.1016/j.cels.2018.10.001. Epub 2018 Nov 7.

Integration of Tumor Genomic Data with Cell Lines Using Multi-dimensional Network Modules Improves Cancer Pharmacogenomics

Affiliations

Integration of Tumor Genomic Data with Cell Lines Using Multi-dimensional Network Modules Improves Cancer Pharmacogenomics

James T Webber et al. Cell Syst. .

Abstract

Leveraging insights from genomic studies of patient tumors is limited by the discordance between these tumors and the cell line models used for functional studies. We integrate omics datasets using functional networks to identify gene modules reflecting variation between tumors and show that the structure of these modules can be evaluated in cell lines to discover clinically relevant biomarkers of therapeutic responses. Applied to breast cancer, we identify 219 gene modules that capture recurrent alterations and subtype patients and quantitate various cell types within the tumor microenvironment. Comparison of modules between tumors and cell lines reveals that many modules composed primarily of gene expression and methylation are poorly preserved. In contrast, preserved modules are highly predictive of drug responses in a manner that is robust and clinically relevant. This work addresses a fundamental challenge in pharmacogenomics that can only be overcome by the joint analysis of patient and cell line data.

Keywords: biomarkers; breast cancer; data integration; networks; pharmacogenomics; therapeutics.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:. Data integration and module discovery using MAGNETIC.
(a) Interaction network of ubiquitin specific peptidases USP6 and USP32 from the STRING database and Pearson correlation between molecular features of USP6 and USP32 across TCGA breast cancers. P-value of association after Bonferroni correction for multiple testing are in parentheses. (b) Scatter of normalized USP32 copy-number and USP6 expression across TCGA. (c) The interaction network of the kinase LCK and its substrate LAT and relationships between their molecular profiles across platforms. (d) Scatter of LCK expression and LAT methylation. (e) MAGNETIC uses as input the normalized DNA copy-number, methylation, somatic mutations, mRNA expression and protein abundance data from a collection of tumor samples. We compute a multi-layer pairwise gene similarity network by computing the correlation between all pairs of gene features both within and between profiling platforms. Each linkage in this correlation network is normalized through comparison against a benchmark of pathways reflected in protein-protein interaction databases. Scored edges are then merged into a multigraph in which nodes represent genes and the edges between nodes represent co-incidence of different types of linkages. Clustering of this network using a random walk algorithm reveals gene modules whose components are closely related in multiple data types. (f) Circos plot representation of the module network containing HER2. Colors represent different data sources selected in the final integrated network for each gene and edge thickness is proportional to edge score. Top central genes are labeled. (g) TCGA samples sorted by HER2 module score. PAM50 subtype and molecular receptor status as determined by IHC are shown. (h) The module network containing the estrogen receptor, ESR1. Direct transcriptional targets of ER as assessed through ChIP analysis are marked with a star. (i) TCGA samples sorted by ESR1 module score. See also Figure S1-S3.
Figure 2:
Figure 2:. Many patient derived modules are not preserved in cell lines and are associated with specific data types.
(a) Overview of approach to score module preservation in cell lines. MAGNETIC takes molecular correlations present across tumor samples and determines if they remain significantly correlated across a cell line panel. Solid edges, above random background, dotted edges, below random background (see STAR Methods). Different colors represent edges derived from comparison between different molecular profiling platforms. (b) Histogram and kernel density estimation of the distribution of module preservation scores. The vertical dotted line shows the cutoff of 5 chosen for further evaluation. (c) Correlation of module scores with pathologic assessments of necrosis and normal cell infiltration for lowly (L) and highly (H) preserved modules. (d) Comparison of module types with computational assessment of tumor purity. (e) Sorted preservation scores for 219 breast cancer modules evaluated in cell lines. Lower preserved modules have a score less than 5 (dotted line). (f) For each module in (e), the percent of the LLR>1 network that corresponds to each edge type are shown. (g) Percent of each edge type for lowly and highly preserved modules in the LLR>1 network. P-values based on Mann-Whitney U-test in parenthesis. See also Figure S4.
Figure 3:
Figure 3:. Modules reflect specific aspects of the tumor microenvironment.
(a) Heatmap of molecular features associated with the overall activity of the immune module (r2>0.1). For clarity, the CNV of one gene is not shown. (b) Enrichment for high expression of module genes from normalized RNA-seq data in 227 purified immune cell type datasets. Cell types are categorized into 15 groups and enrichment based on a t-test. (c) Comparison of module scores with annotated lymphocytic infiltration values in TCGA and METABRIC datasets. (d) Heatmap of molecular features associated with module 12, associated with stromal cells. (e) Comparison of module scores with pathologic assessment of stromal cells in TCGA samples. P-values based on t-test. (f) Examples of samples from TCGA with low and high scores for module 12, showing the difference in stromal content. (g) Heatmap of molecular features associated with module 16, associated with endothelial cells. (h) Comparison of module scores with annotations of necrosis. (i) Examples of samples with low and high scores for module 16.
Figure 4:
Figure 4:. A module-drug network identifies high performance biomarkers that are preserved between patients and cell lines.
(a) Network of 97 module-drug associations based on breast cancer cell line modeling. Modules significantly associated with drug response are shown (FDR≤5%). Drugs are limited to those that are not associated with PAM50 subtype based on an FDR threshold of 5%. The size of each module is proportional to the number of genes within it, and the thickness of the border depicts the strength of a module or drug’s association with PAM50 subtype. Edges are colored red when a module correlated with sensitivity to a drug, and blue when it correlated with resistance. Thicker edges have a lower FDR. As an example, gain of chr1q is associated with resistance of Etoposide at an FDR of <0.1%. (b) Scatter plot of cell line association of lapatinib response with module #92 (HER2) and (c) oxaliplatin with module #139 (chr11q14#1). Cell lines colored by PAM50 subtype. (d) Comparison of median absolute error of cross-validated predictions of drug sensitivity using single gene features or modules as input to elastic net, random forest or SVM based predictors. P-values based on Mann-Whitney U-test. (e) Cross-correlation for all pairs of molecular features that are the most predictive of response to imatinib in cell lines at an FDR of 1% and cross-correlation of the same features in TCGA. (f) The average cross-correlation (r2) of features selected by various statistical methods (FDR, elastic net) using single genes or modules in cell lines and evaluation of cross-correlation of the same features in TCGA. Each point represents a model for a single drug. P-values based on Mann-Whitney U-test. See also Figure S4.

Similar articles

Cited by

References

    1. Aran D, Sirota M, and Butte AJ (2015). Systematic pan-cancer analysis of tumour purity. Nature Communications 6, 8971. - PMC - PubMed
    1. Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, Wilson CJ, Lehar J, Kryukov GV, Sonkin D, et al. (2012). The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607. - PMC - PubMed
    1. Basu A, Bodycombe Nicole E., Cheah Jaime H., Price Edmund V., Liu K, Schaefer Giannina I., Ebright Richard Y., Stewart Michelle L., Ito D, Wang S, et al. (2013). An Interactive Resource to Identify Cancer Genetic and Lineage Dependencies Targeted by Small Molecules. Cell 154, 1151–1161. - PMC - PubMed
    1. Bhat-Nakshatri P, Wang G, Appaiah H, Luktuke N, Carroll JS, Geistlinger TR, Brown M, Badve S, Liu Y, and Nakshatri H (2008). AKT alters genome-wide estrogen receptor alpha binding and impacts estrogen signaling in breast cancer. Mol Cell Biol 28, 7487–7503. - PMC - PubMed
    1. Borst P, and Wessels L (2010). Do predictive signatures really predict response to cancer chemotherapy? Cell Cycle 9, 4836–4840. - PubMed

Publication types

Substances