Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jul 3:14:440.
doi: 10.1186/1471-2164-14-440.

Large-scale integrative network-based analysis identifies common pathways disrupted by copy number alterations across cancers

Affiliations

Large-scale integrative network-based analysis identifies common pathways disrupted by copy number alterations across cancers

Tae Hyun Hwang et al. BMC Genomics. .

Abstract

Background: Many large-scale studies analyzed high-throughput genomic data to identify altered pathways essential to the development and progression of specific types of cancer. However, no previous study has been extended to provide a comprehensive analysis of pathways disrupted by copy number alterations across different human cancers. Towards this goal, we propose a network-based method to integrate copy number alteration data with human protein-protein interaction networks and pathway databases to identify pathways that are commonly disrupted in many different types of cancer.

Results: We applied our approach to a data set of 2,172 cancer patients across 16 different types of cancers, and discovered a set of commonly disrupted pathways, which are likely essential for tumor formation in majority of the cancers. We also identified pathways that are only disrupted in specific cancer types, providing molecular markers for different human cancers. Analysis with independent microarray gene expression datasets confirms that the commonly disrupted pathways can be used to identify patient subgroups with significantly different survival outcomes. We also provide a network view of disrupted pathways to explain how copy number alterations affect pathways that regulate cell growth, cycle, and differentiation for tumorigenesis.

Conclusions: In this work, we demonstrated that the network-based integrative analysis can help to identify pathways disrupted by copy number alterations across 16 types of human cancers, which are not readily identifiable by conventional overrepresentation-based and other pathway-based methods. All the results and source code are available at http://compbio.cs.umn.edu/NetPathID/.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Conceptual models for disrupted pathways. This figure describes two conceptual models for inferring activity of disrupted pathways. (A) Three out of six member genes in the pathway are significantly altered by copy number changes. In this case, overrepresentation-based gene set enrichment analysis and pathway-based analysis could identify the pathway as an enriched pathway with altered genes, since many member genes in the pathway are altered in copy number changes. (B) No member gene in the pathway is altered by copy number changes, but member genes in the pathway are interacting with many other altered genes in the protein-protein interaction network. Existing gene set enrichment analysis and pathway-based analysis would fail to identify the pathway as a disrupted pathway, due to the lack of overlapping altered genes with member genes in the pathway. However, by applying a machine learning method, which propagates the activity score of genes to other genes by exploring cluster structures in the protein-protein interaction network, our approach could identify the pathway as a disrupted pathway, since many member genes in the pathway are interacting with other altered genes (i.e. significantly altered genes in copy number alterations could alter the activity (or function) of member genes through interactions).
Figure 2
Figure 2
Overview of NetPathID. This figure describes steps to discover disrupted pathways across cancers. The aim of the approach is to integrate the copy number data with protein-protein interaction networks to quantify pathway activity for discovering disrupted pathways across cancers. (A) A list of significantly altered genes residing in copy number regions is generated using GISTIC. (B) We initialize activity scores of these genes using their average log2 ratios of amplification or deletion in copy number data, and overlay initial gene activity scores on the protein-protein interaction networks. To fully utilize network topological information, we apply a label propagation algorithm to assign gene activity scores to all the genes in the protein-protein interaction networks (see “Methods” section). (C) Finally, pathway activity scores are computed by average activity scores of member genes in each predefined pathway from prior knowledge (e.g. pathway database or conserved subnetworks in the protein-protein interaction networks cross species). We repeat step (A) and (B) to generate a matrix containing pathway activity scores from multiple cancer types.
Figure 3
Figure 3
Pathway activity view of cancers. (A) Heat map describing the two-way hierarchical clustering of inferred activity of 217 Biocarta pathways across 16 types of cancers. Each row is a different type of cancer, and each column is a pathway. Color bar represents Z-score transformation of the activity score of the pathway. Red indicates significantly disrupted pathways, and green indicates pathways that are not disrupted by copy number alterations. (B) Heat map describing the correlation coefficient of pathway co-disruption (red: positive correlation, green negative correlation). The top 30 ranked disrupted pathways across cancers are included in the heat map. (C) Zoom-in plots including cancer-type specific and commonly disrupted pathways. For example, Cytokine, DC (“Dendritic cells in regulating TH1 and TH2 Development”), and INFLAM (“Cytokines and Inflammatory Response”) pathways are only disrupted in acute lymphoblastic leukemia and myelodysplasia. Cytokines and inflammatory response, as well as dendritic cells as modulators of immune responses in DC pathway are known for development of acute lymphoblastic leukemia and myelodysplasia. In contrast to cancer-type specific disrupted pathways, there is a set of commonly disrupted pathways across cancers. For example, TGFB (“TGF beta signaling”) pathway is one of commonly disrupted pathways across more than 10 types of cancers. Other commonly disrupted pathways include TEL (“Telomeres, Telomerase, Cellular Aging, and Immortality”), TRKA (or NTRK1) (“Trka Receptor Signaling Pathway”), CTCF (“First Multivalent Nuclear Factor”), and SPRY (“Sprouty regulation of tyrosine kinase signals”) pathways.
Figure 4
Figure 4
Commonly disrupted pathways across cancers correlate with clinical outcomes. (A) Two-way hierarchical clustering of lung cancer patients using member genes in commonly disrupted pathways. (B) Kaplan-Meier survival plots for the clusters of patient subgroups from lung cancer microarray gene expression dataset. Colors (Red, Black and Blue) indicate patient subgroups used for Kaplan-Meier analysis.
Figure 5
Figure 5
Cancer-related genes are enriched in commonly disrupted pathways. Fraction of known cancer genes in top k% ranked disrupted pathways based on pathway activity score using pathway information from (A) Biocarta, (B) Reactome, (C) KEGG, and (D) conserved protein-protein interaction subnetworks.
Figure 6
Figure 6
Network view of TGF-beta signaling pathway alterations in colorectal and ovarian cancer. (A) Network view of genes altered by copy number changes in colorectal cancer in the TGF-beta signaling pathway (diamond nodes) or genes directly interacting with TGF-beta signaling genes (circular nodes) based on the protein-protein interaction database. (B) The same network view for ovarian cancer. Size of node represents frequency of amplification or deletion in patient population. Color of node indicates whether gene is amplified (red), deleted (green), or unchanged (gray). Lines indicate interactions. Blue dotted line separates genes within the pathway from genes that interact with the pathway based on the protein-protein interaction database.

References

    1. Albertson DG, Collins C. et al.Chromosome aberrations in solid tumors. Nat Genet. 2003;34(4):369–376. doi: 10.1038/ng1215. - DOI - PubMed
    1. Shlien A, Malkin D. Copy number variations and cancer. Genome. 2009;1(6):62. doi: 10.1186/gm62. - DOI - PMC - PubMed
    1. Wood LD, Parsons DW. et al.The genomic landscapes of human breast and colorectal cancers. Science. 2007;318(5853):1108. doi: 10.1126/science.1145720. - DOI - PubMed
    1. Jones S, Zhang X. et al.Core signaling pathways in human pancreatic cancers revealed by global genomic analyses. Science. 2008;321(5897):1801. doi: 10.1126/science.1164368. - DOI - PMC - PubMed
    1. McLendon R, Friedman A. et al.Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature. 2008;455(7216):1061–1068. doi: 10.1038/nature07385. - DOI - PMC - PubMed

Publication types

Substances

Associated data

LinkOut - more resources