Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Feb;22(2):375-85.
doi: 10.1101/gr.120477.111. Epub 2011 Jun 7.

De novo discovery of mutated driver pathways in cancer

Affiliations

De novo discovery of mutated driver pathways in cancer

Fabio Vandin et al. Genome Res. 2012 Feb.

Abstract

Next-generation DNA sequencing technologies are enabling genome-wide measurements of somatic mutations in large numbers of cancer patients. A major challenge in the interpretation of these data is to distinguish functional "driver mutations" important for cancer development from random "passenger mutations." A common approach for identifying driver mutations is to find genes that are mutated at significant frequency in a large cohort of cancer genomes. This approach is confounded by the observation that driver mutations target multiple cellular signaling and regulatory pathways. Thus, each cancer patient may exhibit a different combination of mutations that are sufficient to perturb these pathways. This mutational heterogeneity presents a problem for predicting driver mutations solely from their frequency of occurrence. We introduce two combinatorial properties, coverage and exclusivity, that distinguish driver pathways, or groups of genes containing driver mutations, from groups of genes with passenger mutations. We derive two algorithms, called Dendrix, to find driver pathways de novo from somatic mutation data. We apply Dendrix to analyze somatic mutation data from 623 genes in 188 lung adenocarcinoma patients, 601 genes in 84 glioblastoma patients, and 238 known mutations in 1000 patients with various cancers. In all data sets, we find groups of genes that are mutated in large subsets of patients and whose mutations are approximately exclusive. Our Dendrix algorithms scale to whole-genome analysis of thousands of patients and thus will prove useful for larger data sets to come from The Cancer Genome Atlas (TCGA) and other large-scale cancer genome sequencing projects.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Somatic mutations in multiple patients are represented in a mutation matrix. Gene sets are identified as exclusive submatrices or high weight submatrices.
Figure 2.
Figure 2.
Ratio between the sampled frequency π(M) of the maximum weight set, and the maximum frequency π(maxother) of any other set in the sample for different values of W(M).
Figure 3.
Figure 3.
(A) High weight submatrix of eight genes in the somatic mutations data from multiple cancer types (Thomas et al. 2007). (Black bars) Exclusive mutations; (gray bars) co-occurring mutations. (B) Location of identified genes in known pathway. Interactions in the pathway are as reported in Ding et al. (2008).
Figure 4.
Figure 4.
(A) High weight submatrices of two and three genes in the lung adenocarcinoma data. (Black bars) Exclusive mutations; (gray bars) co-occurring mutations. Rows (patients) are ordered differently for each submatrix, to illustrate exclusivity and co-occurrence. (B) The location of gene sets in known pathways reveals that the triplet of genes codes for proteins in the mTOR signaling pathway (light gray nodes), and the pair (ATM, TP53) corresponds to interacting proteins in the cell cycle pathway (dark gray nodes). Interactions in the pathway are as reported in Ding et al. (2008).
Figure 5.
Figure 5.
(A) High weight submatrices of two and three genes in the glioblastoma data. (Black bars) Exclusive mutations; (gray bars) co-occurring mutations. Rows (patients) are ordered differently for each submatrix, to illustrate exclusivity and co-occurrence. (B) Location of identified genes in known pathways. Interactions in pathways are as reported in The Cancer Genome Atlas Research Network (2008).

References

    1. Backlund LM, Nilsson BR, Goike HM, Schmidt EE, Liu L, Ichimura K, Collins VP 2003. Short postoperative survival for glioblastoma patients with a dysfunctional Rb1 pathway in combination with no wild-type PTEN. Clin Cancer Res 9: 4151–4158 - PubMed
    1. Bansal V, Halpern AL, Axelrod N, Bafna V 2008. An MCMC algorithm for haplotype assembly from whole-genome sequence data. Genome Res 18: 1336–1346 - PMC - PubMed
    1. Ben-Dor A, Chor B, Karp RM, Yakhini Z 2003. Discovering local structure in gene expression data: The order-preserving submatrix problem. J Comput Biol 10: 373–384 - PubMed
    1. Boca SM, Kinzler KW, Velculescu VE, Vogelstein B, Parmigiani G 2010. Patient-oriented gene set analysis for cancer mutation data. Genome Biol 11: R112 doi: 10.1186/gb-2010-11-11-r112 - PMC - PubMed
    1. Bradley JR, Farnsworth DL 2009. Testing for mutual exclusivity. J Appl Stat 36: 1307–1314

Publication types