Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Feb;12(2):154-9.
doi: 10.1038/nmeth.3215. Epub 2014 Dec 22.

Selecting causal genes from genome-wide association studies via functionally coherent subnetworks

Affiliations

Selecting causal genes from genome-wide association studies via functionally coherent subnetworks

Murat Taşan et al. Nat Methods. 2015 Feb.

Abstract

Genome-wide association (GWA) studies have linked thousands of loci to human diseases, but the causal genes and variants at these loci generally remain unknown. Although investigators typically focus on genes closest to the associated polymorphisms, the causal gene is often more distal. Reliance on published work to prioritize candidates is biased toward well-characterized genes. We describe a 'prix fixe' strategy and software that uses genome-scale shared-function networks to identify sets of mutually functionally related genes spanning multiple GWA loci. Using associations from ∼100 GWA studies covering ten cancer types, our approach outperformed the common alternative strategy in ranking known cancer genes. As more GWA loci are discovered, the strategy will have increased power to elucidate the causes of human disease.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Overview of prix fixe strategy. TagSNPs associated with disease are used to define linkage-disequilibrium “windows”. A co-function network (CFN) is then used to identify dense “prix fixe” (PF) subnetworks. Dense prix fixe subnetworks are aggregated and genes are scored to reflect their importance in the subnetworks. High-scoring genes are then used to find causal pathways, processes, and additional candidate genes.
Figure 2
Figure 2
Functional connectivity patterns in prostate cancer. Candidate genes are organized by locus in genomic order. Genes highlighted in yellow are members of the Sanger Cancer Gene Census. Red gene intensity indicates the LD (r2) value between the gene and tagSNP for that locus. Blue gene intensity reflects the prix fixe score. Edges represent presence in the final collection of dense subnetworks, with blue edge intensity reflecting the proportion of final dense subnetworks containing that edge.
Figure 3
Figure 3
Rank-based analysis of Sanger Cancer Gene Census (SCGC) prioritization. Genes are ranked within each cancer-associated locus and normalized ranks of SCGC genes are shown as dots for prix fixe-based (“PF”, left) and LD-based (“r2”, right) rankings (100% is highest ranked, 0% is lowest). Average relative rank of SCGC genes (for both methods) within each locus identified by horizontal bars; number of multigenic loci shown above as “n”. Right-most plot (“Union”) shows pooled results across all cancer-associated loci. PF SCGC ranks significantly outperform LD-based SCGC ranks (P = 0.015, one-sided paired Wilcoxon signed-rank test). CLL contained no SCGC-harboring loci in our primary analysis, and is thus not displayed here.
Figure 4
Figure 4
(a) Prix fixe scores are uncorrelated with LD (r2) values. Each scatter plot point is a candidate breast cancer gene. Correlation is computed using Kendall’s τ rank coefficient. Blue genes indicate significantly differentially-expressed mRNA levels in matched case-control TCGA prostate adenocarcinoma (PRAD) samples, while red genes indicate no evidence of cancer-dependent differential expression. Flanking boxplots indicate score distributions of differentially- and not-differentially-expressed genes. Boxplot whiskers extend to 1.5×IQR; outliers not shown. Boxplots compared by one-sided Wilcoxon rank sum tests. (b) Prix fixe rankings identify disease-relevant Gene Ontology (GO) terms for prostate cancer, with no a priori knowledge of disease etiology. Top-15 (by odds-ratio (OR)) GO terms shown using “ordered” functional enrichment analysis with significance (P*) corrected for multiple testing . Three GO terms expanded to show constituent genes with (if available) “PF” score, “SCGC” (Sanger Cancer Gene Census) status, and “SMG” (significantly-mutated gene ) status. Full functional enrichment analysis for all traits provided in Supplementary File 3.

References

    1. Bodmer W, Bonilla C. Common and rare variants in multifactorial susceptibility to common diseases. Nat Genet. 2008;40:695–701. - PMC - PubMed
    1. Risch N, Merikangas K. The future of genetic studies of complex human diseases. Science. 1996;273:1516–1517. - PubMed
    1. Chakravarti A, Clark AG, Mootha VK. Distilling pathophysiology from complex disease genetics. Cell. 2013;155:21–26. - PMC - PubMed
    1. Gilman SR, et al. Rare de novo variants associated with autism implicate a large functional network of genes involved in formation and function of synapses. Neuron. 2011;70:898–907. - PMC - PubMed
    1. Raychaudhuri S, et al. Identifying relationships among genomic disease regions: predicting genes at pathogenic SNP associations and rare deletions. PLoS Genet. 2009;5:e1000534. - PMC - PubMed

Publication types