Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011;6(10):e26277.
doi: 10.1371/journal.pone.0026277. Epub 2011 Oct 25.

A new methodology to associate SNPs with human diseases according to their pathway related context

Affiliations

A new methodology to associate SNPs with human diseases according to their pathway related context

Burcu Bakir-Gungor et al. PLoS One. 2011.

Abstract

Genome-wide association studies (GWAS) with hundreds of żthousands of single nucleotide polymorphisms (SNPs) are popular strategies to reveal the genetic basis of human complex diseases. Despite many successes of GWAS, it is well recognized that new analytical approaches have to be integrated to achieve their full potential. Starting with a list of SNPs, found to be associated with disease in GWAS, here we propose a novel methodology to devise functionally important KEGG pathways through the identification of genes within these pathways, where these genes are obtained from SNP analysis. Our methodology is based on functionalization of important SNPs to identify effected genes and disease related pathways. We have tested our methodology on WTCCC Rheumatoid Arthritis (RA) dataset and identified: i) previously known RA related KEGG pathways (e.g., Toll-like receptor signaling, Jak-STAT signaling, Antigen processing, Leukocyte transendothelial migration and MAPK signaling pathways); ii) additional KEGG pathways (e.g., Pathways in cancer, Neurotrophin signaling, Chemokine signaling pathways) as associated with RA. Furthermore, these newly found pathways included genes which are targets of RA-specific drugs. Even though GWAS analysis identifies 14 out of 83 of those drug target genes; newly found functionally important KEGG pathways led to the discovery of 25 out of 83 genes, known to be used as drug targets for the treatment of RA. Among the previously known pathways, we identified additional genes associated with RA (e.g. Antigen processing and presentation, Tight junction). Importantly, within these pathways, the associations between some of these additionally found genes, such as HLA-C, HLA-G, PRKCQ, PRKCZ, TAP1, TAP2 and RA were verified by either OMIM database or by literature retrieved from the NCBI PubMed module. With the whole-genome sequencing on the horizon, we show that the full potential of GWAS can be achieved by integrating pathway and network-oriented analysis and prior knowledge from functional properties of a SNP.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Outline of our assessment process.
In Step 1, a gene-wise Pw-value for association with disease was computed by integrating functional information. In Step 2, significant Pw-values were loaded as two separate attributes of the genes in a PPI network and visualized using Cytoscape . At this step, active sub-networks of interacting gene products that were also associated with the disease, are identified using jActive Modules plugin . In Step 3, genes in an identified active sub-network were tested whether they are part of functionally important KEGG pathways.
Figure 2
Figure 2. The highest scoring sub-network.
a. This sub-network is composed of 275 nodes and 778 edges (as found in Step 2 of PANOGA). Node size is shown as proportional to the degree of a node. b. Zoomed in view of the highest scoring sub-network. 20 genes known in literature as associated with RA are shown in green. Blue denotes the genes in our highest scoring sub-network that cannot be associated with RA in literature.
Figure 3
Figure 3. Node degree distributions of our highest scoring sub-network vs. random network.
a. Our sub-network follows a power-law (P(k) = ax−γ, a = 120.03, γ = 1.353, R2 = 0.773, Correlation = 0.891 in log log scale), showing that our network displays scale-free properties, as expected from a biological network. b. The random network is obtained via randomization of our highest scoring sub-network using Erdos-Renyi algorithm.
Figure 4
Figure 4. Functionally grouped annotation network of our highest scoring sub-network.
The relationships between the KEGG terms (nodes) were based on the similarity of their associated genes. The size of the nodes reflected the statistical significance of the terms (term p-values corrected with Bonferroni). Edges represent the existence of shared genes. The thickness of the edges is proportional to the number of genes shared and calculated using kappa statistics, in a similar way as described in . The grouped terms (according to their kappa scores) were shown in same color.
Figure 5
Figure 5. Zoomed in view of the entire functional annotation network.
The most significant pathway term of the group with the lowest term p-value (the group leading term) was shown in bold using the group specific color.
Figure 6
Figure 6. Comparison of KEGG pathway terms with literature verified RA genes/our gene set were shown in green/red, respectively.
Nodes represent the identified pathway terms from any one of the two sets. The color gradient showed the gene proportion of each set associated with the term. White color represented equal proportions from the two comparison sets. The size of the nodes reflected the statistical significance of the terms (term p-values corrected with Bonferroni). Following the convention in Figure 4, edges represented the existence of the shared genes between the pathway terms and node border colors mapped to the group colors. Zoomed in view of panel a is shown in panel b.

References

    1. Hardy J, Singleton A. CURRENT CONCEPTS Genomewide Association Studies and Human Disease. New England Journal of Medicine. 2009;360:1759–1768. - PMC - PubMed
    1. Elbers CC, van Eijk KR, Franke L, Mulder F, van der Schouw YT, et al. Using Genome-Wide Pathway Analysis to Unravel the Etiology of Complex Diseases. Genetic Epidemiology. 2009;33:419–431. - PubMed
    1. Altshuler D, Hirschhorn JN, Klannemark M, Lindgren CM, Vohl MC, et al. The common PPAR gamma Pro12Ala polymorphism is associated with decreased risk of type 2 diabetes. Nature Genetics. 2000;26:76–80. - PubMed
    1. Frayling TM. Genome-wide association studies provide new insights into type 2 diabetes aetiology. Nature Reviews Genetics. 2007;8:657–662. - PubMed
    1. Baranzini SE, Galwey NW, Wang J, Khankhanian P, Lindberg R, et al. Pathway and network-based analysis of genome-wide association studies in multiple sclerosis. Human Molecular Genetics. 2009;18:2078–2090. - PMC - PubMed

Publication types

MeSH terms