Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jun;7(6):e1002101.
doi: 10.1371/journal.pgen.1002101. Epub 2011 Jun 9.

Pathways of distinction analysis: a new technique for multi-SNP analysis of GWAS data

Affiliations

Pathways of distinction analysis: a new technique for multi-SNP analysis of GWAS data

Rosemary Braun et al. PLoS Genet. 2011 Jun.

Abstract

Genome-wide association studies (GWAS) have become increasingly common due to advances in technology and have permitted the identification of differences in single nucleotide polymorphism (SNP) alleles that are associated with diseases. However, while typical GWAS analysis techniques treat markers individually, complex diseases (cancers, diabetes, and Alzheimers, amongst others) are unlikely to have a single causative gene. Thus, there is a pressing need for multi-SNP analysis methods that can reveal system-level differences in cases and controls. Here, we present a novel multi-SNP GWAS analysis method called Pathways of Distinction Analysis (PoDA). The method uses GWAS data and known pathway-gene and gene-SNP associations to identify pathways that permit, ideally, the distinction of cases from controls. The technique is based upon the hypothesis that, if a pathway is related to disease risk, cases will appear more similar to other cases than to controls (or vice versa) for the SNPs associated with that pathway. By systematically applying the method to all pathways of potential interest, we can identify those for which the hypothesis holds true, i.e., pathways containing SNPs for which the samples exhibit greater within-class similarity than across classes. Importantly, PoDA improves on existing single-SNP and SNP-set enrichment analyses, in that it does not require the SNPs in a pathway to exhibit independent main effects. This permits PoDA to reveal pathways in which epistatic interactions drive risk. In this paper, we detail the PoDA method and apply it to two GWAS: one of breast cancer and the other of liver cancer. The results obtained strongly suggest that there exist pathway-wide genomic differences that contribute to disease susceptibility. PoDA thus provides an analytical tool that is complementary to existing techniques and has the power to enrich our understanding of disease genomics at the systems-level.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. PoDA applied to simulated data.
Alleles at 50 loci for 250 cases and 250 controls were simulated such that each SNP was in HWE and not associated with case status, but homozygous minor (red) at both loci 1 and 2 or 1 and 3 yielded a three-fold relative risk (a). A 12-SNP pathway comprising SNPs 1–12 shows differential formula image distributions (b); a random 12-SNP pathway does not (c). Boxplots are overlayed on the scatterplots of formula image for clarity.
Figure 2
Figure 2. PoDA applied to four highly-significant SNPs.
Shown is the distribution of formula image values in CGEMS cases (red) and controls (black) for a SNP-set comprised of four highly-significant SNPs located in the formula image gene . As expected, there is a substantial difference in case and control formula image values, with the cases having higher formula image (i.e., closer to other cases) than controls. The discreteness of the distributions are due to the fact that with four SNPs, a finite number of formula image values are possible.
Figure 3
Figure 3. Four significant pathways in breast cancer data.
Scatter plots of formula image for each pathway are overlayed with boxplots are given in the left panel; higher values of formula image indicate that the sample is closer to other cases than it is to other controls. Distributions of formula image for cases (red) and controls (black) are given to the right. A significant shift toward higher formula image values is seen in the cases. Odds ratios and FDR-adjusted OR formula image values are given.
Figure 4
Figure 4. Four significant pathways in liver cancer data.
Scatter plots of formula image for each pathway are overlayed with boxplots are given in the left panel; higher values of formula image indicate that the sample is closer to other cases than it is to other controls. Distributions of formula image for cases (red) and controls (black) are given to the right. A significant shift toward higher formula image values is seen in the cases. Odds ratios and FDR-adjusted OR formula image values are given.
Figure 5
Figure 5. Union of top three pathways.
SNPs from the top three pathways are combined to compute formula image for the breast cancer data (a) and the liver cancer data (b). Distributions of formula image for cases (red) and controls (black) are given to the right. A significant shift toward higher formula image values is seen in the cases.

References

    1. Hirschhorn JN, Daly MJ. Genome-wide association studies for common diseases and complex traits. Nat Rev Genet. 2005;6:95–108. - PubMed
    1. McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet. 2008;9:356–69. - PubMed
    1. Easton DF, Eeles RA. Genome-wide association studies in cancer. Hum Mol Genet. 2008;17:R109–15. - PubMed
    1. Hunter DJ, Kraft P, Jacobs KB, Cox DG, Yeager M, et al. A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nature Genetics. 2007;39:870–874. - PMC - PubMed
    1. Lou H, Yeager M, Li H, Bosquet JG, Hayes RB, et al. Fine mapping and functional analysis of a common variant in MSMB on chromosome 10q11.2 associated with prostate cancer susceptibility. PNAS. 2009;106:7933–8. - PMC - PubMed

Publication types