Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Sep 23;6(9):e1001131.
doi: 10.1371/journal.pgen.1001131.

A novel statistic for genome-wide interaction analysis

Affiliations

A novel statistic for genome-wide interaction analysis

Xuesen Wu et al. PLoS Genet. .

Abstract

Although great progress in genome-wide association studies (GWAS) has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide interaction analysis. To meet challenges raised by genome-wide interactional analysis, we have developed a novel statistic for testing interaction between two loci (either linked or unlinked). The null distribution and the type I error rates of the new statistic for testing interaction are validated using simulations. Extensive power studies show that the developed statistic has much higher power to detect interaction than classical logistic regression. The results identified 44 and 211 pairs of SNPs showing significant evidence of interactions with FDR<0.001 and 0.001<FDR<0.003, respectively, which were seen in two independent studies of psoriasis. These included five interacting pairs of SNPs in genes LST1/NCR3, CXCR5/BCL9L, and GLS2, some of which were located in the target sites of miR-324-3p, miR-433, and miR-382, as well as 15 pairs of interacting SNPs that had nonsynonymous substitutions. Our results demonstrated that genome-wide interaction analysis is a valuable tool for finding remaining missing heritability unexplained by the current GWAS, and the developed novel statistic is able to search significant interaction between SNPs across the genome. Real data analysis showed that the results of genome-wide interaction analysis can be replicated in two independent studies.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Quantile-quantile plots for the test statistic .
(A) Quantile-quantile plots for the test statistic formula image in dataset 1. The P-values (<formula image) for the test are plotted (as −log10 values) as a function of its expected p values. (B) Quantile-quantile plots for the test statistic formula image in dataset 2. The P-values (<formula image) for the test are plotted (as −log10 values) as a function of its expected p values.
Figure 2
Figure 2. Power of the statistics for testing interaction between two linked loci under recessive disease model.
(A) The power of the test statistic formula image, the “fast-epistasis” in PLINK and logistic regression analysis for testing interaction between two linked loci as a function of traditional odds-ratio formula image under a two-locus recessiveformula imagerecessive disease model, where the number of individuals in both the case and control groups is 2,000, the significance level is 0.05, and the odds-ratios at two loci were formula image. (B) The power of the test statistic formula image, the “fast-epistasis” in PLINK and logistic regression analysis for testing interaction between two linked loci as a function of traditional odds-ratio formula image under a two-locus recessiveformula imagerecessive disease model, where the number of individuals in both the case and control groups is 2,000, the significance level is 0.01, and the odds-ratios at two loci were formula image. (C) The power of the test statistic formula image, the “fast-epistasis” in PLINK and logistic regression analysis for testing interaction between two linked loci as a function of traditional odds-ratio formula image under a two-locus recessiveformula imagerecessive disease model, where the number of individuals in both the case and control groups is 2,000, the significance level is 0.001, and the odds-ratios at two loci were formula image.
Figure 3
Figure 3. Power of the statistics for testing interaction between two linked loci under dominant disease model.
(A) The power of the test statistic formula image, the “fast-epistasis” in PLINK and logistic regression analysis for testing interaction between two linked loci as a function of traditional odds-ratio formula image under a two-locus dominantformula imagedominant disease model, where the number of individuals in both the case and control groups is 1,000, the significance level is 0.05, and the odds-ratios at two loci were formula image. (B) The power of the test statistic formula image, the “fast-epistasis” in PLINK and logistic regression analysis for testing interaction between two linked loci as a function of traditional odds-ratio formula image under a two-locus dominantformula imagedominant disease model, where the number of individuals in both the case and control groups is 1,000, the significance level is 0.01, and the odds-ratios at two loci were formula image. (C) The power of the test statistic formula image, the “fast-epistasis” in PLINK and logistic regression analysis for testing interaction between two linked loci as a function of traditional odds-ratio formula image under a two-locus dominantformula imagedominant disease model, where the number of individuals in both the case and control groups is 1,000, the significance level is 0.001, and the odds-ratios at two loci were formula image.
Figure 4
Figure 4. Power of the statistics for testing interaction between two linked loci under additive disease model.
(A) The power of the test statistic formula image, the “fast-epistasis” in PLINK and logistic regression for testing interaction between two linked loci analysis as a function of traditional odds-ratio formula image under a two-locus additiveformula imageadditive disease model, where the number of individuals in both the case and control groups is 1,000, the significance level is 0.05, and the odds-ratios at two loci were formula image. (B) The power of the test statistic formula image, the “fast-epistasis” in PLINK and logistic regression for testing interaction between two linked loci analysis as a function of traditional odds-ratio formula image under a two-locus additiveformula imageadditive disease model, where the number of individuals in both the case and control groups is 1,000, the significance level is 0.01, and the odds-ratios at two loci were formula image. (C) The power of the test statistic formula image, the “fast-epistasis” in PLINK and logistic regression for testing interaction between two linked loci analysis as a function of traditional odds-ratio formula image under a two-locus additiveformula imageadditive disease model, where the number of individuals in both the case and control groups is 1,000, the significance level is 0.001, and the odds-ratios at two loci were formula image.
Figure 5
Figure 5. Power of the statistics for testing interaction between two unlinked loci.
(A) The power of the test statistic formula image, the “fast-epistasis” in PLINK and logistic regression analysis for testing interaction between two unlinked loci as a function of traditional odds-ratio formula image under a two-locus recessiveformula imagerecessive disease model, where the number of individuals in both the case and control groups is 2,000, the significance level is 0.001, and the odds-ratios at two loci were formula image. (B) The power of the test statistic formula image, the “fast-epistasis” in PLINK and logistic regression analysis for testing interaction between two unlinked loci as a function of traditional odds-ratio formula image under a two-locus dominantformula imagedominant disease model, where the number of individuals in both the case and control groups is 1,000, the significance level is 0.001, and the odds-ratios at two loci were formula image. (C) The power of the test statistic formula image, the “fast-epistasis” in PLINK and logistic regression analysis for testing interaction between two unlinked loci as a function of traditional odds-ratio formula image under a two-locus additiveformula imageadditive disease model, where the number of individuals in both the case and control groups is 1,000, the significance level is 0.001, and the odds-ratios at two loci were formula image.
Figure 6
Figure 6. Interacting SNPs that were located in 19 pathways formed a network.
Each pathway was represented by an ellipse with the number. The SNPs were represented by nodes and placed insight their located pathways. Nearby each SNP there was its RS number and the name of its located gene. The pathway and its harbored SNPs were labeled by the same color. The interacting SNPs were connected by the solid light green lines.

Comment in

Similar articles

Cited by

References

    1. Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, et al. Finding the missing heritability of complex diseases. Nature. 2009;461:747–753. - PMC - PubMed
    1. Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A. 2009;106:9362–936 7. - PMC - PubMed
    1. Frazer KA, Murray SS, Schork NJ, Topol EJ. Human genetic variation and its contribution to complex traits. Nat Rev Genet. 2009;10:241–251. - PubMed
    1. Altshuler D, Daly MJ, Lander ES. Genetic mapping in human disease. Science. 2008;322:881–888. - PMC - PubMed
    1. Ay N. Locality of global stochastic interaction in directed acyclic networks. Neural Comput. 2002;14:2959–2980. - PubMed

Publication types