Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Nov 12:4:233.
doi: 10.3389/fgene.2013.00233. eCollection 2013.

Gene-based multiple regression association testing for combined examination of common and low frequency variants in quantitative trait analysis

Affiliations

Gene-based multiple regression association testing for combined examination of common and low frequency variants in quantitative trait analysis

Yun Joo Yoo et al. Front Genet. .

Abstract

Multi-marker methods for genetic association analysis can be performed for common and low frequency SNPs to improve power. Regression models are an intuitive way to formulate multi-marker tests. In previous studies we evaluated regression-based multi-marker tests for common SNPs, and through identification of bins consisting of correlated SNPs, developed a multi-bin linear combination (MLC) test that is a compromise between a 1 df linear combination test and a multi-df global test. Bins of SNPs in high linkage disequilibrium (LD) are identified, and a linear combination of individual SNP statistics is constructed within each bin. Then association with the phenotype is represented by an overall statistic with df as many or few as the number of bins. In this report we evaluate multi-marker tests for SNPs that occur at low frequencies. There are many linear and quadratic multi-marker tests that are suitable for common or low frequency variant analysis. We compared the performance of the MLC tests with various linear and quadratic statistics in joint or marginal regressions. For these comparisons, we performed a simulation study of genotypes and quantitative traits for 85 genes with many low frequency SNPs based on HapMap Phase III. We compared the tests using (1) set of all SNPs in a gene, (2) set of common SNPs in a gene (MAF ≥ 5%), (3) set of low frequency SNPs (1% ≤ MAF < 5%). For different trait models based on low frequency causal SNPs, we found that combined analysis using all SNPs including common and low frequency SNPs is a good and robust choice whereas using common SNPs alone or low frequency SNP alone can lose power. MLC tests performed well in combined analysis except where two low frequency causal SNPs with opposing effects are positively correlated. Overall, across different sets of analysis, the joint regression Wald test showed consistently good performance whereas other statistics including the ones based on marginal regression had lower power for some situations.

Keywords: common variant analysis; generalized Wald test; genetic association analysis; indirect association; minimum p-value test; multi-bin multi-marker tests; multi-marker association analysis; rare variant analysis.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The distribution of common and low frequency SNPs for each of 85 genes used for the simulation study.
Figure 2
Figure 2
Averaged empirical power of gene-based tests for three analysis sets obtained under five different trait models.
Figure 3
Figure 3
Power of gene-based tests using three analysis sets of SNPs for 85 genes under trait Model 1. Genes are ordered along the horizontal axis according to the empirical power of Wald test using only low frequency SNPs.
Figure 4
Figure 4
Power of gene-based tests using three analysis sets of SNPs for 85 genes under trait Model 2. Genes are ordered along the horizontal axis according to the empirical power of Wald test using only low frequency SNPs.
Figure 5
Figure 5
Power of gene-based tests using three analysis sets of SNPs for 85 genes under trait Model 3. Genes are ordered along the horizontal axis according to the empirical power of Wald test using only low frequency SNPs.
Figure 6
Figure 6
Power of gene-based tests using three analysis sets of SNPs for 85 genes under trait Model 4. Genes are ordered along the horizontal axis according to the empirical power of Wald test using only low frequency SNPs.
Figure 7
Figure 7
Power of gene-based tests using three analysis sets of SNPs for 85 genes under trait Model 5. Genes are ordered along the horizontal axis according to the empirical power of Wald test using only low frequency SNPs.
Figure 8
Figure 8
The range of linkage disequilibrium measure r (correlation coefficient) with a given MAF of rare SNP A for range of MAF of SNP B. pA is the MAF of SNP A, pB is the MAF of SNP B, and pAB is the haplotype frequency consisting of rare alleles of SNP A and B.

References

    1. Carlson C. S., Eberle M. A., Rieder M. J., Yi Q., Kruglyak L., Nickerson D. A. (2004). Selecting a maximally informative set of single-nucleotide polymorphisms for association analysis using linkage disequilibrium. Am. J. Hum. Genet. 74, 106–120 10.1086/381000 - DOI - PMC - PubMed
    1. Chen G., Yuan A., Zhou Y., Bentley A. R., Zhou J., Chen W., et al. (2012)., Simultaneous analysis of common and rare variants in complex traits: application to SNPs (SCARVAsnp). Bioinform. Biol. Insights. 6, 177–185 10.4137/BBI.S9966 - DOI - PMC - PubMed
    1. Conneely K. N., Boehnke M. (2007). So many tests, so little time! Rapid adjustment of P-values for multiple correlated tests. Am. J. Hum. Genet. 81, 1158–1168 10.1086/522036 - DOI - PMC - PubMed
    1. Curtis D. (2012). A rapid method for combined analysis of common and rare variants at the level of a region, gene, or pathway. Adv. Appl. Bioinform. Chem. 5, 1–9 10.2147/AABC.S33049 - DOI - PMC - PubMed
    1. Derkach A., Lawless J. F., Sun L. (2013). Assessment of pooled association tests for rare variants within a unified framework. Stat. Sci., (forthcoming).