Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012:5:1-9.
doi: 10.2147/AABC.S33049. Epub 2012 Jul 24.

A rapid method for combined analysis of common and rare variants at the level of a region, gene, or pathway

Affiliations

A rapid method for combined analysis of common and rare variants at the level of a region, gene, or pathway

David Curtis. Adv Appl Bioinform Chem. 2012.

Abstract

Previously described methods for the combined analysis of common and rare variants have disadvantages such as requiring an arbitrary classification of variants or permutation testing to assess statistical significance. Here we propose a novel method which implements a weighting scheme based on allele frequencies observed in both cases and controls. Because the test is unbiased, scores can be analyzed with a standard t-test. To test its validity we applied it to data for common, rare, and very rare variants simulated under the null hypothesis. To test its power we applied it to simulated data in which association was present, including data using the observed allele frequencies of common and rare variants in NOD2 previously reported in cases of Crohn's disease and controls. The method produced results that conformed well to those expected under the null hypothesis. It demonstrated more power to detect association when rare and common variants were analyzed jointly, the power further increasing when rare variants were assigned higher weights. 20,000 analyses of a gene containing 62 variants could be performed in 80 minutes on a laptop. This approach shows promise for the analysis of data currently emerging from genome wide sequencing studies.

Keywords: common; exome; genome; rare; sequence; variant.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Plot of the relative weight, r, using the originally proposed weighting scheme accorded to each allele of a variant with frequency q for sample size n = 2000. Note: For the smallest value of q = 0.0005, r is 22.4.
Figure 2
Figure 2
Plot of the weight, W, using the novel weighting scheme accorded to each allele of a variant with frequency q and weighting factor f = 20. Note: The value of W at q = 0.5 is 1.
Figure 3
Figure 3
Q-Q plot of −log(α) against −log(β). (A) Common variant (MAF = 0.45). (B) Rare variant (MAF = 0.01). (C) 20 very rare variants analyzed together (each with MAF = 0.0005). Abbreviations: MAF, minor allele frequency; α, target P-value; β, the proportion of simulations achieving α.
Figure 4
Figure 4
Q-Q plot of −log(α) against −log(β). (A) Weighting factor f = 1. (B) Weighting factor f = 10. (C) Weighting factor f = 100. (D) Weighting factor f = 1000. Abbreviations: α, target P-value; β, the proportion of simulations achieving α for combined analyses including all variants using a range of values for the weighting factor, f.
Figure 5
Figure 5
Plot of the mean −log(p) value obtained for different values of the weighting factor, f, when applied to combined analysis of all variants in simulated datasets of 1000 cases and 1000 controls.
Figure 6
Figure 6
Plot of the mean −log(p) value obtained for different values of the weighting factor, f, when applied to combined analysis using variant counts generated from those observed in NOD2. Note: Sample size consists of 453 cases and 103 controls.

References

    1. Curtis D, Vine AE, Knight J. A simple method for assessing the strength of evidence for association at the level of the whole gene. Adv Appl Bioinform Chem. 2008;1:115–120. - PMC - PubMed
    1. Li B, Leal SM. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet. 2008;83(3):311–321. - PMC - PubMed
    1. Morris AP, Zeggini E. An evaluation of statistical approaches to rare variant analysis in genetic association studies. Genet Epidemiol. 2010;34(2):188–193. - PMC - PubMed
    1. Ionita-Laza I, Makarov V, Yoon S, et al. Finding disease variants in Mendelian disorders by using sequence data: methods and applications. Am J Hum Genet. 2011;89(6):701–712. - PMC - PubMed
    1. Lawrence R, Day-Williams AG, Elliott KS, Morris AP, Zeggini E. CCRaVAT and QuTie – enabling analysis of rare variants in large-scale case control and quantitative trait association studies. BMC Bioinformatics. 2010;11:527. - PMC - PubMed

LinkOut - more resources