Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Nov;36(7):675-85.
doi: 10.1002/gepi.21662. Epub 2012 Aug 3.

A fast and noise-resilient approach to detect rare-variant associations with deep sequencing data for complex disorders

Affiliations

A fast and noise-resilient approach to detect rare-variant associations with deep sequencing data for complex disorders

Yee Him Cheung et al. Genet Epidemiol. 2012 Nov.

Abstract

Next generation sequencing technology has enabled the paradigm shift in genetic association studies from the common disease/common variant to common disease/rare-variant hypothesis. Analyzing individual rare variants is known to be underpowered; therefore association methods have been developed that aggregate variants across a genetic region, which for exome sequencing is usually a gene. The foreseeable widespread use of whole genome sequencing poses new challenges in statistical analysis. It calls for new rare-variant association methods that are statistically powerful, robust against high levels of noise due to inclusion of noncausal variants, and yet computationally efficient. We propose a simple and powerful statistic that combines the disease-associated P-values of individual variants using a weight that is the inverse of the expected standard deviation of the allele frequencies under the null. This approach, dubbed as Sigma-P method, is extremely robust to the inclusion of a high proportion of noncausal variants and is also powerful when both detrimental and protective variants are present within a genetic region. The performance of the Sigma-P method was tested using simulated data based on realistic population demographic and disease models and its power was compared to several previously published methods. The results demonstrate that this method generally outperforms other rare-variant association methods over a wide range of models. Additionally, sequence data on the ANGPTL family of genes from the Dallas Heart Study were tested for associations with nine metabolic traits and both known and novel putative associations were uncovered using the Sigma-P method.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Power comparisons under the additive model with a decreasing ratio of detrimental to neutral variants in genomic regions with different lengths: (panel A) 1:1 (~25 detrimental variants: ~25 neutral variants) in 3 kb, (panel B) 1:2 (~25 detrimental variants: ~50 neutral variants) in 4.5 kb, (panel C) 1:4 (~25 detrimental variants: ~100 neutral variants) in 7.5 kb, (panel D) 1:1 (~50 detrimental variants: ~ 50 neutral variants) in 6 kb, (panel E) 1:2 (~50 detrimental variants: ~100 neutral variants) in 9 kb, and (panel F) 1:4 (~50 detrimental variants: ~200 neutral variants) in 15 kb regions. Sample size is 500 cases and 500 controls.
Fig. 2.
Fig. 2.
Power comparisons under the dominant model with the ratio of detrimental to protective variants fixed at 1:1, and an increasing number of neutral variants: (panel A) 1:1:0 (~25 detrimental variants: ~25 protective variants: 0 neutral variants) in 3 kb, (panel B) 1:1:2 (~25 detrimental variants: ~25 protective variants: ~50 neutral variants) in 6 kb, (panel C) 1:1:4 (~25 detrimental variants: ~25 protective variants: ~100 neutral variants) in 9 kb, and (panel D) 1:1:8 (~25 detrimental variants: ~25 protective variants: ~200 neutral variants) in15 kb regions. Sample size is 500 cases and 500 controls.

Similar articles

Cited by

References

    1. Ahituv N, Kavaslar N, Schackwitz W, Ustaszewska A, Martin J, Hébert S, Doelle H, Ersoy B, Kryukov G, Schmidt S, Yosef N, Ruppin E, Sharan R, Vaisse C, Sunyaev S, Dent R, Cohen J, McPherson R, Pennacchio LA. 2007. Medical sequencing at the extremes of human body mass. Am J Hum Genet 80:779–791. - PMC - PubMed
    1. Armitage P, Berry G, Matthews JNS. 2002. Statistical Methods in Medical Research (4th ed.) Malden, MA: Blackwell Publishing.
    1. Bansal V, Libiger O, Torkamani A, Schork NJ. 2010. Statistical analysis strategies for association studies involving rare variants. Nat Rev Genet 11:773–785. - PMC - PubMed
    1. Barnard GA. 1989. On alleged gains in power from lower P-values. Stat Med 8:1469–1477. - PubMed
    1. Bhatia G, Bansal V, Harismendy O, Schork NJ, Topol EJ, Frazer K, Bafna V. 2010. A covering method for detecting genetic associations between rare variants and common phenotypes. PLoS Comput Biol 6, e1000954. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources