Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Oct 14;6(10):e1001156.
doi: 10.1371/journal.pgen.1001156.

A novel adaptive method for the analysis of next-generation sequencing data to detect complex trait associations with rare variants due to gene main effects and interactions

Affiliations

A novel adaptive method for the analysis of next-generation sequencing data to detect complex trait associations with rare variants due to gene main effects and interactions

Dajiang J Liu et al. PLoS Genet. .

Abstract

There is solid evidence that rare variants contribute to complex disease etiology. Next-generation sequencing technologies make it possible to uncover rare variants within candidate genes, exomes, and genomes. Working in a novel framework, the kernel-based adaptive cluster (KBAC) was developed to perform powerful gene/locus based rare variant association testing. The KBAC combines variant classification and association testing in a coherent framework. Covariates can also be incorporated in the analysis to control for potential confounders including age, sex, and population substructure. To evaluate the power of KBAC: 1) variant data was simulated using rigorous population genetic models for both Europeans and Africans, with parameters estimated from sequence data, and 2) phenotypes were generated using models motivated by complex diseases including breast cancer and Hirschsprung's disease. It is demonstrated that the KBAC has superior power compared to other rare variant analysis methods, such as the combined multivariate and collapsing and weight sum statistic. In the presence of variant misclassification and gene interaction, association testing using KBAC is particularly advantageous. The KBAC method was also applied to test for associations, using sequence data from the Dallas Heart Study, between energy metabolism traits and rare variants in ANGPTL 3,4,5 and 6 genes. A number of novel associations were identified, including the associations of high density lipoprotein and very low density lipoprotein with ANGPTL4. The KBAC method is implemented in a user-friendly R package.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Quantile-Quantile (QQ) plot of p-values obtained from Monte Carlo approximation (left panel), permutation (right panel), and theoretical expectations.
P-values were estimated using 10,000 iterations and 10,000 permutations for Monte Carlo approximation and permutation, respectively. Four sample sizes were investigated: 200 cases/200 controls; 300 cases/300 controls, 400 cases/400 controls, and 500 cases/500 controls. A total of 3,000 replicates were used to generate the QQ plot for each sample size.
Figure 2
Figure 2. Impact of misclassifications under main effects model with fixed genetic effects using simulated SFS for AA.
Each causal rare variant has an OR = 3.0. Power comparisons were made for the KBAC, WSS, CMC, and RVE when 0%∼60% of causal rare variants are excluded from the analysis (left panel) and when 0%∼100% of non-causal rare variants are included (right panel). A sample size of 1000 cases and 1000 controls was used for each scenario. P-values were empirically estimated using 5,000 permutations and power was evaluated for a significance level of formula image using 2,000 replicates for each scenario.
Figure 3
Figure 3. Impact of misclassifications under main effects model with variable genetic effects using simulated SFS for AA.
The disease odds for causal variants are inversely correlated with their MAFs and within the range of 2∼20. Power comparisons were made for the KBAC, WSS, CMC, and RVE when 0%∼60% of causal rare variants are excluded from the analysis (left panel) and when 0%∼100% of non-causal rare variants are included (right panel). A sample size of 1000 cases and 1000 controls was used for each scenario. P-values were empirically estimated using 5,000 permutations and power was evaluated for a significance level of formula image using 2,000 replicates for each scenario.
Figure 4
Figure 4. Power comparisons for within gene (left panel) and between gene interaction model (right panel) with simulated SFS for AA.
Power was evaluated for the KBAC, WSS, CMC and RVE. A sample size of 1000 cases and 1000 controls were used for the within interaction model, and a sample size of 300 cases and 300 controls were used for the between gene interaction model. Scenarios with different proportions of causal variants were considered. P-values were empirically estimated using 5,000 permutations and power was evaluated for a significance level of formula image using 2,000 replicates.

Similar articles

Cited by

References

    1. Ji W, Foo JN, O'Roak BJ, Zhao H, Larson MG, et al. Rare independent mutations in renal salt handling genes contribute to blood pressure variation. Nat Genet. 2008;40:592–599. - PMC - PubMed
    1. Ahituv N, Kavaslar N, Schackwitz W, Ustaszewska A, Martin J, et al. Medical sequencing at the extremes of human body mass. Am J Hum Genet. 2007;80:779–791. - PMC - PubMed
    1. Cohen JC, Kiss RS, Pertsemlidis A, Marcel YL, McPherson R, et al. Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science. 2004;305:869–872. - PubMed
    1. Cohen JC, Pertsemlidis A, Fahmi S, Esmail S, Vega GL, et al. Multiple rare variants in NPC1L1 associated with reduced sterol absorption and plasma low-density lipoprotein levels. Proc Natl Acad Sci U S A. 2006;103:1810–1815. - PMC - PubMed
    1. Romeo S, Pennacchio LA, Fu Y, Boerwinkle E, Tybjaerg-Hansen A, et al. Population-based resequencing of ANGPTL4 uncovers variations that reduce triglycerides and increase HDL. Nat Genet. 2007;39:513–516. - PMC - PubMed

Publication types

MeSH terms