Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Nov 24:13:667.
doi: 10.1186/1471-2164-13-667.

Weighted pedigree-based statistics for testing the association of rare variants

Affiliations

Weighted pedigree-based statistics for testing the association of rare variants

Yin Yao Shugart et al. BMC Genomics. .

Abstract

Background: With the advent of next-generation sequencing (NGS) technologies, researchers are now generating a deluge of data on high dimensional genomic variations, whose analysis is likely to reveal rare variants involved in the complex etiology of disease. Standing in the way of such discoveries, however, is the fact that statistics for rare variants are currently designed for use with population-based data. In this paper, we introduce a pedigree-based statistic specifically designed to test for rare variants in family-based data. The additional power of pedigree-based statistics stems from the fact that while rare variants related to diseases or traits of interest occur only infrequently in populations, in families with multiple affected individuals, such variants are enriched. Note that while the proposed statistic can be applied with and without statistical weighting, our simulations show that its power increases when weighting (WSS and VT) are applied.

Results: Our working hypothesis was that, since rare variants are concentrated in families with multiple affected individuals, pedigree-based statistics should detect rare variants more powerfully than population-based statistics. To evaluate how well our new pedigree-based statistics perform in association studies, we develop a general framework for sequence-based association studies capable of handling data from pedigrees of various types and also from unrelated individuals. In short, we developed a procedure for transforming population-based statistics into tests for family-based associations. Furthermore, we modify two existing tests, the weighted sum-square test and the variable-threshold test, and apply both to our family-based collapsing methods. We demonstrate that the new family-based tests are more powerful than corresponding population-based test and they generate a reasonable type I error rate.To demonstrate feasibility, we apply the newly developed tests to a pedigree-based GWAS data set from the Framingham Heart Study (FHS). FHS-GWAS data contain approximately 5000 uncommon variants with frequencies less than 0.05. Potential association findings in these data demonstrate the feasibility of the software PB-STAR (note, PB-STAR is now freely available to the public).

Conclusion: Our tests show that when analyzing for rare variants, a pedigree-based design is more powerful than a population-based case-control design. We further demonstrate that a pedigree-based statistic's power to detect rare variants increases in direct relation to the proportion of affected individuals within the pedigree.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The power curves of the family-based corrected single marker χ 2 test statistic as a function of the total number of individuals at the significance level α = 0.05 in the test under seven settings: unrelated individuals in cases-controls study, nuclear family groups 1 and 2, sib-pair groups 1 and 2 and three generation family groups 1 and 2, assuming a dominant model, 20% of the risk variants and a baseline penetrance of 0.01.
Figure 2
Figure 2
The power curves of the family-based collapsing test (variants with frequencies ≤0.005 were collapsed) statistic as a function of the total number of individuals at the significance level α = 0.05 in the test under seven settings: unrelated individuals in cases-controls study, nuclear family groups 1 and 2, sib-pair groups 1 and 2 and three generation family groups 1 and 2, assuming a dominant model, 20% of the risk variants and a baseline penetrance of 0.01.
Figure 3
Figure 3
The power curves of the family-based VT test statistic as a function of the total number of individuals at the significance level α = 0.05 in the test under seven settings: unrelated individuals in cases-controls study, nuclear family groups 1 and 2, sib-pair groups 1 and 2 and three generation family groups 1 and 2, assuming a dominant model, 20% of the risk variants and a baseline penetrance of 0.01.
Figure 4
Figure 4
The power curves of the family-based WSS test statistic as a function of the total number of individuals at the significance level α = 0.05 in the test under seven settings: unrelated individuals in cases-controls study, nuclear family groups 1 and 2, sib-pair groups 1 and 2 and three generation family groups 1 and 2, assuming a dominant model, 20% of the risk variants and a baseline penetrance of 0.01.
Figure 5
Figure 5
The power curves of the family-based corrected single marker χ 2 test statistic as a function of the proportion of risk variants at the significance level α = 0.05 in the test under seven settings: unrelated individuals in cases-controls study, nuclear family groups 1 and 2, sib-pair groups 1 and 2 and three generation family groups 1 and 2, assuming a dominant model, a total of 1,800 sampled individuals and a baseline penetrance of 0.01.
Figure 6
Figure 6
The power curves of the family-based collapsing test (variants with frequencies ≤0.005 were collapsed) statistic as a function of the proportion of risk variants at the significance level α = 0.05 in the test under seven settings: unrelated individuals in cases-controls study, nuclear family groups 1 and 2, sib-pair groups 1 and 2 and three generation family groups 1 and 2, assuming a dominant model, a total of 1,800 sampled individuals and a baseline penetrance of 0.01.
Figure 7
Figure 7
The power curves of the family-based VT test statistic as a function of the proportion of risk variants at the significance level α = 0.05 in the test under seven settings: unrelated individuals in cases-controls study, nuclear family groups 1 and 2, sib-pair groups 1 and 2 and three generation family groups 1 and 2, assuming a dominant model, a total of 1,800 sampled individuals and a baseline penetrance of 0.01.
Figure 8
Figure 8
The power curves of the family-based WSS test statistic as a function of the proportion of risk variants at the significance level α = 0.05 in the test under seven settings: unrelated individuals in cases-controls study, nuclear family groups 1 and 2, sib-pair groups 1 and 2 and three generation family groups 1 and 2, assuming a dominant model, a total of 1,800 sampled individuals and a baseline penetrance of 0.01.
Figure 9
Figure 9
The power curves of the family-based corrected single marker χ 2 statistic under opposite directions of association as a function of the total number of individuals at the significance level α = 0.05 in the test under seven settings: unrelated individuals in cases-controls study, nuclear family groups 1 and 2, sib-pair groups 1 and 2 and three generation family groups 1 and 2, assuming a dominant model, 20% of the risk variants and a baseline penetrance of 0.01.
Figure 10
Figure 10
The power curves of the family-based collapsing test (variants with frequencies ≤0.005 were collapsed) statistic under opposite directions of association as a function of the total number of individuals at the significance level α = 0.05 in the test under seven settings: unrelated individuals in cases-controls study, nuclear family groups 1 and 2, sib-pair groups 1 and 2 and three generation family groups 1 and 2, assuming a dominant model, 20% of the risk variants and a baseline penetrance of 0.01.
Figure 11
Figure 11
The power curves of the family-based VT statistic under opposite directions of association as a function of the total number of individuals at the significance level α = 0.05 in the test under seven settings: unrelated individuals in cases-controls study, nuclear family groups 1 and 2, sib-pair groups 1 and 2 and three generation family groups 1 and 2, assuming a dominant model, 20% of the risk variants and a baseline penetrance of 0.01.
Figure 12
Figure 12
The power curves of the family-based WSS test statistic under opposite directions of association as a function of the total number of individuals at the significance level α = 0.05 in the test under seven settings: unrelated individuals in cases-controls study, nuclear family groups 1 and 2, sib-pair groups 1 and 2 and three generation family groups 1 and 2, assuming a dominant model, 20% of the risk variants and a baseline penetrance of 0.01.

Similar articles

Cited by

References

    1. Ehret G. Genome-wide association studies: contribution of genomics to understanding blood pressure and essential hypertension. Curr Hypertens Rep. 2011;12:17–25. - PMC - PubMed
    1. Lupski JR, Belmont JW, Boerwinkle E, Gibbs RA. Clan genomics and the complex architecture of human disease. Cell. 2011;147:32–43. doi: 10.1016/j.cell.2011.09.008. - DOI - PMC - PubMed
    1. Liu DJ, Leal SM. A novel adaptive method for the analysis of next-generation sequencing data to detect complex trait associating with rare variants due to gene main effects and interactions. PLoS Genet. 2010;6:e1001156. doi: 10.1371/journal.pgen.1001156. - DOI - PMC - PubMed
    1. Xiong M, Zhao J, Boerwinkle E. Generalized T2 test for genome association studies. Am J Hum Genet. 2002;70:1257–1268. doi: 10.1086/340392. - DOI - PMC - PubMed
    1. Madsen BE, Browning SR. A groupwise association test for rare mutations using a weighted sum statistics. PLoS Genet. 2009;5:e1000384. doi: 10.1371/journal.pgen.1000384. - DOI - PMC - PubMed

Publication types