Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jul;35(5):398-409.
doi: 10.1002/gepi.20588. Epub 2011 May 18.

Detecting rare and common variants for complex traits: sibpair and odds ratio weighted sum statistics (SPWSS, ORWSS)

Affiliations

Detecting rare and common variants for complex traits: sibpair and odds ratio weighted sum statistics (SPWSS, ORWSS)

Tao Feng et al. Genet Epidemiol. 2011 Jul.

Abstract

It is generally known that risk variants segregate together with a disease within families, but this information has not been used in the existing statistical methods for detecting rare variants. Here we introduce two weighted sum statistics that can apply to either genome-wide association data or resequencing data for identifying rare disease variants: weights calculated based on sibpairs and odd ratios, respectively. We evaluated the two methods via extensive simulations under different disease models. We compared the proposed methods with the weighted sum statistic (WSS) proposed by Madsen and Browning, keeping the same genotyping or resequencing cost. Our methods clearly demonstrate more statistical power than the WSS. In addition, we found that using sibpair information can increase power over using only unrelated samples by more than 40%. We applied our methods to the Framingham Heart Study (FHS) and Wellcome Trust Case Control Consortium (WTCCC) hypertension datasets. Although we did not identify any genes as reaching a genome-wide significance level, we found variants in the candidate gene angiotensinogen significantly associated with hypertension at P = 6.9 × 10(-4), whereas the most significant single SNP association evidence is P = 0.063. We further applied the odds ratio weighted method to the IFIH1 gene for type-1 diabetes in the WTCCC data. Our method yielded a P-value of 4.82 × 10(-4), much more significant than that obtained by haplotype-based methods. We demonstrated that family data are extremely informative in searching for rare variants underlying complex traits, and the odds ratio weighted sum statistic is more efficient than currently existing methods.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Comparison of power for different relative risks. All the risk allele frequencies are less than 0.02, with a cumulative risk allele frequency of 10%. The power was calculated at significance level α=10-6 based on 1,000 replications. Three disease modes have been assumed: Dominant, Additive and Recessive. All the risk alleles were treated the same. Top panel: for SPWSSAA, we simulated 200 affected sibpairs for calculating the weights, and 2,000 cases and 2000 controls for the association test. For ORWSS and WSS, we simulated 2,400 cases and 2,000 controls for the association test. Bottom panel: for SPWSSAU, we simulated 200 discordant sibpairs for calculating the weights, and 2,000 cases and 2000 controls for the association test. For ORWSS and WSS, we simulated 2,200 cases and 2,200 controls for the association test. For WSS, we used the threshold 0.02 to define rare variants; that is, all the risk variants belong to the rare variant group.
Figure 2
Figure 2
Comparison of power for different relative risk. For each replication, we assumed there is a common risk variant (MAF is between 0.05 and 0.08) and the rest risk variants are rare (MAF<0.02), with a cumulative risk allele frequency of 10%. The power was calculated at significance level α=10-6 based on 1,000 replications. Three disease models have been assumed: Dominant, Additive and Recessive. All the risk alleles were treated the same. Top panel: for SPWSSAA, we simulated 200 affected sibpairs for calculating the weights, and 2,000 cases and 2000 controls for the association test. For ORWSS and WSS, we simulated 2,400 cases and 2,000 controls for the association test. Bottom panel: for SPWSSAU, we simulated 200 discordant sibpairs for calculating the weights, and 2,000 cases and 2000 controls for the association test. For ORWSS and WSS, we simulated 2,200 cases and 2,200 controls for the association test. For WSS, we used different thresholds: 0.02, 0.05 and 0.08 to define the rare variants.
Figure 3
Figure 3
Comparison of power for different relative risk. For each replication, we assumed there is a common risk variant (MAF is between 0.05 and 0.08) and the rest risk variants are rare (MAF<0.02), with a cumulative risk allele frequency of 10%. The power was calculated at significance level α=10-6 based on 1,000 replications. Three disease models have been assumed: Dominant, Additive and Recessive. All the risk alleles were treated the same. Top panel: for SPWSSAA, we simulated 400 affected sibpairs for calculating the weights, and 2,000 cases and 2000 controls for the association test. For ORWSS and WSS, we simulated 2,800 cases and 2,000 controls for the association test. Bottom panel: for SPWSSAU, we simulated 400 discordant sibpairs for calculating the weights, and 2,000 cases and 2000 controls for the association test. For ORWSS and WSS, we simulated 2,400 cases and 2,400 controls for the association test. For WSS, we used different thresholds: 0.02, 0.05 and 0.08 to define the rare variants.
Figure 4
Figure 4
Comparison of power for SPWSS when the total sample size is fixed. The power was calculated at significance level α=10-6 based on 1,000 replications. Three disease models have been assumed: Dominant, Additive and Recessive. All the risk alleles were treated the same. We compared 200 sibpairs, 2000 cases and 2000 controls with 400 sibpairs, 1800 cases and 1800 controls. Top panel: SPWSSAA; Bottom panel: for SPWSSAU.

Similar articles

Cited by

References

    1. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007;447(7145):661–78. - PMC - PubMed
    1. Agresti A. Categorical data analysis. New York: Wiley-Interscience; 2002.
    1. Bansal V, Libiger O, Torkamani A, Schork NJ. Statistical analysis strategies for association studies involving rare variants. Nat Rev Genet. 2010;11(11):773–85. - PMC - PubMed
    1. Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET, Thurman RE, et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007;447(7146):799–816. - PMC - PubMed
    1. Browning SR, Browning BL. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am J Hum Genet. 2007;81(5):1084–97. - PMC - PubMed

Publication types