Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Mar;7(3):e1001322.
doi: 10.1371/journal.pgen.1001322. Epub 2011 Mar 3.

Testing for an unusual distribution of rare variants

Affiliations

Testing for an unusual distribution of rare variants

Benjamin M Neale et al. PLoS Genet. 2011 Mar.

Abstract

Technological advances make it possible to use high-throughput sequencing as a primary discovery tool of medical genetics, specifically for assaying rare variation. Still this approach faces the analytic challenge that the influence of very rare variants can only be evaluated effectively as a group. A further complication is that any given rare variant could have no effect, could increase risk, or could be protective. We propose here the C-alpha test statistic as a novel approach for testing for the presence of this mixture of effects across a set of rare variants. Unlike existing burden tests, C-alpha, by testing the variance rather than the mean, maintains consistent power when the target set contains both risk and protective variants. Through simulations and analysis of case/control data, we demonstrate good power relative to existing methods that assess the burden of rare variants in individuals.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Mixtures of biased coins in a set of largely neutral coins generate substantially increased variances compared to uniform coins with the same bias.
(A) shows distribution of the outcome of coin tosses generated using an 80∶20 mixture of neutral coins and biased coins (probability of a head  = .9), compared with the outcome of a series of biased coin tosses (probability of a head  = .58); the mixed coin toss (blue) has the same mean bias (p = .58) as the biased coin toss (black). (B) shows distribution of a 10∶80∶10 mixture of a biased coin (probability of a head  = .1), neutral coin, and a biased coin (probability of a head  = .9), compared with the outcome of a series of neutral coin tosses. In both simulations, coins are selected and flipped 10 times and the resulting number of heads, ranging from 0 through 10, are shown. The increased variance of the outcomes in the mixture setting carries information about the presence of some non-neutral coins in the experiment.
Figure 2
Figure 2. The distribution of recurrent, low frequency non-synonymous variants.
In (A) 100 high and 100 low extremes of triglyceride levels drawn from the Malmo Diet and Cancer Study – Cardiovascular Arm in APOB and (B) 350 cases of Crohn's disease and 350 controls collected by the NIDDK IBD Genetics Consortium in NOD2, identified from pooled data and then individually genotyped. The background (gray) represents the binomial probability distribution while the foreground (red points) shows observed data from NOD2 and APOB sequencing, in which, for example, APOB (A) the n = 3 row indicates three observed variants, one seen in 3 cases and 0 controls, one seen in 2 cases and 1 control, and one seen in 0 cases and 3 controls.
Figure 3
Figure 3. Power comparisons and variants.
(A) shows power comparisons for the population genetics model simulations. Power comparisons are for C-alpha, Madsen-Browning (MB), Variable threshold (VT), and Li-Leal's approach (presence/absence Li-Leal_p and count of rare variants Li-Leal_c). These simulations reflect the presence of selection on the variation which predisposes to phenotype. As we increase the mixing proportions between risk and protective variants (moving from mixtures 1 to 6, which reflects 0, 10, 20, 30, 40 and 50% chance of any of the phenotypically relevant variants are protective, rather than risk), C-alpha maintains power, while other tests lose power. In (B), the each of 6 variants explains 0.1% of the variance of the phenotype. All three approaches have high power when all the effects are detrimental. For burden tests, the power drops markedly when 3 variants are protective and 3 are detrimental. “Selected” controls are chosen from the lower 1% of the liability distribution. The solid (dashed) lines represent power for selected (unselected) controls.

References

    1. Cohen J, Pertsemlidis A, Kotowski IK, Graham R, Garcia CK, et al. Low ldl cholesterol in individuals of african descent resulting from frequent nonsense mutations in pcsk9. Nat Genet. 2005;37(2):161–165. - PubMed
    1. Cohen JC, Boerwinkle E, Mosley TH, Hobbs HH. Sequence variations in PCSK9, low ldl, and protection against coronary heart disease. N Engl J Med. 2006;354(12):1264–1272. - PubMed
    1. Kathiresan S, Melander O, Anevski D, Guiducci C, Burtt NP, et al. Polymorphisms associated with cholesterol and risk of cardiovascular events. N Engl J Med. 2008;358(12):1240–9, PMID 18354102. - PubMed
    1. Cohen JC, Kiss RS, Pertsemlidis A, Marcel YL, McPherson R, et al. Multiple rare alleles contribute to low plasma levels of hdl cholesterol. Science. 2005;305(5685):869–872. - PubMed
    1. Morgenthaler S, Thilly WG. A strategy to discover genes that carry multi-allelic or mono-allelic risk for common diseases: a cohort allelic sums test (cast). Mutat Res. 2007;615(1-2):28–56. - PubMed

Publication types