Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;73(2):84-94.
doi: 10.1159/000336982. Epub 2012 Mar 22.

ARIEL and AMELIA: testing for an accumulation of rare variants using next-generation sequencing data

Affiliations

ARIEL and AMELIA: testing for an accumulation of rare variants using next-generation sequencing data

Jennifer L Asimit et al. Hum Hered. 2012.

Abstract

Objectives: There is increasing evidence that rare variants play a role in some complex traits, but their analysis is not straightforward. Locus-based tests become necessary due to low power in rare variant single-point association analyses. In addition, variant quality scores are available for sequencing data, but are rarely taken into account. Here, we propose two locus-based methods that incorporate variant quality scores: a regression-based collapsing approach and an allele-matching method.

Methods: Using simulated sequencing data we compare 4 locus-based tests of trait association under different scenarios of data quality. We test two collapsing-based approaches and two allele-matching-based approaches, taking into account variant quality scores and ignoring variant quality scores. We implement the collapsing and allele-matching approaches accounting for variant quality in the freely available ARIEL and AMELIA software.

Results: The incorporation of variant quality scores in locus-based association tests has power advantages over weighting each variant equally. The allele-matching methods are robust to the presence of both protective and risk variants in a locus, while collapsing methods exhibit a dramatic loss of power in this scenario.

Conclusions: The incorporation of variant quality scores should be a standard protocol when performing locus-based association analysis on sequencing data. The ARIEL and AMELIA software implement collapsing and allele-matching locus association analysis methods, respectively, that allow the incorporation of variant quality scores.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Distributions of converted quality scores used for simulating data. The distribution of positive converted quality scores (probabilities of correct variant call) for a 50-kb region (left panel). The middle and right panels show the (converted) quality scores filtered at Q1 and Q10, respectively.
Fig. 2
Fig. 2
Power comparisons for the collapsing method, ARIEL, KBAT, and AMELIA with high-quality causal variants. Power comparisons of the four methods at different OR magnitudes for common direction of effect (left panel) and presence of both risk and protective variants (right panel). The simulated data has non-consensus variant quality scores and the causal variants are set to have phred-scaled score quality 10 (probability of correct call is 0.90). The total sample size (cases + controls) is 1,000 (upper) and 2,000 (lower).
Fig. 3
Fig. 3
Power comparisons for the collapsing method, ARIEL, KBAT, and AMELIA for Q10 and Q1 filtering of variants. Power comparisons of the four methods at different OR magnitudes for common direction of effect (left panel) and presence of both risk and protective variants (right panel). The simulated data has non-consensus variant quality scores, where (upper) all variants have phred-scaled score quality ≥ 10 (probability of correct call is ≥ 0.90) and (lower) all variants have phred-scaled score quality ≥ 1 (probability of correct call is ≥ 0.205). The total sample size (cases + controls) is 2,000.
Fig. 4
Fig. 4
Power comparisons for the collapsing method, ARIEL, KBAT, and AMELIA for consensus quality score data. Power comparisons of the four methods at different OR magnitudes for common direction of effect (left panel) and presence of both risk and protective variants (right panel). The simulated data has consensus variant quality scores, where the causal variants are set to have phred-scaled score quality 10 (probability of correct call is 0.90). The total sample size (cases + contols) is 2,000.

References

    1. Cohen J, et al. Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science. 2004;305:869–872. - PubMed
    1. Ji W, et al. Rare independent mutations in renal salt handling genes contribute to blood pressure variation. Nat Genet. 2008;40:592–599. - PMC - PubMed
    1. Nejentsev S, Walker N, Riches D, Egholm M, Todd J. Rare variants of IFIH1, a gene implicated in antiviral responses, protect against type 1 diabetes. Science. 2009;324:387–389. - PMC - PubMed
    1. Li B, Leal S. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet. 2008;83:311–321. - PMC - PubMed
    1. Bodmer W, Bonillna C. Common and rare variants in multifactorial susceptibility to common diseases. Nat Genet. 2008;40:695–701. - PMC - PubMed

Publication types