ARIEL and AMELIA: testing for an accumulation of rare variants using next-generation sequencing data

Jennifer L Asimit¹, Aaron G Day-Williams, Andrew P Morris, Eleftheria Zeggini

Affiliations

PMID: 22441326
PMCID: PMC3477640
DOI: 10.1159/000336982

ARIEL and AMELIA: testing for an accumulation of rare variants using next-generation sequencing data

Jennifer L Asimit et al. Hum Hered. 2012.

. 2012;73(2):84-94.

doi: 10.1159/000336982. Epub 2012 Mar 22.

Authors

Jennifer L Asimit¹, Aaron G Day-Williams, Andrew P Morris, Eleftheria Zeggini

Affiliation

¹ Wellcome Trust Sanger Institute, Hinxton, UK.

PMID: 22441326
PMCID: PMC3477640
DOI: 10.1159/000336982

Abstract

Objectives: There is increasing evidence that rare variants play a role in some complex traits, but their analysis is not straightforward. Locus-based tests become necessary due to low power in rare variant single-point association analyses. In addition, variant quality scores are available for sequencing data, but are rarely taken into account. Here, we propose two locus-based methods that incorporate variant quality scores: a regression-based collapsing approach and an allele-matching method.

Methods: Using simulated sequencing data we compare 4 locus-based tests of trait association under different scenarios of data quality. We test two collapsing-based approaches and two allele-matching-based approaches, taking into account variant quality scores and ignoring variant quality scores. We implement the collapsing and allele-matching approaches accounting for variant quality in the freely available ARIEL and AMELIA software.

Results: The incorporation of variant quality scores in locus-based association tests has power advantages over weighting each variant equally. The allele-matching methods are robust to the presence of both protective and risk variants in a locus, while collapsing methods exhibit a dramatic loss of power in this scenario.

Conclusions: The incorporation of variant quality scores should be a standard protocol when performing locus-based association analysis on sequencing data. The ARIEL and AMELIA software implement collapsing and allele-matching locus association analysis methods, respectively, that allow the incorporation of variant quality scores.

PubMed Disclaimer

Figures

**Fig. 1**
Distributions of converted quality scores used for simulating data. The distribution of positive converted quality scores (probabilities of correct variant call) for a 50-kb region (left panel). The middle and right panels show the (converted) quality scores filtered at Q1 and Q10, respectively.

**Fig. 2**
Power comparisons for the collapsing method, ARIEL, KBAT, and AMELIA with high-quality causal variants. Power comparisons of the four methods at different OR magnitudes for common direction of effect (left panel) and presence of both risk and protective variants (right panel). The simulated data has non-consensus variant quality scores and the causal variants are set to have phred-scaled score quality 10 (probability of correct call is 0.90). The total sample size (cases + controls) is 1,000 (upper) and 2,000 (lower).

**Fig. 3**
Power comparisons for the collapsing method, ARIEL, KBAT, and AMELIA for Q10 and Q1 filtering of variants. Power comparisons of the four methods at different OR magnitudes for common direction of effect (left panel) and presence of both risk and protective variants (right panel). The simulated data has non-consensus variant quality scores, where (upper) all variants have phred-scaled score quality ≥ 10 (probability of correct call is ≥ 0.90) and (lower) all variants have phred-scaled score quality ≥ 1 (probability of correct call is ≥ 0.205). The total sample size (cases + controls) is 2,000.

**Fig. 4**
Power comparisons for the collapsing method, ARIEL, KBAT, and AMELIA for consensus quality score data. Power comparisons of the four methods at different OR magnitudes for common direction of effect (left panel) and presence of both risk and protective variants (right panel). The simulated data has consensus variant quality scores, where the causal variants are set to have phred-scaled score quality 10 (probability of correct call is 0.90). The total sample size (cases + contols) is 2,000.

See this image and copyright information in PMC

References

1. Cohen J, et al. Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science. 2004;305:869–872. - PubMed
1. Ji W, et al. Rare independent mutations in renal salt handling genes contribute to blood pressure variation. Nat Genet. 2008;40:592–599. - PMC - PubMed
1. Nejentsev S, Walker N, Riches D, Egholm M, Todd J. Rare variants of IFIH1, a gene implicated in antiviral responses, protect against type 1 diabetes. Science. 2009;324:387–389. - PMC - PubMed
1. Li B, Leal S. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet. 2008;83:311–321. - PMC - PubMed
1. Bodmer W, Bonillna C. Common and rare variants in multifactorial susceptibility to common diseases. Nat Genet. 2008;40:695–701. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- figshare - Access datasets and other research materials.

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

ARIEL and AMELIA: testing for an accumulation of rare variants using next-generation sequencing data

Affiliation

ARIEL and AMELIA: testing for an accumulation of rare variants using next-generation sequencing data

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources