Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Sep;29(9):2177-86.
doi: 10.1093/molbev/mss090. Epub 2012 Mar 12.

Detecting selective sweeps from pooled next-generation sequencing samples

Affiliations

Detecting selective sweeps from pooled next-generation sequencing samples

Simon Boitard et al. Mol Biol Evol. 2012 Sep.

Abstract

Due to its cost effectiveness, next-generation sequencing of pools of individuals (Pool-Seq) is becoming a popular strategy for characterizing variation in population samples. Because Pool-Seq provides genome-wide SNP frequency data, it is possible to use them for demographic inference and/or the identification of selective sweeps. Here, we introduce a statistical method that is designed to detect selective sweeps from pooled data by accounting for statistical challenges associated with Pool-Seq, namely sequencing errors and random sampling among chromosomes. This allows for an efficient use of the information: all base calls are included in the analysis, but the higher credibility of regions with higher coverage and base calls with better quality scores is accounted for. Computer simulations show that our method efficiently detects sweeps even at very low coverage (0.5× per chromosome). Indeed, the power of detecting sweeps is similar to what we could expect from sequences of individual chromosomes. Since the inference of selective sweeps is based on the allele frequency spectrum (AFS), we also provide a method to accurately estimate the AFS provided that the quality scores for the sequence reads are reliable. Applying our approach to Pool-Seq data from Drosophila melanogaster, we identify several selective sweep signatures on chromosome X that include some previously well-characterized sweeps like the wapl region.

PubMed Disclaimer

Figures

F<sc>IG</sc>. 1.
FIG. 1.
AFS estimation. Pools of n = 25 (a) and n = 50 (b) chromosomes of length L = 100  kb were simulated under a constant population size coalescent model with θ = 0.003 and ρ = 0.003. Solid lines show the AFS extracted from the complete sequence information and averaged over 100 simulated samples (it closely fits the AFS expected from coalescent theory). Diamonds and error bars represent the average estimated AFS and the average absolute deviation respectively using the same 100 samples. The estimates were obtained from pooled NGS data with 100× expected coverage using the EM algorithm.
F<sc>IG</sc>. 2.
FIG. 2.
AFS in Drosophila melanogaster. Estimated from all base calls (a) or only those with PHRED score greater than 35 (b). As we consider the folded AFS, the probabilities for allele frequencies 98/194 to 193/194 (not shown) can be deduced by symmetry from those for allele frequencies 1/194 to 96/194.
F<sc>IG</sc>. 3.
FIG. 3.
Selective sweeps detected on the X chromosome of D. melanogaster. We used either all base calls or base calls with PHRED score greater than 35. The x axis labels permit to read off the physical position of the sweep window (in kilobases).

References

    1. Achaz G. Testing for neutrality in samples with sequencing errors. Genetics. 2009;179:1409–1424. - PMC - PubMed
    1. Andolfatto P, Davison D, Erezyilmaz D, Hu T, Mast J, Sunayama-Morita T, Stern D. Multiplexed shotgun genotyping for rapid and efficient genetic mapping. Genome Res. 2011;21:610–617. - PMC - PubMed
    1. Bansal V. A statistical method for the detection of variants from next-generation resequencing of DNA pools. Bioinformatics. 2010;26:i318–i324. - PMC - PubMed
    1. Beisswanger S, Stephan W, De Lorenzo D. Evidence for a selective sweep in the wapl region of Drosophila melanogaster. Genetics. 2006;172:265–274. - PMC - PubMed
    1. Boitard S, Schlötterer C, Futschik A. Detecting selective sweeps: a new approach based on hidden Markov models. Genetics. 2009;181:1567–1578. - PMC - PubMed

Publication types

LinkOut - more resources