Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Aug 26:5:293.
doi: 10.3389/fgene.2014.00293. eCollection 2014.

A bioinformatics workflow for detecting signatures of selection in genomic data

Affiliations

A bioinformatics workflow for detecting signatures of selection in genomic data

Murray Cadzow et al. Front Genet. .

Abstract

The detection of "signatures of selection" is now possible on a genome-wide scale in many plant and animal species, and can be performed in a population-specific manner due to the wealth of per-population genome-wide genotype data that is available. With genomic regions that exhibit evidence of having been under selection shown to also be enriched for genes associated with biologically important traits, detection of evidence of selective pressure is emerging as an additional approach for identifying novel gene-trait associations. While high-density genotype data is now relatively easy to obtain, for many researchers it is not immediately obvious how to go about identifying signatures of selection in these data sets. Here we describe a basic workflow, constructed from open source tools, for detecting and examining evidence of selection in genomic data. Code to install and implement the pipeline components, and instructions to run a basic analysis using the workflow described here, can be downloaded from our public GitHub repository: http://www.github.com/smilefreak/selectionTools/

Keywords: analysis pipeline; genome-wide; genomics; signatures of selection.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Plots of Rsb (top row) and iHS (middle and bottom rows) values across chromosome 2 (whole chromosome in the left column, and the region around the LCT gene in the right column) based on 1000 Genomes Project data for the CEU and YRI populations. Blue vertical lines/boxes on the plots indicate the location of the LCT gene, and the red horizontal lines denote a p-value of less than 5% for any Rsb value above the line. The marked deviation of iHS away from zero in the CEU population provides evidence for the region around the LCT gene having been under selective pressure in the past. In contrast, there is no such evidence in the YRI population, which is also communicated by the Rsb statistic, which examines the relative evidence for selection in the two populations, here indicating that there is stronger evidence for this region having been under selective pressure in the CEU cohort.

References

    1. Barrett R. D. H., Hoekstra H. E. (2011). Molecular spandrels: tests of adaptation at the genetic level. Nat. Rev. Genet. 12, 767–780 10.1038/nrg3015 - DOI - PubMed
    1. Bersaglieri T., Sabeti P. C., Patterson N., Vanderploeg T., Schaffner S. F., Drake J. A., et al. (2004). Genetic signatures of strong recent positive selection at the lactase gene. Am. J. Hum. Genet. 74, 1111–1120 10.1086/421051 - DOI - PMC - PubMed
    1. Bhatia G., Patterson N., Sankararaman S., Price A. L. (2013). Estimating and interpreting FST: the impact of rare variants. Genome Res. 23, 1514–1521 10.1101/gr.154831.113 - DOI - PMC - PubMed
    1. Browning S. R., Browning B. L. (2007). Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am. J. Hum. Genet. 81, 1084–1097 10.1086/521987 - DOI - PMC - PubMed
    1. Browning S. R., Browning B. L. (2011). Haplotype phasing: existing methods and new developments. Nat. Rev. Genet. 12, 703–714 10.1038/nrg3054 - DOI - PMC - PubMed

LinkOut - more resources