Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Feb 11;12(2):e1004739.
doi: 10.1371/journal.pcbi.1004739. eCollection 2016 Feb.

Practical Approaches for Detecting Selection in Microbial Genomes

Affiliations

Practical Approaches for Detecting Selection in Microbial Genomes

Jessica Hedge et al. PLoS Comput Biol. .

Abstract

Microbial genome evolution is shaped by a variety of selective pressures. Understanding how these processes occur can help to address important problems in microbiology by explaining observed differences in phenotypes, including virulence and resistance to antibiotics. Greater access to whole-genome sequencing provides microbiologists with the opportunity to perform large-scale analyses of selection in novel settings, such as within individual hosts. This tutorial aims to guide researchers through the fundamentals underpinning popular methods for measuring selection in pathogens. These methods are transferable to a wide variety of organisms, and the exercises provided are designed for researchers with any level of programming experience.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Phylogenetic tree reconstruction and evolutionary rate estimation.
A phylogenetic tree comprises a collection of branches that connect sampled sequences at the tips (called taxa) with the most recent common ancestor of the sample. The point where each pair of branches join together is called a node. The lengths of these branches represent the evolutionary distance between sequences at either end, usually measured in numbers of substitutions per site, which can be calculated using the scale bar. The length of the vertical branches and rotation of branches around each node are arbitrary. The tree can be rooted using a divergent sequence (called an outgroup) (a), in which case the direction of substitutions can be inferred and each node represents the common ancestor of all descendent nodes and taxa. The node furthest from the tips is called the root. The tree can also be left unrooted and displayed radially (b) (tip labels have been omitted for visual clarity). Assuming the phylogeny has been rooted correctly, linear regression analysis can be used to test for a signal of a molecular clock by plotting the sampling time of each sequence against its evolutionary distance from the root of the tree. If the test is significant (c), the slope of the regression line (red) can provide an estimate of the evolutionary rate. The lack of any temporal signal (d) may occur if insufficient time has passed for substitutions to accumulate or if the molecular clock has been violated (for example, due to selection, recombination, or hypermutation).
Fig 2
Fig 2. Detecting selection from microbial sequence data.
The phylogeny shows the evolutionary history of 20 sequences sampled evenly from four divergent populations. dN/dS methods test for selection by comparing the rates of non-synonymous and synonymous substitution occurring between divergent lineages (i.e., only substitutions that have occurred on the black branches) with those expected under neutrality. In contrast, the McDonald-Kreitman test for selection compares the ratio of non-synonymous and synonymous polymorphisms that are present within populations (due to substitutions occurring on red branches) with the ratio of non-synonymous and synonymous fixed differences that are present between populations (due to substitutions occurring on black branches). The phylogeny can also be used to detect selection by identifying parallel evolution, whereby recurrent mutations occur at a site or across a gene during the evolutionary history of a sample (for example, substitution X on the phylogeny).

Similar articles

Cited by

References

    1. Didelot X, Bowden R, Wilson DJ, Peto TEA, Crook DW. Transforming clinical microbiology with bacterial genome sequencing. Nat Rev Genet. 2012;13(9):601–12. 10.1038/nrg3226 - DOI - PMC - PubMed
    1. Stratton MR. Exploring the genomes of cancer cells: progress and promise. Science. 2011;331(6024):1553–8. 10.1126/science.1204040 - DOI - PubMed
    1. Green ED, Guyer MS. Charting a course for genomic medicine from base pairs to bedside. Nature. 2011;470(7333):204–13. 10.1038/nature09764 - DOI - PubMed
    1. Lieberman TD, Michel J-B, Aingaran M, Potter-Bynoe G, Roux D, Davis MR, et al. Parallel bacterial evolution within multiple patients identifies candidate pathogenicity genes. Nat Genet. 2011;43(12):1275–80. 10.1038/ng.997 - DOI - PMC - PubMed
    1. Pepperell CS, Casto AM, Kitchen A, Granka JM, Cornejo OE, Holmes EC, et al. The role of selection in shaping diversity of natural M. tuberculosis populations. PLoS Pathog. 2013;9(8):e1003543 10.1371/journal.ppat.1003543 - DOI - PMC - PubMed

Publication types