Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2019 Jul 22;374(1777):20180234.
doi: 10.1098/rstb.2018.0234. Epub 2019 Jun 3.

Detecting adaptive convergent amino acid evolution

Affiliations
Review

Detecting adaptive convergent amino acid evolution

Carine Rey et al. Philos Trans R Soc Lond B Biol Sci. .

Abstract

In evolutionary genomics, researchers have taken an interest in identifying substitutions that subtend convergent phenotypic adaptations. This is a difficult question that requires distinguishing foreground convergent substitutions that are involved in the convergent phenotype from background convergent substitutions. Those may be linked to other adaptations, may be neutral or may be the consequence of mutational biases. Furthermore, there is no generally accepted definition of convergent substitutions. Various methods that use different definitions have been proposed in the literature, resulting in different sets of candidate foreground convergent substitutions. In this article, we first describe the processes that can generate foreground convergent substitutions in coding sequences, separating adaptive from non-adaptive processes. Second, we review methods that have been proposed to detect foreground convergent substitutions in coding sequences and expose the assumptions that underlie them. Finally, we examine their power on simulations of convergent changes-including in the presence of a change in the efficacy of selection-and on empirical alignments. This article is part of the theme issue 'Convergent evolution in the genomics era: new insights and directions'.

Keywords: C3/C4; convergent evolution; genomics; molecular evolution; phylogenetics; probabilistic models.

PubMed Disclaimer

Conflict of interest statement

We have no competing interests.

Figures

Figure 1.
Figure 1.
Categories of adaptive and non-adaptive convergent amino acid evolution. (a) At a particular position in a protein, some amino acids provide better fitness than others. This is represented by coloured bars for six amino acids, the bigger the bar the higher the fitness. In the ancestral environment A, amino acids blue and green provide the highest fitness, whereas in the convergent environment C, amino acids orange and purple provide the highest fitness. Increasing the selection efficacy makes the profiles more pointed, while decreasing it makes them more flat, but the amino acid relative rank does not change. Decreases of the selection efficacy are not adaptive, while the two other types of changes are. (b) Species with the convergent phenotype are named C* and species with the ancestral phenotype are named A*. Substitutions are represented by small boxes on the branches. We distinguish two types of adaptive convergent substitutions. Type 1 are substitutions that occur systematically on the branch where the phenotype changes, at the transition between Ancestral and Convergent environments (A–C). Type 2 are substitutions that occur on later branches (e.g. in the branch leading to C3).
Figure 2.
Figure 2.
Cartoon examples of the types of sites targeted by each type of method. The tree topologies and species are the same in all examples. Species with the convergent phenotype are named C*, those with the ancestral phenotype A*; the transitions between ancestral and convergent phenotype occur where the subtrees become shaded in yellow. Coloured squares on the branches of the phylogeny indicate substitution events, with the colour corresponding to the new amino acid. In Example A, every time the phenotype changes, a substitution occurs towards amino acid Q (type 1 substitutions to a single amino acid). This is an ideal case for the methods based on identical substitutions and should be detectable by all methods. Example B shows a site that has undergone a profile change, whereby two different amino acids, Q and Y, have good fitness in the convergent case. All methods but the identical may detect such changes, although this depends on how different the ancestral and the convergent profiles are [18]. Example C is similar to Example B except that some substitutions occurred after the phenotype has changed (type 2 substitutions), not simultaneously with the phenotype change. Example D is similar to Example C except that the amino acid change only occurred three times out of four: this makes it more controversial and harder to detect. But if the change in profile is strong enough, profile methods should be able to detect it. Example E shows a case where the evolution of the site does not seem to correlate with the convergent/ancestral state of the species. We do not expect the methods to detect such a site, but some such sites will nevertheless come out as false positives.
Figure 3.
Figure 3.
Detection of sites undergoing convergent profile change by different methods. Simulations are performed with constant selection efficacy (NeS = 4). Each panel corresponds to one empirical phylogeny, with convergent transitions placed as in electronic supplementary material, figure S2. The trade-off between sensitivity and precision is presented for each method, assuming that 2% of the sites are convergent in the sequences (colour code indicated on the top of the figure). The dashed lines highlight 90% precision. Area under the curves (AUC) ranked from best to worst are presented on the right-hand sides of each panel, with the same colour code as the precision-recall curves.
Figure 4.
Figure 4.
Overview of the simulations and AUC values for the Cyperaceae tree. The trees, convergent clades and symbols are as in figure 1. Three kinds of adaptive convergent cases have been simulated: (a) a convergent profile change, (b) a convergent scaling of selection efficacy and (c) a convergent profile change combined with a selection efficacy scaling. The genome-wide selection efficacy (NeSA) remains the same in (a) and is changed to a convergent selection efficacy (NeSC) in Ha (b) and Ha (c). The black arrows (b(ii) and c(ii)) indicate if selection efficacy increases or decreases in convergent clades. AUC values are calculated based on precision-recall curves such as presented in figure 3 ((a) AUC values for NeSA = 4 in case 1 correspond to figure 3a(i)).
Figure 5.
Figure 5.
Ability of the different methods to recover published convergent sites in two empirical alignments. Those alignments had been used to study convergent transitions from C3 to C4 metabolisms in plants ((a) Besnard et al. [20] found 16 convergent sites in Cyperaceae, (b) Parto & Lartillot [24] found 15 convergent sites in Amaranthaceae). For each method, the scores were obtained for each site of the alignment. The sites were then ranked according to their scores, and only the ranks of previously published convergent sites are reported on the figure.

References

    1. Parker J, Tsagkogeorga G, Cotton JA, Liu Y, Provero P, Stupka E, Rossiter SJ. 2013. Genome-wide signatures of convergent evolution in echolocating mammals. Nature 502, 228–231. (10.1038/nature12511) - DOI - PMC - PubMed
    1. Zou Z, Zhang J. 2015. No genome-wide protein sequence convergence for echolocation. Mol. Biol. Evol. 32, 1237–1241. (10.1093/molbev/msv014) - DOI - PMC - PubMed
    1. Thomas GWC, Hahn MW. 2015. Determining the null model for detecting adaptive convergence from genomic data: a case study using echolocating mammals. Mol. Biol. Evol. 32, 1232–1236. (10.1093/molbev/msv013) - DOI - PMC - PubMed
    1. Khan AI, Dinh DM, Schneider D, Lenski RE, Cooper TF. 2011. Negative epistasis between beneficial mutations in an evolving bacterial population. Science 332, 1193–1196. (10.1126/science.1203801) - DOI - PubMed
    1. Goldman N, Yang Z. 1994. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol. Biol. Evol. 11, 725–736. (10.1093/oxfordjournals.molbev.a040153) - DOI - PubMed

Publication types

LinkOut - more resources