Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Oct 15;116(42):21094-21103.
doi: 10.1073/pnas.1818532116. Epub 2019 Sep 30.

A functional enrichment test for molecular convergent evolution finds a clear protein-coding signal in echolocating bats and whales

Affiliations

A functional enrichment test for molecular convergent evolution finds a clear protein-coding signal in echolocating bats and whales

Amir Marcovitz et al. Proc Natl Acad Sci U S A. .

Abstract

Distantly related species entering similar biological niches often adapt by evolving similar morphological and physiological characters. How much genomic molecular convergence (particularly of highly constrained coding sequence) contributes to convergent phenotypic evolution, such as echolocation in bats and whales, is a long-standing fundamental question. Like others, we find that convergent amino acid substitutions are not more abundant in echolocating mammals compared to their outgroups. However, we also ask a more informative question about the genomic distribution of convergent substitutions by devising a test to determine which, if any, of more than 4,000 tissue-affecting gene sets is most statistically enriched with convergent substitutions. We find that the gene set most overrepresented (q-value = 2.2e-3) with convergent substitutions in echolocators, affecting 18 genes, regulates development of the cochlear ganglion, a structure with empirically supported relevance to echolocation. Conversely, when comparing to nonecholocating outgroups, no significant gene set enrichment exists. For aquatic and high-altitude mammals, our analysis highlights 15 and 16 genes from the gene sets most affected by molecular convergence which regulate skin and lung physiology, respectively. Importantly, our test requires that the most convergence-enriched set cannot also be enriched for divergent substitutions, such as in the pattern produced by inactivated vision genes in subterranean mammals. Showing a clear role for adaptive protein-coding molecular convergence, we discover nearly 2,600 convergent positions, highlight 77 of them in 3 organs, and provide code to investigate other clades across the tree of life.

Keywords: aquatic; coding; convergent evolution; echolocation; genome-wide functional enrichment tests.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interest.

Figures

Fig. 1.
Fig. 1.
Screening for molecular convergence and divergence in the mammalian lineage. (A, Left) Simplified placental mammals phylogenetic tree. See SI Appendix, Fig. S1 for all 57 species used in the study. Colored rectangles highlight branches with independent phenotypic evolution of echolocation, aquatic, high-altitude, and subterranean lifestyles. (A, Right) Filled and empty rectangles represent target and outgroup species, respectively. We screened for (parallel or strictly) convergent and divergent amino acid substitutions along the branches leading from the last common ancestor of each target group and its outgroup to the target group itself. For example: (B) An N7T (asparagine to threonine in position 7) parallel convergent substitution in the hearing gene Prestin (SLC26A5) in echolocating mammals. (C) An E3840 divergent substitution in the vision/hearing gene Usherin (USH2A) in a pair of distantly related subterranean mammals.
Fig. 2.
Fig. 2.
A molecular convergence test. Example application to echolocating mammals: (A, Top Left) We picked 2 target groups (TG) of species with a phenotypic convergence (echolocation here), and 2 outgroups (OG). (Top Middle) Our algorithm identified all cross-species conserved (and thus functionally important) amino acid positions in genes annotated for any of 4,300 specific MGI phenotype functions. (Top/Bottom Right) We then identified the subset of positions showing (parallel or strictly) convergent substitutions between our target groups and performed a hypergeometric test over positions to find the single most statistically enriched MGI function (if any) for amino acid convergence. We also tested for divergent substitutions between our target groups (red arrow). (Bottom Middle/Left) If the most-enriched convergent function is not also enriched for divergent substitutions, we declared an adaptive molecular convergence event and linked it back to the convergent species phenotypes. In this example, we discovered 25 convergence events in 18 genes regulating cochlear ganglion function in bats and whales. The cochlear ganglion prediction is particularly striking considering that 4,300 different functions were evaluated to arrive at a poster-child organ for phenotypic convergence between the 2 echolocating groups. (A and B) Comparable total number of convergent substitutions were observed in echolocating mammals (TG.I and TG.II in A, Top Left) as in 3 control sets formed by shuffling either or both target groups with their respective outgroups. (C) However, only the echolocating set of B showed a statistically significant enrichment for molecular convergence in the ontology term “cochlear ganglion degeneration.” In fact, none of the control experiments yielded any statistical enrichment across all 4,300 tested terms.
Fig. 3.
Fig. 3.
Example convergent and divergent substitutions identified. (A) A skin development gene ABCA12 exhibits 8 convergent substitutions and only a single divergent substitution in aquatic mammals. (B) The estrogen receptor ERα encoded by ESR1 contains the convergent substitution T483I in high-altitude mammals. An ERα structural model (Protein Data Bank ID code 3OS8) highlights the convergent substitution at the homodimeric interface (between monomers A and B, green and blue, respectively), suggesting an important functional role in regulating dimer stability. Thr483 interacts with surface polar residues D480A and Q506B through a polar contacts network, antagonizing the adjacent hydrophobic interaction that also occurs at the homodimeric interface. Thus, by substituting residue 483 to hydrophobic isoleucine, the adjacent hydrophobic interaction (involving nonpolar amino acids I451A, L479A, L486A, L504A, L508A, A505B, and L509B) would likely be strengthened, increasing dimer stability and promoting binding to estrogen-responsive elements. (C) A convergent substitution F115Y in GJB2 observed in echolocating mammals is central to 3 codons containing human hearing loss disease mutations, suggesting the residue’s importance in modulating hearing. (D) The vision gene USH2A contains multiple divergent and convergent substitutions in 3 tested pairs of moles. These changes likely accumulated as a result of relaxed purifying selection.

References

    1. Stern D. L., The genetic causes of convergent evolution. Nat. Rev. Genet. 14, 751–764 (2013). - PubMed
    1. Weinreich D. M., Delaney N. F., Depristo M. A., Hartl D. L., Darwinian evolution can follow only very few mutational paths to fitter proteins. Science 312, 111–114 (2006). - PubMed
    1. Storz J. F., Causes of molecular convergence and parallelism in protein evolution. Nat. Rev. Genet. 17, 239–250 (2016). - PMC - PubMed
    1. Liu Z., Qi F.-Y., Zhou X., Ren H.-Q., Shi P., Parallel sites implicate functional convergence of the hearing gene prestin among echolocating mammals. Mol. Biol. Evol. 31, 2415–2424 (2014). - PubMed
    1. Liu Y., et al. , Convergent sequence evolution between echolocating bats and dolphins. Curr. Biol. 20, R53–R54 (2010). - PubMed

Publication types

LinkOut - more resources