Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Aug 1;37(8):2332-2340.
doi: 10.1093/molbev/msaa089.

A New Analysis of Archaea-Bacteria Domain Separation: Variable Phylogenetic Distance and the Tempo of Early Evolution

Affiliations

A New Analysis of Archaea-Bacteria Domain Separation: Variable Phylogenetic Distance and the Tempo of Early Evolution

Sarah J Berkemer et al. Mol Biol Evol. .

Abstract

Comparative genomics and molecular phylogenetics are foundational for understanding biological evolution. Although many studies have been made with the aim of understanding the genomic contents of early life, uncertainty remains. A study by Weiss et al. (Weiss MC, Sousa FL, Mrnjavac N, Neukirchen S, Roettger M, Nelson-Sathi S, Martin WF. 2016. The physiology and habitat of the last universal common ancestor. Nat Microbiol. 1(9):16116.) identified a number of protein families in the last universal common ancestor of archaea and bacteria (LUCA) which were not found in previous works. Here, we report new research that suggests the clustering approaches used in this previous study undersampled protein families, resulting in incomplete phylogenetic trees which do not reflect protein family evolution. Phylogenetic analysis of protein families which include more sequence homologs rejects a simple LUCA hypothesis based on phylogenetic separation of the bacterial and archaeal domains for a majority of the previously identified LUCA proteins (∼82%). To supplement limitations of phylogenetic inference derived from incompletely populated orthologous groups and to test the hypothesis of a period of rapid evolution preceding the separation of the domains, we compared phylogenetic distances both within and between domains, for thousands of orthologous groups. We find a substantial diversity of interdomain versus intradomain branch lengths, even among protein families which exhibit a single domain separating branch and are thought to be associated with the LUCA. Additionally, phylogenetic trees with long interdomain branches relative to intradomain branches are enriched in information categories of protein families in comparison to those associated with metabolic functions. These results provide a new view of protein family evolution and temper claims about the phenotype and habitat of the LUCA.

Keywords: LUCA; conserved orthologous groups of proteins; microbial physiology; orthology; progenote.

PubMed Disclaimer

Figures

<sc>Fig</sc>. 1.
Fig. 1.
Comparison of tree topologies for two trees corresponding to the same protein family, but which contain different collections of sequences (SSC1665 on the left and COG1646 on the right). Blue colors are bacterial sequences and red colors show archaeal sequences. Sequences with darker color shades appear in both trees; lighter color-shaded labels indicate genes that only appear in a single tree. Leaf labels are gene identifiers .
<sc>Fig</sc>. 2.
Fig. 2.
Left: Violin plot depicting the number of sequences per group discussed in the text and table 1. The black bar in the yellow area indicates interquartile ranges. Right: The number of sequences per orthologous group plotted against the number of interdomain branches (splits) found when the sequences are subjected to phylogenetic analysis (log10 scales). Expanding SSCs (red squares) with the complete set of sequences of the corresponding COGs results in SSCCOG (blue triangles).
<sc>Fig</sc>. 3.
Fig. 3.
Relationship between the number of archaea:bacteria interdomain branches (splits) and D¯ observed in phylogenetic trees drawn from the COGs. Top: Reconstructed trees for COG0048 (ribosomal protein S12), COG1110 (reverse gyrase), and COG1846 (DNA-binding transcriptional regulator, MarR) with corresponding interdomain archaea:bacteria branches (splits) (s) and D¯ values. The position of these trees is indicated in part B of the figure. The trees are drawn shading archaea in red and bacteria in blue , and the branch lengths are contained within the shaded region. Bottom: Interdomain split values for each COG plotted against D¯, where lower D¯ values represent phylogenetic trees with smaller average intra- to inter-domain phylogenetic distances. The inset shows the distribution on normal scale, and the log (split) version is shown below. Symbols are slightly shifted to avoid overlays, and the differently shaped and colored symbols indicate subgroups as defined by Harris et al. (2003), Catchpole and Forterre, oxygen related COGs (Liu et al. 2018), CODH/ACS COGs, and further examples as indicated in the legend. Brackets on top of the log-plot summarize regions in the plot that correspond to 1, 2, … splits. Labeled symbols refer to corresponding reconstructed phylogenetic trees shown in (top), in supplementary figure 5 and additional table 2, Supplementary Material online. COG0013 is the alanyl-tRNA synthetase, and COG1679 is a predicted Fe-S cluster binding aconitase.

References

    1. Adam PS, Borrel G, Gribaldo S.. 2018. Evolutionary history of carbon monoxide dehydrogenase/acetyl-CoA synthase, one of the oldest enzymatic complexes. Proc Natl Acad Sci U S A. 115(6):E1166–E1173. - PMC - PubMed
    1. Altenhoff AM, Boeckmann B, Capella-Gutierrez S, Dalquen DA, DeLuca T, Forslund K, Huerta-Cepas J, Linard B, Pereira C, Pryszcz LP, et al. 2016. Standardized benchmarking in the quest for orthologs. Nat Methods. 13(5):425–430. - PMC - PubMed
    1. Altenhoff AM, Dessimoz C.. 2012. Inferring orthology and paralogy In: Anisimova M, editor. Evolutionary Genomics. Vol. 855. Clifton (NJ): Methods in Molecular Biology. p. 259–279. - PubMed
    1. Becerra A, Delaye L, Islas S, Lazcano A.. 2007. The very early stages of biological evolution and the nature of the last common ancestor of the three major cell domains. Annu Rev Ecol Evol Syst. 38(1):361–379.
    1. Boyd E, Anbar A, Miller S, Hamilton T, Lavin M, Peters J.. 2011. A late methanogen origin for molybdenum-dependent nitrogenase. Geobiology 9(3):221–232. - PubMed

Publication types