Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2024 Nov 25;22(11):e3002917.
doi: 10.1371/journal.pbio.3002917. eCollection 2024 Nov.

Reconstructing the last common ancestor of all eukaryotes

Affiliations
Review

Reconstructing the last common ancestor of all eukaryotes

Thomas A Richards et al. PLoS Biol. .

Abstract

Understanding the origin of eukaryotic cells is one of the most difficult problems in all of biology. A key challenge relevant to the question of eukaryogenesis is reconstructing the gene repertoire of the last eukaryotic common ancestor (LECA). As data sets grow, sketching an accurate genomics-informed picture of early eukaryotic cellular complexity requires provision of analytical resources and a commitment to data sharing. Here, we summarise progress towards understanding the biology of LECA and outline a community approach to inferring its wider gene repertoire. Once assembled, a robust LECA gene set will be a useful tool for evaluating alternative hypotheses about the origin of eukaryotes and understanding the evolution of traits in all descendant lineages, with relevance in diverse fields such as cell biology, microbial ecology, biotechnology, agriculture, and medicine. In this Consensus View, we put forth the status quo and an agreed path forward to reconstruct LECA's gene content.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Genetic contributions to LECA.
LECA’s gene repertoire was chimeric, containing genes derived from the Asgardarchaeota-derived host cell, mitochondrial endosymbiont, and potentially other prokaryotic sources, along with a set of eukaryote-specific genes that evolved after the divergence of eukaryotes from prokaryotes. The number of sources, and the proportions and identities of genes from each source, remain uncertain but can be investigated using the approach articulated in the main text of this paper. Here, we illustrate 2 possible LECA reconstructions that are broadly compatible with what is currently known about eukaryotic gene origins. (A) Shows a larger LECA gene repertoire reconstruction as indicated by the large pie chart. Such an inference may be the result of relatively few gene innovations post LECA, as indicated by the modest expansion after LECA leading to extant eukaryotic diversity. This hypothetical model also shows strong Asgardarchaeota and alphaproteobacterial signals and a strong additional signal from a “third party” contributor. This “third signal” could be used to argue for the role of 3 contributing lineages to eukaryogenesis beyond the two+ model. Here, the fraction of genes of de novo gene evolution (i.e., bona fide ESPs) is relatively small. The proportion of gene families of prokaryotic ancestry with poor phylogenetic resolution is not a dominant ancestral signal. (B) Shows a smaller LECA gene repertoire reconstruction as indicated by a smaller pie chart. Such an inference may indicate a larger-scale gene innovation post LECA, as indicated by the wider expansion after the LECA lineage leading to extant eukaryotic diversity. In this hypothetical model, the LECA repertoire with identifiable prokaryotic origin is dominated by genes of undefined ancestry. This model also shows that the LECA gene families of de novo gene ancestry (ESPs) is extensive. Only a tiny proportion of gene families present in LECA can be accurately attributed to either the Asgardarchaeota or the Alphaproteobacteria. The question marks inside the ovals on both models A and B indicate an unknown order of contribution and/or unknown contributing lineages. Dashed double arrow-headed lines indicate possible HGT contributions throughout eukaryogenesis and subsequent diversification of eukaryotes. Not all aspects of these models are mutually exclusive; for example, a large LECA repertoire (as shown in A) could be combined with a two+ model for ancestry (as shown in B). ESP, eukaryote signature protein; HGT, horizontal gene transfer; LECA, last eukaryotic common ancestor.
Fig 2
Fig 2. Cellular features inferred to be present in LECA.
This schematic follows on from [17] and summarises the cellular features discussed in the section titled “What do we know about LECA?” (and references therein). Note that the process of meiosis, mitosis, cell division, associated machines, and processes, inferred to have been present in LECA, are not shown here. Created in BioRender. Eme, L. (2024) https://BioRender.com/w64x492. LECA, last eukaryotic common ancestor.
Fig 3
Fig 3. Proposed LECA gene repertoire analysis pipeline.
(A) Eukaryotic gene complements are divided into candidate ortholog groups using phylogenetic trees. Black arrows indicate how phylogenetic analyses can be used to move from gene family phylogenies to distinct ortholog groups. Black blocks indicate genes that are specific to eukaryotes (i.e., ESPs). Orange blocks indicate eukaryotic genes of prokaryotic ancestry (phylogenetic donor-relationship is identified by red branches in the trees; red discs on the tree indicate information for inferring provenance of prokaryotic ancestry, e.g., taxonomy and node support statistics). Note that numerous genes are likely to be classified as “genes that cannot be assigned to cluster groups” (marked as box X). This pool is a repository which would allow for further revision, addition of unclassified genes to new cluster groups as they arise, or subsequent inclusion within established cluster groups as more genome data are included and the HMMs are revised. The broader process would allow cross referencing of specific orthologs to larger gene clusters, thereby allowing the ultimate ancestry of ortholog families to be inferred. (B) Overview of analytical process that would allow community-based revision of ortholog cluster-groupings necessary for LECA gene repertoire estimations. This process is based on HMM generation and several levels of revision allowing cluster groupings to be updated with input from numerous additional sources of data (as shown). (C) LECA gene repertoire estimation based on ancestral state estimation and allowing for alternative eukaryotic species tree topologies. Sources of analytical challenge and error are marked using “*” convention. *Resolving gene clusters and ortholog groups will be a highly challenging due to lack of phylogenetic resolution and hidden paralogy, likely leading to a high proportion of genes that cannot be resolved to cluster or ortholog groups. It is for this reason we advocate for iterative chains of analysis allowing for appropriate identification of such gene sets and where possible revisions. **HMMs generated for ortholog groups will likely cross-sample paralogs and/or xenologs. New tools are needed to allow ortholog sampling that excludes paralogs (e.g., [174]). ***Pipelines to cluster orphan genes will be subject to high error with false clustering of unrelated genes. ****Manual correction will involve subjective error; this is unavoidable but community access to these processes is critical to allow for downstream improvement. *****The flow of new genomic data, with different assembly and annotation standards and varying sources of contamination, will be a difficult challenge to integrate while also maintaining standards for comparative analyses. Legend is shown in a box. ESP, eukaryote signature protein; HMM, hidden Markov model; LECA, last eukaryotic common ancestor.

References

    1. Stanier RY, Doudoroff M, Adelberg EA. The microbial world: Prentice-Hall; 1957.
    1. Lane N, Martin W. The energetics of genome complexity. Nature. 2010;467(7318):929–934. doi: 10.1038/nature09486 . - DOI - PubMed
    1. Booth A, Doolittle WF. Eukaryogenesis, how special really? Proc Natl Acad Sci U S A. 2015;112(33):10278–10285. doi: 10.1073/pnas.1421376112 - DOI - PMC - PubMed
    1. Booth A, Doolittle WF. Reply to Lane and Martin: Being and becoming eukaryotes. Proc Natl Acad Sci U S A. 2015;112(35):E4824–E. doi: 10.1073/pnas.1513285112 - DOI - PMC - PubMed
    1. Lane N, Martin WF. Eukaryotes really are special, and mitochondria are why. Proc Natl Acad Sci U S A. 2015;112(35):E4823–E. doi: 10.1073/pnas.1509237112 - DOI - PMC - PubMed

LinkOut - more resources