Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 May 4;38(5):2014-2029.
doi: 10.1093/molbev/msab003.

Coevolutionary and Phylogenetic Analysis of Mimiviral Replication Machinery Suggest the Cellular Origin of Mimiviruses

Affiliations

Coevolutionary and Phylogenetic Analysis of Mimiviral Replication Machinery Suggest the Cellular Origin of Mimiviruses

Supriya Patil et al. Mol Biol Evol. .

Abstract

Mimivirus is one of the most complex and largest viruses known. The origin and evolution of Mimivirus and other giant viruses have been a subject of intense study in the last two decades. The two prevailing hypotheses on the origin of Mimivirus and other viruses are the reduction hypothesis, which posits that viruses emerged from modern unicellular organisms; whereas the virus-first hypothesis proposes viruses as relics of precellular forms of life. In this study, to gain insights into the origin of Mimivirus, we have carried out extensive phylogenetic, correlation, and multidimensional scaling analyses of the putative proteins involved in the replication of its 1.2-Mb large genome. Correlation analysis and multidimensional scaling methods were validated using bacteriophage, bacteria, archaea, and eukaryotic replication proteins before applying to Mimivirus. We show that a large fraction of mimiviral replication proteins, including polymerase B, clamp, and clamp loaders are of eukaryotic origin and are coevolving. Although phylogenetic analysis places some components along the lineages of phage and bacteria, we show that all the replication-related genes have been homogenized and are under purifying selection. Collectively our analysis supports the idea that Mimivirus originated from a complex cellular ancestor. We hypothesize that Mimivirus has largely retained complex replication machinery reminiscent of its progenitor while losing most of the other genes related to processes such as metabolism and translation.

Keywords: DNA replication; HGT; LUCA; LUCELLA; MDS; Mimivirus; NCLDVs; coevolution; correlation analysis; evolution; evolutionary selection; giant viruses; phylogenetic; phylogenetic trees; purifying selection.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Phylogenetic trees of proteins from the cellular domains (eukaryotes, bacteria, archaea) and phage T4 and Mimivirus. The multiple sequence analysis was performed by MUSCLE in MEGA6.0 and trees are built by FastTree v.2.1. The phylogenetic trees of (A) DNA polymerase, (B) sliding clamp, (C) clamp loader 1, (D) clamp loader 2, (E) topoisomerase IA, (F) topoisomerase II, (G) helicase, (H) ligase, and (I) SSBP are constructed using four sequences from each domain and viral families, accession number of protein sequences used to construct trees are included in the supplementary table S2, Supplementary Material online. Red, Mimiviridae (M); maroon, eukaryotes (E); green, bacteria (B); blue, T4-like phages (T); purple, archaea (A).
Fig. 2.
Fig. 2.
Components of the phage T4 and bacterial replication machinery show evidence of coevolution. Coevolution of protein complexes has been analyzed by Pearson correlation coefficient and multidimensional scaling analysis (MDS). (A) A matrix of 13 replication proteins of phage T4 with correlation coefficients shown as a heatmap and (B) the MDS analysis of the same set of proteins showing the clustering of proteins in the 2D space. DNA polymerase (gp43), helicase (gp41), helicase loader (gp59), primase (gp61), sliding clamp (gp45), clamp loaders (gp44/gp62), SSBP (gp32), ligase (gp30), topoisomerase (gp39, gp52, and gp60), and RNase H. (C) A correlation matrix of 22 bacterial core replication proteins correlation coefficients displayed as a heatmap and (D) MDS analysis of the same set of proteins display distances of proteins in the 2D space. The goodness of fit of MDS assessed by Shepard diagram (supplementary fig. S3, Supplementary Material online) and smaller Kruskal’s stress (1) suggests the better MDS representation (supplementary table S10, Supplementary Material online).
Fig. 3.
Fig. 3.
Coevolutionary analysis of archaeal and eukaryotic replication machinery. Coevolution of proteins of the A. pernix archaeal replication machinery shows diversity among replication components. (A) A correlation matrix of 18 replication proteins of the archaeon is displayed as a heatmap and (B) the MDS analysis of the same set of proteins show distances of proteins in the 2D space with scattered pattern suggesting the diversity. The eukaryotic replication machinery components show coevolution analyzed by (C) a correlation matrix of replication proteins displayed as a heatmap and (D) the MDS analysis of the same set of proteins showing distances of proteins in the 2D space. The goodness of fit of MDS assessed by Shepard diagram (supplementary fig. S3, Supplementary Material online) and smaller Kruskal’s stress (1) suggests the better MDS representation (supplementary table S10, Supplementary Material online).
Fig. 4.
Fig. 4.
Components of the Mimiviridae replication machinery show evidence of coevolution. (A) A matrix of 21 replication proteins of Mimivirus with correlation coefficients shown as a heatmap and (B) the MDS analysis of the same set of proteins showing the cluster of proteins in 2D space. The goodness of fit of MDS assessed by Shepard diagram (supplementary fig. S3, Supplementary Material online) and smaller Kruskal’s stress (1) suggests the better representation of MDS analysis (supplementary table S10, Supplementary Material online). (C) Maximum likelihood phylogenetic trees of mimiviral proteins gp351, gp229, gp515, gp243, gp532, gp441, gp549, gp331, gp544 except gp1 showed cophylogenetic mirror pattern, accession numbers of respective proteins are given in supplementary table S10, Supplementary Material online, Red, Mimiviridae I lineage A; blue, Mimiviridae I lineage B; yellow, Mimiviridae I lineage C; maroon, Mimiviridae III; green, Klosneuvirinae. Putative helicase (gp229), putative replication origin-binding protein (gp1), putative ATP-dependent DNA helicase (gp612), putative helicase (gp635), putative helicase (gp8), putative helicase (gp132), DNA topoisomerase 1 (gp243), DNA topoisomerase 2 (gp515), DNA ligase (gp331), DNA polymerase (gp351), putative proliferating cell nuclear antigen (gp886), probable DNA polymerase sliding clamp (gp532), putative replication factor C small subunit (gp549), putative replication factor C small subunit (gp513), putative replication factor C small subunit (gp425), putative replication factor C large subunit (gp441), putative replication factor C small subunit (gp538), probable ribonuclease H protein (gp326), probable ribonuclease 3 (gp371), hypothetical protein-SSBPs (gp544), putative endonuclease of the XPG family (gp417), and putative nuclease (gp456).
Fig. 5.
Fig. 5.
Representative proteins involved in different processes like DNA replication, transcription, translation, DNA repair, and genome packaging of Mimiviridae family viruses show an evidence of coevolution. (A) A matrix of correlation coefficients of proteins shown as a heatmap and (B) the MDS analysis of the same set of proteins showing the cluster of proteins in 2D space. DNA polymerase (gp351), putative helicase (gp229), DNA topoisomerase 2 (gp515), putative serine/threonine-protein kinase (gp254), probable uracil-DNA glycosylase (gp271), putative DNA mismatch repair protein MutS-like protein (gp389), DNA-directed RNA polymerase subunit 2 (gp266), ribonucleoside-diphosphate reductase large subunit (gp342), putative translation initiation factor 4a (gp492), DNA-directed RNA polymerase subunit 1 (gp540), uncharacterized hydrolase (gp597), and A32 virion packaging ATPase (gp468).
Fig. 6.
Fig. 6.
Mimiviral DNA replication genes are homogeneous and are under purifying selection. The replication genes analyzed (A) for their normalized expression level, %GC content with the effective codon number (Nc), and dN/dS show no evidence of HGT, and (B) most of the mimiviral replication genes are located toward the center of the genome.
Fig. 7.
Fig. 7.
A speculative hypothesis based on the ancestral nature of replication machinery of Mimivirus suggests the origin and evolution of giant viruses are from a descendent of LUCELLA; Last Common Ancestor of Giant Viruses (LCAGV) with a double-stranded DNA genome carrying complex DNA replication machinery. LUCA, last universal common ancestor; LUCELLA, last universal cellular ancestor; LAECA, last archaeo-eukaryotic common ancestor; LECA, last eukaryotic common ancestor.

References

    1. Atanassova N, Grainge I.. 2008. Biochemical characterization of the minichromosome maintenance (MCM) protein of the crenarchaeote Aeropyrum pernix and its interactions with the origin recognition complex (ORC) proteins. Biochemistry 47(50):13362–13370. - PubMed
    1. Benarroch D, Claverie JM, Raoult D, Shuman S.. 2006. Characterization of mimivirus DNA topoisomerase IB suggests horizontal gene transfer between eukaryal viruses and bacteria. J Virol. 80(1):314–321. - PMC - PubMed
    1. Benarroch D, Shuman S.. 2006. Characterization of mimivirus NAD+-dependent DNA ligase. Virology 353(1):133–143. - PubMed
    1. Bhardwaj A, Ghose D, Thakur KG, Dutta D.. 2018. Escherichia coli β-clamp slows down DNA polymerase I dependent nick translation while accelerating ligation. PLoS One 13(6):e0199559. - PMC - PubMed
    1. Blattner FR, Plunkett G 3rd, Bloch CA, Perna NT, Burland V, Riley M, Collado-Vides J, Glasner JD, Rode CK, Mayhew GF, et al.1997. The complete genome sequence of Escherichia coli K-12. Science 277(5331):1453–1462. - PubMed

Publication types