Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Apr;616(7958):783-789.
doi: 10.1038/s41586-023-05962-4. Epub 2023 Apr 19.

Mirusviruses link herpesviruses to giant viruses

Affiliations

Mirusviruses link herpesviruses to giant viruses

Morgan Gaïa et al. Nature. 2023 Apr.

Abstract

DNA viruses have a major influence on the ecology and evolution of cellular organisms1-4, but their overall diversity and evolutionary trajectories remain elusive5. Here we carried out a phylogeny-guided genome-resolved metagenomic survey of the sunlit oceans and discovered plankton-infecting relatives of herpesviruses that form a putative new phylum dubbed Mirusviricota. The virion morphogenesis module of this large monophyletic clade is typical of viruses from the realm Duplodnaviria6, with multiple components strongly indicating a common ancestry with animal-infecting Herpesvirales. Yet, a substantial fraction of mirusvirus genes, including hallmark transcription machinery genes missing in herpesviruses, are closely related homologues of giant eukaryotic DNA viruses from another viral realm, Varidnaviria. These remarkable chimaeric attributes connecting Mirusviricota to herpesviruses and giant eukaryotic viruses are supported by more than 100 environmental mirusvirus genomes, including a near-complete contiguous genome of 432 kilobases. Moreover, mirusviruses are among the most abundant and active eukaryotic viruses characterized in the sunlit oceans, encoding a diverse array of functions used during the infection of microbial eukaryotes from pole to pole. The prevalence, functional activity, diversification and atypical chimaeric attributes of mirusviruses point to a lasting role of Mirusviricota in the ecology of marine ecosystems and in the evolution of eukaryotic DNA viruses.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Evolutionary relationships between Nucleocytoviricota, Herpesvirales and mirusviruses.
Left: a maximum-likelihood phylogenetic tree built from the GOEV database (1,722 genomes) on the basis of a concatenation of manually curated RNApolA, RNApolB, DNApolB and TFIIS genes (3,715 amino acid positions) using the posterior mean site frequency mixture model (LG + C30 + F + R10) and rooted between mirusviruses and the rest. Highlighted phylogenetic supports (dots in the tree) were considered high (approximation likelihood ratio (aLRT) ≥ 80 and ultrafast bootstrap approximation (UFBoot) ≥ 95, in black) or medium (aLRT ≥ 80 or UFBoot ≥ 95, in yellow; see Methods). The tree was decorated with rings of complementary information and visualized with anvi’o. Right: predicted 3D structures for the HK97 MCP of Caudoviricetes, mirusvirus and herpesvirus representatives obtained using AlphaFold2. Proteins are coloured on the basis of secondary structure properties. The panel also shows predicted 3D structures for the triplex capsid proteins of mirusvirus and herpesvirus representatives using the same methodology. Source Data
Fig. 2
Fig. 2. Genomic statistics and evolution of mirusviruses.
a, Genomic and environmental statistics for the seven Mirusviricota subclades. Av., average; aa, amino acids; KEGG, Kyoto Encyclopedia of Genes and Genome; N50, the shortest contig length needed to capture 50% of the total assembly size; Med., Mediterranean. b, A maximum-likelihood phylogenetic tree built from the Mirusviricota MAGs on the basis of a concatenation of four hallmark informational genes (those encoding RNApolA, RNApolB, DNApolB and TFIIS; 3,715 amino acid positions) using the LG + F + R7 model. c, A maximum-likelihood phylogenetic tree built from the Mirusviricota MAGs on the basis of the MCP (701 amino acid positions) using the LG + R6 model. Both trees were rooted between clade M6 and other clades. Values at nodes represent branch supports (out of 100) calculated by the Shimodaira–Hasegawa-like aLRT (1,000 replicates; left score) and UFBoot (1,000 replicates; right score). Source Data
Fig. 3
Fig. 3. In situ expression profile of mirusviruses during infection.
Left: summary of the overall metatranscriptomic signal of different gene categories for the mirusvirus MAGs among the Tara Oceans metatranscriptomes. DCM, deep chlorophyll maximum layer; Meso, mesopelagic (top dark ocean layer below 200 m). Right, summary of the occurrence of 35 Mirusviricota core gene clusters as a ratio for the mirusvirus MAGs (mirus) and Nucleocytoviricota (Nucleocyto.). The panel also shows box plots corresponding to the overall metatranscriptomic signal for genes corresponding to the 35 core gene clusters and occurring in the 10 most abundant mirusviruses among the Tara Oceans metagenomes. Percentage values are genome-centric and correspond to the percentage of mean coverage (sum across all the metatranscriptomes) of one gene when considering the cumulated mean coverage of all genes (sum across all the metatranscriptomes) found in the corresponding genome. Centre lines in box plots show the medians; box limits indicate the 25th and 75th percentiles; whiskers extend 1.5 times the interquartile range from the 25th and 75th percentiles; outliers are represented by dots (n = 10 points). Red., reductase. Source Data
Fig. 4
Fig. 4. Evolutionary trajectories of the eukaryotic informational module.
a, Summary of the occurrence of hallmark genes for the informational and virion modules in Nucleocytoviricota, mirusviruses, herpesviruses and Caudoviricetes. Informational module genes with a strong evolutionary relationship are connected with a line. Genes containing information pointing to a common eukaryotic viral ancestry between mirusviruses and herpesviruses are framed. VLTF3, viral late transcription factor 3. b,c, Descriptions of two evolutionary scenarios in which the informational module of eukaryote-infecting viruses within the realms Duplodnaviria and Varidnaviria first emerged in the ancestor of either Nucleocytoviricota (giant virus hypothesis) or mirusviruses (mirusvirus hypothesis).
Extended Data Fig. 1
Extended Data Fig. 1. Identification of novel DNA-dependent RNA polymerase B (RNApolB) clades in the sunlit ocean.
The maximum-likelihood phylogenetic tree (LG+F+R10 model, 906 sites) is based on 2,728 RNApolB sequences more than 800 amino acids in length with similarity <90% (gray color in the inner ring) identified from 11 large marine metagenomic co-assemblies. This analysis also includes 262 reference RNApolB sequences (red color in the inner ring) corresponding to known archaeal, bacterial, eukaryotic and giant virus lineages for perspective. The middle ring shows the number of RNApolB sequences from the 11 metagenomic co-assemblies that match to the selected amino acid sequence with identity >90% (log10). The outer ring displays selections made for the different clades. Finally, RNApolB new lineages are labelled with a red dot for mirusviruses (subclades were characterized in subsequent analyses) and in blue for Proculviricetes. Source Data
Extended Data Fig. 2
Extended Data Fig. 2. Single-protein and concatenated phylogenies of the four informational hallmark genes in the GOEV database.
Maximum-likelihood phylogenetic trees of the RNpolA, RNApolb, DNApolB and TFIIS were built from the GOEV database using the LG+F+R10 model (selected by ModelFinder Plus) and rooted between Pokkesviricetes and the rest. Phylogenetic supports were considered high (aLRT>=80 and UFBoot>=95, in black), medium (aLRT>=80 or UFBoot>=95, in yellow) or low (aLRT<80 and UFBoot<95, in red) (see Methods). Finally, the concatenated tree described in Fig. 1 is also presented at the bottom for perspective. Source Data
Extended Data Fig. 3
Extended Data Fig. 3. 3D structure of the major capsid protein (MCP).
The figure displays MCP 3D structures for Escherichia phage HK97 (Caudoviricetes), a representative genome for the mirusviruses (estimated using Alphafold), and the human cytomegalovirus (Herpesvirales). PDB accession numbers for the HK97 and cytomegalovirus MCPs are indicated in parentheses.
Extended Data Fig. 4
Extended Data Fig. 4. Protein sequence and predicated 3D structures comparisons.
Panel A displays protein sequence and 3D structure comparisons (Blastp and Foldseek) for the HK97 MCP of representatives covering various families from the three main Duplodnaviria clades. Center lines in boxplots show the medians; box limits indicate the 25th and 75th percentiles; whiskers extend 1.5 times the interquartile range from the 25th and 75th percentiles; outliers are represented by dots (from top to bottom, n = 22, 50, 38, 25, 35, 16, 40, 23 and 117 independent comparisons). The alignment values range from a minimum of 9 amino acids to a maximum of 1,437 amino acids. The bitscore values range from a minimum of 19.6 to a maximum of 2577. The Foldseek TMscore values range from a minimum of 0.09 to a maximum of 0.997. The dendrogram was generated using Euclidian distance and ward within anvi’o and is based on the Foldseek TMscore values. Panel B describes a selection of predicated 3D structures for the HK97 MCP and triplex proteins of representatives from the three main Duplodnaviria clades (Caudoviricetes viruses lack the triplex capsid proteins). Proteins are colored based on secondary structure properties. Source Data
Extended Data Fig. 5
Extended Data Fig. 5. Phylogeny of the DNApolB hallmark gene.
The figure displays a maximum-likelihood phylogenetic tree (847 sites, 1,475 sequences) of DNA-polymerase B-family sequences using the LG+F+R10 model (selected by ModelFinder Plus) from the database described herein, Duplodnaviria and Baculoviridae sequences from the NCBI viral genomic database, and eukaryotic and viral sequences from Kazlauskas et al. (see Methods). Eukaryotic Epsilon-type and related clades were used as outgroup. Phylogenetic supports were considered high (aLRT>=80 and UFBoot>=95, in black), medium (aLRT>=80 or UFBoot>=95, in yellow) or low (aLRT<80 and UFBoot<95, in red) (see Methods). Baculo: Baculoviridae; Caudo: Caudoviricetes; Nudi: Nudiviridae. Source Data
Extended Data Fig. 6
Extended Data Fig. 6. Functional clustering of mirusviruses and reference viral genomes from culture.
The inner tree is a clustering of ‘Mirusviricota’ and other genomes based on the occurrence of all gene clusters (OrthoFinder method, Bray-Curtis distance). Source Data
Extended Data Fig. 7
Extended Data Fig. 7. Functional clustering of abundant and widespread marine viruses within mirusviruses and Nucleocytoviricota.
In panel A, the inner tree is a clustering of ‘Mirusviricota’ and Nucleocytoviricota genomes >100 kbp in length based on the occurrence of all the non-singleton gene clusters (Euclidean distance), rooted with the Chordopoxvirinae subfamily of Poxviridae genomes. Rings of information display the main taxonomy of Nucleocytoviricota as well as the occurrence of 60 gene clusters detected in at least 50% of ‘Mirusviricota’ or Nucleocytoviricota. The 60 gene clusters are clustered based on their occurrence (absence/presence) across the genomes. Panel B displays the occurrence of gene clusters of known Pfam functions detected in at least 50% of ‘Mirusviricota’ or Nucleocytoviricota genomes. Source Data
Extended Data Fig. 8
Extended Data Fig. 8. Mirusviruses contain new phylogenetic clades of histones and heliorhodopsins.
The figure displays two panels. Left panel displays a maximum-likelihood phylogenetic tree of histones occurring in the GOEV database and in eukaryotic MAGs, rooted with H4 (distant eukaryotic clade) (266 sequences; 180 sites) and based on the LG+R8 model. The various eukaryotic clades distant from H2-H3-H4 were excluded to focus on the more restrained viral signal. A ring provides additional taxonomic information. Bottom panel summarizes the proportion of genomes from different viral clades containing histones. Phylogenetic supports were considered high (aLRT>=80 and UFBoot>=95, in black), medium (aLRT>=80 or UFBoot>=95, in yellow) or low (aLRT<80 and UFBoot<95, in red) (see Methods). Right panel displays a maximum-likelihood phylogenetic tree of heliorhodopsins occurring in the GOEV database and in eukaryotic MAGs (280 sequences; 313 sites), rooted with a large clade enriched in eukaryotes and based on the VT+F+R8 model. A ring provides additional taxonomic information. Bottom panel summarizes the proportion of genomes from different viral clades containing heliorhodopsins. Phylogenetic supports were considered high (aLRT>=80 and UFBoot>=95, in black), medium (aLRT>=80 or UFBoot>=95, in yellow) or low (aLRT<80 and UFBoot<95, in red) (see Methods). Source Data
Extended Data Fig. 9
Extended Data Fig. 9. Environmental signal of virus eukaryotic clades in the sunlit oceans.
For each marine eukaryotic virus clades, the box plots display cumulative mean coverage of GOEV genomes among 937 TARA Oceans metagenomes. Only genome detected in at least one metagenome were considered. Center lines in boxplots show the medians; box limits indicate the 25th and 75th percentiles; whiskers extend 1.5 times the interquartile range from the 25th and 75th percentiles; outliers are represented by dots. The mean coverage values range from a minimum of 0.35x to a maximum of 6273.1x. The number of considered genomes per clade and their cumulative coverage median are also described. Source Data
Extended Data Fig. 10
Extended Data Fig. 10. A near-complete genome for ‘Mirusviricota’.
Synteny of 355 genes in the mirusvirus near-complete contiguous genome highlighting the occurrence of hallmark genes for the informational and virion modules, as well as heliorhodopsins and histone. Genes with a hit to HMMs from either Duplodnaviria or Varidnaviria are labelled in green and red, respectively (inner tree). Source Data

References

    1. Vincent F, Sheyn U, Porat Z, Schatz D, Vardi A. Visualizing active viral infection reveals diverse cell fates in synchronized algal bloom demise. Proc. Natl Acad. Sci. USA. 2021;118:e2021586118. doi: 10.1073/pnas.2021586118. - DOI - PMC - PubMed
    1. Suttle, C. A. Marine viruses — major players in the global ecosystem. Nat. Rev. Microbiol.10.1038/nrmicro1750 (2007). - PubMed
    1. Irwin NAT, Pittis AA, Richards TA, Keeling PJ. Systematic evaluation of horizontal gene transfer between eukaryotes and viruses. Nat. Microbiol. 2022;7:327–336. doi: 10.1038/s41564-021-01026-3. - DOI - PubMed
    1. Moniruzzaman, M., Weinheimer, A. R., Martinez-Gutierrez, C. A. & Aylward, F. O. Widespread endogenization of giant viruses shapes genomes of green algae. Nature10.1038/s41586-020-2924-2 (2020). - PubMed
    1. Koonin EV, Dolja VV, Krupovic M. Origins and evolution of viruses of eukaryotes: the ultimate modularity. Virology. 2015;479–480:2–25. doi: 10.1016/j.virol.2015.02.039. - DOI - PMC - PubMed

Publication types

LinkOut - more resources