Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2023 Aug 3:14:1240993.
doi: 10.3389/fmicb.2023.1240993. eCollection 2023.

A critical analysis of the current state of virus taxonomy

Affiliations
Review

A critical analysis of the current state of virus taxonomy

Gustavo Caetano-Anollés et al. Front Microbiol. .

Abstract

Taxonomical classification has preceded evolutionary understanding. For that reason, taxonomy has become a battleground fueled by knowledge gaps, technical limitations, and a priorism. Here we assess the current state of the challenging field, focusing on fallacies that are common in viral classification. We emphasize that viruses are crucial contributors to the genomic and functional makeup of holobionts, organismal communities that behave as units of biological organization. Consequently, viruses cannot be considered taxonomic units because they challenge crucial concepts of organismality and individuality. Instead, they should be considered processes that integrate virions and their hosts into life cycles. Viruses harbor phylogenetic signatures of genetic transfer that compromise monophyly and the validity of deep taxonomic ranks. A focus on building phylogenetic networks using alignment-free methodologies and molecular structure can help mitigate the impasse, at least in part. Finally, structural phylogenomic analysis challenges the polyphyletic scenario of multiple viral origins adopted by virus taxonomy, defeating a polyphyletic origin and supporting instead an ancient cellular origin of viruses. We therefore, prompt abandoning deep ranks and urgently reevaluating the validity of taxonomic units and principles of virus classification.

Keywords: ICTV; classification; evolution; holobiont; horizontal genetic transfer; reticulation; virus origin.

PubMed Disclaimer

Conflict of interest statement

AN is a shareholder and employee at Moderna, Inc. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Matching taxonomies to evolution. The endeavor (A) may prove difficult in the presence of taxonomic terminal units that are holobionts (B), phylogenies with reticulations (dashed lines) caused by horizontal gene transfer (line connecting taxa c and j) or recruitment (line connecting taxa c and i) (C), or the existence of independent origins that break up monophyletic relationships (D). Note that reticulations at higher rank levels enhance the chances of multiple origins in evolution.
Figure 2
Figure 2
The current virus taxonomy is a 15-ranked system that can be visualized as a taxonomic pyramid when phylogenetic relationships are mapped onto the ranked classification system. The example pyramid shows a classification of the phylum ‘Peploviricota’, which hosts the herpesviruses. Note that only one species per genus illustrates the 133 that currently map to the different genera.
Figure 3
Figure 3
Comparing the Baltimore classification of viruses and the ranking of realms uncovers evolutionarily entangled systems. (A) The seven Baltimore classes describe processes of information transfer that lead to mRNA molecules necessary for translation into viral proteins. (B) A bimodal network mapping realms to Baltimore classes shows the entangled relationships between the two classification schemes. (C) A phylogenetic reconstruction of a tree of Baltimore classes and a tree of realms from viral traits related to replication, transcription and translation reveals comparable evolutionary histories. CI, consistency index; HI, homoplasy index; RI, Retention index; RC, Rescaled consistency index.
Figure 4
Figure 4
Well-known limitations makes building a virus taxonomy a challenging proposition.
Figure 5
Figure 5
The basal paraphyletic grouping of viruses in a uToL describing the evolution of proteomes from and cellular organisms challenges the monophyletic classification of viruses. The phylogeny (phylogenetic tree length = 45,935; retention index = 0.83; g1 = −0.31) rendered in ‘fan’ format describes the evolution of 368 proteomes (taxa) randomly sampled from cells and viruses (Nasir and Caetano-Anollés, 2015). This tree of proteomes was reconstructed from 442 parsimony-informative phylogenetic characters representing genomic abundance of 442 domain structures that were universally present in the 3 domains of cellular life and viruses and were defined at SCOP fold superfamily level of protein classification. Differently colored branches represent bootstrap support (BS) values. Viral taxa are labeled with family names and are indexed with realms-kingdoms and Baltimore classes. While many viral families do form largely unified monophyletic groups, viruses as a collective group is paraphyletic and so are most realms or Baltimore classes. Insert: Virus paraphyly in deep branches leading to virus families are traced in orange.
Figure 6
Figure 6
Three main scenarios of viral origins suggest viruses originated during either a pre-cellular world, a primordial cellular world, or a diversified cellular world. The pre-cellular ‘Virus-first’ hypothesis is problematic because all viruses depend on cells to propagate. The ‘Escape’ hypothesis in which viruses originate as ‘escapees’ from already diversified cells belonging to Archaea, Bacteria or Eukarya, is incompatible with viruses carrying conserved protein fold structures that are common to all domains of life, which suggest they arose prior to the ‘last universal cellular ancestor’ (LUCellA). The more likely ‘Reduction’ hypothesis suggest viruses appeared prior to LUCellA in an emergent cellular world.
Figure 7
Figure 7
A census of SCOP structural domains challenges the ‘virus-first’ and ‘escape’ hypotheses. (A) Venn diagrams describe the distribution of 1,995 fold superfamilies and 3,892 fold families identified with HMMs of structural recognition in Archaea, Bacteria, Eukarya, and viruses following a survey of 5,080 and 8,127 proteomes, respectively. The red circle highlights the number of superfamilies and families that are shared by all three organismal domains and viruses. (B) Venn diagrams describe the distribution of the 715 superfamilies and 1,526 families that were present in archaeoviruses, bacterioviruses and eukaryoviruses. Note that the existence of structures present in the three viral groups (the abe Venn group in the red circle) does not imply they belong to viruses capable of infecting organisms in the three domains of cellular life (an impossibility). Instead, it shows the groups of structural domains shared by viruses infecting the different hosts. Data from Nasir and Caetano-Anollés (2015) and Mughal et al. (2020).
Figure 8
Figure 8
The evolutionary history of structural domains defined at SCOP family level reveals gradual evolutionary accumulation of domains in the proteomes of cells and viruses. A rooted phylogenomic tree describing the evolution of the 3,892 families that are present in 8,127 proteomes allowed calculation of times of origin for families unique or shared among Archaea (A), Bacteria (B) and Eukarya (E) and viruses (V). Horizontal bar plots show ranges of ‘times of origin’ in a geological time scale defined by a molecular clock of folds that ranges from the origin of domains 3.8 billion years ago (Gya) to the present (0 Gya). Numbers in bars indicate families appearing in each evolutionary phase of the timeline. A most likely chronology of cellular evolution inferred from Venn group distributions is shown on top of bar plots as a series of phylogenetic networks reconstructed with the Neighbor-Net algorithm in SplitsTree. The chronology confirms an evolutionary progression in which ancestral cells (A) coalesce into a last universal common ancestor (LUCA), which then diversifies into a last universal cellular ancestor (LUCellA) and ancestors of viruses (AV), the rise of Archaea and a stem line leading to ancestors of Bacteria and Eukarya (ABE) and then Eukarya (AE), and finally to modern diversified lineages of Archaea, Bacteria, Eukarya and viruses. A similar progression was obtained when analyzing domains defined at superfamily level. Data from Mughal et al. (2020).
Figure 9
Figure 9
Proteomic composition of viruses infecting the three domains of cellular life. Numbers in parentheses indicate number of virus that were surveyed. Data from Nasir and Caetano-Anollés (2015) and Mughal et al. (2020).

References

    1. Agol V. I. (1974). Towards the system of viruses. Biosystems 6, 113–132. doi: 10.1016/0303-2647(74)90003-3, PMID: - DOI - PubMed
    1. Alves M. R. P. (2020). The natural fallacy in the post-truth era. EMBO Rep. 21:e49859. doi: 10.15252/embr.201949859, PMID: - DOI - PMC - PubMed
    1. Amicone M., Borges V., Alves M. J., Isidro J., Zé-Zé L., Duarte S., et al. (2022). Mutation rate of SARS-CoV-2 and emergence of mutators during experimental evolution. Evol. Med. Public Health 10, 142–155. doi: 10.1093/emph/eoac010, PMID: - DOI - PMC - PubMed
    1. Baltimore D. (1971). Expression of animal virus genomes. Bacteriol. Rev. 35, 235–241. doi: 10.1128/br.35.3.235-241.1971, PMID: - DOI - PMC - PubMed
    1. Bandea C. I. (1983). A new theory on the origin and the nature of viruses. J. Theor. Biol. 105, 591–602. doi: 10.1016/0022-5193(83)90221-7, PMID: - DOI - PubMed