Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Sep;99(9):1331-1343.
doi: 10.1099/jgv.0.001110. Epub 2018 Jul 17.

Evaluation of the genomic diversity of viruses infecting bacteria, archaea and eukaryotes using a common bioinformatic platform: steps towards a unified taxonomy

Affiliations

Evaluation of the genomic diversity of viruses infecting bacteria, archaea and eukaryotes using a common bioinformatic platform: steps towards a unified taxonomy

Pakorn Aiewsakun et al. J Gen Virol. 2018 Sep.

Abstract

Genome Relationship Applied to Virus Taxonomy (GRAViTy) is a genetics-based tool that computes sequence relatedness between viruses. Composite generalized Jaccard (CGJ) distances combine measures of homology between encoded viral genes and similarities in genome organizational features (gene orders and orientations). This scoring framework effectively recapitulates the current, largely morphology and phenotypic-based, family-level classification of eukaryotic viruses. Eukaryotic virus families typically formed monophyletic groups with consistent CGJ distance cut-off dividing between and within family divergence ranges. In the current study, a parallel analysis of prokaryotic virus families revealed quite different sequence relationships, particularly those of tailed phage families (Siphoviridae, Myoviridae and Podoviridae), where members of the same family were generally far more divergent and often not detectably homologous to each other. Analysis of the 20 currently classified prokaryotic virus families indeed split them into 70 separate clusters of tailed phages genetically equivalent to family-level assignments of eukaryotic viruses. It further divided several bacterial (Sphaerolipoviridae, Tectiviridae) and archaeal (Lipothrixviridae) families. We also found that the subfamily-level groupings of tailed phages were generally more consistent with the family assignments of eukaryotic viruses, and this supports ongoing reclassifications, including Spounavirinae and Vi1virus taxa as new virus families. The current study applied a common benchmark with which to compare taxonomies of eukaryotic and prokaryotic viruses. The findings support the planned shift away from traditional morphology-based classifications of prokaryotic viruses towards a genome-based taxonomy. They demonstrate the feasibility of a unified taxonomy of viruses into which the vast body of metagenomic viral sequences may be consistently assigned.

Keywords: Baltimore classification; bacteriophage; eukaryote; hidden Markov model; metagenomic; prokaryote; taxon; taxonomy; virus; virus classification.

PubMed Disclaimer

Conflict of interest statement

The authors declare that there are no conflicts of interest.

Figures

Fig. 1.
Fig. 1.
(a) Heat map and dendrogram of dsDNA viruses using pairwise composite generalized Jaccard (CGJ) distances. Branches and labels were colour coded by their hosts – Siphoviridae: orange; Myoviridae: blue; Podoviridae: green; other bacterial: purple; archaeal: red; dual host (archaea and bacteria): yellow; eukaryotic: black. The order of taxa in the heat map followed the phylogeny of the dendrogram and was not therefore constrained by ICTV family assignments. (b) Expanded view of the lower right-hand quadrant of the heat map and associated section of the dendrogram showing the genome relationships of archaeal viruses and other related viruses. Bootstrap support is shown above branches in the dendrogram (values of ≥70 % are shown). Taxa were colour coded as in Fig. 1(a).
Fig. 1.
Fig. 1.
(a) Heat map and dendrogram of dsDNA viruses using pairwise composite generalized Jaccard (CGJ) distances. Branches and labels were colour coded by their hosts – Siphoviridae: orange; Myoviridae: blue; Podoviridae: green; other bacterial: purple; archaeal: red; dual host (archaea and bacteria): yellow; eukaryotic: black. The order of taxa in the heat map followed the phylogeny of the dendrogram and was not therefore constrained by ICTV family assignments. (b) Expanded view of the lower right-hand quadrant of the heat map and associated section of the dendrogram showing the genome relationships of archaeal viruses and other related viruses. Bootstrap support is shown above branches in the dendrogram (values of ≥70 % are shown). Taxa were colour coded as in Fig. 1(a).
Fig. 2.
Fig. 2.
Genome relationships of bacterial and archaeal viruses in Baltimore group II. The heat map and dendrogram and associated bootstrap values are represented as in Fig. 1(b).
Fig. 3.
Fig. 3.
Dendrogram of members of Caudovirales that have subfamily assignments. Taxa were colour coded for family as in Fig. 1(a) (see the key). Minor discrepancies between genus assignments and phylogeny are shown in red circles. Bootstrap support is shown above branches in the dendrogram (values of ≥70 % are shown).
Fig. 4.
Fig. 4.
(a) Sets of pairwise composite generalized Jaccard (CGJ) distances between viral sequences of classified members of eukaryotic, bacterial and archaeal virus families. Blue bars represent the totals of pairwise comparisons (left-hand y-axis) over 0.02 distance intervals. For eukaryotic viruses, distances between genera (red bars) and within genera (green bars) were shown using the right-hand y-axis scale. (b) A separate analysis of sequences of members of the Siphoviridae, Podoviridae and Myoviridae families in Caudovirales, showing sets of pairwise distances between different genera (red) and within genera (green).
Fig. 5.
Fig. 5.
Dendrogram of classified and unclassified dsDNA viruses (Baltimore group I) based on composite generalised Jaccard (CGJ) distances, divided into six separate lines to represent the 139 clades present in the dataset. Tips are labelled with genus for members of Caudovirales (abbreviated as S: Siphoviridae; M: Myoviridae; P: Podoviridae), with family/genus for other bacterial, eukaryotic and archaeal viruses or with accession number codes for unclassified viruses. The scale bar for CGJ distance is shown at the left of each line and the 0.8 threshold that corresponds to eukaryotic family groupings is shown as a grey dotted line. Bootstrap re-sampling was performed with pruned signature tables as previously described [30]. Clades were coloured based on host origin according to the key; those containing both classified and unclassified sequences were shown in a lighter shade. The 39 new candidate unassigned taxonomic units (UTUs) arising from the inclusion of current unclassified viruses are shaded in light blue. Bootstrap support is shown above branches in the dendrogram (values of ≥70 % are shown).

References

    1. Adams MJ, Lefkowitz EJ, King AMQ, Harrach B, Harrison RL, et al. Changes to taxonomy and the International Code of Virus Classification and Nomenclature ratified by the International Committee on Taxonomy of Viruses (2017) Arch Virol. 2017;162:2505–2538. doi: 10.1007/s00705-017-3358-5. - DOI - PubMed
    1. Philippe N, Legendre M, Doutre G, Couté Y, Poirot O, et al. Pandoraviruses: amoeba viruses with genomes up to 2.5 Mb reaching that of parasitic eukaryotes. Science. 2013;341:281–286. doi: 10.1126/science.1239181. - DOI - PubMed
    1. Koonin EV, Senkevich TG, Dolja VV. The ancient virus world and evolution of cells. Biol Direct. 2006;1:29. doi: 10.1186/1745-6150-1-29. - DOI - PMC - PubMed
    1. Holmes EC. What does virus evolution tell us about virus origins? J Virol. 2011;85:5247–5251. doi: 10.1128/JVI.02203-10. - DOI - PMC - PubMed
    1. Rohwer F, Edwards R. The phage proteomic tree: a genome-based taxonomy for phage. J Bacteriol. 2002;184:4529–4535. doi: 10.1128/JB.184.16.4529-4535.2002. - DOI - PMC - PubMed

Publication types

LinkOut - more resources