Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Aug 4;106(31):12826-31.
doi: 10.1073/pnas.0905115106. Epub 2009 Jun 24.

Whole-proteome phylogeny of large dsDNA virus families by an alignment-free method

Affiliations

Whole-proteome phylogeny of large dsDNA virus families by an alignment-free method

Guohong Albert Wu et al. Proc Natl Acad Sci U S A. .

Abstract

The vast sequence divergence among different virus groups has presented a great challenge to alignment-based sequence comparison among different virus families. Using an alignment-free comparison method, we construct the whole-proteome phylogeny for a population of viruses from 11 viral families comprising 142 large dsDNA eukaryote viruses. The method is based on the feature frequency profiles (FFP), where the length of the feature (l-mer) is selected to be optimal for phylogenomic inference. We observe that (i) the FFP phylogeny segregates the population into clades, the membership of each has remarkable agreement with current classification by the International Committee on the Taxonomy of Viruses, with one exception that the mimivirus joins the phycodnavirus family; (ii) the FFP tree detects potential evolutionary relationships among some viral families; (iii) the relative position of the 3 herpesvirus subfamilies in the FFP tree differs from gene alignment-based analysis; (iv) the FFP tree suggests the taxonomic positions of certain "unclassified" viruses; and (v) the FFP method identifies candidates for horizontal gene transfer between virus families.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Optimal feature (l-mer) length. (A) Cumulative relative entropy (CRE) curves for 142 large dsDNA virus proteomes. (B) Relative sequence divergence (RSD) values for 4 representative viral proteomes, the smallest (NeleNPV), the intermediate (SHFV and CNPV), and the largest (APMV). The optimal feature length for whole-proteome comparison and phylogeny inference is 8 and approximately corresponds to when both CRE and RSD fall to <10% of their maximum values.
Fig. 2.
Fig. 2.
Common 8-mers and HGT. The number of interviral-family protein pairs vs. the number of common 8-mers in a protein pair for LDVs. (A) The ascovirus HvAV3e proteome against the baculovirus HzSNPV proteome, suggesting that there are several protein pairs due to interfamily HGT events. (B) Interviral-family protein pairs from all LDV proteomes. (C) Interviral-family DNA polymerase pairs. (D) Same as in B but with each protein sequence subject to random permutation of its amino acids. Interfamily HGT candidates are identified when a protein pair shares unusually high number of common 8-mers relative to the most conserved LDV protein of DNA polymerase, with a maximum of eight 8-mers as shown in C. Randomized protein sequences share much fewer common 8-mers with a maximum of four 8-mers as shown in D.
Fig. 3.
Fig. 3.
The LDV whole-proteome tree. The FFP tree of large dsDNA viruses at feature length 8 after deleting horizontally transferred genes between viral families and filtering out low-complexity features. Modified bootstrap percentages <80% are shown and are based on 200 replicates. The tree is drawn using iTOL (48), and is not drawn to scale. Outer circle color-codes 11 viral families as per ICTV and 2 groups of viruses not assigned to any family: nudivirus and saliva gland hypertrophy virus (SGHV) (see key in the bottom left). The middle layer color-codes viral subfamilies of the poxviridae and herpesviridae. The different viral genera are color-coded by both the inner ring and tree leaves.

References

    1. Herniou EA, Jehle JA. Baculovirus phylogeny and evolution. Curr Drug Targets. 2007;8:1043–1050. - PubMed
    1. Montague MG, Hutchison CA., 3rd Gene content phylogeny of herpesviruses. Proc Natl Acad Sci USA. 2000;97:5334–5339. - PMC - PubMed
    1. McLysaght A, Baldi PF, Gaut BS. Extensive gene gain associated with adaptive evolution of poxviruses. Proc Natl Acad Sci USA. 2003;100:15655–15660. - PMC - PubMed
    1. de Andrade Zanotto PM, Krakauer DC. Complete genome viral phylogenies suggests the concerted evolution of regulatory cores and accessory satellites. PLoS ONE. 2008;3:e3500. - PMC - PubMed
    1. Iyer LM, Aravind L, Koonin EV. Common origin of four diverse families of large eukaryotic DNA viruses. J Virol. 2001;75:11720–11734. - PMC - PubMed

Publication types

LinkOut - more resources