Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Aug;36(2):224-32.
doi: 10.1016/j.ympev.2005.03.030.

Coronavirus phylogeny based on a geometric approach

Affiliations

Coronavirus phylogeny based on a geometric approach

Wen-Xin Zheng et al. Mol Phylogenet Evol. 2005 Aug.

Abstract

A novel coronavirus has been identified as the cause of the outbreak of severe acute respiratory syndrome (SARS). Previous phylogenetic analyses based on sequence alignments show that SARS-CoVs form a new group distantly related to the other three groups of previously characterized coronaviruses. In this paper, a geometric approach based on the Z-curve representation of the whole genome sequence is proposed to analyze the phylogenetic relationships of coronaviruses. The evolutionary distances are obtained through measuring the differences among the three-dimensional Z-curves. The Z-curve is approximately described by its geometric center and the associated three eigenvectors, which indicate the center position and the trend of the Z-curve, respectively. Although some information is lost due to the approximate description of the Z-curve, the phylogenetic tree constructed based on these parameters is consistent with those of previous analyses. The present method has the merits of simplicity and intuitiveness, but it is still in its premature stage. Because the phylogenetic relationships are inferred from the whole genome, instead of some individual genes, the present method represents a new direction of phylogeny study in the post-genome era.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
The three-dimensional Z-curves (xyz′) for three complete coronavirus genomes. (A–C) The Z-curves of BJ01, TOR2, and BCoV, respectively. It can be clearly seen that the Z-curves of BJ01 and TOR2 are very similar, while the Z-curve of BCoV is significantly different from the former two. This forms the basis of the present method. (D) A sketch of the three eigenvectors for a certain genome (TOR2), which illustrates the relationship between the three eigenvectors and the Z-curve.
Fig. 2
Fig. 2
The three groups of eigenvectors (denoted with different arrows). The vectors in the X, Y, and Z′ groups are denoted by dark, dot, and grey arrows, respectively. (A and B) The eigenvectors of the 24 genomes observed from different directions. It can be seen from (A) that the three groups can be separated according to their three-dimensional space distribution. (B) The vectors in Y group of the 24 genomes are coplanar and they are almost in the xy plane.
Fig. 3
Fig. 3
The phylogenetic tree constructed with the current method. The result shows that four groups exist in the genus Coronavirus. Note that group I (HCoV-229E, TGEV, and PEDV) and group II (BCoVM, BCoVL, BCoVQ, BCoV, MHVM, MHV2, MHVP, and MHV) cluster together forming a bigger group firstly. Second, this group joins group III (IBV) to form a much bigger group. Finally, SARS-CoVs join them and result in the phylogenetic tree shown here. Also note that the resulting monophyletic clusters agree perfectly with the established taxonomic groups.

References

    1. Chen L.L., Ou H.Y., Zhang R., Zhang C.-T. ZCURVE_CoV: a new system to recognize protein coding genes in coronavirus genomes, and its applications in analyzing SARS-CoV genomes. Biochem. Biophys. Res. Commun. 2003;307:382–388. - PMC - PubMed
    1. Cork D.J. Achieving consensus of long genomic sequences with the W-curve. In: Lapointe F., McMorris F.R., Janowitz M., editors. Bioconsensus. vol. 61. American Mathematical Society; Providence, RI: 2003. pp. 123–134. (DIMACS series in discrete mathematics and theoretical computer science).
    1. Cork, D.J., Hutch, T.B., Marland, E., Zmuda, J., 2002. Achieving congruency of phylogenetic trees generated by W-curves of genomic sequences. In: Valafar, F. (Ed.), Techniques in Bioinformatics and Medical Informatics. Ann. N. Y. Acad. Sci. 980, 23–31 - PubMed
    1. Cork, D.J., Toguem, A., 2002. Using fuzzy logic to confirm the integrity of a pattern recognition algorithm for long genomic sequences: the W-curve of genomic sequences. In: Valafar, F. (Ed.), Techniques in Bioinformatics and Medical Informatics. Ann. N. Y. Acad. Sci. 980, 32–40 - PubMed
    1. Cornish-Bowden A. Nomenclature for incompletely specified bases in nucleic acid sequences: recommendation 1984. Nucleic Acids Res. 1985;13:3021–3030. - PMC - PubMed

Publication types

LinkOut - more resources