A stochastic analysis of three viral sequences
- PMID: 1321321
- DOI: 10.1093/oxfordjournals.molbev.a040741
A stochastic analysis of three viral sequences
Abstract
This paper analyzes the nucleotide sequences of three viruses: Kunjin, west Nile, and yellow fever. Each virus has one long open reading frame of greater than 10,200 nucleotides that codes for four structural and seven nonstructural genes. The Kunjin and west Nile viruses are the most closely related pair, when assessed on the basis of matches between their nucleotide sequences. As would be expected, the matching is least for bases at third-position codon sites and is greatest for second-position sites. Statistics are presented for the numbers of mismatches that are transitions or transversions. Nucleotide base usage is also reported. To each of the 33 virus-gene segments, nonhomogeneous Markov chain models have been fitted to describe the sequences of nucleotide bases. The models allow for different transition probabilities ("transition" is used in the mathematical sense here) and for different degrees of dependency, at the three sites in the codons. Reasonably satisfactory fits can be obtained for many of the genes by using models that are first order for both first- and second-position sites in the codon but that are second order for third-position sites. One consequence of such a model is that the correlation between one amino acid and the next is limited to the correlation of the last base of the former with the first base of the latter. Other consequences are that the model can (and does) prohibit the occurrence of stop codons within a gene and that subsequences of only first-position bases, or only third-position bases, are also first-order Markov chains. In theory, second-position subsequences may not be Markov chains at all. In practice, the data suggest that each of these subsequences is effectively a zero-order Markov chain, i.e., bases spaced three apart are statistically independent. Stationarity of nucleotide base distributions can be interpreted in either of two ways: (1) spatially along the sites or (2) temporally at each site. These interpretations must often be inconsistent, when the former allows for Markov dependence between adjacent sites whereas the latter assumes independence between sites. The inconsistency can be overcome, for these viruses, if subsequences at different codon positions are analyzed separately.
Similar articles
-
Nucleotide and complete amino acid sequences of Kunjin virus: definitive gene order and characteristics of the virus-specified proteins.J Gen Virol. 1988 Jan;69 ( Pt 1):1-21. doi: 10.1099/0022-1317-69-1-1. J Gen Virol. 1988. PMID: 2826659
-
Partial nucleotide sequence of the Murray Valley encephalitis virus genome. Comparison of the encoded polypeptides with yellow fever virus structural and non-structural proteins.J Mol Biol. 1986 Feb 5;187(3):309-23. doi: 10.1016/0022-2836(86)90435-3. J Mol Biol. 1986. PMID: 3009829
-
Analysis of structural properties which possibly are characteristic for the 3'-terminal sequence of the genome RNA of flaviviruses.J Gen Virol. 1986 Jun;67 ( Pt 6):1183-8. doi: 10.1099/0022-1317-67-6-1183. J Gen Virol. 1986. PMID: 3011975
-
Preclinical and clinical development of YFV 17D-based chimeric vaccines against dengue, West Nile and Japanese encephalitis viruses.Vaccine. 2010 Jan 8;28(3):632-49. doi: 10.1016/j.vaccine.2009.09.098. Epub 2009 Oct 4. Vaccine. 2010. PMID: 19808029 Review.
-
Molecular detection of West Nile virus RNA.Expert Rev Mol Diagn. 2003 May;3(3):357-66. doi: 10.1586/14737159.3.3.357. Expert Rev Mol Diagn. 2003. PMID: 12779009 Review.
Cited by
-
Nucleotide composition of the Zika virus RNA genome and its codon usage.Virol J. 2016 Jun 8;13:95. doi: 10.1186/s12985-016-0551-1. Virol J. 2016. PMID: 27278486 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials