[Statistical characteristics in primary structures of functional regions of Escherichia coli genome. II. Non-stationary Markov chains]
- PMID: 3531811
[Statistical characteristics in primary structures of functional regions of Escherichia coli genome. II. Non-stationary Markov chains]
Abstract
We introduced non-stationary Marcov chains for statistical description of the DNA E. coli structural domains. The values of all needed parameters for those chains was determined by the preliminary statistical processing of a wide set of the E. coli coding regions. It was shown that non-stationary models predict frequencies of occurrences of various combinations of nucleotides within the coding fragments of DNA, better than stationary ones. In particular non-stationary models give good approximation for short and long distance arrangement of nucleotides in the coding regions. The correlation parameters for neighbour codons and for neighbour amino acid residuals in E. coli protein's primary structure was determined from the non-stationary model of the second order. With the aid of the statistical criteria it was found that neighbour residuals in polypeptide chains can't be considered as independent. The new model of the DNA structural domain may be used in computer algorithms for recognition and classification of DNA functional regions.
Similar articles
-
[Statistical characteristics of primary structures of the functional regions of the Escherichia coli genome. III. Computer recognition of coding regions].Mol Biol (Mosk). 1986 Sep-Oct;20(5):1390-8. Mol Biol (Mosk). 1986. PMID: 3534549 Russian.
-
[Statistical characteristics in primary structures of functional regions of Escherichia coli genome. I. Frequency characteristics].Mol Biol (Mosk). 1986 Jul-Aug;20(4):1014-23. Mol Biol (Mosk). 1986. PMID: 3531810 Russian.
-
A DNA structural atlas for Escherichia coli.J Mol Biol. 2000 Jun 16;299(4):907-30. doi: 10.1006/jmbi.2000.3787. J Mol Biol. 2000. PMID: 10843847
-
Prediction of non-coding and antisense RNA genes in Escherichia coli with Gapped Markov Model.Gene. 2006 May 10;372:171-81. doi: 10.1016/j.gene.2005.12.034. Epub 2006 Mar 24. Gene. 2006. PMID: 16564143
-
Significant dispersed recurrent DNA sequences in the Escherichia coli genome. Several new groups.J Mol Biol. 1993 Feb 20;229(4):833-48. doi: 10.1006/jmbi.1993.1090. J Mol Biol. 1993. PMID: 8445651
Cited by
-
Self-identification of protein-coding regions in microbial genomes.Proc Natl Acad Sci U S A. 1998 Aug 18;95(17):10026-31. doi: 10.1073/pnas.95.17.10026. Proc Natl Acad Sci U S A. 1998. PMID: 9707594 Free PMC article.
-
Assessment of protein coding measures.Nucleic Acids Res. 1992 Dec 25;20(24):6441-50. doi: 10.1093/nar/20.24.6441. Nucleic Acids Res. 1992. PMID: 1480466 Free PMC article. Review.