[Statistical characteristics in primary structures of functional regions of Escherichia coli genome. II. Non-stationary Markov chains]
- PMID: 3531811
[Statistical characteristics in primary structures of functional regions of Escherichia coli genome. II. Non-stationary Markov chains]
Abstract
We introduced non-stationary Marcov chains for statistical description of the DNA E. coli structural domains. The values of all needed parameters for those chains was determined by the preliminary statistical processing of a wide set of the E. coli coding regions. It was shown that non-stationary models predict frequencies of occurrences of various combinations of nucleotides within the coding fragments of DNA, better than stationary ones. In particular non-stationary models give good approximation for short and long distance arrangement of nucleotides in the coding regions. The correlation parameters for neighbour codons and for neighbour amino acid residuals in E. coli protein's primary structure was determined from the non-stationary model of the second order. With the aid of the statistical criteria it was found that neighbour residuals in polypeptide chains can't be considered as independent. The new model of the DNA structural domain may be used in computer algorithms for recognition and classification of DNA functional regions.