"Word" preference in the genomic text and genome evolution: different modes of n-tuplet usage in coding and noncoding sequences
- PMID: 16059753
- DOI: 10.1007/s00239-004-0209-2
"Word" preference in the genomic text and genome evolution: different modes of n-tuplet usage in coding and noncoding sequences
Abstract
Extensive work on n-tuplet occurrence in genomic sequences has revealed the correlation of their usage with sequence origin. Parallel to that, there exist different restrictions in the nucleotide composition of coding and noncoding sequences that may result in distinct modes of usage of n-tuplets. The relatively simple approaches described herein focus on such differences. They are based on simple summation measures of n-tuplet frequencies, computed after filtering the background nucleotide composition. Among the main targets of this work is to draw some conclusions on the qualitative differences in the composition of genomic sequences depending on their functionality. Moreover, an evolutionary model is formulated, including simple forms of ubiquitous events of genome dynamics: genomic fusions, genome shuffling due to transpositions, replication slippage, and point mutations. This model is shown to be able to reproduce all the statistical features of genomic sequences discussed herein.
References
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
