"Word" preference in the genomic text and genome evolution: different modes of n-tuplet usage in coding and noncoding sequences
- PMID: 16059753
- DOI: 10.1007/s00239-004-0209-2
"Word" preference in the genomic text and genome evolution: different modes of n-tuplet usage in coding and noncoding sequences
Abstract
Extensive work on n-tuplet occurrence in genomic sequences has revealed the correlation of their usage with sequence origin. Parallel to that, there exist different restrictions in the nucleotide composition of coding and noncoding sequences that may result in distinct modes of usage of n-tuplets. The relatively simple approaches described herein focus on such differences. They are based on simple summation measures of n-tuplet frequencies, computed after filtering the background nucleotide composition. Among the main targets of this work is to draw some conclusions on the qualitative differences in the composition of genomic sequences depending on their functionality. Moreover, an evolutionary model is formulated, including simple forms of ubiquitous events of genome dynamics: genomic fusions, genome shuffling due to transpositions, replication slippage, and point mutations. This model is shown to be able to reproduce all the statistical features of genomic sequences discussed herein.
Similar articles
-
Measuring the coding potential of genomic sequences through a combination of triplet occurrence patterns and RNY preference.J Mol Evol. 2004 Sep;59(3):309-16. doi: 10.1007/s00239-004-2626-7. J Mol Evol. 2004. PMID: 15553086
-
The role of context-dependent mutations in generating compositional and codon usage bias in grass chloroplast DNA.J Mol Evol. 2003 May;56(5):616-29. doi: 10.1007/s00239-002-2430-1. J Mol Evol. 2003. PMID: 12698298
-
How mitochondria redefine the code.J Mol Evol. 2001 Oct-Nov;53(4-5):299-313. doi: 10.1007/s002390010220. J Mol Evol. 2001. PMID: 11675590
-
Phase-dependent nucleotide substitution in protein-coding sequences.Biochem Biophys Res Commun. 2007 Apr 13;355(3):599-602. doi: 10.1016/j.bbrc.2007.01.006. Epub 2007 Jan 10. Biochem Biophys Res Commun. 2007. PMID: 17300744 Review.
-
Driving change: the evolution of alternative genetic codes.Trends Genet. 2004 Feb;20(2):95-102. doi: 10.1016/j.tig.2003.12.009. Trends Genet. 2004. PMID: 14746991 Review.
Cited by
-
The evolution of word composition in metazoan promoter sequence.PLoS Comput Biol. 2006 Nov 3;2(11):e150. doi: 10.1371/journal.pcbi.0020150. Epub 2006 Sep 28. PLoS Comput Biol. 2006. PMID: 17083273 Free PMC article.
-
Information theory applications for biological sequence analysis.Brief Bioinform. 2014 May;15(3):376-89. doi: 10.1093/bib/bbt068. Epub 2013 Sep 20. Brief Bioinform. 2014. PMID: 24058049 Free PMC article. Review.
-
Diminishing return for increased Mappability with longer sequencing reads: implications of the k-mer distributions in the human genome.BMC Bioinformatics. 2014 Jan 3;15:2. doi: 10.1186/1471-2105-15-2. BMC Bioinformatics. 2014. PMID: 24386976 Free PMC article.
References
MeSH terms
Substances
LinkOut - more resources
Full Text Sources