Linguistics of nucleotide sequences: morphology and comparison of vocabularies
- PMID: 3078230
- DOI: 10.1080/07391102.1986.10507643
Linguistics of nucleotide sequences: morphology and comparison of vocabularies
Abstract
The concept of "words" in continuous languages devoid of blanks is introduced and an operational definition of words given. With this novel concept nucleotide sequences become object for linguistic analysis. The typical word size of the nucleotide language is found to be 3 to 5 (tri- to pentamers). Different genomes have distinct vocabularies. Comparison of these vocabularies can serve as a basis for revealing functional and evolutionary relatedness of sequences.
Similar articles
-
Linguistic measure of taxonomic and functional relatedness of nucleotide sequences.J Biomol Struct Dyn. 1990 Jun;7(6):1251-68. doi: 10.1080/07391102.1990.10508563. J Biomol Struct Dyn. 1990. PMID: 2363847
-
[Constraints on base sequences in a polynucleotide: I. Significance of the degeneration of the code].C R Acad Sci III. 1985;301(6):277-82. C R Acad Sci III. 1985. PMID: 3928104 French.
-
[Comparative hierarchic structure of the genetic language].Genetika. 1993 May;29(5):720-39. Genetika. 1993. PMID: 8335232 Review. Russian.
-
Linguistics of nucleotide sequences. II: Stationary words in genetic texts and the zonal structure of DNA.J Biomol Struct Dyn. 1989 Apr;6(5):1027-38. doi: 10.1080/07391102.1989.10506529. J Biomol Struct Dyn. 1989. PMID: 2531597
-
DNA sequence analysis linguistic tools: contrast vocabularies, compositional spectra and linguistic complexity.Appl Bioinformatics. 2003;2(2):103-12. Appl Bioinformatics. 2003. PMID: 15130826 Review.
Cited by
-
Google matrix analysis of DNA sequences.PLoS One. 2013 May 9;8(5):e61519. doi: 10.1371/journal.pone.0061519. Print 2013. PLoS One. 2013. PMID: 23671568 Free PMC article.
-
Evaluating the number of different genomes in a metagenome by means of the compositional spectra approach.PLoS One. 2020 Nov 6;15(11):e0237205. doi: 10.1371/journal.pone.0237205. eCollection 2020. PLoS One. 2020. PMID: 33156862 Free PMC article.
-
Distinguishing microbial genome fragments based on their composition: evolutionary and comparative genomic perspectives.Genome Biol Evol. 2010 Jan 25;2:117-31. doi: 10.1093/gbe/evq004. Genome Biol Evol. 2010. PMID: 20333228 Free PMC article.
-
A Puzzling Anomaly in the 4-Mer Composition of the Giant Pandoravirus Genomes Reveals a Stringent New Evolutionary Selection Process.J Virol. 2019 Nov 13;93(23):e01206-19. doi: 10.1128/JVI.01206-19. Print 2019 Dec 1. J Virol. 2019. PMID: 31534042 Free PMC article.
-
Conservation of k-mer Composition and Correlation Contribution between Introns and Intergenic Regions of Animalia Genomes.Genes (Basel). 2018 Oct 4;9(10):482. doi: 10.3390/genes9100482. Genes (Basel). 2018. PMID: 30287792 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources