Lack of biological significance in the 'linguistic features' of noncoding DNA--a quantitative analysis
- PMID: 8649985
- PMCID: PMC145855
- DOI: 10.1093/nar/24.9.1676
Lack of biological significance in the 'linguistic features' of noncoding DNA--a quantitative analysis
Abstract
Recently, the application of two statistical methods (related to Zipf's distribution and Shannon's redundancy), called 'linguistic' tests, to the primary structure of DNA sequences of living organisms has excited considerable interest. Of particular importance is the claim that noncoding DNA sequences in eukaryotes display specific 'linguistic' features, being reminiscent of natural languages. Furthermore, this implies that noncoding regions of DNA may carry some new, thus far unknown, biological information which is revealed by these tests. In this paper these claims are tested quantitatively. With the aid of computer simulations of natural DNA sequences, and by applying the same 'linguistic' tests to both natural and artificial sequences, we investigate in detail the reasons of the appearance of the claimed 'linguistic' features and the associated differences between coding and noncoding DNAs. The presented results show quantitatively that the 'linguistic' tests failed to reveal any new biological information in (noncoding or coding) DNA.
Similar articles
-
Linguistic features of noncoding DNA sequences.Phys Rev Lett. 1994 Dec 5;73(23):3169-72. doi: 10.1103/PhysRevLett.73.3169. Phys Rev Lett. 1994. PMID: 10057305
-
Statistical and linguistic features of DNA sequences.Fractals. 1995 Jun;3(2):269-84. doi: 10.1142/s0218348x95000229. Fractals. 1995. PMID: 11539281
-
Scaling features of noncoding DNA.Physica A. 1999;273(1-2):1-18. doi: 10.1016/s0378-4371(99)00407-0. Physica A. 1999. PMID: 11542924
-
Linguistic approaches to the analysis of sequence information.Trends Biotechnol. 1994 Oct;12(10):401-8. doi: 10.1016/0167-7799(94)90028-0. Trends Biotechnol. 1994. PMID: 7765386 Review.
-
The human genome: organization and evolutionary history.Annu Rev Genet. 1995;29:445-76. doi: 10.1146/annurev.ge.29.120195.002305. Annu Rev Genet. 1995. PMID: 8825483 Review.
Cited by
-
Peptide vocabulary analysis reveals ultra-conservation and homonymity in protein sequences.Bioinform Biol Insights. 2009 Nov 24;1:101-26. doi: 10.4137/bbi.s415. Bioinform Biol Insights. 2009. PMID: 20066129 Free PMC article.
-
A comparative study and a phylogenetic exploration of the compositional architectures of mammalian nuclear genomes.PLoS Comput Biol. 2014 Nov 6;10(11):e1003925. doi: 10.1371/journal.pcbi.1003925. eCollection 2014 Nov. PLoS Comput Biol. 2014. PMID: 25375262 Free PMC article.
-
Genomics, morphogenesis and biophysics: triangulation of Purkinje cell development.Cerebellum. 2006;5(1):27-35. doi: 10.1080/14734220500378581. Cerebellum. 2006. PMID: 16527761 Review.
-
Evolutionary dynamics of selfish DNA explains the abundance distribution of genomic subsequences.Sci Rep. 2016 Aug 4;6:30851. doi: 10.1038/srep30851. Sci Rep. 2016. PMID: 27488939 Free PMC article.
References
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources