Information theory applications for biological sequence analysis
- PMID: 24058049
- PMCID: PMC7109941
- DOI: 10.1093/bib/bbt068
Information theory applications for biological sequence analysis
Abstract
Information theory (IT) addresses the analysis of communication systems and has been widely applied in molecular biology. In particular, alignment-free sequence analysis and comparison greatly benefited from concepts derived from IT, such as entropy and mutual information. This review covers several aspects of IT applications, ranging from genome global analysis and comparison, including block-entropy estimation and resolution-free metrics based on iterative maps, to local analysis, comprising the classification of motifs, prediction of transcription factor binding sites and sequence characterization based on linguistic complexity and entropic profiles. IT has also been applied to high-level correlations that combine DNA, RNA or protein features with sequence-independent properties, such as gene mapping and phenotype analysis, and has also provided models based on communication systems theory to describe information transmission channels at the cell level and also during evolutionary processes. While not exhaustive, this review attempts to categorize existing methods and to indicate their relation with broader transversal topics such as genomic signatures, data compression and complexity, time series analysis and phylogenetic classification, providing a resource for future developments in this promising area.
Keywords: Rényi entropy; alignment-free; chaos game representation; genomic signature; information theory; sequence analysis.
Similar articles
-
Pattern recognition and probabilistic measures in alignment-free sequence analysis.Brief Bioinform. 2014 May;15(3):354-68. doi: 10.1093/bib/bbt070. Epub 2013 Oct 3. Brief Bioinform. 2014. PMID: 24096012 Review.
-
New developments of alignment-free sequence comparison: measures, statistics and next-generation sequencing.Brief Bioinform. 2014 May;15(3):343-53. doi: 10.1093/bib/bbt067. Epub 2013 Sep 23. Brief Bioinform. 2014. PMID: 24064230 Free PMC article. Review.
-
Sequence analysis by iterated maps, a review.Brief Bioinform. 2014 May;15(3):369-75. doi: 10.1093/bib/bbt072. Epub 2013 Oct 25. Brief Bioinform. 2014. PMID: 24162172 Free PMC article. Review.
-
Sequence alignment, mutual information, and dissimilarity measures for constructing phylogenies.PLoS One. 2011 Jan 4;6(1):e14373. doi: 10.1371/journal.pone.0014373. PLoS One. 2011. PMID: 21245917 Free PMC article.
-
Local Renyi entropic profiles of DNA sequences.BMC Bioinformatics. 2007 Oct 16;8:393. doi: 10.1186/1471-2105-8-393. BMC Bioinformatics. 2007. PMID: 17939871 Free PMC article.
Cited by
-
Unification and extensive diversification of M/Orf3-related ion channel proteins in coronaviruses and other nidoviruses.Virus Evol. 2021 Feb 16;7(1):veab014. doi: 10.1093/ve/veab014. eCollection 2021 Jan. Virus Evol. 2021. PMID: 33692906 Free PMC article.
-
Benchmarking of alignment-free sequence comparison methods.Genome Biol. 2019 Jul 25;20(1):144. doi: 10.1186/s13059-019-1755-7. Genome Biol. 2019. PMID: 31345254 Free PMC article.
-
Subjective Information and Survival in a Simulated Biological System.Entropy (Basel). 2022 May 2;24(5):639. doi: 10.3390/e24050639. Entropy (Basel). 2022. PMID: 35626524 Free PMC article.
-
Disentangling single-cell omics representation with a power spectral density-based feature extraction.Nucleic Acids Res. 2022 Jun 10;50(10):5482-5492. doi: 10.1093/nar/gkac436. Nucleic Acids Res. 2022. PMID: 35639509 Free PMC article.
-
Information Theory, Living Systems, and Communication Engineering.Entropy (Basel). 2024 May 18;26(5):430. doi: 10.3390/e26050430. Entropy (Basel). 2024. PMID: 38785679 Free PMC article. Review.
References
-
- Shannon CE. A mathematical theory of communication. Bell Syst Tech J. 1948;27(3):379–423.
-
- Shannon CE. A mathematical theory of communication. Bell Syst Tech J. 1948;27(4):623–56.
-
- Ash RB. Information Theory. New York: Dover Publications; 1990. xi, 339.
-
- Cover TM, Thomas JA. Elements of Information Theory. 2nd edn. Hoboken, NJ: Wiley-Interscience; 2006.
-
- Khinchin AIA. Mathematical Foundations of Information Theory. New Dover edn. New York: Dover Publications; 1957.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases