Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2014 May;15(3):376-89.
doi: 10.1093/bib/bbt068. Epub 2013 Sep 20.

Information theory applications for biological sequence analysis

Affiliations
Review

Information theory applications for biological sequence analysis

Susana Vinga. Brief Bioinform. 2014 May.

Abstract

Information theory (IT) addresses the analysis of communication systems and has been widely applied in molecular biology. In particular, alignment-free sequence analysis and comparison greatly benefited from concepts derived from IT, such as entropy and mutual information. This review covers several aspects of IT applications, ranging from genome global analysis and comparison, including block-entropy estimation and resolution-free metrics based on iterative maps, to local analysis, comprising the classification of motifs, prediction of transcription factor binding sites and sequence characterization based on linguistic complexity and entropic profiles. IT has also been applied to high-level correlations that combine DNA, RNA or protein features with sequence-independent properties, such as gene mapping and phenotype analysis, and has also provided models based on communication systems theory to describe information transmission channels at the cell level and also during evolutionary processes. While not exhaustive, this review attempts to categorize existing methods and to indicate their relation with broader transversal topics such as genomic signatures, data compression and complexity, time series analysis and phylogenetic classification, providing a resource for future developments in this promising area.

Keywords: Rényi entropy; alignment-free; chaos game representation; genomic signature; information theory; sequence analysis.

PubMed Disclaimer

Similar articles

Cited by

References

    1. Shannon CE. A mathematical theory of communication. Bell Syst Tech J. 1948;27(3):379–423.
    1. Shannon CE. A mathematical theory of communication. Bell Syst Tech J. 1948;27(4):623–56.
    1. Ash RB. Information Theory. New York: Dover Publications; 1990. xi, 339.
    1. Cover TM, Thomas JA. Elements of Information Theory. 2nd edn. Hoboken, NJ: Wiley-Interscience; 2006.
    1. Khinchin AIA. Mathematical Foundations of Information Theory. New Dover edn. New York: Dover Publications; 1957.

Publication types

Substances