Amino acid substitution matrices from an information theoretic perspective
- PMID: 2051488
- PMCID: PMC7130686
- DOI: 10.1016/0022-2836(91)90193-a
Amino acid substitution matrices from an information theoretic perspective
Abstract
Protein sequence alignments have become an important tool for molecular biologists. Local alignments are frequently constructed with the aid of a "substitution score matrix" that specifies a score for aligning each pair of amino acid residues. Over the years, many different substitution matrices have been proposed, based on a wide variety of rationales. Statistical results, however, demonstrate that any such matrix is implicitly a "log-odds" matrix, with a specific target distribution for aligned pairs of amino acid residues. In the light of information theory, it is possible to express the scores of a substitution matrix in bits and to see that different matrices are better adapted to different purposes. The most widely used matrix for protein sequence comparison has been the PAM-250 matrix. It is argued that for database searches the PAM-120 matrix generally is more appropriate, while for comparing two specific proteins with suspected homology the PAM-200 matrix is indicated. Examples discussed include the lipocalins, human alpha 1 B-glycoprotein, the cystic fibrosis transmembrane conductance regulator and the globins.
References
-
- Altschul S.F., Erickson B.W. A nonlinear measure of subalignment similarity and its significance levels. Bull. Math. Biol. 1986;48:617–632. - PubMed
-
- Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J. Basic local alignment search tool. J. Mol. Biol. 1990;215:403–410. - PubMed
-
- Argos P. A sensitive procedure to compare amino acid sequences. J. Mol. Biol. 1987;193:385–396. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous
