Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 1991 Jun 5;219(3):555-65.
doi: 10.1016/0022-2836(91)90193-a.

Amino acid substitution matrices from an information theoretic perspective

Affiliations
Comparative Study

Amino acid substitution matrices from an information theoretic perspective

S F Altschul. J Mol Biol. .

Abstract

Protein sequence alignments have become an important tool for molecular biologists. Local alignments are frequently constructed with the aid of a "substitution score matrix" that specifies a score for aligning each pair of amino acid residues. Over the years, many different substitution matrices have been proposed, based on a wide variety of rationales. Statistical results, however, demonstrate that any such matrix is implicitly a "log-odds" matrix, with a specific target distribution for aligned pairs of amino acid residues. In the light of information theory, it is possible to express the scores of a substitution matrix in bits and to see that different matrices are better adapted to different purposes. The most widely used matrix for protein sequence comparison has been the PAM-250 matrix. It is argued that for database searches the PAM-120 matrix generally is more appropriate, while for comparing two specific proteins with suspected homology the PAM-200 matrix is indicated. Examples discussed include the lipocalins, human alpha 1 B-glycoprotein, the cystic fibrosis transmembrane conductance regulator and the globins.

PubMed Disclaimer

Similar articles

Cited by

References

    1. Altschul S.F., Erickson B.W. A nonlinear measure of subalignment similarity and its significance levels. Bull. Math. Biol. 1986;48:617–632. - PubMed
    1. Altschul S.F., Lipman D.J. Vol. 87. 1990. Protein database searches for multiple alignments; pp. 5509–5513. (Proc. Nat. Acad. Sci., U.S.A.). - PMC - PubMed
    1. Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J. Basic local alignment search tool. J. Mol. Biol. 1990;215:403–410. - PubMed
    1. Argos P. A sensitive procedure to compare amino acid sequences. J. Mol. Biol. 1987;193:385–396. - PubMed
    1. Armstrong J., Niemann H., Smeekens S., Rottier P., Warren G. Sequence and topology of a model intracellular membrane protein. El glycoprotein. from a coronavirus. Nature (London) 1984;308:751–752. - PMC - PubMed

Publication types