Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Jul 3:7:329.
doi: 10.1186/1471-2105-7-329.

An interactive visualization tool to explore the biophysical properties of amino acids and their contribution to substitution matrices

Affiliations

An interactive visualization tool to explore the biophysical properties of amino acids and their contribution to substitution matrices

Blazej Bulka et al. BMC Bioinformatics. .

Abstract

Background: Quantitative descriptions of amino acid similarity, expressed as probabilistic models of evolutionary interchangeability, are central to many mainstream bioinformatic procedures such as sequence alignment, homology searching, and protein structural prediction. Here we present a web-based, user-friendly analysis tool that allows any researcher to quickly and easily visualize relationships between these bioinformatic metrics and to explore their relationships to underlying indices of amino acid molecular descriptors.

Results: We demonstrate the three fundamental types of question that our software can address by taking as a specific example the connections between 49 measures of amino acid biophysical properties (e.g., size, charge and hydrophobicity), a generalized model of amino acid substitution (as represented by the PAM74-100 matrix), and the mutational distance that separates amino acids within the standard genetic code (i.e., the number of point mutations required for interconversion during protein evolution). We show that our software allows a user to recapture the insights from several key publications on these topics in just a few minutes.

Conclusion: Our software facilitates rapid, interactive exploration of three interconnected topics: (i) the multidimensional molecular descriptors of the twenty proteinaceous amino acids, (ii) the correlation of these biophysical measurements with observed patterns of amino acid substitution, and (iii) the causal basis for differences between any two observed patterns of amino acid substitution. This software acts as an intuitive bioinformatic exploration tool that can guide more comprehensive statistical analyses relating to a diverse array of specific research questions.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Overview of Amino Acid Explorer Architecture.
Figure 2
Figure 2
A minimum spanning tree of size, charge and hydrophobicity for the 20 amino acids of the standard genetic code. Specifically, this tree is built from the 67 amino acid indices that contain the words "hydrop" and/or "polar," "size," "volume," "charge," and "electr" as part of their description. This includes most of the indices that relate to the general concepts of amino acid size, charge, and hydrophobicity. Boxes A and B represent "natural" clusters formed by the minimum spanning tree of charge and size, respectively.
Figure 3
Figure 3
The minimum spanning tree recolored to reflect distance to a PAM matrix. Specifically, the minimum spanning tree of size, charge, and hydrophobicity (Figure 2) is recolored to indicate the similarity of each amino acid index to the PAM74-100 amino acid substitution matrix [5].
Figure 4
Figure 4
The minimum spanning tree recolored to show each index's similarity to one of two substitution matrices. Specifically, the spanning tree of size, charge, and hydrophobicity (Figure 2) is recolored to indicate whether each amino acid index is more highly correlated with the PAM74-100 amino acid substitution matrix (green) or a matrix of amino acids' proximity within the standard genetic code [8] (brown).

Similar articles

Cited by

References

    1. Henikoff S, Henikoff JG. Performance evaluation of amino acid substitution matrices. Proteins. 1993;17:49–61. - PubMed
    1. Jeanmougin F, Thompson JD, Gouy M, Higgins DG, Gibson TJ. Multiple sequence alignment with Clustal X. Trends Biochem Sci. 1998;23:403–405. - PubMed
    1. Tress M, Ezkurdia I, Grana O, Lopez G, Valencia A. Assessment of predictions submitted for the CASP6 comparative modelling category. Proteins. 2005. - PubMed
    1. Vilim RB, Cunningham RM, Lu B, Kheradpour P, Stevens FJ. Fold-specific substitution matrices for protein classification. Bioinformatics. 2004;20:847–853. - PubMed
    1. Teodorescu O, Galor T, Pillardy J, Elber R. Enriching the sequence substitution matrix by structural information. Proteins. 2004;54:41–48. - PubMed

Publication types

LinkOut - more resources