Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Aug 15;24(16):i207-12.
doi: 10.1093/bioinformatics/btn268.

Comprehensive in silico mutagenesis highlights functionally important residues in proteins

Affiliations

Comprehensive in silico mutagenesis highlights functionally important residues in proteins

Yana Bromberg et al. Bioinformatics. .

Abstract

Motivation: Mutating residues into alanine (alanine scanning) is one of the fastest experimental means of probing hypotheses about protein function. Alanine scans can reveal functional hot spots, i.e. residues that alter function upon mutation. In vitro mutagenesis is cumbersome and costly: probing all residues in a protein is typically as impossible as substituting by all non-native amino acids. In contrast, such exhaustive mutagenesis is feasible in silico.

Results: Previously, we developed SNAP to predict functional changes due to non-synonymous single nucleotide polymorphisms. Here, we applied SNAP to all experimental mutations in the ASEdb database of alanine scans; we identi.ed 70% of the hot spots (>or=1 kCal/mol change in binding energy); more severe changes were predicted more accurately. Encouraged, we carried out a complete all-against-all in silico mutagenesis for human glucokinase. Many of the residues predicted as functionally important have indeed been con.rmed in the literature, others await experimental veri.cation, and our method is ready to aid in the design of in vitro mutagenesis.

Availability: ASEdb and glucokinase scores are available at http://www.rostlab.org/services/SNAP. For submissions of large/whole proteins for processing please contact the author.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Variation of SNAP cutoff influences performance. By varying the threshold in the SNAP output (−100to+100) for considering a mutation as effecting function, we can dial through the ROC curve for interaction hot spots. On the one end, choosing a very low threshold we find all hot spots at very low accuracy (−50 on the lower right), conversely, at high positives we find few hot spots but those we find at high accuracy (50 at top left). Performance is slightly worse for the reduced data set where all mutants overlapping with PMD are removed; it is unclear which data set is better for estimating the method's performance (Results). For the full ASEdb data set at thresholds >30, we find ∼25% of the observed hot spots, and ∼45% of the sites predicted at that threshold are hot spots. To compile accuracy we assumed that proteins have only one binding site and that was the one probed in ASEdb; the degree to which this statement is wrong describes the degree to which our method underestimated accuracy.
Fig. 2.
Fig. 2.
Comprehensive mutagenesis for human glucokinase (HXK4). The crystal structure of HXK4 was taken from Kamata, et al (PDB: 1v4s; 2004); visualization by GRASP2 (Nichols, et al. 1991). The two ligands in the picture are glucose (yellow spheres) and a synthetic activator (green spheres). The scale of predictions ranges from blue (neutral; SNAP score <50) to red (strong effect; SNAP score >50). Blue indeed largely highlights regions that have not been implicated in functional changes, red highlights important residues, and white regions are unknown. Measurements shown reflect SNAP scores of mutation to alanine (A), to glycine (B), to cysteine (C) and to all 19 non-native acids [average score] (D).
Fig. 3.
Fig. 3.
Average substitution effect correlated with single amino acid substitutions. Among all single amino acid substitutions (at ASEdb mutant sequence positions), the distribution of predictions that best estimated the average was that of alanine, followed by cysteine, and glycine. These are also the amino acids that are often used in experimental mutagenesis studies to define functional sites.

References

    1. Bairoch A, Apweiler R. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 2000;28:45–48. - PMC - PubMed
    1. Bairoch A, et al. The universal protein resource (UniProt) Nucleic Acids Res. 2005;33:D154–D159. - PMC - PubMed
    1. Bogan AA, Thorn KS. Anatomy of hot spots in protein interfaces. J. Mol. Biol. 1998;280:1–9. - PubMed
    1. Bromberg Y, Rost B. SNAP: predict effect of non-synonymous polymorphisms on function. Nucleic Acids Res. 2007;35:3823–3835. - PMC - PubMed
    1. Christesen HB, et al. The second activating glucokinase mutation (A456V): implications for glucose homeostasis and diabetes therapy. Diabetes. 2002;51:1240–1246. - PubMed

Publication types