Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2008:75:107-41.
doi: 10.1016/S0065-3233(07)75004-0. Epub 2009 Feb 26.

Chapter 4. Predicting and characterizing protein functions through matching geometric and evolutionary patterns of binding surfaces

Affiliations
Review

Chapter 4. Predicting and characterizing protein functions through matching geometric and evolutionary patterns of binding surfaces

Jie Liang et al. Adv Protein Chem Struct Biol. 2008.

Abstract

Predicting protein functions from structures is an important and challenging task. Although proteins are often thought to be packed as tightly as solids, closer examination based on geometric computation reveals that they contain numerous voids and pockets. Most of them are of random nature, but some are binding sites providing surfaces to interact with other molecules. A promising approach for function prediction is to infer functions through discovery of similarity in local binding pockets, as proteins binding to similar substrates/ligands and carrying out similar functions have similar physical constraints for binding and reactions. In this chapter, we describe computational methods to distinguish those surface pockets that are likely to be involved in important biological functions, and methods to identify key residues in these pockets. We further describe how to predict protein functions at large scale from structures by detecting binding surfaces similar in residue make-ups, shape, and orientation. We also describe a Bayesian Monte Carlo method that can separate selection pressure due to biological function from pressure due to protein folding. We show how this method can be used to reconstruct the evolutionary history of binding surfaces for detecting similar binding surfaces. In addition, we briefly discuss how the negative image of a binding pocket can be casted, and how such information can be used to facilitate drug discovery.

Keywords: Bayesian Monte Carlo; CASTp; Local binding surface; alpha shape; pocket; protein function; pvSOAR; void.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Pockets and voids in proteins. There are three types of unfilled space on protein surfaces. Voids are fully enclosed and have no outlet, pockets are accessible from the outside but with constriction at mouths, and shallow depressions have wide openings. We use the general term surface pockets to include both pockets and voids (Adapted from [13]).
Figure 2
Figure 2
Voids and pockets in protein structures. (a) Number of voids and pockets scale roughly linearly with protein length for a representative set of 636 proteins. Here circles and solid triangles represent the numbers of voids and pockets, respectively. (b) The volume of protein as calculated using van der Waals model scales linearly with the van der Waals area of protein (Adapted from [13]).
Figure 3
Figure 3
The binding pockets on HIV-1 protease and phosphatidylinositol transfer protein (PITP). (Left): Binding pocket (yellow) on HIV-1 shown in van der Waals space filling model. Ligand is colored red. (Middle): The alpha shape of the HIV-1 binding site. Its mouth opening is colored gold. (Right): Binding pocket (green) on PITP for phoshpolipid (red) and a regulatory site on a different region (yellow) of the same protein.
Figure 4
Figure 4
The length distribution and residue composition of functional surfaces for 3,275 enzyme proteins containing known functional key residues. (a) Functional surfaces usually consist of 8–200 residues, with the mean at 35 residues. (b) The amino acid residue composition of functional surfaces is different from the composition of sequences used to construct the Jones-Taylor-Thornton (JTT) model (Adapted from [14]).
Figure 5
Figure 5
The binding surface (green) and key residues predicted from a structure of alpha amylase. Here the predicted four key residues are colored yellow (D176), cyan (H180), pink (N208) and blue (D269). They contain several high propensity atomic patterns from our library of 1,031 functional atomic patterns. Their classes of secondary structural environment (sheet s, helix h, and coil c) are also listed. The substrate molecule is colored red (adapted from [14]).
Figure 6
Figure 6
Functional surfaces on the catalytic domains of cAMP-dependent protein kinase (1cdk) and tyrosine protein kinase (2src). (a) In both cases, the active sites are computed as surface pockets. (b) Residues defining the pockets are well dispersed throughout the primary sequences (full sequence identity = 16%), (c) The identity of their surface sequence patterns is much higher (51%).
Figure 7
Figure 7
The binding pockets from two different stromelysin catalytic domains (pocket 29 from pdb 1hv5.A and pocket 19 from 1qic.D). They are aligned in a sequence order independent fashion with an cRMSD of 0.76 Å for 29 atoms from 10 residues. (Top) The binding pockets on the two protein structures, with pocket atoms shown in space filling form. The aligned atoms are colored in red. (Middle) The alignment of residues of these two surface pockets. Atomic details of the alignment are not shown. Sequence numbers are listed above and below the residue names for 1hv5 and 1qic, respectively. Residues in 1hv5 are arranged in order, but it is clear that the aligned residues in 1qic are not in sequence order. This residue alignment is derived from detailed alignment of atoms from surface pockets. (Bottom) Aligned atoms from these two surface pockets, with N atoms in blue, O in red, and C in green.
Figure 8
Figure 8
Substitution rates of residues in the functional binding surface and the remaining surface of alpha-amylase (pdb 1bag). (a) Substitution rates of residues on functional binding surface (values represented by bubble sizes). (b) Substitution rates of residues on the remaining surface on 1bag. The values and overall pattern of substitutions that appear in both surface regions are very different (adapted from [11]).
Figure 9
Figure 9
Function prediction of alpha amylases. (a) The phylogenetic tree for Pdb structure 1bag from B. subtilis. (b) The functional binding pocket of alpha amylase on 1bag. (c) A matched binding surface on a different protein structure ( 1b2y from human, full sequence identity 22%) obtained by querying with the binding surface of 1bag (adapted from [11]).
Figure 10
Figure 10
Structures containing the CBS domain: (A) CBS domain protein mt1622 from M. thermoautotrophicum (PDB ID=1pbj), (C) inosine-5′-monosphate dehydrogenase (IMPDH) from S. pyogenes (PDB ID=1zfj), and (E) conserved hypothetical protein Ta549 from T. acidophilum (PDB ID=1pvm). The proposed nucleotide bindings surface of mt1622 (CASTp ID=9, cyan, A) is shown superposited to a flavoprotein (PDB ID=1efp, white) with bound AMP molecule (B). The IMPDH binding surface (CASTp ID=31, yellow) is show superpositioned with ATP bound cyclin-dependent kinase 2 (PDB ID=1b38, white) (D). Ta549 contains an additional C terminus CBS domain (C, orange) opposite the tandem domain interface surface (CASTp ID=27, C, green). The domain insert creates a novel surface (CASTp ID=30, orange) that shares similarity to an ATP binding surface from saicar-synthase (PDB ID=1obd, white) (F).
Figure 11
Figure 11
Amino acid substitution rates in the putative retinal-binding pockets of proteorhodopsins. a) Alignment of putative pocket sequences. The 20 pocket residue positions are mapped from retinal-binding pocket in bacteriorhodpsin structure 1KGB. Residues that are identical with the residues in the first sequence are substituted with “.”. b) Phylogenetic tree of the full-length protorhodopsin sequences. c) The plot of amino acid substitution rates for residues in the putative retinal binding pocket. The area of the circles is proportional to the substitution rate. The exchange pairs with the fastest rates are found at positions 93 and 137 in PR (following BR numbering). These are: A/L, A/V, A/E, E/Q, E/L, L/Q, L/V, and M/T (Adapted from [60].
Figure 12
Figure 12
The generation of a negative image of a binding pocket. (a) The surface pocket in apoferritin that binds isoflurane, (b) the atoms forming the binding pocket and its computed negative image, and (c) negative image of the binding pocket.

Similar articles

Cited by

References

    1. Chandonia JM, Brenner SE. The impact of structural genomics: expectations and outcomes. Science. 2006;311(5759):347–51. - PubMed
    1. Rost B. Enzyme function less conserved than anticipated. J Mol Biol. 2002;318:595–608. - PubMed
    1. Tian W, Skolnick J. How well is enzyme function conserved as a function of pairwise sequence identity? J Mol Biol. 2003;333:863–882. - PubMed
    1. Russell RB. Detection of protein three-dimensional side-chain patterns: new examples of convergent evolution. J Mol Biol. 1998;279:1211–1227. - PubMed
    1. Binkowski TA, Adamian L, Liang J. Inferring functional relationships of proteins from local sequence and spatial surface patterns. J Mol Biol. 2003;332:505–526. - PubMed

Publication types

LinkOut - more resources