Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Nov 24:13:314.
doi: 10.1186/1471-2105-13-314.

Local functional descriptors for surface comparison based binding prediction

Affiliations

Local functional descriptors for surface comparison based binding prediction

Gregory M Cipriano et al. BMC Bioinformatics. .

Abstract

Background: Molecular recognition in proteins occurs due to appropriate arrangements of physical, chemical, and geometric properties of an atomic surface. Similar surface regions should create similar binding interfaces. Effective methods for comparing surface regions can be used in identifying similar regions, and to predict interactions without regard to the underlying structural scaffold that creates the surface.

Results: We present a new descriptor for protein functional surfaces and algorithms for using these descriptors to compare protein surface regions to identify ligand binding interfaces. Our approach uses descriptors of local regions of the surface, and assembles collections of matches to compare larger regions. Our approach uses a variety of physical, chemical, and geometric properties, adaptively weighting these properties as appropriate for different regions of the interface. Our approach builds a classifier based on a training corpus of examples of binding sites of the target ligand. The constructed classifiers can be applied to a query protein providing a probability for each position on the protein that the position is part of a binding interface. We demonstrate the effectiveness of the approach on a number of benchmarks, demonstrating performance that is comparable to the state-of-the-art, with an approach with more generality than these prior methods.

Conclusions: Local functional descriptors offer a new method for protein surface comparison that is sufficiently flexible to serve in a variety of applications.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Shown are the results for three of the calcium ion binding predictions discussed in the results section. The three examples depict a successful result, a moderate success, and a failure case. In red are areas that the classifier chose as highly likely (>95%estimated probability) to bind to calcium. In lighter orange are areas that have between 40% and 95% probability of binding, with the shade of orange indicating approximately where in that range the estimate fell. In white are areas that were deemed unlikely to bind to calcium. The binding locations of the crystal structures are shown as blue spheres with a green point at the center.
Figure 2
Figure 2
This image shows representative examples from the multiple-ligand binding test discussed in the results section. The best and worst examples for each test ligand in the experiment are shown visually. Binding prediciton probability is shown by color on the protein surface: red are areas that the classifier chose as highly likely (>95%estimated probability), orange are areas with between 40% and 95% probability of binding.
Figure 3
Figure 3
Shown here are, for one sample point, the disc-shaped patches of each radii used in the functional surface descriptor: 1.6Å, 3.2Å, 4.8Å, 6.4Å and 8Å.
Figure 4
Figure 4
A visual depiction of Algorithm 2, for training a classifier to recognize the environment surrounding a specific atom, given a corpus of examples of that atom’s binding.
Figure 5
Figure 5
Shown here are two possible conformations for Adenosine Triphosphate (ATP). Note that the distances between atoms within the rigid adenine moiety do not change. Distances between non-rigid components, such as those between the ‘C8’ and ‘O3A’ atoms, may change dramatically. As these will be used later to combine atomic predictions, the observed minimum and maximum distances between each pair of atoms are stored during the training phase.
Figure 6
Figure 6
Prediction Phase: combining atom surface functions to predict a ligand.
Figure 7
Figure 7
An illustration (in 2D) of how our method for grouping samples on the 3D surface works. In this illustration, each circle represents a sample; samples having similar values in feature space are given the same color. The algorithm proceeds as follows: starting with a radius R, identify discs of radius R that have minimum average distance (in feature space) between elements in the disc. Replace the best non-overlapping discs with the sample in the center of each disc. Repeat, each time reducing the size of the disc. When complete, there will still be samples not contained in a disc. Merge those into neighboring discs if their distance from the center sample is less than a threshold T. The resulting center samples are used for surface prediction. In all results, R= 4Å and T=.25.
Figure 8
Figure 8
Shown here is the performance over all test cases of calcium binding (Table2) as a function of the size of the training corpus. Performance is measured by the area under the ROC graph produced by each example. On the top, each test case is shown as a separate line, each with a different color. Note that while with only a few training examples, the correct pocket is found in most tests, a few harder test cases require more training before they can be reliably predicted. On the bottom is the same data, but averaged, with error bars indicating 95% confidence intervals.
Figure 9
Figure 9
Shown here are three confusion matrices, thetop (a) tested using the full feature vector description (listed in Table1), the middle (b) using only the most local features, and the bottom (c) using only geometric features. Each row represents the tests for a ligand classifier run on all test cases. Each column represents an individual testing example, grouped by the ligand the protein is known to bind to. The value in the cell is the area underneath the precision/recall curve produced from that test. A higher value indicates a better match. Green cells indicate true positive results: the predictor found the ligand it was trained for. Purple cells indicate false negatives: the ligand failed to find the ligand it was trained for. Red cells indicate false positives: the predictor found the site of a different ligand. See Figure 2 for illustrations to help interpret these numbers.
Figure 10
Figure 10
This image shows the charge (indicated by color, range from dark red (very negative) to dark blue (very positive) of protein 1AYP as computed by APBS. This charge pattern is quite different than any of the others seen in the training set.

Similar articles

Cited by

References

    1. Kahraman A, Morris RJ, Laskowski RA, Favia AD, Thornton JM. On thediversity of physicochemical environments experienced by identical ligands in binding pockets of unrelated proteins. Proteins. 2010;78(5):1120–1136. doi: 10.1002/prot.22633. [ http://www.ncbi.nlm.nih.gov/pubmed/19927322] - DOI - PubMed
    1. Tuytelaars T, Mikolajczyk K. Local invariant feature detectors: a survey. Foundations Trends®;in Comput Graph Vision. 2007;3(3):177–280. doi: 10.1561/0600000017. [ http://dx.doi.org/10.1561/0600000017] - DOI - DOI
    1. Mikolajczyk K, Schmid C. A performance evaluation of local descriptors. IEEE PAMI. 2005;27(10):1615–1630. [ http://dx.doi.org/10.1109/TPAMI.2005.188] - DOI - PubMed
    1. Lowe DG. Distinctive image features from scale-invariant keypoints. Int J Comput Vision. 2004;60(2):91–110. [ http://dx.doi.org/10.1023/B:VISI.0000029664.99615.94] - DOI
    1. Lowe DG. ICCV. Washington: IEEE Computer Society; 1999. Object recognition from local scale-invariant features; pp. 1150–1150. [ http://doi.ieeecomputersociety.org/10.1109/ICCV.1999.79041] - DOI

Publication types

LinkOut - more resources