. 2012 Nov 24:13:314.

doi: 10.1186/1471-2105-13-314.

Local functional descriptors for surface comparison based binding prediction

Gregory M Cipriano¹, George N Phillips Jr, Michael Gleicher

Affiliations

PMID: 23176080
PMCID: PMC3585919
DOI: 10.1186/1471-2105-13-314

Local functional descriptors for surface comparison based binding prediction

Gregory M Cipriano et al. BMC Bioinformatics. 2012.

. 2012 Nov 24:13:314.

doi: 10.1186/1471-2105-13-314.

Authors

Gregory M Cipriano¹, George N Phillips Jr, Michael Gleicher

Affiliation

¹ Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, USA.

PMID: 23176080
PMCID: PMC3585919
DOI: 10.1186/1471-2105-13-314

Abstract

Background: Molecular recognition in proteins occurs due to appropriate arrangements of physical, chemical, and geometric properties of an atomic surface. Similar surface regions should create similar binding interfaces. Effective methods for comparing surface regions can be used in identifying similar regions, and to predict interactions without regard to the underlying structural scaffold that creates the surface.

Results: We present a new descriptor for protein functional surfaces and algorithms for using these descriptors to compare protein surface regions to identify ligand binding interfaces. Our approach uses descriptors of local regions of the surface, and assembles collections of matches to compare larger regions. Our approach uses a variety of physical, chemical, and geometric properties, adaptively weighting these properties as appropriate for different regions of the interface. Our approach builds a classifier based on a training corpus of examples of binding sites of the target ligand. The constructed classifiers can be applied to a query protein providing a probability for each position on the protein that the position is part of a binding interface. We demonstrate the effectiveness of the approach on a number of benchmarks, demonstrating performance that is comparable to the state-of-the-art, with an approach with more generality than these prior methods.

Conclusions: Local functional descriptors offer a new method for protein surface comparison that is sufficiently flexible to serve in a variety of applications.

PubMed Disclaimer

Figures

**Figure 1**
**Shown are the results for three of the calcium ion binding predictions discussed in the results section.** The three examples depict a successful result, a moderate success, and a failure case. In red are areas that the classifier chose as highly likely (>95%estimated probability) to bind to calcium. In lighter orange are areas that have between 40% and 95% probability of binding, with the shade of orange indicating approximately where in that range the estimate fell. In white are areas that were deemed unlikely to bind to calcium. The binding locations of the crystal structures are shown as blue spheres with a green point at the center.

**Figure 2**
**This image shows representative examples from the multiple-ligand binding test discussed in the results section.** The best and worst examples for each test ligand in the experiment are shown visually. Binding prediciton probability is shown by color on the protein surface: red are areas that the classifier chose as highly likely (>95%estimated probability), orange are areas with between 40% and 95% probability of binding.

**Figure 3**
Shown here are, for one sample point, the disc-shaped patches of each radii used in the functional surface descriptor: 1.6Å, 3.2Å, 4.8Å, 6.4Å and 8Å.

**Figure 4**
A visual depiction of Algorithm 2, for training a classifier to recognize the environment surrounding a specific atom, given a corpus of examples of that atom’s binding.

**Figure 5**
**Shown here are two possible conformations for Adenosine Triphosphate (ATP).** Note that the distances between atoms within the rigid adenine moiety do not change. Distances between non-rigid components, such as those between the ‘C8’ and ‘O3A’ atoms, may change dramatically. As these will be used later to combine atomic predictions, the observed minimum and maximum distances between each pair of atoms are stored during the training phase.

**Figure 6**
Prediction Phase: combining atom surface functions to predict a ligand.

**Figure 7**
**An illustration (in 2D) of how our method for grouping samples on the 3D surface works.** In this illustration, each circle represents a sample; samples having similar values in feature space are given the same color. The algorithm proceeds as follows: starting with a radius R, identify discs of radius R that have minimum average distance (in feature space) between elements in the disc. Replace the best non-overlapping discs with the sample in the center of each disc. Repeat, each time reducing the size of the disc. When complete, there will still be samples not contained in a disc. Merge those into neighboring discs if their distance from the center sample is less than a threshold T. The resulting center samples are used for surface prediction. In all results, R= 4Å and T=.25.

**Figure 8**
**Shown here is the performance over all test cases of calcium binding (Table**2) as a function of the size of the training corpus. Performance is measured by the area under the ROC graph produced by each example. On the top, each test case is shown as a separate line, each with a different color. Note that while with only a few training examples, the correct pocket is found in most tests, a few harder test cases require more training before they can be reliably predicted. On the bottom is the same data, but averaged, with error bars indicating 95% confidence intervals.

**Figure 9**
**Shown here are three confusion matrices, thetop (a) tested using the full feature vector description (listed in Table**1), the middle (b) using only the most local features, and the bottom (c) using only geometric features. Each row represents the tests for a ligand classifier run on all test cases. Each column represents an individual testing example, grouped by the ligand the protein is known to bind to. The value in the cell is the area underneath the precision/recall curve produced from that test. A higher value indicates a better match. Green cells indicate true positive results: the predictor found the ligand it was trained for. Purple cells indicate false negatives: the ligand failed to find the ligand it was trained for. Red cells indicate false positives: the predictor found the site of a different ligand. See Figure 2 for illustrations to help interpret these numbers.

**Figure 10**
**This image shows the charge (indicated by color, range from dark red (very negative) to dark blue (very positive) of protein 1AYP as computed by APBS.** This charge pattern is quite different than any of the others seen in the training set.

See this image and copyright information in PMC

Cited by

Visualizing Validation of Protein Surface Classifiers.
Sarikaya A, Albers D, Mitchell J, Gleicher M. Sarikaya A, et al. Comput Graph Forum. 2014 Jun;33(3):171-180. doi: 10.1111/cgf.12373. Comput Graph Forum. 2014. PMID: 25342867 Free PMC article.

References

1. Kahraman A, Morris RJ, Laskowski RA, Favia AD, Thornton JM. On thediversity of physicochemical environments experienced by identical ligands in binding pockets of unrelated proteins. Proteins. 2010;78(5):1120–1136. doi: 10.1002/prot.22633. [ http://www.ncbi.nlm.nih.gov/pubmed/19927322] - DOI - PubMed
1. Tuytelaars T, Mikolajczyk K. Local invariant feature detectors: a survey. Foundations Trends®;in Comput Graph Vision. 2007;3(3):177–280. doi: 10.1561/0600000017. [ http://dx.doi.org/10.1561/0600000017] - DOI - DOI
1. Mikolajczyk K, Schmid C. A performance evaluation of local descriptors. IEEE PAMI. 2005;27(10):1615–1630. [ http://dx.doi.org/10.1109/TPAMI.2005.188] - DOI - PubMed
1. Lowe DG. Distinctive image features from scale-invariant keypoints. Int J Comput Vision. 2004;60(2):91–110. [ http://dx.doi.org/10.1023/B:VISI.0000029664.99615.94] - DOI
1. Lowe DG. ICCV. Washington: IEEE Computer Society; 1999. Object recognition from local scale-invariant features; pp. 1150–1150. [ http://doi.ieeecomputersociety.org/10.1109/ICCV.1999.79041] - DOI

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

Grants and funding

5T15LM007359/LM/NLM NIH HHS/United States

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Local functional descriptors for surface comparison based binding prediction

Affiliation

Local functional descriptors for surface comparison based binding prediction

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources