Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Sep;22(9):561-7.
doi: 10.1093/protein/gzp035. Epub 2009 Jul 2.

Discovering rules for protein-ligand specificity using support vector inductive logic programming

Affiliations

Discovering rules for protein-ligand specificity using support vector inductive logic programming

Lawrence A Kelley et al. Protein Eng Des Sel. 2009 Sep.

Abstract

Structural genomics initiatives are rapidly generating vast numbers of protein structures. Comparative modelling is also capable of producing accurate structural models for many protein sequences. However, for many of the known structures, functions are not yet determined, and in many modelling tasks, an accurate structural model does not necessarily tell us about function. Thus, there is a pressing need for high-throughput methods for determining function from structure. The spatial arrangement of key amino acids in a folded protein, on the surface or buried in clefts, is often the determinants of its biological function. A central aim of molecular biology is to understand the relationship between such substructures or surfaces and biological function, leading both to function prediction and to function design. We present a new general method for discovering the features of binding pockets that confer specificity for particular ligands. Using a recently developed machine-learning technique which couples the rule-discovery approach of inductive logic programming with the statistical learning power of support vector machines, we are able to discriminate, with high precision (90%) and recall (86%) between pockets that bind FAD and those that bind NAD on a large benchmark set given only the geometry and composition of the backbone of the binding pocket without the use of docking. In addition, we learn rules governing this specificity which can feed into protein functional design protocols. An analysis of the rules found suggests that key features of the binding pocket may be tied to conformational freedom in the ligand. The representation is sufficiently general to be applicable to any discriminatory binding problem. All programs and data sets are freely available to non-commercial users at http://www.sbg.bio.ic.ac.uk/svilp_ligand/.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Graph depicting how performance of the SVILP system was affected by the number of rules used to form the attribute vector.
Data shown is performance on the independent 5-fold optimization set.
Figure 2
Figure 2. Graph depicting the performance of each of the methods on the FAD/NAD discrimination problem.
Precision and recall have been calculated over the 20-fold cross-validation. Precision is defined as tp/(tp+fp) and recall is defined as tp/(tp+fn), where tp are true positives, fp are false positives, and fn are false negatives. Full data is presented in Table 1.
Figure 3
Figure 3. Cartoon representation of FAD (blue and red sticks) inside the binding pocket of protein 1JU2.
Red sticks indicate regions of the FAD that exhibit higher conformationally flexibility across 200 computational docking simulations. Pink spheres indicate atoms within amino acid residues that account for 50% of the instantiations (proofs) of the top 10 ILP rules.

Similar articles

Cited by

References

    1. Amini A, Lodhi H, Muggleton SH, Sternberg MJE. J. Chem. Inf. Model. 2007;47(3):998–1006. - PubMed
    1. Amini A, Shrimpton PJ, Muggleton SH, Sternberg MJE. Proteins: Struct., Funct., Bioinf. 2007;69(4):823–831. - PubMed
    1. Brenner SE. Nat Rev Genet. 2001;2(10):801–809. - PubMed
    1. Burley SK, Almo SC, Bonanno JB, Capel M, Chance MR, Gaasterland T, Lin D, Sali A, Studier FW, Swaminathan S. Nat. Genet. 1999;23(2):151–157. - PubMed
    1. Baker D, Sali A. Science. 2001;294(5540):93–96. - PubMed

Publication types