Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Dec 26;114(52):13685-13690.
doi: 10.1073/pnas.1705381114. Epub 2017 Dec 11.

Structure-based prediction of ligand-protein interactions on a genome-wide scale

Affiliations

Structure-based prediction of ligand-protein interactions on a genome-wide scale

Howook Hwang et al. Proc Natl Acad Sci U S A. .

Abstract

We report a template-based method, LT-scanner, which scans the human proteome using protein structural alignment to identify proteins that are likely to bind ligands that are present in experimentally determined complexes. A scoring function that rapidly accounts for binding site similarities between the template and the proteins being scanned is a crucial feature of the method. The overall approach is first tested based on its ability to predict the residues on the surface of a protein that are likely to bind small-molecule ligands. The algorithm that we present, LBias, is shown to compare very favorably to existing algorithms for binding site residue prediction. LT-scanner's performance is evaluated based on its ability to identify known targets of Food and Drug Administration (FDA)-approved drugs and it too proves to be highly effective. The specificity of the scoring function that we use is demonstrated by the ability of LT-scanner to identify the known targets of FDA-approved kinase inhibitors based on templates involving other kinases. Combining sequence with structural information further improves LT-scanner performance. The approach we describe is extendable to the more general problem of identifying binding partners of known ligands even if they do not appear in a structurally determined complex, although this will require the integration of methods that combine protein structure and chemical compound databases.

Keywords: drug off-targets; machine learning; protein–ligand interactions; structure-based prediction.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Overview of LBias and LT-scanner methods. (A) For a given query protein (shown in green) LBias collects and superposes structure neighbors A, B, and C (shown in yellow) that are cocrystalized with their bound ligands (shown in blue). Then LBias predicts the most likely ligand-binding residues (shown in red and yellow) on the query protein based on collective contact information that the superposed ligands make. (B) For a given template cocrystal structure of a drug (shown in blue) and a template protein (shown in green), LT-scanner scans through protein A, B, and C (shown in orange) for by superposing the template structure onto each protein so as to create interaction models. Then LT-scanner calculates the SimLT-scanner interaction similarity score (shown as SimLT) between the interaction models of the query–drug complex and the interactions in the binding site of the template.
Fig. 2.
Fig. 2.
Precision–recall curves for ligand-binding residue prediction. Precision–recall curves are shown for LBias (black line), “simple count” (gray line), ConCavity (blue line), and random prediction (yellow line) in precision–recall curve space. Precision–recall points (PR-point; Results) are shown for LBias, ConCavity, simple count, COACH, and FTsite.
Fig. 3.
Fig. 3.
ROC curves for drug target protein predictions. (A) ROC curves for LT-scanner (black line), LT-scanner/seq(red line), and Sequence (green line) were shown to evaluate performance for prediction of drug targets in the full set of ∼15,000 human proteins in HSSP. (B) ROC curves for LT-scanner/seq (red line), FINDSITE_comb(30) (blue line), and FINDSITE_comb(95) (gray line) for drugs and proteins used in both the FINDSITE study and this study. (C) ROC curves for LT-scanner/seq (red line), LT-scanner (black line), Sequence (green line), and random (purple line). Curves calculated for 21 FDA-approved kinase inhibitors and 600 human kinases.
Fig. 4.
Fig. 4.
Calculating LBias SIM score. A query protein Q is shown at Left with three atoms, q1, q2, and q3, identified specifically. The second panel shows a ligand-containing protein NL, structurally similar to Q with the ligand shown as a gray line connecting two gray atoms. Ligand-binding residues, n1 and n2, are identified. The NL complex is superposed onto Q. Residues from Q that interact with the superposed ligand are identified (q1 and q2 in the above). A protein–ligand similarity score between QL and NL (SQL:NL) is then calculated, where SQL:NL=m11eγr112+m12eγr122+m21eγr212+m22eγr222. SQL:NL is a function of all of the pairwise distances (e.g., r11, r12) between the atoms from Q interacting with the superposed ligand and the atoms from N interacting with the native ligand if the two atoms in question make chemically similar contacts with the ligands. For example, if n1 makes a hydrogen bond with the ligand, but q1 is hydrophobic, m11 would be zero and there would be no contribution to SQL:NL from this pair of atoms (Materials and Methods).

Similar articles

Cited by

References

    1. Hendlich M, Rippmann F, Barnickel G. LIGSITE: Automatic and efficient detection of potential small molecule-binding sites in proteins. J Mol Graph Model. 1997;15:359–363, 389. - PubMed
    1. Laskowski RA. SURFNET: A program for visualizing molecular surfaces, cavities, and intermolecular interactions. J Mol Graph. 1995;13:323–330. - PubMed
    1. Capra JA, Laskowski RA, Thornton JM, Singh M, Funkhouser TA. Predicting protein ligand binding sites by combining evolutionary sequence conservation and 3D structure. PLoS Comput Biol. 2009;5:e1000585. - PMC - PubMed
    1. Huang B, Schroeder M. LIGSITEcsc: Predicting ligand binding sites using the Connolly surface and degree of conservation. BMC Struct Biol. 2006;6:19. - PMC - PubMed
    1. Ngan CH, et al. FTSite: High accuracy detection of ligand binding sites on unbound protein structures. Bioinformatics. 2012;28:286–287. - PMC - PubMed

Publication types