Computational-experimental approach to drug-target interaction mapping: A case study on kinase inhibitors

Affiliations

¹ Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, Finland.
² Institute for Molecular Medicine Finland FIMM, University of Helsinki, Helsinki, Finland.
³ Department of Information Technology, University of Turku, Turku, Finland.
⁴ Department of Mathematics and Statistics, University of Turku, Turku, Finland.

PMID: 28787438
PMCID: PMC5560747
DOI: 10.1371/journal.pcbi.1005678

Computational-experimental approach to drug-target interaction mapping: A case study on kinase inhibitors

Anna Cichonska et al. PLoS Comput Biol. 2017.

. 2017 Aug 7;13(8):e1005678.

doi: 10.1371/journal.pcbi.1005678. eCollection 2017 Aug.

Authors

Affiliations

¹ Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, Finland.
² Institute for Molecular Medicine Finland FIMM, University of Helsinki, Helsinki, Finland.
³ Department of Information Technology, University of Turku, Turku, Finland.
⁴ Department of Mathematics and Statistics, University of Turku, Turku, Finland.

PMID: 28787438
PMCID: PMC5560747
DOI: 10.1371/journal.pcbi.1005678

Abstract

Due to relatively high costs and labor required for experimental profiling of the full target space of chemical compounds, various machine learning models have been proposed as cost-effective means to advance this process in terms of predicting the most potent compound-target interactions for subsequent verification. However, most of the model predictions lack direct experimental validation in the laboratory, making their practical benefits for drug discovery or repurposing applications largely unknown. Here, we therefore introduce and carefully test a systematic computational-experimental framework for the prediction and pre-clinical verification of drug-target interactions using a well-established kernel-based regression algorithm as the prediction model. To evaluate its performance, we first predicted unmeasured binding affinities in a large-scale kinase inhibitor profiling study, and then experimentally tested 100 compound-kinase pairs. The relatively high correlation of 0.77 (p < 0.0001) between the predicted and measured bioactivities supports the potential of the model for filling the experimental gaps in existing compound-target interaction maps. Further, we subjected the model to a more challenging task of predicting target interactions for such a new candidate drug compound that lacks prior binding profile information. As a specific case study, we used tivozanib, an investigational VEGF receptor inhibitor with currently unknown off-target profile. Among 7 kinases with high predicted affinity, we experimentally validated 4 new off-targets of tivozanib, namely the Src-family kinases FRK and FYN A, the non-receptor tyrosine kinase ABL1, and the serine/threonine kinase SLK. Our sub-sequent experimental validation protocol effectively avoids any possible information leakage between the training and validation data, and therefore enables rigorous model validation for practical applications. These results demonstrate that the kernel-based modeling approach offers practical benefits for probing novel insights into the mode of action of investigational compounds, and for the identification of new target selectivities for drug repurposing applications.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1. An overview of our computational-experimental framework for prediction and pre-clinical testing of compound-protein bioactivity profiles.**
Two separate prediction problems are considered: (1) filling the gaps in existing compound-target interaction maps and (2) prediction of target interactions for a new or investigational compound. Molecular descriptors of drug compounds and protein targets are encoded as kernels, and used for binding affinity prediction with a regularized least squares regression model KronRLS. Finally, a subset of predicted compound-protein bioactivities is experimentally tested (see Materials and Methods for details). Since the experimental validations do not exists at the time of making the predictions, this approach effectively assesses any potential model overfitting to the training data only. We chose to use kernel-based models as these are well-suited for representing structured objects, such as molecules, that cannot be accurately described by a standard feature vector. Different types of drug and protein kernels can be calculated using readily available chemical structures and amino acid sequences. The resulting matrices associate all pairs of input objects, and therefore a kernel function can be considered as a similarity measure.

**Fig 2. Drug-protein interaction prediction scenarios.**
(d_x, p_x) denotes a query drug-protein pair, the binding affinity of which is to be predicted. (a) The *Bioactivity Imputation* scenario: both the drug d_x and protein p_x are present in the training set, i.e., there exist known bioactivity values for the drug d_x and protein p_x, but not for their interaction (d_x, p_x). (b) The *New Drug* scenario: the protein p_x is present in the training set, whereas the drug d_x is not, i.e., there exist known bioactivity values for the protein p_x but not for the drug d_x. (c) The *New Target* scenario: the drug d_x is present in the training set, whereas the protein p_x is not, i.e., there exist known bioactivity values for the drug d_x, but not for the protein p_x. (d) The *New Drug-Target Pair* scenario: neither the drug d_x nor protein p_x is present in the training set, i.e., there exist no bioactivity values neither for the drug d_x nor protein p_x. In this work, we focused primarily on two most common and practical prediction scenarios of (a) and (b), which correspond to filling the gaps in existing experimentally-measured drug-target interaction maps and prediction of target interactions for an investigational drug compound, respectively.

**Fig 3. Computational evaluation of the model predictions.**
(a) Leave-one-out and (b) leave-drug-out cross-validation results. The prediction accuracy was evaluated with Pearson correlation (r) between binding affinities (pK_i) from the study by Metz *et al*. [3] and those predicted using KronRLS algorithm with different pairs of compound (rows) and protein (columns) molecular descriptors encoded as kernel matrices (c). The corresponding root mean squared error (RMSE) values are shown in S1 Fig. Of note, Gaussian interaction profile drug kernel (KD-GIP), which resulted in the highest predictive performance under the *Bioactivity Imputation* scenario (a), was not evaluated under the *New Drug* scenario (b), because it is constructed based on the bioactivity profile of a drug to be predicted, that is, using information that in practice is unavailable when predicting target interactions for a new investigational drug compound.

**Fig 4. Comparison between computationally-predicted and experimentally-measured bioactivities.**
(a) Scatter plot between bioactivity values of 100 compound-kinase pairs (detailed in S2 Table). r indicates Pearson correlation. The orange cross points correspond to compound-kinase pairs tested in the study of Metz *et al*. but randomly blinded by us in the training of the model, forming an additional validation set. When no clear interaction between compound and kinase was observed in our experimental assay, the pIC₅₀ value was set to 4.9 M, corresponding to the highest drug concentration used in our screen (12,500 nM). The higher the pK_i/pIC₅₀ value, the stronger the affinity between the two molecules. Red lines mark a relatively stringent interaction threshold (7 M), distinguishing the top left corner as the region containing false positive interaction predictions, and the bottom right corner as false negative predictions. (b) A set of receiver operating characteristic (ROC) curves to investigate the model performance as a function of varying activity threshold. We applied 11 different interaction threshold values from the pIC₅₀ interval [6 M, 8 M] to binarize the experimentally-measured bioactivities into true class labels, and then determined how accurately the model can discriminate between the interacting and non-interacting compound-kinase pairs. The average area under the ROC curves (AUC) equals 0.970.

**Fig 5. Prediction of target interactions for an investigational kinase inhibitor tivozanib.**
**(a)** Predicted and measured bioactivity profiles of tivozanib against its 3 established on-targets (*FLT1*, *FLT4*, *KDR*; average bioactivity from ChEMBL; S3 Table) and 7 predicted off-target kinases tested in our experimental assay. Pearson correlation r = 0.668 (p = 0.035). When no clear compound-kinase interaction was observed in our assay, the pIC₅₀ value was set to 4.9 M, corresponding to the highest drug concentration used (12,500 nM). Predicted values belong to approximately constant range because we focused on experimental validation of the model-predicted off-target interactions. Three of them turned out to be false positives, and therefore the range of experimental results varies more than the range of predicted values. **(b)** Evaluation of negative interaction predictions from the model. Among 82 kinases with low predicted binding affinities (pK_i < 6 M), 64 were screened by Gao *et al*., and 59 of these are not likely targets of tivozanib (as they have at least 50% of the activity remaining at the high compound concentration of 1 μM).

See this image and copyright information in PMC

References

1. Knight ZA, Lin H, Shokat KM. Targeting the cancer kinome through polypharmacology. Nat Rev Cancer. 2010; 10:130–7. doi: 10.1038/nrc2787 - DOI - PMC - PubMed
1. Hu Y, Furtmann N, Bajorath J. Current compound coverage of the kinome: miniperspective. J Med Chem. 2014; 58:30–40. doi: 10.1021/jm5008159 - DOI - PubMed
1. Metz JT, Johnson EF, Soni NB, Merta PJ, Kifle L, Hajduk PJ. Navigating the kinome. Nat Chem Biol. 2011; 7:200–2. doi: 10.1038/nchembio.530 - DOI - PubMed
1. Savitski MM, Reinhard FB, Franken H, Werner T, Savitski MF, Eberhard D, et al. Tracking cancer drugs in living cells by thermal profiling of the proteome. Science. 2014; 346:1255784 doi: 10.1126/science.1255784 - DOI - PubMed
1. Reymond JL, Awale M. Exploring chemical space for drug discovery using the chemical universe database. ACS Chem Neurosci. 2012; 3:649–57. doi: 10.1021/cn3000422 - DOI - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Computational-experimental approach to drug-target interaction mapping: A case study on kinase inhibitors

Affiliations

Computational-experimental approach to drug-target interaction mapping: A case study on kinase inhibitors

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials

Miscellaneous