QSAR models for predicting the similarity in binding profiles for pairs of protein kinases and the variation of models between experimental data sets
- PMID: 19639957
- DOI: 10.1021/ci900176y
QSAR models for predicting the similarity in binding profiles for pairs of protein kinases and the variation of models between experimental data sets
Abstract
We propose a direct QSAR methodology to predict how similar the inhibitor-binding profiles of two protein kinases are likely to be, based on the properties of the residues surrounding the ATP-binding site. We produce a random forest model for each of five data sets (one in-house, four from the literature) where multiple compounds are tested on many kinases. Each model is self-consistent by cross-validation, and all models point to only a few residues in the active site controlling the binding profiles. While all models include the "gatekeeper" as one of the important residues, consistent with previous literature, some models suggest other residues as being more important. We apply each model to predict the similarity in binding profile to all pairs in a set of 411 kinases from the human genome and get very different predictions from each model. This turns out not to be an issue with model-building but with the fact that the experimental data sets disagree about which kinases are similar to which others. It is possible to build a model combining all the data from the five data sets that is reasonably self-consistent but not surprisingly, given the disagreement between data sets, less self-consistent than the individual models.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
