Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Sep 15;35(18):3329-3338.
doi: 10.1093/bioinformatics/btz111.

DeepAffinity: interpretable deep learning of compound-protein affinity through unified recurrent and convolutional neural networks

Affiliations

DeepAffinity: interpretable deep learning of compound-protein affinity through unified recurrent and convolutional neural networks

Mostafa Karimi et al. Bioinformatics. .

Abstract

Motivation: Drug discovery demands rapid quantification of compound-protein interaction (CPI). However, there is a lack of methods that can predict compound-protein affinity from sequences alone with high applicability, accuracy and interpretability.

Results: We present a seamless integration of domain knowledges and learning-based approaches. Under novel representations of structurally annotated protein sequences, a semi-supervised deep learning model that unifies recurrent and convolutional neural networks has been proposed to exploit both unlabeled and labeled data, for jointly encoding molecular representations and predicting affinities. Our representations and models outperform conventional options in achieving relative error in IC50 within 5-fold for test cases and 20-fold for protein classes not included for training. Performances for new protein classes with few labeled data are further improved by transfer learning. Furthermore, separate and joint attention mechanisms are developed and embedded to our model to add to its interpretability, as illustrated in case studies for predicting and explaining selective drug-target interactions. Lastly, alternative representations using protein sequences or compound graphs and a unified RNN/GCNN-CNN model using graph CNN (GCNN) are also explored to reveal algorithmic challenges ahead.

Availability and implementation: Data and source codes are available at https://github.com/Shen-Lab/DeepAffinity.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Our unified RNN-CNN pipeline to predict and interpret compound–protein affinity
Fig. 2.
Fig. 2.
Comparing strategies to generalize predictions for four sets of new protein classes: original random forest (RF), original param.+NN ensemble of unified RNN-CNN models (DL for deep learning with the default attention), and re-trained RF or transfer DL using incremental amounts of labeled data in each set
Fig. 3.
Fig. 3.
Interpreting deep learning models for predicting factor Xa (A) binding site and (B) selectivity origin based on joint attention. (A) 3D structure of factor Xa (colored cartoon representation) in complex with DX-9065a (black sticks) (PDB ID: 1FAX) where protein SSEs are color-coded by attention scores (βi), warmer colors indicating higher attentions. (B) Segments of factor Xa are scored by one less the average of the βi rank ratios for the two compound–protein interactions where the ground truth of the selectivity origin is in red. (Color version of this figure is available at Bioinformatics online.)

References

    1. Ain Q.U. et al. (2015) Machine-learning scoring functions to improve structure-based binding affinity prediction and virtual screening. Wiley Interdiscip. Rev. Comput. Mol. Sci., 5, 405–424. - PMC - PubMed
    1. Brandstetter H. et al. (1996) X-ray structure of active site-inhibited clotting factor xa implications for drug design and substrate recognition. J. Biol. Chem., 271, 29988–29992. - PubMed
    1. Cang Z., Wei G.W. (2017) TopologyNet: topology based deep convolutional and multi-task neural networks for biomolecular property predictions. PLoS Comput. Biol., 13, e1005690.. - PMC - PubMed
    1. Chang R.L. et al. (2010) Drug off-target effects predicted using structural analysis in the context of a metabolic network model. PLoS Comput. Biol., 6, e1000938. - PMC - PubMed
    1. Chen X. et al. (2016) Drug–target interaction prediction: databases, web servers and computational models. Brief. Bioinf., 17, 696–712. - PubMed

Publication types