Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Sep 1;34(17):i821-i829.
doi: 10.1093/bioinformatics/bty593.

DeepDTA: deep drug-target binding affinity prediction

Affiliations

DeepDTA: deep drug-target binding affinity prediction

Hakime Öztürk et al. Bioinformatics. .

Abstract

Motivation: The identification of novel drug-target (DT) interactions is a substantial part of the drug discovery process. Most of the computational methods that have been proposed to predict DT interactions have focused on binary classification, where the goal is to determine whether a DT pair interacts or not. However, protein-ligand interactions assume a continuum of binding strength values, also called binding affinity and predicting this value still remains a challenge. The increase in the affinity data available in DT knowledge-bases allows the use of advanced learning techniques such as deep learning architectures in the prediction of binding affinities. In this study, we propose a deep-learning based model that uses only sequence information of both targets and drugs to predict DT interaction binding affinities. The few studies that focus on DT binding affinity prediction use either 3D structures of protein-ligand complexes or 2D features of compounds. One novel approach used in this work is the modeling of protein sequences and compound 1D representations with convolutional neural networks (CNNs).

Results: The results show that the proposed deep learning based model that uses the 1D representations of targets and drugs is an effective approach for drug target binding affinity prediction. The model in which high-level representations of a drug and a target are constructed via CNNs achieved the best Concordance Index (CI) performance in one of our larger benchmark datasets, outperforming the KronRLS algorithm and SimBoost, a state-of-the-art method for DT binding affinity prediction.

Availability and implementation: https://github.com/hkmztrk/DeepDTA.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Summary of the Davis (left panel) and KIBA (right panel) datasets. (A) Distribution of binding affinity values. (B) Distribution of the lengths of the SMILES strings. (C) Distribution of the lengths of the protein sequences
Fig. 2.
Fig. 2.
DeepDTA model with two CNN blocks to learn from compound SMILES and protein sequences
Fig. 3.
Fig. 3.
Experiment setup
Fig. 4.
Fig. 4.
Predictions from DeepDTA model with two CNN blocks against measured (real) binding affinity values for Davis (pKd) and KIBA (KIBA score) datasets

References

    1. Abadi M., et al. (2016) Tensorflow: a system for large-scale machine learning. In: OSDI, Vol. 16, pp. 265–283.
    1. Apweiler R., et al. (2004) Uniprot: the universal protein knowledgebase. Nucleic Acids Res., 32(Suppl. 1), D115–D119. - PMC - PubMed
    1. Ballester P.J., Mitchell J.B. (2010) A machine learning approach to predicting protein–ligand binding affinity with applications to molecular docking. Bioinformatics, 26, 1169–1175. - PMC - PubMed
    1. Bleakley K., Yamanishi Y. (2009) Supervised prediction of drug–target interactions using bipartite local models. Bioinformatics, 25, 2397–2403. - PMC - PubMed
    1. Bolton E.E., et al. (2008) Pubchem: integrated platform of small molecules and biological activities. Annu. Rep. Comput. Chem., 4, 217–241.

Publication types