Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jun 30;39(39 Suppl 1):i103-i110.
doi: 10.1093/bioinformatics/btad234.

Transfer learning for drug-target interaction prediction

Affiliations

Transfer learning for drug-target interaction prediction

Alperen Dalkıran et al. Bioinformatics. .

Abstract

Motivation: Utilizing AI-driven approaches for drug-target interaction (DTI) prediction require large volumes of training data which are not available for the majority of target proteins. In this study, we investigate the use of deep transfer learning for the prediction of interactions between drug candidate compounds and understudied target proteins with scarce training data. The idea here is to first train a deep neural network classifier with a generalized source training dataset of large size and then to reuse this pre-trained neural network as an initial configuration for re-training/fine-tuning purposes with a small-sized specialized target training dataset. To explore this idea, we selected six protein families that have critical importance in biomedicine: kinases, G-protein-coupled receptors (GPCRs), ion channels, nuclear receptors, proteases, and transporters. In two independent experiments, the protein families of transporters and nuclear receptors were individually set as the target datasets, while the remaining five families were used as the source datasets. Several size-based target family training datasets were formed in a controlled manner to assess the benefit provided by the transfer learning approach.

Results: Here, we present a systematic evaluation of our approach by pre-training a feed-forward neural network with source training datasets and applying different modes of transfer learning from the pre-trained source network to a target dataset. The performance of deep transfer learning is evaluated and compared with that of training the same deep neural network from scratch. We found that when the training dataset contains fewer than 100 compounds, transfer learning outperforms the conventional strategy of training the system from scratch, suggesting that transfer learning is advantageous for predicting binders to under-studied targets.

Availability and implementation: The source code and datasets are available at https://github.com/cansyl/TransferLearning4DTI. Our web-based service containing the ready-to-use pre-trained models is accessible at https://tl4dti.kansil.org.

PubMed Disclaimer

Conflict of interest statement

None declared.

Figures

Figure 1.
Figure 1.
The percentage of target proteins with certain numbers of bioactive compounds in the ChEMBL_29 database (data filters: targets are single proteins and belong to the human, bioactivities are associated with a pChEMBL value, all data points are coming from binding assays, and multiple bioactivity data points for the same compound-target pair are counted as one). For example, 6.2% of the target proteins in ChEMBL have bioactive compounds in the range between 501 and 1000.
Figure 2.
Figure 2.
Sketch of the training phase. During the training phase, we first trained a source neural network model with a training dataset of a source family (Stage I). This pre-trained source model is then used for transfer learning to retrain it with a small-sized target training dataset (Stage II). We also trained, from scratch, an FNN having exactly the same configuration (reference model) as well as a shallow classifier (base model), using this same target training dataset.
Figure 3.
Figure 3.
Visual representations of three modes of transfer learning (described in Section 2) on FNN-2-Chemprop; a) Mode 1: full fine-tuning, b) Mode 2: feature transformer, and c) Mode 3: shallow classifier.
Figure 4.
Figure 4.
Prediction performance results of the models where transporter is the target family and one of the other five families is the source family in each panel: The average test MCC values of the reference model (FNN-2-Chemprop trained from scratch), the base model (an SVM trained from scratch) and the three modes of transfer learning. The results are given for four different cases (i.e., target training dataset sizes).
Figure 5.
Figure 5.
Prediction performance results of the models where nuclear receptor is the target family and one of the other five families is used as the source family in each panel: The average test MCC values of the reference model (FNN-2-Chemprop trained from scratch), the base model (an SVM trained from scratch) and the three modes of transfer learning. The results are given for four different cases (i.e., target training dataset sizes).
Figure 6.
Figure 6.
Prediction performance results of the models where transporter is the target family and kinase is the source family: The average test MCC values of the reference model (FNN-2-Chemprop trained from scratch), the base model (an SVM trained from scratch) and the three modes of transfer learning. The results are given for eight different cases (i.e., target training dataset sizes).
Figure 7.
Figure 7.
Prediction performance results of the models where nuclear receptor is the target family and kinase is the source family: The average test MCC values of the reference model (FNN-2-Chemprop trained from scratch), the base model (an SVM trained from scratch) and the three modes of transfer learning. The results are given for eight different cases (i.e., target training dataset sizes).
Figure 8.
Figure 8.
When transfer learning is used, the target (fine-tuned) models starts the training from lower loss values compared to the reference model (scratch), and converge after significantly lower number of epochs.
Figure 9.
Figure 9.
Prediction performance results of the models where the source models are directly used for inference on the target data; i.e. no transfer learning (no further training is applied on the pre-trained source model): The average test MCC values of the reference model (FNN-2-Chemprop trained from scratch) when transporter and nuclear receptor are the target families, respectively.

References

    1. Bagherian M, Sabeti E, Wang K. et al. Machine learning approaches and databases for prediction of drug–target interaction: a survey paper. Brief Bioinform 2021;22:247–69. - PMC - PubMed
    1. Baskin II. The power of deep learning to ligand-based novel drug discovery. Expert Opin Drug Discov 2020;15:755–64. - PubMed
    1. Butina D. Unsupervised data base clustering based on daylight’s fingerprint and tanimoto similarity: a fast and automated way to cluster small and large data sets. J Chem Inf Comput Sci 1999;39:747–50.
    1. Cai C, Wang S, Xu Y. et al. Transfer learning for drug discovery. J Med Chem 2021;63:8683–94. - PubMed
    1. Chen H, Engkvist O, Wang Y. et al. The rise of deep learning in drug discovery. Drug Discov Today 2018;23:1241–50. - PubMed

Publication types

Substances