Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jun 1;10(35):20701-20712.
doi: 10.1039/d0ra02297g. eCollection 2020 May 27.

Drug-target affinity prediction using graph neural network and contact maps

Affiliations

Drug-target affinity prediction using graph neural network and contact maps

Mingjian Jiang et al. RSC Adv. .

Abstract

Computer-aided drug design uses high-performance computers to simulate the tasks in drug design, which is a promising research area. Drug-target affinity (DTA) prediction is the most important step of computer-aided drug design, which could speed up drug development and reduce resource consumption. With the development of deep learning, the introduction of deep learning to DTA prediction and improving the accuracy have become a focus of research. In this paper, utilizing the structural information of molecules and proteins, two graphs of drug molecules and proteins are built up respectively. Graph neural networks are introduced to obtain their representations, and a method called DGraphDTA is proposed for DTA prediction. Specifically, the protein graph is constructed based on the contact map output from the prediction method, which could predict the structural characteristics of the protein according to its sequence. It can be seen from the test of various metrics on benchmark datasets that the method proposed in this paper has strong robustness and generalizability.

PubMed Disclaimer

Conflict of interest statement

There are no conflicts to declare.

Figures

Fig. 1
Fig. 1. The architecture of DGraphDTA. Drug molecule SMILES is used for molecule construction and the graph is built up based on it. For the protein, the contact map is constructed based on the protein sequence, and then the graph is built up. After getting two graphs, they enter two GNNs to extract the representations. Finally the representations are concatenated for affinity prediction.
Fig. 2
Fig. 2. Construction of molecular graph. The SMILES of the drug molecule is inputted and the molecular graph is constructed with atoms as nodes and bonds as edges, and then the related adjacency matrix is generated. In order to involve the convolution of the atom itself, the self-loop is added, that is, the diagonal of the adjacency matrix is set to 1.
Fig. 3
Fig. 3. Construction of protein graph. The protein sequence was preprocessed first, then the contact map was predicted by Pconsc4, then the adjacency matrix of the protein graph was obtained after threshold (0.5) filter.
Fig. 4
Fig. 4. The processing of protein, including the pre-processing of the sequence, graph construction and feature generation. The results of protein sequence alignment and filter were fed into Pconsc4 for contact map prediction. After further format conversion, the filtered results are used for PSSM calculation.
Fig. 5
Fig. 5. The network of DGraphDTA. The graphs of molecule and protein pass through two GNNs to get their representations. Then the affinity can be predicted after multiple fully connected layers.
Fig. 6
Fig. 6. Performances of various GNN dropout probabilities to describe protein. (a) The CI scores of the 5-fold validation results. (b) The MSE scores of the 5-fold validation results. (c) The Pearson correlation coefficient of the 5-fold validation results.
Fig. 7
Fig. 7. Performances of various GNN pooling methods to describe protein. (a) The CI scores of the 5-fold validation results. (b) The MSE scores of the 5-fold validation results. (c) The Pearson correlation coefficient of the 5-fold validation results.
Fig. 8
Fig. 8. Performances of GNN with or without PSSM to describe protein. (a) The CI scores of the 5-fold validation results. (b) The MSE scores of the 5-fold validation results. (c) The Pearson correlation coefficient of the 5-fold validation results.

Similar articles

Cited by

References

    1. Aminpour M. Montemagno C. Tuszynski J. A. Molecules. 2019;24:1693. doi: 10.3390/molecules24091693. - DOI - PMC - PubMed
    1. Phillips J. C. Braun R. Wang W. Gumbart J. Tajkhorshid E. Villa E. Chipot C. Skeel R. D. Kale L. Schulten K. J. Comput. Chem. 2005;26:1781–1802. doi: 10.1002/jcc.20289. - DOI - PMC - PubMed
    1. Van Der Spoel D. Lindahl E. Hess B. Groenhof G. Mark A. E. Berendsen H. J. C. J. Comput. Chem. 2005;26:1701–1718. doi: 10.1002/jcc.20291. - DOI - PubMed
    1. Salomon-Ferrer R. Case D. A. Walker R. C. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2013;3:198–210.
    1. Lang P. T. Brozell S. R. Mukherjee S. Pettersen E. F. Meng E. C. Thomas V. Rizzo R. C. Case D. A. James T. L. Kuntz I. D. Rna. 2009;15:1219–1230. doi: 10.1261/rna.1563609. - DOI - PMC - PubMed