Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jul-Aug;18(4):1290-1298.
doi: 10.1109/TCBB.2021.3085972. Epub 2021 Aug 6.

LUNAR :Drug Screening for Novel Coronavirus Based on Representation Learning Graph Convolutional Network

LUNAR :Drug Screening for Novel Coronavirus Based on Representation Learning Graph Convolutional Network

Deshan Zhou et al. IEEE/ACM Trans Comput Biol Bioinform. 2021 Jul-Aug.

Abstract

An outbreak of COVID-19 that began in late 2019 was caused by a novel coronavirus(SARS-CoV-2). It has become a global pandemic. As of June 9, 2020, it has infected nearly 7 million people and killed more than 400,000, but there is no specific drug. Therefore, there is an urgent need to find or develop more drugs to suppress the virus. Here, we propose a new nonlinear end-to-end model called LUNAR. It uses graph convolutional neural networks to automatically learn the neighborhood information of complex heterogeneous relational networks and combines the attention mechanism to reflect the importance of the sum of different types of neighborhood information to obtain the representation characteristics of each node. Finally, through the topology reconstruction process, the feature representations of drugs and targets are forcibly extracted to match the observed network as much as possible. Through this reconstruction process, we obtain the strength of the relationship between different nodes and predict drug candidates that may affect the treatment of COVID-19 based on the known targets of COVID-19. These selected candidate drugs can be used as a reference for experimental scientists and accelerate the speed of drug development. LUNAR can well integrate various topological structure information in heterogeneous networks, and skillfully combine attention mechanisms to reflect the importance of neighborhood information of different types of nodes, improving the interpretability of the model. The area under the curve(AUC) of the model is 0.949 and the accurate recall curve (AUPR) is 0.866 using 10-fold cross-validation. These two performance indexes show that the model has superior predictive performance. Besides, some of the drugs screened out by our model have appeared in some clinical studies to further illustrate the effectiveness of the model.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
The model for LUNAR. A. The first three layers take the representation learning of drug embedding as an example and use graph convolutional neural network to aggregate the information of itself and other nodes in the neighborhood. Different edge types are represented with different colors. The learned embedding serves as the embedded representation of the next level of nodes. In the third layer to get the final representation of the node, the model uses a combination of graph convolutional neural network and attention mechanism. When the neighborhood information of different edge types is aggregated, the attention mechanism is added. The domain information of different edge types is Node embedding has different degrees of influence, making the representation of node embedding more interpretable. After getting the embedding of all the nodes, the method of network topology reconstruction is used to get the repositioning network. In the repositioning network, the solid line is the existing connection in the original network, and the dashed line is the new connection. The thickness of the line represents the magnitude of the relationship strength. B. This is a schematic diagram of a graph convolutional neural network. First, the same edge type information is aggregated to obtain the sum of neighborhood information of different edge types. Finally, it is added to the map embedded in the node itself. C. This is a schematic diagram of the combination of graph convolutional neural network and attention mechanism. The attention mechanism is used for the sum of the domain information of different edge types to obtain different attention coefficients, and the product of the attention coefficient and the sum of the original domain information is aggregated as the new domain information of a certain edge type.
Fig. 2.
Fig. 2.
In the left subgraph where all unknown pairs are treated as negative examples, we observed that LUNAR was superior to the other methods, with a 1.8 percent performance improvement over the second-best method on AUPR. The right sub-graph is a random sampling of all negative samples of the drug-target interaction network. The number of positive examples is ten times the number of negative examples in the data set. We observe that our model has a 27.3 percent increase in AUPR compared to the left sub-graph, reaching a high index value of 86.6 percent, which is 8 percent higher than DTINet and is not much different from the DeoDti model.
Fig. 3.
Fig. 3.
This is the attention coefficient representing the sum of neighborhood information of different edge types when learning the embedding of a drug. The embedded dimension is 1024 dimensions, and each dimension has an attention factor. The darker the color, the greater the influence. Looking from left to right, we observe that the overall attention coefficient of each dimension of the same edge type is not very different. From top to bottom, we observe that the attention coefficients of different edge types have obvious color differences and a sense of hierarchy. This shows that the influence of the sum of the domain information of different edge types on the embedding representation of the node is different overall. For this drug node, the sum of the domain information of the interaction between the drug and the drug has the greatest influence on its embedding.
Fig. 4.
Fig. 4.
Remove the attention mechanism in LUNAR and observe the changes in the performance indicators AUC and AUPR of the model under the condition of NO. negative: NO. positive = 10:1 using 10-fold cross-validation.
Fig. 5.
Fig. 5.
Comparison of model prediction performance of different sizes of training data.

Similar articles

Cited by

References

    1. Zhu N., et al. , “A novel coronavirus from patients with pneumonia in China, 2019,” New Engl. J. Med., vol. 382, no. 8, pp. 727–733, 2020. - PMC - PubMed
    1. Xu X., et al. , “Evolution of the novel coronavirus from the ongoing Wuhan outbreak and modeling of ts spike protein for risk of human transmission,” Sci. China Life Sci., vol. 63, no. 3, pp. 457–460, 2020. - PMC - PubMed
    1. Huang C., et al. , “Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China,” Lancet, vol. 395, no. 10223, pp. 497–506, 2020. - PMC - PubMed
    1. Yang Y., et al. , “Epidemiological and clinical features of the 2019 novel coronavirus outbreak in China,” MedRxiv, 2020. - PubMed
    1. WHO. “Coronavirus disease (COVID-2019) situation reports,” Accessed: Jun. 9, 2020. [Online]. Available: https://www.who.int/emergencies/diseases/novel-coronavirus-2019/situatio.../

Publication types

MeSH terms

Substances