Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan;44(1):e202400227.
doi: 10.1002/minf.202400227.

Improving Molecular Design with Direct Inverse Analysis of QSAR/QSPR Model

Affiliations

Improving Molecular Design with Direct Inverse Analysis of QSAR/QSPR Model

Yuto Shino et al. Mol Inform. 2025 Jan.

Abstract

Recent advances in machine learning have significantly impacted molecular design, notably the molecular generation method combining the chemical variational autoencoder (VAE) with Gaussian mixture regression (GMR). In this method, a mathematical model is constructed with X as the latent variable of the molecule and Y as the target properties and activities. Through direct inverse analysis of this model, it is possible to generate molecules with the desired target properties. However, this approach outputs many strings that do not follow the simplified molecular input line entry system grammar and generates unrealistic chemical structures in which the properties and activity do not satisfy the target values. In this study, we focus on hierarchical VAE using molecular graphs to address these issues. We confirm that the combination of hierarchical VAE and GMR does not generate invalid outputs and returns molecules that simultaneously satisfy multiple target values. Moreover, we use this method to identify several molecules that are predicted to exhibit activity against drug targets.

Keywords: autoencoder; cheminformatics; drug design; machine learning; virtual screening.

PubMed Disclaimer

Conflict of interest statement

The author declares no conflicts of interest.

Figures

Figure 1
Figure 1
Basic concept of direct inverse analysis through the integration of VAE and GMR.
Figure 2
Figure 2
Diagram of VAE training with joint property prediction.
Figure 3
Figure 3
(a)Actual logP vs predicted value plots of test data. (b)Actual QED vs predicted value plots of test data. (c)Actual SAS vs predicted value plots of test data.
Figure 4
Figure 4
A molecule generated by the proposed method that achieve the target values.
Figure 5
Figure 5
(a) Crystal structure of target protein (DRD2). (b) Docking pose of interaction between DRD2 and risperidone. (c) Docking pose of interaction between DRD2 and the novel molecule with the highest binding affinity.
Figure 6
Figure 6
Binding affinity energies of generated molecules; black points indicate initial samples, blue points indicate generated samples, red points indicate samples meeting the target, grey area indicates target range (target value: −12.2).

Similar articles

References

    1. C. Bilodeau, W. Jin, T. Jaakkola, R. Barzilay, K. F. Jensen, “Generative models for molecular discovery: Recent advances and challenges”, Wiley Interdisciplinary Reviews: Computational Molecular Science 12 (2022): e1608.
    1. Zhang Y., Luo M., Wu P., Wu S., Lee T., Bai C., “Application of Computational Biology and Artificial Intelligence in Drug Design”, International Journal of Molecular Sciences 23 (2022): 13568, 10.3390/ijms232113568. - DOI - PMC - PubMed
    1. Weininger D., “SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules”, Journal of Chemical Infornation Computer Sciences 28 (1988): 31–36, 10.1021/ci00057a005. - DOI
    1. Krenn M., Hase F., Nigam A., Friederich P., Aspuru-Guzik A., “Self-referencing embedded strings (SELFIES): A 100% robust molecular string representation”, Machine Learning Science and Technology 1 (2020): 045024, 10.1088/2632--2153/aba947. - DOI
    1. D. P. Kingma, M. Welling, “Auto-Encoding Variational Bayes ”, Cornell University Library 2018, arXiv:1312.6114.

MeSH terms

LinkOut - more resources