Improving Molecular Design with Direct Inverse Analysis of QSAR/QSPR Model
- PMID: 39797757
- PMCID: PMC11724648
- DOI: 10.1002/minf.202400227
Improving Molecular Design with Direct Inverse Analysis of QSAR/QSPR Model
Abstract
Recent advances in machine learning have significantly impacted molecular design, notably the molecular generation method combining the chemical variational autoencoder (VAE) with Gaussian mixture regression (GMR). In this method, a mathematical model is constructed with X as the latent variable of the molecule and Y as the target properties and activities. Through direct inverse analysis of this model, it is possible to generate molecules with the desired target properties. However, this approach outputs many strings that do not follow the simplified molecular input line entry system grammar and generates unrealistic chemical structures in which the properties and activity do not satisfy the target values. In this study, we focus on hierarchical VAE using molecular graphs to address these issues. We confirm that the combination of hierarchical VAE and GMR does not generate invalid outputs and returns molecules that simultaneously satisfy multiple target values. Moreover, we use this method to identify several molecules that are predicted to exhibit activity against drug targets.
Keywords: autoencoder; cheminformatics; drug design; machine learning; virtual screening.
© 2025 The Author(s). Molecular Informatics published by Wiley-VCH GmbH.
Conflict of interest statement
The author declares no conflicts of interest.
Figures






Similar articles
-
De Novo Direct Inverse QSPR/QSAR: Chemical Variational Autoencoder and Gaussian Mixture Regression Models.J Chem Inf Model. 2023 Feb 13;63(3):794-805. doi: 10.1021/acs.jcim.2c01298. Epub 2023 Jan 12. J Chem Inf Model. 2023. PMID: 36635071
-
[Ring-system-based Chemical Structure Enumeration for de Novo Design].Yakugaku Zasshi. 2016;136(1):101-6. doi: 10.1248/yakushi.15-00230-2. Yakugaku Zasshi. 2016. PMID: 26725676 Review. Japanese.
-
Deep Generative Models for Molecular Science.Mol Inform. 2018 Jan;37(1-2). doi: 10.1002/minf.201700133. Epub 2018 Feb 6. Mol Inform. 2018. PMID: 29405647 Review.
-
ICVAE: Interpretable Conditional Variational Autoencoder for De Novo Molecular Design.Int J Mol Sci. 2025 Apr 23;26(9):3980. doi: 10.3390/ijms26093980. Int J Mol Sci. 2025. PMID: 40362221 Free PMC article.
-
Molecular Descriptors, Structure Generation, and Inverse QSAR/QSPR Based on SELFIES.ACS Omega. 2023 Jun 5;8(24):21781-21786. doi: 10.1021/acsomega.3c01332. eCollection 2023 Jun 20. ACS Omega. 2023. PMID: 37360490 Free PMC article.
References
-
- C. Bilodeau, W. Jin, T. Jaakkola, R. Barzilay, K. F. Jensen, “Generative models for molecular discovery: Recent advances and challenges”, Wiley Interdisciplinary Reviews: Computational Molecular Science 12 (2022): e1608.
-
- Weininger D., “SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules”, Journal of Chemical Infornation Computer Sciences 28 (1988): 31–36, 10.1021/ci00057a005. - DOI
-
- Krenn M., Hase F., Nigam A., Friederich P., Aspuru-Guzik A., “Self-referencing embedded strings (SELFIES): A 100% robust molecular string representation”, Machine Learning Science and Technology 1 (2020): 045024, 10.1088/2632--2153/aba947. - DOI
-
- D. P. Kingma, M. Welling, “Auto-Encoding Variational Bayes ”, Cornell University Library 2018, arXiv:1312.6114.
MeSH terms
LinkOut - more resources
Full Text Sources
Research Materials