Variational autoencoder-based chemical latent space for large molecular structures with 3D complexity
- PMID: 37973971
- PMCID: PMC10654724
- DOI: 10.1038/s42004-023-01054-6
Variational autoencoder-based chemical latent space for large molecular structures with 3D complexity
Abstract
The structural diversity of chemical libraries, which are systematic collections of compounds that have potential to bind to biomolecules, can be represented by chemical latent space. A chemical latent space is a projection of a compound structure into a mathematical space based on several molecular features, and it can express structural diversity within a compound library in order to explore a broader chemical space and generate novel compound structures for drug candidates. In this study, we developed a deep-learning method, called NP-VAE (Natural Product-oriented Variational Autoencoder), based on variational autoencoder for managing hard-to-analyze datasets from DrugBank and large molecular structures such as natural compounds with chirality, an essential factor in the 3D complexity of compounds. NP-VAE was successful in constructing the chemical latent space from large-sized compounds that were unable to be handled in existing methods, achieving higher reconstruction accuracy, and demonstrating stable performance as a generative model across various indices. Furthermore, by exploring the acquired latent space, we succeeded in comprehensively analyzing a compound library containing natural compounds and generating novel compound structures with optimized functions.
© 2023. The Author(s).
Conflict of interest statement
The authors declare no competing interests.
Figures







Similar articles
-
Predicting drug polypharmacology from cell morphology readouts using variational autoencoder latent space arithmetic.PLoS Comput Biol. 2022 Feb 25;18(2):e1009888. doi: 10.1371/journal.pcbi.1009888. eCollection 2022 Feb. PLoS Comput Biol. 2022. PMID: 35213530 Free PMC article.
-
Interpretable Machine Learning Models for Molecular Design of Tyrosine Kinase Inhibitors Using Variational Autoencoders and Perturbation-Based Approach of Chemical Space Exploration.Int J Mol Sci. 2022 Sep 24;23(19):11262. doi: 10.3390/ijms231911262. Int J Mol Sci. 2022. PMID: 36232566 Free PMC article.
-
Deep Clustering Analysis via Dual Variational Autoencoder With Spherical Latent Embeddings.IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):6303-6312. doi: 10.1109/TNNLS.2021.3135460. Epub 2023 Sep 1. IEEE Trans Neural Netw Learn Syst. 2023. PMID: 34941534
-
Deep Generative Models for Molecular Science.Mol Inform. 2018 Jan;37(1-2). doi: 10.1002/minf.201700133. Epub 2018 Feb 6. Mol Inform. 2018. PMID: 29405647 Review.
-
Exploring Low-Toxicity Chemical Space with Deep Learning for Molecular Generation.J Chem Inf Model. 2022 Jul 11;62(13):3191-3199. doi: 10.1021/acs.jcim.2c00671. Epub 2022 Jun 17. J Chem Inf Model. 2022. PMID: 35713712 Review.
Cited by
-
Improving Molecular Design with Direct Inverse Analysis of QSAR/QSPR Model.Mol Inform. 2025 Jan;44(1):e202400227. doi: 10.1002/minf.202400227. Mol Inform. 2025. PMID: 39797757 Free PMC article.
-
Leveraging tree-transformer VAE with fragment tokenization for high-performance large chemical model generation.Commun Chem. 2025 Aug 5;8(1):228. doi: 10.1038/s42004-025-01640-w. Commun Chem. 2025. PMID: 40764746 Free PMC article.
-
Multi-Objective Design of DNA-Stabilized Nanoclusters Using Variational Autoencoders With Automatic Feature Extraction.ACS Nano. 2024 Oct 1;18(39):26997-27008. doi: 10.1021/acsnano.4c09640. Epub 2024 Sep 17. ACS Nano. 2024. PMID: 39288200 Free PMC article.
-
Integrate & balance aspects for safe and sustainable innovation: Needs analysis on SSbD categories and product development stage requirements to cover the entire life cycle.Comput Struct Biotechnol J. 2025 Jul 17;29:201-221. doi: 10.1016/j.csbj.2025.07.030. eCollection 2025. Comput Struct Biotechnol J. 2025. PMID: 40740296 Free PMC article. Review.
-
Chemical language modeling with structured state space sequence models.Nat Commun. 2024 Jul 22;15(1):6176. doi: 10.1038/s41467-024-50469-9. Nat Commun. 2024. PMID: 39039051 Free PMC article.
References
-
- Kingma D. P., Welling M. Auto-encoding variational Bayes. Preprint at https://arxiv.org/abs/1312.6114 (2013).
Grants and funding
- 22H04901/Ministry of Education, Culture, Sports, Science and Technology (MEXT)
- 17H06410/Ministry of Education, Culture, Sports, Science and Technology (MEXT)
- 23H04885/Ministry of Education, Culture, Sports, Science and Technology (MEXT)
- 23H04880/Ministry of Education, Culture, Sports, Science and Technology (MEXT)
- 23H04881/Ministry of Education, Culture, Sports, Science and Technology (MEXT)
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous