Implicitly perturbed Hamiltonian as a class of versatile and general-purpose molecular representations for machine learning
- PMID: 35273170
- PMCID: PMC8913769
- DOI: 10.1038/s41467-022-28912-6
Implicitly perturbed Hamiltonian as a class of versatile and general-purpose molecular representations for machine learning
Abstract
Unraveling challenging problems by machine learning has recently become a hot topic in many scientific disciplines. For developing rigorous machine-learning models to study problems of interest in molecular sciences, translating molecular structures to quantitative representations as suitable machine-learning inputs play a central role. Many different molecular representations and the state-of-the-art ones, although efficient in studying numerous molecular features, still are suboptimal in many challenging cases, as discussed in the context of the present research. The main aim of the present study is to introduce the Implicitly Perturbed Hamiltonian (ImPerHam) as a class of versatile representations for more efficient machine learning of challenging problems in molecular sciences. ImPerHam representations are defined as energy attributes of the molecular Hamiltonian, implicitly perturbed by a number of hypothetic or real arbitrary solvents based on continuum solvation models. We demonstrate the outstanding performance of machine-learning models based on ImPerHam representations for three diverse and challenging cases of predicting inhibition of the CYP450 enzyme, high precision, and transferrable evaluation of non-covalent interaction energy of molecular systems, and accurately reproducing solvation free energies for large benchmark sets.
© 2022. The Author(s).
Conflict of interest statement
The authors declare no competing interests.
Figures





Similar articles
-
Solvent-Specific Featurization for Predicting Free Energies of Solvation through Machine Learning.J Chem Inf Model. 2019 Apr 22;59(4):1338-1346. doi: 10.1021/acs.jcim.8b00901. Epub 2019 Mar 18. J Chem Inf Model. 2019. PMID: 30821455
-
Improved prediction of solvation free energies by machine-learning polarizable continuum solvation model.Nat Commun. 2021 Jun 18;12(1):3584. doi: 10.1038/s41467-021-23724-6. Nat Commun. 2021. PMID: 34145237 Free PMC article.
-
Predicting Energetics Materials' Crystalline Density from Chemical Structure by Machine Learning.J Chem Inf Model. 2021 May 24;61(5):2147-2158. doi: 10.1021/acs.jcim.0c01318. Epub 2021 Apr 26. J Chem Inf Model. 2021. PMID: 33899482
-
Protein representations: Encoding biological information for machine learning in biocatalysis.Biotechnol Adv. 2024 Dec;77:108459. doi: 10.1016/j.biotechadv.2024.108459. Epub 2024 Oct 2. Biotechnol Adv. 2024. PMID: 39366493 Review.
-
Progress towards machine learning reaction rate constants.Phys Chem Chem Phys. 2022 Feb 2;24(5):2692-2705. doi: 10.1039/d1cp04422b. Phys Chem Chem Phys. 2022. PMID: 34935798 Review.
Cited by
-
Electron iso-density surfaces provide a thermodynamically consistent representation of atomic and molecular surfaces.Nat Commun. 2024 Jul 19;15(1):6086. doi: 10.1038/s41467-024-50408-8. Nat Commun. 2024. PMID: 39030194 Free PMC article.
References
-
- Liu, J., Li, J., Wang, H. & Yan, J. Application of deep learning in genomics. Sci. China Life Sci.63, 1860–1878 (2020). - PubMed
-
- Sommer C, Gerlich DW. Machine learning in cell biology–teaching computers to recognize phenotypes. J. Cell Sci. 2013;126:5529–5539. - PubMed
-
- Berka K, Srsen S, Slavicek P. Is machine learning the future of theoretical chemistry? CHEMICKE LISTY. 2018;112:640–647.
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources