Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar 10;13(1):1245.
doi: 10.1038/s41467-022-28912-6.

Implicitly perturbed Hamiltonian as a class of versatile and general-purpose molecular representations for machine learning

Affiliations

Implicitly perturbed Hamiltonian as a class of versatile and general-purpose molecular representations for machine learning

Amin Alibakhshi et al. Nat Commun. .

Abstract

Unraveling challenging problems by machine learning has recently become a hot topic in many scientific disciplines. For developing rigorous machine-learning models to study problems of interest in molecular sciences, translating molecular structures to quantitative representations as suitable machine-learning inputs play a central role. Many different molecular representations and the state-of-the-art ones, although efficient in studying numerous molecular features, still are suboptimal in many challenging cases, as discussed in the context of the present research. The main aim of the present study is to introduce the Implicitly Perturbed Hamiltonian (ImPerHam) as a class of versatile representations for more efficient machine learning of challenging problems in molecular sciences. ImPerHam representations are defined as energy attributes of the molecular Hamiltonian, implicitly perturbed by a number of hypothetic or real arbitrary solvents based on continuum solvation models. We demonstrate the outstanding performance of machine-learning models based on ImPerHam representations for three diverse and challenging cases of predicting inhibition of the CYP450 enzyme, high precision, and transferrable evaluation of non-covalent interaction energy of molecular systems, and accurately reproducing solvation free energies for large benchmark sets.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Structure of human microsomal CYP450 1A2 enzyme.
Evaluating the possibility of inhibiting this enzyme by drug candidates is one of the early steps in drug design.
Fig. 2
Fig. 2. Comparison of reference CCSD(T) and evaluated energies.
A comparison of the reference energies with energies evaluated by machine learning based on ImPerHam representations (a) and computed by GFN2-xTB method (b) shows that the machine learning-based method results in a remarkably better agreement with the reference data.
Fig. 3
Fig. 3. Comparison of reference CCSD(T) energies and predicted energies in different conformers.
For three dimers (a), (b), and (c) which showed the highest variability in energy range among conformers, employing machine learning and ImPerHam representations remarkably improves the agreement between predicted and reference data.
Fig. 4
Fig. 4. Comparison of predicted and reference solvation free energies.
The solvation free energies predicted via ML (a) are in better agreement with the reference data in comparison to the SMD method (b).
Fig. 5
Fig. 5. Percentage of the presence of studied mediums in selected models for different considered applications.
Perturbation of Hamiltonian by different solvents might have different impacts on predictability depending on the property of interest.

Similar articles

Cited by

References

    1. Faulon J-L, Faure L. In silico, in vitro, and in vivo machine learning in synthetic biology and metabolic engineering. Curr. Opin. Chem. Biol. 2021;65:85–92. doi: 10.1016/j.cbpa.2021.06.002. - DOI - PubMed
    1. Liu, J., Li, J., Wang, H. & Yan, J. Application of deep learning in genomics. Sci. China Life Sci.63, 1860–1878 (2020). - PubMed
    1. Lavecchia A. Machine-learning approaches in drug discovery: methods and applications. Drug Discov. Today. 2015;20:318–331. doi: 10.1016/j.drudis.2014.10.012. - DOI - PubMed
    1. Sommer C, Gerlich DW. Machine learning in cell biology–teaching computers to recognize phenotypes. J. Cell Sci. 2013;126:5529–5539. - PubMed
    1. Berka K, Srsen S, Slavicek P. Is machine learning the future of theoretical chemistry? CHEMICKE LISTY. 2018;112:640–647.