Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2025 Jun 25;125(12):5776-5829.
doi: 10.1021/acs.chemrev.4c00893. Epub 2025 Jun 9.

Studying Noncovalent Interactions in Molecular Systems with Machine Learning

Affiliations
Review

Studying Noncovalent Interactions in Molecular Systems with Machine Learning

Serhii Tretiakov et al. Chem Rev. .

Abstract

Noncovalent interactions (NCIs) is an umbrella term for a multitude of typically weak interactions within and between molecules. Despite the low individual energy contributions, their collective effect significantly influences molecular behavior. Accordingly, understanding these interactions is crucial across fields like catalysis, drug design, materials science, and environmental chemistry. However, predicting NCIs is challenging, requiring at least molecular mechanics-level pairwise energy contributions or efficient quantum mechanical electron correlation treatment. In this review, we investigate the application of machine learning (ML) to study NCIs in molecular systems, an emerging research field. ML excels at modeling complex nonlinear relationships, and is capable of integrating vast data sets from experimental and theoretical sources. It offers a powerful approach for analyzing interactions across scales, from small molecules to large biomolecular assemblies. Specifically, we examine data sets characterizing NCIs, compare molecular featurization techniques, assess ML models predicting NCIs explicitly, and explore inverse design approaches. ML enhances predictive accuracy, reduces computational costs, and reveals overlooked interaction patterns. By identifying current challenges and future opportunities, we highlight how ML-driven insights could revolutionize this field. Overall, we believe that recent proof-of-concept studies foreshadow exciting developments for the study of NCIs in the years to come.

PubMed Disclaimer

Figures

1
1
A) Components of the van der Waals force. B) Examples of hydrogen bonding. C, D) Molecular electrostatic potential surfaces, featuring σ- and π-holes, respectively. Red and blue areas represent negative and positive regions, respectively. σ-Holes are shown for halogen, chalcogen, pnictogen and tetrel fluorides, calculated at the MP2/aug-cc-pVTZ level of theory; potential values of the σ-holes are given in kJ mol–1. π-Holes (labeled with an asterisk) are given for carbonyl fluoride (M06–2X/aug-cc-pVDZ level of theory) and hexafluorobenzene. For the latter, the highest occupied molecular orbital (HOMO) is also shown. Subfigure C was adapted with permission from the literature. Copyright 2015 Wiley. Left-hand structure in Subfigure D was adapted with permission from the literature. Copyright 2021 American Chemical Society.
2
2
A, B) Various π-interactions, including different modes of π-π stacking; C) X-ray crystal structure of [V15A]­crambin (PDB accession code: 2FD7) and the chemical structure of the salt bridge between the δ-guanidinium group of Arg10 and the α-carboxylate of the C-terminal Asn46. The protein structure is displayed in a ribbon representation, with the salt bridge highlighted in a stick representation.
3
3
Model optimization workflow and model application in supervised learning.
4
4
Neural network architectures used for molecular property prediction. A) Feedforward neural network: Molecular features (input) are processed through fully connected layers to predict molecular properties (output). B) Convolutional neural network: Molecular images are processed through convolution and pooling layers before reaching the fully connected layer providing the output. C) Graph neural network: Encodes molecules as graphs with atom and bond features. Graph layers update feature representations, and the pooled graph representation is fed to the fully connected layer, providing the output.
5
5
Schematic graph representation within InteractionNet (a) and its architecture (b) for predicting ligand dissociation constants. Both covalent and noncovalent interactions are encoded into the corresponding adjacency matrices, which are then learned by InteractionNet to yield the dissociation constants.
6
6
Graph Convolutional Neural Network (GCNN) framework for drug-target interaction prediction by Torng and Altman. The framework represents protein pockets and small molecules as graphs to predict binding interactions. In Step I, an unsupervised deep graph autoencoder learns fixed-size latent embeddings from protein binding sites, capturing general pocket features without geometric constraints. In Step II, two supervised GCNNs are trained separately on the protein pocket and ligand graphs, with binding classification guiding the training. The protein pocket graph represents residues as nodes with edges indicating spatial proximity, while the ligand is represented as molecular graph. The figure was adapted with permission from the literature. Copyright 2019 American Chemical Society.
7
7
A) Example of a Michael addition. B) Example a Diels–Alder reaction. Experimentally observed products are shown in blue. C) A workflow for deriving atom-contact vectors, which starts from (I) a pair of transition state-like complexes, followed by (II) distance-binning of contacts between specific atom types and (III) condensing each histogram into a vector, two per reaction. Subfigure C was adapted with permission from the literature. Copyright 2021 Wiley.
8
8
Intramolecular chalcogen interaction metric S.
9
9
Condensed π-interaction metrics (A) and example use cases (B–C).
10
10
Common steric parameters.
11
11
Parameters considered in the Sterimol featurization.
12
12
A: Comparison of the trends in experimental A-values and the corrected Sterimol parameter B1’. B: Correlation between the London dispersion component of the A-value and polarizability. Group 1 displays a linear relationship, Groups 2 and 3 have A-values slightly smaller and larger compared to B1’, respectively, but no dependence on polarizability. The figure was adapted with permission from the literature. Copyright 2021 American Chemical Society.
13
13
Correlation between the steric part of the computed A-values and the Sterimol parameters B1 and B5. The figure was adapted with permission from the literature. Copyright 2021 American Chemical Society.
14
14
Case study reaction of 1,1-diarylation of benzyl acrylates.
15
15
Wiberg bond orders for NCIs in selected biomacromolecules calculated at the ωB97X-D/def2-SVP level of theory and predicted by DelFTa (denoted as ML). Only interactions with Wiberg bond orders within 0.05–0.8 are shown. H atoms are shown in white, C atoms in gray, O atoms in red, and N atoms in blue. The figure was adapted with permission from the literature. Copyright 2022 Royal Society of Chemistry.
16
16
Feature vector in the VmaxPred model. + and – : the metrics for positive and negative charges; G: group electronegativity; Li: lone-pair electrostatic interaction measure; Dx, Dc: electrotopological state indices of the halogen (X) and its bound carbon atom (C); C/N/O/S: total Pauling electronegativity of atoms within the respective sphere; Le: lone-pair electron index. The figure was adapted with permission from the literature. Copyright 2019 American Chemical Society.
17
17
Structure of cucurbit[7]­uril.
18
18
Feature visualization of TFRegNCI-3D compared to the NCIPLOT method and relevant MO isosurfaces. The figure was adapted with permission from the literature. Copyright 2023 American Chemical Society.
19
19
Two-stage workflow for the explicit design of small organic host molecules via a VAE for generating 3D electron densities.

References

    1. Müller-Dethlefs K., Hobza P.. Noncovalent Interactions: A Challenge for Experiment and Theory. Chem. Rev. 2000;100:143–168. doi: 10.1021/cr9900331. - DOI - PubMed
    1. Casitas A., Rees J. A., Goddard R., Bill E., DeBeer S., Fürstner A.. Two Exceptional Homoleptic Iron­(IV) Tetraalkyl Complexes. Angew. Chem., Int. Ed. 2017;56:10108–10113. doi: 10.1002/anie.201612299. - DOI - PubMed
    1. Rösel S., Quanz H., Logemann C., Becker J., Mossou E., Cañadillas-Delgado L., Caldeweyher E., Grimme S., Schreiner P. R.. London dispersion enables the shortest intermolecular hydrocarbon H··· H contact. J. Am. Chem. Soc. 2017;139:7428–7431. doi: 10.1021/jacs.7b01879. - DOI - PubMed
    1. Rösel S., Becker J., Allen W. D., Schreiner P. R.. Probing the delicate balance between pauli repulsion and London dispersion with triphenylmethyl derivatives. J. Am. Chem. Soc. 2018;140:14421–14432. doi: 10.1021/jacs.8b09145. - DOI - PubMed
    1. Askeland, D. ; Wright, W. . The Science and Engineering of Materials; Cengage Learning: Boston, MA, 2015; p 38.

LinkOut - more resources