Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Oct 14;125(40):8978-8986.
doi: 10.1021/acs.jpca.1c04462. Epub 2021 Oct 5.

Deep Learning Coordinate-Free Quantum Chemistry

Affiliations

Deep Learning Coordinate-Free Quantum Chemistry

Matthew K Matlock et al. J Phys Chem A. .

Abstract

Computing quantum chemical properties of small molecules and polymers can provide insights valuable into physicists, chemists, and biologists when designing new materials, catalysts, biological probes, and drugs. Deep learning can compute quantum chemical properties accurately in a fraction of time required by commonly used methods such as density functional theory. Most current approaches to deep learning in quantum chemistry begin with geometric information from experimentally derived molecular structures or pre-calculated atom coordinates. These approaches have many useful applications, but they can be costly in time and computational resources. In this study, we demonstrate that accurate quantum chemical computations can be performed without geometric information by operating in the coordinate-free domain using deep learning on graph encodings. Coordinate-free methods rely only on molecular graphs, the connectivity of atoms and bonds, without atom coordinates or bond distances. We also find that the choice of graph-encoding architecture substantially affects the performance of these methods. The structures of these graph-encoding architectures provide an opportunity to probe an important, outstanding question in quantum mechanics: what types of quantum chemical properties can be represented by local variable models? We find that Wave, a local variable model, accurately calculates the quantum chemical properties, while graph convolutional architectures require global variables. Furthermore, local variable Wave models outperform global variable graph convolution models on complex molecules with large, correlated systems.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1.
Figure 1.
Coordinate-free methods leverage deep learning to compute QC properties. (A) Many deep-learning-based QC calculation methods require coordinates from experiment or computational optimization, which can be time-consuming to obtain. Coordinate-free methods operate directly on structural formulas without the need for coordinates. (B) Many graph-based deep learning methods describe atoms by aggregating features from other atoms in their local environment. Atoms may also exchange information with global variables, as in message passing neural networks (MPNN-G). (C) Wave deep learning architecture describes atoms based only on their ancestors as defined by a breadth-first search. Information is propagated in Waves, forward and backward across a molecule. (D) Wave achieves better than chemical accuracy when predicting total energy on QM9, a standard benchmark dataset. This result is comparable to the published, coordinate-based methods. CF: Coordinate-free 3D: 3D coordinates used as input features, *value from published results.,
Figure 2.
Figure 2.
Wave enables more accurate coordinate-free calculation of QC properties across diverse molecules. (A) PubChemQC and Oligomer datasets used in this study cover a substantially larger number of atom types and (B) include substantially larger molecules than QM9. (C) Wave was slightly more accurate than MPNN-G on the six QC properties included in this study. (D) Wave exhibits substantially lower error compared to MPNN-G when calculating total energy for molecules with large conjugated systems and also outperformed MPNN-G on large, flexible molecules. (E) Example molecules on which Wave achieves lower absolute error on total energy (kcal/mol) compared to MPNN-G with (e1) many atoms in conjugated systems and (e2) many rotable bonds. (F) Matched pairs were selected by choosing topologically similar molecules from a large external validation set. (G) Coordinate-free methods exhibit a small increase in error when computing the difference in total energy between matched pairs of molecules. Wave slightly outperformed MPNN-G on this task. Statistical tests were performed by paired t-test. +: p < 0.1, *: p < 0.05, **: p < 0.01, and ***: p < 0.001.
Figure 3.
Figure 3.
Wave represents QC systems accurately with a local variable model, while convolution requires global variables. (A) Graph-based deep learning methods can be used to learn an atom-local variable model of quantum chemistry. This intermediate representation can then be decoded to a quantum measurement. Wave is an atom-local variable model, while MPNN-G, which includes a global variable, is a mixed local variable model. (B) Removing the global variable from MPNN-G (MPNN) results in substantially higher error on total energy and (C) polarizability. Both methods exhibit higher error than Wave. (D) Increase in total energy error of MPNN vs MPNN-G. MPNN exhibits a statistically significant increase in error for all molecules, but substantially larger error increases for large molecules. Statistical tests were performed by paired t-test. *: p < 0.05, ***: p < 0.001.

References

    1. Segler MHS; Preuss M; Waller MP Planning chemical syntheses with deep neural networks and symbolic AI. Nature 2018, 555, 604–610. - PubMed
    1. Esteva A; Kuprel B; Novoa RA; Ko J; Swetter SM; Blau HM; Thrun S Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017, 542, 115. - PMC - PubMed
    1. Titano JJ; Badgeley M; Schefflein J; Pain M; Su A; Cai M; Swinburne N; Zech J; Kim J; Bederson J; et al. Automated deep-neural-network surveillance of cranial images for acute neurologic events. Nat. Med 2018, 24, 1337–1341. - PubMed
    1. Wood DE; White JR; Georgiadis A; Van Emburgh B; Parpart-Li S; Mitchell J; Anagnostou V; Niknafs N; Karchin R; Papp E; et al. A machine learning approach for somatic mutation discovery. Sci. Transl. Med 2018, 10, No. eaar7939. - PMC - PubMed
    1. Kumar RD; Swamidass SJ; Bose R Unsupervised detection of cancer driver mutations with parsimony-guided learning. Nat. Genet 2016, 48, 1288–1294. - PMC - PubMed