Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Oct 16;11(1):5223.
doi: 10.1038/s41467-020-19093-1.

Quantum chemical accuracy from density functional approximations via machine learning

Affiliations

Quantum chemical accuracy from density functional approximations via machine learning

Mihail Bogojeski et al. Nat Commun. .

Abstract

Kohn-Sham density functional theory (DFT) is a standard tool in most branches of chemistry, but accuracies for many molecules are limited to 2-3 kcal ⋅ mol-1 with presently-available functionals. Ab initio methods, such as coupled-cluster, routinely produce much higher accuracy, but computational costs limit their application to small molecules. In this paper, we leverage machine learning to calculate coupled-cluster energies from DFT densities, reaching quantum chemical accuracy (errors below 1 kcal ⋅ mol-1) on test data. Moreover, density-based Δ-learning (learning only the correction to a standard DFT calculation, termed Δ-DFT ) significantly reduces the amount of training data required, particularly when molecular symmetries are included. The robustness of Δ-DFT is highlighted by correcting "on the fly" DFT-based molecular dynamics (MD) simulations of resorcinol (C6H4(OH)2) to obtain MD trajectories with coupled-cluster accuracy. We conclude, therefore, that Δ-DFT facilitates running gas-phase MD simulations with quantum chemical accuracy, even for strained geometries and conformer changes where standard DFT fails.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Illustration of density-based machine learning for water conformer energies.
For all panels, DFT energies (orange) are shown alongside CC energies (blue) for the same molecular conformers, with optimized geometries indicated by open diamonds. a The nuclear potential, represented by an approximate Gaussians potential, is the input to a set of ML models that return the electron density. This learned density is the input for independent ML predictions of molecular energies based on DFT or CC electronic structure calculations, or the difference between these energies, in order to correct the DFT energy (final term in Eq. (3)). b Calculated energies for CC (dark blue) and DFT (dark orange) for 102 sample geometries relative to the lowest training energy (top), along with the relative energy errors for DFT compared to CC for each conformer (bottom). Note that the DFT energy errors are not a simple function of the energy relative to the minimum energy geometry (see Supplementary Fig. 2), as short O–H bond lengths tend to be too high in energy and stretched bonds are overstabilized. c Average out-of-sample prediction errors for the different ML functionals compared to the reference ECC energies. The MAE of the EDFT energies w.r.t. ECC is also shown as a dashed line. d The energy surface (in kcal mol−1) of symmetric water geometries for EMLDFT (orange) and EΔ-DFTCC (blue) after applying the Δ-DFT  correction (bottom). For this figure, DFT calculations use the PBE functional, and CC calculations use CCSD(T) (see “Methods” for more details).
Fig. 2
Fig. 2. Molecular geometries of ethanol from the ML training set and optimizations.
a 1000 unique configurations used for training (light orange circles), along with the anti and gauche minima optimized using conventional electronic structure methods (open diamonds). The distribution of anti and gauche conformers is shown in Supplementary Fig. 6. b The configurational space near the minima. Starting from MP2 geometries (EMP2, grey diamonds), the EML-based optimizations reproduce the subtle differences in DFT- and CC-optimized geometries (dark orange and dark blue diamonds, respectively). For this figure, DFT calculations use the PBE+TS functional and CC calculations use CCSD(T) (see refs. , for more details).
Fig. 3
Fig. 3. Resorcinol dynamics from an initial condition near a conformational change.
a The atomic positions explored during 100 fs NVE MD trajectories run with standard DFT (dark orange), EsMLCC[nsMLDFT] with RESPA-corrected forces (light blue), and EsΔ-DFTCC[nsMLDFT] (blue). b The conformer energy along each trajectory (solid lines), with the error relative to CC shown as a shaded line width. c The evolution of the C–C–O–H dihedral angle for each trajectory with dashed grey lines indicating the barrier between conformers. For this figure, all DFT calculations use PBE and all CC energies are from CCSD(T).

References

    1. Rupp M, Tkatchenko A, Müller K-R, von Lilienfeld OA. Fast and accurate modeling of molecular atomization energies with machine learning. Phys. Rev. Lett. 2012;108:058301. doi: 10.1103/PhysRevLett.108.058301. - DOI - PubMed
    1. Montavon, G. et al. Learning invariant representations of molecules for atomization energy prediction. Adv. Neural. Inf. Process. Syst.25, 440–448 (2012).
    1. Montavon G, et al. Machine learning of molecular electronic properties in chemical compound space. N. J. Phys. 2013;15:095003. doi: 10.1088/1367-2630/15/9/095003. - DOI
    1. Botu V, Ramprasad R. Learning scheme to predict atomic forces and accelerate materials simulations. Phys. Rev. B. 2015;92:094306. doi: 10.1103/PhysRevB.92.094306. - DOI
    1. Hansen K, et al. Machine learning predictions of molecular properties: accurate many-body potentials and nonlocality in chemical space. J. Phys. Chem. Lett. 2015;6:2326–2331. doi: 10.1021/acs.jpclett.5b00831. - DOI - PMC - PubMed

Publication types