Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Oct 14;153(14):144101.
doi: 10.1063/5.0014282.

Extrapolation and interpolation strategies for efficiently estimating structural observables as a function of temperature and density

Affiliations

Extrapolation and interpolation strategies for efficiently estimating structural observables as a function of temperature and density

Jacob I Monroe et al. J Chem Phys. .

Abstract

Thermodynamic extrapolation has previously been used to predict arbitrary structural observables in molecular simulations at temperatures (or relative chemical potentials in open-system mixtures) different from those at which the simulation was performed. This greatly reduces the computational cost in mapping out phase and structural transitions. In this work, we explore the limitations and accuracy of thermodynamic extrapolation applied to water, where qualitative shifts from anomalous to simple-fluid-like behavior are manifested through shifts in the liquid structure that occur as a function of both temperature and density. We present formulas for extrapolating in volume for canonical ensembles and demonstrate that linear extrapolations of water's structural properties are only accurate over a limited density range. On the other hand, linear extrapolation in temperature can be accurate across the entire liquid state. We contrast these extrapolations with classical perturbation theory techniques, which are more conservative and slowly converging. Indeed, we show that such behavior is expected by demonstrating exact relationships between extrapolation of free energies and well-known techniques to predict free energy differences. An ideal gas in an external field is also studied to more clearly explain these results for a toy system with fully analytical solutions. We also present a recursive interpolation strategy for predicting arbitrary structural properties of molecular fluids over a predefined range of state conditions, demonstrating its success in mapping qualitative shifts in water structure with density.

PubMed Disclaimer

Figures

FIG. 1.
FIG. 1.
Schematic representations of scenarios in which extrapolation/interpolation or perturbation (re-weighting) are expected to be efficient. When the behavior of X with respect to α is well-approximated by a low-order polynomial (a and b), only derivatives of low order (here second for extrapolation, first for polynomial interpolation) are needed to accurately capture a wide range of α values and extrapolation/interpolation is appropriate. Perturbation is suitable when fluctuations in X are large compared to its variations with α (a and c). Neither method is efficient for systems with many extrema and relatively small fluctuations (d). The adjustable parameter α may be a thermodynamic variable (e.g., temperature or density) or model parameter (e.g., appearing in a potential energy function).
FIG. 2.
FIG. 2.
Perturbation theory and extrapolation at various orders are compared for predicting the average position 〈x〉 of an ideal gas particle in a linear external field. Analytical behavior at infinite order and sampling is shown as a black dashed line, the analytical result with infinite sampling for each order of extrapolation is shown as a red solid line, and perturbation or extrapolation results with varying numbers of randomly sampled configurations from the reference state of β = 5.6 are represented by circles.
FIG. 3.
FIG. 3.
Uncertainty (standard deviations over 1000 independent data draws at the reference condition) in the average position 〈x〉 relative to the infinite sampling extrapolation result 〈xanalytic (top row) and in the derivative estimates (bottom) are shown as functions of extrapolation distance δβ (a), extrapolation order (b), and number of samples Ns (c). In (b) and (c) the extrapolation distance is fixed at δβ = 1.0. Uncertainty increases with greater distance from the reference point, higher order, and fewer numbers of samples. In the bottom panel of (a), uncertainties in derivatives do not grow with δβ as these are locally estimated at the reference state.
FIG. 4.
FIG. 4.
Interpolation procedures are shown for weighted extrapolation and interpolating polynomials (both using only first-order derivatives), as well as BAR or MBAR depending on the number of data points (using the pymbar package). Interpolation from two points is shown in (a), while interpolation using all of the points selected during a recursive interpolation run using polynomial interpolation and an error tolerance of 2% are shown in (b). Eq. 18 is shown as a black dashed line. The same 10000 samples at each temperature are used for each method, with error bars representing one standard deviation in 100 bootstrap resamples. (c) shows polynomial fits on sliding sets of three points selected during recursive interpolation, which are indicated by black squares and vertical lines. For each set, a polynomial in the same color is plotted based on each pair of points (solid lines indicate the outermost points of the set, dotted the left two points, and dash-dot the right two points). In many cases, polynomials diverge significantly outside the interval over which they were fit. In the shared interval, however, good overlap indicates that the local curvature of the polynomial fits is consistent, meaning additional points are not necessary.
FIG. 5.
FIG. 5.
In all panels, lines represent direct simulation results at each temperature for RDFs (a) or three-body angle distributions (b) while points are the estimates via extrapolation from 300 K (constant density of 1.00 g/cm3) with 5000 snapshots obtained at regular intervals over a 5 ns simulation. Error bars represent one standard deviation and are determined through bootstrap resampling. The neighbor cutoff was set to 0.34 nm in computing three-body angles.
FIG. 6.
FIG. 6.
Direct simulation results (solid lines), perturbation theory estimates (stars), and first- (circles) and second-order (diamonds) extrapolations are shown versus temperature for (a) translational order parameters, (b) tetrahedral order parameters, (c)/(d) excess chemical potentials of 0.33 nm radii hard-spheres. Translational order parameters are computed from RDFs extrapolated in temperature from 5 ns simulations at 300 K and 1.00 g/cm3, while all other quantities are extrapolated directly from the same reference state. Due to the inherently high ratio of noise to temperature variation for excess chemical potential estimates at this density, 50 ns of simulation are used for extrapolations in (c) and (d). For clarity, error bars are not shown for perturbation or second-order extrapolation in (c) or (d), as they are on the same scale as the entire ordinate axis.
FIG. 7.
FIG. 7.
Lines represent direct simulation results at each density for RDFs (a) or three-body angle distributions (b) while points are the estimates via extrapolation from a density of 1.00 g/cm3 (top) or interpolation from the two most extreme densities of 0.87 and 1.40 g/cm3 (bottom). Temperature is constant at 300 K. Error bars represent one standard deviation and are determined through bootstrap resampling.
FIG. 8.
FIG. 8.
Direct simulation results (solid lines), first-order extrapolations from 1.0 g/cm3 (circles), and interpolations from the most extreme densities (squares) are shown versus density (at constant temperature of 300 K) for (a) translational order parameters, (b) tetrahedral order parameters, (c) excess chemical potentials of 0.33 nm radii hard-spheres. Translational order parameters are computed from RDFs extrapolated in density from 5 ns simulations at 300 K and 1.00 g/cm3, while all other quantities are extrapolated directly from the same reference state. Interpolations utilize two 5 ns simulations, one from each of the most extreme densities. Hard-sphere chemical potentials could not be computed at densities above 1.20 g/cm3.
FIG. 9.
FIG. 9.
Polynomial fits are shown on sliding sets of three points, which are indicated by solid black vertical lines. For each set, a polynomial in the same color is plotted based on each pair of points (solid lines indicate the outermost points of the set, dotted the two higher-density points, and dash-dot the two lower-density points). Piecewise interpolations using three (a) and four (b) data points are compared to demonstrate improvement of local curvature consistency with the number of points. Results from direct simulation at each density are shown as a black dashed line.

References

    1. Palmer JC and Debenedetti PG, “Recent advances in molecular simulation: A chemical engineering perspective,” AIChE Journal 61, 370–383 (2015).
    1. Hollingsworth SA and Dror RO, “Molecular Dynamics Simulation for All,” Neuron 99, 1129–1143 (2018). - PMC - PubMed
    1. Spiwok V, Sucur Z, and Hosek P, “Enhanced sampling techniques in biomolecular simulations,” Biotechnology Advances 33, 1130–1140 (2015). - PubMed
    1. Yang YI, Shao Q, Zhang J, Yang L, and Gao YQ, “Enhanced sampling in molecular dynamics,” The Journal of Chemical Physics 151, 070902 (2019). - PubMed
    1. Brown WM, Wang P, Plimpton SJ, and Tharrington AN, “Implementing molecular dynamics on hybrid high performance computers – short range forces,” Computer Physics Communications 182, 898–911 (2011).