Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Dec;27(8):507-523.
doi: 10.1002/env.2405. Epub 2016 Sep 12.

A comparison of statistical emulation methodologies for multi-wave calibration of environmental models

Affiliations

A comparison of statistical emulation methodologies for multi-wave calibration of environmental models

James M Salter et al. Environmetrics. 2016 Dec.

Abstract

Expensive computer codes, particularly those used for simulating environmental or geological processes, such as climate models, require calibration (sometimes called tuning). When calibrating expensive simulators using uncertainty quantification methods, it is usually necessary to use a statistical model called an emulator in place of the computer code when running the calibration algorithm. Though emulators based on Gaussian processes are typically many orders of magnitude faster to evaluate than the simulator they mimic, many applications have sought to speed up the computations by using regression-only emulators within the calculations instead, arguing that the extra sophistication brought using the Gaussian process is not worth the extra computational power. This was the case for the analysis that produced the UK climate projections in 2009. In this paper, we compare the effectiveness of both emulation approaches upon a multi-wave calibration framework that is becoming popular in the climate modeling community called "history matching." We find that Gaussian processes offer significant benefits to the reduction of parametric uncertainty over regression-only approaches. We find that in a multi-wave experiment, a combination of regression-only emulators initially, followed by Gaussian process emulators for refocussing experiments can be nearly as effective as using Gaussian processes throughout for a fraction of the computational cost. We also discover a number of design and emulator-dependent features of the multi-wave history matching approach that can cause apparent, yet premature, convergence of our estimates of parametric uncertainty. We compare these approaches to calibration in idealized examples and apply it to a well-known geological reservoir model.

Keywords: Gaussian processes; emulator diagnostics; ensemble design; history matching; tuning; uncertainty quantification.

PubMed Disclaimer

Figures

Figure 1
Figure 1
How the prediction and 99% uncertainty bounds change for a regression‐only emulator (green) and a Gaussian process emulator (red) for a line between two design points x 1, x 2 in 10‐dimensional space, where λ describes how far along this line we are. The actual function (blue) is a toy model. The Gaussian process is a better approximation of the original function and has less uncertainty on its predictions here. The observation is taken to be 0, observed with an observation error given by the dotted black lines
Figure 2
Figure 2
The implausibility I(x) for the above two emulators. With 3 chosen as the threshold for ruling out points, the regression emulator cannot rule out anything in this part of space, while the Gaussian process emulator can for λ > 0.52
Figure 3
Figure 3
Flow chart showing the emulators built for a comparison between the regression‐only case and the Gaussian process case. GP1 denotes that we started to use a Gaussian process from Wave 1 in that history match
Figure 4
Figure 4
Top left Function 1, top right Function 2, bottom left Function 3, and bottom right borehole function. This picture shows the sizes of not ruled out yet (NROY) space we have at each wave when history matching our various functions with regression‐only emulators and when we start to use a Gaussian process emulator at different waves
Figure 5
Figure 5
The weighted densities for the function output at points in NROY space after Wave 1 (left) and Wave 4 (right) for each of the four functions, for the Gaussian process (blue) and the regression‐only emulator (green). The observation for each function is given by the red line
Figure 6
Figure 6
The observations for the IC fault model
Figure 7
Figure 7
The progression of the sizes of NROY space when history matching the IC fault model with regression‐only emulators and when we start to use a Gaussian process emulator at different waves
Figure 8
Figure 8
A parameter plot showing the true NROY space (green) and those points classified as being in NROY space after four waves when we use regressions at each of the four waves
Figure 9
Figure 9
A parameter plot showing the true NROY space (green) and those points classified as being in NROY space after four waves when we use Gaussian process emulators at each of the four waves
Figure C1
Figure C1
Leave‐one‐out cross validation plots (left) and prediction for the validation set (right), for the Gaussian process emulators for Function 1, after Wave 1 (top) and Wave 4 (bottom). The black points indicate the prediction given by the emulator, with 95% error bars. The green and red points are the actual function values, colored green if they lie within the 95% error bars around the prediction. Emulators are deemed to validate well if there are not too many or too few of the true values outside of these error bars. These checks ensure that the parameter estimation for the Gaussian process is reasonable and that our emulator has predictive power
Figure D1
Figure D1
The progression of the sizes of NROY space for regression and the Gaussian processes for the borehole function. The dotted lines indicate the original NROY spaces found, as in Figure 4, with the solid lines showing improvements we have been able to achieve through either fitting a new mean function (in the case of GP2 [red line]), or by taking a new sample in the existing NROY space

References

    1. Andrianakis, I. , & Challenor, P. G. (2012). The effect of the nugget on Gaussian process emulators of computer models. Computational Statistics and Data Analysis, 56(12), 4215–4228.
    1. Andrianakis, I. , Vernon, I. R. , McCreesh, N. , McKinley, T. J. , Oakley, J. E. , Nsubuga, R. N. , Goldstein, M. , & White, R. G. (2015). Bayesian history matching of complex infectious disease models using emulation: A tutorial and a case study on HIV in Uganda. PLoS Computational Biology, 11(1), e1003968. - PMC - PubMed
    1. Bayarri, M. , Berger, J. , Cafeo, J. , Garcia‐Donato, G. , Liu, F. , Palomo, J. , Parthasarathy, R. , Paulo, R. , Sacks, J , & Walsh, D. (2007). Computer model validation with functional output. The Annals of Statistics, 35(5), 1874–1906.
    1. Conti, S. , & O'Hagan, A. (2010). Bayesian emulation of complex multi‐output and dynamic computer models. Journal of Statistical Planning and Inference, 140(3), 640–651.
    1. Craig, P. S. , Goldstein, M. , Seheult, A. , & Smith, J. (1996). Bayes linear strategies for matching hydrocarbon reservoir history. Bayesian Statistics, 5, 69–95.

LinkOut - more resources