. 2015 Jan 8;11(1):e1003968.

doi: 10.1371/journal.pcbi.1003968. eCollection 2015 Jan.

Bayesian history matching of complex infectious disease models using emulation: a tutorial and a case study on HIV in Uganda

Ioannis Andrianakis¹, Ian R Vernon², Nicky McCreesh³, Trevelyan J McKinley⁴, Jeremy E Oakley⁵, Rebecca N Nsubuga⁶, Michael Goldstein², Richard G White¹

Affiliations

¹ Dept. of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, United Kingdom.
² Dept. of Mathematical Sciences, Durham University, Durham, United Kingdom.
³ School of Medicine, Pharmacy and Health, Durham University, Durham, United Kingdom.
⁴ Dept. of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom.
⁵ School of Mathematics and Statistics, University of Sheffield, Sheffield, United Kingdom.
⁶ Medical Research Council/Uganda Virus Research Institute, Uganda Research Unit on AIDS, Entebbe, Uganda.

PMID: 25569850
PMCID: PMC4288726
DOI: 10.1371/journal.pcbi.1003968

Bayesian history matching of complex infectious disease models using emulation: a tutorial and a case study on HIV in Uganda

Ioannis Andrianakis et al. PLoS Comput Biol. 2015.

. 2015 Jan 8;11(1):e1003968.

doi: 10.1371/journal.pcbi.1003968. eCollection 2015 Jan.

Authors

Ioannis Andrianakis¹, Ian R Vernon², Nicky McCreesh³, Trevelyan J McKinley⁴, Jeremy E Oakley⁵, Rebecca N Nsubuga⁶, Michael Goldstein², Richard G White¹

Affiliations

¹ Dept. of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, United Kingdom.
² Dept. of Mathematical Sciences, Durham University, Durham, United Kingdom.
³ School of Medicine, Pharmacy and Health, Durham University, Durham, United Kingdom.
⁴ Dept. of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom.
⁵ School of Mathematics and Statistics, University of Sheffield, Sheffield, United Kingdom.
⁶ Medical Research Council/Uganda Virus Research Institute, Uganda Research Unit on AIDS, Entebbe, Uganda.

PMID: 25569850
PMCID: PMC4288726
DOI: 10.1371/journal.pcbi.1003968

Abstract

Advances in scientific computing have allowed the development of complex models that are being routinely applied to problems in disease epidemiology, public health and decision making. The utility of these models depends in part on how well they can reproduce empirical data. However, fitting such models to real world data is greatly hindered both by large numbers of input and output parameters, and by long run times, such that many modelling studies lack a formal calibration methodology. We present a novel method that has the potential to improve the calibration of complex infectious disease models (hereafter called simulators). We present this in the form of a tutorial and a case study where we history match a dynamic, event-driven, individual-based stochastic HIV simulator, using extensive demographic, behavioural and epidemiological data available from Uganda. The tutorial describes history matching and emulation. History matching is an iterative procedure that reduces the simulator's input space by identifying and discarding areas that are unlikely to provide a good match to the empirical data. History matching relies on the computational efficiency of a Bayesian representation of the simulator, known as an emulator. Emulators mimic the simulator's behaviour, but are often several orders of magnitude faster to evaluate. In the case study, we use a 22 input simulator, fitting its 18 outputs simultaneously. After 9 iterations of history matching, a non-implausible region of the simulator input space was identified that was 10(11) times smaller than the original input space. Simulator evaluations made within this region were found to have a 65% probability of fitting all 18 outputs. History matching and emulation are useful additions to the toolbox of infectious disease modellers. Further research is required to explicitly address the stochastic nature of the simulator as well as to account for correlations between outputs.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Figure 1. History matching.**
The physical process is observed via and described by the simulator output . The simulator is substituted by the emulator for computational efficiency. The question marks indicate the various sources of uncertainty present in the system.

formula image — **Figure 1. History matching.**
The physical process is observed via and described by the simulator output . The simulator is substituted by the emulator for computational efficiency. The question marks indicate the various sources of uncertainty present in the system.

**Figure 2. History matching workflow.**
The simulator is evaluated at carefully selected design points. Its output is used to train the emulator, which, with the help of the implausibility measure, determines the parts of the input space which are non-implausible (NI). The simulator is then evaluated at set of design points from the non-implausible space and the procedure is repeated until one or more stopping criteria are met.

**Figure 3. Example emulator and implausibility for toy simulator [].**
Panel (a) shows an emulator of the toy simulator (black dashed line). The value of is considered unknown apart from six points where the simulator is run and are represented by the black dots in the figure. The blue line is the emulator's posterior mean, and the red lines represent its posterior uncertainty (95% CI). The 3 horizontal lines represent the empirical data (mean value and 95% CI) that we use to history match the simulator. Panel (b) shows the implausibility for the emulator and the empirical data shown in panel (a). The implausibility is large when the emulator's posterior mean is far from the empirical data, relatively to the uncertainties present in the system (observation and code uncertainty in this case). The horizontal green line is an implausibility cut-off, which determines whether an input is implausible or not.

**Figure 4. Second history matching wave for the toy simulator .**
Panel (a) shows an emulator of the toy simulator (black dashed line). The value of is considered unknown apart from seven points where the simulator is run and are represented by the black dots in the figure. The blue line is the emulator's posterior mean, and the red lines represent its posterior uncertainty (95% CI). The 3 horizontal lines represent the empirical data (mean value and 95% CI) that we use to history match the simulator. Panel (b) shows the implausibility for the emulator and the empirical data shown in panel (a). The implausibility is large when the emulator's posterior mean is far from the empirical data, relatively to the uncertainties present in the system (observation and code uncertainty in this case). The horizontal green line is an implausibility cut-off, which determines whether an input is implausible or not.

**Figure 5. Minimum implausibility (a) and optical depth (b) plots for inputs 1 and 4 in wave 1.**
Minimum implausibility plots show an estimate of the minimum implausibility for different values of pairs of inputs. Optical depth plots show an estimate of the probability of encountering a non-implausible point for different values of pairs of inputs.

**Figure 6. Minimum implausibility (below and left of diagonal) and optical depth plots (above and right of diagonal) for 10 key inputs for waves 1,4,7,9.**
Minimum implausibility plots show an estimate of the minimum implausibility for different values of pairs of inputs. Optical depth plots show an estimate of the probability of encountering a non-implausible point for different values of pairs of inputs.

**Figure 7. Cumulative distribution function of simulator run implausibility by waves.**
Each line represents the percentage of each wave's simulator runs with an less than the value indicated by the x-axis.

**Figure 8. Simulator output (male and female HIV prevalence) in waves 1, 4, 7 and 10.**
The black lines show the average observed HIV prevalence with 95% credible ranges.

**Figure 9. Convergence of the simulator's output to the empirical data with successive waves of history matching.**
Each of the 18 panels shows the range of the target data (horizontal region) and the simulator's output in waves 1 (red), 4 (yellow), 7 (blue) and 10 (green) (left to right along the x-axis).

**Figure 10. Posterior samples drawn with the importance sampling method described in section ‘Posterior Sampling’.**
Each panel shows the samples drawn for one of the 22 simulator inputs. Their full names and descriptions can be found in Table 1.

See this image and copyright information in PMC

References

1. Law A (2007) Simulation modeling and analysis. McGraw Hill.
1. Spear RC, Hubbard A, Liang S, Seto E (2002) Disease transmission models for public health decision making: toward an approach for designing intervention strategies for Schistosomiasis Japonica. Environ Health Perspect 110: 907–915. - PMC - PubMed
1. Bauer AL, Beauchemin CAA, Perelson AS (2009) Agent-based modeling of host-pathogen systems: The successes and challenges. Information Sciences 179: 1379–1389. - PMC - PubMed
1. Grimm V, Railsback SF (2005) Individual-based Modeling and Ecology. Princeton University Press.
1. White RG, Glynn JR, Orroth KK, Freeman E, Bakker R (2008) Male circumcision for HIV prevention in sub-saharan Africa: who, what and when? AIDS 22: 1841–1850. - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Bayesian history matching of complex infectious disease models using emulation: a tutorial and a case study on HIV in Uganda

Affiliations

Bayesian history matching of complex infectious disease models using emulation: a tutorial and a case study on HIV in Uganda

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical