. 2020 Mar 16;16(3):e1006869.

doi: 10.1371/journal.pcbi.1006869. eCollection 2020 Mar.

The use of mixture density networks in the emulation of complex epidemiological individual-based models

Christopher N Davis^{1

2}, T Deirdre Hollingsworth³, Quentin Caudron⁴, Michael A Irvine^{4

5}

Affiliations

¹ MathSys CDT, Mathematics Institute, University of Warwick, Coventry, United Kingdom.
² Zeeman Institute (SBIDER), University of Warwick, Coventry, United Kingdom.
³ Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom.
⁴ Scai Analytics Ltd., Vancouver, Canada.
⁵ Institute of Applied Mathematics, University of British Columbia, Vancouver, Canada.

PMID: 32176687
PMCID: PMC7098654
DOI: 10.1371/journal.pcbi.1006869

The use of mixture density networks in the emulation of complex epidemiological individual-based models

Christopher N Davis et al. PLoS Comput Biol. 2020.

. 2020 Mar 16;16(3):e1006869.

doi: 10.1371/journal.pcbi.1006869. eCollection 2020 Mar.

Authors

Christopher N Davis^{1

2}, T Deirdre Hollingsworth³, Quentin Caudron⁴, Michael A Irvine^{4

5}

Affiliations

¹ MathSys CDT, Mathematics Institute, University of Warwick, Coventry, United Kingdom.
² Zeeman Institute (SBIDER), University of Warwick, Coventry, United Kingdom.
³ Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom.
⁴ Scai Analytics Ltd., Vancouver, Canada.
⁵ Institute of Applied Mathematics, University of British Columbia, Vancouver, Canada.

PMID: 32176687
PMCID: PMC7098654
DOI: 10.1371/journal.pcbi.1006869

Abstract

Complex, highly-computational, individual-based models are abundant in epidemiology. For epidemics such as macro-parasitic diseases, detailed modelling of human behaviour and pathogen life-cycle are required in order to produce accurate results. This can often lead to models that are computationally-expensive to analyse and perform model fitting, and often require many simulation runs in order to build up sufficient statistics. Emulation can provide a more computationally-efficient output of the individual-based model, by approximating it using a statistical model. Previous work has used Gaussian processes (GPs) in order to achieve this, but these can not deal with multi-modal, heavy-tailed, or discrete distributions. Here, we introduce the concept of a mixture density network (MDN) in its application in the emulation of epidemiological models. MDNs incorporate both a mixture model and a neural network to provide a flexible tool for emulating a variety of models and outputs. We develop an MDN emulation methodology and demonstrate its use on a number of simple models incorporating both normal, gamma and beta distribution outputs. We then explore its use on the stochastic SIR model to predict the final size distribution and infection dynamics. MDNs have the potential to faithfully reproduce multiple outputs of an individual-based model and allow for rapid analysis from a range of users. As such, an open-access library of the method has been released alongside this manuscript.

PubMed Disclaimer

Conflict of interest statement

The authors MI and QC declare that they are members of the data science consultancy Scai Analytics Ltd. All other authors declare that they have no competing interests.

Figures

**Fig 1. MDN that emulates a model with three inputs and a one-dimensional output with two mixtures.**
The inputs are passed through two hidden layers, which are then passed on to the normalised neurons, which represent the parameters of a distribution and its weights e.g. the mean (shown in blue) and variance (shown in green) of a normal distribution. These parameters are used to construct a mixture of distributions (represented as a dashed line).

**Fig 2. Gamma-MDN output emulating a negative binomial model.**
(A) For fixed shape parameter k = 2.5, the distribution of output from MDN is shown in blue (mean = solid line, variance = shaded region), the theoretical values are shown as a black dashed line (mean = bold line, variance = normal line). (B) For fixed mean parameter m = 50, the distribution of output from MDN over a range of k values is shown in blue (mean = solid line, variance = shaded region), the theoretical values are shown as a black dashed line (mean = bold line, variance = normal line). (C) Corresponding two-sample K–S statistic where sample of 100 points are drawn from a negative binomial and the MDN over a range of m values. 100 replicates are used to estimate a mean K–S statistic and a 95% range. The dashed line represents significance at α = 0.05, with values less than this indicating that the two samples do not differ significantly. (D) Example empirical CDFs drawn from 100 samples of MDN with inputs m = 50 and k = 2.5. 1,000 empirical CDFs are shown as black transparent lines and true CDF is shown as a blue solid line.

**Fig 3. Binomial-MDN output emulating the final size distribution of a stochastic SIR model.**
(A) For random uniform sampling over β and γ a sample of the output from MDN across values for the basic reproductive number R₀ = β/γ are shown in blue and the directly simulated values are shown in red. (B) Corresponding two-sample K–S statistic where sample of 100 points are drawn from a negative binomial and the MDN over a range of R₀ values. 100 replicates are used to estimate a mean K–S statistic and a 95% range. Dashed line represent significance at α = 0.05, with values less indicating the two samples do not differ significantly. (C) The percentage of 1,000 realisations of the stochastic SIR model with final size greater than 100 is shown in black with dashed line showing a 95% range. Emulated results are shown by the blue line with a 95% range. (D) Example empirical CDFs drawn from 100 samples of MDN with inputs β = 0.4 and γ = 0.2. 1,000 empirical CDF are shown as black transparent lines and true CDF is shown as a blue solid line.

**Fig 4. Beta-MDN output emulating the infection dynamics with time for a stochastic SIR model.**
(A–D) A comparison of simulation results with sampled MDN output for fixed γ = 0.2 and N = 1, 000 and different β values that give the following R₀ values: (A) R₀ = 0.5, (B) R₀ = 1.0, (C) R₀ = 2.0, and (D) R₀ = 5.0. (E–F) Two-sample K–S statistic where sample of 100 points are drawn from a negative binomial and the MDN over a range of time t values. 100 replicates are used to estimate a mean K–S statistic and a 95% range. Dashed line represent significance at α = 0.05, with values less indicating the two samples do not differ significantly. Tests are for (E) number of susceptible people and (F) number of infected people.

**Fig 5. Beta-MDN output emulating the infection dynamics with time for a stochastic SIR model.**
(A–D) A comparison of simulation results with sampled MDN output for fixed γ = 0.2 and different β, δ and N values such that (A) R₀ = 2.0, δ = 0.01 and N = 1, 000, (B) R₀ = 1.0, δ = 0.01 and N = 1, 000, (C) R₀ = 2.0, δ = 0.001 and N = 1, 000, (D) R₀ = 2.0, δ = 0.01 and N = 100. (E–F) Two-sample K–S statistic where sample of 100 points are drawn from a negative binomial and the MDN over a range of time t values. 100 replicates are used to estimate a mean K–S statistic and a 95% range. Dashed line represent significance at α = 0.05, with values less indicating the two samples do not differ significantly. Tests are for (E) number of susceptible people and (F) number of infected people.

See this image and copyright information in PMC

Cited by

Using mixture density networks to emulate a stochastic within-host model of Francisella tularensis infection.
Carruthers J, Finnie T. Carruthers J, et al. PLoS Comput Biol. 2023 Dec 20;19(12):e1011266. doi: 10.1371/journal.pcbi.1011266. eCollection 2023 Dec. PLoS Comput Biol. 2023. PMID: 38117811 Free PMC article.
Parametric seasonal-trend autoregressive neural network for long-term crop price forecasting.
Hong W, Choi SC, Oh S. Hong W, et al. PLoS One. 2024 Sep 26;19(9):e0311199. doi: 10.1371/journal.pone.0311199. eCollection 2024. PLoS One. 2024. PMID: 39325794 Free PMC article.
Optimizing COVID-19 vaccine distribution across the United States using deterministic and stochastic recurrent neural networks.
Davahli MR, Karwowski W, Fiok K. Davahli MR, et al. PLoS One. 2021 Jul 6;16(7):e0253925. doi: 10.1371/journal.pone.0253925. eCollection 2021. PLoS One. 2021. PMID: 34228740 Free PMC article.
Approximating solutions of the Chemical Master equation using neural networks.
Sukys A, Öcal K, Grima R. Sukys A, et al. iScience. 2022 Aug 27;25(9):105010. doi: 10.1016/j.isci.2022.105010. eCollection 2022 Sep 16. iScience. 2022. PMID: 36117994 Free PMC article.
Distilling dynamical knowledge from stochastic reaction networks.
Liu C, Wang J. Liu C, et al. Proc Natl Acad Sci U S A. 2024 Apr 2;121(14):e2317422121. doi: 10.1073/pnas.2317422121. Epub 2024 Mar 26. Proc Natl Acad Sci U S A. 2024. PMID: 38530895 Free PMC article.

See all "Cited by" articles

References

1. Keeling MJ, Rohani P. Modeling infectious diseases in humans and animals. Princeton University Press; 2011.
1. Britton T, House T, Lloyd AL, Mollison D, Riley S, Trapman P. Five challenges for stochastic epidemic models involving global transmission. Epidemics. 2015;10:54–57. 10.1016/j.epidem.2014.05.002 - DOI - PMC - PubMed
1. May RM. Togetherness among schistosomes: its effects on the dynamics of the infection. Mathematical Biosciences. 1977;35(3-4):301–343. 10.1016/0025-5564(77)90030-X - DOI
1. Irvine MA, Reimer LJ, Njenga SM, Gunawardena S, Kelly-Hope L, Bockarie M, et al. Modelling strategies to break transmission of lymphatic filariasis-aggregation, adherence and vector competence greatly alter elimination. Parasites & Vectors. 2015;8(1):547 10.1186/s13071-015-1152-3 - DOI - PMC - PubMed
1. Hollingsworth TD, Adams ER, Anderson RM, Atkins K, Bartsch S, Basáñez MG, et al. Quantitative analyses and modelling to support achievement of the 2020 goals for nine neglected tropical diseases. Parasites & Vectors. 2015;8(1):630 10.1186/s13071-015-1235-1 - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

MRC_/Medical Research Council/United Kingdom

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The use of mixture density networks in the emulation of complex epidemiological individual-based models

Affiliations

The use of mixture density networks in the emulation of complex epidemiological individual-based models

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Research Materials

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Research Materials