Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Sep 26;49(3):621-637.
doi: 10.1080/02664763.2020.1825646. eCollection 2022.

A non-parametric Hawkes model of the spread of Ebola in west Africa

Affiliations

A non-parametric Hawkes model of the spread of Ebola in west Africa

Junhyung Park et al. J Appl Stat. .

Abstract

Recently developed methods for the non-parametric estimation of Hawkes point process models facilitate their application for describing and forecasting the spread of epidemic diseases. We use data from the 2014 Ebola outbreak in West Africa to evaluate how well a simple Hawkes point process model can forecast the spread of Ebola virus in Guinea, Sierra Leone, and Liberia. For comparison, SEIR models that fit previously to the same data are evaluated using identical metrics. To test the predictive power of each of the models, we simulate the ability to make near real-time predictions during an actual outbreak by using the first 75% of the data for estimation and the subsequent 25% of the data for evaluation. Forecasts generated from Hawkes models more accurately describe the spread of Ebola in each of the three countries investigated and result in a 38% reduction in RMSE for weekly case estimation across all countries when compared to SEIR models (total RMSE of 59.8 cases/week using SEIR compared to 37.1 for Hawkes). We demonstrate that the improved fit from Hawkes modeling cannot be attributed to overfitting and evaluate the advantages and disadvantages of Hawkes models in general for forecasting the spread of epidemic diseases.

Keywords: Compartmental models; SEIR models; disease epidemics; non-parametric estimation; point processes; self-exciting.

PubMed Disclaimer

Conflict of interest statement

No potential conflict of interest was reported by the authors.

Figures

Figure 1.
Figure 1.
Point process vs. WHO cumulative case counts (Left to right): Guinea SE, Sierra Leone East, Liberia NW. Solid = cumulative number of cases reported by WHO, dashed = cumulative number of cases reported by WHO with times uniformly spread within WHO report dates. The start dates of outbreak from left to right are 23 March 2014, 27 May 2014 and 5 April 2014, respectively.
Figure 2.
Figure 2.
Estimated Hawkes triggering density. (Left to Right): Guinea SE, Sierra Leone East, Liberia NW. Whiskers represent ± 2 standard errors computed as in Fox et al. [12]. Sensitivity of triggering histogram to randomly imputed data. (Left to Right): Guinea SE, Sierra Leone East, Liberia NW. Bar heights represent the median. Whiskers represent the 95th percentile interval.
Figure 4.
Figure 4.
Weekly forecasts of new infections from SEIR and Hawkes models. (Top to bottom): Guinea SE, Sierra Leone East, Liberia NW. Solid curve = observed new case incidence per week as reported in WHO data, dashed curve = SEIR forecast, dotted curve = Hawkes forecast. As in Figure 3, the start dates of outbreak from top to bottom are 23 March 2014, 27 May 2014 and 5 April 2014, respectively. Each weekly forecast is the mean of 1000 simulations. For each week, simulations of new cases were conducted using model parameters fitted over each country's entire data set. Each week's simulations began with the same number of initial infected cases based on the history of reported infections preceding each week's simulation start date.
Figure 3.
Figure 3.
Weekly estimates of R0 over time, denoted R(t). (Top to bottom): Guinea SE, Sierra Leone East, Liberia NW. Point estimates of the SEIR reproductive number, R0, by week. As in Figure 4, the start dates of outbreak from top to bottom are 23 March 2014, 27 May 2014 and 5 April 2014, respectively. Each weekly point estimate is based on the actual case counts observed up until that point in time. Note that for Sierra Leone East, estimates are off the charts at weeks 2 and 4 with 12.75 and 40.46, respectively.
Figure 5.
Figure 5.
SEIR and Hawkes projections using first 75% of data for fitting. (Left to right): Guinea SE, Sierra Leone East, Liberia NW. Starts dates of simulations from left to right are 28 July 2014, 13 August 2014 and 19 August 2014, respectively. Thin curves in top panels show 1000 simulations of Hawkes model (2) with parameters fit using first 75% of the data for the corresponding country and simulated forward for the last 25% of the observed time period. Thin curves in bottom panels show 1000 simulations of SEIR model with parameters fit using first 75% of the data for the corresponding country and simulated forward on the last 25%. Dashed curve = mean of simulations. Solid curve = actual cumulative total number of observed cases as reported by WHO.
Figure 6.
Figure 6.
Superthinning using Hawkes and SEIR infection rate parameters. (Left to right): Guinea SE, Sierra Leone East, Liberia NW. Top = Hawkes, bottom = SEIR. Thinned original points are marked with plus signs, and superposed points are marked with circles. X-axis indicates days from 23 March 2014, the beginning of the West African Ebola outbreak. The y-coordinates are uniform(0,1) random coordinates.
Figure 7.
Figure 7.
Pooled Hawkes model projections using first 75% of data for fitting. (Left to right): Guinea SE, Sierra Leone East, Liberia NW. Starts dates of simulations from left to right are 28 July 2014, 13 August 2014 and 19 August 2014, respectively. Thin curves in top panels show 1000 simulations of the pooled Hawkes model with parameters fit using first 75% of the data and simulated forward for the last 25% of the observed time period. Dashed curve = mean of simulations. Solid curve = actual cumulative total number of observed cases as reported by WHO.

References

    1. Althaus A.C., Estimating the reproduction number of Ebola virus (EBOV) during the 2014 outbreak in West Africa, PLoS Curr. 6 (2014), pp. 1–12. - PMC - PubMed
    1. Balderama E., Schoenberg F. P., Murray E., and Rundel P. W., Application of branching point process models to the study of invasive red banana plants in Costa Rica, JASA 107 (2012), pp. 467–476. doi: 10.1080/01621459.2011.641402 - DOI
    1. Becker N., Estimation for discrete time branching processes with application to epidemics, Biometrics 33 (1977), pp. 515–522. doi: 10.2307/2529366 - DOI - PubMed
    1. Britton T., Stochastic epidemic models: A survey, Math. Biosci. 225 (2010), pp. 24–35. doi: 10.1016/j.mbs.2010.01.006 - DOI - PubMed
    1. Cao Y., Gillespie D.T., and Petzold L.R., Adaptive explicit-implicit tau-leaping method with automatic tau selection, J. Chem. Phys. 126 (2010), pp. 224101. - PubMed

LinkOut - more resources