. 2025 Jul 31;20(7):e0328770.

doi: 10.1371/journal.pone.0328770. eCollection 2025.

Detecting outbreaks using a spatial latent field

Cosmin Safta¹, Jaideep Ray¹, Wyatt Bridgman¹

Affiliations

PMID: 40743263
PMCID: PMC12312950
DOI: 10.1371/journal.pone.0328770

Detecting outbreaks using a spatial latent field

Cosmin Safta et al. PLoS One. 2025.

. 2025 Jul 31;20(7):e0328770.

doi: 10.1371/journal.pone.0328770. eCollection 2025.

Authors

Cosmin Safta¹, Jaideep Ray¹, Wyatt Bridgman¹

Affiliation

¹ Data Sciences and Computing, Sandia National Laboratories, Livermore, California, United States of America.

PMID: 40743263
PMCID: PMC12312950
DOI: 10.1371/journal.pone.0328770

Abstract

In this paper, we present a method for estimating the infection-rate of a disease as a spatial-temporal field. Our data comprises time-series case-counts of symptomatic patients in various areal units of a region. We extend an epidemiological model, originally designed for a single areal unit, to accommodate multiple units. The field estimation is framed within a Bayesian context, utilizing a parameterized Gaussian random field as a spatial prior. We apply an adaptive Markov chain Monte Carlo method to sample the posterior distribution of the model parameters condition on COVID-19 case-count data from three adjacent counties in New Mexico, USA. Our results suggest that the correlation between epidemiological dynamics in neighboring regions helps regularize estimations in areas with high variance (i.e., poor quality) data. Using the calibrated epidemic model, we forecast the infection-rate over each areal unit and develop a simple anomaly detector to signal new epidemic waves. Our findings show that anomaly detector based on estimated infection-rates outperforms a conventional algorithm that relies solely on case-counts.

Copyright: This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1. Left: The waves of COVID-19 infection in New Mexico in 2020.**
The data from the “Summer wave” (June 1, 2020 to September 15, 2020) will be used to estimate the infection-rate field. The Fall 2020 wave started around September 15, and is marked with a solid vertical line. The dashed line is August 15. Right: Case-count data during the “Summer wave” for 3 counties. Note the erroneous data spikes in the middle of the summer for Cibola. Data quality for the various areal units can vary significantly.

**Fig 2. Top left: Evolution of coefficients w_k,t over time as the risk-factor model is fitted to cumulative case-counts y_t,r normalized by county populations.**
Results are plotted for the intercept and four principal components (PC). Only the intercept survives and is far larger that the weights associated with the principal components. Top right: Plot of the prediction error from a 7-fold cross-validation performed with the risk-factor model and LASSO, on case-count data accumulated over the entire two-and-a-half-year duration (and normalized by county populations). The figures on the upper horizontal axis denotes the number of principal components retained in the fitted model. $λ_{m i n}$ and $λ_{1 s e}$ are clearly marked. Bottom left: Distribution of coefficients, corresponding to penalties $λ_{m i n}$ and $λ_{1 s e}$ ; the intercept dominates. Bottom right: The residuals from the risk-factors model i.e., the component not explained by the risk-factors model. The spatial correlations are clear.

**Fig 3. The geographical extent of three adjacent New Mexico counties considered in this paper: Bernalillo (in green), Santa Fe (in red) and Valencia (in blue).**

**Fig 4. 1-D marginal posterior distributions to Bernalillo (left column), Santa Fe (middle column), and Valencia (right column).**
Top row: PDFs for t_0,r. t_0,r values are negative as it is measured from June $10^{th}$ , 2020, and the PDFs imply that infections for the Summer wave started in late May. Second row: PDFs for N_r. Third row: PDFs for k. Bottom row: PDFs for $θ_{r}$ .

**Fig 5. Marginal posterior distributions for GMRF parameters (τϕ2,λϕ) (top row) and noise parameters (σa,σm) (bottom row), estimated via 2r and 3r joint estimations with data for Santa Fe.**

Fig 6. Comparison of posterior predictive distribution results obtained via joint inference (using the GMRF model) for Bernalillo (left), Santa Fe (middle), and Valencia (right) shown on top row with equivalent results from independent inferences for each county separately, on the bottom row; data up to September 15th, 2020 is used and case-count data (shown with black circles) was smoothed with a 7-day running average.
The red line is the median prediction, the shaded teal region is the inter-quartile range and the dashed lines are $5^{th}$ and $95^{th}$ percentiles and the white circles are actual counts in the forecast regime.

**Fig 7. Comparison of reconstructed infection-rate profiles that underlie the predictions in Fig 6.**
The top row contains results obtained via joint inference (using the GMRF model) for Bernalillo (left), Santa Fe (middle), and Valencia (right). Results from independent inferences for each county separately, are shown in the bottom row. The calibration data spans up to September $15^{th}$ , 2020 and the case-count data was smoothed with a 7-day running average. The red line is the median prediction, the shaded teal region is the inter-quartile range and the dashed lines are $5^{th}$ and $95^{th}$ percentiles. difference is *not* the consequence of over-smoothing in time by our “mechanistic” treatment of time-evolution of the outbreak vis-à-vis a pCAR-ST model, which would likely employ an auto-regressive or moving-average approach. In Fig 7, we plot the corresponding infection-rates for all three counties. Differences in the estimated infection-rates, 3r joint estimation (top row) versus independent (bottom row), are difficult to discern. This is because the infection-rate is only affected by $(t_{0, r}, k_{r}, N_{r}, θ_{r})$ and, as is clear from Fig 4, there is not much difference in their posterior PDFs. Instead, it is the noise and spatial parameters whose estimates differ as we add more regions to the joint estimation (see Fig 5).

**Fig 8. Infection-rate detector results (top row) compared with the GLR-Poisson detector (bottom row), using data from 2020-06-01 to 2020-09-15.**
The symbols are the observed case-counts for Bernalillo (left), Santa Fe (middle), and Valencia (right). The Fall 2020 wave is believed to have started around September 15. The red line beyond September 15 is the outlier boundary; a day with a case-count above the dashed line is an “outlier” and is circled. A data point with a square box around it denotes the the last of a sequence of three consecutive alarmed days. In all cases we see that the GLR-Poisson detector misses the Fall 2020 wave.

**Fig 9. Variation explained by the principal components obtained via sparse PCA of the 79 risk factors used to model population-normalized case-counts in the counties of New Mexico.**
We see that 12 principal components can cover 95% of the variations observed in the risk-factors.

**Fig 10. Infection-rate detector results (top row) compared with the GLR-Poisson detector (bottom row), using data from 2020-06-01 to 2020-08-15.**
August 15 is a month before the arrival of the Fall 2020 wave. Lines and symbols’ settings are the same as in Fig 8. We see that for Bernalillo and Santa Fe, both methods suffer from false positives, detecting alarms before the arrival of the Fall 2020 wave.

**Fig 11. One and two-dimensional marginal posterior distributions for the Bernalillo county model parameters; left: 3 region joint inversion, right: Bernalillo county only.**

**Fig 12. One and two-dimensional marginal posterior distributions for the Santa Fe county model parameters; left: 3 region joint inversion, right: Santa Fe county only.**

**Fig 13. One and two-dimensional marginal posterior distributions for the Valencia county model parameters; left: 3 region joint inversion, right: Valencia county only.**

See this image and copyright information in PMC

References

1. Daza-Torres ML, Capistrán MA, Capella A, Christen JA. Bayesian sequential data assimilation for COVID-19 forecasting. Epidemics. 2022;39:100564. doi: 10.1016/j.epidem.2022.100564 - DOI - PMC - PubMed
1. Wang Z, Zhang X, Teichert GH, Carrasco-Teja M, Garikipati K. System inference for the spatio-temporal evolution of infectious diseases: Michigan in the time of COVID-19. Comput Mech. 2020;66(5):1153–76. doi: 10.1007/s00466-020-01894-2 - DOI - PMC - PubMed
1. Chen P, Wu K, Ghattas O. Bayesian inference of heterogeneous epidemic models: application to COVID-19 spread accounting for long-term care facilities. Comput Methods Appl Mech Eng. 2021;385:114020. doi: 10.1016/j.cma.2021.114020 - DOI - PMC - PubMed
1. Blonigan P, Ray J, Safta C. Forecasting multi-wave epidemics through Bayesian inference. Arch Comput Methods Eng. 2021;28(6):4169–83. doi: 10.1007/s11831-021-09603-9 - DOI - PMC - PubMed
1. Lin YT, Neumann J, Miller EF, Posner RG, Mallela A, Safta C, et al. Daily forecasting of regional epidemics of coronavirus disease with bayesian uncertainty quantification, United States. Emerg Infect Dis. 2021;27(3):767–78. doi: 10.3201/eid2703.203364 - DOI - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- PubMed Central
- Public Library of Science
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Detecting outbreaks using a spatial latent field

Affiliation

Detecting outbreaks using a spatial latent field

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

Similar articles

References

MeSH terms

LinkOut - more resources

Full Text Sources

Medical

Abstract

Conflict of interest statement

Figures

Similar articles

References

MeSH terms

Related information

LinkOut - more resources

Full Text Sources

Medical