Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun 23;21(6):e1013203.
doi: 10.1371/journal.pcbi.1013203. eCollection 2025 Jun.

Synthetic method of analogues for emerging infectious disease forecasting

Affiliations

Synthetic method of analogues for emerging infectious disease forecasting

Alexander C Murph et al. PLoS Comput Biol. .

Abstract

The Method of Analogues (MOA) has gained popularity in the past decade for infectious disease forecasting due to its non-parametric nature. In MOA, the local behavior observed in a time series is matched to the local behaviors of several historical time series. The known values that directly follow the historical time series that best match the observed time series are used to calculate a forecast. This non-parametric approach leverages historical trends to produce forecasts without extensive parameterization, making it highly adaptable. However, MOA is limited in scenarios where historical data is sparse. This limitation was particularly evident during the early stages of the COVID-19 pandemic, where the emerging global epidemic had little-to-no historical data. In this work, we propose a new method inspired by MOA, called the Synthetic Method of Analogues (sMOA). sMOA replaces historical disease data with a library of synthetic data that describe a broad range of possible disease trends. This model circumvents the need to estimate explicit parameter values by instead matching segments of ongoing time series data to a comprehensive library of synthetically generated segments of time series data. We demonstrate that sMOA has competitive performance with state-of-the-art infectious disease forecasting models, out-performing 78% of models from the COVID-19 Forecasting Hub in terms of averaged Mean Absolute Error and 76% of models from the COVID-19 Forecasting Hub in terms of averaged Weighted Interval Score. Additionally, we introduce a novel uncertainty quantification methodology designed for the onset of emerging epidemics. Developing versatile approaches that do not rely on historical data and can maintain high accuracy in the face of novel pandemics is critical for enhancing public health decision-making and strengthening preparedness for future outbreaks.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Diagram of sMOA.
Recall k is the time series segment length and h is the largest forecast horizon. (a) Three fully observed synthetic time series y~i in the library. (b) Synthetic time series segments yi of length k+h. The first k time points in black; the last h time points in red. (c) Fully observed time series y~𝒪. (d) Time series segment y𝒪 of length k (i.e., the last k observations from the time series in (c)). (e) Compute the distance di=d(y𝒪,yi,1:k) between the observed time series segment y𝒪 and the first k observations of each synthetic time series segment yi in the library (i.e., the black points). (f) The point forecast is an aggregation (e.g., average) of the last h observations of the synthetic time series (i.e., the red points) with the smallest distances di.
Fig 2
Fig 2. A demonstration of sMOA forecasting during the early weeks of the COVID-19 epidemic.
Black lines correspond to point forecasts; the orange lines correspond to the true observed value. The basic ensemble model of the ForecastHub (‘COVIDhub-4_week_ensemble’) and the basic persistence model (‘COVIDhub-baseline’) forecasts are provided for reference for the dates where forecasts were provided. The third model used for later comparisons, the ‘COVIDHub-trained_ensemble’, does not provide forecasts this early in the COVID-19 epidemic.
Fig 3
Fig 3. Nominal vs. empirical coverage for sMOA over every state in the US and over four forecast horizons (1w, 2w, 3w, 4w), plotted using a black line.
The dotted line indicates a perfect match between nominal and empirical coverages for reference. Over every forecast made for the data application to COVID-19, nominal and empirical coverages approximately match.
Fig 4
Fig 4. Direct comparisons between models from the ForecastHub and sMOA, using mean MAE (left) and mean WIS (right).
The error comparison between sMOA and a given model from the ForecastHub is only calculated for the dates for which forecasts from the given model were reported. That is, a given point represents the mean error metric for a model from the ForecastHub calculated over every date, state, and forecast horizon available for that model, plotted against the same mean metric calculated using sMOA on these same dates, states, and forecast horizons. Models beneath the diagonal black line were outperformed by sMOA. Four outlier models were removed for ease of visualization.
Fig 5
Fig 5. The proportion of all models (black) and best-in-class models (red) sMOA outperforms in MAE (top) and WIS (bottom) if the validation window ranged from August 2020 through the x-axis date.
sMOA outperforms the majority of all models and best-in-class models if the validation date cut off is between October 2020 and March 2023. Directly before October 2020, there was a dip in incidence case counts that sMOA failed to forecast accurately that caused the initial lower performance.

Similar articles

References

    1. Brooks LC, Farrow DC, Hyun S, Tibshirani RJ, Rosenfeld R. Flexible modeling of epidemics with an empirical Bayes framework. PLoS Comput Biol. 2015;11(8):e1004382. doi: 10.1371/journal.pcbi.1004382 - DOI - PMC - PubMed
    1. Viboud C, Boëlle P-Y, Carrat F, Valleron A-J, Flahault A. Prediction of the spread of influenza epidemics by the method of analogues. Am J Epidemiol. 2003;158(10):996–1006. doi: 10.1093/aje/kwg239 - DOI - PubMed
    1. Moniz L, Buczak AL, Baugher B, Guven E, Chretien J-P. Predicting influenza with dynamical methods. BMC Med Inform Decis Mak. 2016;16(1):134. doi: 10.1186/s12911-016-0371-7 - DOI - PMC - PubMed
    1. Amnatsan S, Yoshikawa S, Kanae S. Improved forecasting of extreme monthly reservoir inflow using an analogue-based forecasting method: a case study of the Sirikit Dam in Thailand. Water. 2018;10(11):1614. doi: 10.3390/w10111614 - DOI
    1. Simpson GL. Analogue methods in palaeoecology: using the analogue package. J Stat Soft. 2007;22(2). doi: 10.18637/jss.v022.i02 - DOI

LinkOut - more resources