. 2022 Nov 10;22(1):833.

doi: 10.1186/s12879-022-07794-5.

Chimeric forecasting: combining probabilistic predictions from computational models and human judgment

Thomas McAndrew¹, Allison Codi², Juan Cambeiro^{3

4}, Tamay Besiroglu^{3

5}, David Braun⁶, Eva Chen⁷, Luis Enrique Urtubey De Cèsaris⁷, Damon Luk²

Affiliations

¹ College of Health, Lehigh University, Bethlehem, PA, USA. mcandrew@lehigh.edu.
² College of Health, Lehigh University, Bethlehem, PA, USA.
³ Metaculus, Santa Cruz, CA, USA.
⁴ Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, USA.
⁵ Massachusetts Institute of Technology, Cambridge, MA, USA.
⁶ Department of Psychology, Lehigh University, Bethlehem, PA, USA.
⁷ Good Judgment Inc., New York, NY, USA.

PMID: 36357829
PMCID: PMC9648897
DOI: 10.1186/s12879-022-07794-5

Chimeric forecasting: combining probabilistic predictions from computational models and human judgment

Thomas McAndrew et al. BMC Infect Dis. 2022.

. 2022 Nov 10;22(1):833.

doi: 10.1186/s12879-022-07794-5.

Authors

Thomas McAndrew¹, Allison Codi², Juan Cambeiro^{3

4}, Tamay Besiroglu^{3

5}, David Braun⁶, Eva Chen⁷, Luis Enrique Urtubey De Cèsaris⁷, Damon Luk²

Affiliations

¹ College of Health, Lehigh University, Bethlehem, PA, USA. mcandrew@lehigh.edu.
² College of Health, Lehigh University, Bethlehem, PA, USA.
³ Metaculus, Santa Cruz, CA, USA.
⁴ Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, USA.
⁵ Massachusetts Institute of Technology, Cambridge, MA, USA.
⁶ Department of Psychology, Lehigh University, Bethlehem, PA, USA.
⁷ Good Judgment Inc., New York, NY, USA.

PMID: 36357829
PMCID: PMC9648897
DOI: 10.1186/s12879-022-07794-5

Abstract

Forecasts of the trajectory of an infectious agent can help guide public health decision making. A traditional approach to forecasting fits a computational model to structured data and generates a predictive distribution. However, human judgment has access to the same data as computational models plus experience, intuition, and subjective data. We propose a chimeric ensemble-a combination of computational and human judgment forecasts-as a novel approach to predicting the trajectory of an infectious agent. Each month from January, 2021 to June, 2021 we asked two generalist crowds, using the same criteria as the COVID-19 Forecast Hub, to submit a predictive distribution over incident cases and deaths at the US national level either two or three weeks into the future and combined these human judgment forecasts with forecasts from computational models submitted to the COVID-19 Forecasthub into a chimeric ensemble. We find a chimeric ensemble compared to an ensemble including only computational models improves predictions of incident cases and shows similar performance for predictions of incident deaths. A chimeric ensemble is a flexible, supportive public health tool and shows promising results for predictions of the spread of an infectious agent.

PubMed Disclaimer

Conflict of interest statement

Authors declare no competing interests.

Figures

**Fig. 1**
A A timeline of the six surveys that collected human judgment predictions from January to June of 2021, showing when surveys were open and closed (blue dashed lines), when computational predictions submitted to the COVID-19 Forecast Hub were due (black dashed line), human judgment predictions excluded in formal analysis (dark blue), for what week each forecast was made (red dash line), and the reported number of weekly incident COVID-19 cases at the US national level (black solid line). B Forecasts of weekly incident cases submitted to the COVID-19 Forecast Hub (orange) were formatted as seven quantiles, and we similarly formatted human judgment predictions from Metaculus (blue) and Good Judgment Open (red). C Forecasts of weekly incident deaths submitted to the COVID-19 Forecast Hub were formatted as twenty three quantiles and we formatted human judgment predictions the same. We collected more than 3000 original and revised human judgement predictions of incident cases and deaths of the spread of SARS-CoV-2 and burden of COVID-19 in the US

**Fig. 2**
A Forecasts of weekly incident cases at the national level by an ensemble of computational models (blue) and ensemble of human judgement (red). The dot represents the median forecast and the shaded bars represent the 25th and 75th, and the 2.5th and 97.5th prediction intervals. B A mean and 95% confidence interval of the weighted interval score (WIS) for forecasts of incident cases made by individual computational and human judgement models. C Forecasts of weekly incident deaths and forecasts from computational models and human judgement. D Mean and 95% confidence intervals of the WIS for individual predictions of incident deaths. Though individual human judgement forecasts tend to perform worse than computational models, a human judgement ensemble performed similar to an ensemble of computational models for predictions of both cases and deaths over a 6 month period

**Fig. 3**
Submitted and missing forecasts made by A computational forecasts, B human judgment forecasts submitted before the COVID-19 deadline, and C human judgment forecasts submitted by the survey deadline. Forecasts that were submitted are shown in blue and forecasts not submitted (missing) are shown in yellow. Rows represent a single model and columns are broken into six pairs—the left column (with the tick mark) corresponds to submissions of incident cases and the second column in the pair corresponds to submissions of incident deaths—which represent the six surveys from January 2021 to June 2021. The high proportion of missing forecasts made by human judgement models presents a methodological challenge when building a chimeric ensemble

**Fig. 4**
Mean difference in WIS for incident cases (A) and deaths (B) at the US national level between a chimeric ensemble and a computational ensemble paired across six different surveys from Jan 2021 to June 2021 for two strategies to impute missing values (“spotty memory” and “defer to the crowd”) and, within each strategy, 5 different techniques to impute missing forecasts. A chimeric ensemble—a combination of computational and human judgment models—improves WIS scores when the target is cases but weakens or maintains similar WIS scores when the target is deaths. There are negligible differences in mean WIS between a “defer to the crowd” and “spotty memory” imputation strategy for prediction of cases and a defer to the crowd approach appears to improve predictions compared to a spotty memory approach for predictions of incident deaths. Bayesian Ridge Regression (BR) and Median imputation (MI) are promising strategies to impute missing forecasts for incident cases

**Fig. 5**
Median, 25th and 75th, and interquartile ranges for the difference between WIS scores when fitting a performance based ensemble (PB) and equally weighted ensemble (EW) paired by survey for three different ensembles: an ensemble that includes only computational models (blue), only human judgment (red), and a chimeric ensemble that includes both computational and human judgement models (gold). A “spotty memory” strategy was used along with five imputation techniques for training. Ensemble predictions are stratified by A incident cases and B deaths. For the majority of imputation techniques used for predictions of incident cases, training a performance based ensemble shows similar results for a chimeric, computational, and human judgement ensemble. For deaths, performance based training improves predictions of a computational ensemble, shows little improvement to a chimeric ensemble, and weakens predictions of a human judgment ensemble

**Fig. 6**
Median, 25th and 75th, and interquartile ranges for the difference between WIS scores when fitting a performance based ensemble (PB) and equally weighted ensemble (EW) paired by survey for three different ensembles: an ensemble that includes only computational models (blue), only human judgment (red), and a chimeric ensemble that includes both computational and human judgement models (gold). A “defer to the crowd” strategy was used along with five imputation techniques for training. Ensemble predictions are stratified by A incident cases and B deaths. For the majority of imputation techniques used for predictions of incident cases, training a performance based ensemble improves the WIS score of a human judgement ensemble and weakens the performance of a computational and chimeric ensemble. For deaths, performance based training improves predictions of a a chimeric and human judgement ensemble, but for some imputation techniques weakens predictions of a computational ensemble. An algorithm that assigns different weights based on past performance, coupled with a “defer to the crowd” imputation strategy, may improve predictive performance of a chimeric ensemble

**Fig. 7**
WIS scores for predictions of A incident cases and B incident deaths for a performance weighted computational ensemble (blue circle), human judgement ensemble (red square), and chimeric ensemble (yellow triangle) over all imputation techniques for a “defer to the crowd” imputation strategy. The mean WIS and 95% confidence interval over all imputation techniques is plotted. For incident cases, the predictive performance for a chimeric ensemble is similar to or improved when compared to a computational ensemble and despite poorer performance from human judgement alone. For incident deaths, though a computational ensemble has improved performance a chimeric ensemble outperforms a computational ensemble on two surveys and again is able to leverage human judgement to make improved forecasts

See this image and copyright information in PMC

References

1. Lutz CS, Huynh MP, Schroeder M, Anyatonwu S, Dahlgren FS, Danyluk G, Fernandez D, Greene SK, Kipshidze N, Liu L, et al. Applying infectious disease forecasting to public health: a path forward using influenza forecasting examples. BMC Public Health. 2019;19(1):1–12. - PMC - PubMed
1. Matthew B, Slayton RB, Johansson MA, Butler JC . Improving pandemic response: employing mathematical modeling to confront coronavirus disease 2019. Clin Infect Dis. 2021. - PMC - PubMed
1. Matthew B, Cowling BJ, Cucunubá ZM, Dinh L, Ferguson NM, Gao H, Hill V, Imai N, Johansson MA, Kada S, et al. Early insights from statistical and mathematical modeling of key epidemiologic parameters of COVID-19. Emerg Infect Dis. 2020;26(11). - PMC - PubMed
1. Hufnagel L, Brockmann D, Geisel T. Forecast and control of epidemics in a globalized world. Proc Natl Acad Sci. 2004;101(42):15124–15129. - PMC - PubMed
1. Petropoulos F, Makridakis S. Forecasting the novel coronavirus COVID-19. PLoS ONE. 2020;15(3):e0231236. - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

MIDASNI2020-1/GM/NIGMS NIH HHS/United States

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Chimeric forecasting: combining probabilistic predictions from computational models and human judgment

Affiliations

Chimeric forecasting: combining probabilistic predictions from computational models and human judgment

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical