This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

[Preprint]. 2024 Sep 13:2024.09.11.24313489.

doi: 10.1101/2024.09.11.24313489.

Timely vaccine strain selection and genomic surveillance improves evolutionary forecast accuracy of seasonal influenza A/H3N2

John Huddleston¹, Trevor Bedford^{1

2}

Affiliations

¹ Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA, USA.
² Howard Hughes Medical Institute, Seattle, WA, USA.

PMID: 39314963
PMCID: PMC11419249
DOI: 10.1101/2024.09.11.24313489

Timely vaccine strain selection and genomic surveillance improves evolutionary forecast accuracy of seasonal influenza A/H3N2

John Huddleston et al. medRxiv. 2024.

[Preprint]. 2024 Sep 13:2024.09.11.24313489.

doi: 10.1101/2024.09.11.24313489.

Authors

John Huddleston¹, Trevor Bedford^{1

2}

Affiliations

¹ Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA, USA.
² Howard Hughes Medical Institute, Seattle, WA, USA.

PMID: 39314963
PMCID: PMC11419249
DOI: 10.1101/2024.09.11.24313489

Abstract

For the last decade, evolutionary forecasting models have influenced seasonal influenza vaccine design. These models attempt to predict which genetic variants circulating at the time of vaccine strain selection will be dominant 12 months later in the influenza season targeted by vaccination campaign. Forecasting models depend on hemagglutinin (HA) sequences from the WHO's Global Influenza Surveillance and Response System to identify currently circulating groups of related strains (clades) and estimate clade fitness for forecasts. However, the average lag between collection of a clinical sample and the submission of its sequence to the Global Initiative on Sharing All Influenza Data (GISAID) EpiFlu database is ~3 months. Submission lags complicate the already difficult 12-month forecasting problem by reducing understanding of current clade frequencies at the time of forecasting. These constraints of a 12-month forecast horizon and 3-month average submission lags create an upper bound on the accuracy of any long-term forecasting model. The global response to the SARS-CoV-2 pandemic revealed that modern vaccine technology like mRNA vaccines can reduce how far we need to forecast into the future to 6 months or less and that expanded support for sequencing can reduce submission lags to GISAID to 1 month on average. To determine whether these recent advances could also improve long-term forecasts for seasonal influenza, we quantified the effects of reducing forecast horizons and submission lags on the accuracy of forecasts for A/H3N2 populations. We found that reducing forecast horizons from 12 months to 6 or 3 months reduced average absolute forecasting errors to 25% and 50% of the 12-month average, respectively. Reducing submission lags provided little improvement to forecasting accuracy but decreased the uncertainty in current clade frequencies by 50%. These results show the potential to substantially improve the accuracy of existing influenza forecasting models by modernizing influenza vaccine development and increasing global sequencing capacity.

PubMed Disclaimer

Conflict of interest statement

Competing interests The authors declare that no competing interests exist.

Figures

**Figure 1.**
Model of forecast horizons and submission lags. A) Long-term forecasting models historically predicted 12 months into the future from April and October because of the time required to develop and distribute a new vaccine (Łuksza and Lässig, 2014). We tested three additional shorter forecast horizons in three-month intervals of 9, 6, and 3 months prior to the same time in the future season. For each forecast horizon, we calculated the accuracy of forecasts under each of the three submission lags reflected above including no lag (blue), realistic lag (green), and ideal lag (orange). B) Observed lags in days between collection of viral samples and submission of corresponding HA sequences to GISAID (purple) for samples collected in 2019 have a mean of 98 days (approximately 3 months). A gamma distribution fit to the observed lag distribution with a similar mean and shape (green) represents a realistic submission lag that we sampled from to assign “submission dates” to simulated and natural A/H3N2 populations. A gamma distribution with a mean that is one third of the realistic distribution (orange) represents an ideal submission lag analogous to the 1-month average observed lags for SARS-CoV-2 genomes. Retrospective analyses including fitting of forecasting models typically filter HA sequences by collection date instead of submission dates in which case there is no lag (blue).

**Figure 2.**
Distance to the future per timepoint (AAs) for natural A/H3N2 populations by forecast horizon and submission lag type based on forecasts from the local branching index (LBI) and mutational load model. Each point represents a future timepoint whose population was predicted from the number of months earlier corresponding to the forecast horizon. Points are colored by submission lag type including forecasts made with no lag (blue), an ideal lag (orange), and a realistic lag (green).

**Figure 3.**
Clade frequency errors for natural A/H3N2 clades at the same timepoint calculated as the difference between clade frequencies without submission lag and corresponding frequencies with either A) ideal or B) realistic submission lags. Distributions of frequency errors appear normally distributed in both lag scenarios for both C) small clades (>0% and <10% frequency) and D) large clades (≥10%). Dashed lines indicate the median error from the distribution of the lag type with the same color.

**Figure 4.**
Absolute forecast clade frequency errors for natural A/H3N2 populations by forecast horizon in months and submission lag type (none, ideal, or observed) for A) small clades (<10% initial frequency) and B) large clades (≥10% initial frequency).

**Figure 5.**
Improvement of clade frequency errors for A/H3N2 populations between the status quo (12-month forecast horizon and realistic submission lags) and realistic interventions of improved vaccine development (reducing 12-month to 6-month forecast horizon), improved surveillance (reducing submission lags from 3 months on average to 1 month), or a combination of both interventions. We measured improvements from the status quo as the difference in total absolute clade frequency error per future timepoint. Positive values indicate increased forecast accuracy, while negative values indicate decreased accuracy. Each point represents the improvement of forecasts for a specific future timepoint under the given intervention. Horizontal dashed lines indicate median improvements. Horizontal dotted lines indicate upper and lower quartiles of improvements.

**Figure 6.**
Improvement of optimal distances to the future (AAs) for A/H3N2 populations between the status quo (12-month forecast horizon and realistic submission lags) and realistic interventions of improved vaccine development (reducing 12-month to 6-month forecast horizon), improved surveillance (reducing submission lags from 3 months on average to 1 month), or a combination of both interventions. We measured improvements from the status quo as the difference in optimal distances to the future per future timepoint. Positive values indicate increased forecast accuracy, while negative values indicate decreased accuracy. Each point represents the improvement of forecasts for a specific future timepoint under the given intervention. Horizontal dashed lines indicate median improvements. Horizontal dotted lines indicate upper and lower quartiles of improvements.

See this image and copyright information in PMC

Cited by

Forecasting framework for dominant SARS-CoV-2 strains before clade replacement using phylogeny-informed genetic distances.
Lee K, Demirev AV, Lee S, Cho S, Kim H, Cho J, Yang JS, Kim KC, Lee JY, Shin W, Lee S, Park S, Lemey P, Park MS, Kim JI. Lee K, et al. Front Microbiol. 2025 Jun 20;16:1619546. doi: 10.3389/fmicb.2025.1619546. eCollection 2025. Front Microbiol. 2025. PMID: 40620492 Free PMC article.

References

1. Abousamra E, Figgins M, Bedford T. Fitness models provide accurate short-term forecasts of SARS-CoV-2 variant frequency. PLOS Computational Biology. 2024. 09; 20(9):1–20. 10.1371/journal.pcbi.1012443, doi: 10.1371/journal.pcbi.1012443. - DOI - DOI - PMC - PubMed
1. Baden LR, El Sahly HM, Essink B, Kotloff K, Frey S, Novak R, Diemert D, Spector SA, Rouphael N, Creech CB, McGettigan J, Khetan S, Segall N, Solis J, Brosz A, Fierro C, Schwartz H, Neuzil K, Corey L, Gilbert P, et al. Efficacy and Safety of the mRNA-1273 SARS-CoV-2 Vaccine. N Engl J Med. 2021. Feb; 384(5):403–416. - PMC - PubMed
1. Black A, MacCannell DR, Sibley TR, Bedford T. Ten recommendations for supporting open pathogen genomic analysis in public health. Nat Med. 2020. Jun; 26(6):832–841. - PMC - PubMed
1. Brazzoli M, Magini D, Bonci A, Buccato S, Giovani C, Kratzer R, Zurli V, Mangiavacchi S, Casini D, Brito LM, De Gregorio E, Mason PW, Ulmer JB, Geall AJ, Bertholet S. Induction of Broad-Based Immunity and Protective Efficacy by Self-amplifying mRNA Vaccines Encoding Influenza Virus Hemagglutinin. J Virol. 2016. Jan; 90(1):332–344. - PMC - PubMed
1. Brito AF, Semenova E, Dudas G, Hassler GW, Kalinich CC, Kraemer MUG, Ho J, Tegally H, Githinji G, Agoti CN, Matkin LE, Whittaker C, Howden BP, Sintchenko V, Zuckerman NS, Mor O, Blankenship HM, de Oliveira T, Lin RTP, Siqueira MM, et al. Global disparities in SARS-CoV-2 genomic surveillance. Nat Commun. 2022. Nov; 13(1):7003. - PMC - PubMed

Publication types

Actions

Grants and funding

R01 AI165821/AI/NIAID NIH HHS/United States

LinkOut - more resources

Full Text Sources
- Cold Spring Harbor Laboratory
- PubMed Central
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

This is a preprint.

Timely vaccine strain selection and genomic surveillance improves evolutionary forecast accuracy of seasonal influenza A/H3N2

Affiliations

Timely vaccine strain selection and genomic surveillance improves evolutionary forecast accuracy of seasonal influenza A/H3N2

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous

This is a preprint.

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous