Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 May;25(21):1900221.
doi: 10.2807/1560-7917.ES.2020.25.21.1900221.

Using web search queries to monitor influenza-like illness: an exploratory retrospective analysis, Netherlands, 2017/18 influenza season

Affiliations

Using web search queries to monitor influenza-like illness: an exploratory retrospective analysis, Netherlands, 2017/18 influenza season

Paul P Schneider et al. Euro Surveill. 2020 May.

Abstract

BackgroundDespite the early development of Google Flu Trends in 2009, standards for digital epidemiology methods have not been established and research from European countries is scarce.AimIn this article, we study the use of web search queries to monitor influenza-like illness (ILI) rates in the Netherlands in real time.MethodsIn this retrospective analysis, we simulated the weekly use of a prediction model for estimating the then-current ILI incidence across the 2017/18 influenza season solely based on Google search query data. We used weekly ILI data as reported to The European Surveillance System (TESSY) each week, and we removed the then-last 4 weeks from our dataset. We then fitted a prediction model based on the then-most-recent search query data from Google Trends to fill the 4-week gap ('Nowcasting'). Lasso regression, in combination with cross-validation, was applied to select predictors and to fit the 52 models, one for each week of the season.ResultsThe models provided accurate predictions with a mean and maximum absolute error of 1.40 (95% confidence interval: 1.09-1.75) and 6.36 per 10,000 population. The onset, peak and end of the epidemic were predicted with an error of 1, 3 and 2 weeks, respectively. The number of search terms retained as predictors ranged from three to five, with one keyword, 'griep' ('flu'), having the most weight in all models.DiscussionThis study demonstrates the feasibility of accurate, real-time ILI incidence predictions in the Netherlands using Google search query data.

Keywords: digital epidemiology; infectious diseases; influenza-like illness; machine learning; surveillance.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest: None declared.

Figures

Figure 1
Figure 1
Bivariate associations between Google search terms and influenza-like illness incidence in the training dataset, Netherlands, weeks 33/2013–30/2017 (n = 20 search terms)
Figure 2
Figure 2
Time series plot showing observed influenza-like illness incidence against predictions of 52 final lasso regression models, weeks 31/2017–31/2018 (A) and overview of training and validation, weeks 33/2013–31/2018 (B), Netherlands
Figure 3
Figure 3
Observed vs predicted influenza-like illness incidence at five time points (A–E), Netherlands, influenza season 2017/18
Figure 4
Figure 4
Predictors retained in the final lasso regression models throughout the 52 iterations, Netherlands, weeks 31/2017–31/2018

References

    1. Salathé M, Bengtsson L, Bodnar TJ, Brewer DD, Brownstein JS, Buckee C, et al. Digital epidemiology. PLOS Comput Biol. 2012;8(7):e1002616. 10.1371/journal.pcbi.1002616 - DOI - PMC - PubMed
    1. Milinovich GJ, Williams GM, Clements ACA, Hu W. Internet-based surveillance systems for monitoring emerging infectious diseases. Lancet Infect Dis. 2014;14(2):160-8. 10.1016/S1473-3099(13)70244-5 - DOI - PMC - PubMed
    1. Simonsen L, Gog JR, Olson D, Viboud C. Infectious Disease Surveillance in the Big Data Era: Towards Faster and Locally Relevant Systems. J Infect Dis. 2016;214(4) suppl_4;S380-5. 10.1093/infdis/jiw376 - DOI - PMC - PubMed
    1. Bovi AM, Council on Ethical and Judicial Affairs of the American Medical Association Use of health-related online sites. Am J Bioeth. 2003;3(3):F3. 10.1162/152651603322874780 - DOI - PubMed
    1. Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, Brilliant L. Detecting influenza epidemics using search engine query data. Nature. 2009;457(7232):1012-4. 10.1038/nature07634 - DOI - PubMed

Publication types