Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011;6(8):e23610.
doi: 10.1371/journal.pone.0023610. Epub 2011 Aug 19.

Assessing Google flu trends performance in the United States during the 2009 influenza virus A (H1N1) pandemic

Affiliations

Assessing Google flu trends performance in the United States during the 2009 influenza virus A (H1N1) pandemic

Samantha Cook et al. PLoS One. 2011.

Abstract

Background: Google Flu Trends (GFT) uses anonymized, aggregated internet search activity to provide near-real time estimates of influenza activity. GFT estimates have shown a strong correlation with official influenza surveillance data. The 2009 influenza virus A (H1N1) pandemic [pH1N1] provided the first opportunity to evaluate GFT during a non-seasonal influenza outbreak. In September 2009, an updated United States GFT model was developed using data from the beginning of pH1N1.

Methodology/principal findings: We evaluated the accuracy of each U.S. GFT model by comparing weekly estimates of ILI (influenza-like illness) activity with the U.S. Outpatient Influenza-like Illness Surveillance Network (ILINet). For each GFT model we calculated the correlation and RMSE (root mean square error) between model estimates and ILINet for four time periods: pre-H1N1, Summer H1N1, Winter H1N1, and H1N1 overall (Mar 2009-Dec 2009). We also compared the number of queries, query volume, and types of queries (e.g., influenza symptoms, influenza complications) in each model. Both models' estimates were highly correlated with ILINet pre-H1N1 and over the entire surveillance period, although the original model underestimated the magnitude of ILI activity during pH1N1. The updated model was more correlated with ILINet than the original model during Summer H1N1 (r = 0.95 and 0.29, respectively). The updated model included more search query terms than the original model, with more queries directly related to influenza infection, whereas the original model contained more queries related to influenza complications.

Conclusions: Internet search behavior changed during pH1N1, particularly in the categories "influenza complications" and "term for influenza." The complications associated with pH1N1, the fact that pH1N1 began in the summer rather than winter, and changes in health-seeking behavior each may have played a part. Both GFT models performed well prior to and during pH1N1, although the updated model performed better during pH1N1, especially during the summer months.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: Yes, the authors have the following competing interest. This study was supported by funding from Google Inc., and three of the authors (SC, CC, MM) are employees of Google Inc. There are no patents, products in development or marketed products to declare. This does not alter the authors' adherence to all the PLoS ONE policies on sharing data and materials, as detailed online in the guide for authors.

Figures

Figure 1
Figure 1. Time series plots of ILINet data and original and updated GFT estimates.
A) ILINet data and GFT estimates from 2009. B) ILINet data and GFT estimates for the entire time period where GFT estimates are available: 2003–2009.
Figure 2
Figure 2. Time series plots of ILINet data and category-level GFT estimates.
Category-level estimates are created by applying the GFT methodology to a subset of the queries in a given model. A) ILINet data and GFT estimates based on original model queries related to influenza complications. B) ILINet data and GFT estimates based on updated model queries related to specific influenza symptoms.
Figure 3
Figure 3. Time series plots of ILINet data and query-level GFT estimates.
Query-level estimates are created by applying the GFT methodology to the search activity for a single query. A) ILINet data and GFT estimates based on the query [symptoms of flu]. B) ILINet data and GFT estimates based on the query [symptoms of bronchitis]. C) ILINet data and GFT estimates based on the query [symptoms of pneumonia].

Similar articles

Cited by

References

    1. http://www.google.org/flutrends. Google Flu Trends. 2009. Google.org. Available: www.google.org/flutrends. Accessed 2011 July 28.
    1. CDC - Seasonal Influenza (Flu) - Flu Activity & Surveillance. Available: http://www.cdc.gov/flu/weekly/fluactivitysurv.htm. Accessed 2011 July 28.
    1. Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, et al. Detecting influenza epidemics using search engine query data. Nature. 2008;457:1012–10155. DOI: http://www.nature.com/nature/journal/v457/n7232/full/nature07634.html. Accessed 2011 July 28. - PubMed
    1. Kelly H, Grant K. Interim analysis of pandemic influenza (H1N1) 2009 in Australia: surveillance trends, age of infection and effectiveness of seasonal vaccination. Euro Surveill. 2009;14(31):pii = 19288. Available: http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=19288. Accessed 2011 July 28. - PubMed
    1. Wilson N, Mason K, Tobias M, Peacey M, Huang QS, et al. Interpreting “GFT” Data for Pandemic H1N1: The New Zealand Experience. EuroSurveill. 2009;14(44):pii = 19386. Available: http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=19386. Accessed 2011 July 28. - PubMed

Publication types