Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Dec;10(2):020511.
doi: 10.7189/jogh.10.020511.

Retrospective analysis of the accuracy of predicting the alert level of COVID-19 in 202 countries using Google Trends and machine learning

Affiliations

Retrospective analysis of the accuracy of predicting the alert level of COVID-19 in 202 countries using Google Trends and machine learning

Yuanyuan Peng et al. J Glob Health. 2020 Dec.

Abstract

Background: Internet search engine data, such as Google Trends, was shown to be correlated with the incidence of COVID-19, but only in several countries. We aim to develop a model from a small number of countries to predict the epidemic alert level in all the countries worldwide.

Methods: The "interest over time" and "interest by region" Google Trends data of Coronavirus, pneumonia, and six COVID symptom-related terms were searched. The daily incidence of COVID-19 from 10 January to 23 April 2020 of 202 countries was retrieved from the World Health Organization. Three alert levels were defined. Ten weeks' data from 20 countries were used for training with machine learning algorithms. The features were selected according to the correlation and importance. The model was then tested on 2830 samples of 202 countries.

Results: Our model performed well in 154 (76.2%) countries, of which each had no more than four misclassified samples. In these 154 countries, the accuracy was 0.8133, and the kappa coefficient was 0.6828. While in all 202 countries, the accuracy was 0.7527, and the kappa coefficient was 0.5841. The proposed algorithm based on Random Forest Classification and nine features performed better compared to other machine learning methods and the models with different numbers of features.

Conclusions: Our result suggested that the model developed from 20 countries with Google Trends data and Random Forest Classification can be applied to predict the epidemic alert levels of most countries worldwide.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors completed the ICMJE Unified Competing Interest form (available upon request from the corresponding author) and declare that they have no competing interests.

Figures

Figure 1
Figure 1
The predicted alert level (red), normalized Google Trends search volume of the topic “Coronavirus” (green), normalized daily new confirmed cases. Panel A. Italy. Panel B. United States Virgin Islands.
Figure 2
Figure 2
The importance of included features.
Figure 3
Figure 3
Classification confusion matrix.

Similar articles

Cited by

References

    1. Wu JT, Leung K, Leung GM.Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study. Lancet. 2020;395:689-97. 10.1016/S0140-6736(20)30260-9 - DOI - PMC - PubMed
    1. Zhu N, Zhang D, Wang W, Li X, Yang B, Song J, et al. A Novel Coronavirus from Patients with Pneumonia in China, 2019. N Engl J Med. 2020;382:727-33. 10.1056/NEJMoa2001017 - DOI - PMC - PubMed
    1. Heymann DL, Shindo N, WHO Scientific Technical Advisory Group for Infectious Hazards COVID-19: what is next for public health? Lancet. 2020;395:542-5. 10.1016/S0140-6736(20)30374-3 - DOI - PMC - PubMed
    1. Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, Brilliant L.Detecting influenza epidemics using search engine query data. Nature. 2009;457:1012-4. 10.1038/nature07634 - DOI - PubMed
    1. Marques-Toledo CA, Degener CM, Vinhal L, Coelho G, Meira W, Codeco CT, et al. Dengue prediction by the web: Tweets are a useful tool for estimating and forecasting Dengue at country and city level. PLoS Negl Trop Dis. 2017;11:e0005729. 10.1371/journal.pntd.0005729 - DOI - PMC - PubMed