Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Apr 14;6(2):e18828.
doi: 10.2196/18828.

Predicting COVID-19 Incidence Through Analysis of Google Trends Data in Iran: Data Mining and Deep Learning Pilot Study

Affiliations

Predicting COVID-19 Incidence Through Analysis of Google Trends Data in Iran: Data Mining and Deep Learning Pilot Study

Seyed Mohammad Ayyoubzadeh et al. JMIR Public Health Surveill. .

Abstract

Background: The recent global outbreak of coronavirus disease (COVID-19) is affecting many countries worldwide. Iran is one of the top 10 most affected countries. Search engines provide useful data from populations, and these data might be useful to analyze epidemics. Utilizing data mining methods on electronic resources' data might provide a better insight into the COVID-19 outbreak to manage the health crisis in each country and worldwide.

Objective: This study aimed to predict the incidence of COVID-19 in Iran.

Methods: Data were obtained from the Google Trends website. Linear regression and long short-term memory (LSTM) models were used to estimate the number of positive COVID-19 cases. All models were evaluated using 10-fold cross-validation, and root mean square error (RMSE) was used as the performance metric.

Results: The linear regression model predicted the incidence with an RMSE of 7.562 (SD 6.492). The most effective factors besides previous day incidence included the search frequency of handwashing, hand sanitizer, and antiseptic topics. The RMSE of the LSTM model was 27.187 (SD 20.705).

Conclusions: Data mining algorithms can be employed to predict trends of outbreaks. This prediction might support policymakers and health care managers to plan and allocate health care resources accordingly.

Keywords: COVID-19; Google Trends; LSTM; coronavirus; incidence; linear regression; outbreak; pandemic; prediction; public health.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

Figure 1
Figure 1
Proposed LSTM network architecture. LSTM: long short-term memory.
Figure 2
Figure 2
Training and validation loss of the long short-term memory model. MSE: mean squared error.
Figure 3
Figure 3
Actual and predicted new cases of COVID-19. LSTM: long short-term memory; COVID-19: coronavirus disease.

References

    1. Guo Y, Cao Q, Hong Z, Tan Y, Chen S, Jin H, Tan K, Wang D, Yan Y. The origin, transmission and clinical therapies on coronavirus disease 2019 (COVID-19) outbreak - an update on the status. Mil Med Res. 2020 Mar 13;7(1):11. doi: 10.1186/s40779-020-00240-0. https://mmrjournal.biomedcentral.com/articles/10.1186/s40779-020-00240-0 - DOI - PMC - PubMed
    1. Giovanetti M, Benvenuto Domenico, Angeletti Silvia, Ciccozzi Massimo. The first two cases of 2019-nCoV in Italy: Where they come from? J Med Virol. 2020 May;92(5):518–521. doi: 10.1002/jmv.25699. - DOI - PMC - PubMed
    1. Lu R, Zhao X, Li J, Niu P, Yang B, Wu H, Wang W, Song H, Huang B, Zhu N, Bi Y, Ma X, Zhan F, Wang L, Hu T, Zhou H, Hu Z, Zhou W, Zhao L, Chen J, Meng Y, Wang J, Lin Y, Yuan J, Xie Z, Ma J, Liu WJ, Wang D, Xu W, Holmes EC, Gao GF, Wu G, Chen W, Shi W, Tan W. Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding. The Lancet. 2020 Feb;395(10224):565–574. doi: 10.1016/S0140-6736(20)30251-8. - DOI - PMC - PubMed
    1. Prathap L, Jagadeesan V, Suganthirababu P, Ganesan D. Online Journal of Health and Allied Sciences. 2017. [2020-04-07]. Association of quantitative and qualitative dermatoglyphic variable and DNA polymorphism in female breast cancer population https://www.ojhas.org/issue62/2017-2-2.pdf.
    1. Yang S, Santillana M, Kou SC. Accurate estimation of influenza epidemics using Google search data via ARGO. Proc Natl Acad Sci U S A. 2015 Nov 24;112(47):14473–8. doi: 10.1073/pnas.1515373112. http://www.pnas.org/cgi/pmidlookup?view=long&pmid=26553980 - DOI - PMC - PubMed

LinkOut - more resources