Twitter conversations predict the daily confirmed COVID-19 cases
- PMID: 36092470
- PMCID: PMC9444159
- DOI: 10.1016/j.asoc.2022.109603
Twitter conversations predict the daily confirmed COVID-19 cases
Abstract
As of writing this paper, COVID-19 (Coronavirus disease 2019) has spread to more than 220 countries and territories. Following the outbreak, the pandemic's seriousness has made people more active on social media, especially on the microblogging platforms such as Twitter and Weibo. The pandemic-specific discourse has remained on-trend on these platforms for months now. Previous studies have confirmed the contributions of such socially generated conversations towards situational awareness of crisis events. The early forecasts of cases are essential to authorities to estimate the requirements of resources needed to cope with the outgrowths of the virus. Therefore, this study attempts to incorporate the public discourse in the design of forecasting models particularly targeted for the steep-hill region of an ongoing wave. We propose a sentiment-involved topic-based latent variables search methodology for designing forecasting models from publicly available Twitter conversations. As a use case, we implement the proposed methodology on Australian COVID-19 daily cases and Twitter conversations generated within the country. Experimental results: (i) show the presence of latent social media variables that Granger-cause the daily COVID-19 confirmed cases, and (ii) confirm that those variables offer additional prediction capability to forecasting models. Further, the results show that the inclusion of social media variables introduces 48.83%-51.38% improvements on RMSE over the baseline models. We also release the large-scale COVID-19 specific geotagged global tweets dataset, MegaGeoCOV, to the public anticipating that the geotagged data of this scale would aid in understanding the conversational dynamics of the pandemic through other spatial and temporal contexts.
Keywords: ARIMAX models; Granger causality; Pandemic forecast; Social media analytics; Time series analysis; Twitter analytics; VAR models.
© 2022 Elsevier B.V. All rights reserved.
Conflict of interest statement
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Figures








Similar articles
-
BillionCOV: An enriched billion-scale collection of COVID-19 tweets for efficient hydration.Data Brief. 2023 Jun;48:109229. doi: 10.1016/j.dib.2023.109229. Epub 2023 May 12. Data Brief. 2023. PMID: 37223279 Free PMC article.
-
Design and analysis of a large-scale COVID-19 tweets dataset.Appl Intell (Dordr). 2021;51(5):2790-2804. doi: 10.1007/s10489-020-02029-z. Epub 2020 Nov 6. Appl Intell (Dordr). 2021. PMID: 34764561 Free PMC article.
-
Cross-Platform Comparative Study of Public Concern on Social Media during the COVID-19 Pandemic: An Empirical Study Based on Twitter and Weibo.Int J Environ Res Public Health. 2021 Jun 16;18(12):6487. doi: 10.3390/ijerph18126487. Int J Environ Res Public Health. 2021. PMID: 34208483 Free PMC article.
-
(Mis)Information on Digital Platforms: Quantitative and Qualitative Analysis of Content From Twitter and Sina Weibo in the COVID-19 Pandemic.JMIR Infodemiology. 2022 Feb 24;2(1):e31793. doi: 10.2196/31793. eCollection 2022 Jan-Jun. JMIR Infodemiology. 2022. PMID: 36406147 Free PMC article.
-
An augmented multilingual Twitter dataset for studying the COVID-19 infodemic.Soc Netw Anal Min. 2021;11(1):102. doi: 10.1007/s13278-021-00825-0. Epub 2021 Oct 20. Soc Netw Anal Min. 2021. PMID: 34697560 Free PMC article. Review.
Cited by
-
Can the number of confirmed COVID-19 cases be predicted more accurately by including lifestyle data? An exploratory study for data-driven prediction of COVID-19 cases in metropolitan cities using deep learning models.Digit Health. 2025 Jan 26;11:20552076251314528. doi: 10.1177/20552076251314528. eCollection 2025 Jan-Dec. Digit Health. 2025. PMID: 39872000 Free PMC article.
-
From Heroes to Scoundrels: Exploring the effects of online campaigns celebrating frontline workers on COVID-19 outcomes.Technol Soc. 2023 Feb;72:102198. doi: 10.1016/j.techsoc.2023.102198. Epub 2023 Jan 21. Technol Soc. 2023. PMID: 36712551 Free PMC article.
-
Bibliometric Analysis of Granger Causality Studies.Entropy (Basel). 2023 Apr 7;25(4):632. doi: 10.3390/e25040632. Entropy (Basel). 2023. PMID: 37190420 Free PMC article.
-
MGLEP: Multimodal Graph Learning for Modeling Emerging Pandemics with Big Data.Sci Rep. 2024 Jul 16;14(1):16377. doi: 10.1038/s41598-024-67146-y. Sci Rep. 2024. PMID: 39013976 Free PMC article.
References
-
- Worldometers L. 2021. COVID-19 coronavirUS PANDEMIC. URL https://www.worldometers.info/coronavirus/
-
- Smartraveller.gov.au L. 2021. Covid-19 vaccinations. URL https://www.smartraveller.gov.au/COVID-19/COVID-19-vaccinations.
-
- Health Ministry A. 2020. First confirmed case of novel coronavirus in Australia. URL https://www.health.gov.au/ministers/the-hon-greg-hunt-mp/media/first-con....
-
- Lamsal R., Harwood A., Read M.R. Socially enhanced situation awareness from microblogs using artificial intelligence: A survey. ACM Comput. Surv. 2022
LinkOut - more resources
Full Text Sources
Research Materials