CoWIN twitter dataset: A comprehensive collection of public discourse on India's COVID-19 vaccination platform
- PMID: 39877807
- PMCID: PMC11772133
- DOI: 10.1016/j.dib.2024.111252
CoWIN twitter dataset: A comprehensive collection of public discourse on India's COVID-19 vaccination platform
Abstract
The CoWIN Twitter Dataset offers a wide-ranging collection of public opinions on India's COVID-19 vaccination platform CoWIN. The raw dataset has 635,000 tweets that mention "cowin," collected over the period of January to December 2021. The dataset was extracted by employing the Twitter Academic API. It addition to the raw data, it also included a cleaned and processed set of 419,409 English tweets, and a labeled subset with sentiment analysis. The raw data file has tweet details like ID, text, timestamp, user ID, and language. The processed dataset is devoid of URLs and hashtags and other noise, and also adds month and category groupings. Finally,the labelled dataset gives sentiment classifications of positive or negative the relevant tweets. This dataset enables researchers to analyse themes and sentiments related to India's vaccination administration. It can help policymakers gain insights around issues related to large-scale health initiatives and digital health systems. The mix of languages in the data also makes it useful for language processing research.
Keywords: COVID-19; CoWIN; Digital health; Health informatics; India; Sentiment analysis; Social media analytics; Twitter data.
© 2024 Published by Elsevier Inc.
Figures
Similar articles
-
Twitter Discussions on #digitaldementia: Content and Sentiment Analysis.J Med Internet Res. 2024 Jul 16;26:e59546. doi: 10.2196/59546. J Med Internet Res. 2024. PMID: 39012679 Free PMC article.
-
Tracking discussions of complementary, alternative, and integrative medicine in the context of the COVID-19 pandemic: a month-by-month sentiment analysis of Twitter data.BMC Complement Med Ther. 2022 Apr 13;22(1):105. doi: 10.1186/s12906-022-03586-1. BMC Complement Med Ther. 2022. PMID: 35418205 Free PMC article.
-
MonkeyPox2022Tweets: A Large-Scale Twitter Dataset on the 2022 Monkeypox Outbreak, Findings from Analysis of Tweets, and Open Research Questions.Infect Dis Rep. 2022 Nov 14;14(6):855-883. doi: 10.3390/idr14060087. Infect Dis Rep. 2022. PMID: 36412745 Free PMC article.
-
An augmented multilingual Twitter dataset for studying the COVID-19 infodemic.Soc Netw Anal Min. 2021;11(1):102. doi: 10.1007/s13278-021-00825-0. Epub 2021 Oct 20. Soc Netw Anal Min. 2021. PMID: 34697560 Free PMC article. Review.
-
Current landscape of social media use pertaining to glioblastoma by various stakeholders.Neurooncol Adv. 2023 May 24;5(1):vdad039. doi: 10.1093/noajnl/vdad039. eCollection 2023 Jan-Dec. Neurooncol Adv. 2023. PMID: 37250621 Free PMC article. Review.
References
-
- Lampert J., Lampert C.H. 2021 IEEE International Conference on Big Data (Big Data) 2021. Overcoming rare-language discrimination in multi-lingual sentiment analysis. - DOI
-
- Pota M., Ventura M., Fujita H., Esposito M. Multilingual evaluation of pre-processing for BERT-based sentiment analysis of tweets. Expert Syst. Appl. 2021;181
-
- Tessore J.P., Esnaola L., Russo C., Baldassarri S. Proceedings of the XX International Conference on Human Computer Interaction. 2019. Comparative analysis of preprocessing tasks over social media texts in Spanish.
-
- Ritchie, H., Mathieu, E., Rodés-Guirao, L., Appel, C., Giattino, C., Ortiz-Ospina, E., Hasell, J., Macdonald, B., Beltekian, D., & Roser, M. (2023). COVID-19 vaccinations. Our World in Data. Retrieved July 17, 2024, from https://ourworldindata.org/covid-vaccinations?country=IND∼GBR∼USA∼OWID_WRL
LinkOut - more resources
Full Text Sources