What Are People Tweeting About Zika? An Exploratory Study Concerning Its Symptoms, Treatment, Transmission, and Prevention
- PMID: 28630032
- PMCID: PMC5495967
- DOI: 10.2196/publichealth.7157
What Are People Tweeting About Zika? An Exploratory Study Concerning Its Symptoms, Treatment, Transmission, and Prevention
Abstract
Background: In order to harness what people are tweeting about Zika, there needs to be a computational framework that leverages machine learning techniques to recognize relevant Zika tweets and, further, categorize these into disease-specific categories to address specific societal concerns related to the prevention, transmission, symptoms, and treatment of Zika virus.
Objective: The purpose of this study was to determine the relevancy of the tweets and what people were tweeting about the 4 disease characteristics of Zika: symptoms, transmission, prevention, and treatment.
Methods: A combination of natural language processing and machine learning techniques was used to determine what people were tweeting about Zika. Specifically, a two-stage classifier system was built to find relevant tweets about Zika, and then the tweets were categorized into 4 disease categories. Tweets in each disease category were then examined using latent Dirichlet allocation (LDA) to determine the 5 main tweet topics for each disease characteristic.
Results: Over 4 months, 1,234,605 tweets were collected. The number of tweets by males and females was similar (28.47% [351,453/1,234,605] and 23.02% [284,207/1,234,605], respectively). The classifier performed well on the training and test data for relevancy (F1 score=0.87 and 0.99, respectively) and disease characteristics (F1 score=0.79 and 0.90, respectively). Five topics for each category were found and discussed, with a focus on the symptoms category.
Conclusions: We demonstrate how categories of discussion on Twitter about an epidemic can be discovered so that public health officials can understand specific societal concerns within the disease-specific categories. Our two-stage classifier was able to identify relevant tweets to enable more specific analysis, including the specific aspects of Zika that were being discussed as well as misinformation being expressed. Future studies can capture sentiments and opinions on epidemic outbreaks like Zika virus in real time, which will likely inform efforts to educate the public at large.
Keywords: epidemiology; machine learning; social media; viruses.
©Michele Miller, Tanvi Banerjee, Roopteja Muppalla, William Romine, Amit Sheth. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 19.06.2017.
Conflict of interest statement
Conflicts of Interest: None declared.
Figures
References
-
- Nanlong M. Allafrica. 2016. [2016-12-12]. Nigeriabola - two die after drinking salt water in Jos http://allafrica.com/stories/201408111640.html 6mhjzekwJ.
-
- Centers for Disease Control and Prevention (CDC) CDC. 2016. [2017-06-06]. Transcript for CDC telebriefing: Zika summit press conference 2016 https://www.cdc.gov/media/releases/2016/t0404-zika-summit.html 6r1Oj46i9. - PubMed
-
- Berg N. Greenbiz. 2013. How citizens have become sensors https://www.greenbiz.com/news/2013/03/20/how-citizens-have-become-sensors 6mhnJVcJ8.
-
- Tran T, Lee K. Understanding citizen reactions and Ebola-related information propagation on social media. International Conference on Advances in Social Networks Analysis and Mining; August 18, 2016; San Francisco. 2016.
-
- Purohit H, Banerjee T, Hampton A, Shalin V, Bhandutia N, Sheth A. Arxiv. 2016. [2017-06-10]. Gender-based violence in 140 characters or fewer: a #BigData case study of Twitter https://arxiv.org/abs/1503.02086 6r7WfwX7K.
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
