Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Oct 18;3(4):e74.
doi: 10.2196/publichealth.8133.

Identifying Sentiment of Hookah-Related Posts on Twitter

Affiliations

Identifying Sentiment of Hookah-Related Posts on Twitter

Jon-Patrick Allem et al. JMIR Public Health Surveill. .

Abstract

Background: The increasing popularity of hookah (or waterpipe) use in the United States and elsewhere has consequences for public health because it has similar health risks to that of combustible cigarettes. While hookah use rapidly increases in popularity, social media data (Twitter, Instagram) can be used to capture and describe the social and environmental contexts in which individuals use, perceive, discuss, and are marketed this tobacco product. These data may allow people to organically report on their sentiment toward tobacco products like hookah unprimed by a researcher, without instrument bias, and at low costs.

Objective: This study describes the sentiment of hookah-related posts on Twitter and describes the importance of debiasing Twitter data when attempting to understand attitudes.

Methods: Hookah-related posts on Twitter (N=986,320) were collected from March 24, 2015, to December 2, 2016. Machine learning models were used to describe sentiment on 20 different emotions and to debias the data so that Twitter posts reflected sentiment of legitimate human users and not of social bots or marketing-oriented accounts that would possibly provide overly positive or overly negative sentiment of hookah.

Results: From the analytical sample, 352,116 tweets (59.50%) were classified as positive while 177,537 (30.00%) were classified as negative, and 62,139 (10.50%) neutral. Among all positive tweets, 218,312 (62.00%) were classified as highly positive emotions (eg, active, alert, excited, elated, happy, and pleasant), while 133,804 (38.00%) positive tweets were classified as passive positive emotions (eg, contented, serene, calm, relaxed, and subdued). Among all negative tweets, 95,870 (54.00%) were classified as subdued negative emotions (eg, sad, unhappy, depressed, and bored) while the remaining 81,667 (46.00%) negative tweets were classified as highly negative emotions (eg, tense, nervous, stressed, upset, and unpleasant). Sentiment changed drastically when comparing a corpus of tweets with social bots to one without. For example, the probability of any one tweet reflecting joy was 61.30% from the debiased (or bot free) corpus of tweets. In contrast, the probability of any one tweet reflecting joy was 16.40% from the biased corpus.

Conclusions: Social media data provide researchers the ability to understand public sentiment and attitudes by listening to what people are saying in their own words. Tobacco control programmers in charge of risk communication may consider targeting individuals posting positive messages about hookah on Twitter or designing messages that amplify the negative sentiments. Posts on Twitter communicating positive sentiment toward hookah could add to the normalization of hookah use and is an area of future research. Findings from this study demonstrated the importance of debiasing data when attempting to understand attitudes from Twitter data.

Keywords: Twitter; big data; bots; hookah; sentiment; social media; waterpipe.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

Figure 1
Figure 1
Flowchart of how the analytic sample was derived.
Figure 2
Figure 2
Tagged tweets showing range of emotions from unpleasant to pleasant.
Figure 3
Figure 3
Probability of one tweet’s specific sentiment from the debiased (or social bot free) corpus of tweets.
Figure 4
Figure 4
Probability of one tweet’s specific sentiment from the biased corpus.

References

    1. Salloum RG, Asfar T, Maziak W. Toward a Regulatory Framework for the Waterpipe. Am J Public Health. 2016;106(10):1773–7. doi: 10.2105/AJPH.2016.303322. - DOI - PMC - PubMed
    1. Maziak W. The global epidemic of waterpipe smoking. Addict Behav. 2011;36(1-2):1–5. doi: 10.1016/j.addbeh.2010.08.030. http://europepmc.org/abstract/MED/20888700 - DOI - PMC - PubMed
    1. El-Zaatari ZM, Chami HA, Zaatari GS. Health effects associated with waterpipe smoking. Tob Control. 2015;24 Suppl 1:i31–i43. doi: 10.1136/tobaccocontrol-2014-051908. http://tobaccocontrol.bmj.com/cgi/pmidlookup?view=long&pmid=25661414 - DOI - PMC - PubMed
    1. Heinz AJ, Giedgowd GE, Crane NA, Veilleux JC, Conrad M, Braun AR, Olejarska NA, Kassel JD. A comprehensive examination of hookah smoking in college students: use patterns and contexts, social norms and attitudes, harm perception, psychological correlates and co-occurring substance use. Addict Behav. 2013;38(11):2751–60. doi: 10.1016/j.addbeh.2013.07.009. - DOI - PubMed
    1. Primack BA, Hopkins M, Hallett C, Carroll MV, Zeller M, Dachille K, Kim KH, Fine MJ, Donohue JM. US health policy related to hookah tobacco smoking. Am J Public Health. 2012;102(9):e47–51. doi: 10.2105/AJPH.2012.300838. http://europepmc.org/abstract/MED/22827447 - DOI - PMC - PubMed

LinkOut - more resources