Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Dec;13(4):278-286.
doi: 10.1007/s13181-017-0625-5. Epub 2017 Aug 22.

Epidemiology from Tweets: Estimating Misuse of Prescription Opioids in the USA from Social Media

Affiliations

Epidemiology from Tweets: Estimating Misuse of Prescription Opioids in the USA from Social Media

Michael Chary et al. J Med Toxicol. 2017 Dec.

Abstract

Background: The misuse of prescription opioids (MUPO) is a leading public health concern. Social media are playing an expanded role in public health research, but there are few methods for estimating established epidemiological metrics from social media. The purpose of this study was to demonstrate that the geographic variation of social media posts mentioning prescription opioid misuse strongly correlates with government estimates of MUPO in the last month.

Methods: We wrote software to acquire publicly available tweets from Twitter from 2012 to 2014 that contained at least one keyword related to prescription opioid use (n = 3,611,528). A medical toxicologist and emergency physician curated the list of keywords. We used the semantic distance (SemD) to automatically quantify the similarity of meaning between tweets and identify tweets that mentioned MUPO. We defined the SemD between two words as the shortest distance between the two corresponding word-centroids. Each word-centroid represented all recognized meanings of a word. We validated this automatic identification with manual curation. We used Twitter metadata to estimate the location of each tweet. We compared our estimated geographic distribution with the 2013-2015 National Surveys on Drug Usage and Health (NSDUH).

Results: Tweets that mentioned MUPO formed a distinct cluster far away from semantically unrelated tweets. The state-by-state correlation between Twitter and NSDUH was highly significant across all NSDUH survey years. The correlation was strongest between Twitter and NSDUH data from those aged 18-25 (r = 0.94, p < 0.01 for 2012; r = 0.94, p < 0.01 for 2013; r = 0.71, p = 0.02 for 2014). The correlation was driven by discussions of opioid use, even after controlling for geographic variation in Twitter usage.

Conclusions: Mentions of MUPO on Twitter correlate strongly with state-by-state NSDUH estimates of MUPO. We have also demonstrated that a natural language processing can be used to analyze social media to provide insights for syndromic toxicosurveillance.

Keywords: Computational linguistics; Epidemiology; Misuse; Natural language processing; Opioids; Social media.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest

The authors declare that they have no conflicts of interest.

Sources of Funding

None

Figures

Fig. 1
Fig. 1
Study design. Data are collected from Twitter via Twitter’s Streaming API. Tweets having less than one character are excluded. Tweets are filtered into "signal" tweets (tweets of interest) if they have keywords; otherwise, into "basal activity". MUPO tweets are identified by clustering on SemD and validated by expert curation. A scaled version of the fraction of MUPO tweets in each state is compared with the NSDUH estimate for that same state
Fig. 2
Fig. 2
Separation of tweets into semantic clusters. Each panel is the projection of the same 0.01% random sample of tweets projected onto the two principal components indicated by the panel’s axes. PC1 refers to principal component 1, PC2 principal component 2, PC3 principal component 3. Diagonal shows the distribution of values projected onto each principal component. Data from 2012
Fig. 3
Fig. 3
Twenty most common words in MUPO and not-MUPO clusters in signal stream. X-axis shows the frequency of words in each category on a logarithmic scale. Same logarithmic scale for both panels. Twitter data from 2012
Fig. 4
Fig. 4
Twenty most common words from basal activity stream. X-axis shows the frequency of words in each category on a logarithmic scale
Fig. 5
Fig. 5
Scatter plot of estimates of MUPO from NSDUH and Twitter for 2012. Title of each panel indicates NSDUH age range. Open circles are estimates for each state scaled as indicated in “Methods” section. Solid line shows linear regression line
Fig. 6
Fig. 6
Correlation between NSDUH and Twitter across age groups. Legend indicates NSDUH age groups. All correlation coefficients are significantly greater than 0

Similar articles

Cited by

References

    1. Abuse S. Results from the 2010 National Survey on Drug Use and Health: Summary Of National Findings 2011.
    1. Manchikanti L, Singh A. Therapeutic opioids: a ten-year perspective on the complexities and complications of the escalating use, abuse, and nonmedical use of opioids. Pain physician. 2008;11(2 Suppl):S63–S88. - PubMed
    1. Hansen RN, Oster G, Edelsberg J, Woody GE, Sullivan SD. Economic costs of nonmedical use of prescription opioids. Clin J Pain. 2011;27(3):194–202. doi: 10.1097/AJP.0b013e3181ff04ca. - DOI - PubMed
    1. Florence CS, Zhou C, Luo F, Xu L. The economic burden of prescription opioid overdose, abuse, and dependence in the United States, 2013. Med Care. 2016;54(10):901–906. doi: 10.1097/MLR.0000000000000625. - DOI - PMC - PubMed
    1. Chew C, Eysenbach G. Pandemics in the age of Twitter: content analysis of Tweets during the 2009 H1N1 outbreak. PLoS One. 2010;5(11):e14118. doi: 10.1371/journal.pone.0014118. - DOI - PMC - PubMed

MeSH terms