Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jun 12;28(6):1125-1134.
doi: 10.1093/jamia/ocaa298.

Developing a standardized protocol for computational sentiment analysis research using health-related social media data

Affiliations

Developing a standardized protocol for computational sentiment analysis research using health-related social media data

Lu He et al. J Am Med Inform Assoc. .

Abstract

Objective: Sentiment analysis is a popular tool for analyzing health-related social media content. However, existing studies exhibit numerous methodological issues and inconsistencies with respect to research design and results reporting, which could lead to biased data, imprecise or incorrect conclusions, or incomparable results across studies. This article reports a systematic analysis of the literature with respect to such issues. The objective was to develop a standardized protocol for improving the research validity and comparability of results in future relevant studies.

Materials and methods: We developed the Protocol of Analysis of senTiment in Health (PATH) based on a systematic review that analyzed common research design choices and how such choices were made, or reported, among eligible studies published 2010-2019.

Results: Of 409 articles screened, 89 met the inclusion criteria. A total of 16 distinctive research design choices were identified, 9 of which have significant methodological or reporting inconsistencies among the articles reviewed, ranging from how relevance of study data was determined to how the sentiment analysis tool selected was validated. Based on this result, we developed the PATH protocol that encompasses all these distinctive design choices and highlights the ones for which careful consideration and detailed reporting are particularly warranted.

Conclusions: A substantial degree of methodological and reporting inconsistencies exist in the extant literature that applied sentiment analysis to analyzing health-related social media data. The PATH protocol developed through this research may contribute to mitigating such issues in future relevant studies.

Keywords: Facebook; Instagram; Twitter; Web 2.0; computing methodologies [L01.224]; machine learning [G17.035.250.500]; natural language processing [L01.224.050.375.580]; reference standard [E05.978.808]; sentiment analysis; social media [L01.178.75]; user-generated content.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) Flow Diagram.
Figure 2.
Figure 2.
Protocol of Analysis of senTiment in Health (PATH).

References

    1. Pruksachatkun Y, Pendse SR, Sharma A. Moments of change: analyzing peer-based cognitive support in online mental health forums. In: CHI ’19: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 2019: 64: 1–13. doi: 10.1145/3290605.3300294
    1. Cabling ML, Turner JW, Hurtado-de-Mendoza A, et al. Sentiment analysis of an online breast cancer support group: communicating about tamoxifen. Health Commun 2018; 33 (9): 1158–65. - PMC - PubMed
    1. Davis MA, Zheng K, Liu Y, Levy H.. Public response to Obamacare on Twitter. J Med Internet Res 2017; 19 (5): e167. - PMC - PubMed
    1. Thelwall M, Thelwall S. Retweeting for COVID-19: consensus building, information sharing, dissent, and lockdown life. arXiv: 2004.02793; 2020.
    1. Du J, Xu J, Song H, Liu X, Tao C.. Optimization on machine learning based approaches for sentiment analysis on HPV vaccines related tweets. J Biomed Semantics 2017; 8 (1): 9. - PMC - PubMed

Publication types