Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jun 16:2023:261-270.
eCollection 2023.

Generalizable Natural Language Processing Framework for Migraine Reporting from Social Media

Affiliations

Generalizable Natural Language Processing Framework for Migraine Reporting from Social Media

Yuting Guo et al. AMIA Jt Summits Transl Sci Proc. .

Abstract

Migraine is a highly prevalent and disabling neurological disorder. However, information about migraine management in real-world settings is limited to traditional health information sources. In this paper, we (i) verify that there is substantial migraine-related chatter available on social media (Twitter and Reddit), self-reported by those with migraine; (ii) develop a platform-independent text classification system for automatically detecting self-reported migraine-related posts, and (iii) conduct analyses of the self-reported posts to assess the utility of social media for studying this problem. We manually annotated 5750 Twitter posts and 302 Reddit posts, and used them for training and evaluating supervised machine learning methods. Our best system achieved an F1 score of 0.90 on Twitter and 0.93 on Reddit. Analysis of information posted by our 'migraine cohort' revealed the presence of a plethora of relevant information about migraine therapies and sentiments associated with them. Our study forms the foundation for conducting an in-depth analysis of migraine-related information using social media data.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
The framework of our generalizable NLP system—model development and validation on Twitter data followed by additional evaluation on Reddit posts.
Figure 2.
Figure 2.
Examples from the bias analysis. The green color signifies positive attention to the words while red shows negative.
Figure 3.
Figure 3.
The normalized sentiment distributions of the medications: (a) Twitter and (b) Reddit.

References

    1. Nittas V, Lun P, Ehrler F, Puhan MA, Mütsch M. Electronic Patient-Generated Health Data to Facilitate Disease Prevention and Health Promotion: Scoping Review. J Med Internet Res. 2019;21(10):e13320. doi:10.2196/13320. - PMC - PubMed
    1. Conway M, Hu M, Chapman WW. Recent Advances in Using Natural Language Processing to Address Public Health Research Questions Using Social Media and ConsumerGenerated Data. Yearbook of medical informatics. 2019;28(1):208–217. doi:10.1055/s-0039-1677918. - PMC - PubMed
    1. Gonzalez-Hernandez G, Sarker A, O’Connor K, Savova G. Capturing the Patient’s Perspective: a Review of Advances in Natural Language Processing of Health-Related Text. Yearb Med Inform. 2017;26(1):214–227. - PMC - PubMed
    1. Paul MJ, Sarker A, Brownstein JS, et al. Social Media Mining for Public Health Monitoring and Surveillance. Pacific Symposium on Biocomputing. World Scientific Publishing Co. Pte Ltd. 2016:468–479. doi:10.1142/9789814749411_0043.
    1. Ravindranath S, Zhao C, Tgavalekos K. Patient Status Indicator to Extract Key Temporal Changes in Continuous-Time Deterioration Risk Score. Critical Care Medicine. 2021;49(1) https://journals.lww.com/ccmjournal/Fulltext/2021/01001/374_Patient_Stat... .

LinkOut - more resources