Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Dec 9:26:e59425.
doi: 10.2196/59425.

Long COVID Discourse in Canada, the United States, and Europe: Topic Modeling and Sentiment Analysis of Twitter Data

Affiliations

Long COVID Discourse in Canada, the United States, and Europe: Topic Modeling and Sentiment Analysis of Twitter Data

Ahmed Ghassan Tawfiq AbuRaed et al. J Med Internet Res. .

Abstract

Background: Social media serves as a vast repository of data, offering insights into public perceptions and emotions surrounding significant societal issues. Amid the COVID-19 pandemic, long COVID (formally known as post-COVID-19 condition) has emerged as a chronic health condition, profoundly impacting numerous lives and livelihoods. Given the dynamic nature of long COVID and our evolving understanding of it, effectively capturing people's sentiments and perceptions through social media becomes increasingly crucial. By harnessing the wealth of data available on social platforms, we can better track the evolving narrative surrounding long COVID and the collective efforts to address this pressing issue.

Objective: This study aimed to investigate people's perceptions and sentiments around long COVID in Canada, the United States, and Europe, by analyzing English-language tweets from these regions using advanced topic modeling and sentiment analysis techniques. Understanding regional differences in public discourse can inform tailored public health strategies.

Methods: We analyzed long COVID-related tweets from 2021. Contextualized topic modeling was used to capture word meanings in context, providing coherent and semantically meaningful topics. Sentiment analysis was conducted in a zero-shot manner using Llama 2, a large language model, to classify tweets into positive, negative, or neutral sentiments. The results were interpreted in collaboration with public health experts, comparing the timelines of topics discussed across the 3 regions. This dual approach enabled a comprehensive understanding of the public discourse surrounding long COVID. We used metrics such as normalized pointwise mutual information for coherence and topic diversity for diversity to ensure robust topic modeling results.

Results: Topic modeling identified five main topics: (1) long COVID in people including children in the context of vaccination, (2) duration and suffering associated with long COVID, (3) persistent symptoms of long COVID, (4) the need for research on long COVID treatment, and (5) measuring long COVID symptoms. Significant concern was noted across all regions about the duration and suffering associated with long COVID, along with consistent discussions on persistent symptoms and calls for more research and better treatments. In particular, the topic of persistent symptoms was highly prevalent, reflecting ongoing challenges faced by individuals with long COVID. Sentiment analysis showed a mix of positive and negative sentiments, fluctuating with significant events and news related to long COVID.

Conclusions: Our study combines natural language processing techniques, including contextualized topic modeling and sentiment analysis, along with domain expert input, to provide detailed insights into public health monitoring and intervention. These findings highlight the importance of tracking public discourse on long COVID to inform public health strategies, address misinformation, and provide support to affected individuals. The use of social media analysis in understanding public health issues is underscored, emphasizing the role of emerging technologies in enhancing public health responses.

Keywords: Twitter; long COVID; public health; public perception; sentiment analysis; social media analysis; topic modeling.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: NZJ participated in advisory boards and has spoken for AbbVie and Gilead, not related to this work.

Figures

Figure 1
Figure 1
Intertopic distance map via multidimensional scaling. This figure visualizes the distance between the 5 topics identified from long COVID–related tweets in 2021, showing intertopic overlap among topics T1-T4 but not for T5. The study includes tweets from Canada, the United States, and Europe. T1: Long COVID in people including children in the context of vaccination; T2: duration and suffering associated with long COVID; T3: persistent symptoms of long COVID; T4: need for research on long COVID treatment; and T5: measuring long COVID symptoms. In the figure, T1 is selected.
Figure 2
Figure 2
Analyzing topic trends: visualizing the prominence of different topics over time through a bar chart for the Canada region. This figure shows the weekly distribution of the 5 identified topics in long COVID–related tweets in Canada during 2021.
Figure 3
Figure 3
Analyzing topic trends: visualizing the prominence of different topics over time through a bar chart for the US region. This figure displays the weekly distribution of the 5 identified topics in long COVID–related tweets in the United States during 2021.
Figure 4
Figure 4
Analyzing topic trends: visualizing the prominence of different topics over time through a bar chart for Europe region. This figure depicts the weekly distribution of the 5 identified topics in long COVID–related tweets in Europe during 2021.
Figure 5
Figure 5
Analyzing topic trends: visualizing the prominence of different topics over time through a line chart for the Canada region. This figure shows the trends of the 5 identified topics in long COVID–related tweets in Canada over the year 2021.
Figure 6
Figure 6
Analyzing topic trends: visualizing the prominence of different topics over time through a line chart for the US region. This figure illustrates the trends of the 5 identified topics in long COVID–related tweets in the United States over the year 2021.
Figure 7
Figure 7
Analyzing topic trends: visualizing the prominence of different topics over time through a line chart for Europe region. This figure shows the trends of the 5 identified topics in long COVID–related tweets in Europe over the year 2021.
Figure 8
Figure 8
Sentiment analysis: visualizing the monthly sentiment counts produced by Llama 2 through a bar chart for the Canada region. This figure displays the count distribution of positive, negative, and neutral sentiments in long COVID–related tweets in Canada during each month of 2021.
Figure 9
Figure 9
Sentiment analysis: visualizing the monthly sentiment counts produced by Llama 2 through a bar chart for the US region. This figure shows the count distribution of positive, negative, and neutral sentiments in long COVID–related tweets in the United States during each month of 2021.
Figure 10
Figure 10
Sentiment analysis: visualizing the monthly sentiment counts produced by Llama 2 through a bar chart for Europe region. This figure illustrates the count distribution of positive, negative, and neutral sentiments in long COVID–related tweets in Europe during each month of 2021.
Figure 11
Figure 11
Sentiment analysis: visualizing the monthly sentiment percentages produced by Llama 2 through a bar chart for the Canada region. This figure displays the percentage distribution of positive, negative, and neutral sentiments in long COVID–related tweets in Canada during each month of 2021.
Figure 12
Figure 12
Sentiment analysis: visualizing the monthly sentiment percentages produced by Llama 2 through a bar chart for the US region. This figure shows the percentage distribution of positive, negative, and neutral sentiments in long COVID–related tweets in the United States during each month of 2021.
Figure 13
Figure 13
Sentiment analysis: visualizing the monthly sentiment percentages produced by Llama 2 through a bar chart for Europe region. This figure illustrates the percentage distribution of positive, negative, and neutral sentiments in long COVID–related tweets in Europe during each month of 2021.

Similar articles

Cited by

References

    1. Coronavirus (COVID-19) dashboard. World Health Organization. [2024-06-30]. https://covid19.who.int .
    1. Altmann DM, Whettlock EM, Liu S, Arachchillage DJ, Boyton RJ. The immunology of long COVID. Nat Rev Immunol. 2023;23(10):618–634. doi: 10.1038/s41577-023-00904-7.10.1038/s41577-023-00904-7 - DOI - PubMed
    1. Binka M, Klaver B, Cua G, Wong AW, Fibke C, Velásquez García HA, Adu P, Levin A, Mishra S, Sander B, Sbihi H, Janjua NZ. An elastic net regression model for identifying long COVID patients using health administrative data: a population-based study. Open Forum Infect Dis. 2022;9(12):ofac640. doi: 10.1093/ofid/ofac640.ofac640 - DOI - PMC - PubMed
    1. Long-term effects of COVID-19. Centers for Disease Control and Prevention. [2024-06-30]. https://www.cdc.gov/coronavirus/2019-ncov/long-term-effects/index.html#:... .
    1. Natarajan A, Shetty A, Delanerolle G, Zeng Y, Zhang Y, Raymont V, Rathod S, Halabi S, Elliot K, Shi JQ, Phiri P. A systematic review and meta-analysis of long COVID symptoms. Syst Rev. 2023;12(1):88. doi: 10.1186/s13643-023-02250-0. https://systematicreviewsjournal.biomedcentral.com/articles/10.1186/s136... 10.1186/s13643-023-02250-0 - DOI - DOI - PMC - PubMed

LinkOut - more resources