Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Feb 24;18(2):e0277878.
doi: 10.1371/journal.pone.0277878. eCollection 2023.

Sentiment analysis and causal learning of COVID-19 tweets prior to the rollout of vaccines

Affiliations

Sentiment analysis and causal learning of COVID-19 tweets prior to the rollout of vaccines

Qihuang Zhang et al. PLoS One. .

Abstract

While the impact of the COVID-19 pandemic has been widely studied, relatively fewer discussions about the sentimental reaction of the public are available. In this article, we scrape COVID-19 related tweets on the microblogging platform, Twitter, and examine the tweets from February 24, 2020 to October 14, 2020 in four Canadian cities (Toronto, Montreal, Vancouver, and Calgary) and four U.S. cities (New York, Los Angeles, Chicago, and Seattle). Applying the RoBERTa, Vader and NRC approaches, we evaluate sentiment intensity scores and visualize the results over different periods of the pandemic. Sentiment scores for the tweets concerning three anti-epidemic measures, "masks", "vaccine", and "lockdown", are computed for comparison. We explore possible causal relationships among the variables concerning tweet activities and sentiment scores of COVID-19 related tweets by integrating the echo state network method with convergent cross-mapping. Our analyses show that public sentiments about COVID-19 vary from time to time and from place to place, and are different with respect to anti-epidemic measures of "masks", "vaccines", and "lockdown". Evidence of the causal relationship is revealed for the examined variables, assuming the suggested model is feasible.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. The timeline of the measures, indicated by the three periods, which are taken by the four provincial governments in Canada: Alberta, Quebec, British Columbia, and Ontario, and four state governments in U.S.: Washington, Illinois, California, and New York.
Fig 2
Fig 2. The diagram of an echo state network.
Fig 3
Fig 3. Canadian data: The trajectory of provincial daily infected cases, daily number of tweets, total counts of likes, reply and retweets of the COVID-19 related tweets for Toronto, Vancouver, Montreal, and Calgary. The y-axis is presented on the scale of logarithm.
Fig 4
Fig 4. U.S. data: The trajectory of state daily infected cases, the daily number of tweets, total counts of likes, replies, and retweets of the COVID-19 related tweets for New York, Los Angeles, Seattle, and Chicago.
The y-axis is presented on the scale of the logarithm.
Fig 5
Fig 5. The line plot of the sentiment score of “COVID19” related tweets over time for the eight cities in North America, calculated using Vader lexicon and roBERTa, respectively.
Fig 6
Fig 6. The line plots of the sentiment score of “mask”, “vaccine”, and “lockdown” related tweets calculated using roBERTa over time for the eight cities in North America.
Fig 7
Fig 7. The density graph of the distribution of the moods frequency for different cities over 234 days.
Fig 8
Fig 8. Pearson correlation coefficient versus different choices of lags for examining the possible causal relationship between X and Y, where X represents the daily like counts of a city, and Y stands for the daily average sentiment scores in that city calculated from RoBERTa method.
Here Direction 1 examines whether Y is the cause of X, and Direction 2 assesses whether X is the cause of Y. The dash vertical line refers to the choice of lag that achieves the peak of the correlations.
Fig 9
Fig 9. Pearson correlation coefficient versus different values of τ for examining the possible causal relationship between X and Y, taken from a pair of variables among sentiment scores calculated using RoBERTa, the daily number of tweets, the daily number of likes, daily number of reply, and the daily number of retweets.
Here Direction 1 examines whether Y is the cause of X, and Direction 2 assesses whether X is the cause of Y. The dash vertical line refers to the choice of lag that achieves the peak of the correlations.
Fig 10
Fig 10. Summary of causal relationships for the variables associated with the tweet activities and sentiment scores.

References

    1. Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. The Lancet Infectious Diseases. 2020;20(5):533–534. doi: 10.1016/S1473-3099(20)30120-1 - DOI - PMC - PubMed
    1. Marchand-Senécal X, Kozak R, Mubareka S, Salt N, Gubbay JB, Eshaghi A, et al.. Diagnosis and management of first case of COVID-19 in Canada: lessons applied from SARS-CoV-1. Clinical Infectious Diseases. 2020;71(16):2207–2210. doi: 10.1093/cid/ciaa227 - DOI - PMC - PubMed
    1. Lawson T, Nathans L, Goldenberg A, Fimiani M, Boire-Schwab David. COVID-19: Emergency Measures Tracker. 2020. https://www.mccarthy.ca/en/insights/articles/covid-19-emergency-measures....
    1. Bhat M, Qadri M, Noor-ul Asrar Beg MK, Ahanger N, Agarwal B. Sentiment analysis of social media response on the COVID-19 outbreak. Brain, Behavior, and Immunity. 2020;87:136–137. doi: 10.1016/j.bbi.2020.05.006 - DOI - PMC - PubMed
    1. Kwartler T. Text Mining in Practice with R. John Wiley & Sons; 2017.

Publication types