Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Apr 12;18(8):4069.
doi: 10.3390/ijerph18084069.

Applying Machine Learning to Identify Anti-Vaccination Tweets during the COVID-19 Pandemic

Affiliations

Applying Machine Learning to Identify Anti-Vaccination Tweets during the COVID-19 Pandemic

Quyen G To et al. Int J Environ Res Public Health. .

Abstract

Anti-vaccination attitudes have been an issue since the development of the first vaccines. The increasing use of social media as a source of health information may contribute to vaccine hesitancy due to anti-vaccination content widely available on social media, including Twitter. Being able to identify anti-vaccination tweets could provide useful information for formulating strategies to reduce anti-vaccination sentiments among different groups. This study aims to evaluate the performance of different natural language processing models to identify anti-vaccination tweets that were published during the COVID-19 pandemic. We compared the performance of the bidirectional encoder representations from transformers (BERT) and the bidirectional long short-term memory networks with pre-trained GLoVe embeddings (Bi-LSTM) with classic machine learning methods including support vector machine (SVM) and naïve Bayes (NB). The results show that performance on the test set of the BERT model was: accuracy = 91.6%, precision = 93.4%, recall = 97.6%, F1 score = 95.5%, and AUC = 84.7%. Bi-LSTM model performance showed: accuracy = 89.8%, precision = 44.0%, recall = 47.2%, F1 score = 45.5%, and AUC = 85.8%. SVM with linear kernel performed at: accuracy = 92.3%, Precision = 19.5%, Recall = 78.6%, F1 score = 31.2%, and AUC = 85.6%. Complement NB demonstrated: accuracy = 88.8%, precision = 23.0%, recall = 32.8%, F1 score = 27.1%, and AUC = 62.7%. In conclusion, the BERT models outperformed the Bi-LSTM, SVM, and NB models in this task. Moreover, the BERT model achieved excellent performance and can be used to identify anti-vaccination tweets in future studies.

Keywords: BERT; LSTM; deep learning; neural network; stance analysis; transformer; vaccine.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

References

    1. Doherty M., Buchy P., Standaert B., Giaquinto C., Prado-Cohrs D. Vaccine impact: Benefits for human health. Vaccine. 2016;34:6707–6714. doi: 10.1016/j.vaccine.2016.10.025. - DOI - PubMed
    1. American Academy of Pediatrics Documenting Parental Refusal to Have Their Children Vaccinated. [(accessed on 30 November 2020)]. Available online: https://www.aap.org/en-us/documents/immunization_refusaltovaccinate.pdf.
    1. Bednarczyk R.A., King A.R., Lahijani A., Omer S.B. Current landscape of nonmedical vaccination exemptions in the United States: Impact of policy changes. Expert Rev. Vaccines. 2019;18:175–190. doi: 10.1080/14760584.2019.1562344. - DOI - PMC - PubMed
    1. World Health Organization Ten Threats to Global Health in 2019. [(accessed on 30 November 2020)]. Available online: https://www.who.int/news-room/spotlight/ten-threats-to-global-health-in-....
    1. Megget K. Even covid-19 can’t kill the anti-vaccination movement. BMJ. 2020;369:m2184. doi: 10.1136/bmj.m2184. - DOI - PubMed

LinkOut - more resources