Using Machine Learning to Compare Provaccine and Antivaccine Discourse Among the Public on Social Media: Algorithm Development Study
- PMID: 34185004
- PMCID: PMC8277307
- DOI: 10.2196/23105
Using Machine Learning to Compare Provaccine and Antivaccine Discourse Among the Public on Social Media: Algorithm Development Study
Abstract
Background: Despite numerous counteracting efforts, antivaccine content linked to delays and refusals to vaccinate has grown persistently on social media, while only a few provaccine campaigns have succeeded in engaging with or persuading the public to accept immunization. Many prior studies have associated the diversity of topics discussed by antivaccine advocates with the public's higher engagement with such content. Nonetheless, a comprehensive comparison of discursive topics in pro- and antivaccine content in the engagement-persuasion spectrum remains unexplored.
Objective: We aimed to compare discursive topics chosen by pro- and antivaccine advocates in their attempts to influence the public to accept or reject immunization in the engagement-persuasion spectrum. Our overall objective was pursued through three specific aims as follows: (1) we classified vaccine-related tweets into provaccine, antivaccine, and neutral categories; (2) we extracted and visualized discursive topics from these tweets to explain disparities in engagement between pro- and antivaccine content; and (3) we identified how those topics frame vaccines using Entman's four framing dimensions.
Methods: We adopted a multimethod approach to analyze discursive topics in the vaccine debate on public social media sites. Our approach combined (1) large-scale balanced data collection from a public social media site (ie, 39,962 tweets from Twitter); (2) the development of a supervised classification algorithm for categorizing tweets into provaccine, antivaccine, and neutral groups; (3) the application of an unsupervised clustering algorithm for identifying prominent topics discussed on both sides; and (4) a multistep qualitative content analysis for identifying the prominent discursive topics and how vaccines are framed in these topics. In so doing, we alleviated methodological challenges that have hindered previous analyses of pro- and antivaccine discursive topics.
Results: Our results indicated that antivaccine topics have greater intertopic distinctiveness (ie, the degree to which discursive topics are distinct from one another) than their provaccine counterparts (t122=2.30, P=.02). In addition, while antivaccine advocates use all four message frames known to make narratives persuasive and influential, provaccine advocates have neglected having a clear problem statement.
Conclusions: Based on our results, we attribute higher engagement among antivaccine advocates to the distinctiveness of the topics they discuss, and we ascribe the influence of the vaccine debate on uptake rates to the comprehensiveness of the message frames. These results show the urgency of developing clear problem statements for provaccine content to counteract the negative impact of antivaccine content on uptake rates.
Keywords: Twitter messaging; antivaccination movement; data visualization; health misinformation; infodemic; infodemiology; infoveillance; public health informatics; qualitative content analysis; social listening; supervised machine learning algorithm; unsupervised machine learning algorithm.
©Young Anna Argyris, Kafui Monu, Pang-Ning Tan, Colton Aarts, Fan Jiang, Kaleigh Anne Wiseley. Originally published in JMIR Public Health and Surveillance (https://publichealth.jmir.org), 24.06.2021.
Conflict of interest statement
Conflicts of Interest: None declared.
Figures




Similar articles
-
Adapting and Extending a Typology to Identify Vaccine Misinformation on Twitter.Am J Public Health. 2020 Oct;110(S3):S331-S339. doi: 10.2105/AJPH.2020.305940. Am J Public Health. 2020. PMID: 33001737 Free PMC article.
-
The Association Between Dissemination and Characteristics of Pro-/Anti-COVID-19 Vaccine Messages on Twitter: Application of the Elaboration Likelihood Model.JMIR Infodemiology. 2022 Jun 27;2(1):e37077. doi: 10.2196/37077. eCollection 2022 Jan-Jun. JMIR Infodemiology. 2022. PMID: 35783451 Free PMC article.
-
COVID-19 Vaccine Hesitancy on Social Media: Building a Public Twitter Data Set of Antivaccine Content, Vaccine Misinformation, and Conspiracies.JMIR Public Health Surveill. 2021 Nov 17;7(11):e30642. doi: 10.2196/30642. JMIR Public Health Surveill. 2021. PMID: 34653016 Free PMC article.
-
The sociology of the antivaccine movement.Emerg Top Life Sci. 2020 Sep 8;4(2):241-245. doi: 10.1042/ETLS20190198. Emerg Top Life Sci. 2020. PMID: 32463081 Review.
-
Prevalence of Health Misinformation on Social Media: Systematic Review.J Med Internet Res. 2021 Jan 20;23(1):e17187. doi: 10.2196/17187. J Med Internet Res. 2021. PMID: 33470931 Free PMC article.
Cited by
-
Automatic detection of health misinformation: a systematic review.J Ambient Intell Humaniz Comput. 2023 May 27:1-13. doi: 10.1007/s12652-023-04619-4. Online ahead of print. J Ambient Intell Humaniz Comput. 2023. PMID: 37360776 Free PMC article.
-
Harnessing artificial intelligence for enhanced public health surveillance: a narrative review.Front Public Health. 2025 Jul 30;13:1601151. doi: 10.3389/fpubh.2025.1601151. eCollection 2025. Front Public Health. 2025. PMID: 40809756 Free PMC article. Review.
-
Vaccine rhetoric on social media and COVID-19 vaccine uptake rates: A triangulation using self-reported vaccine acceptance.Soc Sci Med. 2024 May;348:116775. doi: 10.1016/j.socscimed.2024.116775. Epub 2024 Mar 15. Soc Sci Med. 2024. PMID: 38579627 Free PMC article.
-
Childhood Vaccine Attitude and Refusal among Turkish Parents.Vaccines (Basel). 2023 Jul 26;11(8):1285. doi: 10.3390/vaccines11081285. Vaccines (Basel). 2023. PMID: 37631853 Free PMC article.
-
The Role of Influential Actors in Fostering the Polarized COVID-19 Vaccine Discourse on Twitter: Mixed Methods of Machine Learning and Inductive Coding.JMIR Infodemiology. 2022 Jun 30;2(1):e34231. doi: 10.2196/34231. eCollection 2022 Jan-Jun. JMIR Infodemiology. 2022. PMID: 35814809 Free PMC article.
References
-
- Dunn AG, Surian D, Leask J, Dey A, Mandl KD, Coiera E. Mapping information exposure on social media to explain differences in HPV vaccine coverage in the United States. Vaccine. 2017 May 25;35(23):3033–3040. doi: 10.1016/j.vaccine.2017.04.060. https://linkinghub.elsevier.com/retrieve/pii/S0264-410X(17)30552-2 - DOI - PubMed
-
- Larson HJ, Jarrett C, Eckersberger E, Smith DM, Paterson P. Understanding vaccine hesitancy around vaccines and vaccination from a global perspective: a systematic review of published literature, 2007-2012. Vaccine. 2014 Apr 17;32(19):2150–9. doi: 10.1016/j.vaccine.2014.01.081. http://paperpile.com/b/Y5SGkE/QKgrr - DOI - PubMed
-
- Ortiz RR, Smith A, Coyne-Beasley T. A systematic literature review to examine the potential for social media to impact HPV vaccine uptake and awareness, knowledge, and attitudes about HPV and HPV vaccination. Hum Vaccin Immunother. 2019 Apr 11;15(7-8):1465–1475. doi: 10.1080/21645515.2019.1581543. http://europepmc.org/abstract/MED/30779682 - DOI - PMC - PubMed
-
- Moms and Media 2019. Edison Research. 2019. [2021-06-13]. https://www.edisonresearch.com/moms-and-media-2019/
MeSH terms
LinkOut - more resources
Full Text Sources
Medical