Using Machine Learning to Compare Provaccine and Antivaccine Discourse Among the Public on Social Media: Algorithm Development Study

Young Anna Argyris; Kafui Monu; Pang-Ning Tan; Colton Aarts; Fan Jiang; Kaleigh Anne Wiseley

doi:10.2196/23105

JMIR Public Health and Surveillance (Jun 2021)

Using Machine Learning to Compare Provaccine and Antivaccine Discourse Among the Public on Social Media: Algorithm Development Study

Young Anna Argyris,
Kafui Monu,
Pang-Ning Tan,
Colton Aarts,
Fan Jiang,
Kaleigh Anne Wiseley

Affiliations

Young Anna Argyris: ORCiD
Kafui Monu: ORCiD
Pang-Ning Tan: ORCiD
Colton Aarts: ORCiD
Fan Jiang: ORCiD
Kaleigh Anne Wiseley: ORCiD

DOI: https://doi.org/10.2196/23105
Journal volume & issue: Vol. 7, no. 6
p. e23105

Abstract

Read online

BackgroundDespite numerous counteracting efforts, antivaccine content linked to delays and refusals to vaccinate has grown persistently on social media, while only a few provaccine campaigns have succeeded in engaging with or persuading the public to accept immunization. Many prior studies have associated the diversity of topics discussed by antivaccine advocates with the public’s higher engagement with such content. Nonetheless, a comprehensive comparison of discursive topics in pro- and antivaccine content in the engagement-persuasion spectrum remains unexplored. ObjectiveWe aimed to compare discursive topics chosen by pro- and antivaccine advocates in their attempts to influence the public to accept or reject immunization in the engagement-persuasion spectrum. Our overall objective was pursued through three specific aims as follows: (1) we classified vaccine-related tweets into provaccine, antivaccine, and neutral categories; (2) we extracted and visualized discursive topics from these tweets to explain disparities in engagement between pro- and antivaccine content; and (3) we identified how those topics frame vaccines using Entman’s four framing dimensions. MethodsWe adopted a multimethod approach to analyze discursive topics in the vaccine debate on public social media sites. Our approach combined (1) large-scale balanced data collection from a public social media site (ie, 39,962 tweets from Twitter); (2) the development of a supervised classification algorithm for categorizing tweets into provaccine, antivaccine, and neutral groups; (3) the application of an unsupervised clustering algorithm for identifying prominent topics discussed on both sides; and (4) a multistep qualitative content analysis for identifying the prominent discursive topics and how vaccines are framed in these topics. In so doing, we alleviated methodological challenges that have hindered previous analyses of pro- and antivaccine discursive topics. ResultsOur results indicated that antivaccine topics have greater intertopic distinctiveness (ie, the degree to which discursive topics are distinct from one another) than their provaccine counterparts (t122=2.30, P=.02). In addition, while antivaccine advocates use all four message frames known to make narratives persuasive and influential, provaccine advocates have neglected having a clear problem statement. ConclusionsBased on our results, we attribute higher engagement among antivaccine advocates to the distinctiveness of the topics they discuss, and we ascribe the influence of the vaccine debate on uptake rates to the comprehensiveness of the message frames. These results show the urgency of developing clear problem statements for provaccine content to counteract the negative impact of antivaccine content on uptake rates.

Published in JMIR Public Health and Surveillance

ISSN: 2369-2960 (Online)
Publisher: JMIR Publications
Country of publisher: Canada
LCC subjects: Medicine: Public aspects of medicine
Website: https://publichealth.jmir.org

About the journal