PLoS ONE (Jan 2020)

Comparing covariation among vaccine hesitancy and broader beliefs within Twitter and survey data.

  • Sarah A Nowak,
  • Christine Chen,
  • Andrew M Parker,
  • Courtney A Gidengil,
  • Luke J Matthews

DOI
https://doi.org/10.1371/journal.pone.0239826
Journal volume & issue
Vol. 15, no. 10
p. e0239826

Abstract

Read online

Over the past decade, the percentage of adults in the United States who use some form of social media has roughly doubled, increasing from 36 percent in early 2009 to 72 percent in 2019. There has been a corresponding increase in research aimed at understanding opinions and beliefs that are expressed online. However, the generalizability of findings from social media research is a subject of ongoing debate. Social media platforms are conduits of both information and misinformation about vaccines and vaccine hesitancy. Our research objective was to examine whether we can draw similar conclusions from Twitter and national survey data about the relationship between vaccine hesitancy and a broader set of beliefs. In 2018 we conducted a nationally representative survey of parents in the United States informed by a literature review to ask their views on a range of topics, including vaccine side effects, conspiracy theories, and understanding of science. We developed a set of keyword-based queries corresponding to each of the belief items from the survey and pulled matching tweets from 2017. We performed the data pull of the most recent full year of data in 2018. Our primary measures of belief covariation were the loadings and scores of the first principal components obtained using principal component analysis (PCA) from the two sources. We found that, after using manually coded weblinks in tweets to infer stance, there was good qualitative agreement between the first principal component loadings and scores using survey and Twitter data. This held true after we took the additional processing step of resampling the Twitter data based on the number of topics that an individual tweeted about, as a means of correcting for differential representation for elicited (survey) vs. volunteered (Twitter) beliefs. Overall, the results show that analyses using Twitter data may be generalizable in certain contexts, such as assessing belief covariation.