Healthcare (Nov 2022)

Deciphering Latent Health Information in Social Media Using a Mixed-Methods Design

  • George Shaw,
  • Margaret Zimmerman,
  • Ligia Vasquez-Huot,
  • Amir Karami

DOI
https://doi.org/10.3390/healthcare10112320
Journal volume & issue
Vol. 10, no. 11
p. 2320

Abstract

Read online

Natural language processing techniques have increased the volume and variety of text data that can be analyzed. The aim of this study was to identify the positive and negative topical sentiments among diet, diabetes, exercise, and obesity tweets. Using a sequential explanatory mixed-method design for our analytical framework, we analyzed a data corpus of 1.7 million diet, diabetes, exercise, and obesity (DDEO)-related tweets collected over 12 months. Sentiment analysis and topic modeling were used to analyze the data. The results show that overall, 29% of the tweets were positive, and 17% were negative. Using sentiment analysis and latent Dirichlet allocation (LDA) topic modeling, we analyzed 800 positive and negative DDEO topics. From the 800 LDA topics—after the qualitative and computational removal of incoherent topics—473 topics were characterized as coherent. Obesity was the only query health topic with a higher percentage of negative tweets. The use of social media by public health practitioners should focus not only on the dissemination of health information based on the topics discovered but also consider what they can do for the health consumer as a result of the interaction in digital spaces such as social media. Future studies will benefit from using multiclass sentiment analysis methods associated with other novel topic modeling approaches.

Keywords