JMIR Public Health and Surveillance (Jul 2022)

Using Social Media to Predict Food Deserts in the United States: Infodemiology Study of Tweets

  • Nekabari Sigalo,
  • Beth St Jean,
  • Vanessa Frias-Martinez

DOI
https://doi.org/10.2196/34285
Journal volume & issue
Vol. 8, no. 7
p. e34285

Abstract

Read online

BackgroundThe issue of food insecurity is becoming increasingly important to public health practitioners because of the adverse health outcomes and underlying racial disparities associated with insufficient access to healthy foods. Prior research has used data sources such as surveys, geographic information systems, and food store assessments to identify regions classified as food deserts but perhaps the individuals in these regions unknowingly provide their own accounts of food consumption and food insecurity through social media. Social media data have proved useful in answering questions related to public health; therefore, these data are a rich source for identifying food deserts in the United States. ObjectiveThe aim of this study was to develop, from geotagged Twitter data, a predictive model for the identification of food deserts in the United States using the linguistic constructs found in food-related tweets. MethodsTwitter’s streaming application programming interface was used to collect a random 1% sample of public geolocated tweets across 25 major cities from March 2020 to December 2020. A total of 60,174 geolocated food-related tweets were collected across the 25 cities. Each geolocated tweet was mapped to its respective census tract using point-to-polygon mapping, which allowed us to develop census tract–level features derived from the linguistic constructs found in food-related tweets, such as tweet sentiment and average nutritional value of foods mentioned in the tweets. These features were then used to examine the associations between food desert status and the food ingestion language and sentiment of tweets in a census tract and to determine whether food-related tweets can be used to infer census tract–level food desert status. ResultsWe found associations between a census tract being classified as a food desert and an increase in the number of tweets in a census tract that mentioned unhealthy foods (P=.03), including foods high in cholesterol (P=.02) or low in key nutrients such as potassium (P=.01). We also found an association between a census tract being classified as a food desert and an increase in the proportion of tweets that mentioned healthy foods (P=.03) and fast-food restaurants (P=.01) with positive sentiment. In addition, we found that including food ingestion language derived from tweets in classification models that predict food desert status improves model performance compared with baseline models that only include socioeconomic characteristics. ConclusionsSocial media data have been increasingly used to answer questions related to health and well-being. Using Twitter data, we found that food-related tweets can be used to develop models for predicting census tract food desert status with high accuracy and improve over baseline models. Food ingestion language found in tweets, such as census tract–level measures of food sentiment and healthiness, are associated with census tract–level food desert status.