Digital Health (May 2022)

A chronological and geographical analysis of personal reports of COVID-19 on Twitter from the UK

  • Su Golder,
  • Ari Z Klein,
  • Arjun Magge,
  • Karen O’Connor,
  • Haitao Cai,
  • Davy Weissenbacher,
  • Graciela Gonzalez-Hernandez

DOI
https://doi.org/10.1177/20552076221097508
Journal volume & issue
Vol. 8

Abstract

Read online

Objective Given the uncertainty about the trends and extent of the rapidly evolving COVID-19 outbreak, and the lack of extensive testing in the United Kingdom, our understanding of COVID-19 transmission is limited. We proposed to use Twitter to identify personal reports of COVID-19 to assess whether this data can help inform as a source of data to help us understand and model the transmission and trajectory of COVID-19. Methods We used natural language processing and machine learning framework. We collected tweets (excluding retweets) from the Twitter Streaming API that indicate that the user or a member of the user's household had been exposed to COVID-19. The tweets were required to be geo-tagged or have profile location metadata in the UK. Results We identified a high level of agreement between personal reports from Twitter and lab-confirmed cases by geographical region in the UK. Temporal analysis indicated that personal reports from Twitter appear up to 2 weeks before UK government lab-confirmed cases are recorded. Conclusions Analysis of tweets may indicate trends in COVID-19 in the UK and provide signals of geographical locations where resources may need to be targeted or where regional policies may need to be put in place to further limit the spread of COVID-19. It may also help inform policy makers of the restrictions in lockdown that are most effective or ineffective.