IEEE Access (Jan 2022)
Identifying COVID-19 Personal Health Mentions From Tweets Using Masked Attention Model
Abstract
Twitter has been an important platform for people to discuss and share health-related information. It provides a massive amount of data for real-time monitoring of infectious diseases (such as COVID-19) and freeing disease-prevention organizations from the tedious labor involved in public health surveillance. Personal health mention (PHM) detection is one of the critical methods to keep up-to-date on an epidemic’s condition; it attempts to identify a person’s health condition based on online text information. This paper explores PHM identification for COVID-19 through Twitter. We built a COVID-19 PHM data set containing tweets annotated with four types of COVID-19-related health conditions. A masked attention model was devised to classify the tweets as self-mention, other-mention, awareness, and non-health. We obtained promising results on the PHM identification task. The classification results facilitate timely health monitoring and surveillance for digital epidemiology. We also evaluate how the attention mechanism and training method affect the model’s predictive performance.
Keywords