International Journal of Digital Earth (Dec 2024)
Optimizing county-level infectious respiratory disease forecasts: a pandemic case study integrating social media-based physical and social connectivity networks
Abstract
Forecasting infectious respiratory diseases is crucial for effective prevention and intervention strategies. However, existing time series forecasting models that incorporate human mobility data have faced challenges in making localized predictions on a large scale across the country due to data costs and constraints. Using the COVID-19 pandemic as a case study, this research explores whether integrating social media-based place and social connectivity networks can improve predictions of disease transmission at the county level across various regions. Place connectivity networks, derived from Twitter users and tweets, and social connectivity networks, based on Facebook interactions, were used to map spatial and social linkages between locations. These networks were integrated into weekly COVID-19 incidence data across 2,927 U.S. counties using Long Short-Term Memory (LSTM) models. The combined connectivity-weighted model significantly enhanced prediction accuracy, reducing Mean Absolute Percentage Error (MAPE) by 49.38% across 96.62% of the counties, with the greatest improvements observed in urban and Northeastern counties. The results demonstrate that combining connectivity networks enhances prediction accuracy, offering a scalable and sustainable solution for localized disease forecasting on a large scale across diverse geographic areas using publicly accessible social media data.
Keywords