EBioMedicine (Mar 2023)

Incorporating variant frequencies data into short-term forecasting for COVID-19 cases and deaths in the USA: a deep learning approachResearch in context

  • Hongru Du,
  • Ensheng Dong,
  • Hamada S. Badr,
  • Mary E. Petrone,
  • Nathan D. Grubaugh,
  • Lauren M. Gardner

Journal volume & issue
Vol. 89
p. 104482

Abstract

Read online

Summary: Background: Since the US reported its first COVID-19 case on January 21, 2020, the science community has been applying various techniques to forecast incident cases and deaths. To date, providing an accurate and robust forecast at a high spatial resolution has proved challenging, even in the short term. Method: Here we present a novel multi-stage deep learning model to forecast the number of COVID-19 cases and deaths for each US state at a weekly level for a forecast horizon of 1–4 weeks. The model is heavily data driven, and relies on epidemiological, mobility, survey, climate, demographic, and SARS-CoV-2 variant frequencies data. We implement a rigorous and robust evaluation of our model—specifically we report on weekly performance over a one-year period based on multiple error metrics, and explicitly assess how our model performance varies over space, chronological time, and different outbreak phases. Findings: The proposed model is shown to consistently outperform the CDC ensemble model for all evaluation metrics in multiple spatiotemporal settings, especially for the longer-term (3 and 4 weeks ahead) forecast horizon. Our case study also highlights the potential value of variant frequencies data for use in short-term forecasting to identify forthcoming surges driven by new variants. Interpretation: Based on our findings, the proposed forecasting framework improves upon the available state-of-the-art forecasting tools currently used to support public health decision making with respect to COVID-19 risk. Funding: This work was funded the NSF Rapid Response Research (RAPID) grant Award ID 2108526 and the CDC Contract #75D30120C09570.

Keywords