AIMS Public Health (Feb 2021)

Regional forecasting of COVID-19 caseload by non-parametric regression: a VAR epidemiological model

  • Aaron C Shang,
  • Kristen E Galow ,
  • Gary G Galow

DOI
https://doi.org/10.3934/publichealth.2021010
Journal volume & issue
Vol. 8, no. 1
pp. 124 – 136

Abstract

Read online

Objectives: The COVID-19 pandemic (caused by SARS-CoV-2) has introduced significant challenges for accurate prediction of population morbidity and mortality by traditional variable-based methods of estimation. Challenges to modelling include inadequate viral physiology comprehension and fluctuating definitions of positivity between national-to-international data. This paper proposes that accurate forecasting of COVID-19 caseload may be best preformed non-parametrically, by vector autoregression (VAR) of verifiable data regionally. Methods: A non-linear VAR model across 7 major demographically representative New York City (NYC) metropolitan region counties was constructed using verifiable daily COVID-19 caseload data March 12–July 23, 2020. Through association of observed case trends with a series of (county-specific) data-driven dynamic interdependencies (lagged values), a systematically non-assumptive approximation of VAR representation for COVID-19 patterns to-date and prospective upcoming trends was produced. Results: Modified VAR regression of NYC area COVID-19 caseload trends proves highly significant modelling capacity of observed patterns in longitudinal disease incidence (county R2 range: 0.9221–0.9751, all p < 0.001). Predictively, VAR regression of daily caseload results at a county-wide level demonstrates considerable short-term forecasting fidelity (p < 0.001 at one-step ahead) with concurrent capacity for longer-term (tested 11-week period) inferences of consistent, reasonable upcoming patterns from latest (model data update) disease epidemiology. Conclusions: In contrast to macroscopic variable-assumption projections, regionally-founded VAR modelling may substantially improve projection of short-term community disease burden, reduce potential for biostatistical error, as well as better model epidemiological effects resultant from intervention. Predictive VAR extrapolation of existing public health data at an interdependent regional scale may improve accuracy of current pandemic burden prognoses.

Keywords