PLoS Neglected Tropical Diseases (Oct 2020)

Weekly dengue forecasts in Iquitos, Peru; San Juan, Puerto Rico; and Singapore.

  • Corey M Benedum,
  • Kimberly M Shea,
  • Helen E Jenkins,
  • Louis Y Kim,
  • Natasha Markuzon

DOI
https://doi.org/10.1371/journal.pntd.0008710
Journal volume & issue
Vol. 14, no. 10
p. e0008710

Abstract

Read online

BackgroundPredictive models can serve as early warning systems and can be used to forecast future risk of various infectious diseases. Conventionally, regression and time series models are used to forecast dengue incidence, using dengue surveillance (e.g., case counts) and weather data. However, these models may be limited in terms of model assumptions and the number of predictors that can be included. Machine learning (ML) methods are designed to work with a large number of predictors and thus offer an appealing alternative. Here, we compared the performance of ML algorithms with that of regression models in predicting dengue cases and outbreaks from 4 to up to 12 weeks in advance. Many countries lack sufficient health surveillance infrastructure, as such we evaluated the contribution of dengue surveillance and weather data on the predictive power of these models.MethodsWe developed ML, regression, and time series models to forecast weekly dengue case counts and outbreaks in Iquitos, Peru; San Juan, Puerto Rico; and Singapore from 1990-2016. Forecasts were generated using available weekly dengue surveillance, and weather data. We evaluated the agreement between model forecasts and actual dengue observations using Mean Absolute Error and Matthew's Correlation Coefficient (MCC).ResultsFor near term predictions of weekly case counts and when using surveillance data, ML models had 21% and 33% less error than regression and time series models respectively. However, using weather data only, ML models did not demonstrate a practical advantage. When forecasting weekly dengue outbreaks 12 weeks in advance, ML models achieved a maximum MCC of 0.61.ConclusionsOur results identified 2 scenarios when ML models are advantageous over regression model: 1) predicting dengue weekly case counts 4 weeks ahead when dengue surveillance data are available and 2) predicting weekly dengue outbreaks 12 weeks ahead when dengue surveillance data are unavailable. Given the advantages of ML models, dengue early warning systems may be improved by the inclusion of these models.