Heliyon (Nov 2021)

Forecasting the future number of pertussis cases using data from Google Trends

  • Dominik Nann,
  • Mark Walker,
  • Leonie Frauenfeld,
  • Tamás Ferenci,
  • Mihály Sulyok

Journal volume & issue
Vol. 7, no. 11
p. e08386

Abstract

Read online

Background: Alternative methods could be used to enhance the monitoring and forecasting of re-emerging conditions such as pertussis. Here, whether data on the volume of Internet searching on pertussis could complement traditional modeling based solely on reported case numbers was assessed. Methods: SARIMA models were fitted to describe reported weekly pertussis case numbers over a four-year period in Germany. Pertussis-related Google Trends data (GTD) was added as an external regressor. Predictions were made by the models, both with and without GTD, and compared with values within the validation dataset over a one-year and for a two-weeks period. Results: Predictions of the traditional model using solely reported case numbers resulted in an RMSE (residual mean squared error) of 192.65 and 207.8, a mean absolute percentage error (MAPE) of 58.59 and 72.1, and a mean absolute error (MAE) 169.53 and 190.53 for the one-year and for the two-weeks period, respectively. The GTD expanded model achieved better forecasting accuracy (RMSE: 144.22 and 201.78), a MAPE 43.86, and 68.54 and a MAE of 124.46 and 178.96. Corrected Akaike Information Criteria also favored the GTD expanded model (1750.98 vs. 1746.73). The difference between the predictive performances was significant when using a two-sided Diebold-Mariano test (DM value: 6.86, p < 0.001) for the one-year period. Conclusion: Internet-based surveillance data enhanced the predictive ability of a traditionally based model and should be considered as a method to enhance future disease modeling.

Keywords