Scientific Reports (Jun 2021)

A novel data-driven methodology for influenza outbreak detection and prediction

  • Lin Du,
  • Yan Pang

DOI
https://doi.org/10.1038/s41598-021-92484-6
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 16

Abstract

Read online

Abstract Influenza is an infectious disease that leads to an estimated 5 million cases of severe illness and 650,000 respiratory deaths worldwide each year. The early detection and prediction of influenza outbreaks are crucial for efficient resource planning to save patient’s lives and healthcare costs. We propose a new data-driven methodology for influenza outbreak detection and prediction at very local levels. A doctor’s diagnostic dataset of influenza-like illness from more than 3000 clinics in Malaysia is used in this study because these diagnostic data are reliable and can be captured promptly. A new region index (RI) of the influenza outbreak is proposed based on the diagnostic dataset. By analysing the anomalies in the weekly RI value, potential outbreaks are identified using statistical methods. An ensemble learning method is developed to predict potential influenza outbreaks. Cross-validation is conducted to optimize the hyperparameters of the ensemble model. A testing data set is used to provide an unbiased evaluation of the model. The proposed methodology is shown to be sensitive and accurate at influenza outbreak prediction, with average of 75% recall, 74% precision, and 83% accuracy scores across five regions in Malaysia. The results are also validated by Google Flu Trends data, news reports, and surveillance data released by World Health Organization.