BMC Public Health (Jan 2024)

Using meta-learning to recommend an appropriate time-series forecasting model

  • Nasrin Talkhi,
  • Narges Akhavan Fatemi,
  • Mehdi Jabbari Nooghabi,
  • Ehsan Soltani,
  • Azadeh Jabbari Nooghabi

DOI
https://doi.org/10.1186/s12889-023-17627-y
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 9

Abstract

Read online

Abstract Background There are various forecasting algorithms available for univariate time series, ranging from simple to sophisticated and computational. In practice, selecting the most appropriate algorithm can be difficult, because there are too many algorithms. Although expert knowledge is required to make an informed decision, sometimes it is not feasible due to the lack of such resources as time, money, and manpower. Methods In this study, we used coronavirus disease 2019 (COVID-19) data, including the absolute numbers of confirmed, death and recovered cases per day in 187 countries from February 20, 2020, to May 25, 2021. Two popular forecasting models, including Auto-Regressive Integrated Moving Average (ARIMA) and exponential smoothing state-space model with Trigonometric seasonality, Box-Cox transformation, ARMA errors, Trend, and Seasonal components (TBATS) were used to forecast the data. Moreover, the data were evaluated by the root mean squared error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and symmetric mean absolute percentage error (SMAPE) criteria to label time series. The various characteristics of each time series based on the univariate time series structure were extracted as meta-features. After that, three machine-learning classification algorithms, including support vector machine (SVM), decision tree (DT), random forest (RF), and artificial neural network (ANN) were used as meta-learners to recommend an appropriate forecasting model. Results The finding of the study showed that the DT model had a better performance in the classification of time series. The accuracy of DT in the training and testing phases was 87.50% and 82.50%, respectively. The sensitivity of the DT algorithm in the training phase was 86.58% and its specificity was 88.46%. Moreover, the sensitivity and specificity of the DT algorithm in the testing phase were 73.33% and 88%, respectively. Conclusion In general, the meta-learning approach was able to predict the appropriate forecasting model (ARIMA and TBATS) based on some time series features. Considering some characteristics of the desired COVID-19 time series, the ARIMA or TBATS forecasting model might be recommended to forecast the death, confirmed, and recovered trend cases of COVID-19 by the DT model.

Keywords