Scientific Reports (Jan 2022)

Hybrid systems using residual modeling for sea surface temperature forecasting

  • Paulo S. G. de Mattos Neto,
  • George D. C. Cavalcanti,
  • Domingos S. de O. Santos Júnior,
  • Eraylson G. Silva

DOI
https://doi.org/10.1038/s41598-021-04238-z
Journal volume & issue
Vol. 12, no. 1
pp. 1 – 16

Abstract

Read online

Abstract The sea surface temperature (SST) is an environmental indicator closely related to climate, weather, and atmospheric events worldwide. Its forecasting is essential for supporting the decision of governments and environmental organizations. Literature has shown that single machine learning (ML) models are generally more accurate than traditional statistical models for SST time series modeling. However, the parameters tuning of these ML models is a challenging task, mainly when complex phenomena, such as SST forecasting, are addressed. Issues related to misspecification, overfitting, or underfitting of the ML models can lead to underperforming forecasts. This work proposes using hybrid systems (HS) that combine (ML) models using residual forecasting as an alternative to enhance the performance of SST forecasting. In this context, two types of combinations are evaluated using two ML models: support vector regression (SVR) and long short-term memory (LSTM). The experimental evaluation was performed on three datasets from different regions of the Atlantic Ocean using three well-known measures: mean square error (MSE), mean absolute percentage error (MAPE), and mean absolute error (MAE). The best HS based on SVR improved the MSE value for each analyzed series by $$82.26\%$$ 82.26 % , $$98.93\%$$ 98.93 % , and $$65.03\%$$ 65.03 % compared to its respective single model. The HS employing the LSTM improved $$92.15\%$$ 92.15 % , $$98.69\%$$ 98.69 % , and $$32.41\%$$ 32.41 % concerning the single LSTM model. Compared to literature approaches, at least one version of HS attained higher accuracy than statistical and ML models in all study cases. In particular, the nonlinear combination of the ML models obtained the best performance among the proposed HS versions.