PLoS ONE (Jan 2022)

Data-driven models for atmospheric air temperature forecasting at a continental climate region

  • Mohamed Khalid Alomar,
  • Faidhalrahman Khaleel,
  • Mustafa M. Aljumaily,
  • Adil Masood,
  • Siti Fatin Mohd Razali,
  • Mohammed Abdulhakim AlSaadi,
  • Nadhir Al-Ansari,
  • Mohammed Majeed Hameed

Journal volume & issue
Vol. 17, no. 11

Abstract

Read online

Atmospheric air temperature is the most crucial metrological parameter. Despite its influence on multiple fields such as hydrology, the environment, irrigation, and agriculture, this parameter describes climate change and global warming quite well. Thus, accurate and timely air temperature forecasting is essential because it provides more important information that can be relied on for future planning. In this study, four Data-Driven Approaches, Support Vector Regression (SVR), Regression Tree (RT), Quantile Regression Tree (QRT), ARIMA, Random Forest (RF), and Gradient Boosting Regression (GBR), have been applied to forecast short-, and mid-term air temperature (daily, and weekly) over North America under continental climatic conditions. The time-series data is relatively long (2000 to 2021), 70% of the data are used for model calibration (2000 to 2015), and the rest are used for validation. The autocorrelation and partial autocorrelation functions have been used to select the best input combination for the forecasting models. The quality of predicting models is evaluated using several statistical measures and graphical comparisons. For daily scale, the SVR has generated more accurate estimates than other models, Root Mean Square Error (RMSE = 3.592°C), Correlation Coefficient (R = 0.964), Mean Absolute Error (MAE = 2.745°C), and Thiels’ U-statistics (U = 0.127). Besides, the study found that both RT and SVR performed very well in predicting weekly temperature. This study discovered that the duration of the employed data and its dispersion and volatility from month to month substantially influence the predictive models’ efficacy. Furthermore, the second scenario is conducted using the randomization method to divide the data into training and testing phases. The study found the performance of the models in the second scenario to be much better than the first one, indicating that climate change affects the temperature pattern of the studied station. The findings offered technical support for generating high-resolution daily and weekly temperature forecasts using Data-Driven Methodologies.