Heliyon (Apr 2024)

Evaluate effect of 126 pre-processing methods on various artificial intelligence models accuracy versus normal mode to predict groundwater level (case study: Hamedan-Bahar Plain, Iran)

  • Mohsen Saroughi,
  • Ehsan Mirzania,
  • Mohammed Achite,
  • Okan Mert Katipoğlu,
  • Nadhir Al-Ansari,
  • Dinesh Kumar Vishwakarma,
  • Il-Moon Chung,
  • Maha Awjan Alreshidi,
  • Krishna Kumar Yadav

Journal volume & issue
Vol. 10, no. 7
p. e29006

Abstract

Read online

The estimation of groundwater levels is crucial and an important step in ensuring sustainable management of water resources. In this paper, selected piezometers of the Hamedan-Bahar plain located in west of Iran. The main objective of this study is to compare effect of various pre-processing methods on input data for different artificial intelligence (AI) models to predict groundwater levels (GWLs). The observed GWL, evaporation, precipitation, and temperature were used as input variables in the AI algorithms. Firstly, 126 method of data pre-processing was done by python programming which are classified into three classes: 1- statistical methods, 2- wavelet transform methods and 3- decomposition methods; later, various pre-processed data used by four types of widely used AI models with different kernels, which includes: Support Vector Machine (SVR), Artificial Neural Network (ANN), Long-Short Term memory (LSTM), and Pelican Optimization Algorithm (POA) - Artificial Neural Network (POA-ANN) are classified into three classes: 1- machine learning (SVR and ANN), 2- deep learning (LSTM) and 3- hybrid-ML (POA-ANN) models, to predict groundwater levels (GWLs). Akaike Information Criterion (AIC) were used to evaluate and validate the predictive accuracy of algorithms. According to the results, based on summation (train and test phases) of AIC value of 1778 models, average of AIC values for ML, DL, hybrid-ML classes, was decreased to −25.3%, −29.6% and −57.8%, respectively. Therefore, the results showed that all data pre-processing methods do not lead to improvement of prediction accuracy, and they should be selected very carefully by trial and error. In conclusion, wavelet-ANN model with daubechies 13 and 25 neurons (db13_ANN_25) is the best model to predict GWL that has −204.9 value for AIC which has grown by 5.23% (−194.7) compared to the state without any pre-processing method (ANN_Relu_25).

Keywords