Water (Oct 2023)

Applying Multivariate Analysis and Machine Learning Approaches to Evaluating Groundwater Quality on the Kairouan Plain, Tunisia

  • Sarra Bel Haj Salem,
  • Aissam Gaagai,
  • Imed Ben Slimene,
  • Amor Ben Moussa,
  • Kamel Zouari,
  • Krishna Kumar Yadav,
  • Mohamed Hamdy Eid,
  • Mostafa R. Abukhadra,
  • Ahmed M. El-Sherbeeny,
  • Mohamed Gad,
  • Mohamed Farouk,
  • Osama Elsherbiny,
  • Salah Elsayed,
  • Stefano Bellucci,
  • Hekmat Ibrahim

DOI
https://doi.org/10.3390/w15193495
Journal volume & issue
Vol. 15, no. 19
p. 3495

Abstract

Read online

In the Zeroud basin, a diverse array of methodologies were employed to assess, simulate, and predict the quality of groundwater intended for irrigation. These methodologies included the irrigation water quality indices (IWQIs); intricate statistical analysis involving multiple variables, supported with GIS techniques; an artificial neural network (ANN) model; and an XGBoost regression model. Extensive physicochemical examinations were performed on groundwater samples to elucidate their compositional attributes. The results showed that the abundance order of ions was Na+ > Ca2+ > Mg2+ > K+ and SO42− > HCO3− > Cl−. The groundwater facies reflected Ca-Mg-SO4, Na-Cl, and mixed Ca-Mg-Cl/SO4 water types. A cluster analysis (CA) and principal component analysis (PCA), along with ionic ratios, detected three different water characteristics. The mechanisms controlling water chemistry revealed water–rock interaction, dolomite dissolution, evaporation, and ion exchange. The assessment of groundwater quality for agriculture with respect IWQIs, such as the irrigation water quality index (IWQI), sodium adsorption ratio (SAR), sodium percentage (Na%), soluble sodium percentage (SSP), potential salinity (PS), and residual sodium carbonate (RSC), revealed that the domination of the water samples was valuable for agriculture. However, the IWQI and PS fell between high-to-severe restrictions and injurious-to-unsatisfactory. The ANN and XGBoost regression models showed robust results for predicting IWQIs. For example, ANN-HyC-9 emerged as the most precise forecasting framework according to its outcomes, as it showcased the most robust link between prime attributes and IWQI. The nine attributes of this model hold immense significance in IWQI prediction. The R2 values for its training and testing data stood at 0.999 (RMSE = 0.375) and 0.823 (RMSE = 3.168), respectively. These findings indicate that XGB-HyC-3 emerged as the most accurate forecasting model, displaying a stronger connection between IWQI and its exceptional characteristics. When predicting IWQI, approximately three of the model’s attributes played a pivotal role. Notably, the model yielded R2 values of 0.999 (RMSE = 0.001) and 0.913 (RMSE = 2.217) for the training and testing datasets, respectively. Overall, these results offer significant details for decision-makers in managing water quality and can support the long-term use of water resources.

Keywords