Scientific Reports (Dec 2024)

Enhancement of groundwater resources quality prediction by machine learning models on the basis of an improved DRASTIC method

  • Ali Bakhtiarizadeh,
  • Mohammad Najafzadeh,
  • Sedigheh Mohamadi

DOI
https://doi.org/10.1038/s41598-024-78812-6
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 24

Abstract

Read online

Abstract Determining situation of groundwater vulnerability plays a crucial role in studying the groundwater resource management. Generally, the preparation of reliable groundwater vulnerability maps provides targeted and practical scientific measures for the protection and management of groundwater resources. In this study, in order to evaluate the groundwater vulnerability of Kerman–Baghin plain aquifer, two developed indicators including composite DRASTIC index (CD) and nitrate vulnerability index (NVI) based on DRASTIC index were considered. Soft computing methods, including Gene Expression Programming (GEP), Evolutionary Polynomial Regression (EPR), Multivariate Adaptive Regression Spline (MARS), and M5 Model Tree (MTM5) have been used to provide formulations for prediction of NVI. Soft computing techniques were fed nine input parameters: depth to water level, net recharge, aquifer environment, soil environment, topography, effect of unsaturated area, hydraulic conductivity, land use, and potential risk related to land use. After calculating the vulnerability by soft computing methods, the results showed that the EPR model with Correlation Coefficient (R) of 0.9999 and Root Mean Square Error (RMSE) = 0.2105 has the best performance in the testing stage in comparison with MARS (R = 0.9966 and RMSE = 2.408), M5MT (R = 0.9956 and RMSE = 2.988), and GEP (R = 0.9920 and RMSE = 3.491). Although the EPR and GEP models have more complex mathematical computations than other soft computing models, the MARS and MT model that have quadratic polynomial and multivariable linear structures respectively, can be considered as the best alternative. According to the MARS model, the vulnerability of the region is divided into two categories: very low vulnerability (73.06%) and low vulnerability (26.94%). Overall, the statistical results of soft computing techniques were indicative of effective formulations for evaluating the DRASTIC index.

Keywords