Heliyon (Sep 2024)
Comprehensive evaluation and prediction of groundwater quality and risk indices using quantitative approaches, multivariate analysis, and machine learning models: An exploratory study
Abstract
Assessing and predicting quality of groundwater is crucial in managing groundwater availability effectively. In the current study, groundwater quality was thoroughly appraised using various indexing methods, including the drinking water quality index (DWQI), pollution index of heavy metals (HPI), pollution index (PI), metal index (MI), degree of contamination (Cd), and risk indicators, like hazard quotient (HQ) and total hazard indicator (HI). The assessments were augmented through multivariate analytical techniques, models based on recurrent neural networks (RNNs), and integration of geographic information system (GIS) technology. The analysis measured physicochemical parameters across 48 groundwater wells from El-Menoufia region, revealing distinct water types influenced by ion exchange, rock-water interactions, and silicate weathering. Notably, the groundwater showed elevated levels of certain metals, particularly manganese (Mn) and lead (Pb), exceeding the drinking water limits. The DWQI deemed the bulk of the tested samples suitable for consumption, assigning them to the ''good'' category, whereas a small number were considered inferior quality. The HPI, MI, and Cd indices indicated significant pollution in the central study region. The PI revealed that Pb, Mn, and Fe were significant contributors to water pollution, falling between classes IV (strongly affected) and V (seriously affected). HQ and HI analyses identified the central area of the study as particularly prone to metal contamination, signifying a high risk to children via oral and dermal routes and to adults through oral exposure alone (non-carcinogenic risk). The adults had no health risks due to dermal contact. Finally, the RNN simulation model effectively predicted the health and water quality indices in training and testing series. For instance, the RNN model excelled in predicting the DWQI, with three key parameters being crucial. The model demonstrated an excellent fit on the training set, achieving an R2 of 1.00 with a very low root mean of squared error (RMSE) of 0.01. However, on the testing set, the model's performance slightly decreased, showing an R2 of 0.96 and an RMSE of 2.73. Regarding HPI, the RNN model performed exceptionally well as the primary predictor, with R2 values of 1.00 (RMSE = 0.01) and 0.93 (RMSE = 27.35) for the training and testing sets, respectively. This study provides a unique perspective for improving the integration of various techniques to gain a more comprehensive understanding of groundwater quality and its associated health risks, with a strong focus on feature selection strategies to enhance model accuracy and interpretability.