Water (Feb 2024)

Identifying the Most Discriminative Parameter for Water Quality Prediction Using Machine Learning Algorithms

  • Tapan Chatterjee,
  • Usha Rani Gogoi,
  • Animesh Samanta,
  • Ayan Chatterjee,
  • Mritunjay Kumar Singh,
  • Srinivas Pasupuleti

DOI
https://doi.org/10.3390/w16030481
Journal volume & issue
Vol. 16, no. 3
p. 481

Abstract

Read online

Groundwater quality is one of the major concerns. Quality of the groundwater directly impacts human health, growth of plants and vegetables. Due to the severe impacts of inadequate water quality, it is imperative to find a swift and economical solution. Water quality prediction may help us to manage water resources properly. The present study has been carried out considering thirty-seven water sample data points form the Pindrawan tank command area of Raipur district, Chhattisgarh, India. A total of nineteen physicochemical parameters were measured, out of which seventeen parameters were used to compute the weight-based groundwater quality index (WQI). In this present work, the primary goal is to identify the most effective parameters for WQI prediction. Out of the seventeen parameters tested, the Mann—Whitney—Wilcoxon (MWW) statistical test has revealed that five parameters Fe, Cr, Na, Ca, and Mg hold a strong statistical significance in distinguishing between drinkable and non-drinkable water. Out of these five parameters, Cr is the only parameter that maintains a different range of values for drinkable water and non-drinkable water. To validate the efficiency of these statistically significant parameters, machine learning techniques like Artificial Neural Networks (ANN) and Logistic Regression (LR) were used. The experimental results clearly demonstrate that out of all the seventeen parameters tested, utilizing only Cr yields remarkably high classification accuracy. ‘Cr’ achieved an accuracy of 91.67% using artificial neural networks. This is much higher than the accuracy of 66.67% obtained using a parameter set with all seventeen parameters. The proposed methodology achieved good accuracy when classifying water samples into drinkable and non-drinkable water using only one parameter, ‘Cr’.

Keywords