Water Practice and Technology (Dec 2023)
Drinking water potability prediction using machine learning approaches: a case study of Indian rivers
Abstract
Drinking water is the most precious resource on Earth. In the past few decades, the quality of drinking water has significantly degraded due to pollution. Water quality assessment is paramount for the well-being of the people since the presence of pollutants can have serious health issues. Particularly, in developing countries such as India, water is not properly assessed for its quality. This work uses machine learning techniques to predict the water quality of Indian rivers. It focuses on finding water potability when provided with the key factors used to calculate the water quality index for the water sample. Important parameters like water temperature, pH value, electrical conductivity, dissolved oxygen, fecal coliform, total coliform counts, and biochemical oxygen demand are used to calculate the water quality index. The approaches that are explored include the use of K-nearest neighbor (KNN), Random Forest, and XGBoost, with and without hyperparameter tuning, and the use of a sequential artificial neural network to see which of the three models helps us to give the most accurate predictions for the potability of water. XGBoost was the most efficient model, with an accuracy of 98.93%. HIGHLIGHTS The extreme gradient boosting classifier (XGBoost) model has been used for assessing the water potability of Indian rivers.; Comparison is shown with some existing works and the results are much better.; Case studies are applied on Indian rivers.; Four different models are used: K-nearest neighbor, XGBoost, Random Forest, and artificial neural networks.;
Keywords