IEEE Access (Jan 2024)

Optimized Ensemble Methods for Classifying Imbalanced Water Quality Index Data

  • Zaharaddeen Karami Lawal,
  • Ali Aldrees,
  • Hayati Yassin,
  • Salisu Dan'azumi,
  • Sujay Raghavendra Naganna,
  • Sani I. Abba,
  • Saad Sh. Sammen

DOI
https://doi.org/10.1109/ACCESS.2024.3502361
Journal volume & issue
Vol. 12
pp. 178536 – 178551

Abstract

Read online

River water pollution has increased due to human activities. Initially, numerical and analytical methods were used to classify river water quality, but machine learning now enables faster and more accurate water quality index (WQI) classification. This study aimed to develop an effective ensemble model for classifying river water as drinkable or polluted using advanced machine learning. The objective was to apply a classification method to predict WQI using Kinta River data in Malaysia and improve on existing models’ $70-95\%$ accuracy range. The dataset of this study comprises 301 records collected from eight monitoring stations along the Kinta River, encompassing 31 pollution indicators, including hydrological, chemical, physical, and microbiological parameters. Six algorithms used include decision tree, logistic regression, random forest, support vector machine, AdaBoost, and XGBoost. The three experiments were conducted with and without hyperparameter tuning. The dataset was normalized and oversampled to address the imbalance. In all experiments, XGBoost performed best individually, while SVM was worst. The ensemble models outperformed individuals, with the GridSearchCV ensemble achieving 97.3% accuracy, an improvement exceeding the existing literature’s models by 2.3%. The study had limitations, such as the absence of advanced optimization or dimensionality reduction. In conclusion, it demonstrated that an ensemble model with optimized hyperparameters could classify river water quality more effectively than individual models, contributing to the advancement of sustainable development goals (SGD) related to water access.

Keywords