Water Science and Technology (Dec 2023)

Alleviating sample imbalance in water quality assessment using the VAE–WGAN–GP model

  • Jingbin Xu,
  • Degang Xu,
  • Kun Wan,
  • Ying Zhang

DOI
https://doi.org/10.2166/wst.2023.373
Journal volume & issue
Vol. 88, no. 11
pp. 2762 – 2778

Abstract

Read online

Water resources are essential for sustaining human life and promoting sustainable development. However, rapid urbanization and industrialization have resulted in a decline in freshwater availability. Effective prevention and control of water pollution are essential for ecological balance and human well-being. Water quality assessment is crucial for monitoring and managing water resources. Existing machine learning-based assessment methods tend to classify the results into the majority class, leading to inaccuracies in the outcomes due to the prevalent issue of imbalanced class sample distribution in practical scenarios. To tackle the issue, we propose a novel approach that utilizes the VAE–WGAN–GP model. The VAE–WGAN–GP model combines the encoding and decoding mechanisms of VAE with the adversarial learning of GAN. It generates synthetic samples that closely resemble real samples, effectively compensating data of the scarcity category in water quality evaluation. Our contributions include (1) introducing a deep generative model to alleviate the issue of imbalanced category samples in water quality assessment, (2) demonstrating the faster convergence speed and improved potential distribution learning ability of the proposed VAE–WGAN–GP model, (3) introducing the compensation degree concept and conducting comprehensive compensation experiments, resulting in a 9.7% increase in the accuracy of water quality assessment for multi-classification imbalance samples. HIGHLIGHTS Novel method: the VAE–WGAN–GP model is introduced to alleviate the problem of imbalanced category distribution in water quality evaluation and improve the accuracy of assessment.; Water resource management: our research bridges the gap in the distribution of categories in water management by providing deep generative models to compensate for data scarcity in water quality assessments.;

Keywords