PeerJ (Dec 2019)

Quantitative estimation of wastewater quality parameters by hyperspectral band screening using GC, VIP and SPA

  • Zheng Xing,
  • Junying Chen,
  • Xiao Zhao,
  • Yu Li,
  • Xianwen Li,
  • Zhitao Zhang,
  • Congcong Lao,
  • Haifeng Wang

DOI
https://doi.org/10.7717/peerj.8255
Journal volume & issue
Vol. 7
p. e8255

Abstract

Read online Read online

Water pollution has been hindering the world’s sustainable development. The accurate inversion of water quality parameters in sewage with visible-near infrared spectroscopy can improve the effectiveness and rational utilization and management of water resources. However, the accuracy of spectral models of water quality parameters is usually prone to noise information and high dimensionality of spectral data. This study aimed to enhance the model accuracy through optimizing the spectral models based on the sensitive spectral intervals of different water quality parameters. To this end, six kinds of sewage water taken from a biological sewage treatment plant went through laboratory physical and chemical tests. In total, 87 samples of sewage water were obtained by adding different amount of pure water to them. The raw reflectance (Rraw) of the samples were collected with analytical spectral devices. The Rraw-SNV were obtained from the Rraw processed with the standard normal variable. Then, the sensitive spectral intervals of each of the six water quality parameters, namely, chemical oxygen demand (COD), biological oxygen demand (BOD), NH3-N, the total dissolved substances (TDS), total hardness (TH) and total alkalinity (TA), were selected using three different methods: gray correlation (GC), variable importance in projection (VIP) and set pair analysis (SPA). Finally, the performance of both extreme learning machine (ELM) and partial least squares regression (PLSR) was investigated based on the sensitive spectral intervals. The results demonstrated that the model accuracy based on the sensitive spectral ranges screened through different methods appeared different. The GC method had better performance in reducing the redundancy and the VIP method was better in information preservation. The SPA method could make the optimal trade-offs between information preservation and redundancy reduction and it could retain maximal spectral band intervals with good response to the inversion parameters. The accuracy of the models based on varied sensitive spectral ranges selected by the three analysis methods was different: the GC was the highest, the SPA came next and the VIP was the lowest. On the whole, PLSR and ELM both achieved satisfying model accuracy, but the prediction accuracy of the latter was higher than the former. Great differences existed among the optimal inversion accuracy of different water quality parameters: COD, BOD and TN were very high; TA relatively high; and TDS and TH relatively low. These findings can provide a new way to optimize the spectral model of wastewater biochemical parameters and thus improve its prediction precision.

Keywords