Ingeniería (Jan 2017)

Detection of Outliers and Imputing of Missing Values for Water Quality UV-VIS Absorbance Time Series

  • Leonardo Plazas-Nossa,
  • Miguel Antonio Ávila Angulo,
  • Andres Torres

DOI
https://doi.org/10.14483/udistrital.jour.reving.2017.1.a01
Journal volume & issue
Vol. 22, no. 1
pp. 09 – 22

Abstract

Read online

Context: The UV-Vis absorbance collection using online optical captors for water quality detection may yield outliers and/or missing values. Therefore, data pre-processing is a necessary pre-requisite to monitoring data processing. Thus, the aim of this study is to propose a method that detects and removes outliers as well as fills gaps in time series. Method: Outliers are detected using Winsorising procedure and the application of the Discrete Fourier Transform (DFT) and the Inverse of Fast Fourier Transform (IFFT) to complete the time series. Together, these tools were used to analyse a case study comprising three sites in Colombia ((i) Bogotá D.C. Salitre-WWTP (Waste Water Treatment Plant), influent; (ii) Bogotá D.C. Gibraltar Pumping Station (GPS); and, (iii) Itagüí, San Fernando-WWTP, influent (Medellín metropolitan area)) analysed via UV-Vis (Ultraviolet and Visible) spectra. Results: Outlier detection with the proposed method obtained promising results when window parameter values are small and self-similar, despite that the three time series exhibited different sizes and behaviours. The DFT allowed to process different length gaps having missing values. To assess the validity of the proposed method, continuous subsets (a section) of the absorbance time series without outlier or missing values were removed from the original time series obtaining an average 12% error rate in the three testing time series. Conclusions: The application of the DFT and the IFFT, using the 10% most important harmonics of useful values, can be useful for its later use in different applications, specifically for time series of water quality and quantity in urban sewer systems. One potential application would be the analysis of dry weather interesting to rain events, a feat achieved by detecting values that correspond to unusual behaviour in a time series. Additionally, the result hints at the potential of the method in correcting other hydrologic time series.

Keywords