Atmosphere (Mar 2022)

Learning Calibration Functions on the Fly: Hybrid Batch Online Stacking Ensembles for the Calibration of Low-Cost Air Quality Sensor Networks in the Presence of Concept Drift

  • Evangelos Bagkis,
  • Theodosios Kassandros,
  • Kostas Karatzas

DOI
https://doi.org/10.3390/atmos13030416
Journal volume & issue
Vol. 13, no. 3
p. 416

Abstract

Read online

Deployment of an air quality low-cost sensor network (AQLCSN), with proper calibration of low-cost sensors (LCS), offers the potential to substantially increase the ability to monitor air pollution. However, to leverage this potential, several drawbacks must be ameliorated, thus the calibration of such sensors is becoming an essential component in their use. Commonly, calibration takes place in a laboratory environment using gasses of known composition to measure the response and a linear calibration is often reached. On site calibration is a promising complementary technique where an LCS and a reference instrument are collocated with the former being calibrated to match the measurements of the latter. In a scenario where an AQLCSN is already operational, both calibration approaches are resource and time demanding procedures to be implemented as frequently repeated actions. Furthermore, sensors are sensitive to the local meteorology and adaptation is a slow process making relocation a complex and expensive option. We concentrate our efforts in keeping the LCS positions fixed and propose to blend a genetic algorithm (GA) with a hybrid stacking (HS) ensemble into the GAHS framework. GAHS employs a combination of batch machine learning algorithms and regularly updated online machine learning calibration function(s) for the whole network when a small number of reference instruments are present. Furthermore, we introduce the concept of spatial online learning to achieve better spatial generalization. The frameworks are tested for the case of Thessaloniki where a total of 33 devices are installed. The AQLCSN is calibrated on the basis of on-site matching with high quality observations from three reference station measurements. The O3 LCS are successfully calibrated for 8–10 months and the PM10 LCS calibration is evaluated for 13–24 months showing a strong seasonal dependence on their ability to correctly capture the pollution levels.

Keywords