Sensors (Jul 2023)

Calibration Assessment of Low-Cost Carbon Dioxide Sensors Using the Extremely Randomized Trees Algorithm

  • Tiago Araújo,
  • Lígia Silva,
  • Ana Aguiar,
  • Adriano Moreira

DOI
https://doi.org/10.3390/s23136153
Journal volume & issue
Vol. 23, no. 13
p. 6153

Abstract

Read online

As the monitoring of carbon dioxide is an important proxy to estimate the air quality of indoor and outdoor environments, it is essential to obtain trustful data from CO2 sensors. However, the use of widely available low-cost sensors may imply lower data quality, especially regarding accuracy. This paper proposes a new approach for enhancing the accuracy of low-cost CO2 sensors using an extremely randomized trees algorithm. It also reports the results obtained from experimental data collected from sensors that were exposed to both indoor and outdoor environments. The indoor experimental set was composed of two metal oxide semiconductors (MOS) and two non-dispersive infrared (NDIR) sensors next to a reference sensor for carbon dioxide and independent sensors for air temperature and relative humidity. The outdoor experimental exposure analysis was performed using a third-party dataset which fit into our goals: the work consisted of fourteen stations using low-cost NDIR sensors geographically spread around reference stations. One calibration model was trained for each sensor unit separately, and, in the indoor experiment, it managed to reduce the mean absolute error (MAE) of NDIR sensors by up to 90%, reach very good linearity with MOS sensors in the indoor experiment (r2 value of 0.994), and reduce the MAE by up to 98% in the outdoor dataset. We have found in the outdoor dataset analysis that the exposure time of the sensor itself may be considered by the algorithm to achieve better accuracy. We also observed that even a relatively small amount of data may provide enough information to perform a useful calibration if they contain enough data variety. We conclude that the proper use of machine learning algorithms on sensor readings can be very effective to obtain higher data quality from low-cost gas sensors either indoors or outdoors, regardless of the sensor technology.

Keywords