Applying machine learning for large scale field calibration of low‐cost PM2.5 and PM10 air pollution sensors

Priscilla Adong; Engineer Bainomugisha; Deo Okure; Richard Sserunjogi

doi:10.1002/ail2.76

Applied AI Letters (Sep 2022)

Applying machine learning for large scale field calibration of low‐cost PM2.5 and PM10 air pollution sensors

Priscilla Adong,
Engineer Bainomugisha,
Deo Okure,
Richard Sserunjogi

Affiliations

Priscilla Adong: AirQo, Department of Computer Science, College of Computing and Information Sciences Makerere University Kampala Uganda
Engineer Bainomugisha: AirQo, Department of Computer Science, College of Computing and Information Sciences Makerere University Kampala Uganda
Deo Okure: AirQo, Department of Computer Science, College of Computing and Information Sciences Makerere University Kampala Uganda
Richard Sserunjogi: AirQo, Department of Computer Science, College of Computing and Information Sciences Makerere University Kampala Uganda

DOI: https://doi.org/10.1002/ail2.76
Journal volume & issue: Vol. 3, no. 3
pp. n/a – n/a

Abstract

Read online

Abstract Low‐cost air quality monitoring networks can potentially increase the availability of high‐resolution monitoring to inform analytic and evidence‐informed approaches to better manage air quality. This is particularly relevant in low and middle‐income settings where access to traditional reference‐grade monitoring networks remains a challenge. However, low‐cost air quality sensors are impacted by ambient conditions which could lead to over‐ or underestimation of pollution concentrations and thus require field calibration to improve their accuracy and reliability. In this paper, we demonstrate the feasibility of using machine learning methods for large‐scale calibration of AirQo sensors, low‐cost PM sensors custom‐designed for and deployed in Sub‐Saharan urban settings. The performance of various machine learning methods is assessed by comparing model corrected PM using k‐nearest neighbours, support vector regression, multivariate linear regression, ridge regression, lasso regression, elastic net regression, XGBoost, multilayer perceptron, random forest and gradient boosting with collocated reference PM concentrations from a Beta Attenuation Monitor (BAM). To this end, random forest and lasso regression models were superior for PM2.5 and PM10 calibration, respectively. Employing the random forest model decreased RMSE of raw data from 18.6 μg/m3 to 7.2 μg/m3 with an average BAM PM2.5 concentration of 37.8 μg/m3 while the lasso regression model decreased RMSE from 13.4 μg/m3 to 7.9 μg/m3 with an average BAM PM10 concentration of 51.1 μg/m3. We validate our models through cross‐unit and cross‐site validation, allowing analysis of AirQo devices' consistency. The resulting calibration models were deployed to the entire large‐scale air quality monitoring network consisting of over 120 AirQo devices, which demonstrates the use of machine learning systems to address practical challenges in a developing world setting.

Published in Applied AI Letters

ISSN: 2689-5595 (Online)
Publisher: Wiley
Country of publisher: United Kingdom
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://onlinelibrary.wiley.com/journal/26895595

About the journal

Abstract

Keywords