Constructing transferable and interpretable machine learning models for black carbon concentrations

Pak Lun Fung; Marjan Savadkoohi; Martha Arbayani Zaidan; Jarkko V. Niemi; Hilkka Timonen; Marco Pandolfi; Andrés Alastuey; Xavier Querol; Tareq Hussein; Tuukka Petäjä

Environment International (Feb 2024)

Constructing transferable and interpretable machine learning models for black carbon concentrations

Pak Lun Fung,
Marjan Savadkoohi,
Martha Arbayani Zaidan,
Jarkko V. Niemi,
Hilkka Timonen,
Marco Pandolfi,
Andrés Alastuey,
Xavier Querol,
Tareq Hussein,
Tuukka Petäjä

Affiliations

Pak Lun Fung: Institute for Atmospheric and Earth System Research / Physics, Faculty of Science, University of Helsinki, Helsinki FI-00560, Finland; Helsinki Institute of Sustainability Science, Faculty of Science, University of Helsinki, Helsinki FI-00560, Finland; Corresponding authors at: Institute for Atmospheric and Earth System Research / Physics, Faculty of Science, University of Helsinki, Helsinki FI-00560, Finland (P.L. Fung); Institute of Environmental Assessment and Water Researchh (IDAEA-CSIC), Barcelona, Spain (M. Savadkoohi).
Marjan Savadkoohi: Institute of Environmental Assessment and Water Research (IDAEA-CSIC), Barcelona, Spain; Department of Mining, Industrial and ICT Engineering (EMIT), Manresa School of Engineering (EPSEM), Universitat Politècnica de Catalunya (UPC), Manresa 08242, Spain; Corresponding authors at: Institute for Atmospheric and Earth System Research / Physics, Faculty of Science, University of Helsinki, Helsinki FI-00560, Finland (P.L. Fung); Institute of Environmental Assessment and Water Researchh (IDAEA-CSIC), Barcelona, Spain (M. Savadkoohi).
Martha Arbayani Zaidan: Institute for Atmospheric and Earth System Research / Physics, Faculty of Science, University of Helsinki, Helsinki FI-00560, Finland; Helsinki Institute of Sustainability Science, Faculty of Science, University of Helsinki, Helsinki FI-00560, Finland; Department of Computer Science, Faculty of Science, University of Helsinki, Helsinki FI-00560, Finland
Jarkko V. Niemi: Helsinki Region Environmental Services Authority (HSY), Helsinki FI-00066, Finland
Hilkka Timonen: Atmospheric Composition Research, Finnish Meteorological Institute, Helsinki FI-00560, Finland
Marco Pandolfi: Institute of Environmental Assessment and Water Research (IDAEA-CSIC), Barcelona, Spain
Andrés Alastuey: Institute of Environmental Assessment and Water Research (IDAEA-CSIC), Barcelona, Spain
Xavier Querol: Institute of Environmental Assessment and Water Research (IDAEA-CSIC), Barcelona, Spain
Tareq Hussein: Institute for Atmospheric and Earth System Research / Physics, Faculty of Science, University of Helsinki, Helsinki FI-00560, Finland; Environmental and Atmospheric Research Laboratory (EARL), Department of Physics, School of Science, Amman 11942, Jordan
Tuukka Petäjä: Institute for Atmospheric and Earth System Research / Physics, Faculty of Science, University of Helsinki, Helsinki FI-00560, Finland

Journal volume & issue: Vol. 184
p. 108449

Abstract

Read online

Black carbon (BC) has received increasing attention from researchers due to its adverse health effects. However, in-situ BC measurements are often not included as a regulated variable in air quality monitoring networks. Machine learning (ML) models have been studied extensively to serve as virtual sensors to complement the reference instruments. This study evaluates and compares three white-box (WB) and four black-box (BB) ML models to estimate BC concentrations, with the focus to show their transferability and interpretability. We train the models with the long-term air pollutant and weather measurements in Barcelona urban background site, and test them in other European urban and traffic sites. Despite the difference in geographical locations and measurement sites, BC correlates the strongest with particle number concentration of accumulation mode (PNacc, r = 0.73–0.85) and nitrogen dioxide (NO2, r = 0.68–0.85) and the weakest with meteorological parameters. Due to its similarity of correlation behaviour, the ML models trained in Barcelona performs prominently at the traffic site in Helsinki (R2 = 0.80–0.86; mean absolute error MAE = 3.90–4.73 %) and at the urban background site in Dresden (R2 = 0.79–0.84; MAE = 4.23–4.82 %). WB models appear to explain less variability of BC than BB models, long short-term memory (LSTM) model of which outperforms the rest of the models. In terms of interpretability, we adopt several methods for individual model to quantify and normalize the relative importance of each input feature. The overall static relative importance commonly used for WB models demonstrate varying results from the dynamic values utilized to show local contribution used for BB models. PNacc and NO2 on average have the strongest absolute static contribution; however, they simultaneously impact the estimation positively and negatively at different sites. This comprehensive analysis demonstrates that the possibility of these interpretable air pollutant ML models to be transfered across space and time.

Published in Environment International

ISSN: 0160-4120 (Print)
Publisher: Elsevier
Country of publisher: United Kingdom
LCC subjects: Geography. Anthropology. Recreation: Environmental sciences
Website: https://www.journals.elsevier.com/environment-international

About the journal

Abstract

Keywords