Archives of Environmental Protection (Mar 2020)

Explicit and implicit description of the factors impact on the NO 2 concentration in the traffic corridor

  • Joanna Amelia Kamińska,
  • Tomasz Turek

DOI
https://doi.org/10.24425/aep.2020.132530
Journal volume & issue
Vol. vol. 46, no. No 1
pp. 93 – 99

Abstract

Read online

High concentrations of nitrogen dioxide in the air, particularly in heavily urbanized areas, have an adverse eff ect on many aspects of residents’ health. A method is proposed for modelling daily average, minimal and maximal atmospheric NO 2 concentrations in a conurbation, using two types of modelling: multiple linear regression (LR) an advanced data mining technique – Random Forest (RF). It was shown that Random Forest technique can be successfully applied to predict daily NO 2 concentration based on data from 2015–2017 years and gives better fit than linear models. The best results were obtained for predicting daily average NO 2 values with R 2 =0.69 and RMSE=7.47 μg/m . The cost of receiving an explicit, interpretable function is a much worse fit (R 2 from 0.32 to 0.57). Verification of models on independent material from the first half of 2018 showed the correctness of the models with the mean average percentage error equal to 16.5% for RF and 28% for LR modelling daily average concentration. The most important factors were wind conditions and traffic flow. In prediction of maximal daily concentration, air temperature and air humidity take on greater importance. Prevailing westerly and south-westerly winds in Wrocław effectively implement the idea of ventilating the city within the studied intersection. Summarizing: when modeling natural phenomena, a compromise should be sought between the accuracy of the model and its interpretability.

Keywords