Algorithms (Feb 2021)

Feature and Language Selection in Temporal Symbolic Regression for Interpretable Air Quality Modelling

  • Estrella Lucena-Sánchez,
  • Guido Sciavicco,
  • Ionel Eduard Stan

DOI
https://doi.org/10.3390/a14030076
Journal volume & issue
Vol. 14, no. 3
p. 76

Abstract

Read online

Air quality modelling that relates meteorological, car traffic, and pollution data is a fundamental problem, approached in several different ways in the recent literature. In particular, a set of such data sampled at a specific location and during a specific period of time can be seen as a multivariate time series, and modelling the values of the pollutant concentrations can be seen as a multivariate temporal regression problem. In this paper, we propose a new method for symbolic multivariate temporal regression, and we apply it to several data sets that contain real air quality data from the city of Wrocław (Poland). Our experiments show that our approach is superior to classical, especially symbolic, ones, both in statistical performances and the interpretability of the results.

Keywords