Proceedings of the International Conference on Applied Innovations in IT (Mar 2020)

Prediction of Air Pollution Concentration Using Weather Data and Regression Models

  • Aleksandar Trenchevski,
  • Marija Kalendar,
  • Hristijan Gjoreski,
  • Danijela Efnusheva

DOI
https://doi.org/10.25673/32749
Journal volume & issue
Vol. 8, no. 1
pp. 55 – 61

Abstract

Read online

Air pollution is becoming a global environmental problem, in both developed and developing countries. It has greatly impacted the health and lives of millions of people, thus increasing mortality rates and pollution induced diseases reports. This paper proposes machine learning methods for predicting the rates of possibly increased air pollution in several areas, by processing the gathered data from multiple weather and air quality meter stations. The data has been gathered over a period of several years including air quality and pollution data and weather data including temperature, humidity and wind characteristics. The development process included feature extraction, feature selection for removing redundancy, and finally training multiple regression models and hyperparameter optimization. Pollutants and air quality index (AQI) were used as target variables, and appropriate regression models were trained. The performed experiments show that XGBoost is the most accurate, achieving MAE of 8.9 for Center, 8.9 for Karpos and 7.3 for Kumanovo municipality for the PM10 pollutant. The improvements over the baseline, Dummy regressor are significant, reducing the MAE for 12 on average.

Keywords