Atmosphere (Nov 2024)

Quantifying the Impact of Multiple Factors on Air Quality Model Simulation Biases Using Machine Learning

  • Chunying Fan,
  • Ruilin Wang,
  • Ge Song,
  • Mengfan Teng,
  • Maolin Zhang,
  • Huangchuan Liu,
  • Zhujun Li,
  • Siwei Li,
  • Jia Xing

DOI
https://doi.org/10.3390/atmos15111337
Journal volume & issue
Vol. 15, no. 11
p. 1337

Abstract

Read online

Accurate air pollutant prediction is essential for addressing environmental and public health concerns. Air quality models like WRF-CMAQ provide simulations, but often show significant errors compared to observed concentrations. To identify the sources of these model biases, we applied the XGBoost machine learning algorithm to assess the performance of WRF-CMAQ in predicting air pollutants across two regions in China. XGBoost models trained with observations achieved high accuracy (R > 0.95), indicating that the selected features effectively capture pollutant variations. When trained on WRF-CMAQ inputs, XGBoost still improved performance but revealed biases linked to both model inputs (10–60%) and mechanisms (1–30%). Analysis identified previous-hour pollutant levels as the largest bias contributor, followed by meteorological variables. The study highlights the need for improving both model inputs and mechanisms to enhance future air quality predictions and support pollution control strategies.

Keywords