Atmospheric Chemistry and Physics (Aug 2024)

Estimation of ground-level NO<sub>2</sub> and its spatiotemporal variations in China using GEMS measurements and a nested machine learning model

  • N. Ahmad,
  • C. Lin,
  • A. K. H. Lau,
  • A. K. H. Lau,
  • J. Kim,
  • T. Zhang,
  • T. Zhang,
  • F. Yu,
  • C. Li,
  • Y. Li,
  • J. C. H. Fung,
  • J. C. H. Fung,
  • X. Q. Lao

DOI
https://doi.org/10.5194/acp-24-9645-2024
Journal volume & issue
Vol. 24
pp. 9645 – 9665

Abstract

Read online

The major link between satellite-derived vertical column densities (VCDs) of nitrogen dioxide (NO2) and ground-level concentrations is theoretically the NO2 mixing height (NMH). Various meteorological parameters have been used as a proxy for NMH in existing studies. This study developed a nested XGBoost machine learning model to convert VCDs of NO2 into ground-level NO2 concentrations across China using Geostationary Environmental Monitoring Spectrometer (GEMS) measurements. This nested model was designed to directly incorporate NMH into the methodological framework to estimate satellite-derived ground-level NO2 concentrations. The inner machine learning model predicted the NMH from meteorological parameters, which were then input into the main XGBoost machine learning model to predict the ground-level NO2 concentrations from its VCDs. The inclusion of NMH significantly enhanced the accuracy of ground-level NO2 concentration estimates; i.e., the R2 values were improved from 0.73 to 0.93 in 10-fold cross-validation and from 0.88 to 0.99 in the fully trained model. Furthermore, NMH was identified as the second most important predictor variable, following the VCDs of NO2. Subsequently, the satellite-derived ground-level NO2 data were analyzed across subregions with varying geographic locations and urbanization levels. Highly populated areas typically experienced peak NO2 concentrations during the early morning rush hour, whereas areas categorized as lightly populated observed a slight increase in NO2 levels 1 or 2 h later, likely due to regional pollutant dispersion from urban sources. This study underscores the importance of incorporating NMH in estimating ground-level NO2 from satellite column measurements and highlights the significant advantages of geostationary satellites in providing detailed air pollution information at an hourly resolution.