Geoscientific Model Development (Jul 2023)

An optimized semi-empirical physical approach for satellite-based PM<sub>2.5</sub> retrieval: embedding machine learning to simulate complex physical parameters

  • C. Jin,
  • Q. Yuan,
  • Q. Yuan,
  • Q. Yuan,
  • T. Li,
  • Y. Wang,
  • L. Zhang,
  • L. Zhang

DOI
https://doi.org/10.5194/gmd-16-4137-2023
Journal volume & issue
Vol. 16
pp. 4137 – 4154

Abstract

Read online

Satellite remote sensing of PM2.5 (fine particulate matter) mass concentration has become one of the most popular atmospheric research aspects, resulting in the development of different models. Among them, the semi-empirical physical approach constructs the transformation relationship between the aerosol optical depth (AOD) and PM2.5 based on the optical properties of particles, which has strong physical significance. Also, it performs the PM2.5 retrieval independently of the ground stations. However, due to the complex physical relationship, the physical parameters in the semi-empirical approach are difficult to calculate accurately, resulting in relatively limited accuracy. To achieve the optimization effect, this study proposes a method of embedding machine learning into a semi-physical empirical model (RF-PMRS). Specifically, based on the theory of the physical PM2.5 remote sensing (PMRS) approach, the complex parameter (VEf, a columnar volume-to-extinction ratio of fine particles) is simulated by the random forest (RF) model. Also, a fine-mode fraction product with higher quality is applied to make up for the insufficient coverage of satellite products. Experiments in North China (35∘–45∘N, 110∘–120∘E) show that the surface PM2.5 concentration derived by RF-PMRS has an average annual value of 57.92 µg m−3 vs. the ground value of 60.23 µg m−3. Compared with the original method, RMSE decreases by 39.95 µg m−3, and the relative deviation is reduced by 44.87 %. Moreover, validation at two Aerosol Robotic Network (AERONET) sites presents a time series change closer to the true values, with an R of about 0.80. This study is also a preliminary attempt to combine model-driven and data-driven models, laying the foundation for further atmospheric research on optimization methods.