Diagnosing drivers of PM<sub>2.5</sub> simulation biases in China from meteorology, chemical composition, and emission sources using an efficient machine learning method

S. Wang; M. Zhang; Y. Gao; P. Wang; P. Wang; Q. Fu; H. Zhang; H. Zhang; H. Zhang

doi:10.5194/gmd-17-3617-2024

Geoscientific Model Development (May 2024)

Diagnosing drivers of PM<sub>2.5</sub> simulation biases in China from meteorology, chemical composition, and emission sources using an efficient machine learning method

S. Wang,
M. Zhang,
Y. Gao,
P. Wang,
P. Wang,
Q. Fu,
H. Zhang,
H. Zhang,
H. Zhang

Affiliations

S. Wang: Department of Environmental Science and Engineering, Fudan University, Shanghai 200438, China
M. Zhang: Department of Environmental Science and Engineering, Fudan University, Shanghai 200438, China
Y. Gao: Department of Environmental Science and Engineering, Fudan University, Shanghai 200438, China
P. Wang: Department of Atmospheric and Oceanic Sciences and Institute of Atmospheric Sciences, Fudan University, Shanghai 200438, China
P. Wang: IRDR ICoE on Risk Interconnectivity and Governance on Weather/Climate Extremes Impact and Public Health, Fudan University, Shanghai, China
Q. Fu: Shanghai Environmental Monitoring Center, Shanghai 200235, China
H. Zhang: Department of Environmental Science and Engineering, Fudan University, Shanghai 200438, China
H. Zhang: IRDR ICoE on Risk Interconnectivity and Governance on Weather/Climate Extremes Impact and Public Health, Fudan University, Shanghai, China
H. Zhang: Institute of Eco-Chongming (IEC), Shanghai 200062, China

DOI: https://doi.org/10.5194/gmd-17-3617-2024
Journal volume & issue: Vol. 17
pp. 3617 – 3629

Abstract

Read online

Chemical transport models (CTMs) are widely used for air pollution modeling, which suffer from significant biases due to uncertainties in simplified parameterization, meteorological fields, and emission inventories. Accurate diagnosis of simulation biases is critical for the improvement of models, interpretation of results, and management of air quality, especially for the simulation of fine particulate matter (PM2.5). In this study, an efficient method with high speed and a low computational resource requirement based on the tree-based machine learning (ML) method, the light gradient boosting machine (LightGBM), was designed to diagnose CTM simulation biases. The drivers of the Community Multiscale Air Quality (CMAQ) model biases are compared to observations obtained by simulating PM2.5 concentrations from the perspectives of meteorology, chemical composition, and emission sources. The source-oriented CMAQ was used to diagnose the influences of different emission sources on PM2.5 biases. The model can capture the complex relationship between input variables and simulation bias well; meteorology, PM2.5 components, and source sectors can partially explain the simulation bias. The CMAQ model underestimates PM2.5 by −19.25 to −2.66 µg m−3 in 2019, especially in winter and spring and during high-PM2.5 events. Secondary organic components showed the largest contribution to the PM2.5 simulation bias for different regions and seasons (13.8 %–22.6 %) of all components. Relative humidity, cloud cover, and soil surface moisture were the main meteorological factors contributing to PM2.5 bias in the North China Plain, Pearl River Delta, and northwestern China, respectively. Primary and secondary inorganic components from residential sources showed the two largest contributions to this bias (12.05 % and 12.78 %), implying large uncertainties in this sector. The ML-based methods provide valuable complements to traditional-mechanism-based methods for model improvement, with high efficiency and low reliance on prior information.

Published in Geoscientific Model Development

ISSN: 1991-959X (Print); 1991-9603 (Online)
Publisher: Copernicus Publications
Country of publisher: Germany
LCC subjects: Science: Geology
Website: https://www.geoscientific-model-development.net/

About the journal