Environment International (Jan 2021)
The ChinaHighPM10 dataset: generation, validation, and spatiotemporal variations from 2015 to 2019 across China
Abstract
Respirable particles with aerodynamic diameters ≤ 10 µm (PM10) have important impacts on the atmospheric environment and human health. Available PM10 datasets have coarse spatial resolutions, limiting their applications, especially at the city level. A tree-based ensemble learning model, which accounts for spatiotemporal information (i.e., space-time extremely randomized trees, denoted as the STET model), is designed to estimate near-surface PM10 concentrations. The 1-km resolution Multi-Angle Implementation of Atmospheric Correction (MAIAC) aerosol product and auxiliary factors, including meteorology, land-use cover, surface elevation, population distribution, and pollutant emissions, are used in the STET model to generate the high-resolution (1 km) and high-quality PM10 dataset for China (i.e., ChinaHighPM10) from 2015 to 2019. The product has an out-of-sample (out-of-station) cross-validation coefficient of determination (CV-R2) of 0.86 (0.82) and a root-mean-square error (RMSE) of 24.28 (27.07) μg/m3, outperforming most widely used models from previous related studies. High levels of PM10 concentration occurred in northwest China (e.g., the Tarim Basin) and the Northern China Plain. Overall, PM10 concentrations had a significant declining trend of 5.81 μg/m3 per year (p < 0.001) over the past five years in China, especially in three key urban agglomerations. The ChinaHighPM10 dataset is potentially useful for future small- and medium-scale air pollution studies by virtue of its higher spatial resolution and overall accuracy.