Environment International (Sep 2021)

A robust approach to deriving long-term daily surface NO2 levels across China: Correction to substantial estimation bias in back-extrapolation

  • Yangyang Wu,
  • Baofeng Di,
  • Yuzhou Luo,
  • Michael L. Grieneisen,
  • Wen Zeng,
  • Shifu Zhang,
  • Xunfei Deng,
  • Yulei Tang,
  • Guangming Shi,
  • Fumo Yang,
  • Yu Zhan

Journal volume & issue
Vol. 154
p. 106576

Abstract

Read online

Background: Long-term surface NO2 data are essential for retrospective policy evaluation and chronic human exposure assessment. In the absence of NO2 observations for Mainland China before 2013, training a model with 2013–2018 data to make predictions for 2005–2012 (back-extrapolation) could cause substantial estimation bias due to concept drift. Objective: This study aims to correct the estimation bias in order to reconstruct the spatiotemporal distribution of daily surface NO2 levels across China during 2005–2018. Methods: On the basis of ground- and satellite-based data, we proposed the robust back-extrapolation with a random forest (RBE-RF) to simulate the surface NO2 through intermediate modeling of the scaling factors. For comparison purposes, we also employed a random forest (Base-RF), as a representative of the commonly used approach, to directly model the surface NO2 levels. Results: The validation against Taiwan’s NO2 observations during 2005–2012 showed that RBE-RF adequately corrected the substantial underestimation by Base-RF. The RMSE decreased from 10.1 to 8.2 µg/m3, 7.1 to 4.3 µg/m3, and 6.1 to 2.9 µg/m3 in predicting daily, monthly, and annual levels, respectively. For North China with the most severe pollution, the population-weighted NO2 ([NO2]pw) during 2005–2012 was estimated as 40.2 and 50.9 µg/m3 by Base-RF and RBE-RF, respectively, i.e., 21.0% difference. While both models predicted that the national annual [NO2]pw increased during 2005–2011 and then decreased, the interannual trends were underestimated by >50.2% by Base-RF relative to RBE-RF. During 2005–2018, the nationwide population that lived in the areas with NO2 > 40 µg/m3 were estimated as 259 and 460 million by Base-RF and RBE-RF, respectively. Conclusion: With RBE-RF, we corrected the estimation bias in back-extrapolation and obtained a full-coverage dataset of daily surface NO2 across China during 2005–2018, which is valuable for environmental management and epidemiological research.

Keywords