Environmental Research Letters (Jan 2024)

First estimation of hourly full-coverage ground-level ozone from Fengyun-4A satellite using machine learning

  • Ling Gao,
  • Han Zhang,
  • Fukun Yang,
  • Wangshu Tan,
  • Ronghua Wu,
  • Yi Song

DOI
https://doi.org/10.1088/1748-9326/ad2022
Journal volume & issue
Vol. 19, no. 2
p. 024040

Abstract

Read online

Ground-level ozone (O _3 ), renowned for its adverse impacts on human health and crop production, has garnered significant attention from governmental and public sectors. To address the limitations posed by sparse and uneven ground-level O _3 observations, this study proposes an innovative method for hourly full-coverage ground-level O _3 estimation using machine learning. Meteorological data from National Centers for Environmental Prediction global forecasting system, satellite data from Fengyun-4 A(FY-4 A) and Ozone Monitoring Instrument, emission inventory from Multi-resolution Emission Inventory for China, and other auxiliary data are utilized as input variables, while ground-based O _3 observations serve as the response variable. The method is applied on a monthly basis across China for the year 2022, resulting in the generation of an hourly full-coverage high-resolution (4 km) ground-level O _3 estimation, termed ML-derived-O _3 . Cross-validation results demonstrate the robustness of ML-derived-O _3 yielding a coefficient of determination ( R ^2 ) of 0.96 (0.91) for sample-based (site-based) evaluations and a root-mean-square error (RMSE) of 9.22 (13.65) µ g m ^−3 . However, the date-based evaluation is less satisfactory due to the imbalanced training data, resulting from the pronounced daily variations in ground-level O _3 concentrations. Nevertheless, the seasonal and hourly ML-derived-O _3 exhibits high prediction accuracy, with R ^2 values surpassing 0.95 and RMSE remaining below 7.5 µ g m ^−3 . This study marks a significant milestone as the first successful attempt to obtain hourly full-coverage ground-level O _3 data across China. The diurnal variation of ML-derived-O _3 demonstrates high consistency with ground-based observations, irrespective of clear or cloudy days, effectively capturing ground-level O _3 pollution exposure events. This novel estimation method will be employed to establish a long-term high spatial-temporal resolution ground-level O _3 dataset, which holds valuable applications for air pollution monitoring and environmental health research in future endeavors.

Keywords