Representing chemical history in ozone time-series predictions – a model experiment study building on the MLAir (v1.5) deep learning framework

F. Kleinert; F. Kleinert; L. H. Leufen; L. H. Leufen; A. Lupascu; T. Butler; M. G. Schultz

doi:10.5194/gmd-15-8913-2022

Geoscientific Model Development (Dec 2022)

Representing chemical history in ozone time-series predictions – a model experiment study building on the MLAir (v1.5) deep learning framework

F. Kleinert,
F. Kleinert,
L. H. Leufen,
L. H. Leufen,
A. Lupascu,
T. Butler,
M. G. Schultz

Affiliations

F. Kleinert: Forschungszentrum Jülich GmbH, Jülich Supercomputing Centre (JSC), Jülich, Germany
F. Kleinert: Institute of Geosciences, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
L. H. Leufen: Forschungszentrum Jülich GmbH, Jülich Supercomputing Centre (JSC), Jülich, Germany
L. H. Leufen: Institute of Geosciences, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
A. Lupascu: Institute for Advanced Sustainability Studies, Potsdam, Germany
T. Butler: Institute for Advanced Sustainability Studies, Potsdam, Germany
M. G. Schultz: Forschungszentrum Jülich GmbH, Jülich Supercomputing Centre (JSC), Jülich, Germany

DOI: https://doi.org/10.5194/gmd-15-8913-2022
Journal volume & issue: Vol. 15
pp. 8913 – 8930

Abstract

Read online

Tropospheric ozone is a secondary air pollutant that is harmful to living beings and crops. Predicting ozone concentrations at specific locations is thus important to initiate protection measures, i.e. emission reductions or warnings to the population. Ozone levels at specific locations result from emission and sink processes, mixing and chemical transformation along an air parcel's trajectory. Current ozone forecasting systems generally rely on computationally expensive chemistry transport models (CTMs). However, recently several studies have demonstrated the potential of deep learning for this task. While a few of these studies were trained on gridded model data, most efforts focus on forecasting time series from individual measurement locations. In this study, we present a hybrid approach which is based on time-series forecasting (up to 4 d) but uses spatially aggregated meteorological and chemical data from upstream wind sectors to represent some aspects of the chemical history of air parcels arriving at the measurement location. To demonstrate the value of this additional information, we extracted pseudo-observation data for Germany from a CTM to avoid extra complications with irregularly spaced and missing data. However, our method can be extended so that it can be applied to observational time series. Using one upstream sector alone improves the forecasts by 10 % during all 4 d, while the use of three sectors improves the mean squared error (MSE) skill score by 14 % during the first 2 d of the prediction but depends on the upstream wind direction. Our method shows its best performance in the northern half of Germany for the first 2 prediction days. Based on the data's seasonality and simulation period, we shed some light on our models' open challenges with (i) spatial structures in terms of decreasing skill scores from the northern German plain to the mountainous south and (ii) concept drifts related to an unusually cold winter season. Here we expect that the inclusion of explainable artificial intelligence methods could reveal additional insights in future versions of our model.

Published in Geoscientific Model Development

ISSN: 1991-959X (Print); 1991-9603 (Online)
Publisher: Copernicus Publications
Country of publisher: Germany
LCC subjects: Science: Geology
Website: https://www.geoscientific-model-development.net/

About the journal