Geoscientific Model Development (Mar 2022)
Towards physics-inspired data-driven weather forecasting: integrating data assimilation with a deep spatial-transformer-based U-NET in a case study with ERA5
Abstract
There is growing interest in data-driven weather prediction (DDWP), e.g., using convolutional neural networks such as U-NET that are trained on data from models or reanalysis. Here, we propose three components, inspired by physics, to integrate with commonly used DDWP models in order to improve their forecast accuracy. These components are (1) a deep spatial transformer added to the latent space of U-NET to capture rotation and scaling transformation in the latent space for spatiotemporal data, (2) a data-assimilation (DA) algorithm to ingest noisy observations and improve the initial conditions for next forecasts, and (3) a multi-time-step algorithm, which combines forecasts from DDWP models with different time steps through DA, improving the accuracy of forecasts at short intervals. To show the benefit and feasibility of each component, we use geopotential height at 500 hPa (Z500) from ERA5 reanalysis and examine the short-term forecast accuracy of specific setups of the DDWP framework. Results show that the spatial-transformer-based U-NET (U-STN) clearly outperforms the U-NET, e.g., improving the forecast skill by 45 %. Using a sigma-point ensemble Kalman (SPEnKF) algorithm for DA and U-STN as the forward model, we show that stable, accurate DA cycles are achieved even with high observation noise. This DDWP+DA framework substantially benefits from large (O(1000)) ensembles that are inexpensively generated with the data-driven forward model in each DA cycle. The multi-time-step DDWP+DA framework also shows promise; for example, it reduces the average error by factors of 2–3. These results show the benefits and feasibility of these three components, which are flexible and can be used in a variety of DDWP setups. Furthermore, while here we focus on weather forecasting, the three components can be readily adopted for other parts of the Earth system, such as ocean and land, for which there is a rapid growth of data and need for forecast and assimilation.