AtmoDist: Self-supervised representation learning for atmospheric dynamics

Sebastian Hoffmann; Christian Lessig

doi:10.1017/eds.2023.1

Environmental Data Science (Jan 2023)

AtmoDist: Self-supervised representation learning for atmospheric dynamics

Sebastian Hoffmann,
Christian Lessig

Affiliations

Sebastian Hoffmann: ORCiD; Institut für Simulation und Graphik, Otto-von-Guericke-Universität Magdeburg, Magdeburg, Germany
Christian Lessig: ORCiD; Institut für Simulation und Graphik, Otto-von-Guericke-Universität Magdeburg, Magdeburg, Germany

DOI: https://doi.org/10.1017/eds.2023.1
Journal volume & issue: Vol. 2

Abstract

Read online

Representation learning has proven to be a powerful methodology in a wide variety of machine-learning applications. For atmospheric dynamics, however, it has so far not been considered, arguably due to the lack of large-scale, labeled datasets that could be used for training. In this work, we show how to sidestep the difficulty and introduce a self-supervised learning task that is applicable to a wide variety of unlabeled atmospheric datasets. Specifically, we train a neural network on the simple yet intricate task of predicting the temporal distance between atmospheric fields from distinct but nearby times. We demonstrate that training with this task on the ERA5 reanalysis dataset leads to internal representations that capture intrinsic aspects of atmospheric dynamics. For example, when employed as a loss function in other machine-learning applications, the derived AtmoDist distance leads to improved results compared to the $ {\mathrm{\ell}}_2 $ -loss. For downscaling one obtains higher resolution fields that match the true statistics more closely than previous approaches and for the interpolation of missing or occluded data the AtmoDist distance leads to results that contain more realistic fine-scale features. Since it is obtained from observational data, AtmoDist also provides a novel perspective on atmospheric predictability.

Published in Environmental Data Science

ISSN: 2634-4602 (Online)
Publisher: Cambridge University Press
Country of publisher: United Kingdom
LCC subjects: Geography. Anthropology. Recreation: Environmental sciences; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.cambridge.org/core/journals/environmental-data-science

About the journal

Abstract

Keywords