Informing synthetic passive microwave predictions through Bayesian deep learning with uncertainty decomposition

Pedro Ortiz; Eleanor Casas; Marko Orescanin; Scott W. Powell; Veljko Petkovic

doi:10.1017/eds.2024.17

Environmental Data Science (Jan 2024)

Informing synthetic passive microwave predictions through Bayesian deep learning with uncertainty decomposition

Pedro Ortiz,
Eleanor Casas,
Marko Orescanin,
Scott W. Powell,
Veljko Petkovic

Affiliations

Pedro Ortiz: ORCiD; Department of Computer Science, Naval Postgraduate School, Monterey, CA, USA
Eleanor Casas: Department of Earth Sciences, Millersville University, Millersville, PA, USA
Marko Orescanin: ORCiD; Department of Computer Science, Naval Postgraduate School, Monterey, CA, USA
Scott W. Powell: Department of Meteorology, Naval Postgraduate School, Monterey, CA, USA
Veljko Petkovic: Cooperative Institute for Satellite Earth System Studies/Earth System Science Interdisciplinary Center, University of Maryland, College Park, MD, USA

DOI: https://doi.org/10.1017/eds.2024.17
Journal volume & issue: Vol. 3

Abstract

Read online

Space-borne passive microwave (PMW) data provide rich information on atmospheric state, including cloud structure and underlying surface properties. However, PMW data are sparse and limited due to low Earth orbit collection, resulting in coarse Earth system sampling. This study demonstrates that Bayesian deep learning (BDL) is a promising technique for predicting synthetic microwave (MW) data and its uncertainties from more ubiquitously available geostationary infrared observations. Our BDL models decompose predicted uncertainty into aleatoric (irreducible) and epistemic (reducible) components, providing insights into uncertainty origin and guiding model improvement. Low and high aleatoric uncertainty values are characteristic of clear sky and cloudy regions, respectively, suggesting that expanding the input feature vector to allow richer information content could improve model performance. The initially high average epistemic uncertainty metrics quantified by most models indicate that the training process would benefit from a greater data volume, leading to improved performance at most studied MW frequencies. Using quantified epistemic uncertainty to select the most useful additional training data (a training dataset size increase of 3.6%), the study reduced the mean absolute error and root mean squared error by 1.74% and 1.38%, respectively. The broader impact of this study is the demonstration of how predicted epistemic uncertainty can be used to select targeted training data. This allows for the curation of smaller, more optimized training datasets and also allows for future active learning studies.

Published in Environmental Data Science

ISSN: 2634-4602 (Online)
Publisher: Cambridge University Press
Country of publisher: United Kingdom
LCC subjects: Geography. Anthropology. Recreation: Environmental sciences; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.cambridge.org/core/journals/environmental-data-science

About the journal

Abstract

Keywords