Geoscientific Model Development (Jun 2024)

Multivariate adjustment of drizzle bias using machine learning in European climate projections

  • G. Lazoglou,
  • T. Economou,
  • C. Anagnostopoulou,
  • G. Zittis,
  • A. Tzyrkalli,
  • P. Georgiades,
  • P. Georgiades,
  • J. Lelieveld,
  • J. Lelieveld

DOI
https://doi.org/10.5194/gmd-17-4689-2024
Journal volume & issue
Vol. 17
pp. 4689 – 4703

Abstract

Read online

Precipitation holds significant importance as a climate parameter in various applications, including studies on the impacts of climate change. However, its simulation or projection accuracy is low, primarily due to its high stochasticity. Specifically, climate models often overestimate the frequency of light rainy days while simultaneously underestimating the total amounts of extreme observed precipitation. This phenomenon, known as “drizzle bias”, specifically refers to the model's tendency to overestimate the occurrence of light precipitation events. Consequently, even though the overall precipitation totals are generally well represented, there is often a significant bias in the number of rainy days. The present study aims to minimize the drizzle bias in model output by developing and applying two statistical approaches. In the first approach, the number of rainy days is adjusted based on the assumption that the relationship between observed and simulated rainy days remains the same in time (thresholding). In the second, a machine learning method (random forest or RF) is used for the development of a statistical model that describes the relationship between several climate (modelled) variables and the observed number of wet days. The results demonstrate that employing a multivariate approach yields results that are comparable to the conventional thresholding approach when correcting sub-periods with similar climate characteristics. However, the importance of utilizing RF becomes evident when addressing periods exhibiting extreme events, marked by a significantly distinct frequency of rainy days. These disparities are particularly pronounced when considering higher temporal resolutions. Both methods are illustrated on data from three EURO-CORDEX climate models. The two approaches are trained during a calibration period, and they are applied for the selected evaluation period.