Weather and Climate Extremes (Sep 2022)

Seasonal prediction of summer extreme precipitation over the Yangtze River based on random forest

  • Wenguang Wei,
  • Zhongwei Yan,
  • Xuan Tong,
  • Zuoqiang Han,
  • Miaomiao Ma,
  • Shuang Yu,
  • Jiangjiang Xia

Journal volume & issue
Vol. 37
p. 100477

Abstract

Read online

The 2020 summer extreme precipitation event over the Mid-Lower Reaches of Yangtze River in China caused widespread socioeconomic impacts, with a death toll of hundreds and direct economic loss of half billion CNY. Seasonal prediction of summer extreme precipitation event over this region, however, has long been a challenge, due to the underlying interactions among various atmospheric and oceanic factors. Based on the random forest (RF), a classical machine learning method, a series of predictive models are trained and tested on the samples during 1951–2019, with 14 preceding atmospheric and oceanic indices as potential predictors. It is found that the model based on 3 indices has optimal performance in terms of the distinguishing capacity between extreme and non-extreme events. For the 2020 summer extreme event, the model predicts a large probability far beyond the climatological mean level, indicating a very likely extreme event. Interpretation of the decision trees in the RF model reveals 3 main decision paths leading to an extreme precipitation event over this region. The first one is driven by a strong eastern tropical Pacific (EP) El Niño which starts to decay in spring but does not totally disappear in summer. The second one results from the combined effect of an EP La Niña and normal sea surface temperature over the North Indian Ocean (NIO). The last one is also associated with a decaying EP El Niño, but the EP El Niño here is much weaker than that in the first path. Both the Pacific-El Niño-independent warming of NIO and cooling in the eastern tropical western Pacific in spring play important roles. The RF integrates different nonlinear physical mechanisms in a model and discovers weak signals which are easily omitted by traditional linear methods. The trained model can self-improve with increasing samples and serve as a reference for operational prediction of extreme precipitation events in the region.

Keywords