The Cryosphere (Jun 2020)

Snow depth estimation and historical data reconstruction over China based on a random forest machine learning approach

  • J. Yang,
  • L. Jiang,
  • K. Luojus,
  • J. Pan,
  • J. Lemmetyinen,
  • M. Takala,
  • S. Wu

DOI
https://doi.org/10.5194/tc-14-1763-2020
Journal volume & issue
Vol. 14
pp. 1763 – 1778

Abstract

Read online

We investigated the potential capability of the random forest (RF) machine learning (ML) model to estimate snow depth in this work. Four combinations composed of critical predictor variables were used to train the RF model. Then, we utilized three validation datasets from out-of-bag (OOB) samples, a temporal subset, and a spatiotemporal subset to verify the fitted RF algorithms. The results indicated the following: (1) the accuracy of the RF model is greatly influenced by geographic location, elevation, and land cover fractions; (2) however, the redundant predictor variables (if highly correlated) slightly affect the RF model; and (3) the fitted RF algorithms perform better on temporal than spatial scales, with unbiased root-mean-square errors (RMSEs) of ∼4.4 and ∼7.3 cm, respectively. Finally, we used the fitted RF2 algorithm to retrieve a consistent 32-year daily snow depth dataset from 1987 to 2018. This product was evaluated against the independent station observations during the period 1987–2018. The mean unbiased RMSE and bias were 7.1 and −0.05 cm, respectively, indicating better performance than that of the former snow depth dataset (8.4 and −1.20 cm) from the Environmental and Ecological Science Data Center for West China (WESTDC). Although the RF product was superior to the WESTDC dataset, it still underestimated deep snow cover (>20 cm), with biases of −10.4, −8.9, and −34.1 cm for northeast China (NEC), northern Xinjiang (XJ), and the Qinghai–Tibetan Plateau (QTP), respectively. Additionally, the long-term snow depth datasets (station observations, RF estimates, and WESTDC product) were analyzed in terms of temporal and spatial variations over China. On a temporal scale, the ground truth snow depth presented a significant increasing trend from 1987 to 2018, especially in NEC. However, the RF and WESTDC products displayed no significant changing trends except on the QTP. The WESTDC product presented a significant decreasing trend on the QTP, with a correlation coefficient of −0.55, whereas there were no significant trends for ground truth observations and the RF product. For the spatial characteristics, similar trend patterns were observed for RF and WESTDC products over China. These characteristics presented significant decreasing trends in most areas and a significant increasing trend in central NEC.