Geoscientific Model Development (Mar 2021)

Using Shapley additive explanations to interpret extreme gradient boosting predictions of grassland degradation in Xilingol, China

  • Batunacun,
  • Batunacun,
  • R. Wieland,
  • T. Lakes,
  • T. Lakes,
  • C. Nendel,
  • C. Nendel

DOI
https://doi.org/10.5194/gmd-14-1493-2021
Journal volume & issue
Vol. 14
pp. 1493 – 1510

Abstract

Read online

Machine learning (ML) and data-driven approaches are increasingly used in many research areas. Extreme gradient boosting (XGBoost) is a tree boosting method that has evolved into a state-of-the-art approach for many ML challenges. However, it has rarely been used in simulations of land use change so far. Xilingol, a typical region for research on serious grassland degradation and its drivers, was selected as a case study to test whether XGBoost can provide alternative insights that conventional land-use models are unable to generate. A set of 20 drivers was analysed using XGBoost, involving four alternative sampling strategies, and SHAP (Shapley additive explanations) to interpret the results of the purely data-driven approach. The results indicated that, with three of the sampling strategies (over-balanced, balanced, and imbalanced), XGBoost achieved similar and robust simulation results. SHAP values were useful for analysing the complex relationship between the different drivers of grassland degradation. Four drivers accounted for 99 % of the grassland degradation dynamics in Xilingol. These four drivers were spatially allocated, and a risk map of further degradation was produced. The limitations of using XGBoost to predict future land-use change are discussed.