Geoscientific Model Development (Mar 2021)
Using Shapley additive explanations to interpret extreme gradient boosting predictions of grassland degradation in Xilingol, China
Abstract
Machine learning (ML) and data-driven approaches are increasingly used in many research areas. Extreme gradient boosting (XGBoost) is a tree boosting method that has evolved into a state-of-the-art approach for many ML challenges. However, it has rarely been used in simulations of land use change so far. Xilingol, a typical region for research on serious grassland degradation and its drivers, was selected as a case study to test whether XGBoost can provide alternative insights that conventional land-use models are unable to generate. A set of 20 drivers was analysed using XGBoost, involving four alternative sampling strategies, and SHAP (Shapley additive explanations) to interpret the results of the purely data-driven approach. The results indicated that, with three of the sampling strategies (over-balanced, balanced, and imbalanced), XGBoost achieved similar and robust simulation results. SHAP values were useful for analysing the complex relationship between the different drivers of grassland degradation. Four drivers accounted for 99 % of the grassland degradation dynamics in Xilingol. These four drivers were spatially allocated, and a risk map of further degradation was produced. The limitations of using XGBoost to predict future land-use change are discussed.