Applied Computing and Geosciences (Sep 2024)
Interpretation techniques to explain the output of a spatial land subsidence hazard model in an area with a diverted tributary
Abstract
Due to the nature of black-box machine learning (ML) models used in the spatial modelling field of environmental and natural hazards, the interpretation of predictive model outputs is necessary. For this purpose, we applied four interpretation techniques consisting of interaction plot, permutation feature importance (PFI) measure, shapley additive explanation (SHAP) decision plot, and accumulated local effects (ALE) plot to explain and interpret the output of an ML model applied to map land subsidence (LS) in the Nazdasht plain, Hormozgan province, southern Iran. We applied a stepwise regression (SR) algorithm and five ML models (Cforest (as a conditional random forest), generalized linear model (GLM), multivariate linear regression (MLR), partial least squares (PLS) and extreme gradient boosting (XGBoost)) to select important features and to map the LS hazard, respectively. Thereafter, several interpretation techniques were used to explain the spatial ML hazard model output. Our findings revealed that a GLM model was the most accurate approach to map LS in our study area, and that 24.3% of the total study area had a very high susceptibility to the LS hazard. According to the interpretation techniques, land use, elevation, groundwater level and vegetation were the most important variables controlling the LS hazard and also the most important variables contributing to the model’s output. Overall, human activities, especially the diversion of the route of one of the main tributaries feeding the plain and the recharging of groundwater five decades ago, intensified the current LS occurrence. Therefore, management activities such as water spreading projects upstream of the plain can be useful to mitigate LS occurrence in the plain.