Dizhi lixue xuebao (Apr 2023)
Evaluation of landslide susceptibility in the gentle hill-valley areas based on the interpretable random forest-recursive feature elimination model
Abstract
This study aims to evaluate landslide susceptibility and explain the internal mechanism of gentle hill-valley through SHAP partial interpretation and PDP partial dependency map based on the random forest-recursive feature elimination model to provide references for geological disaster prevention and control. We used the optimized random forest algorithm to analyze the landslide susceptibility of the specific hill-valley areas and established a landslide susceptibility evaluation model. The recursive feature elimination algorithm was used to eliminate noise factors. Sixteen factors of four types, including terrain, geology, environmental conditions, and human activities, were selected to build a landslide hazard factor database for the Hechuan district. Then we combined 754 historical landslide sites in the Hechuan district with the factor database to derive a landslide susceptibility zoning map for the study area, and the factor importance was ranked using the random forest algorithm. Finally, a partial dependency plot is applied to explain the factors strongly influencing landslide occurrence in the Hechuan district and the SHAP algorithm for a local explanation of individual landslides. The results show that: compared with the original model, the AUC value of the test set of the random forest-recursive feature elimination model has increased by 0.019, demonstrating the effectiveness of the recursive feature elimination algorithm. According to the evaluation results of the random forest model, the AUC values of the training set and the test set are 0.769 and 0.755, respectively, with high prediction accuracy. The density is high in areas with large undulations, and historical landslides are concentrated in high-susceptibility areas. The spatial distribution of landslides is uneven and complex, and the influence of each hazard factor on landslide occurrence has prominent regional characteristics and spatial heterogeneity. In hill-valley areas, the average annual rainfall, elevation, and lithology are the most critical factors affecting landslide occurrence. According to the local interpretation map of SHAP, the landslide on the uphill road of Baitaping is explained. The lithology and elevation played a role in restraining the landslide, and the undulation, slope, NDVI, and POI kernel density promoted the landslide. In summary, the random forest-recursive feature elimination model has high accuracy in landslide susceptibility evaluation in the hill-valley areas. The interpretation and analysis of the internal mechanism of the regional landslides and individual landslides through PDP and SHAP interpretation algorithms are conducive to constructing and improving the evaluation factor system for landslide susceptibility under different geomorphic environments. The internal decision-making mechanism of landslides is explored; it can provide a reference for the regional landslide susceptibility assessment and geological disaster prevention.
Keywords