Natural Hazards Research (Mar 2024)

Comparative study on landslide susceptibility mapping based on different ratios of training samples and testing samples by using RF and FR-RF models

  • Ke Xu,
  • Zhou Zhao,
  • Wei Chen,
  • Jianquan Ma,
  • Fei Liu,
  • Yihao Zhang,
  • Zijun Ren

Journal volume & issue
Vol. 4, no. 1
pp. 62 – 74

Abstract

Read online

Evaluation of landslide susceptibility is essential to planning of land and space utilization. For this purpose, the paper presents a case study from Fugu County, Shaanxi Province, China. Firstly, the geological environment and current state of landslides in Fugu County were investigated. Then, slope, aspect, terrain relief, curvature, lithology, land type, and normalized difference vegetation index (NDVI) were considered as the landslide susceptibility condition factors, and the correlation between these carried out by using Multicollinearity Analysis method. Next, landslide and non-landslide samples were divided into training samples and testing samples according to the sample ratios of 8/2, 7/3, 6/4, and 5/5, respectively. The landslide susceptibility mapping was carried out by using Random Forest (RF) model and Frequency Ratio coupled with Random Forest (FR-RF) model, respectively. Lastly, the landslide density (LD), landslide frequency ratio (LFR), the area under the curve (AUC) of the receiver operator, and other indicators were used to validate the rationality, accuracy, and performance of the landslide susceptibility maps produced from different models and ratios. The results indicated that all maps are reasonable, except the map when ratio is 5/5. For each map, regardless of ratios, the LD and LFR are the greatest in the zones classed as having a very high susceptibility, followed by those with a high, moderate, low, and very low classes.In the Random Forest (RF) model, when the training test set is not at the same time its in the area of extremely high sensitivity of LD and the size of the FR value respectively 7/3 (201.026) ​> ​8/2 (154.440) ​> ​6/4 (93.696) >5/5 (136.364) and 7/3 (4.806) ​> ​8/2 (3.692) ​> ​6/4 (3.260) ​> ​5/5 (2.240); in the Frequency Ratio coupled with Random Forest (FR-RF) model, Inall the training test sets the size of the proportion of LD and FR value respectively 7/3 (145.693) ​> ​6/4 (127.151) ​> ​5/5 (122.857) ​> ​8/2 (113.263) and 7/3 (3.334) ​> ​6/4 (3.073) ​> ​5/5 (2.811) ​> ​8/2 (2.592). What else, from the comparison of ROC curves, when ratio is 7/3, the accuracy of the two models is higher than that of other ratios. Similarly, the results of the ensemble model (A combination of two models with different learning abilities.) are not more reasonable than the results of the single model, which reflects that the combination of a weaker learner model (Frequency Ratio model here) with a stronger learner model (Random Forest model here) can diminish the performance of the stronger model.

Keywords