Geomatics, Natural Hazards & Risk (Dec 2024)

Landslide susceptibility analysis using random forest model with SMOTE-ENN resampling algorithm

  • Mingxi Lu,
  • Lea Tien Tay,
  • Junita Mohamad-Saleh

DOI
https://doi.org/10.1080/19475705.2024.2314565
Journal volume & issue
Vol. 15, no. 1

Abstract

Read online

AbstractLandslide is one of the natural disasters that cause property damages and human injuries. Landslide hazard predictions are crucial measures to reduce the damages and losses. One of the effective approaches in landslide prediction is landslide susceptibility analysis (LSA). In this article, LSA is carried out on the study area, Penang Island. The imbalanced landslide dataset is the most important issue to be solved in this article, four resampling methods were compared for the training set using random forest (RF) as the basic model. To enhance the credibility of the results, the experiments replicate 10 times, and McNemar’s test was applied to analyse statistical significance of classifier performances for the LSA. The results indicated that the differences between the methods were statistically significant; RF combined with the synthetic minority oversampling technique-edited nearest neighbour (SMOTE-ENN) resampling method proposed in this paper has positive effect in LSA as compared with the other resampling methods. The RF and SMOTE-ENN combined model for the LSA using the min–max normalization method achieved a recall of 0.844 and an F2-score of 0.756. The SMOTE-ENN method had a significant impact on the LSA of the imbalanced data in the study area.

Keywords