Applied Sciences (Oct 2024)

Comparison of Different Negative-Sample Acquisition Strategies Considering Sample Representation Forms for Debris Flow Susceptibility Mapping

  • Ruiyuan Gao,
  • Di Wu,
  • Hailiang Liu,
  • Xiaoyang Liu

DOI
https://doi.org/10.3390/app14209240
Journal volume & issue
Vol. 14, no. 20
p. 9240

Abstract

Read online

The lack of reliable negative samples is an important factor limiting the quality of machine learning-based debris flow susceptibility mapping (DFSM). The purpose of this paper is to propose multiple negative-sample acquisition strategies for DFSM considering different sample representation forms. The sample representation forms mainly include a single grid, multi-grid, and watershed unit, and the negative-sample acquisition strategies are based on support vector machine (SVM), spy technique, and isolation forest (IF) methods, respectively. These three strategies can assign a value to all the samples based on different assumptions, and reliable, negative samples can be generated from samples with values below a predefined threshold. Combining different sample representation forms with negative sample acquisition strategies, nine datasets were then involved in random forest (RF) modeling. The receiver operating characteristic (ROC) curves and related statistical results were used to evaluate the models. The results show that the strategy based on the spy technique is suitable for multiple datasets, while the IF-based strategy is well-adapted to the watershed unit datasets. This study can provide more options for improving the quality of datasets in DFSM, which can further improve the performance of machine learning models.

Keywords