Geomatics, Natural Hazards & Risk (Jan 2021)

Robustness analysis of machine learning classifiers in predicting spatial gully erosion susceptibility with altered training samples

  • Tusar Kanti Hembram,
  • Sunil Saha,
  • Biswajeet Pradhan,
  • Khairul Nizam Abdul Maulud,
  • Abdullah M. Alamri

DOI
https://doi.org/10.1080/19475705.2021.1890644
Journal volume & issue
Vol. 12, no. 1
pp. 794 – 828

Abstract

Read online

The present research intended to assess the robustness of three popular machine learning models, i.e. random forest (RF), boosted regression tree (BRT) and naïve bayes (NB) in spatial gully erosion susceptibility modelling in Jainti River basin, India. A gully inventory map of 208 gullies was prepared through field survey and Google earth imageries. Following the 70/30 ratio, three randomly sampled groups of altered training and validation gully sets G1, G2 and G3 were prepared for modelling gully erosion susceptibility. Using information gain ratio and multi-collinearity analysis, 14 gully conditioning factors (GCF) were selected. The discrimination ability and reliability of the models were measured through Kappa coefficient, efficiency, receiver operating characteristic curve, root-mean-square-error (RMSE) and mean-absolute-error (MAE). The stability of the machine learning models was estimated by comparing the accuracy statistics and the departure in areal outcomes among intra-model and inter-model. RF model was found as the most consistent. With the highest mean AUC (0.903), efficiency (91.17), Kappa coefficient (0.835) and lowest RMSE (0.192) and MAE (0.081), RF was found to be more consistent when the training and validation data sets were altered. The effectiveness of each input GCFs was determined using map removal sensitivity analysis technique. This study could be supportive in ascertaining model deployment for mapping gully erosion and managing the land resource.

Keywords