International Journal of Digital Earth (Dec 2024)
Enhancing flood-prone area mapping: fine-tuning the K-nearest neighbors (KNN) algorithm for spatial modelling
Abstract
ABSTRACTThis study focuses on determining the optimal distance metric in the K-Nearest Neighbors (KNN) algorithm for spatial modelling of floods. Four distance metrics of the KNN algorithm, namely KNN-Manhattan, KNN-Minkowski, KNN-Euclidean, and KNN-Chebyshev, were utilized for flood susceptibility mapping (FSM) in Estahban, Iran. A spatial database comprising 509 flood occurrence points extracted from satellite images and 12 factors influencing floods was created for analysis. The particle swarm optimization (PSO) algorithm was employed for hyperparameter optimization and feature selection, considering eight influential factors as modelling inputs. The modelling results revealed that the KNN-Manhattan algorithm exhibited superior accuracy (root mean squared error (RMSE) = 0.169, mean absolute error (MAE) = 0.051, coefficient of determination (R2) = 0.884, and area under the curve (AUC) = 0.94) compared with the other algorithms for identifying flood-prone areas. The KNN-Minkowski algorithm followed closely, with an RMSE of 0.175, MAE of 0.056, R2 of 0.876, and AUC of 0.939. The KNN-Euclidean algorithm achieved an RMSE of 0.183, MAE of 0.061, R2 of 0.842, and AUC of 0.929, whereas the KNN-Chebyshev algorithm achieved an RMSE of 0.198, MAE of 0.075, R2 of 0.842, and AUC of 0.924.
Keywords