Machine Learning-Driven Landslide Susceptibility Mapping in the Himalayan China–Pakistan Economic Corridor Region

Mohib Ullah; Bingzhe Tang; Wenchao Huangfu; Dongdong Yang; Yingdong Wei; Haijun Qiu

doi:10.3390/land13071011

Land (Jul 2024)

Machine Learning-Driven Landslide Susceptibility Mapping in the Himalayan China–Pakistan Economic Corridor Region

Mohib Ullah,
Bingzhe Tang,
Wenchao Huangfu,
Dongdong Yang,
Yingdong Wei,
Haijun Qiu

Affiliations

Mohib Ullah: Shaanxi Key Laboratory of Earth Surface and Environmental Carrying Capacity, College of Urban and Environmental Sciences, Northwest University, Xi’an 710127, China
Bingzhe Tang: Shaanxi Key Laboratory of Earth Surface and Environmental Carrying Capacity, College of Urban and Environmental Sciences, Northwest University, Xi’an 710127, China
Wenchao Huangfu: Shaanxi Key Laboratory of Earth Surface and Environmental Carrying Capacity, College of Urban and Environmental Sciences, Northwest University, Xi’an 710127, China
Dongdong Yang: Shaanxi Key Laboratory of Earth Surface and Environmental Carrying Capacity, College of Urban and Environmental Sciences, Northwest University, Xi’an 710127, China
Yingdong Wei: Shaanxi Key Laboratory of Earth Surface and Environmental Carrying Capacity, College of Urban and Environmental Sciences, Northwest University, Xi’an 710127, China
Haijun Qiu: Shaanxi Key Laboratory of Earth Surface and Environmental Carrying Capacity, College of Urban and Environmental Sciences, Northwest University, Xi’an 710127, China

DOI: https://doi.org/10.3390/land13071011
Journal volume & issue: Vol. 13, no. 7
p. 1011

Abstract

Read online

The reliability of data-driven approaches in generating landslide susceptibility maps depends on data quality, analytical method selection, and sampling techniques. Selecting optimal datasets and determining the most effective analytical methods pose significant challenges. This study assesses the performance of seven machine learning classifiers in the Himalayan region of the China–Pakistan Economic Corridor, utilizing statistical techniques and validation metrics. Thirteen geo-environmental variables were analyzed, including topographic (8), land cover (1), hydrological (1), geological (2), and meteorological (1) factors. These variables were evaluated for multicollinearity, feature importance, and their influence on landslide incidences. Our findings indicate that Support Vector Machines and Logistic Regression were highly effective, particularly near fault zones and roads, due to their effectiveness in handling complex, non-linear terrain interactions. Conversely, Random Forest and Logistic Regression demonstrated variability in their results. Each model distinctly identified landslide susceptibility zones ranging from very low to very high risk. Significant conditioning variables such as elevation, rainfall, lithology, slope, and land use were identified, reflecting the unique geomorphological conditions of the Himalayas. Further analysis using the Variance Inflation Factor and Pearson correlation coefficient showed minimal multicollinearity among the variables. Moreover, evaluations of Area Under the Receiver Operating Characteristic Curve (AUC-ROC) values confirmed the strong predictive capabilities of the models, with the Random Forest Classifier performing exceptionally well, achieving an AUC of 0.96 and an F-Score of 0.86. This study shows the importance of model selection based on dataset characteristics to enhance decision-making and strategy effectiveness.

Published in Land

ISSN: 2073-445X (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Agriculture
Website: http://www.mdpi.com/journal/land

About the journal

Abstract

Keywords