A novel framework for feature simplification and selection in flood susceptibility assessment based on machine learning

Kaili Zhu; Chengguang Lai; Zhaoli Wang; Zhaoyang Zeng; Zhonghao Mao; Xiaohong Chen

Journal of Hydrology: Regional Studies (Apr 2024)

A novel framework for feature simplification and selection in flood susceptibility assessment based on machine learning

Kaili Zhu,
Chengguang Lai,
Zhaoli Wang,
Zhaoyang Zeng,
Zhonghao Mao,
Xiaohong Chen

Affiliations

Kaili Zhu: School of Civil Engineering and Transportation, State Key Laboratory of Subtropical Building and Urban Science, South China University of Technology, Guangzhou 510641, China
Chengguang Lai: School of Civil Engineering and Transportation, State Key Laboratory of Subtropical Building and Urban Science, South China University of Technology, Guangzhou 510641, China; Pazhou Lab, Guangzhou 510335, China
Zhaoli Wang: School of Civil Engineering and Transportation, State Key Laboratory of Subtropical Building and Urban Science, South China University of Technology, Guangzhou 510641, China; Pazhou Lab, Guangzhou 510335, China
Zhaoyang Zeng: School of Civil Engineering and Transportation, State Key Laboratory of Subtropical Building and Urban Science, South China University of Technology, Guangzhou 510641, China; Correspondence to: School of Civil Engineering and Transportation, South China University of Technology, Guangzhou 510641, China.
Zhonghao Mao: School of Civil Engineering and Transportation, State Key Laboratory of Subtropical Building and Urban Science, South China University of Technology, Guangzhou 510641, China
Xiaohong Chen: Center for Water Resources and Environment, Sun Yat-sen University, Guangzhou 510275, China

Journal volume & issue: Vol. 52
p. 101739

Abstract

Read online

Study region: Yangtze River Delta core urban agglomeration, China Study focus: Traditional research on flood susceptibility assessment using machine learning often seeks to enhance model performance by increasing the number of input variables, which is impractical in regions with limited data availability. In this study, we constructed a variable system comprising 13 features for flood susceptibility assessment through machine learning techniques. A flexible framework, primarily incorporating methods for importance value calculation and repeated random sampling, were established to identify a minimal set of features that yield high-performance classifiers. Finally, the feasibility of the proposed framework was verified by comparing the classifier performances and flood susceptibility maps. New hydrological insights for the region: Results underscored the significance of features such as Land Use / Land Cover, Impervious Area, Normalized Difference Vegetation Index, Distance to Lake and Built-up Probability in model development. These five features proved sufficient to produce a classifier with Area Under the Curve (AUC) indices exceeding 0.9 for both training and testing data. Susceptibility maps generated using varying feature counts revealed that regions with limited vegetation cover and near lakes face higher flood susceptibility. The framework's feasibility and viability were confirmed by the excellent classifier performance (mean AUC > 0.9) with reduced features and the consistent outcomes of generated maps, offering theoretical and technical support for flooding research in data-constrained regions.

Published in Journal of Hydrology: Regional Studies

ISSN: 2214-5818 (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Geography. Anthropology. Recreation: Physical geography; Science: Geology
Website: http://www.journals.elsevier.com/journal-of-hydrology-regional-studies/

About the journal

Abstract

Keywords