Water (Jul 2024)

Integrating Feature Selection with Machine Learning for Accurate Reservoir Landslide Displacement Prediction

  • Qi Ge,
  • Jingyong Wang,
  • Cheng Liu,
  • Xiaohong Wang,
  • Yiyan Deng,
  • Jin Li

DOI
https://doi.org/10.3390/w16152152
Journal volume & issue
Vol. 16, no. 15
p. 2152

Abstract

Read online

Accurate prediction of reservoir landslide displacements is crucial for early warning and hazard prevention. Current machine learning (ML) paradigms for predicting landslide displacement demonstrate superior performance, while often relying on various feature engineering techniques, such as decomposing into different temporal lags and feature selection. This study investigates the impact of various feature selection techniques on the performance of ML algorithms for landslide displacement prediction. The Shuping and Baishuihe landslides in China’s Three Gorges Reservoir Area are used to comprehensively benchmark four prevalent ML algorithms. Both static ML models, including backpropagation neural network (BPNN), support vector machine (SVM), and dynamic models, such as long short-term memory (LSTM), and gated recurrent unit (GRU), are included. Each ML model is evaluated under three feature engineering techniques: raw multivariate time series, and feature selection under maximal information coefficient-partial autocorrelation function (MIC-PACF), or grey relational analysis-PACF (GRA-PACF). The results demonstrate that appropriate feature selection methods could significantly improve the performance of static ML models. In contrast, dynamic models effectively leverage inherent capabilities in capturing temporal dynamics within raw multivariate time series, seeing marginal gains with extensive feature engineering compared to no feature selection strategy. The optimal feature selection approach varies based on the ML model and specific landslide, highlighting the importance of case-specific assessments. The findings in this study offer guidance on integrating feature selection techniques with different machine learning models to maximize the robustness and generalizability of data-driven landslide displacement prediction frameworks.

Keywords