Journal of Rock Mechanics and Geotechnical Engineering (Oct 2024)
Data-augmented landslide displacement prediction using generative adversarial network
Abstract
Landslides are destructive natural disasters that cause catastrophic damage and loss of life worldwide. Accurately predicting landslide displacement enables effective early warning and risk management. However, the limited availability of on-site measurement data has been a substantial obstacle in developing data-driven models, such as state-of-the-art machine learning (ML) models. To address these challenges, this study proposes a data augmentation framework that uses generative adversarial networks (GANs), a recent advance in generative artificial intelligence (AI), to improve the accuracy of landslide displacement prediction. The framework provides effective data augmentation to enhance limited datasets. A recurrent GAN model, RGAN-LS, is proposed, specifically designed to generate realistic synthetic multivariate time series that mimics the characteristics of real landslide on-site measurement data. A customized moment-matching loss is incorporated in addition to the adversarial loss in GAN during the training of RGAN-LS to capture the temporal dynamics and correlations in real time series data. Then, the synthetic data generated by RGAN-LS is used to enhance the training of long short-term memory (LSTM) networks and particle swarm optimization-support vector machine (PSO-SVM) models for landslide displacement prediction tasks. Results on two landslides in the Three Gorges Reservoir (TGR) region show a significant improvement in LSTM model prediction performance when trained on augmented data. For instance, in the case of the Baishuihe landslide, the average root mean square error (RMSE) increases by 16.11%, and the mean absolute error (MAE) by 17.59%. More importantly, the model's responsiveness during mutational stages is enhanced for early warning purposes. However, the results have shown that the static PSO-SVM model only sees marginal gains compared to recurrent models such as LSTM. Further analysis indicates that an optimal synthetic-to-real data ratio (50% on the illustration cases) maximizes the improvements. This also demonstrates the robustness and effectiveness of supplementing training data for dynamic models to obtain better results. By using the powerful generative AI approach, RGAN-LS can generate high-fidelity synthetic landslide data. This is critical for improving the performance of advanced ML models in predicting landslide displacement, particularly when there are limited training data. Additionally, this approach has the potential to expand the use of generative AI in geohazard risk management and other research areas.