Scientific Reports (Nov 2024)
“Ensembled transfer learning approach for error reduction in landslide susceptibility mapping of the data scare region”
Abstract
Abstract Landslide susceptibility map (LSM) plays an important role in providing the knowledge of slopes prone to future landslides. However, the applicability of LSM is often hindered due to high cost of data collection especially in mountainous region such as Himalayas. Therefore, this study proposes transfer learning approach (TL) to improve the performance of LSM by transferring the information from the data rich region (source) to data scare region (target). Two machine learning based model are trained in source area which is then transferred for prediction in target area and a source trained model combined with the knowledge of target is used for the prediction in the target area. The applicability of the proposed approach is tested considering Mandi district as source area and Kullu and Rudraprayag districts as target areas. Independent variables which included 11 landslide conducive factors from the source and target area, the independence was critically analysed using multi-collinearity test, while the geomorphic similarity was assessed using KL divergence method. Efficient machine learning method such as random forest (RF) and multi-layer perceptron (MLP) was used to train the models in both areas, statistical measure such as AUC-ROC, precision, recall, F-score, and accuracy were used to evaluate the performance of the LSMs. The results demonstrated the proposed approach for target area 1, the AUC value increased from 0.908 (Target trained on itself), to 0.942 (Target-Transfer Learned (TL)) and 0.959 (Target Combined) for RF and 0.896, 0.907 and 0.946 for MLP. Additionally, an increase in the precision was observed in RF while all other statistical measures increased to 0.023 for precision, 0.022 for recall, 0.023 for F-score and 0.066 for accuracy in the case of MLP. While on the other hand, in the case of target area 2, the AUC value for RF, ranged from 0.95 (Target area trained on itself), 0.80 (target TL) and 0.98 (target combined) and for MLP the AUC ranged from 0.82 for the source trained model while 0.84 for LSM obtained when both target and source data was used. Thus, the results revealed that proposed TL approach can be effective in improving the performance of LSM in data scare region in Himalayan region, thereby providing a promising approach in overcoming the issue of data limitation.