Keywords

1 Introduction

Landslides are the gravitational movement of mass down a slope, which have caused huge casualties and severe financial loss (Gong et al. 2021; Guo et al. 2021). In lately few decades, mountainous areas struggle with an increasing landslide risk as a consequence of industrialization, urbanization and population growth (Kang et al. 2009). In general, the frequent landslides with complicated mechanism are extremely hard to identify, and they seriously endanger the development of national economy and the safety of people’s lives and property. To mitigate the global adverse effects of landslides geohazards, various strategies have been implemented (Cui et al. 2021; Juang 2021). Landslide displacement prediction based on monitoring data has been proven to be an effective way to predict and prevent landslide disasters (Wang et al. 2022). The visualization analysis of high-frequency terminology in title and abstract based on Web of Science and VOSviewer is used to better understand the research trends of “landslide displacement prediction” in recent years. The result (Fig. 1) shows that the most researches concentrated on China Three Gorges Reservoir (CTGR) area, were focused on selection of developed the most slide-prone strata (Criss et al. 2020). Many attempts have been made in landslide displacement prediction by applying Machine Learning (ML) and Deep Learning (DL) methods, such as artificial neural networks (ANNs); Support Vector Machine (SVM); Random Forest (RF) and Extreme Learning Machine (ELM) models, due to their inherently nonlinear advantage and excellent prediction ability (Yu and Ma 2021; Huang et al. 2020). Generally speaking, in the process of model construction, a regression or classification model have been trained through a complex nonlinear mapping with adjustable parameters based on external inducing factors and displacement datasets. More recently, to overcome the drawbacks of single method and make breakthroughs, multifarious ensemble models integrated two or three methods are put forward. Considering the time series characteristics of displacement, the Recursive Neural Network (RNN) and Long Short-Term Memory (LSTM) are applied for prediction (Yang et al. 2019). Particularly, target for the typical landslides with step-like deformation characteristics, some researchers decomposed the accumulated displacement into trend item and periodic item through time filter algorithms, such as Complementary Ensemble Empirical Mode Decomposition (CEEMD) algorithm (Yeh et al. 2011).

$$ {A}_t={T}_t+{P}_t,t=1,2,3,\dots, N $$
(1)

where At is the landslide cumulative displacement at time t. Tt is the trend component that reflects the long-term trend over time, and it is determined by the geological conditions of the slope. Pt is a periodic component, which is affected by periodic events in natural environment. The prediction of landslide displacement is realized through adding the fitting trend component and the periodic composition predicted according to artificial intelligent (AI) model.

Fig. 1
A cluster network plot of high-frequency terminology in the title and abstract of landslide displacement prediction literatures. It includes T G R, approach and application, E L M, and P S O have declining frequencies in order.

The visualization analysis of high-frequency terminology in the title and abstract of “landslide displacement prediction” literatures

Traditionally, the mean square error (MSE), root mean square error (RMSE) and mean absolute error (MAE) are used to evaluate the overall prediction accuracy of a model. These indexes can be expressed as follows:

$$ RMSE=\sqrt{\frac{1}{n}\sum \limits_{t=0}^n{\left(\hat{{\mathrm{Y}}_{\mathrm{t}}}-{Y}_t\right)}^2} $$
(2)
$$ MSE=\frac{1}{n}\sum \limits_{t=0}^n{\left(\hat{{\mathrm{Y}}_{\mathrm{t}}}\hbox{-} {\mathrm{Y}}_{\mathrm{t}}\right)}^2 $$
(3)
$$ MAE=\frac{1}{n}\sum \limits_{t=0}^n\left|\hat{{\mathrm{Y}}_{\mathrm{t}}}\hbox{-} {\mathrm{Y}}_{\mathrm{t}}\right| $$
(4)

where Yt refers to the observed value and \( {\hat{Y}}_t \) refers to the predicted value. The MSE and RMSE are used to measure the deviation between observed and predicted values. The MAE is used to measure the average overall prediction error.

2 Case Study

The intensive and severe hydrodynamic pressure-driven landslides with obvious step-like deformation characteristics under the influence of precipitation and reservoir water level fluctuation, such as Baijiabao, Bazimen and Baishuihe landslides, are developed in Zigui basin, CTGR area (Tang et al. 2019). According to the field investigation (Figs. 2a, 3a, 4a), Baijiabao and Bazimen landslides are the anti-dip slopes located on the right bank of the Xiangxi River. Baishuihe landslide is a dip slope which located on the right bank of the Yangtze River.

Fig. 2
A map and 2 plot. a. Baijiabao landslide with Z G 324 in the south and 326, 325, and 323, to the north of Xing Road. b. A declining trend of elevation for Q 4, Z G 325, Z G 324, in order. c. A bar cum multi-line plot of various categories. Cumulative displacement of the 4 points have rising peaks.

(a) The plan map; (b) the profile map and (c) the monthly cumulative displacement, the monthly precipitation and reservoir water level of Baijiabao landslide (after Long et al. 2022)

Fig. 3
A map and 2 plots. a. Bazimen landslide with Z G 110 in the south and 111 to the north of Xing Road. b. A declining trend of elevation for gravel, gravel breccia, argillaceous siltstone, and quartz sandstone in order. c. A bar cum multi-line plot for various categories. Reservoir water level varies.

(a) The plan map; (b) the profile map and (c) the landslide monthly cumulative displacement, the monthly precipitation and reservoir water level of Bazimen landside (after Long et al. 2022)

Fig. 4
A map and 2 plots. a. Bazimen landslide with Z G 93 and 118 lie between 240 and 280 meters, south of the road. b. A declining trend of elevation for silt clay, and sandstone in order. c. A bar cum multi-line plot for various categories. Cumulative displacement of Z G 93 and 118 have rising peaks.

(a) The plan map; (b) the profile map and (c) the landslide monthly cumulative displacement and the monthly precipitation, reservoir water level of Baishuihe landside (after Long et al. 2022)

The comparison of the geological information shows that the stratigraphic units of the three landslides all consist of sandstone interbedded with mudstone of the Early Jurassic Xiangxi Group (J1x) (Figs. 2b, 3b, 4b). The sliding zones of the three landslides are Jurassic slide-prone strata composed of sedimentary rocks with poor weathering resistance and high permeability (Long et al. 2020). But the three landslides vary in volume. The Baijiabao landslide covers an area of 2.4 × 105 m2, and has a volume of 1.0 × 107 m3. The Bazimen landslide covers a total area of 1.18 × 105 m2, and encompasses a volume of 2.35 × 106 m3. The Baishuihe landslide covers an area of 2.15 × 105 m3, with an estimated volume of 1.26× 107 m3.

Several GPS sites were set up in the three landslides to monitor the landslides deformation. However, due to various reasons such as reservoir water impoundment, some sensors have been damaged. Hence, the data of four GPS monitoring points (ZG323, ZG324, ZG325 and ZG326) in Baijiabao landslide (Fig. 2c), two GPS monitoring points in Bazimen landslide (ZG110 and ZG111) (Fig. 3c) and two GPS monitoring points in Baishuihe landslide (ZG93 and ZG118) (Fig. 4c); and the monthly precipitation, reservoir water level are collected for research.

3 Landslide Prediction Model Based on Ensemble Learning

3.1 Model Construction and Application

RF is an advanced ensemble algorithm based on Decision Trees (DT) with robust and accurate performance (Breiman 2001). And SVM algorithm (Ruping and Morik 2003) with high robustness and outstanding generalization capacity is regarded as one of the most famous ML methods for landslide displacement forecasting. In this paper, we construct the ensemble learning method called Greedy Algorithm (GA) -CEEMD-RF and GA-CEEMD-SVM algorithms for landslide displacement decomposition and prediction. It mainly includes four key steps and the flowchart is shown in Fig. 5.

Fig. 5
A flowchart of G A-C E E M D- R F and G A-C E E M D- S V M model for landslide displacement prediction. It includes calculating the predicted trend displacement via polynomial fitting from original displacement and periodic displacement from a combination of induced factors.

The flowchart of GA-CEEMD-RF and GA-CEEMD-SVM model for landslide displacement prediction with step-like characteristics (after Li et al. 2021)

Considering the data integrity, the monitoring displacement dataset (ZG323) from April 2010 to April 2017 of Baijiabao landslide is applied for model constuction as an example. At first, CEEMD algorithm is used to decompose the landslide accumulative displacement into trend component and periodical component.

Secondly, rainfall and variation of reservoir water level are regarded as the main landslide-related inducing factors in CTGR area. Hence, monthly accumulative rainfall (R1), accumulative rainfall of the previous 2 months (R2) and average rainfall of the previous 3 months (R3); monthly average water level (W1), monthly maximum variation of water level (W2), and maximum variation of water level of the previous 2 months (W3); and the high-frequency item of R4 and W4 obtained by the decomposition of R1 and W1 respectively based on CEEMD algorithm, are taken as direct inducing factors. Since the induction factors with similar information cannot be used simultaneously, the GA algorithm which makes the optimal decision at each step as it tries to find the overall optimal solution for the whole problem is applied to determine the optimal combination of inducers (Gribonval and Nielsen 2001). The eight direct induced factors are used to construct the different displacement prediction models, respectively. And goodness of fitting (R2) of each model is calculated. The induced factors corresponding to the maximum R2 are regressed one and 2 months considering the lag effect, and they are served as the inputs to construct models. The R2 of the models are calculated and compared with values mentioned above. The induced factors corresponding to the maximum R2 is regarded as the Optimal Combination of Induced Factors (OCIF) at the present stage. And next, the combination of the remaining seven direct induced factors with the OCIF is served as the inputs and the R2 of these models can be obtained. The combination of induced factors corresponding to maximum R2 of these models is regarded as the global OCIF at the present stage. Repeat the steps until the global OCIF is determined. The result indicates that the Gauss function shows good capacity for trend component prediction. The optimal inducers combination during this period including regression 2-month R2, R3, regression 1-month R4 and regression 2-month W2 is determined by GA method.

Finally, Gauss function is selected to fit the trend displacement. And the OCIF and the periodic composition are served as inputs and outputs of RF and SVM algorithms for constructing the displacement prediction model. In this study, 80% of data (from April 2010 to February 2016) are treated as training sets for periodical displacement prediction model construction, while the remaining 20% (from March 2016 to April 2017) are used as test data to check the effectiveness of the model. And the number of DT is set to 1000 and the length of features set is set to 2 in RF algorithm. The optimal error penalty factor (C) and kernel function parameter (g) of SVM periodic component prediction model are calculated by grid cell method as 0.5 and 9.76 × 10−4, respectively. The trend term and the periodic term are added together to get the cumulative displacement.

3.2 Result Evaluation and Verification

As shown in Fig. 6, the R2 of fitting result of the trend displacement with Gauss equation is 0.9956. The RMSE is 15.66, MSE is 254.44, and MAE is 13.11. The RMSE, MSE and MAE of periodic displacement prediction based on RF algorithm are 5.44; 29.58 and 3.01, respectively. The RMSE, MSE and MAE of periodic displacement prediction based on SVM algorithm are 8.76; 76.82 and 3.59, respectively. And the RMSE, MSE and MAE of accumulative displacement prediction based on GA-CEEMD-RF method are 17.47; 305.28 and 13.98. And the RMSE, MSE and MAE of accumulative displacement prediction based on GA-CEEMD-SVM model are 19.67; 387.10 and 14.57.

Fig. 6
3 fitted multi-line graphs of displacement value for 3 components from January 2010 to January 2017. The observed and predicted values based on S V M and R F rises in the trend component, fluctuates with peaks and dips in the periodic, and has ascending peaks in the accumulated component.

The fitting result of the trend displacement with Gauss equation; the periodic component prediction result based on RF model and SVM model; the accumulative displacement prediction results of GA-CEEMD-RF model and GA-CEEMD- SVM model (after Li et al. 2021)

As is known to all, the smaller the evaluation value, the better the model performance. Hence, the results comparison shows that both SVM model and RF model perform well on step-like landslide periodic component forecasting, but the GA-CEEMD-RF model possesses better prediction capacity than that of SVM model in general.

4 A Landslide Prediction Model Based on Transfer Learning

4.1 MFTL Model Construction

The monitoring system established in the Zigui basin can only cover the most destructive landslides due to manpower and equipment limitations. In addition, monitoring work interruption will cause data missing. Thus, a landslide of interest may lack training data, but data from other landslides located in neighboring areas or with the same deformation characteristics can potentially be available. According to engineering geological analogy (EGA) method, landslides occurring under the same geological conditions generally have similar deformation characteristics, but vary in size. In such cases, the transfer learning (TL) approach inspired by the EGA method provides a solution, as the knowledge gained from a landslide case with sufficient monitoring data can be applied to other landslides that lack adequate training data (Pan and Yang 2010; Yi and Gianfranco 2010; Weiss et al. 2016). Multi-feature fusion transfer learning (MFTL) method is novelty proposed to transfer the knowledge learned from Baijiabao landslides to Bazimen landslide and Baishuihe landslide for further forecasting.

The evolution process of hydrodynamic pressure-driven landslides is affected by complex factors that cannot be simply expressed by a single variable (Liu et al. 2020). Hence, the landslide evolution state is proposed to represent the relationship between deformation and external factors. The process of multi-feature space projection and landslide evolution states construction is shown as follows: (Fig. 7) (Long et al. 2022):

  1. 1.

    The landslide evolution state (St) at time t is defined as a continuous response deviation:

$$ {S}_t=\left[\begin{array}{ccc}{R}_t& {R}_{t-1}& {R}_{t-2}\\ {}{K}_t& {K}_{t-1}& {K}_{t-2}\\ {}\varDelta {W}_t& \varDelta {W}_{t-1}& \varDelta {W}_{t-2}\end{array}\right] $$
(5)

It is composed of consecutive incentive conditional space (\( X=\left[\begin{array}{cccc}{R}_t& {R}_{t-1}& \cdots & {R}_0\\ {}{K}_t& {K}_{t-1}& \cdots & {K}_0\end{array}\right] \)) and sensitive states (ΔWt = Wt' − Wt), where R refers to the monthly accumulative precipitation and K refers to the variation velocity of reservoir water level. The landslide incentive conditional space (X) and displacement (ωt) are projected into Dynamic-Kernel Hilbert Space (DKHS) through different projection functions f1(x) and f2(w).Wt is the projection of displacement in the conditional space at time t. Wt' is the projection of displacement at time t. The difference in post-projection displacement is defined as the sensitivity state (ΔWt) at time t under the current excitation conditions.

  1. 2.

    The distance \( \left(R{D}_{S_{t_1}{S}_{t_2}}\right) \) is used to measure the similarity of landslide evolution states \( {S}_{t_1} \) and \( {S}_{t_2} \) at times t1 and t2, and it is defined as follows:

$$ R{D}_{S_{t_1}{S}_{t_2}}=\sqrt{\sum \limits_{k=0}^2\left({\left|{R}_{t_1-k}-{R}_{t_2-k}\right|}^2+{\left|{K}_{t_1-k}-{K}_{t_2-k}\right|}^2+{\left|\varDelta {W}_{t_1-k}-\varDelta {W}_{t_2-k}\right|}^2\right)} $$
(6)

where \( {R}_{t_1-k} \), \( {R}_{t_2-k} \), \( {K}_{t_1-k} \), and \( {K}_{t_2-k} \) denote incentive condition factors \( {R}_{t_1} \) and \( {R}_{t_2} \), \( {K}_{t_1} \) and \( {K}_{t_2} \) at time t − k, respectively. \( \varDelta {W}_{t_1-k} \) and \( \varDelta {W}_{t_2-k} \) denote the sensitivity states \( \varDelta {W}_{t_1} \) and \( \varDelta {W}_{t_2} \) at time t − k, respectively.

  1. 3.

    The class centroids (CS) is defined as the average distance from every landslide evolution state. The class centroids are defined as follows:

$$ C{S}_{N_i}=\frac{1}{M_i}\sum \limits_{j=1}^{M_i}\left({S}_{t_j}\right) $$
(7)

where N refers to the kinds of landslide deformation states. In this paper, the mutation state and creep state are taken into consideration. The default value of is equals to 2. One of the landslide deformation states Ni includes Mi landslide evolution states.

Fig. 7
An illustrated flowchart of the M F T L model for landslide displacement prediction. It includes calculating the baseline response and displacement projection, followed by the M F T L strategy of plotting projection function on a classification hyperplane and a backward transfer strategy.

The flowchart of MFTL model for landslide displacement prediction (after Long et al. 2022)

  1. 4.

    The total distance in the same class \( \left(D{C}_{N_i}\right) \)is calculated as follows:

$$ D{C}_{N_i}=\sum \limits_{j=1}^{M_i}R{D}_{t_j,C{S}_{N_i}} $$
(8)

where \( R{D}_{t_j, CS} \) refers to the distance between the landslide evolution state \( \left({S}_{t_j}\right) \) and a class centroid (CS).

  1. 5.

    Considering the characteristics of landslide displacement data, the expected mean distribution discrepancy (EMDD) method is defined to maintain consistency, and at the same time, obtain the minimum risk distribution for the structural features of landslide evolution states in the DKHS. The projection functions f1(x) and f2(w) can be adjusted with the EMDD method to separate the samples through classification hyperplane into different classes as much as possible:

$$ EMDD=\min \mu \sum \limits_{t=0}^nV\left({x}_t,{\omega}_t,{f}_1,{f}_2\right)+\min \sum \limits_{i=0}^ND{C}_{N_i} $$
(9)

where μ is the penalty coefficient and V(xt, ωt, f1, f2) is the penalty function.

Considering the data integrity, the monthly cumulative displacement acquired from four GPS monitoring sites (ZG323, ZG324, ZG325, and ZG326) in Baijiabao landslide, monthly precipitation and variation velocity of reservoir water level from March 2009 to December 2012 are used to form the source domain. ZG111 located in Bazimen landslide and ZG93 located in Baishuihe landslide, and external incentive conditional space from December 2008 to December 2012 are taken as the target domains. The steps mentioned above are repeated, and separate landslide evolution states are obtained to transfer landslide states.

To transfer knowledge from the source domain (DS) to the target domain (DT), the similarity of the landslide evolution state is determined through feature transformation. To avoid a negative transfer effect in landslide displacement prediction, a hybrid optimization method is proposed to measure the distribution similarity for datasets in source domain and target domain. The decision optimization function of the MFTL algorithm is shown as follows:

$$ j=\min \eta \upsilon \parallel EMD{D}_S- EMD{D}_T{\parallel}^2+\lambda {d}_f\left({D}_S,{D}_T\right) $$
(10)

where regularization parameter η represents the degree of punishment for wrongly identifying samples. υ ∥ EMDDS − EMDDT2 is the structure risk function. EMDDS and EMDDT represent the data distributions in the source domain and target domain, respectively. By minimizing the EMDD according to the Structural Risk Minimization (SRM) criterion, the spatial distributions of data in source domain and target domain can be adjusted as similarly as possible. df(DS, DT) is a measurement of the distance between the common features in source domain and target domain. The equilibrium parameter λ is related to the risk structure and the distribution distance between domains.

The MFTL strategy aims to align the centroids of two domains by moving the target domain samples towards the source domain; the movement direction is based on the centroid differences in the two domains, and centroid alignment and isometric scaling methods are applied (Zhu and Ma 2016). After moving the centroids, the data distributions will be similar, and the classifier trained in the source domain can be used to predict landslide displacement in the target landslide. Hence, through the centroid alignment and isometric scaling processes, the data scale of similar landslides with different sizes are transferred into a roughly same space, which heavily eliminate or weaken the scale effect. The objective function of MFTL method is shown as follows (Long et al. 2022):

$$ g=\underset{D_S,{D}_T,{\varepsilon}_a^S,{\varepsilon}_b^T,\alpha, \delta }{\min}\frac{1}{2}\parallel {w}_S-{w}_T{\parallel}^2+{\mu}_1\sum \limits_{a=1}^{M_1}{\varepsilon}_a+{\mu}_2\sum \limits_{b=1}^{M_2}{\varepsilon}_b+\sum \limits_{t\hbox{'}=1}^{S\hbox{'}}R{D}_{\alpha \left({S_{t\hbox{'}}}^{\hbox{'}}-\delta \right),C{S}_{D_S}} $$
(11)

where \( \frac{1}{2}{w}_S-{w_T}^2 \) represents the feature difference between normal vectors in the source domain (wS) and target domain (wT) in DKHS. \( {\mu}_1\sum \limits_{a=1}^{M_1}{\varepsilon}_a \) and \( {\mu}_2\sum \limits_{b=1}^{M_2}{\varepsilon}_b \) are second-order soft interval classifiers. εa and εb are the slack variables for fault tolerance. μ1 and μ2 are the cost parameters. M1 and M2 are the numbers of landslide evolution states corresponding to mutation and creep, respectively. α(St' − δ) refers to the samples transferred from the target domain (DT) to the source domain (DS). In this process, after the application of the centroid alignment method, the landslide evolution state in the target domain (St') is changed to St' − δ, where δ denotes the alignment vector. After the application of the isometric scaling method, the state is changed to α(St' − δ), where α denotes the scaling vector. \( C{S}_{D_S} \) is a class centroid in the source domain (DS). \( R{D}_{\alpha \left({S_{t\hbox{'}}}^{\hbox{'}}-\delta \right),C{S}_{D_S}} \) represents the distance between the transferred samples and class centroids in the source domain (\( C{S}_{D_S} \)).

Subsequently, the transferred landslide evolution states are substituted into the decision optimization function (Eq. 10), and f(x) is further adjusted through the EMDD method (Eq. 9). After calculating the optimal function f(x), the projection of the source domain and the projection of the target domain after transfers are complete should be recalculated, as should the function df(DS, DT). In addition, df(DS, DT) should be substituted into the Eq. 10 until obtaining the optimal function f(x).

However, the spatial distance between the Baishuihe and Baijiabao landslide leads to some differences in characteristics. To reduce errors and enhance the transfer efficiency considering the dissimilarity between the two domains, landslide evolution state clustering is performed before applying transfer strategy.

After determining the classification of landslide deformation state for each sample in the target domain, the most similar landslide evolution state in source domain should be calculated by similarity measurement. Using the corresponding displacement under the most similar excitation condition, the displacement value in the target domain is obtained. The backward transfer strategy is shown as follows (Long et al. 2022):

  1. 1.

    For each landslide evolution state (St'), there is a closest corresponding landslide evolution state in the source domain (St). The distance is used to measure the similarity between landslide evolution states in target domain at current time t' and source domain at time t.

$$ R{D}_{S_t{S}_t\hbox{'}}=\sqrt{\sum \limits_{k=0}^2\left({\left|{R}_{t-k}-{R}_{t\hbox{'}-k}\right|}^2+{\left|{K}_{t-k}-{K}_{t\hbox{'}-k}\right|}^2+{\left|\varDelta {W}_{t-k}-\varDelta {W}_{t\hbox{'}-k}\right|}^2\right)} $$
(12)
  1. 2.

    The distance between external incentive conditional space in source domain at time (t + 1) and in target domain at time (t' + 1) is used to measure the similarity:

$$ {D}_{RK}=\sqrt{{\left|{R}_{t+1}-{R}_{t\hbox{'}+1}\right|}^2+{\left|{K}_{t+1}-{K}_{t\hbox{'}+1}\right|}^2} $$
(13)
  1. 3.

    The displacement (ωt + 1) corresponding to the external incentive conditional space at time (t + 1) in source domain can be obtained as follows:

$$ {D}_{\mathrm{min}}=\min R{D}_{S_t{S}_t\hbox{'}}+\min {D}_{RK} $$
(14)
  1. 4.

    The backward strategy of isometric scaling and centroid alignment method aims is used to figure out the unknown displacement value (ωt '  + 1) in target domain at the next time (t' + 1):

$$ {\omega}_{t\hbox{'}+1}=\frac{\omega_{t+1}}{\alpha^{\prime }}-{\delta}^{\prime } $$
(15)

where α' refers to the backward scaling vector and δ' refers to the backward alignment vector.

4.2 Result Evaluation and Verification

To verify the accuracy and reliability of the MFTL model, the prediction results are compared with those of Back Propagation neural network (BPNN) and Least Square (LS) methods, which are generally used in landslide displacement forecasting. The monitoring data and conditional space of Baijiabao landslide are selected as the inputs, and the displacements of Bazimen and Baishuihe landslides are output through the training of the BPNN and LS method. Notably, Fig. 8 shows that the MFTL method provides the best prediction results and it is more accurate in both creep and mutation states than the other two methods. The error evaluation results for Bazimen and Baishuihe landslides are shown in Table 1. The MSE, RMSE and MAE values are lower in the MFTL method than in other methods, which indicates that the novel model can not only fill data gaps associated with disrupted monitoring, but also provide a real-time, whole-process and effective landslide displacement prediction based on accurate forecasts of rainfall data and periodic variations in reservoir water levels.

Fig. 8
2 8-line graphs of observed and predicted values for 8 categories from March 2009 to December 2012 by Bazimen and Baishuihe landslides. It includes the predicted values of creep and mutation state based on M F T L peaking in June 2011 and that of creep state based on M F T L peaking in October 2012.

The prediction results of (a) Bazimen landslide and (b) Baishuihe landslide based on MFTL, BPNN and LS methods. (after Long et al. 2022)

Table 1 The MSE; RMSE and MAE results of MFTL; BPNN and LS methods, respectively (after Long et al. 2022)

5 Conclusions

In the digital era with explosive growth data, it is increasingly urgent to realize intelligent and automatic processing of landslide monitoring data. Literature review shows that ML model, DL model, and improved ensemble learning methods are widely applied in landslide displacement prediction, due to their outstanding data processing abilities. In this paper, the novel GA-CEEMD-RF and GA-CEEMD-SVM algorithms are proposed to decompose the landslide displacement and conduct displacement prediction of hydrodynamic pressure-driven landslide in CTGR area. It can solve the difficulties of determining the optimal combination of induced factors and weak stableness of prediction results using a single displacement prediction model.

Considering the limitations of monitoring equipment and data missing, the MFTL approach transfers the knowledge learned from the Baijiabao landslide to two other landslides with incomplete monitoring datasets. The successful transfer between Baijiabao landslide and Bazimen landslide, which share the nearly similar internal and external conditions, proves that the MFTL model is feasible in the landslides with the high similarity. And the model also works well for displacement prediction with the addition of clustering in different landslides, such as Baishuihe and Baijiabao landslides, which proves that the MFTL model exhibit positive generalization ability. The predicted results and traditional error evaluation indexes verify that the MFTL method performs better in landslide displacement prediction than traditional BPNN and LS methods.

However, the “black box” of data-based mathematical prediction models makes it impossible to obtain the internal variables and parameters and lacks interpretability. There are inherent disadvantages in generality if the data mining methods fail to consider the landslide physical mechanism. Hence, in the future, there is a growing tendency to construct multi-model coupled methods combining triggering and instability mechanisms to realize dynamic and reliable landslide deformation prediction.