Keywords

1 Introduction

Slope debris flow is a common natural disaster in mountainous and hilly areas. They are typically formed by the movement of a large amount of sediment and rock on steep slopes. Slope debris flow is characterized by their rapid flow, high velocity, high volume, high concentration, and destructive nature. The complex generation process of slope debris flows involves the interaction of a number of variables, including rainfall, terrain, soil properties, vegetation cover, and human activity [1]. The main triggering factor for debris flows is usually water, with rainfall being the most significant. Currently, researchers both domestically and internationally mainly predict the possibility of debris flow occurrence through rainfall forecasts, often using a critical rainfall threshold for a specific area to achieve prediction.

Through analyzing historical data of the Huanren Reservoir watershed, Chusheng Xing and others proposed a method of multi-model integrated rainfall forecast for the future 1–3 days in the Huanren Reservoir watershed using artificial neural network (ANN), extreme learning machine (ELM), and support vector machine (SVM) prediction models [2]. This study demonstrates the feasibility of using machine learning models for multi-model integrated rainfall forecast and shows that it can improve the accuracy of short-term rainfall prediction. Tang et al. gathered 254 debris flow data and daily cumulative rainfall data in the study area and used the long short-term memory (LSTM) method to anticipate short-term rainfall [3]. They used a statistical classification approach to define the rainfall warning threshold for debris flows. By comparing the predicted values with the threshold, they were able to determine the warning level and the likelihood of debris flow occurrence, thus creating an integrated warning method. Pradeep [4] have proposed a new lightweight weather prediction model based on the structure of Time Convolutional Neural Networks (TCN) and LSTM. This model can be used to forecast the weather for a selected fine-grained geographical location for up to 9 h. Hirschberg J [5] and others used 17 years of rainfall records in the Swiss Alps region and 67 instances of mudslides to determine the critical rainfall threshold. They employed a random forest model (RF) for prediction, which improved the extraction of mining and development information from the data and enhanced the accuracy of the warning performance.

Traditional methods of studying slope debris flows rely heavily on the collection and analysis of geological, hydrological, and meteorological data, as well as the construction of empirical and statistical models [6]. For instance, empirical and statistical methods typically use factors such as rainfall and terrain as predictive indicators, and use empirical formulas or statistical models to make predictions and issue warnings.

The integration of various sensors with suitable prediction models is a successful strategy to obtain more accurate slope debris flow prediction, overcoming the drawbacks of conventional techniques. In this study, a new approach for slope debris flow prediction is proposed, focusing on three main aspects. First, tests simulating the occurrence of slope debris flows were carried out, gathering five different types of sensor data to examine the precursor warning characteristics of the data, and utilizing the TOPSIS entropy approach to determine the danger level of slope debris flows. Second, a brand-new technique for forecasting slope debris flow is presented. It is dubbed DA-TCN-BiGRU and integrates the dual attention mechanism, temporal convolutional neural network, and bidirectional gated recurrent unit. Last but not least, the model is tested using data from a slope debris flow, and experimental findings show that the suggested model outperforms comparable models in terms of accuracy in warning detection and prediction.

2 Slope Debris Flow Simulation Platform

Slope debris flows are fluids with a mixed phase composition of water and solids, and the production process is highly intricate. Heavy rainfall is an external trigger for the onset of slope debris flows, but steep topography and the availability of solid materials are inherent elements that also contribute to their occurrence [7].

The platform for simulating slope debris flows was built with the intention of simulating actual slope debris flows. Figure 1 shows a physical representation of the slope debris flow simulation platform, which consists of a rainfall simulation device, sensor measurement device, and soil loading test box. The system for simulating rainfall creates the necessary amount of strong rainfall for the occurrence of slope debris flows. The soil loading test box replicates mountainous terrain, and its hydraulic support and lifting rods can be adjusted at different angles to represent the range of possible mountain slopes seen in nature.

Fig. 1.
figure 1

Slope debris flow simulation experimental platform

Six sensors, including a tipping-bucket rain gauge, a ground displacement sensor, a soil pressure sensor, a shear wave velocity sensor, and two soil moisture sensors, have been mounted in the experimental platform that we are using to simulate debris flow on slopes. Figure 2 depicts the sensors’ mounting locations.

Fig. 2.
figure 2

Schematic of sensor installation on slope debris flow simulation platform

3 Methodology

3.1 Topsis-Entrory Method

Sensors are used to continuously track a number of factors, such as rainfall, shallow soil moisture content, deep soil moisture content, surface displacement, and soil shear wave velocity, during the slope debris flow simulation process.

There are many methods for risk assessment of debris flow sensor data on slope. In this study, TOPSIS entropy method [8] was adopted to obtain the risk degree of debris flow on slope.

Fig. 3.
figure 3

TOPSIS-ENTRORY calculation flow chart

Figure 3 depicts the flowchart demonstrates how to handle sensor data using the TOPSIS entropy approach. First, the sensor data of slope debris flow is subjected to a simple preprocessing. Then, the normalized data is used to calculate the weight pij and entropy value ej using the following formulas:

$$ p_{ij} = x_{ij} /\sum_{i = 1}^N {x_{ij} } $$
(1)
$$ e_j = - \frac{1}{\ln N}\sum_{i = 1}^N {p_{ij} \ln p_{ij} } ,e_j \in [0,1] $$
(2)

Calculate the information entropy for each data and compute the information utility value, as shown in Eq. (3).

$$ d_j = 1 - e_j $$
(3)

To determine the magnitude of the weights for the sensor data, the following steps can be followed:

$$ \omega_j = d_j /\sum_{j = 1}^N {d_j } $$
(4)

Normalizing and standardizing the data, and constructing a weighted matrix as follows:

$$ z_{ij} = x_{ij} /\sqrt {{\sum_{i = 1}^N {x_{ij}^2 } }} $$
(5)
$$ z_{ij}^* = z_{ij} \cdot w_j $$
(6)

Find the optimal solution \({z}_{ij}^{*+}\) and the worst solution \({z}_{ij}^{*-}\) and determine the optimal distance \({D}_{i}^{+}\) and the worst distance \({D}_{i}^{-}\). Construct the similarity Ci as follows:

$$ \left\{ {\begin{array}{*{20}c} {z_{ij}^{* + } = \max (z_1^+ ,z_2^+ ,z_3^+ , \cdots ,z_i^+ )} \\ {z_{ij}^{* - } = \max (z_1^- ,z_2^- ,z_3^- , \cdots ,z_i^- )} \\ \end{array} } \right. $$
(7)
$$ \left\{ {\begin{array}{*{20}c} {D_i^+ = \sqrt {{\sum_j {(z_{ij}^* - z_{ij}^{* + } )^2 } }} } \\ {D_i^- = \sqrt {{\sum_j {(z_{ij}^* - z_{ij}^{* - } )^2 } }} } \\ \end{array} } \right. $$
(8)
$$ C_i = D_i^- /(D_i^+ + D_i^- ) $$
(9)

3.2 Models

Temporal Convolutional Network. Convolutional Neural Networks (CNNs) are commonly used in image processing, while CNNs used for time-series prediction are called Time Convolutional Neural Networks (TCNs). Due to the size of their convolutional kernels, traditional CNNs are unable to successfully extract features before and after temporal information from sequential input. In 2018, Shaojie Bai [9] et al. used CNNs with dilated convolutions in sequence prediction modeling, allowing CNNs to have a causal convolutional temporal constraint model that captures longer dependency relationships. TCNs have a bigger receptive field as a result. TCNs have a more straightforward and effective model structure. TCNs have also been expanded by numerous academics to include multivariate time-series prediction.

The TCN is composed of multiple residual blocks [9]. In each residual block, the output of the convolutional layer is added to the input of the residual block and fed into the next residual block. To adjust the width of the residual tensor, a 1x1 convolution is added to perform this operation. As a result, the receptive field width of TCN is twice the size of the original causal layer. Therefore, the size of the receptive field can be obtained using Eq. (10).

$$ r = 1 + \sum_{i = 0}^{n - 1} {2\left( {k - 1} \right)} b^i = 1 + 2\left( {k - 1} \right)\frac{b^n - 1}{{b - 1}} $$
(10)
$$ n = \left[ {\log_b \left( {\frac{{\left( {l - 1} \right)\left( {b - 1} \right)}}{{2\left( {k - 1} \right)}} + 1} \right)} \right] $$
(11)

In the above equation, the variables k and b stand for the size of the convolutional kernel and the dilation base, respectively, and both satisfy the constraint k, b. According to Eq. (11), where n denotes the quantity of residual blocks and l denotes the input tensor's relationship to that quantity. The residual block's input and output are kept equal in length by the 1 × 1 convolutional process, and the output is protected from future information by the dilated causal convolution.

Bidirectional Gated Recurrent Unit. For processing and making predictions with regard to sequential data, the Gated Recurrent Unit (GRU) is a condensed version of the LSTM neural network [10]. Figure 4 illustrates the GRU organizational structure.

$$ r_t = sigmoid\left( {W_r \left[ {h_{t - 1} ,x_t } \right]} \right) $$
(12)
$$ z_t = sigmoid\left( {W_z \left[ {h_{t - 1} ,x_t } \right]} \right) $$
(13)
$$ \hat{h}_t = \tanh \left( {W\left[ {r_t \odot h_{t - 1} ,x_t } \right]} \right) $$
(14)
$$ h_t = \left( {1 - z_t } \right) \odot h_{t - 1} + z_t \odot \hat{h}_t $$
(15)

The specific equations are shown in Eqs. (12) and (15), The GRU's update gate is represented by zt, while its reset gate is represented by rt. The activation function is the sigmoid function, while the hyperbolic tangent activation function is represented by tanh. The equivalent weight matrices are Wr, Wz, and W.

Fig. 4.
figure 4

The structure of GRU

The output in conventional GRU simply depends on the previous input and hidden state at each time step. However, the Bidirectional Gated Recurrent Unit (BiGRU) considers both the previous and future inputs and hidden states of each time step in the input sequence. In BiGRU, the forward GRU determines the time intervals from the start to the finish while the backward GRU determines the time intervals from the finish to the start. The output of the complete sequence is created by concatenating the outputs from the two directions [11].

Fig. 5.
figure 5

The structure of BiGRU

Figure 5 depicts the formula for BiGRU is similar to GRU, but it calculates the forward and backward GRU outputs separately. The updated BiGRU equation is as follows:

$$ \vec{h}_t = GRU\left( {x_t ,\vec{h}_{t - 1} } \right) $$
(16)
$$ \mathop{h}\limits^{\leftarrow} _t = GRU\left( {x_t ,\mathop{h}\limits^{\leftarrow} _{t + 1} } \right) $$
(17)
$$ h_t = [\vec{h}_t ,\overleftarrow{h}_t ] $$
(18)

Attention mechanism. Attention mechanism was initially used in image tasks to weight important features of the image[12]. The computation of attention weights can be thought of as a query in a key-value pair. The attention mechanism essentially obtains a weight matrix through this procedure. The first step is to determine how comparable the query (Q) and the keys (K) are. This can be done by taking the vectors’ dot products. The weights must then be normalized in order to produce weights that may be used immediately. The weighted summation of the weights and values is done in the third step to get the attention values.

$$ \alpha_t = {\text{softmax}}(Q^T K) = \frac{\exp (Q^T K)}{{\sum\limits_j {\exp (Q^T K)} }} $$
(19)
$$ a = \sum_t {\alpha_t V_t } $$
(20)
$$ Q = W^{q_i } X_t $$
(21)
$$ K = W^{k_i } X_t $$
(22)
$$ V = W^{v_i } X_t $$
(23)

In Eqs. (19) and (20), αt represents the attention weight at time t, Softmax is the activation function, α is the weighted sum of weights and variables, Q, K, and V respectively represent the query, key, and value of the attention mechanism. By using Eqs. (21) and (23), Wqi, Wki, and Wvi are the corresponding weights.

DA-TCN-BiGRU Model. Figure 6 shows the model architecture. A time series data set made up of slope debris flow serves as the model's input. The hidden layer's data at time t-1 and the data from n sensors at time t are inputs to the input stage attention mechanism (I-Attn), which outputs the attention weights at time t. I-Attn is sent through TCN after going through a residual block framework. The attention mechanism following TCN (T-Attn) is then created by multiplying the weight vector produced by the attention mechanism by the output of TCN. The final prediction value is then output after being passed via a BiGRU layer.

Fig. 6.
figure 6

The structure of DA-TCN-BiGRU

A sliding window is used to implement this model's dynamic sliding prediction in order to accommodate dynamic data., as illustrated in Fig. 7, since sensor data in the real experiments of slope debris flow is returned to the host computer in the form of a continuous array. The result of Fig. 7 is the slope debris flow danger degree for upcoming time steps To, using an input of 6-dimensional sensor data of length Ti. Each time step causes the sliding window to advance, producing the predicted value.

Fig. 7.
figure 7

Sliding window of slope debris flow sensor data

4 Simulation Experimental Data and Model Validation of Slope Debris Flow

4.1 Simulation Experimental Data and Analysis

During the entire simulation process of slope debris flow slope, sensors are used to monitor the process. For experimental purposes, the debris flow simulation platform configures the rainfall circumstances as pre-rainfall and strong rainfall.

Early rainfall: The rainfall intensity is set at 10 mm/h, and it will last for a total of 60 min, divided into 2 stages of raining for 60 min and then stopping for 60 min.

Heavy rainfall: The rainfall intensity is set at 100 mm/h, and it will last for a total of 30 min, divided into 2 stages of raining for 30 min and then stopping for 60 min.

During simulated rainfall, the sediment box was maintained at a 30 angle using a hydraulic lifting rod. The monitoring system collected data every 1 s, resulting in approximately 20,000 data points. These data will be used for modeling slope debris flows.

Fig. 8.
figure 8

Experimental data and risk degree of slope debris flow

Figure 8 depicts the curve that was plotted based on the dataset of debris flow slope that was acquired. The technique of debris flow analysis and its link to various sensor data led to the following conclusions:

  • Rainfall is a triggering factor for the occurrence of slope debris flows. Its variation directly affects the change in soil moisture content.

  • During the first stage of rainfall, the initial infiltration capacity of the soil is greater than or equal to the rainfall intensity. This causes the rainfall water to gradually penetrate into the ground, increasing the moisture content in the shallow soil layer. Eventually, the shallow soil layer becomes partially saturated, and the moisture content does not change further. However, in the deeper soil layer, the initial moisture content is higher than that of the shallow soil layer due to the presence of groundwater. As the rainfall continues and the intensity increases, the infiltration capacity of the surface soil gradually approaches or becomes lower than the rainfall intensity. This leads to the formation of surface water pooling, resulting in surface runoff and triggering soil erosion and movement.

  • As the moisture content of the entire soil layer changes, the shear strength of the soil also changes. The shear wave velocity represents the magnitude of soil shear strength. When the soil moisture content does not reach the critical moisture content, there is a positive correlation between soil shear strength and shear wave velocity. In contrast, there is a negative correlation between soil shear strength and shear wave velocity when the soil moisture content reaches the critical value. Before surface displacement occurs, the shear wave velocity shows a clear upward trend and the soil pressure also increases, both of which serve as precursory warning features for slope debris flow.

  • Before the slope enters the stage of sliding flow, the soil moisture content reaches saturation and no longer shows a significant increasing trend. With the occurrence of heavy rainfall, the soil enters the stage of sliding and becomes in a flowing state. As the surface soil is eroded and washed away by surface runoff on the slope, the soil becomes loose, forming channels, and further accelerating the flow velocity of water, promoting the rapid movement and transportation of soil until the soil reaches stability and forms a deposition area at the bottom.

  • The risk level of slope debris flow is determined using data from slope debris flow sensors, using the TOPSIS entropy weight method. This method takes into account a combination of surface displacement, soil pressure, shear wave velocity, and rainfall intensity, providing a comprehensive assessment of the hazard level. It highlights the presence of early warning signs and can provide advanced indication of the risk situation that slope debris flow may encounter.

4.2 Model Test

The DA-TCN-BiGRU model has been extensively described in Sect. 3 of this paper. The DA-TCN-BiGRU model was compared against the LSTM, GRU, TCN, BiGRU, and Bidirectional Long Short-Term Memory (BiLSTM) models in order to assess its performance. The performance was evaluated using metrics such as Root Mean Square Error (RMSE) [13], Mean Absolute Error (MAE) [13], and Mean Absolute Percentage Error (MAPE) [13], and the specific equations are shown in Eqs. (24) and (25).

$$ MAE = \frac{1}{N}\sum_{i = 1}^N {\left| {\widehat{y}_i - y_i } \right|} $$
(24)
$$ RMSE = \sqrt {{\frac{1}{N}\sum_{i = 1}^N {(\widehat{y}_i - y_i )^2 } }} $$
(25)
$$ MAPE = \frac{100\% }{N}\sum_{i = 1}^N {\left| {\frac{{\widehat{y}_i - y_i }}{y_i }} \right|} $$
(26)

Model running environment: R5-5600G CPU, Windows 11, NVIDIA GeForce GTX 1070 GPU, 16GB RAM, Python 3.6, Keras 2.6.0, TensorFlow 2.6.0. Model testing with sliding windows comes in two flavors. One kind predicts a data length of 10 given an input data length of 100. The other method type predicts a data length of 50 given an input data length of 100. These two categories reflect different input data lengths and prediction lengths. The following settings are made for the DA-TCN-BiGRU model: filters = 32, batch size = 128, kernel size = 8, and gru_units = 16. It's vital to remember that Softmax serves as the attention mechanism's activation function. The TCN model's parameters are as follows: filters = 32, batch size = 128 and kernel size = 8. The depth is set to 32 layers for the LSTM, GRU, BiLSTM, and BiGRU models, while the number of units is set to 16. Adam is the optimization method, ReLU is the activation function, and 0.001 is the starting learning rate. The loss function will determine how to change the learning rate. To prevent overfitting, a regularization dropout of 0.2 is applied. Each model is tested 20 times, and the average performance metrics for each model are calculated and summarized in the following table:

Table 1. Study of different prediction lengths for deep learning models

Table 1 shows that the DA-TCN-BiGRU model outperforms other prediction models in terms of both “100–10” and “100–50” sliding windows, and it exhibits steady performance.

5 Conclusion and Discussion

The prediction of debris flow risk on slopes is a cross-disciplinary research field involving geotechnical engineering, computer science, and other disciplines. In this study, we conducted simulated experiments on slope debris flows and obtained four types of sensor data. Through the TOPSIS entropy method, we obtained an objective measure of the risk of debris flow on slopes, which characterizes the risk of debris flow occurrence by effectively combining factors from both the atmosphere and the ground. To address the nonlinearity and high complexity of data modeling in debris flow prediction methods, we propose the DA-TCN-BiGRU prediction model, which considers the impact of important information on the prediction and effectively extracts features from the sensor data. Through comparative experiments, it can be concluded that the DA-TCN-BiGRU model has certain effectiveness and feasibility in predicting debris flow risk on slopes, and it has certain engineering practical significance.