1 Introduction

Runout distance estimation of debris flows is critical for hazard mitigation since it can benefit the delineation of dangerous areas and promote disaster mitigation for relevant elements (Hürlimann et al. 2008; Gao and Sang 2017). Predicting runout distance is thus of deep social and scientific significance (Vegliante et al. 2024). Furthermore, debris flows after strong earthquakes can be particularly destructive due to their increased occurrence frequency and magnitude (Shieh et al. 2009). This is attributed to extensive sediment deposition resulting from co-seismic landslides settling and accumulating on slopes or along slope toes (Marc et al. 2016). As a result, these loose materials become easily mobilized when heavy rainfall events arrive during the rainy seasons of the following 5–10 years (Iverson et al. 1997; Tang et al. 2009). Additionally, the weakened slopes due to ground shaking are prone to landslides, contributing to debris generation and feeding into debris flows (Tanyaş et al. 2021). As a consequence, the secondary supplement and entrainment during the flowing process can lead to a greater depositional volume than the initiation volume, causing particularly damaging events (Zhang et al. 2013; Dahlquist and West 2019; Horton et al. 2019), such as river blocking and severe infrastructure and house damage on the depositional fan (Hürlimann et al. 2006; Qiu et al. 2022). Therefore, understanding potential post-seismic debris flows and estimating their runout distance on the accumulation fans is significant for hazards reduction.

Many studies have explored the empirical relationships between topographic factors and runout distance. One notable example is a widely recognized equation proposed by Rickenmann (1999), which linked the final debris flow volume and runout distance but performed poorly when applied to predicting runout distance. Similarly, other empirical equations proposed by Legros (2002), Guo et al. (2016), and Falconi et al. (2023) considered the sediment volume (V) to estimate the potential runout distance but presented a relatively high prediction error. Altitude difference (Δhb) between the trigger point of failure mass on slopes and the mobilization point at the slope foot was also considered in equation development (Puglisi et al. 2015). This approach overcame the limitation of application for potential events but failed to describe the complex relationship between sediment volume and altitude difference and runout due to the use of a simple regression method. Similarly, a post-Wenchuan Earthquake multivariate regression equation was formulated, incorporating catchment internal relief (H) and sediment volume (VD) to estimate runout distance. However, this equation fell short in considering the feasibility of potential events (Zhou et al. 2019). Additionally, the removable sediment (VL), catchment area (A), and catchment internal relief (H) were involved in equation development and achieved a high prediction accuracy (Tang, Zhu, et al. 2012). However, this equation ignored the significant role of rainfall in governing runout distance, as rainfall has a close correlation with debris flow initiation volume and subsequently impacts the runout distance (Wilford et al. 2004; Iverson 2014; Fan et al. 2017). Therefore, this study addressed this gap by incorporating triggering rainfall into runout distance prediction. However, the sparse rain gauge stations and coarse-resolution of Global Precipitation Measurement (GPM) satellite products cannot satisfy the needs for determining the triggering rainfall of a debris flow event. To overcome this, we proposed a hybrid deep learning model to downscale GPM to a very fine spatial and temporal resolution. This downscaled data was then used to develop a prediction model of runout distance. Deep learning, a subdiscipline of machine learning, offers inherent and distinct advantages over physics model, automatically extracting high-level information from big data (Bergen et al. 2019; Reichstein et al. 2019). Moreover, for the accurate prediction of runout distance, it is crucial to develop a reliable method. This is imperative due to the intricate nature of the initiation, flow, and deposition phases involved in a debris flow event, making it challenging to adequately describe the processes by empirical approaches. Therefore, the deep learning model was employed again here to develop a prediction model because deep learning can analyze geo-hazard effectively (Ma and Mei 2021).

In this study, a deep learning model was proposed to predict the potential runout distance of debris flows for a specific catchment after the 2022 Luding Earthquake by considering the topographical and meteorological factors. The historical debris flows after the Wenchuan Earthquake were used to train the prediction model, and then this model was applied to predict the runout distance of potential debris flows in the Luding Earthquake-affected areas.

2 Methodology

In this study, an AI-based method was proposed for GPM downscaling. Then, the downscaled GPM data were calibrated using rain gauge observations, and the refined GPM data were used to develop a prediction model with the involvement of topographical factors, drawing insights from debris flow data after the 2008 Wenchuan Earthquake. Furthermore, we calculated the rainfall thresholds using the calibrated GPM data, offering a tool for issuing warnings regarding potential debris flows subsequent to the Luding Earthquake. Finally, this prediction model was adopted to predict the potential runout distance for a specific catchment when the debris flow events occur after the 2022 Luding Earthquake.

2.1 Study Area and Data Preparation

The Luding Earthquake-affected area is situated at the junction of the Longmenshan and Xianshuihe fault lines, both renowned for triggering numerous devastating quakes, including the 1786 Moxi Earthquake (Ms 7.75), the 2008 Wenchuan Earthquake (Ms 8.0), the 2013 Lushan Earthquake (Ms 7.0), the 2017 Jiuzhaigou Earthquake (Ms 7.0), the 2022 Lushan Earthquake (Ms 6.1), and the recent Luding Earthquake (Ms 6.8) (see Fig. 1). These active fault lines primarily arise from the continuous northward movement of the Indian Plate, ultimately leading to the uplift of the Tibetan Plateau (TP). Consequently, the TP stands as the most extensive elevated terrain on Earth, characterized by a widespread exposure of fractured and eroded rocks. The instability of these rock masses creates favorable geological conditions for potential hazards. Regarding weather patterns, this region experiences a subtropical monsoon climate, with an average annual rainfall of 664.4 mm. However, this precipitation value escalates to 897.8 mm at altitudes of 1600 m, and a significant increase to 1941.5 mm of average annual rainfall can be anticipated at elevations of 3000 m (Xiong et al. 2023). Therefore, during the rainy season, sediments loosened by ground shaking are highly susceptible to being triggered. Moreover, in terms of lithology characteristics, the rock types in the earthquake-affected area are mainly intrusive rocks and sedimentary rocks. The intrusive rocks include granite, monzogranite, and granodiorite, and the sedimentary rocks mainly consist of the Trassic Zagunao Formation, Zhagasha Formation, and Baiguowan Formation. Overall, this is a high seismic hazard area, secondarily exposed also to geo-hydrological hazards, as landslides and debris flows. A number of debris flows were triggered by the 2008 Wenchuan Earthquake (Fig. 1). Predicting the runout distance of mobilized sediments caused by co-seismic landslides is significant to understand the dangerous areas.

Fig. 1
figure 1

Location of the Luding Earthquake-affected areas and historical debris flows after the 2008 Wenchuan Earthquake

2.2 Precipitation Downscaling

The GPM product has a spatial resolution of 0.1°, which cannot be used for the determination of triggering rainfall of debris flows due to the high heterogeneity of rainfall in mountainous areas (Mahmud et al. 2017). Therefore, aiming to downscale this product to a spatial resolution of 0.01° and a daily temporal resolution, seasonal precipitation was used, based on the following equations:

$${\text{RGPM}}_{i}^{{0.1^{ \circ } }} \left( {c,s} \right) = {\text{DailyGPM}}_{i}^{{0.1^{ \circ } }} \left( {c,s} \right)/{\text{SeasonalGPM}}_{i}^{{0.1^{ \circ } }} \left( {c,s} \right)$$
(1)
$${\text{DailyGPM}}_{i}^{{0.01^{ \circ } }} \left( {c,s} \right) = {\text{RGPM}}_{i}^{{0.1^{ \circ } }} \left( {c,s} \right)\,{\text{SeasonalGPM}}_{i}^{{0.01^{ \circ } }} \left( {c,s} \right)$$
(2)

where \({\text{DailyGPM}}_{i}^{{0.01^{ \circ } }} \left( {c,s} \right)\) represents the daily precipitation of seasons (spring, summer, autumn, winter) at location c. To downscale the precipitation, four regional environmental factors were selected to describe the spatial distribution of precipitation. They are normalized difference vegetation index (NDVI), elevation, land surface temperature (LST), and geolocations (longitude and latitude). The selection of NDVI was due to the positive relationship between vegetation density and water supply. Therefore, NDVI can serve as an indicator to reflect the precipitation distribution since the majority of the precipitation would seep into subsoil and result in the increase of water content, and a higher NDVI value is usually associated with denser and healthier vegetation. Considering the characteristics of spatial difference for precipitation distribution, DEM was introduced because precipitation varies in different elevations (Yao et al. 2016). The DEM data are available at https://search.asf.alaska.edu/#/ and have a spatial resolution of 12.5 m. Geolocation (longitude and latitude) indicates variations in precipitation across flat space (Sachindra et al. 2013; Wang et al. 2022). This factor is downloaded from the NASA website,Footnote 1 which has a spatial resolution of 1 km and a temporal resolution of 5 minute. The LST has a close relationship with precipitation (Trenberth and Shea 2005). The rise in LST can enhance the evaporation process, resulting in an increased amount of moisture in the atmosphere. This surplus moisture plays a crucial role in cloud formation, ultimately contributing to heightened rainfall. Therefore, LST was also included in precipitation estimation on a regional scale. The NDVI and LST data were both downloaded from the NASA website,Footnote 2 with NDVI being a monthly product featuring a spatial resolution of 0.01° (1 km). The LST data have a spatial resolution of 1 km with a repeat time of 8 days. Considering the significant correlation between these factors and precipitation at the seasonal scale (Chen and Li 2020), NDVI was accumulated to a seasonal scale and then resampled to a spatial resolution of 0.1° (10 km). Similarly, a seasonal scale and a spatial resolution of 0.1° were also required for LST products. Elevation data were downloaded from the Alaska Satellite Facility (ASF) website, which has a spatial resolution of 12.5 m. Therefore, this factor is resampled to a spatial resolution of 0.1°.

After data collection and preparation, the relationship between the regional environment factors (REVs0.1°) and precipitation0.1° at the seasonal scale was established using a deep learning method, named Dual Attention Network (DAN). The DAN model is composed of a joint attention and an attention free transformer (AFT), as illustrated in Fig. S1 of the Supplementary MaterialsFootnote 3 that outlines the mechanism of this model. The attention mechanism is commonly applied in sequential models or tasks involving sequential data, allowing the model to focus on the most informative or relevant parts of the input data rather than treating all input elements equally. Therefore, this method significantly improves the performance of traditional machine learning models, addressing long-range dependency issue and improving accuracy, thereby delivering cutting-edge results in various natural language processing (NLP) tasks. The AFT serves as an efficient variant of transformers, eliminating the need for self-attention through the dot product. This transformer does not require the computation and storage of an attention matrix, while still maintaining global interaction between the query and the values. Therefore, to effectively explore the input-output maps, we proposed a new model (DAN), combining joint attention and AFT. The process involves reshaping the input data as a sequence, using linear layers to reconstruct the sequence to the target size, and extracting features using attentional blocks and AFT. The input, attention output, and AFT output were then concatenated and fused with a linear layer, followed by reshaping operations to obtain the final outputs. Finally, seasonal precipitation with a spatial resolution of 0.01° was obtained by incorporating the seasonal REVs with a spatial resolution of 0.01° (REVs0.01°) into this established relationship. This allowed us to calculate daily precipitation with a spatial resolution of 0.01° (precipitation0.01°) based on Eq. 2.

2.3 Calibration of Downscaled Precipitation

After the GPM downscaling, we employed geographical differential analysis (GDA) to calibrate the downscaled results (Cheema and Bastiaanssen 2012):

$$\Delta R_{{\left( {x,y} \right)}} = R_{{GPM\left( {x,y} \right)}} - R_{{Mea\left( {x,y} \right)}}$$
(3)
$$\Delta R_{{\left( {x,y} \right)ip}} = \Delta R_{{\left( {x,y} \right)}}$$
(4)
$$R_{Cal} = R_{GPM} - \Delta R_{{\left( {x,y} \right)ip}}$$
(5)

where \(\Delta R_{{\left( {x,y} \right)}}\) represents the difference values between the GPM data and measured values at a given point (x,y). RGPM(x,y) is GPM data, and RMea(x,y) represents the measured values of rain gauge stations. \(\Delta R_{{\left( {x,y} \right)ip}}\) is the difference map using an interpolation method. RCal is the calibrated value. The inverse distance weighting (IDW) method was employed to produce the difference map due to its simplicity and robustness (Brouder et al. 2005).

2.4 Assessment of Downscaling and Calibration Performances

After the calibration of the downscaled data, three assessment indices were used to evaluate the downscaling and calibration performances, including mean absolute error (MAE) (Chai and Draxler 2014), mean square error (MSE) (Hodson et al. 2021), and root mean square error (RMSE) (Chai and Draxler 2014):

$$MAE = \frac{1}{n}\sum\limits_{i = 1}^{n} {\left| {y_{iMea} - y_{iCal} } \right|}$$
(6)
$$MSE = \frac{1}{n}\sum\limits_{i = 1}^{n} {\left( {y_{iMea} - y_{iCal} } \right)}^{2}$$
(7)
$$RMSE = \sqrt {\frac{{\sum\limits_{i = 1}^{n} {\left( {y_{iMea} - y_{iCal} } \right)^{2} } }}{n}}$$
(8)

where yiMea is the measured value, and yiCal represents the calibrated/downscaled value. n is the number of prediction values.

2.5 Rainfall Threshold Calculation of Debris Flows After an Earthquake

The rainfall thresholds for post-seismic debris flows undergo a significant drop, primarily stemming from the settlement and accumulation of loose materials across vast slopes and channels. As a result, calculating these thresholds becomes imperative to grasp the hydrological conditions that precipitate post-seismic debris flows. This comprehension is critical for forecasting the runout distance of debris flows in areas impacted by the Luding Earthquake. Leveraging the calibrated precipitation data at 0.01° resolution, we established a relationship between effective antecedent rainfall and intraday rainfall (Pe − Po), which enabled us to estimate the triggering rainfall for post-seismic debris flows. The effective antecedent rainfall can be derived by:

$$P_{e} = \sum\limits_{j = 1}^{m} {P_{j} K_{j} }$$
(9)

where Pe is the effective antecedent rainfall. Pj represents the daily precipitation on the jth day before the occurrence of debris flows. Kj is a reduction coefficient due to evaporation, and a suggested value of 0.84 was used in this study (Guo et al. 2013).

2.6 Potential Runout Distance Prediction

To effectively anticipate potential debris flows following the Luding Earthquake, it is crucial to develop a prediction model. This model serves as a vital tool for estimating the potential runout distance of debris flows when future precipitation surpasses the calculated rainfall thresholds. The potential runout distance, denoted as D, is of utmost importance in identifying areas vulnerable to debris flows. Specifically, D represents the distance between the initial deposition point (fan apex) and the lowest point of the depositional fan. Given the geometric factors involved, it is reasonable to assume that D, to a certain extent, depends on the sediment supply of the debris flow.

2.6.1 Characteristics of Debris Flow Activity

To achieve runout prediction, five factors were selected for the model development, including catchment area (A), average channel gradient (J), main channel length (L), intraday rainfall (Po), and sediment volume (VL). Second, a dataset comprising 84 debris flows in the three areas (see Fig. 1) after the Wenchuan Earthquake is compiled based on past studies except for intraday rainfall (Tang et al. 2011; Tang, Van Asch, et al. 2012; Tang, Zhu, et al. 2012). Of these, 72 debris flows were collected for model training, with the remaining 12 debris flows prepared for model testing to ensure the prediction accuracy and reliability of this developed model. Finally, the DAN model was employed to develop the prediction model, incorporating the collected data and calibrated precipitation as inputs. These historical debris flows are channelized debris flows with channel gradient ranging from 4.75° to 43.6°. Over 90% of the channel gradient is distributed in the range of 10° and 40°. The channel length of the training dataset ranges from 0.46 to 10.23 km. The catchment area ranges from 0.12 to 24.42 km2 with approximately 45% smaller than 1 km2. Intraday rainfall of these events ranged from 33.18 to 96.47 mm. Approximately 73% of the catchments suffered a daily rainfall greater than 50 mm.

2.6.2 Runout Distance Prediction

Following the model development and assessment, data preparation was required before this prediction model can be employed for runout distance prediction. Therefore, we estimated the sediment volume based on the interpreted co-seismic landslides (Xiao et al. 2023; Zhang et al. 2023). Subsequently, the prediction area was delineated, and this space was partitioned into multiple debris flow catchments. This allowed the determination of sediment volume (VL) within each catchment. After that, the catchment area, channel length, and average channel gradient were calculated using GIS. As for the intraday rainfall, the Pe − Po relationship was applied here to estimate the possible runout distance when the sediments in the debris flow catchments are all mobilized.

3 Results and Analysis

This section introduces the results of precipitation downscaling and calibration, and further presents the results of applying the precipitation data to runout model development and prediction of debris flows after the 2022 Luding Earthquake.

3.1 Downscaling and Calibration Results

We selected summer 2010 to present the process of downscaling GPM data. To increase the robustness of this established model, a wide area of 5.25 × 105 km2 was considered here to involve sufficient data into model development. Before the model training, correlation analysis between each selected factor and precipitation was conducted to ensure the positive contributions of the factors to the model development. The analysis results show that all the factors present a strong relationship with precipitation since all the Person’s coefficients are larger than 0.7. After that, the input data were divided into a training dataset and a testing dataset with a ratio of 7:3. The activation function in this study was determined as Gaussian error linear unit (GELU), which is commonly used in transformer and performs better than traditional activation functions, such as rectified linear unit (ReLU) (Lee 2023). In order to avoid overfitting, the leave-one-out cross validation was conducted in our study. A total of 300 iterations were achieved in our method since there was no obvious loss decreasing observed anymore after 300 iterations. As a result, the model training achieved a train loss of 0.016, and the MSE of model validation is 1 × 10−5. This is because the combination of linear and one-dimensional convolution blocks can increase the ability of feature extraction and rapid processing ability of large-scale data, and this method can effectively avoid over-fitting. Therefore, this model was used to predict the seasonal precipitation0.01° after the input of seasonal REVs0.01°.

To test the superiority of this deep learning model in downscaling GPM data, a machine learning method (XGBoost) and geographically weighted regression (GWR) were introduced to conduct comparison analysis. We randomly selected the rain gauge stations in this area to compare the downscaling performance of the three methods. The assessment results using MAE, MSE, and RMSE are presented in Table 1.

Table 1 Performance assessment of the three methods at a seasonal scale

As illustrated in Table 1, the DAN model exhibited superior performance in estimating the seasonal GPM0.1°. Furthermore, the downscaled results leveraged to compute daily GPM0.01° based on Eqs. 1 and 2. Subsequently, the calculated daily GPM0.01° was calibrated using the monitoring values from rain gauge stations. The calibration process used 80% of the rain gauge data, while the remaining 20% was earmarked for validation. The precipitation distribution on 14 August 2010 was selected to present the calibration process (see Fig. S2 in the Supplementary Materials3).

As depicted in Fig. S2b, remarkable changes in precipitation distribution are evident in the calibrated data. The maximum precipitation increases to 111.45 mm after calibration. The discrepancy between downscaled and calibrated precipitation results can be attributed to the intermittent measuring mode of GPM, where the 0.5-hour interval may overlook high-intensity rainfall events. However, the incorporation of rain gauge data proved to be an effective supplement for refining downscaled precipitation. Apart from the issue of measuring mode, high-intensity rainfall may interfere with the transmission of radio frequency radiations, causing disruptions in GPM satellite reception. In this case, the application of the GDA method can effectively improve this trouble because the GDA can spatially consider the average value of the nearby samples and therefore eliminate the errors.

3.2 Performance Assessment

After the precipitation downscaling and calibration, performance assessment is essential to evaluate the application feasibility of the calibrated results to runout predictions of debris flows. The errors of original GPM (OriGPM), downscaled GPM (DownGPM), and calibrated GPM (CaliGPM) are presented in Fig. 2 using the three evaluation indices.

Fig. 2
figure 2

Mean absolute errors (MAEs), mean square errors (MSEs), and root mean square errors (RMSEs) of 15 days before the occurrence of debris flows on 14 August 2010

Figure 2 demonstrates a comparable trend in the variation of MAEs, MSEs, and RMSEs. When compared to OriGPM and DownGPM, CaliGPM emerges as the superior performer, with all error metrics approaching zero. Notably, CaliGPM exhibits minimal fluctuations in its curves, except for the 11th day, hinting at potential instability in GPM’s capability to capture intense rainfall events. However, the application of downscaling and calibration techniques significantly mitigates this error. Consequently, the combination of the DAN model and GDA method effectively complements the process of downscaling GPM data to an exceptionally fine spatial and temporal resolution in mountainous regions.

To further scrutinize the effectiveness of the method across various timeframes, we analyzed the MAEs, MSEs, and RMSEs for daily, monthly, and seasonal periods (see Fig. S3 in the Supplementary Materials3). The results reveal that the DownGPM curves follow a similar trend to OriGPM, yet DownGPM exhibits remarkable improvements, characterized by reduced MAE, MSE, and RMSE values across all three temporal scales. Moreover, when DownGPM was calibrated using rain gauge data, even lower error values are observed.

Comparing the three temporal scales, minor differences are discernible at the daily level among the three datasets. However, as the time period transitions to the monthly scale, a significant rise in error is observed for OriGPM, DownGPM, and CaliGPM. This is attributed to the accumulation of daily errors, resulting in higher monthly rainfall errors (Chen et al. 2020). Nevertheless, as the period expands to a seasonal timescale, the average error increasing rates decrease. For instance, the MAEs for OriGPM and DownGPM decrease by 50% and 41%, while the error increasing rate of CaliGPM reduces by 92%. This reduction in errors is attributed to the deep learning method’s ability to establish a strong relationship between precipitation and REVs at a seasonal timescale.

Moreover, the temporal analysis revealed enhanced stability and accuracy in CaliGPM data. Although the assessment values of CaliGPM exhibit an increase as daily data accumulates to monthly and seasonal timescales, notable fluctuations are absent across these three temporal perspectives. Therefore, the calibration process not only mitigates the potential escalation of deviations but also underscores the suitability of a seasonal timescale for spatially downscaling GPM data. In this case, we used the downscaled data to calculate the rainfall thresholds to develop a runout distance prediction model.

3.3 Rainfall Threshold Calculation

The impact of antecedent rainfall in triggering debris flows has been addressed by past studies (Crozier and Eyles 1980; Glade et al. 2000), with suggestions for an antecedent rainfall period extending up to 10 days ( Guo et al. 2013). Rainfall within this timeframe preceding debris flow incidents can elevate soil water content, inducing slope instability. However, considering the evaporation during this time period, the effective antecedent rainfall (Pe) was employed to establish an empirical relationship with intraday rainfall. Consequently, the regression equation yields R-squared and adjusted R-squared values of 0.63 and 0.62, respectively. Significant testing confirms the statistical significance of the estimated coefficients, as the P value is less than 0.05. The regression relationship is as follows:

$$P_{o} = 179.76P_{{\text{e}}}^{ - 0.29}$$
(10)

where Po is the intraday rainfall (mm), and Pe represents the effective antecedent rainfall (mm) before the debris flow occurrence. The estimation of triggering precipitation for debris flows after the Wenchuan Earthquake can rely on this equation.

For instance, when intraday rainfall reaches 54.83 mm, the loose materials within debris flow catchments are anticipated to become mobilized. Consequently, an effective antecedent rainfall of approximately 60 mm over a 10 days span is considered crucial. Furthermore, if the antecedent rainfall reaches 30 mm during the same duration, debris flows are likely to occur when intraday precipitation surpasses 67 mm. Hence, it is more probable that debris flows will be triggered when the effective antecedent rainfall and intraday rainfall closely align with the regression line.

3.4 Runout Distance Prediction

Similar topographical conditions of the Wenchuan Earthquake-affected and Luding Earthquake-affected areas can further ensure the reliability and accuracy of runout distance estimation.

3.4.1 Prediction Model Development

To forecast the runout distance of debris flows following the Luding Earthquake, we developed a prediction model using historical debris flow datasets from three regions (Yingxiu, Qingping, and Beichuan, see Fig. 1) subsequent to the Wenchuan Earthquake. In addition to the factor selection, the DAN model was employed again to develop a prediction model to achieve a high prediction accuracy. Following the model development, 12 debris flows were used to validate the prediction model, as shown in Table 2.

Table 2 Dataset for model testing

The historical debris flows were divided into a training set and validation set with a ratio of 7:3. To visualize the training process, we selected the Smooth-L1-Loss to reflect the difference between predicted and actual values:

$${\text{Smooth}}_{{{\text{L1}}}} \left( x \right) = \left\{ {\begin{array}{*{20}c} {0.5x^{2} } & {if\left| x \right| < 1} \\ {\left| x \right| - 0.5} & {{\text{otherwise}}} \\ \end{array} } \right.$$
(11)

The derivative of Eq. 11 is:

$$\frac{{d{\text{smooth}}_{{{\text{L1}}}} }}{dx} = \left\{ {\begin{array}{*{20}c} x & {if\left| x \right| < 1} \\ { \pm 1} & {{\text{otherwise}}} \\ \end{array} } \right.$$
(12)

This loss function is widely used for evaluating the performance of deep learning (DL), which combines the advantages of L1 loss and L2 loss. L1 loss is seldom used for evaluating the difference between actual and prediction values due to its poor performance in dealing with complex DL problems. L2 loss is sensitive to outliers, which may cause instability of model development. Therefore, Smooth-L1-Loss is proposed capable of avoiding the impacts of outliers on model convergence, and finally we achieved a Smooth-L1-loss of 2.8 × 10−7. The testing result using the dataset in Table 2 are plotted in Fig. 3.

Fig. 3
figure 3

Scatter diagram of the predicted results

Before employing our developed model to estimate the potential runout distance of future debris flows following the Luding Earthquake, we conducted a comparative analysis. We introduced an empirical equation proposed by Tang, Zhu, et al. (2012) to assess the debris flow runout distance in areas affected by the Wenchuan Earthquake. The predicted outcomes demonstrate that our model outperforms this empirical equation, as the predicted values align more closely with the measured values. Moreover, our model achieved an MAE value of 0.036, while the empirical equation had an MAE of 0.076. This MAE difference signifies a 52.6% increase in accuracy, enabling us to reduce the error in runout distance prediction by approximately 40 m. The values of the other two assessment indices for both methods are presented in Table 3.

Table 3 Prediction assessment of the two methods

The test dataset, as depicted in Table 3, suggests a tendency for overestimation in the predicted values of runout distance. This is likely attributed to the predominance of granitic rock underlying the debris flows in the test area, coupled with shorter torrent channels, resulting in smaller runout zones with coarser deposits on the fans. Despite this, the error percentage was deemed acceptable for comparison with other empirical models. For example, Legros (2002) used historical debris flows with a long runout to develop an equation and achieved an average mean absolute percentage error \(\left( {{\text{MAPE}} = \left( {100\% /m} \right) \times \sum\limits_{i = 1}^{m} {\left| {\left( {y_{{i,{\text{pre}}}} - y_{{i,{\text{true}}}} } \right)/y_{{i,{\text{true}}}} } \right|} } \right)\) of 48%. However, this value in our study was calculated as 18%. Furthermore, a much higher MAPE is found in the equation of Guo et al. (2016), which is 65%. Overall, the developed model in this study can perform well in estimating the runout distance of debris flows following the strong earthquake.

3.4.2 Volume of Sediment Supply

The Luding Earthquake triggered a significant number of co-seismic landslides, at least 10,869 in total (as depicted in Figs. 1 and 4a), leaving loose materials scattered across slopes and their base areas. Estimating the volume of individual landslides within vast landslide-prone regions poses a formidable challenge. However, this can be tackled effectively by utilizing empirical correlations that link the volume of individual landslides to their geometrical measurements observed in the field. In our study, we determined the landslide volume (VL) using an empirical equation put forth by Parker et al. (2011). Consequently, the sediments generated by the landslides within a given catchment can be represented as follows:

$$V_{L} = \sum\limits_{1}^{n} {\alpha A_{i}^{\gamma } }$$
(13)

where Ai is the landslide area. α and γ are empirical coefficients, which are suggested as 0.106 and 1.338, respectively. To accurately identify landslides within each catchment, we collected a comprehensive series of images, spanning from high-resolution unmanned aerial vehicle (UAV)-based imagery (resolution < 1 m) to multi-satellite data from Sentinel 2 (~ 10 m resolution), Beijing-3 (< 2 m resolution), Gaofen-2 (2–3 m resolution), Gaofen-6 (6 m resolution), and Google Earth (< 2 m resolution). This multi-source approach ensured thorough coverage of our study area. Within these images, loose landslide debris zones are clearly distinguishable due to their distinct shapes, colors, and textural features that set them apart from the surrounding terrain. By analyzing these observable characteristics, we can accurately interpret and delineate the shapes, dimensions, and types of seismic landslide debris on the images.

Fig. 4
figure 4

The study area after the 2022 Luding Earthquake

These landslides are distributed in six counties, in which 53% of the events are found in Shimian county, as depicted in Fig. 4a. Approximately 47% of the landslides are distributed in Luding County. By using satellite images, 2,206 landslides are identified to have an area larger than 5224 m2, which indicates a proportion of 20.3% for medium-scale (104 < VL < 105 m3) and large-scale (VL > 105 m3) landslides based on the landslide area-volume equation proposed by Parker et al. (2011). The largest landslide area reaches 3.6×105 m2. Furthermore, 24%, 40%, 54%, 69%, and 79% of the landslides are distributed with a distance of 1, 2, 3, 4, and 5 km to the Moxi Fault, respectively.

3.4.3 Runout Distance Prediction after the Luding Earthquake

The loose materials can be initiated by rainfall and evolute into debris flows, finally causing severe damage to properties. Therefore, a reliable prediction of runout distance is necessary. We employed the developed model to predict the possible runout distance of each catchment if a debris flow event occurs. As indicated in Fig. 4b, a total of 42 catchments were generated (C1–C42 in Fig. 4b). The maximum catchment area reaches 226.35 km2 (C1), and a total of 146 landslides are observed in this catchment. However, the minimum catchment lowers to 0.33 km2 (C13). Regarding the volume of removable sediments resulting from co-seismic landslides, C4 stands out as the region with the highest concentration, reaching a staggering 73.69 × 106 m3, attributed to the occurrence of 1834 landslides. In stark contrast, despite its vast area of 74 km2, C2 only witnessed 79 landslides. C3 boasts 607 landslides, comprising 5.6% of the total landslides recorded. Jointly, C5 and C6 experienced a total of 1021 landslides. Remarkably, the minimum sediment volume is observed in C30, with VL dipping to a mere 0.03 × 106 m3.

After the preparation of topography-related factors, the intraday rainfall was decided based on Eq. 10. We assumed that the intraday rainfall is 100 mm, and then runout distance prediction was conducted to estimate the possible runout distance. The prediction results indicate that C4 exhibits the longest potential runout distance, reaching 0.77 km. This long distance is primarily attributed to large amounts of removable sediments, which possess high potential energy and enhanced mobility (Dahlquist and West 2019). Similarly, C3 and C6 contain 25.60 × 106 m3 and 16.25 × 106 m3 materials, respectively, resulting in a runout distance of 0.54 and 0.48 km, respectively. In general, sediment volume is significant in controlling the final runout distance. However, C1 has a sediments volume of 3.07 × 106 m3, and it achieves a runout of 0.48 km. This extended runout can be ascribed to the lengthy channel length of 23.86 km. This is because the long channel can allow a debris flow to entrain the existing sediments within the catchment, thereby achieving a long runout distance (de Haas and Densmore 2019). Although the entrainment of sediments is not explicitly considered during the development of the prediction model, the measured runout distance values used for model training are influenced by various factors. Therefore, the entrainment effect during mass flow can be implicitly incorporated into model training through the mapping relationship between measured runout values and influencing factors. However, due to the difficulty in obtaining some influencing factors, not all factors impacting the final runout can be included in model development. In this case, the application of a deep learning model proves beneficial, as it can discover intricate structures in limited input data. (LeCun et al. 2015). This developed prediction model transcends statistical analysis by discovering the mapping relationship between input variables and output results, thereby learning complex functions for further analysis.

In order to further reveal the impacts of selected factors on runout performance, we plotted the variations of normalization of runout distance to each factor in Fig. 5. As illustrated in Fig. 5a, the runout distance of a unit area decreases with the increase of catchment area, which can indicate the complexity of materials’ movement within a large catchment to arrive at the start point of the accumulation fan. This is because the movement route within a unit area is small. As a result, a greater number of movement pathways is essential for reaching the outlet of the watershed. The predicted runout distance in each catchment is presented in Table S1 of the Supplementary Materials3 when Po ranges from 40 to 100 mm.

Fig. 5
figure 5

Relationships between normalization of runout distance and catchment area (a), channel gradient (b), channel length (c), and sediment volume (d)

4 Discussion

The deep learning approach implemented in this study, specifically the Deep Autoencoder Network (DAN), demonstrated remarkable proficiency in downscaling the Global Precipitation Measurement (GPM) data. While numerous studies have relied on the geographically weighted regression (GWR) method for precipitation downscaling, discrepancies in accuracy arise when compared with deep learning or even traditional machine learning models. This divergence stems from GWR’s limitation in managing vast datasets. As datasets grow, the computational demands of GWR may increase exponentially, forcing the method to resort to resampling from high-resolution to coarser units to sustain processing speed (Harris et al. 2010). This processing approach leads to fluctuating accuracy in downscaled precipitation data.

To address this challenge, machine learning methods like XGBoost are considered for their unparalleled ability to discern latent relationships between input and output variables with swift processing speeds. However, traditional machine learning techniques may not fully satisfy the requirements of handling spatial or temporal complexities (Reichstein et al. 2019). Therefore, the deep learning method introduced in this study enhances seasonal forecasting prediction abilities. This method excels at grasping highly intricate functions by transitioning representations uncovered through representation learning from one level to a higher one using simple yet non-linear modules. Consequently, deep learning outperforms other machine learning techniques across numerous domains (LeCun et al. 2015). Our research further validated the superiority of the deep learning approach in using remote sensing data to articulate the spatial distribution of precipitation at a seasonal scale.

As for the developed prediction model for debris flow runout distance, we leveraged the debris flow data after the 2008 Wenchuan Earthquake to predict the runout distance of potential debris flows following the 2022 Luding Earthquake. In order to further validate the reliability and accuracy of our model in predicting runout distance of debris flows in this area, we employed the equation proposed by Tang, Zhu, et al. (2012) again to calculate the potential runout distance if a debris flow event occurs in this area. A common requirement for the application of an empirical equation and machine learning model is that the area should have similar topographic conditions as areas where this empirical equation and the machine learning model were developed. Approximately 64.2% of the calculation results by Tang, Zhu, et al. (2012) fall into the range of minimum and maximum values in Table S1 of the Supplementary Materials.3 The results have been included in Table S1. As indicated by Tang, Zhu, et al. (2012), this equation can achieve a reliable estimation of runout distance for debris flows. Therefore, our model can expect to provide a reliable and relatively accurate prediction of future debris flows.

This approach rests on the assumption that all the debris flow events used for model development exhibit comparable topographic and weather conditions, thus guaranteeing both prediction accuracy and model stability, thereby enhancing the reliability of this model for runoff predictions. However, we were also aware that the current training data may limit the widespread applicability of this model. Accurately predicting the runout distance of a debris flow is intricate, encompassing a thorough analysis of sediment initiation, flow, and deposition. These processes require a comprehensive understanding of topographic features, meteorological conditions, environmental factors, and geological settings. Fortunately, the deep learning model offers a viable alternative. It simplifies the flow process without relying on fluid and mechanical analyses, enabling reliable predictions of runout distance. Therefore, this model excels in predicting the runout distance, demonstrating its effectiveness and proficiency in performing this task. Although errors may not be avoided when debris flows occur in the future, our prediction results can help improve the hazard mitigation in the Luding Earthquake-affected areas. Additionally, we assumed that materials within each catchment generated by the co-seismic landslides are all initiated in one rainfall event, overlooking the entrainment and loss during the flowing process. This assumption may introduce prediction errors compared to actual runout distances, but further improvement is anticipated with additional data from diverse areas and regions in model development. Overall, all these uncertainties cannot alter the fact that our prediction model can achieve a reliable prediction of debris flow runout distance after a strong earthquake. As a result, an effective mitigation plan can be formed for hazard reduction.

5 Conclusion

Drawing from the historical debris flow data of the 2008 Wenchuan Earthquake, our study introduced a deep learning model to forecast the potential runout distance of debris flows in the 2022 Luding Earthquake-affected areas. This model integrates five critical topographic and meteorological factors, including catchment area (A), channel length (L), channel gradient (J), sediment volume (VL), and intraday rainfall (Po), into its development. The key findings are:

  1. (1)

    The deep learning-based model, named DAN, excels in downscaling GPM data, resulting in the calibrated GPM data with an average MAE of 1.56 mm. It outperforms both OriGPM (MAE 4.25 mm) and DownGPM (MAE 3.83 mm). Furthermore, the calibrated GPM data achieves the MSE and RMSE values of 4.45 mm2 and 2.10 mm, respectively. These refined and reliable precipitation datasets hold immense potential for debris flow warnings in mountainous regions.

  2. (2)

    The established Pe − Po relationship based on the debris flow data after the 2008 Wenchuan Earthquake can be used to decide the 24 h triggering rainfall for debris flows and therefore develop a prediction model.

  3. (3)

    The prediction model we have developed, leveraging data from debris flows following the 2008 Wenchuan Earthquake, achieves a remarkable Smooth-L1-loss of 2.8 × 10−7, demonstrating its effectiveness and reliability in estimating debris flow runout distances. Moreover, testing results indicate that our model outperforms existing empirical equations in this region, reducing the prediction error by 40 m, and attaining a MAPE value that is smaller than those of other empirical methods.

  4. (4)

    The prediction results after the Luding Earthquake show that C4 may have the longest runout distance, reaching 0.77 km when Po is 100 mm. Approximately 90% of the prediction results range from 0.1 to 0.35 km. Additionally, the prediction results were validated using the equation of Tang, Zhu, et al. (2012), and over 60% of the calculation results are distributed in the prediction range by our model.

Runout distance estimation of debris flows is significant, especially for future debris flows. This is because reasonable allocation of resources could be conducted for mitigating the possible damage to properties. In some areas that may suffer debris flows with a very short runout distance, there is no immediate need for mitigation or prevention measures. For catchments where a long runout distance may occur when a heavy rainfall arrives, mitigation strategies might be needed, such as the evacuation of residential areas and construction of drainage channels. Overall, our work could form a part of the warning system for debris flows to benefit hazard mitigation in mountainous areas.