1 Introduction

Debris flows possess significant destructive potential, leading to substantial damage to both infrastructure and residential areas (Goto et al. 2021). Around 100 countries worldwide are vulnerable to severe debris flows, resulting in an annual toll of approximately 1200 fatalities (Papathoma-Köhle et al. 2017). The statistical data reveal a total of 213 debris-flow events with 77,779 fatalities across 38 countries from 1950 to 2011 (Dowling and Santi 2014). Therefore, it is imperative to issue timely warnings for debris flows to decrease the possible losses of human lives and properties. Susceptibility maps can offer valuable insights into the likelihood of debris-flow occurrences within specific terrain zones (Fell et al. 2008). These maps can provide a detailed quantitative scenario, but it is worth noting that their utility is constrained by their static nature, rendering them unable to predict specific locations and dates, as highlighted by Segoni et al. (2016). Debris-flow initiation primarily relies on rainfall patterns (Crosta and Frattini 2001). Consequently, the rainfall threshold has been embraced as an effective predictive measure for debris flows (Hardwick Jones et al. 2010; Nikolopoulos et al. 2014; Staley et al. 2017). The synergy of susceptibility maps and rainfall thresholds can contribute to the development of a dynamic temporal forecasting model capable of delivering daily temporal resolution alongside a very fine spatial resolution (Segoni et al. 2018).

Therefore, calculating the rainfall thresholds is critical to improve the debris-flow warning system. In this context, the acquisition of precise and dependable precipitation data characterized by high spatial and temporal resolutions assumes paramount importance in the realm of hydrological forecasting and hazard mitigation (Yilmaz et al. 2005; Sun et al. 2018). In most cases, the determination of precipitation rates depends on gauge records (Kidd 2001; Calvello et al. 2015), which can be very difficult in rural and mountainous areas with fewer gauge stations. The development of satellite-based technology, such as Tropical Rainfall Measuring Mission (TRMM), offers an alternative to access precipitation data in data-scarce regions (Ashouri et al. 2015). However, the inherent limitation of the low spatial resolution (0.25°) fails to satisfy the need for reliable precipitation estimation. Therefore, an effective spatially downscaling method is required before the TRMM data can be used for rainfall threshold calculation.

Many methods have been proposed to downscale the TRMM data by establishing the relationship between the regional environmental variables (REVs) and precipitation (Chen and Brissette 2014). For instance, the quantification of vegetation growth through the normalized difference vegetation index (NDVI) has been used as a proxy to reflect the spatial precipitation variations (Immerzeel et al. 2005). An empirical function was developed to characterize the NDVI—precipitation relationship (Immerzeel et al. 2009). However, only one factor cannot support the effective downscaling of TRMM data. Therefore, further studies were conducted by integrating both NDVI and DEM into regression relationship development to achieve the downscale of TRMM data (Jia et al. 2011; Park 2013). Furthermore, geolocations (longitude and latitude) were considered to add geography information to the spatial relationships using regression models, such as the multi-linear regression model, geologically weighted regression (GWR), and geologically weighted regression Kriging (GWRK) (Sachindra et al. 2013; Chen et al. 2018; Wang et al. 2022). For better quality of downscaling results, land surface temperature (LST) is selected here as a factor due to its significant relationship with precipitation (Trenberth and Shea 2005). Moreover, the impact of the slope aspect on precipitation intensity has been addressed in mountainous areas, recognizing that reduction in the slope aspect can induce a topographic lifting effect on airflow, subsequently increasing rainfall intensity (Fang et al. 2013). However, the traditional regression models, exponential, and power equations prove inadequate in capturing the intricate relationships between REVs and precipitation (Jing et al. 2016). Therefore, there is a pressing need for the development of a more effective method to comprehensively delineate the spatial of precipitation and thereby benefit hazard mitigation in mountainous areas. Considering the advantages of rapid processing speed, large-scale data processing ability and support for customized loss functions, an ensemble machine learning model (extreme gradient boosting (XGBoost)) is introduced to achieve spatial downscaling of TRMM to 1 km. The machine learning methods have demonstrated their proficiency in establishing complicated and hidden relationships between a group of input variables and output results without considering the distribution characteristics of input variables (Khosravi et al. 2021; Segoni et al. 2015).

With the aim of providing timely warnings for the occurrence of debris flows in mountainous areas with scarce data, we propose an ML-based approach to downscale TRMM data to a daily temporal scale with a spatial resolution of 1 km. Then, the downscaled precipitation data are calibrated by employing PERSIANN-CCS-CDR data, enabling the utilization of calibrated precipitation data for the derivation of a rainfall threshold equation based on the relationship of effective antecedent rainfall (Pe) and intraday rainfall (Po). After that, we integrate the rainfall thresholds with susceptibilities to establish a debris-flow warning model in the Gyirong Zangbo Basin, Tibet Tibetan Autonomous Region, China. Finally, we propose the implementation of an integrated matrix designed to deliver dynamic debris-flow warnings, with updated effective antecedent rainfall serving as a critical input parameter. The findings of this study offer a valuable source of guidance and scientific support for debris-flow warning and mitigation at a regional scale.

2 Methodology

The unstable accuracy of TRMM data in different areas has been addressed (Cheema and Bastiaanssen 2012). In this paper, an ML-based method is proposed for TRMM downscaling. After that, the downscaled TRMM data is calibrated using either rain gauge observations or Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks-Cloud Classification System-Climate Data Record (PERSIANN-CCS-CDR) due to the lack of rain gauge monitoring stations in mountainous areas. This dataset was developed by merging PERSIANN-CCS and Global Precipitation Climatology Project (GPCP) monthly precipitation observations. We are aware that the uncertainty and deviation of the PERSIANN-CCS-CDR product cannot be avoided. However, this dataset can provide an alternative for calibrations due to lacking rain gauge stations. This is because this dataset has been proven to perform well in capturing extreme events after the evaluations using ground and other satellite observations, which is significant for debris-flow analysis (Hsu et al. 2020). Finally, the calibrated TRMM data are used to calculate rainfall thresholds, which is integrated with a susceptibility map to establish a warning model for dynamic debris-flow forecasting at a regional scale. The flowing chart of precipitation downscaling and warning model development is shown in Fig. 1.

Fig. 1
figure 1

Flowchart of the development of a regional warning model for debris flows

2.1 Downscaling of 34B2 TRMM precipitation data

In this paper, NDVI, DEM, slope aspect, longitude, latitude, and LST are selected to spatially explain the spatial variation of the precipitation (Immerzeel et al. 2005; Trenberth and Shea 2005). The NDVI, LST, and geolocations are all MODIS products, and DEM is downloaded from the ‘Geospatial Data Cloud’ with a spatial resolution of 90 m. The slope aspect is derived from DEM using GIS. The downloaded monthly NDVI0.01° is resampled to the spatial resolution of 0.25°. Subsequently, the monthly NDVI0.25° is accumulated to seasonal NDVI0.25° since the significant relationship between the REVs0.25° and precipitation0.25° exists on a seasonal timescale (Chen and Li 2020). Additionally, the downloaded LST0.01° and DEM0.01° are all resampled to the spatial resolution of 0.25° to ensure consistency across variables. As a result, the aspect0.25° can be generated based on DEM0.25° using the GIS tool ‘Surface’.

Preceding the downscaling of precipitation using these factors, a correlation analysis is performed to assess their efficacy in capturing the spatial variations inherent in precipitation patterns. Following this initial evaluation, we employ the XGBoost algorithm to predict the seasonal precipitation0.01° with the input of REVs0.01°. To enhance the stability of the prediction model, we undertake a data pre-processing step wherein the input data are normalized to fall within the range of 0.01 and 0.99:

$$x_{{{\text{nor}}}} = \frac{x - \min (x)}{{\max (x) - \min (x)}}(U - L) + L$$
(1)

where xnor represents the normalized data. U and L are the upper and lower normalization bounds, respectively. Furthermore, in the process of generating daily precipitation0.01°, the assumption is introduced that the ratio of daily precipitation0.25° to seasonal precipitation0.25° is equivalent to the ratio of spatially downscaled daily precipitation0.01° to seasonal precipitation0.01°. Therefore, this implies that the ratio of TRMM precipitation of the i-th day to the seasonal accumulation is equal to the predicted TRMM precipitation of the i-th day to seasonal TRMM0.01°. This assumption forms a fundamental basis for our approach to deriving high-resolution daily precipitation data.

$${\text{RTRMM}}_{i}^{{0.25^{o} }} \left( {u,s} \right) = \frac{{{\text{Daily}}\;{\text{TRMM}}_{i}^{{0.25^{o} }} \left( {u,s} \right)}}{{{\text{Seasonal}}\;{\text{TRMM}}^{{0.25^{o} }} \left( {u,s} \right)}} = \frac{{{\text{Daily}}\;{\text{TRMM}}_{i}^{{0.01^{o} }} \left( {u,s} \right)}}{{{\text{Seasonal}}\;{\text{TRMM}}_{i}^{{0.01^{o} }} \left( {u,s} \right)}} = {\text{RTRMM}}_{i}^{{0.01^{o} }} \left( {u,s} \right)$$
(2)

where \({\text{Daily}}\;{\text{TRMM}}_{i}^{{0.01^{ \circ } }} \left( {u,s} \right)\) represents the daily precipitation of seasons (spring, summer, autumn, and winter) at location u.

2.2 Calibration of downscaled precipitation data

In this paper, the geographical differential analysis (GDA) is employed to calibrate the TRMM data (Cheema and Bastiaanssen 2012). The equations of the GDA method are as follows:

$$\Delta R_{{\left( {x,y} \right)}} = R_{{{\text{TRMM}}\left( {x,y} \right)}} - R_{{{\text{GAU}}\left( {x,y} \right)}}$$
(3)
$$\Delta R_{{\left( {x,y} \right)ip}} = \Delta R_{{\left( {x,y} \right)}}$$
(4)
$$R_{{{\text{Cal}}}} = R_{{{\text{TRMM}}}} - \Delta R_{{\left( {x,y} \right)ip}}$$
(5)

where \(\Delta R_{{\left( {x,y} \right)}}\) is the rainfall difference between the satellite and rain gauge data at a given point. \(R_{{{\text{TRMM}}\left( {x,y} \right)}}\) represents the TRMM data, and \(R_{{{\text{GAU}}\left( {x,y} \right)}}\) is the rain gauge observations or PERSIANN-CCS-CDR data at a specific location (x, y). \(\Delta R_{{{\text{GAU}}\left( {x,y} \right)ip}}\) is the difference map after spatial interpolation. \(R_{{{\text{Cal}}}}\) is the calibrated value. First, the dissimilarity between the TRMM data and the gauge observations is quantified following the equation detailed in Eq. (3). Subsequently, for the purpose of generating the difference map (\(\Delta R_{{\left( {x,y} \right)ip}}\)), an interpolation method is adopted due to its simplicity and robustness (Brouder et al. 2005; Ahrens 2006; Babak and Deutsch 2009), namely inverse distance weighting (IDW) method. The determination of the optimal number of neighboring stations to be considered in the IDW interpolation process is carried out in accordance with the guidelines established by Babak and Deutsch (2009). Finally, the calibrated results can be derived through the application of Eq. (5). This systematic approach allows for the refinement and alignment of the TRMM precipitation data with the PERSIANN-CCS-CDR dataset or gauge observations, enhancing the reliability and accuracy of the results.

2.3 Evaluation of downscaling performance

Nash–Sutcliffe efficiency (NSE) and standard error of estimates (SEE) are used to evaluate the calibration performance of the GDA (Nash and Sutcliffe 1970):

$${\text{NSE}} = 1 - \frac{{\sum\nolimits_{i = 1}^{n} {\left( {R_{{{\text{GAU}}\left( {x,y} \right)}} - R_{{{\text{Cal}}\left( {x,y} \right)}} } \right)^{2} } }}{{\sum\nolimits_{i = 1}^{n} {\left( {R_{{{\text{GAU}}\left( {x,y} \right)}} - \overline{R}_{{{\text{GAU}}\left( {x,y} \right)}} } \right)^{2} } }}$$
(6)

where \(\overline{R}_{{{\text{GAU}}\left( {x,y} \right)}}\) is the mean value of gauge observations, and n is the total number of observations. NSE ranges from − ∞ and 1. SEE measures the deviation of the calibrated values to the gauge observations.

$${\text{SEE}} = \sqrt {\frac{{\sum\limits_{i = 1}^{n} {\left( {R_{{{\text{Cal}}\left( {x,y} \right)}} - R_{{{\text{GAU}}(x,y)}} } \right)^{2} } }}{n - 2}}$$
(7)

Apart from the NSE and SEE, the other two methods that are used to evaluate the downscaling and calibration performance are shown as follows (Valderrama and Alvarez 2005):

$$\% {\text{Averd}} = \frac{100}{N}\sum\limits_{i = 1}^{N} {\left( {\frac{{y_{{i{\text{TRMM}}}} - y_{{i{\text{GAU}}}} }}{{y_{{i{\text{GAU}}}} }}} \right)}$$
(8)
$${\text{Abs}}.\;\% {\text{Averd}} = \frac{100}{N}\sum\limits_{i = 1}^{N} {\left( {\frac{{\left| {y_{{i{\text{TRMM}}}} - y_{{i{\text{GAU}}}} } \right|}}{{y_{{i{\text{GAU}}}} }}} \right)}$$
(9)

where yTRMM represents the TRMM data, and yGAU is the rain gauge station value. The %Averd and Abs. %Averd are both employed here because the %Averd can reveal the true positive and negative deviations. However, there is no indication of how big these deviations are. To address this limitation, the Abs. %Averd serves as a complementary metric, offering insights into the actual deviations. Furthermore, although the %Averd and Abs. %Averd can reflect the percentage-based deviations, they cannot provide a comprehensive account of the actual deviation values. To rectify this, the root-mean-square error (RMSE) and mean absolute error (MAE) are introduced here. These additions to our analysis provide a more comprehensive assessment of the deviations present in the dataset, thus enhancing the robustness and completeness of our evaluation.

$${\text{RMSE}} = \sqrt {\frac{{\sum\nolimits_{i = 1}^{n} {\left( {y_{{i{\text{TRMM}}}} - y_{{i{\text{GAU}}}} } \right)^{2} } }}{n}}$$
(10)
$${\text{MAE}} = \frac{{\sum\nolimits_{i = 1}^{n} {\left| {y_{{i{\text{TRMM}}}} - y_{{i{\text{GAU}}}} } \right|} }}{n}$$
(11)

2.4 Integration of rainfall thresholds and a susceptibility map

After the performance assessment of the proposed method, the effective antecedent rainfall–intraday rainfall (Pe − Po) relationship is established to calculate the rainfall thresholds. The method for calculating the effective antecedent daily precipitation is (Guo et al. 2013):

$$P_{{\text{e}}} = \sum\limits_{1}^{n} {P_{i} \cdot K_{i} }$$
(12)

where Pe is the effective antecedent rainfall. Pi is the daily precipitation on the i-th day proceeding to the debris-flow event (1 < i < n), and Ki is a decay coefficient due to evaporation. Ki = 0.84 has been suggested by many studies (Ni et al. 2010; Guo et al. 2013).

To further improve the efficiency of debris-flow susceptibility, we establish an optimized warning model that integrates a susceptibility map with rainfall thresholds. Therefore, a hybrid machine learning model (certainty factor—genetic algorithm—support vector classification (CF-GA-SVC)) is employed to produce the susceptibility map (Qiu et al. 2022). Our approach is grounded in the premise that the debris flows may be triggered at a lower susceptibility level, even in the face of escalating rainfall severity (Segoni et al. 2015). In this case, if an area exhibits a low rainfall threshold level (R1), the occurrence of debris flows is contingent upon classifying this area as having a very high susceptibility (S4). Conversely, regions characterized by low rainfall thresholds (R1 and R2) and low susceptibility (S1 and S2), as detailed in Table 1, are deemed less prone to debris flows. This integrated scheme underscores the importance of considering both susceptibility and rainfall thresholds in the evaluation of debris flows.

Table 1 Integrated matrix which combines the rainfall thresholds and susceptibility levels

3 Study area and data

Gyirong Zangbo Basin, serving as the only channel of land trade between Nepal and China, is selected as the study area (Fig. 2). It is located in the southwestern part of Tibet. This area contains two towns with a total area of 2.12 × 103 km2. The northern part of this area belongs to the semiarid plateau climate zone with seasonal wind, causing the annual average temperature of 2 degrees Celsius. The dry climate in this region limits the precipitation, ranging from 300 to 600 mm each year, and large areas of weathered rocks with low compression strength and shear strength are caused. However, the annual precipitation in the southern part can reach almost 1000 mm due to the subtropical monsoon climate. The Himalayan Mountains block the northward warm air from the Indian Ocean, allowing the southern part to be one of the rainfall centers in Tibet and benefit the vegetation growth. In terms of the characteristics of geological structure in this area, the collision and squeezing between the Indian Plate and the Eurasian Plate give birth to the Himalayas and form a series of large-scale faults. These fault zones not only impact the trend and flow of rivers, but also control the development and distribution of natural hazards. Gyirong Zangbo Basin spans the southern part of the Karakoram Range and the higher Himalayas, belonging to the North Himalayan tectonic belt. The unique quaternary geological environment provides an appropriate hazard-inducing environment due to large areas of unstable slopes caused by deep-cutting erosion of rivers and human activities. Therefore, the studies of debris flows can benefit the recognition and mitigation of debris flows along the North Himalayan tectonic belt.

Fig. 2
figure 2

Geographic location of Gyirong Zangbo Basin and landforms of this area

Due to the complicated geological conditions and appropriate hazard-pregnant environment in this area, the occurrence of debris flows is frequent. However, the absence of rain gauge stations presents a significant obstacle to the formation of effective hazard mitigation measurements, leading to the unavailability of reliable precipitation data for timely warnings. Therefore, generating a reliable warning map of debris flows is an urgent need. To address this challenge, we employ PERSIANN-CCS-CDR data for the calibration of downscaled TRMM data. Furthermore, we utilize historical debris-flow data from the summer of 2007 (SGESI 2018) to facilitate the downscaling and calibration processes. This selection is motivated by our objective to improve the existing susceptibility map through the integration of rainfall threshold information. For our case study, we focus on the summer of 2007, as it represents a period when the highest levels of precipitation occurred during the rainy season (Fig. 3). Consequently, precipitation downscaling and validation in the 2007 summer became imperative to obtain the effective antecedent rainfall and intraday rainfall of the historical debris flows. Furthermore, the debris flows in 2006 summer are used to validate the merging results of the susceptibility map and rainfall thresholds. This comprehensive approach is instrumental in ensuring the effectiveness and accuracy of our analysis.

Fig. 3
figure 3

Change of annual precipitation and monsoon rainfall in this study area

4 Results

4.1 Downscaling of TRMM0.25° data

The spatial distribution of the REVs0.25° and seasonal precipitation0.25° in 2007 summer is presented in Fig. 4. The data with a coarse resolution are used to train the prediction model. To increase the robustness of this model, the input data cover a wide area of 1.8 × 105 km2 so that enough data can be used to train the model.

Fig. 4
figure 4

Spatial distribution of the seasonal REVs0.25° and downloaded TRMM0.25° data in 2007 summer

Furthermore, to ascertain the spatial relationship between each REV and precipitation at a coarse resolution, correlation analysis is conducted to determine their suitability as indicators for representing precipitation spatial variations. Our findings reveal that the DEM exhibits the strongest correlation with precipitation, characterized by the highest r (Person’s coefficient) value of 0.857. Similarly, the other three REVs, including NDVI, LST and Lat (latitude), all present robust correlations with precipitation, with r values exceeding 0.7. In contrast, only a moderate correlation is found between Lon (longitude) and precipitation with r = 0.4, while no discernible correlation is identified between the aspect and precipitation in this geographic context. Therefore, the slope aspect is excluded from model training. This analysis is consistent with the conclusion of Chen et al. (2020) who emphasized the impact the aspect has been overemphasized in TRMM downscaling. As for Lon, sensitivity analysis is imperative to assess its contribution to enhancing the prediction accuracy of seasonal precipitation0.01°. The seasonal REVs0.01° are prepared to predict seasonal precipitation0.01° based on the established relationship at a spatial resolution of 0.25° (Fig. 5). The prepared data are divided into a training set and a validation set with a ratio of 0.7, with 70% allocated for model training and the remaining 30% for validation purpose.

Fig. 5
figure 5

Input REVs0.01° of 2007 summer for precipitation prediction

The importance of Lon can be evaluated and ranked by quantifying the increase in prediction error resulting from the permutation of ‘out-of-bag’ OOB data exclusively for that variable while keeping all others unchanged (Liaw and Wiener 2002). The prediction results are shown in Fig. 6, clearly demonstrating that the absence of Lon leads to an escalation in prediction error. Therefore, despite the absence of a strong correlation between the longitude and precipitation, it is evident that the contribution of this variable should not be disregarded. Specifically, approximately 81.5% of the prediction errors fall within the range of − 10 and 10 mm, but this proportion decreases to 70.1% (as denoted by the red dash lines in Fig. 7) when Lon is excluded from precipitation prediction. Consequently, the longitude is retained as a training feature for the prediction model. However, it is necessary to note that the introduction of additional variables does not invariably yield a reduction in prediction error. For example, another weather-related factor, evapotranspiration (ET), can effectively reflect the precipitation variations at seasonal, century and even longer time scales (Council 1998). In contrast, its impact may vary across distinct regions. The increased precipitation leads to a small increase in ET (Garbrecht et al. 2004). In our preliminary studies, we observed that the inclusion of ET in model development failed to diminish prediction errors. In fact, upon incorporating ET, the model yielded an R-squared value of 0.91, notably lower than the R-squared of model training (0.98) when excluding ET from model development. As a result, only 75.2% of prediction errors fell within the range of − 10 to 10 mm. The reason behind the error increasing might be due to the introduction of noise information associated with ET, which adversely affected model training. Therefore, ET is not selected as one of the indicators for prediction purposes.

Fig. 6
figure 6

Results of the sensitivity analysis after removing the longitude

Fig. 7
figure 7

Predicted seasonal precipitation0.01° using two methods

Furthermore, in order to demonstrate the superiority of integrating machine learning into the downscaling of the TRMM precipitation, the geographically weighted regression (GWR) method is employed to conduct a comparative analysis with XGBoost (Fig. 7).

Figure 8 illustrates that this ML-based method performs better than the GWR method because 92.4% of prediction errors are between − 20 and 20 mm with the application of the ML-based method, while the GWR method achieves 83.5% of within the same range. Furthermore, the ML-based method outperforms GWR, with 81.5% of prediction error concentrated in the − 10 to 10 mm range, in contrast to GWR’s 61.5%. This difference in data processing ability can be further underscored by the R-squared values of the two methods. The GWR method yields an R-squared value of 0.71, but the R-squared value of the ML-based method exceeds 0.9. Additionally, the six assessment indexes are employed to evaluate the performances of ML-based and GWR methods (Table 2). Overall, the ML-based method shows its superiority in improving prediction accuracy when compared to the GWR method.

Fig. 8
figure 8

Comparison analysis results between the ML-based and GWR methods

Table 2 Performance evaluations between the two methods

The downscaling processes for daily precipitation0.01°, applied to six selected debris-flow events, are shown in Fig. 9. To assess the effectiveness of this ML-based method in downscaling the TRMM data, we extend the analysis to include debris-flow events not only within Gyirong Zangbo Basin, but also in other regions of Gyirong county, denoted as locations a, b, c, d, e, and f. The catchments corresponding to these six debris-flow events are delineated to represent the predicted seasonal precipitation0.01°. On the basis of the prediction results in Fig. 9a, the daily precipitation0.01° is calculated based on Eq. (1), as shown in Fig. 9b. Then, the downscaled results undergo calibration using the PERSIANN-CCS-CDR data to refine the downscaled precipitation estimates (Fig. 9c). Compared to the downscaled daily precipitation0.01°, there are remarkable changes observed for these calibrated results. The most substantial alteration occurs at the d location, where the maximum precipitation increases from 1 to 33 mm. The reason why the downscaled precipitation result is far less than the calibrated result can be speculated to be the underestimation of the satellite observations. The intermittent measuring mode of TRMM may miss a high-intense rainfall event during the 3-h interval time. This deduction aligns with previous findings that the 3B42 TRMM data underestimates the precipitation in Himalayan regions with an altitude higher than 3100 m (Bharti and Singh 2015; Ma et al. 2017). The incorporation of PERSIANN-CCS-CDR data can fill this gap, as this dataset excels in capturing extreme rainfall events (Sadeghi et al. 2021). Furthermore, aside from measurement issues, the high-intense rainfall could also disturb the transmission of radio frequency radiations, potentially affecting TRMM satellite reception (Cheema and Bastiaanssen 2012). However, the utilization of the GDA method can be a supplement for the improvement of these issues because it accounts for spatial averages of nearby samples, thereby reducing errors.

Fig. 9
figure 9

Spatially downscaling and calibration processes of 3B42 TRMM data

4.2 Site-specific validation of data calibration

To evaluate the downscaling and calibration performance of TRMM data, the NSE and SEE indexes are employed. The six debris-flow events in session 4.1 reflect the downscaling efficiency in different locations and dates.

Figure 10a illustrates that the calibration performances of the six debris-flow events are all accepted because the NSE values are between 0 and 1. These NSE values across various locations serve as indicators of the effectiveness of the machine learning-based approach for downscaling TRMM data. Furthermore, the accuracies of the three sets of precipitation data are evaluated using SEE values (Fig. 10b). The original TRMM data exhibits the maximum deviation from the PERSIANN-CCS-CDR data. Notably, the original TRMM data exhibits unstable accuracy across different locations, marked by fluctuating SEE values. This variability can be attributed to mountainous terrain, which introduces variations in radar beam reflectivity (Porcù et al. 2003) and yields non-uniform rainfall in these regions (Bookhagen and Burbank 2006). Applying machine learning can partially decrease the SEE values. Furthermore, significant reductions in SEE values are observed following the calibration of the downscaled TRMM data with PERSIANN-CCS-CDR data. The minimum SEE is 0.08 mm at the d location, while the maximum deviation is 4.18 mm at a location. Overall, the machine learning method can achieve the downscaling of seasonal TRMM precipitation0.25° to seasonal precipitation0.01°, resulting in decreased SEE values. Concurrently, the calibration method, GDA, is reliable and effective in improving the TRMM estimation accuracy. In addition to the site-specific analysis, assessing calibration accuracy across different temporal scales is imperative.

Fig. 10
figure 10

Evaluation of ML-based method for downscaling the TRMM data in six different locations

4.3 Temporal deviation analysis

To assess the performance of the TRMM downscaling and calibration processes, the deviations between the original TRMM (OriTRMM), downscaled TRMM (DownTRMM), and calibrated TRMM (CaliTRMM) are analyzed. Taking the debris-flow event in b location as an illustrative example (see Fig. 9 about the location of debris flow b), we analyze the deviations over the 30 days preceding the occurrence of this event. As shown in Fig. 11, CaliTRMM exhibits the smallest percentage average deviations, surpassing both DownTRMM and OriTRMM. Notably, the CaliTRMM underestimates precipitation compared to PERSIANN-CCS-CDR data. This underestimation is due to the application of the IDW method when generating the difference map (see Eq. (3)). The IDW method assumes that each given point exerts localized influence on its neighboring region, with influence diminishing as the distance from the given points to unknown areas increases. Therefore, the IDW method tends to underestimate the precipitation values in regions farther from the given points. Additionally, the deviations of CaliTRMM decrease by 55% and 57% when compared with the DownTRMM and OriTRMM (Abs.%Averd).

Fig. 11
figure 11

Percentage average errors of 30 days before the occurrence of the debris-flow event in b location

However, limitations persist when precise deviation values are expected, particularly on rainless days. For example, the OriTRMM might report 3 mm and 5 mm of 24-h rainfall at two distinct locations, respectively, but the PERSIANN-CCS-CDR data may indicate no observed rainfall at either location. As a result, both locations would yield an Abs.%Averd of 100%, failing to capture the exact deviations. To address this limitation and provide a more nuanced assessment, we employ RMSE and MAE metrics to quantify the exact deviation values (Fig. 12).

Fig. 12
figure 12

RMSEs and MAEs of 30 days before the occurrence of the historical debris-flow event in b location

Figure 12 illustrates a consistent variation trend in RMSEs and MAEs. The CaliTRMM performs the best when compared with the OriTRMM and DownTRMM. As evidenced by the proximity of RMSEs and MAEs to 0. Notably, CaliTRMM exhibits stable curves with only sporadic deviations in RMSE and MAE values on days 12, 22, and 30 (highlighted in yellow boxes in Fig. 12). These outliers signify the inherent instability of TRMM data when the 24-h precipitation exceeds 15 mm. Conversely, OriTRMM and DownTRMM display notable fluctuations in RMSE and MAE on days 12, 22, 25, and 30, as shown in Fig. 12. Importantly, the utilization of PERSIANN-CCS-CDR proves effective in mitigating errors in mountainous regions, given its superiority ability to capture extreme rainfall events.

To further assess the efficiency of this method in estimating the precipitation across various time periods, the RMSEs and MAEs of the daily, monthly, and seasonal periods are analyzed (Fig. 13). This figure shows that the DownTRMM exhibits a similar variation trend to OriTRMM. However, a clear improvement can also be found for DownTRMM due to the decreasing of RMSEs and MAEs in the three temporal periods. Overall, the CaliTRMM consistently outperforms the other two datasets. Only slight differences in RMSE and MAE are discernible during the daily period for the three datasets. However, the RMSEs and MAEs of the OriTRMM, DownTRMM, and CaliTRMM all increase by approximately 95% when the temporal period extends to the monthly scale. This significant increase can be attributed to the cumulative effect when aggregating daily measurements into monthly rainfall (Chen et al. 2020). When the period is further extended to a seasonal scale, the rate of increase in RMSEs and MAEs diminishes significantly, dropping to 58% for OriTRMM and 57% for DownTRMM, while CaliTRMM exhibits a much lower increase rate of only 29%. This disparity in increasing rates can be attributed to the machine learning method’s ability to establish a robust relationship between precipitation and REVs at a seasonal time scale.

Fig. 13
figure 13

RMSEs and MAEs between the PERSIANN-CCS-CDR and TRMM rainfall data (OriginalTRMM, DownscaledTRMM and ClibratedTRMM) for different periods

Moreover, the calibration of DownTRMM with PERSIANN-CCS-CDR results in enhanced stability and accuracy of TRMM data. Although the RMSE and MAE values for CaliTRMM exhibit an increase when the daily data is accumulated to the monthly timescale, there are no significant fluctuations among the three different timescales. Furthermore, the rate of increase in RMSE and MAE decreases by 68.27% and 65.36%, when the monthly data are accumulated to a seasonal timescale. Therefore, an accuracy improvement for 3B42 TRMM data can be achieved through data calibration using the GDA method. Consequently, the combined application of the ML-based method and GDA calibration method mitigates the exacerbation of deviations to a certain extent. Furthermore, a seasonal timescale emerges as more suitable for spatial downscaling, as it aligns with our approach to employ downscaled data for calculating the rainfall thresholds to improve the debris-flow warning system.

4.4 Rainfall thresholds calculation

A period of antecedent rainfall as long as 10 days can affect the water content of soil and, therefore, cause instability of slopes (Crozier 1999). But 9 days of effective antecedent rainfall shows a stronger correlation with intraday rainfall than the 10 days due to Pearson’s coefficient reaching 0.913. Therefore, n = 9 days is considered in this paper to establish the empirical relationship. The R2 and adjusted R2 of this regression equation are 0.90 and 0.88, respectively, and the significant testing shows that the estimated coefficients are statistically significant since the P value is less than 0.05. The regression relationship can be represented by:

$$P_{o} = 466.2P_{{\text{e}}}^{ - 0.69}$$
(13)

where Po is the intraday rainfall (mm), and Pe represents the accumulated effective antecedent rainfall (mm). As shown in Fig. 14, the analysis of rainfall conditions for triggering the debris flows in the Gyirong area can rely on the line. For example, if a debris-flow event is expected to be triggered after heavy rainfall with the intraday rainfall reaching 50 mm (Administration 2013), then the maximum allowable 9-day antecedent rainfall over the preceding 9 days would be at most 21.3 mm. Similarly, a 9-day antecedent rainfall event of 59.9 mm becomes a potential trigger for debris flows when the intraday precipitation reaches 25 mm. Therefore, the likelihood of debris-flow occurrence escalates when the effective antecedent rainfall and intraday rainfall are getting closer to the regression line.

Fig. 14
figure 14

Relationship between the effective antecedent rainfall and intraday rainfall

To further evaluate the performance of this empirical equation in providing warnings for the occurrence of debris flows in this area, the empirical relationships in previous studies (Table 3) are used to conduct a comparison analysis.

Table 3 Effective antecedent rainfall—intraday rainfall thresholds

The calculated results using the four equations are fitted against the true values (Fig. 15). This figure illustrates that our study and Ni et al. (2010) (green line) perform better than Ni et al. (2010) (brown line) and Zhuang and Peng (2014) (yellow line) since the results of our study and Ni et al., (2010) are closer to the true values. Furthermore, our study achieves a 9% reduction in average deviation percentage when compared to the results of Ni et al. (2010) (green line). Therefore, the equation developed in this study can perform well in mountainous areas that lack monitoring stations.

Fig. 15
figure 15

Comparison analysis of the rainfall thresholds

4.5 Integration of rainfall thresholds and susceptibility map

A total of eleven causative factors are selected to generate a susceptibility map using the CF-GA-SVC model (Fig. 16). They are related to topographic (slope, aspect, height difference, and average channel gradient), ecological (NDVI), geological (lithology, distance to faults, distance to river networks, and seismic intensity), and meteorological conditions (annual precipitation and average annual temperature). The probability in this area ranges from 0.05 to 0.88, with a high susceptibility level classified in the range of 0.46 to 0.74, encompassing 20 catchments. Meanwhile, a very high susceptibility level, with probabilities ranging from 0.74 to 0.88, includes 44 catchments. In general, both the southern and northern parts of this area face elevated debris-flow risks. To address this, the generated rainfall thresholds are integrated with this susceptibility map to develop a debris-flow warning model. Based on the Hazards Investigation Report (SGESI 2018), there were seven recorded debris flows in this area from July 26 to 30, 2006. To estimate the rainfall thresholds of the seven debris flows, 9 days of effective antecedent rainfall of each debris-flow event is calculated (Fig. 17a).

Fig. 16
figure 16

Susceptibility map of Gyirong Zangbo Basin

Fig. 17
figure 17

Rainfall thresholds and reclassified warning levels

As shown in Fig. 17a, the intraday rainfall of all the catchments is classified into three levels, which range from 21.0 to 97.5 mm. The catchments with R1 level require a rainfall event of at least 67.8 mm for triggering debris flows, while debris flows may occur within the catchments with R3 level when the rainfall values range from 21.0 to 38.6 mm. The rainfall data and generated susceptibility map are both resampled to a spatial resolution of 100 m since the spatial resolution of downscaled rainfall data is 1 km (Segoni et al. 2016). The empirical equations provide warnings for four debris-flow events and missed three events. However, integrating the susceptibility map and rainfall thresholds further improves the warning accuracy because the reclassified warning map forecasts five debris-flow events (Fig. 17b). Notably, a catchment in the southern part, initially classified as ‘R1 (Low hazard level)’ in Fig. 17a, is corrected as ‘High hazard (H3) in Fig. 17b.’ This correction underscores the efficacy of the integrated map in improving debris-flow forecasting precision. Meanwhile, this map can also locate the debris-flow events and indicate the occurrence date when compared to the susceptibility map.

However, limitations still exist since this reclassified map misses one event. This error may be due to the lack of historical debris-flow data, which causes the deficiency of the susceptibility map in forecasting this debris-flow event. Overall, the validation results are positive, but the historical data used for validation are limited. Therefore, further studies are essential before this integration method is used for a wider application.

5 Discussion

The selection of explanatory variables and method for establishing a precipitation downscaling relationship is pivotal in regional analysis. NDVI has been widely used for the downscaling of precipitation because of its consistent response to spatial variation of precipitation at a regional and global scale (Onema and Taigbenu 2009). However, climatic variables such as precipitation and temperature exhibit complex, nonlinear variations in three dimensions, including elevation (DEM), longitude and latitude (Bryan and Adams 2002). Therefore, NDVI, DEM, longitude, and latitude are considered by many studies (Jing et al. 2016; Elnashar et al. 2020). Nevertheless, the potential for a weak correlation between precipitation and location has been suggested, aligning with our correlation analysis findings (Chen et al. 2020). In this case, we conduct a sensitivity analysis to reveal the positive contribution of longitude to the accuracy improvement of precipitation downscaling. However, in the context of climate change, anthropogenic activities remarkably amplify the uncertainty in determining extreme rainfall events (Hardwick Jones et al. 2010). This is because the intensity of heavy precipitation events increases with the rising of the temperature (Meehl et al. 2007). As a result, relying solely on NDVI, DEM, longitude, and latitude may prove insufficient in capturing extreme climate events, resulting in unstable precipitation downscaling accuracy and complicating the computation of debris-flow rainfall thresholds. To address this, we introduce land surface temperature (LST) as an additional factor alongside NDVI, DEM, Lon, and Lat, as a significant relationship exists between LST and precipitation (Wan et al. 2004; Brunsell 2006; Chai et al. 2021). LST serves as a critical indicator in describing Earth’s surface energy balance, encompassing latent heat flux, net radiant energy, sensible heat flux, soil heat flux, and regional land surface processes (Wan et al. 2004; Kalma et al. 2008).

In this study, the implementation of the XGBoost machine learning method proves highly effective in downscaling TRMM data, outperforming the GWR method. GWR, while acknowledged in prior studies for its superiority over multiple linear regression and exponential regression due to its incorporation of geological location information, falls short in handling large datasets. This limitation arises because the dataset’s size can exponentially increase the computational demands of GWR. Consequently, to maintain processing speed, GWR aggregates high-resolution data into coarser geographical units, thereby overlooking geographical details (Harris et al. 2010). However, this processing mode can cause unstable accuracy of downscaled precipitation data. Therefore, in order to solve this problem, distributed computation ability is required (Harris et al. 2010). The employment of XGBoost can avoid this problem because of its parallel and distributed computing abilities (Chen and Guestrin 2016). More importantly, the large data processing capacity enables XGBoost to establish the complex relationship between REVs and precipitation at a spatial resolution of 0.25° and further utilize this relationship to predict the precipitation with a spatial resolution of 1 km.

In contrast to the extensively used grid units in previous debris-flow studies, our newly developed warning system employs individual catchments as the analysis units (Stavropoulou et al. 2010). Although the more rapid matrix subdivision can be achieved using grid units when compared with the application of catchments, they cannot effectively encompass the topography and other environment-related information in the debris-flow analysis (Catani et al. 2005; Zou et al. 2019). This is because the initiation, mobility, and accumulation of debris flows are complex processes influenced by geological, hydrological, and geomorphological factors within an independent catchment. Therefore, we adopt catchment units in this study to construct a dynamic warning map, integrating susceptibility maps and rainfall thresholds for real-time debris-flow warnings. In mountainous areas, the warning levels can be updated along with the new inputs of downscaled precipitation data calibrated by the PERSIANN-CCS-CDR dataset. The warning map can only be refreshed and clearly indicate the warning levels of each catchment in a specific area if precipitation data for that region is available. However, limitations still exist. For example, only daily precipitation with a spatial resolution of 1 km rather than hourly precipitation is offered. The daily precipitation cannot indicate the occurrence time of debris flows, causing uncertainty in providing warnings for the local people. In contrast, the uncertainties cannot alter the fact that our studies improve the warning accuracy of debris-flow warnings and provide scientific support for debris-flow mitigation in mountainous areas with scarce data.

6 Conclusion

Our study introduces a hybrid ML-GDA method to generate the daily precipitation with a spatial resolution of 1 km at a regional scale based on NDVI, DEM, longitude, latitude, and LST. On the basis of the historical debris flows in 2007 summer, we apply this method to calculate the rainfall thresholds in the Gyirong Zangbo Basin. Then, the rainfall thresholds are integrated into a susceptibility map to generate a debris-flow warning model defining four hazard levels, including H1, H2, H3, and H4. The main conclusions are as follows:

  1. (1)

    The application of the machine learning model, XGBoost, is effective to predict seasonal precipitation with a spatial resolution of 0.01° based on the established relationship between seasonal precipitation and season REVs with a spatial resolution of 0.25°.

  2. (2)

    The ML-based method performs better than the GWR-based downscaling method in this area, and the calibrated TRMM data achieve an absolute deviation decreasing of 57% when compared with the original TRMM data. Additionally, the RMSEs and MAEs of calibrated TRMM data are closer to 0.

  3. (3)

    The integrated map successfully provides warnings for five debris-flow events in 2006 summer in Gyirong Zangbo Basin, where seven debris-flow events were recorded from 26 to 30th July. Meanwhile, the warning level of one debris-flow event increases from R1 (rainfall thresholds level) to H3 (warning levels).

Although the proposed method in this study can effectively downscale TRMM data and support the analysis of rainfall thresholds in data-scarce mountainous areas, the wide application still needs further studies to increase the robustness of the machine learning model. As a result, the increase in accuracy for the downscaled TRMM data can be further expected.