Introduction

Rainfall affects the force balance of hillslopes and is thus critical for the initiation or acceleration of landslides. Intense rainfall may increase fluvial erosion and initiate debris flows (Gariano and Guzzetti 2016; Turkington et al. 2016), while rainfall totals decrease slope stability by raising water tables and pore pressure (Longoni et al. 2014; Ozturk et al. 2016), thus potentially reducing shear strength, suction, and cohesion, while increasing the soil weight (Tacher and Bonnard 2007). Hence, rainfall properties, such as intensity and duration, are frequently used as predictor variables in models which assess and predict rainfall-induced landslides (Saito et al. 2010; Segoni et al. 2015). The increasing availability of satellite-based rainfall estimates and regional climate models provides this information now in unprecedented spatial and temporal resolutions. Yet, whether these technical developments potentially lead to a better prediction of the temporal and spatial patterns of rainfall-induced landslides remains largely unexplored.

Satellite rainfall missions, such as Global Precipitation Measurement (GPM)—Integrated Multi-satellitE Retrievals for GPM (IMERG)—, provide 0.5 hourly rainfall products with global coverage and spatial resolution of 0.1 (∼15 km), thus offering detailed information to support landslide hazard assessment in regions with scarce ground-based rainfall measurements (Guimarães et al. 2017). Alternatively, the recently released hourly ERA5 climate reanalysis has the same spatial resolution with IMERG featuring several meteorological variables, such as precipitation intensity (Maussion et al. 2014; Turkington et al. 2014). Rainfall products with a spatial resolution of ≈10 km potentially improve our ability to link landslides activity to rainfall and aid in early warning (Nikolopoulos et al. 2017), which otherwise requires either a dense gauge network (<10 km resolution), or a ground-based rainfall radar network. Although new conversion methods and sensors have improved the precision of satellite estimates and rainfall reanalyses (Duan et al. 2015; Crisologo and Heistermann 2019), assessments of the accuracy of satellite rainfall estimates and reanalysis products show weak performances particularly in mountainous regions (Martin and Scherer 1973; Hong et al. 2006; Andermann et al. 2011), which is a shortcoming for landslide nowcasts (Xu et al. 2017; Kirschbaum and Stanley 2018; Brunetti et al. 2018). Another important deficit is the inaccurate estimates of high intensity and convective rainfalls (Kirschbaum et al. 2009; Kidd et al. 2013; Chikalamo et al. 2020). For example, satellite-based rainfall products underestimate accumulated rainfall amounts, when compared with accumulations obtained from ground-based radars (Kubota et al. 2014; Speirs et al. 2017), which are operationally used for landslide early warning only by a handful of countries such as Japan and Taiwan (Chiang and Chang 2009; Osanai et al. 2010).

Landslide hazard often relies on the determination of critical thresholds for landslide initiation based on measures of rainfall intensity and duration (Leonarduzzi et al. 2017; Tang et al. 2019). As other factors exist that determine how rainfall translates into increased landslide activity, these thresholds are usually determined empirically using historical data and commonly apply to individual geographic regions only (Tang et al. 2019). An adequate model should primarily hindcast landslide activity temporally (Liao et al. 2010; Chikalamo et al. 2020). Spatially distributed rainfall estimates may further increase the predictive power of landslide models as long as they capture the spatial patterns of rainfall and duration (Rossi et al. 2017b). For example, commonly the grids of satellite rainfall estimates in which landslides occurred are used as the sole input to train nowcasting models. Zero accumulated event rainfall limits this application, due to the landslide events that miss their rainfall location (Jia et al. 2020), in the case of accurate landslide location and time of occurrence (Froude and Petley 2018). Another bias may hinge on the missing landslide data (Chleborad et al. 2008). We refer to this property as spatial consistency which means that a gridded rainfall product accurately captures the storm center and extent.

Thresholds based on intensity and duration of rainfall are successfully used to issue early warning to mitigate landslide related losses (Osanai et al. 2010; Capparelli and Versace 2011). Besides the triggering rainfall event, landslide distribution and orientation is further controlled by the main landslide causes (predictors)—the surrounding morphology, geology, and land cover (Chigira et al. 2013; Reichenbach et al. 2018). For instance, primary controls on the location of rainfall-induced landslides are topographic indices such as hillslope inclination modulated by geology (Kojima et al. 2015; Ozturk et al. 2018). These topographic and geologic slope instability factors may become trivial comparing to the main trigger, i.e., rainfall intensity and accumulation, once the existing landslide data sufficiently reflect the diversity of those factors.

Here, we aim to test the ability of satellite rainfall estimates (i.e., IMERG) and rainfall reanalyses (i.e., ERA5) to support the characterization of the spatial distribution of landslides. Temporal accuracy of the grid rainfall products was frequently tested (Nikolopoulos et al. 2017; Chikalamo et al. 2020). We test whether these rainfall products are sufficiently spatially consistent to improve a landslide hindcast model. Contrary to the convenient practice of assessing the efficiency of the satellite rainfall in the resolution of the rainfall product (e.g., Jia et al. 2020), we work on the resolution of the digital elevation model (DEM). First, we approach this goal by training a logistic regression model with geomorphometric and geologic variables to establish a base model for landslide prediction (Braun et al. 2015). In a second step, we complement the model with predictor variables derived from different rainfall products and test whether these variables increase the performance of the models.

Study area and data

We assessed three different rainfall datasets. First, we used 1-hourly rainfall totals which are derived from radar/rain gauge–analyzed precipitation (R/A) by the Japan Meteorological Agency (JMA). JMA have been operating R/A since 1988 to measure nationwide rainfall distribution and to prevent rainfall-related disasters across Japan (Makihara et al. 1996; Shimpo 2001). R/A is well known as accurate rainfall product based on 5-minutely original radar data with a spatial resolution of ≈1 km that were calibrated by a dense network of rain gauge data via the Automated Meteorological Data Acquisition System (AMeDAS) (Makihara et al. 1996; Kamiguchi et al. 2010; Urita et al. 2011; Ishizaki and Matsuyama 2018; Hirockawa et al. 2020). Although hourly rainfall accumulation of weather radar underestimates hourly rain gauge readings by <10%, there is a high agreement with daily measurements by rain gauge data (Suzuki et al. 2017). Thus, we considered radar data as benchmark to compare the satellite- and reanalysis-based rainfall estimates. Second, we applied the final product of the Integrated Multi-satellitE Retrievals for Global Precipitation Measurement (IMERG, version 6) mission (Huffman et al. 2019). IMERG rainfall estimates are provided in half hourly intervals in 0.1o (≈10 km) resolution. Third, we analyzed the ERA5 reanalyses rainfall estimates, which are provided with hourly intervals in 0.1o (≈10 km) resolution (C3S 2017).

We chose two events with contrasting spatial and temporal patterns of rainfall. In July 2017, a torrential storm hit southwestern Japan in Fukuoka Prefecture (Hazarika et al. 2020). More than 300 mm of rain fell within 12 h (July 7, between 00:00 and 12:00 UTC) triggering ≈2000 shallow landslides, the majority (>80%) of which detached from hillslopes underlain by Mesozoic schist and Cretaceous granodiorite, while the area consists of volcanic rocks that shows lower density of landsliding (Fig. 1a). One year later, another northeasterly frontal storm led to persistent rainfall for about a week over the entire southeastern Japan. Particularly intense rainfall was recorded between July 5, 00:00 and July 8, 03:00, 2018, for about 75 h. This second event triggered around 8500 landslides, mostly shallow soil slides with debris flows in Hiroshima Prefecture that extends over the area of 30×100 km (Miura 2019). Distribution of these landslides overlaps with the >250 mm cumulative rainfall in Hiroshima (Goto et al. 2019). These debris flows occurred predominantly over two geological formations: Cretaceous volcanic rocks (mostly rhyolite and felsic pyroclastic rocks) and Cretaceous granite that occupy the major area in this region (Fig. 1b). The landscape in both the Fukuoka and Hiroshima areas exhibits similar topographic features characterized by low relief, with steep soil-mantled forested hillslopes. Lithological conditions affect only minor variations in the rugged topography. Elevation of the terrains ranges in 4 (0) to ≈1200 (≈920) m in Fukuoka (Hiroshima), tending to be higher in volcanic rocks than other lithology in both areas, but no drastic differences within the entire areas. Landslides occurred generally in hollows in headwaters with a size in the order of 102−103 m2 scale for their source area. Both of the events occur in southwestern Japan, where heavy rainfall is common due to frequent frontal storms and tropical cyclones (Ozturk et al. 2019; Hirockawa et al. 2020). The rainfalls in 2017 (Fukuoka) and 2018 (Hiroshima) were also caused by mainly frontal storms after the passage of a tropical cyclone with a short-term intense rainfall over a small area in Fukuoka and a long-lasting heavy rainfall over a large area in Hiroshima (Tsuji et al. 2020), with maximum hourly rainfall intensity breaching 60 mm (30 mm) in Fukuoka (Hiroshima). The total rainfall accumulations were about 20% of annual rainfall at respective regions (Ozturk et al. 2019). In both the Fukuoka and Hiroshima events, the landslide data is provided by the Geospatial Information Authority of Japan (GSI, Table 1). Landslides were mapped manually from aerial imagery with sub-meter resolution. We chose our study areas that cover 99% of all the landslides associated with each event (Fig. 1), 910 (3550) km2 for Fukuoka (Hiroshima).

Fig. 1
figure 1

Overview of the study areas. a is the Fukuoka area (910 km2) on the major geological formations, together with rainfall contours, similarly b is the Hiroshima area (3550 km2). Simplified version of the Seamless Digital Geological Map of Japan is shown in the figure

Table 1 List of landslide predictors used in the model

For the Fukuoka event, we extracted the cumulative rainfall of 12 h and the maximum hourly rainfall that was recorded on 7:00 o’clock (UTC) for our analysis. To study the Hiroshima event, we analyzed the 75-hour cumulative rainfall, as well as the maximum hourly rainfall recorded at 11:00 o’clock on July 6. In addition, we derived the maximum rain accumulation within 12 h from 9:00 to 20:00 on July 6, 2018, for direct comparison with the Fukuoka event. All the rainfall products are shown in Figs. 2, 3, 4, and 5. The exact timing of the maximum rainfall is different in GPM and ERA5 (Fig. 2), which is the maximum hourly rainfall reading across the whole study area.

Fig. 2
figure 2

Maximum hourly rainfall based on a, b ERA5, c, d GPM, and e, f ground radar rainfall products. Left and right panels show the Fukuoka and Hiroshima sites, respectively. Maximum rainfall is observed on 05/Jul/2017 7:00 in Fukuoka and on 06/Jul/2018 11:00 in Hiroshima by the ground radar, GPM observation are almost identical (−30 min), but ERA5 based maximum rainfall is on 05/Jul/2017 11:00 (05/Jul/2018 13:00) in Fukuoka (Hiroshima). Black points are the landslide crowns, and red solid line highlights the study area

Fig. 3
figure 3

Twelve hours cumulative rainfall based on a, b ERA5, c, d GPM, and e, f ground radar rainfall products. Left and right panels show the Fukuoka (05/Jul/2017 00:00–12:00) and Hiroshima (06/07/2018 09:00–20:00) sites, respectively. Black points are the landslide crowns, and red solid line highlights the study area

Fig. 4
figure 4

Seventy-five hours cumulative rainfall based on a, b ERA5, c, d GPM, and e, f ground radar rainfall products. Left and right panels show the Fukuoka and Hiroshima (05/Jul/2017 00:00–08/Jul/2017 03:00 UTC) sites, respectively. a, c, d are left blank, while the Fukuoka event took about 12 h only. Black points are the landslide crowns, and red solid line highlights the study area

Fig. 5
figure 5

Soil water index (SWI) computed using the a, b ERA5, c, d GPM, and e, f ground radar rainfall products in 5 km resolution. Left and right panels show the Fukuoka and Hiroshima sites, respectively. The computation of the SWI in Fukuoka (Hiroshima) starts 48 h prior to the start of the event spanning 5/July/2017 00:00 (3/July/2018 00:00)–7/July/2017 12:00 (8/July/2018 03:00) UTC. Black points are the landslide crowns, and red solid line highlights the study area

Antecedent conditions constitute an important control on the response of hillslopes to precipitation. These conditions include both pre-event rainfall and soil conditions which affect soil suction and pore pressure (Glade et al. 2000). We estimate antecedent water content on the surface layer integrating the rainfall time series via the soil water index (SWI), which is operationally used for early warning purposes in 5 km resolution (Osanai et al. 2010). SWI is computed based on a linear three bucket model (Chen et al. 2017) for each of our rainfall products (reprojected to 5 km resolution using nearest neighbor interpolation); the rainfall amount creates an initial water budget, which decreases linearly with time unless new rainfall arrives. We started computing the SWI two days prior to the events, considering also the relatively rainfall inactive antecedent period. In addition to the direct rainfall data, we also considered maximum SWI during the events (Fig. 5).

Our analysis relied on the ALOS World 3D digital elevation model (DEM), which is provided by the Japan Aerospace Exploration Agency (JAXA) and its advanced land observing satellite (ALOS) project with a horizontal resolution of 1″ (≈30 m) (Tadono et al. 2015). The DEM forms the basis for computing hillslope inclination and visualization using TopoToolbox (Schwanghart and Scherler 2014), as well as the total curvature according to (von Specht et al. 2019). Data on major geological units are obtained from the Seamless Digital Geological Map of Japan (scale of 1:200,000) by the Geological Survey of Japan (Yamada et al. 1986; Kubo et al. 1993), Fig. 1). All the rainfall data is reprojected to the DEM extent and resolution using nearest neighbor interpolation keeping original shape of the data (Table 1, Akima 1970).

Methods

We estimated the probability of a given DEM cell being classified as a landslide using multivariate logistic regression. The ease of model comparison is why we preferred logistic regression to more advanced machine learning techniques such as random forests, or support vector machines (Jones et al. 2021), which often outperform logistic regression in classification performance (e.g., Braun et al. 2015; Martinovic et al. 2016; Samia et al. 2018). If predictor variables are standardized to have zero mean and unit variance, predictor coefficients can be interpreted as weights that enable model comparison.

We investigated three models based on different predictor combinations (Table 2). The base model includes only elevation, hillslope inclination, and total curvature along with the geology as landslide predictors (Table 1). These topographic features and geology are commonly ranked very high in landslide susceptibility studies (e.g., Schicker and Moon 2012; Althuwaynee et al. 2014; Meyer et al. 2014; Ozturk et al. 2020) Additional topographic, land cover, or land–use related covariates potentially increase the performance of the base model, yet may also lead to overfitting. As we are primarily interested in how far incorporation of rainfall products increases predictive performance, we thus avoided including any other metrics.

Table 2 Extra covariates that are used in model 2 and model 3 additional to the covariates of the base model. For example Model 2.7. uses 12 h cumulative rainfall obtained from GPM IMERG rainfall estimates, whereas 3.3 uses the SWI computed form ERA5 rainfall estimates additional to the base model. Bold numbered models are showed in Figs. 6 and 7

A second model included one additional covariate obtained from the available rainfall products, i.e., maximum hourly intensity, and 12 h cumulative rainfall. For the Hiroshima event, we also included 75 h cumulative rainfall. This model allowed us to analyze the improvements in predictive performance relative to the base model. We expect that radar-based rainfall records provide best improvement to the model. We additionally coarsened radar-based rainfall (RainLow, Table 2) to IMERG/ERA5 resolution for fair comparison. Rainfall influence on landsliding might be indirect, for example, rain waters might accumulate in certain locations due to flow diversions, or accumulated water budget might increase overtime at some hillslopes. In the third model, we include rainfall derivatives, the maximum soil water index (SWI), on top of our base model.

All models involve equal amount of landslides and non-landslide cells, where non-landslide cells do not belong to any of the landslide polygons. Non-landslide cells are bootstrapped 100 times for each model and a 10-fold cross-validation framework determines the training and testing sets. Model performance is assessed via the receiver operating characteristic (ROC) curve (Costache 2019). The ROC curve is a graphical illustration of the diagnostic ability of a binary model. In addition, we reported the area under the ROC curve (ROC-AUC, μ) which provides a measure for overall model performance. ROC-AUC values close to 0.5 (50%) equates to the performance of a random classifier, whereas 1 (100%) indicates a perfect classification.

As the number of predictors in our models varies, log likelihood values of the models may also differ. As we want to penalize the lavish models and favor parsimony, we tested the goodness-of-fit of the models via the Akaike information criterion (AIC) (Akaike 1974; Samia et al. 2020). We normalized the AIC of the test models by the AIC of the base model; hence, AIC values <1 indicate a better fit compared to the base model in our analyses.

We additionally compared the spatial pattern of different rainfall products with the root-mean-square error (RMSE) and normalized 2-D cross-correlation (Lewis 2001). The RMSE shows the mean difference between the rainfall products in observation units. The normalized 2-D cross-correlation reveals the similarity of different products [−1 1], while indicating potential spatial offsets on the location of the maximum correlation.

Results

We used four different rainfall products on top of our base model, maximum and cumulative rainfall, and soil water index (SWI), which results in a total of 12 evaluated models for Fukuoka and 16 for Hiroshima. Here, we only show the best performing rainfall products: 12 hourly cumulative of ground radar and ERA5, and hourly maximum rainfall of IMERG. ROC curves of the other models with their predictor weights are provided in the appendix (Appendix Figs. 8, 9, 10, and 11).

The base model has an average performance of 67% (71%) in Fukuoka (Hiroshima, Fig. 6a, d) as measured by ROC-AUC. Hillslope inclination ranks the most important predictor in both sites with weight more than five-fold than the second most important predictor total curvature (Fig. 7). Including rainfall data improves the model performance and considerably alters the distribution of predictor weights.

Fig. 6
figure 6

Receiver operating characteristic (ROC) curves of the best performing models. a, b, and c show the results using 12 h cumulative rainfall of ground radar, maximum hourly rainfall of IMERG and 12 h cumulative ERA5 respectively for Fukuoka region, whereas d, e, and f show the same (with 75 h cumulative rainfall) for the Hiroshima region. RainLow indicates the rainfall radar with coarser resolution equal to GPM IMERG and ERA5. Soil water index (SWI) is computed always based on the maximum during the entire event, starting 2 days before. Mean ROC-AUC (μ) of the models are shown in parenthesis. All possible combination are shown in Appendix Figs. 8 and 9

Fig. 7
figure 7

Parameter weights of the models that are listed in the Fig. 6. Models that use rainfall as a predictor on top of the base model are shown in a for Fukuoka and b for Hiroshima regions. c and d list the parameter weights of the models that inputs soil water index (SWI) on top of a base model. All the weights are normalized to allow cross model comparison of the weights. Vertical lines on top of the bars show the mean standard error of the parameter weight in the bootstrap domain. All possible combination are shown in Appendix Figs. 10 and 11

Models that use ground radar rainfall perform 96% (80%) improving the base model by nearly 30% (10%) in terms of ROC-AUC in Fukuoka (Hiroshima, Fig. 6a, d). SWI models based on radar rainfall are comparable (≈1% difference). To this end, including radar rainfall products (i.e., cumulative rainfall, SWI) replace the most important parameter in all the models, indicating rainfall control over the general landslide distribution (Fig. 6). Considering the improvements that result from including radar-based rainfall data, we expect to observe ROC-AUC values between 67 and 96% for the Fukuoka event and 71–80% for the Hiroshima from the models that include IMERG or ERA5 data.

Yet, including satellite- or reanalysis-derived rainfall products only marginally improve model performance. Including IMERG products leads to 2–3% improvement over the base model in both the sites (Fig. 6b, e). The model that uses SWI performs best with an ROC-AUC value of 71% (73%) in Fukuoka (Hiroshima). However, the AIC returns nearly identical values in both study areas, which suggests that including IMERG rainfall leads to no relative information added to the model. Nevertheless, IMERG rainfall ranks higher than the hillslope inclination in the model of Fukuoka site (Fig. 7a), and it is ranked comparable to other four predictors (±5%) in Hiroshima (Fig. 7b). Similarly, SWI of IMERG is the most weighted predictor in Fukuoka case, among the least in Hiroshima.

ERA5 rainfall improves (4% ROC-AUC) the base model only when using SWI in the case of Fukuoka with nearly no change in AIC (≈6% relatively), whereas there is no improvement in Hiroshima. ERA5 rainfall data (SWI) replaces the most important predictor in Hiroshima (Fukuoka) site (Fig. 7b, c). Rainfall data is neutral in Fukuoka, while SWI is neutral in Hiroshima (Fig. 7a, d).

Root-mean-square error (RMSE) is <200 mm/h between radar rainfall and both the IMERG or ERA5 estimates in Fukuoka with <0.6 correlation. This maximum correlation is about ≈16 km (≈8 km) dislocated from the ground-radar data in IMERG (ERA5). RMSE is similar to Fukuoka in Hiroshima with higher correlation, >0.7 for IMERG and >0.8 for ERA5. ERA5 has nearly no shift in the measurements but center of correlation is 34 ± 20 km dislocated in case of IMERG.

Discussion

Our study shows that hindcasting the locations of rainfall-induced landslides improves, when taking spatiotemporal patterns of rainfall into account. Ground-based rainfall radar attains the highest accuracy in characterizing these patterns, which is reflected by a boost in predictive performance of the landslide model (Fig. 6a). However, this advantage largely disappears if rainfall products are prone to large uncertainties and unable to capture the spatiotemporal rainfall patterns that determine spatial distribution of landslides. Based on the results from our case studies, satellite- or reanalysis-derived rainfall products are incapable of capturing rainfall patterns in detail, yet, but emphasize the potential for these products to become valuable predictors as soon as they attain a sufficient spatial accuracy.

We used two sites with contrasting rainfall patterns: the spatially confined range of the Fukuoka event and a single rain burst within 12 h puts the spatial accuracy of the rainfall data to the test. Here, gains in predictive performance are particularly high when using high-resolution radar data, whereas the reanalysis- and satellite-retrieved rainfall data fail capturing the spatial extent and temporal dynamics of the event (e.g., Figs. 2 and 3). IMERG-based cumulative rainfall has a patchy pattern when compared to the rainfall radar (e.g., Figs. 3d and 4d), and IMERG only partially captures the spatial distribution of maximum rainfall (Fig. 2d). Whereas ERA5 shows rather a homogeneous rainfall pattern along both the study areas, partly missing the spatial detail of rainfall (e.g., Figs. 2a, b and 3a, b). Accordingly, the models that use the rainfall data (incl. SWI) based on ground radar show superior performance of >90% with 2–3 times better fit than the base model in terms of AIC. Importance of the rainfall on those models is also emphasized by predictor weights (Fig. 7). This is due to the spatial consistency between the spatially confined rainfall and the landslide concentration (Fig. 1a). The majority of the landslides (>50%) were restricted to a rather small region (10×20 km2, Fig. 1a), which is the resolution of both IMERG and ERA5. The models using ground radar rainfall—low and high resolution—show comparable performance (Fig. 6a, d) to relate the weak performance of the IMERG and ERA5 based models to the low resolution. On the contrary, high performance of coarser resolution radar outputs emphasizes that the spatial resolution of the IMERG and ERA5 is sufficient for landslide hindcasting (e.g., Wang et al. 2021).

In the Hiroshima site, intense rainfalls lasted more than 3 days and the landslides were distributed over a range of 100 km with various geologies (Fig. 1b). Even the high-resolution rainfall product of ground radar marginally improved the base model in the extended event of Hiroshima (9%, Fig. 6d). This suggests that rainfall may not be among the main controls reflecting uncertainties that arise from the long-term accumulation of moisture, spatial variations of soil properties, and failure mechanisms. For example, local geological disparities could alter the susceptibility to landsliding. Granite and rhyolite are the major bedrocks where landslides occurred in Hiroshima (Fig. 1b). The granite hillslopes are known to be susceptible to short-term intense rainfall, whereas rhyolite hillslopes tend to be insensitive to such rainfall input because of their thicker soil coverage on relatively gentle hillslopes (Watakabe and Matsushi 2019). The 9% improvement is matched also by the ERA5 data (7%, Fig. 6f), which correlate (2D, >0.8) well with the ground radar data. SWI based on ERA5 is nearly homogeneously distributed along the Hiroshima area (Fig. 5b), it accordingly weights the lowest in the model (Fig. 7d) and hence makes no improvement to the base model (Fig. 6f). Although the IMERG is spatially more consistent in Hiroshima than in Fukuoka with a correlation of >0.8 with the radar data, logistic model ranks the IMERG data very low in the models (Fig. 7b, d). Apparently, IMERG misses the location of storm center by about 34 ± 20 km in different rainfall products (i.e., 75 h, 12 h, and maximum hourly), based on our 2D cross-correlation metric.

Considering the extreme (return periods of more than 100 years) rainfall amount for such humid regions, estimating landslide locations is challenging without accurate rainfall information. A few kilometer shift of the storm location is able to strongly bias landslide hindcasting, when landslides are spatially confined as in Fukuoka event. We observed a mismatch between IMERG estimates and ground radar rainfall, when we compare them via Normalized 2D cross-correlation. IMERG misses the location of the storm in both the sites by >15 km (e.g., Appendix Fig. 8). ERA5 was able to locate the rainfall event in Hiroshima, which improved the model performance considerably (Appendix Fig. 9). Hence, locating the storm is the primary determinant to achieve an accurate landslide hindcast model. Our findings demonstrate that the low resolution of the IMERG and ERA5 is not the main problem, it is rather the spatial inaccuracy. This is consistent with previous studies which compared ground radars with satellite rainfall data, and which found that even after scaling the ground radar up to match the satellite data resolution, ground radars still captured more spatial variability than the satellite data (Speirs et al. 2017; Ramsauer et al. 2018). Beyond the under- or over-estimation of the rainfall data, IMERG misses the level of spatial detail for both the short and long duration events (Ramsauer et al. 2018; Cui et al. 2020). Another interesting aspect is that accumulated rainfall based on IMERG decreased the model performance. Although the 2−3% difference in model performance could be random coincidence (Figs. 8 and 9), it could also indicate the accumulated residual differences over time decreases the adequacy of the IMERG product (O et al. 2017). Hence, we suspect that the elevation contrast in our study sites alters the quality of both the satellite (IMERG) and the reanalyses product (ERA5, Rossi et al. 2017a).

Our analysis relies on two case studies, both of which feature highly localized rainfall patterns. Generalizing our findings to severe rainfall conditions may not be possible if these are related to larger scale frontal events that have sufficient extent to be reliably captured by satellite- and reanalysis-derived rainfall products. Notwithstanding, our results emphasize that an adequate rainfall information could profoundly (≈30%) improve landslide hazard models for rainfall-induced cases (Osanai et al. 2010), and that the spatial accuracy of these estimates plays a key role (Rossi et al. 2017b). Our analysis also reveals that both the satellite- or reanalysis-based rainfall estimates only marginally improve the prediction of landslides in rather spatially confined sites—the landslide effected area (Bumke 2016; Chikalamo et al. 2020). To go beyond models that aim to increase the situational awareness in global scale (Kirschbaum and Stanley 2018), these estimates need improvement in their spatial accuracy rather than further refinement of their accurate representation of rainfall amounts, as they are often criticized for (e.g., Thomas et al. 2019).

Conclusion

We used an approach based on logistic regression to test whether the satellite-derived rainfall estimates by IMERG and reanalysis data from ERA5 can improve landslide hindcasting. In addition, we used rainfall products based on ground radars as benchmarks. Our analyses covered two test event, the spatially confined Fukuoka event (10×20 km2) and the larger Hiroshima event (50×100 km2, Fig. 1). Both the sites exhibit a rugged and low-relief topography (≈0–1200 m), which are frequently exposed to frontal storms and tropical cyclones. While ground-based radar-derived rainfall estimates significantly increased model performance breaching 90% in Fukuoka, other rainfall products were unable to achieve similar improvements. In Hiroshima case, ERA5 matched the model that uses the radar rainfall estimates by increasing the performance of the base model by about 7%. This improvement proves that the grid rainfall estimates would contribute considerably to landslide hindcasting once they accurately detect spatial dimensions of the rainfall event. Our findings indicate that the rainfall information could be the main control reflecting the spatial distribution of landslides in case of a localized event as in Fukuoka. Whereas, critical rainfall conditions might be attained in a larger area in case of a widespread event, as in Hiroshima, increasing the weight of geology and morphometrics in models. Our results further suggest that the shortcoming of IMERG and ERA5 is neither their coarse resolution nor potential consistent over- or underestimations, rather the lack of spatial consistency and ability to locate storm centers. This implies that global rainfall products are potentially beneficial for landslide hindcasting but this potential lies mainly in improvements in capturing spatial rainfall patterns rather than rainfall amounts.