Introduction

In recent years, flash floods have become a significant and disastrous type of natural calamity, posing considerable dangers to human life, infrastructure, and ecological systems (Elsadek et al. 2023; Wahba et al. 2024b). The global vulnerability to flooding has increased by more than 40% over the past 2 decades and is projected to continue to escalate in the future; this trend is primarily attributable to urbanization and global climate change. Furthermore, significant increases in the susceptibility to flooding have been noted in the Asia-Pacific, Europe, and North America regions (Alfieri et al. 2017; Vaghefi et al. 2019).Furthermore, the increase in the mean surface temperature, which is widely known as “global warming”, is a consequence that arises from the emission of greenhouse gases produced by human activities, in conjunction with subsequent physical feedback mechanisms operating within the complex systems of the Earth. In contrast, climate change refers to the process through which alterations in the typical patterns of climate variables occur at the local level (Janizadeh et al. 2021).

Meanwhile, with the capacity to cause extensive destruction in a matter of minutes, these sudden and powerful floods manifest a swift initiation and considerable intensity. Flash floods, which have been documented across different historical periods, are becoming more frequent and severe. This worrying trend has raised concerns about the potential for climate change to exacerbate the detrimental effects of flash floods, as accumulating evidence indicates that both their frequency and severity are increasing. Therefore, ensuring accurate prediction of flood hazards is crucial for promoting effective flood management and preventative actions (Wahba et al. 2022).

Moreover, floods have emerged as a profoundly concerning issue in numerous regions across the globe, evidenced by the staggering toll of human lives, with over a million individuals tragically losing their lives to storms and flooding events (Malik and Pal 2021). Climate change has engendered a notable escalation in flood hazards through various mechanisms, encompassing intensified precipitation patterns leading to heavier rainfall and more frequent storm surges. This complex phenomenon emerges from an interaction of several detrimental elements, encompassing detrimental hydrological circumstances, changed meteorological patterns, alterations in landforms (geomorphology), and inadequacies in flood protection systems and infrastructures (Pal et al. 2022). The compounding effects of these elements contribute to the heightened vulnerability to flooding events, necessitating comprehensive and adaptive measures to mitigate the risks effectively.

In Japan, between June 28 and July 8, 2018, western Japan experienced a series of torrential rains that resulted in numerous floods and landslides. The magnitude of devastation caused by this natural disaster was ranked as the second most significant event of its kind in the 21st century, following closely after the Great East Japan Earthquake of 2011. Furthermore, in April 2019, the calamity led to a death toll of 263 individuals, with 8 persons still reported missing, and an additional 484 individuals sustaining injuries. The impact on infrastructure was profound, with 6783 houses completely destroyed and 11,346 structures experiencing partial damage (Okazaki et al. 2022).

Generation of FSM is very essential, as it enables the identification of areas most vulnerable to flash floods, thereby facilitating the implementation of appropriate mitigation strategies. In this study the FSM has been produced using variety of machine learning techniques, namely ANN-MLP reg, SVR, GBR, and LASSO. Moreover, the study examines the exclusion of curvature factor in the accuracy assessment of machine learning (ML) techniques. Moreover, scrutinizing the impacts of curvature on the FSM resemble one of the key novelty of this research. By exploring the predictive capabilities of these diverse ML models in the absence of the curvature factor, the research aims to shed light on their respective strengths and limitations in accurately predicting flood-prone areas. The omission of the curvature factor serves as a critical variable to analyze the robustness and reliability of the chosen models in producing flood susceptibility maps. The outcomes of this research will provide valuable insights into the performance of each ML model and contribute significantly to the advancement of flood susceptibility mapping methodologies. Ultimately, this study’s findings hold substantial implications for flood risk management and decision-making processes, facilitating more effective and targeted mitigation strategies to safeguard vulnerable communities and infrastructures from the adverse impacts of flooding events.

Related work

FSMs serve as a vital instrument for decision-makers in formulating effective mitigation plans for regions impacted by the adverse consequences of flash floods. Consequently, the precision of these maps holds immense significance in determining the precise level of hazard. In this context, various methodologies have been employed to generate these maps, encompassing bivariate statistical approaches, multi-criteria decision-making analyses, and more recently, machine learning techniques. The present study employs machine learning techniques, as they exhibit considerable potential in achieving high performance and accuracy in forecasting flood hazard maps. Moreover, artificial neural networks (ANN), decision trees, logistic regression, random forest (RF), regression trees, and support vector machines (SVM) are among the extensively employed machine learning models for conducting flood risk assessments (Roozbeh Hasanzadeh and Tuan 2018; Kourgialas and Karatzas 2017; Gotham et al. 2018; Mojaddadi et al. 2017). Although numerous studies have focused on evaluating flood damage prediction, none of them have specifically investigated the impact of incorporating the curvature factor into machine learning models on the accuracy of the resulting FSMs.

Prior investigations concerning flood vulnerability mapping have accentuated the imperative for advancing more precise and reliable models. Consequently, a cohort of scientists has expressed a preference for leveraging deep learning neural networks or hybrid ensemble models, as these sophisticated approaches demonstrate superior capabilities in accurately evaluating the susceptibility to flooding events (Abu Reza et al. 2021; Talukdar et al. 2020). Concurrently, hydrodynamic models can be employed to simulate runoff within a small-scale catchment area. For instance Wahba et al. (2024a) conducted a simulation of the runoff generated by the maximum predicted storm occurring in 100 year to assess the impact of flood hazards on buildings.

A wide array of methodologies has been employed to create spatial flood hazard maps. These methods encompass statistical indices, frequency ratios, Shannon’s entropy, generalized linear models, logistic regression, weights-of-evidence, multivariate discriminant analysis, weighting factors, flexible discriminant analysis, generalized additive models, multivariate logistic regression, and other multivariate statistical approaches, as documented in studies by. Additionally, multi-criteria decision-making analysis has been applied in studies by Giovannettone et al. (2018) and Youssef et al. (2016). Furthermore, machine learning techniques, including support vector machines (SVM), artificial neural networks (ANNs), least squares SVM (LSSVM), backpropagation ANNs, classification and regression trees (CART), and random forest (RF), have been leveraged by numerous researchers to develop Flood Susceptibility Maps (FSM), as evidenced in studies by Haoyuan et al. (2018) and Darabi et al. (2019).

Regarding SVR, Panahi et al. (2021) conducted an investigation into the performance of a standalone SVR model in generating Flood Susceptibility Maps (FSM). The study involved training the model using 9 environmental factors. The findings indicated that the accuracy of the standalone SVR model, as measured by the area under the ROC curve, reached 87%. However, this relatively lower accuracy might be attributed to the utilization of a smaller set of environmental factors. Nonetheless, the research also demonstrated that employing an ensemble model of SVR, which incorporated the Grasshopper Optimization Algorithm (GOA) and Particle Swarm Optimization (PSO), resulted in a significant improvement in the accuracy of the generated FSM. The use of these optimization techniques in conjunction with SVR contributed to enhancing the predictive capability of the model and, consequently, the accuracy of the produced Flood Susceptibility Maps.

On the hand, the GBR represents an ensemble learning model founded on the principles of boosting, effectively minimizing prediction loss by iteratively fitting the residuals. Wu et al. (2022) successfully applied the GBR and highlighted its efficacy in achieving an efficient downscaling approach, resulting in the generation of high spatial resolution precipitation data.

Additionally, Gaagai et al. (2023) conducted an evaluation of groundwater quality employing a comprehensive approach, which encompassed two machine learning models, namely Artificial Neural Network (ANN) and Gradient Boosting Regressor (GBR), in conjunction with multivariate statistical analysis and Geographic Information System (GIS). The findings of their study revealed that the ANN model exhibited superior predictive capabilities over the GBR in terms of groundwater quality assessment. Moreover, they underscored the significance of leveraging physicochemical parameters and water quality indices, supported by GIS techniques, machine learning, and multivariate modeling, as a valuable and pragmatic strategy for both assessing the quality and facilitating sustainable development of groundwater resources.

In parallel, according to study conducted by Linh et al. (2022) using the K-nearest neighbor (KNN) and Extreme Gradient Boosting (XGB) machine learning models, along with a hybrid genetic algorithm (GA) combined with the XGB model (GA-XGB), to generate the FSM, the GA-XGB model, in particular, demonstrated the highest accuracy among these approaches. Furthermore, Pandey et al. (2021) investigated the integration of frequency ratio (FR) and evidential belief function (EBF) with classification and regression tree (CART) models resulted in the CART-FR and CART-EBF models, respectively. Comparative analyses reveal that the CART-EBF model slightly outperforms the CART-FR model (Pandey et al. 2021). Additionally, a novel Flash-Flood Propagation Susceptibility Index (FFPSI) was calculated using a combination of Weights of Evidence (WOE), Analytical Hierarchy Process (AHP), Logistic Regression (LR), Classification and Regression Trees (CART), and Radial Basis Function Neural Network-Weights of Evidence (RBFN-WOE). The study was conducted in the Zabala river basin located in the mountainous region of the central-southeastern part of Romania. The LR-WOE and AHP-WOE models showed the highest performance among the evaluated models (Costache et al. 2022).

Furthermore, the selection of causative factors is crucial in estimating the FSM and can significantly influence the overall accuracy of the resulted map. Consequently, this study aims to evaluate the impact of chosen causative factors on the accuracy of FSM estimation, both prior to and following the incorporation of the curvature factor.

Study area

Ibaraki Prefecture is located in the Kanto region of Japan. It is situated on the eastern coast of Honshu, the main island of Japan. The prefecture borders Tochigi Prefecture to the north, Gunma Prefecture to the northwest, Saitama Prefecture to the west, Chiba Prefecture to the south, and the Pacific Ocean to the east. The geographical extent of Ibaraki prefecture covers approximately 6100 square kilometers, accommodating an estimated population of around 2.87 million individuals. Figure 1. provides an illustration of the precise geographical positioning of Ibaraki prefecture. In 2019, the prefecture encountered a notable flood incident triggered by Typhoon Hagibis, leading to substantial destruction of both properties and infrastructure, as well as the unfortunate loss of at least one life.

Fig. 1
figure 1

The location of the study area

Material and data

In this study, a Digital Elevation Model (DEM) with a 30-m resolution was employed. The DEM is an essential element for flood susceptibility mapping and plays a significant role in hydrological and hydraulic simulations (Kepeng et al. 2021). The DEM data for Ibaraki Prefecture was sourced from Yamazaki Lab website and underwent a correction process. The boundaries of Ibaraki were delineated using a polygon shapefile obtained from DIVA-GIS website.

Land cover data utilized in the study was extracted from the ESRI website. Additionally, shapefiles for roads and rivers sourced from Geofabrik website. These maps are crucial for calculating distances to roads and rivers within ArcMap. Moreover, the identification of flooded and non-flooded locations in Japan was based on a survey conducted on the hazard map portal of Disaster Prevention Division, Ministry of Land, Infrastructure, Transport and Tourism of Japan. This spatial data is critical for generating the flood inventory map. Table 1 describes an overall information about the utilized data.

Table 1 Description of the utilized data and materials

Methodology

This study comprises four primary stages: preparatory processing, environmental factors, training of the machine learning models, and model validation. The initial step, preparatory processing, involves the utilization of ArcMap software to implement a delineation of the Digital Elevation Model (DEM) in order to determine the flow direction, which is crucial for calculating potential streamlines and basins. Subsequently, these environmental factors are estimated and visualized. the environmental factors involve elevation, slope, distance to stream, distance to river, distance to road, Topographic Wetness Index (TWI), Stream Power Index (SPI), aspect, plane curvature, profile curvature, land cover.

Moreover, the flooded and non-flooded points, along with the environmental factors, are merged and then divided into a 70% portion for training the machine learning models, while the remaining 30% is reserved for testing the performance of models. All the selected machine learning models, namely Artificial Neural Network-Multi Layers Perceptron (ANN-MLP), Support Vector Regression (SVR), Gradient Boosting Regressor (GBR), and Lasso regression, are employed specifically for regression tasks in via Python software.

Following the training of the models, each model generated a Flood Susceptibility Map (FSM). The FSM was produced twice for each model: the first version included all 11 environmental factors, while the second version excluded the plan and profile curvature of the flood conditioning factors.

To evaluate the accuracy of the models, the area under the Receiver Operating Characteristic (ROC) curve was calculated. Additionally, a residual analysis was conducted, and performance measures such as R-squared, mean absolute error (MAE), and mean square error (MSE) were estimated. These measures were utilized to assess and compare the performance of the models. Figure 2 illustrates the methodological framework.

Fig. 2
figure 2

The methodological framework

Artificial neural network-multi layer perceptron

Artificial Neural Networks (ANN) offer a straightforward approach to emulating the neural architecture of the human brain. By utilizing training samples, these networks enable the recognition of previously unseen data and facilitate decision-making and problem-solving related to the spatial correlation between input variables and the presence or absence of a specific phenomenon (Gomez and Kavzoglu 2005; Taravat et al. 2016). In addition, Multi-Layer Perceptron (MLP) is widely recognized as one of the most prominent types of Artificial Neural Networks (ANNs). Functioning as a robust modeling tool, the MLP employs a supervised training procedure that relies on data examples containing known outputs (Bishop 1995). The multilayer perceptron (MLP) represents a variant of a feed-forward neural network distinguished by the presence of a solitary output, numerous inputs, and the inclusion of one or more hidden layers (Murtagh 1991).

In this study, ANN-MLP regressor model is utilized to predict the degree of hazard in a previously identified flooded area. The model has been fine-tuned by incorporating various variables, including the sizes of hidden layers, the maximum number of iterations, the activation function, and the solver. Through optimization, the ideal parameter values have been determined as follows: three hidden layers with 100, 50, and 30 nodes respectively, a maximum of 300 iterations, the “tanh” activation function, and the “adam” solver. Figure 3 illustrates a conceptual depiction of the employed ANN-MLP regressor model. The first layer corresponds to the input layer, comprising a collection of neurons that encode the input 11 flood conditioning factors. The neurons within the hidden layers execute a transformation on the sum of input values and corresponding weight factors \((w_{1} x_{1} + w_{2} x_{2} + \cdots + w_{n} x_{n})\) by means of employing an activation function.

Fig. 3
figure 3

A comprehensive framework for the proposed regression model based on ANN-MLP architecture

On the other hand, The loss curve was determined through the computation of the Mean Squared Error (MSE) at each iteration. By evaluating the average squared discrepancy between predicted and actual values, the MSE served as a metric to gauge the model’s performance. Figure 4 illustrates the estimated loss curve.

Fig. 4
figure 4

The estimated loss curve

This process allowed for the tracking of the loss curve, providing valuable insights into the convergence behavior and the quality of predictions throughout the training iterations. The MSE formula is described in Eq. (1).

$$\begin{aligned} MSE = \frac{1}{n} \sum _{i=1}^{n} (y_i - {\hat{y}}_i)^2 \end{aligned}$$
(1)

where \(n\) is the total number of samples, \(y_i\) represents the true values, and \({\hat{y}}_i\) denotes the predicted values.

Support vector regression

Support vector regression (SVR) can be regarded as a category of supervised machine learning algorithms that possess the ability to function as both prediction models and effective tools for addressing pattern recognition challenges (Vapnik 1999). The training dataset refers to a collection of paired input and target instances \({(x_1, y_1), \ldots , (x_i, y_i)}\) and the corresponding predicted values (\(y_{i} \in R^{n}\)) can be determined by means of a linear approximation function (\(f(x)\)) as depicted in Eq. (2). The precision of the process can be determined by calculating the \(\varepsilon\)-deviation value, which measures the discrepancies between the predicted actual outputs for each corresponding pair of training samples.

$$\begin{aligned} f(x) = w\varphi (x) + b \end{aligned}$$
(2)

where \(f(x)\) is the predicted output, \(w\) represents the weight vector, \(\varphi (x)\) denotes the feature transformation function applied to the input \(x\), and \(b\) is the bias term. According to the theory of structural risk minimization, the values of \(w\) and \(b\) can be determined using the following formula:

$$\begin{aligned}{} & {} {\text {Minimize}}:\left[ {\frac{1}{2}\left\| w \right\| ^{2} + C\sum \limits _{{\text {i = 1}}}^{n} {(\xi _{{\text {i}}} + \xi _{{\text {i}}}^{*} )} } \right] \\{} & {} {\text {Subject to}}:\left\{ \begin{aligned} y_{{\text {i}}} - (w\varphi (x_{{\text {i}}} ) + b_{{\text {i}}} ) \le \varepsilon + \xi _{{\text {i}}} \\ (w\varphi (x_{{\text {i}}} ) + b_{{\text {i}}} ) - y_{{\text {i}}} \le \varepsilon + \xi _{{\text {i}}}^{*} \\ \xi _{{\text {i}}},\xi _{{\text {i}}}^{*} \ge 0 \\ \end{aligned} \right. \end{aligned}$$

where \(\xi _{i}\) and \(\xi _{i}^{*}\) are slack variables that contribute to the minimization of the objective function (Rezaie et al. 2022). In addition, some essential parameters of the SVR model have been described in Fig. 5

Fig. 5
figure 5

Schematic sketch for the linear SVR

The constant “C” in support vector regression (SVR) serves as a regularization parameter that governs the balance between minimizing training error and allowing model flexibility. By controlling the penalty for margin violations and deviation from the desired regression fit, “C” influences the complexity of the SVR model and its tolerance towards outliers. Higher “C” values lead to tighter margins and potentially improved training fit, prioritizing accuracy over generalization. Conversely, lower “C” values result in wider margins, allowing for greater tolerance of violations and potential enhanced generalization. Optimal “C” selection involves considering dataset characteristics, problem complexity, and the desired trade-off between training accuracy and generalization. The mathematical formulation of support vector regression can be expressed in Eq. (3) according to Huaizhi et al. (2018).

$$\begin{aligned} f(x) = \sum \limits _{{\text {i = 1}}}^{n} {(\alpha _{{\text {i}}} - \alpha _{{\text {i}}}^{*} )k(x},x_{{\text {i}}} ) + \, b \end{aligned}$$
(3)

where \(\alpha _{i},\alpha _{i}^{*}\) resemble Lagrange coefficient and \(k(x,x_{{\text {i}}} ) = \langle \varphi (x),\varphi (x_{{\text {i}}} )\rangle\) are the kernel functions. In this study, The Radial Basis Function (RBF) was chosen as the kernel function due to its optimal characteristics, as suggested by Huang et al. (2020). Moreover, Eq. (3) can be rearranged as depicted in Eq. (4)

$$\begin{aligned} f(x) = \sum \limits _{{\text {i = 1}}}^{n} {(\alpha _{i} - \alpha _{i}^{*} )} \exp \left( \frac{{ - \left\| {x - x_{{\text {i}}} } \right\| }}{{2\sigma ^{2} }}^{2} \right) + b \end{aligned}$$
(4)

where \(\sigma\) is the factor of the (RBF).

Gradient boosting regressor

This machine learning model employs the technique of "boosting" to generate predictions by combining an ensemble of weak prediction models, commonly decision trees as implemented in this study, in order to construct a more resilient and reliable model (Rao et al. 2019). The Gradient Boosting Regression (GBR) model with M number of trees can be represented mathematically as described in Eq. (5) (Otchere et al. 2022):

$$\begin{aligned} {\hat{y}} = \sum _{m=1}^{M} \beta _m h_m(x) \end{aligned}$$
(5)

where \({\hat{y}}\) is the predicted value, \(\beta _m\) is the weight or contribution of the mth tree, \(h_m(x)\) is the prediction made by the mth tree for input feature vector \(x\).

Least absolute shrinkage and selection operator (LASSO)

This technique is a linear regression model that serves as a variable selection method which effectively reduces the number of factors included in the final model (Hastie et al. 2009). The formula for the LASSO model can be represented as Eqs. (6) and (7):

$$\begin{aligned} {\hat{y}} = \beta _0 + \sum _{j=1}^{p} \beta _j x_j \end{aligned}$$
(6)

subject to the constraint:

$$\begin{aligned} \sum _{j=1}^{p} |\beta _j| \le t \end{aligned}$$
(7)

where \(\beta _0\) is the y-intercept or bias term, \(\beta _j\) represents the coefficients for the input features \(x_j\), \(p\) is the number of input features, \(x_j\) represents the j-th input feature, \(t\) is the maximum allowed sum of the absolute values of the coefficients. LASSO necessitates the specification of an \(\alpha\) parameter, which governs the magnitude of the penalty imposed. To explore the impact of varying penalty strengths, multiple \(\alpha\) values were tested in this study, including 0, 0.1, 0.5, 1, and 10. Where \(\alpha\) is the regularization parameter controlling the strength of the penalty term. Based on the evaluation metrics of Mean Squared Error (MSE), Mean Absolute Error (MAE), and R-squared (R2), an alpha value of 0.1 was chosen as it yielded the highest accuracy. The LASSO model was implemented using the scikit-learn platform in the Python programming language.

Flood inventory map

The flood inventory map is an essential tool utilized in the assessment of flood hazards, enabling the identification of areas vulnerable to potential flooding. It serves a critical function in the identification and recognition of regions that are at risk of experiencing flood events. Rahmati et al. (2016) and Tehrany et al. (2014). In addition, by increasing the number of flooded areas accurately marked on the flood risk map, the precision of the map can be enhanced. Enhancing the accuracy of identifying flooded areas on a map significantly enhances the map’s efficacy in precisely delineating the regions prone to flooding hazards (Tien Bui and Nhat-Duc 2017).

In this study, a comprehensive sample of 224 locations was selected within the designated study region. Among these locations, 112 were classified as flooded points, while the remaining sites were identified as non-flooded areas. The spatial distribution of these flooded and non-flooded points is visualized in Fig. 6.

Moreover, the utilization of both flooded and non-flooded points in machine learning approaches holds promise for training models that possess the capability to accurately forecast the incidence of floods or their impacts on diverse systems. For instance, a machine learning model could be trained to anticipate the timing and spatial extent of flood events based on input data that encompasses information from both flooded and non-flooded points. By incorporating such data, these models can enhance their predictive accuracy and contribute to more effective flood risk management strategies.

Fig. 6
figure 6

Flood inventory map

Variables selection

The flood environmental factors play a crucial role in the formulation of the flood hazard map.The selection of these factors was conducted through comprehensive investigations into the correlation between past flood occurrences and the localized geo-environmental characteristics (Costache 2019; Tehrany et al. 2015; Luu et al. 2018). In this research, 11 flood environmental variables were adopted as described in Figs. 8 and 9. The first chosen factor is elevation. There is an inverse relationship exists between elevation and floods, which serves as a fundamental factor in assessing flood vulnerability (Bui et al. 2016). Additionally, it has been observed that higher elevations experience relatively less severe impacts from flash floods (Khosravi et al. 2016). The second factor, slope, has a direct impact on the flow velocity of floodwaters, as it is intimately linked to the geomorphological process of flooding (Rahmati et al. 2016). Moreover, in the determination of flood risk zones, the proximity to river networks plays a crucial role (Osman and Das 2023).

Plan curvature refers to the curvature that occurs perpendicular to the direction of the steepest slope. It characterizes the convergence and divergence of flow across a surface. A positive value indicates that the surface is laterally convex at the respective cell, while a negative value suggests lateral concavity. A value of 0 indicates a flat surface. Alternatively, Profile curvature refers to the curvature that directly corresponds to the slope and indicates the steepness of the terrain. It represents the velocity of flow across the ground surface. A negative value signifies that the edge of the cell is convex upwards, resulting in a delayed flow. Conversely, a positive value indicates that the cell’s surface is concave upwards, promoting increased flow. A value of zero suggests a linear terrain (see Fig. 7).

Fig. 7
figure 7

Visual representation of curvature a plan curvature, b profile curvature

It has been observed that areas in close proximity to river networks experience the highest levels of flood hazard (Fernández and Lutz 2010). In terms of the distance to streams, streams serve as the primary channels for floodwaters, and areas in close proximity to streams are more vulnerable to flooding (Opperman et al. 2009). Regarding the distance to roads, the presence of man-made roads significantly influences flooding as they can impede the natural flow of water. Highways and urban development, in particular, have been observed to decrease the infiltration rate of a region, consequently increasing its susceptibility to flooding (Tehrany et al. 2019).

On the other hand, Land cover plays a significant role in influencing the rates of runoff, infiltration, interception, and evaporation (Yalcin et al. 2011). Therefore, the land cover map serves as a critical factor in determining flood hazard (Komolafe et al. 2018). Furthermore, aspect, as a parameter, contributes to climatic characteristics such as the direction of rainfall and the intensity of sunshine, both of which have an impact on natural phenomena occurring on the Earth’s surface (Mohamed Wahba et al. 2023). Additionally, the Topographic Wetness Index (TWI) quantifies the relative wetness or moisture content of a landscape based on topographic characteristics. The TWI reflects the ability of the gravitational impact within a watershed to transport water downstream, indicating areas of higher moisture or wetness. It takes into account the relationship between slope and contributing area, providing valuable information about water flow and potential water accumulation in a given landscape (Mudashiru et al. 2022). Equation (8) denotes the estimation of TWI.

$$\begin{aligned} {TWI}=\textrm{ln}\left( \frac{{\varphi }_{s}}{\textrm{tan}\alpha }\right) \end{aligned}$$
(8)

where \({\varphi }_{s}\) is the accumulation of flow in a specific watershed region, and \(\alpha\) is the slope in degree.

Likewise, the Stream Power Index (SPI) is a measure that assesses the erosive potential of surface runoff. It is commonly employed to identify effective soil conservation strategies aimed at mitigating the damage caused by excessive runoff. The calculation of SPI can be determined using Eq. (9) as described by Burrough et al. (2015). Figures 8 and 9 demonstrate the 11 environmental factors.

$$\begin{aligned} \textrm{SPI}=({\varphi }_{s})(\textrm{tan}\alpha ) \end{aligned}$$
(9)
Fig. 8
figure 8

Environmental factors (part 1)

Fig. 9
figure 9

Environmental factors (part 2)

Fig. 10
figure 10

The generated FSMs using the adopted four ML techniques

Results and discussion

Flood susceptibility map (FSM)

The FSM was generated employing four machine learning techniques. Subsequently, the output generated by each method was systematically categorized into five distinct classes, utilizing the equal interval tool available within the ArcMap software. Figure 10 illustrates the generated FSMs using the four adopted ML techniques.

In relation to the FSMs generated through ANN-MLP, SVR, and GBR methods, the susceptibility classification indicates moderate and low levels of flood susceptibility in the northern region of Ibaraki Prefecture. This area is characterized by high altitude and extensive tree coverage. Furthermore, the vegetation in the north plays a significant role in reducing flood susceptibility, as it contributes to high infiltration rates, which in turn diminish surface runoff volumes. Conversely, the southern urban area exhibits a significant vulnerability to flooding. This observation highlights the critical influence of urbanization on flood susceptibility, suggesting that the transformation of substantial expanses of permeable land into impervious surfaces is a key factor contributing to the region’s heightened flood risk. Furthermore, the FSM generated by the LASSO model indicates that the majority of Ibaraki Prefecture is categorized as having high to very high flood susceptibility, with the exception of a small northern area that demonstrates moderate to low levels of flood susceptibility. Figure 11 illustrates the relative distribution of each hazard class within the employed machine learning approaches.

Fig. 11
figure 11

Proportions of hazard classes by the four selected ML methods

Statistical analysis

ML can be assessed using different statistical metrics. These metrics can assess how well the ML model is efficient in prediction. Table 2 presents the application of various statistical metrics, namely Mean Square Error (MSE), Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and R-squared, to evaluate the accuracy of the employed machine learning approaches. Notably, the results indicate that the ANN-MLP reg method consistently demonstrates the highest accuracy across all four metrics, with the exception of MAE where the GBR method yields superior results.

From another perspective, the residual analysis has been conducted on the adopted four models. Figure 12 illustrates the residual distribution (Gaussian distribution) for the ML models. The residual distribution can be derived by using Eq. (10).

$$\begin{aligned} f(x) = \frac{1}{{\sigma \sqrt{2\pi }}} \cdot e^{-\frac{{(x - \mu )^2}}{{2\sigma ^2}}} \end{aligned}$$
(10)

where \(\mu\) represents the mean of the distribution, and \(\sigma\) represents the standard deviation. The mean specifies the center of the distribution, while the standard deviation controls the spread or dispersion of the distribution. The square of the standard deviation, \(\sigma ^2\), is known as the variance.

Fig. 12
figure 12

Residual distribution of the adopted ML models

Following the evaluation of the residual distribution, it has been noted that the ANN-MLP reg, SVR, and GBR models exhibit a tendency towards symmetry around their mean values and demonstrate a predominantly normal distribution. Noteworthy is the fact that among these models, the ANN-MLP reg model stands out for its ability to present the most precise normal distribution, evident in its remarkably low standard deviation. This finding substantiates the superior performance of the ANN-MLP reg model relative to the other models employed in the study. Conversely, the LASSO model demonstrates a non-uniform distribution, which serves as an indication of its diminished accuracy in prediction.

Table 2 Evaluation metric for the used ML techniques

Variable importance

Variable importance refers to the significance or contribution of different variables or factors in determining the level of flood risk or hazard in the FSM. It is a measure of how much each variable affects the final outcome or prediction of flood hazards. Moreover, variable importance analysis helps in understanding the relative importance of these input variables in predicting the flood hazard. It helps identify which variables have the most influence on the flood risk or hazard levels. In this study, an assessment of the significance of each environmental factor was conducted within the framework of the employed machine learning techniques. Figure 13 showcases the relative importance percentages attributed to each environmental factor in the generation of the FSMs.

Fig. 13
figure 13

The variable importance

The analysis reveals the prominent role of the elevation factor in the generation of the FSMs across all utilized machine learning techniques. Elevation consistently exhibits a substantial influence, constituting approximately 60% of the overall importance percentage within each ML approach. Notably, the GBR method assigns even greater significance to elevation, accounting for slightly over 80% of its contribution in the production of the FSM. The second most influential factor in the generation of the FSMs is slope, which contributes approximately 10% to the overall process. However, the distance to the river factor plays a more significant role, accounting for around 30% of the importance in the LASSO approach and approximately 15% in the SVR approach.

By identifying the most important variables, planners and decision-makers can prioritize interventions and mitigation measures in flood-prone areas. This knowledge helps allocate resources effectively and develop targeted strategies to reduce the impact of floods on vulnerable communities.

Models validation

The employed models have undergone validation using the Receiver Operating Characteristic (ROC) curve method. Figure 14 illustrates the ROC curves for all the adopted methods. The area under the curve (AUC) has been estimated and the results are presented in Table 3. It is noteworthy that the ANN-MLP reg and SVR models exhibit higher AUC values, indicating their superior accuracy in predicting the FSMs. On the other hand, the LASSO model exhibits the lowest AUC, suggesting comparatively lower accuracy.

Fig. 14
figure 14

The estimated ROC curves for the adopted four ML approaches

Table 3 Values of estimated area under curve for the adopted four ML approaches

By excluding the plan and profile curvatures, the ANN-MLP reg model demonstrates higher accuracy with an AUC of 96.7%. Conversely, the SVR, GBR, and LASSO models exhibit lower accuracy with AUC values of 95.74%, 91.05%, and 92.53% respectively. These results suggest that the inclusion of plan and profile curvatures has a negative effect on the accuracy of the FSM generated by ANN-MLP reg. However, incorporating these factors into the other three models can improve the accuracy of the FSM. Figure 15 sketches the ROC curve for the ANN=MLP reg model after excluding plan and profile curvature factors.

Fig. 15
figure 15

The estimated ROC curves for the ANN-MLP reg model after excluding plan and profile curvature factors

Excluding the curvature factor

An alternative version of the FSM was developed using ANN-MLP model after excluding the plan and profile curvatures from the set of environmental factors. This additional step aimed to investigate the impact of including or excluding these specific factors on the performance and accuracy of the FSMs. Figure 16 shows the produced FSMs after excluding plan and profile curvatures using ANN-MLP-reg algorithm.

Fig. 16
figure 16

The generated FSM after excluding plan and profile curvatures using ANN-MLP-reg approach

The results indicated that the FSM can be developed without the necessity for estimating the plan and profile curvature when employing ANN-MLP, which additionally enhances the precision of the predicted map.

Mitigation measures

Mitigation measures for flash floods involve various strategies and actions aimed at reducing the impact and likelihood of flash floods occurring. These measures can be implemented at different levels, including individual, community, and governmental levels. Here are some effective mitigation measures: (1) enhancing early warning systems constitutes a crucial measure aimed at delivering prompt alerts to inhabitants residing in flood-vulnerable zones. Particularly, focusing on Ibaraki prefecture, specifically in the southern regions and proximity to river areas, implementing and refining these systems is of paramount importance. Such mechanisms can harness cutting-edge technologies encompassing weather forecasting, rainfall monitoring, and river level gauges to effectively identify potential flash floods and proactively disseminate timely warnings to the populace, mitigating potential risks and ensuring the safety of affected communities. (2) necessitates the formulation and rigorous enforcement of zoning regulations specifically designed to restrict construction activities within flood-prone regions. Emphasizing the avoidance of building in vulnerable areas such as low-lying terrains, riverbanks, or steep slopes that could exacerbate the risks associated with flash floods. (3) building retention and detention basins to capture excess rainfall and release it slowly, reducing the intensity of flash floods downstream. (4) construct flood control structures, such as levees, embankments, and flood walls, to protect communities from flash floods and redirect water away from vulnerable areas. (5) Preserve and promote the growth of forests and other natural vegetation, as they help absorb rainfall, reduce surface runoff, and stabilize soil, thus mitigating flash flood impacts. (6) implement sustainable stormwater management practices, such as green roofs, rain gardens, and permeable pavements, to reduce surface runoff and allow water to infiltrate into the ground. (VII) elevate critical infrastructure and buildings in flood-prone areas above potential flood levels to minimize damage during flash floods.

Conclusion

This research highlights performance evaluation of four machine learning (ML) approaches in generating flood susceptibility maps (FSMs) and investigate the impact of excluding plan and profile curvature factors on the accuracy of the generated maps in Ibaraki prefecture, Japan. The four selected models, namely ANN-MLP reg, SVR, GBR, and LASSO, were trained using 70% of the data-set, which consisted of 112 flooded and 112 non-flooded spots, along with 11 environmental factors. The predictive capability of the models was assessed using the remaining 30% of the input data and validated using the ROC curve method. The results revealed that both the ANN-MLP reg and SVR models achieved high accuracy, with area under the curve (AUC) values of 95.23% and 95.83%, respectively.

Interestingly, upon excluding the plan and profile curvature factors, the accuracy of the ANN-MLP reg model significantly improved, reaching 96.7%. Additionally, the generated FSMs were categorized into five hazard levels. The northern region of the maps exhibited predominantly very low and low hazard levels. In contrast, areas situated closer to main streams and located in the southern region demonstrated a considerably higher hazard level, categorized as very high and high. The generated FSMs can provide valuable insights into regions that exhibit higher susceptibility to flash floods. These insights can be presented to decision-makers, enabling them to deliberate and adopt appropriate protective measures.

Moreover, it is highly recommended to implement mitigation measures within the high and very high hazard classifications to align with the social, environmental, and economic goals set forth by the sustainable development pillars.