1 Introduction

Natural hazards represent extreme and abrupt climatic phenomena that have far-reaching implications, including economic disruption, agricultural loss, and large-scale displacement of populations (WHO 2003; Rahmati et al. 2016). These events are responsible for significant infrastructural damage, human casualties, and long-term psycho-social consequences, impeding not only everyday life but also sustainable development (Debnath et al. 2017; Ahmed et al. 2018; Bhowmik et al. 2018). Among various types of natural disasters, floods are the most recurrent, constituting a hydrological imbalance that leads to a series of detrimental outcomes, such as death, degradation, and disease (Bubeck et al. 2012; Chapi et al. 2017; Das and Gupta 2021). Floods are inherent features of river systems and are not inherently disastrous, but they become hazardous when they pose threats to human life and property. Consequently, the crux of disaster mitigation strategy rests on the accurate identification of flood-prone zones (Degiorgis et al. 2012; Kazakis et al. 2015; Das and Gupta 2021; Ahmed et al. 2022; Debnath et al. 2022a, b).

Although flood calamities are not possible to prevent completely, the execution of appropriate management and mitigation strategies can help minimize the devastation (Chapi et al. 2017; Mitra et al. 2022). Geospatial tools and algorithms provide a range of models that can run and alter data to assess flood vulnerability (Pradhan and Lee 2010). Various geospatial models are presently widely used to analyze flood susceptibility (Pradhan and Youssef 2010; Altaf et al. 2013; Khosravi et al. 2018, 2019; Sarkar and Mondal 2020). These include logistic regression (Pradhan et al. 2010), Random Forest (RF) (Fayaz et al. 2022), Naive Bayes (NB), Dempster Shafer (DS) evidence theory, decision trees (Pham et al. 2021), index-of-entropy (Islam et al. 2022), Weights of Evidence (WoE) (Rahmati et al. 2016), artificial neural networks (Rahman et al. 2019; Jahangir et al. 2019), and Analytic Network Process (ANP) (Solaimani et al. 2023), etc. To assess flood hazards over the past few decades, conventional optimization approaches like linear, nonlinear, and dynamic programming have been used (Rather et al. 2022; Meraj et al. 2022a). Moreover, frequency ratio (FR) (Rahman et al. 2019; Sarkar and Mondal 2020; Ghosh and Dey 2021), Multi-Criteria Decision Making (MCDM) (Das 2020; Das and Gupta 2021; Mitra et al. 2022; Gupta and Dixit 2022) and fuzzy logic (FL) (Ghosh and Dey 2021; Akay 2021; Balogun et al. 2022) models are also considered for the flood hazard assessment analyses. These studies demonstrate that flood risks are linked to multidimensional attributes and incorporate spatial elements. The studies face inherent difficulties and limitations when identifying and quantifying flood hazard and vulnerability indicators such as topography, drainage, hydrology, and meteorology; or when dealing with uncertainties; when allocating appropriate weights to indicators; and when validating the results (Gupta and Dixit 2022). Additionally, it has been shown that the majority of prediction models, when used singly, contain certain flaws. However, it has been demonstrated that by combining two or more predictive techniques, the results can be more precisely predicted, and the resulting outputs are more effective (Chau et al. 2005). Every model has its pros and cons. As a result, a variety of models must first be applied to a specific area, and only then can one be chosen for usage following an evaluation of the predictive power of the models (Khosravi et al. 2018). The current study sets out to develop and assess the efficacy of several novel models for the geographic prediction of floods that utilized Multi-Criteria Decision Making (MCDM) and Machine Learning (ML) techniques.

MCDM is a widely used technique for flood hazard zonation, flood susceptibility mapping, flood vulnerability mapping, flood risk mapping, flash floods analysis, and flood forecasting. It describes the set of methods for organizing and assessing alternatives based on criteria and objectives. This MCDM technique is beneficial for flood risk management. It promotes the decision process of the participants and provides a platform to express their personal preferences. It helps in the implementation of long-lasting and effective flood management programs. Generally, the MCDM techniques viz., analytical hierarchy process (AHP) (Meyer et al. 2009; Gupta and Dixit 2022), fuzzy analytical hierarchy process (FAHP), TOPSIS, discrete choice analysis, VIKOR, EDAS, WASPAS, Simple Additive Weighting (SAW) (Khosravi et al. 2019), Decision-making trial and Evaluation Laboratory (DEMATEL), Multi-objective Optimization by Ratio Analysis (MOORA), play a dominant role in flood hazard mapping. In the present analyses, an attempt was made to develop a flood susceptibility mapping using the TOPSIS, VIKOR, EDAS, and WASPAS models. The MCDM-based AHP technique has some limitations on the selection of the conditioning variables, when analyzing alternatives with more options, whereas TOPSIS, VIKOR, EDAS, and WASPAS do not have these limitations because they retain order (Mitra et al. 2022). All MCDM approaches were chosen due to their effectiveness in making decisions based on various criteria (Bera et al. 2022). Moreover, the present study used two machine learning algorithms i.e., Random Forest and Support Vector Machine to assess the performances of the MCDM models. In recent times ML techniques have been widely used for all kinds of susceptibility mapping.

The Brahmaputra plain in India and Bangladesh has frequently encountered flooding during the rainy season. The river and its tributaries provide societal, ecological, cultural, and economic services to the people living in Bangladesh, Northeast India, Bhutan, Tibet, and China. Even though the Brahmaputra river provides significant advantages, it frequently contributes to flooding in the Brahmaputra plains of Bangladesh and Northeast India (Shivaprasad Sharma et al. 2018). The monsoon season, which lasts from July to September, is the peak time for the long-lasting (> 10-day) floods that cause significant damage in the region. Upper basin precipitation, coupled with smaller contributions from glacier melt, snow melt, and base flow, is the primary cause of monsoon season discharge in the Brahmaputra during July, August, and September (JAS). The river while flowing through the India and Bangladesh region has formed an important valley namely the Brahmaputra Valley. This valley is important due to its fertile alluvial soil. The prime economy of this region is based on crop cultivation (Hazarika et al. 2018). The entire Northeast region of India is more or less dependent on this valley due to the crops. Farmers are marginal and alternate income sources are few (Economic Survey of Assam, 2012–2013). Every year valuable standing crops are destroyed by floodwaters and cultivable fields are lost and damaged by bank erosion and the deposition of coarser sediments (Debnath et al. 2023a; 2023b). People lose their livelihoods and floods create a food deficit throughout the region (Rabindranath et al. 2011). Throughout the Brahmaputra valley, government agencies have invested a significant amount of money in structural measures, primarily earthen embankments (Planning Commission, 2001–2013). These actions are intended to safeguard the inhabitants of the floodplain from erosion and flooding. However, due to the frequent flooding in this area, these embankments are frequently breached during the monsoon season (Meraj et al. 2022b; Hazarika et al. 2018; Borah et al. 2022).

The lower Brahmaputra plain experienced significant flooding in 1954, 1962, 1972, 1977, 1984, 1988, 1998, 2002, 2004, 2012, 2013, 2015, 2016, 2017, 2018, 2019, 2020, and 2022 over the past half-a-century. It has been noticed that the frequency of the flood has increased significantly in the past two decades compared to the second half of the twentieth century. Shawki et al. (2018) and Liu et al. (2019) mentioned that the South Asian Summer Monsoon (SASM) was less active in the second half of the twentieth century as a result of human sulfate aerosol emissions. However, through the twenty-first century, higher carbon dioxide emissions and decreased aerosol loading are predicted to intensify the SASM (IPCC 2018; Rao et al. 2020). The faster glacial melt brought on by global warming and the intensity of the monsoon is projected to increase the flow of the Brahmaputra river and the likelihood of flooding in the area is likely to get accentuated (Nepal and Shrestha 2015). According to regional climate models (RCMs), the discharge of the Brahmaputra at Bahadurabad station in Bangladesh will increase during monsoon (Islam et al. 2018). Moreover, Kamal et al. (2013) mentioned that there will be an expected change in the seasonal distribution of monsoon flows throughout the 2020s, 2050s, and 2080s scenarios. Deforestation and shifting cultivation add to the causes behind floods (Debnath et al. 2017, 2023b; Debnath et al. 2022a, b). Furthermore, due to unplanned urban growth (Cetin 2013; Cetin 2016a; 2016b; Meraj 2021; Sahin et al. 2022) and rising population flood vulnerability and risks are estimated to grow significantly (Rakib et al. 2017).

Although the catastrophic nature of the Brahmaputra is well known to researchers, nevertheless, identification of a flood susceptibility or risk zone using flood influencing factors for the entire Brahmaputra basin has not yet been undertaken in any of the existing studies by Bhattacharyya and Bora (1997), Rao et al. (2020), Kumar et al. (2022), Lopez et al. (2020). Hence, a better understanding of flood susceptibility and risk is important and necessary as a stepping stone for creating and implementing damage control strategies. Flood susceptibility zones were prepared for the Brahmaputra flood plain, but the researchers used only the AHP technique to identify the zones (Hazarika et al. 2018; Gupta and Dixit 2022). According to Varmazyar et al. (2016), the best outcome is not achievable by using only one prioritization approach, and thus a conclusion would also not be reliable. Thus, combining various MCDM techniques and ML algorithms was suggested by several investigators as an improved way to increase the accuracy of the final judgment (Khosravi et al. 2019). Making trustworthy selections requires an effective aggregation mechanism when the differences among the alternatives are innately small or when the number of alternatives rises. To that end, this study introduces a novel methodology incorporating four MCDM techniques—Technique for Order Preference by Similarity to Ideal Solution (TOPSIS), Vise KriterijumskaOptimizacijaiKompromisnoResenje (VIKOR), Evaluation Based on Distance from Average Solution (EDAS), and Weighted Aggregated Sum Product Assessment (WASPAS)—as well as two machine learning algorithms, Random Forest (RF) and Support Vector Machine (SVM). The objective is to produce a reliable flood susceptibility map for the Brahmaputra River basin, which can be instrumental in informing and fortifying disaster management policies, particularly in Bangladesh and Northeast India.

2 Study Area

The Brahmaputra River Basin is geographically situated between latitudes 23.9 °N and 31.5 °N, and longitudes 82.1 °E and 97.7 °E, primarily located in the northeastern region of India (Fig. 1). The river starts from the Chemayungdung Glacier in the southern region of Tibet, part of the Himalayan Mountain Range. Upon traversing 1625 km through China, 918 km through India, and 337 km through Bangladesh, the river ultimately converges with the Bay of Bengal (Devrani et al. 2021). Geopolitically, the basin's territory is distributed among Bhutan (7.8%), Bangladesh (8.1%), India (33.6%), and China (50.5%) (Devrani et al. 2021). The Brahmaputra River has the fourth-highest flood discharge in the entire world. Nearly 50% of the 40,000m3/s mean annual discharge of the Ganga–Brahmaputra-Meghna River system comes from the Brahmaputra River (Rao et al. 2020). It is currently tied with the Ro Orinoco, Venezuela, for third place globally in terms of mean annual discharge, after the Amazon and the Congo (Best 2019).

Fig. 1
figure 1

Location map of the Brahmaputra River basin

The alluvial floodplains of Assam in India and Bangladesh constitute the middle and lower portions of the basin, formed through sediment deposition from Himalayan rivers. The Indian state of Assam encompasses approximately one-third of the basin within Indian territory. The Brahmaputra Valley in Assam spans an extensive area, covering approximately 56,480 square kilometers with an elevation reaching up to 34,130 m. According to a report by the World Bank (2007), the Brahmaputra River accounts for over 17% of India's GDP, serving as a pivotal resource for the nation's agricultural sector by supplying water for food production and irrigation.

Hydrologically, the primary source of water in the basin is precipitation, supplemented by glacial meltwater and snowmelt. The basin experiences an average annual precipitation of 2650 mm, with the peak period of rainfall occurring from June to September. Even the mountainous regions of the basin are subjected to substantial precipitation levels, receiving up to 6400 mm during peak periods. This intense rainfall in the upper catchment areas results in high-flood conditions in the downstream portions of the basin (Bhattacharyya and Bora 1997).

The Brahmaputra River is supplemented by various tributaries, including the Subansiri, Jiadhal, Jia Bharali, Beki, Manas, and Aie rivers, which converge in the low-lying Assam valley. Further downstream, the Teesta, Rangeet, and Rangpo rivers, originating in Sikkim, join the Brahmaputra in Bangladesh. In recent decades, both anthropogenic activities and climate change have raised concerns for the Brahmaputra Basin. Water levels frequently reach hazardous levels in the tributaries of the Brahmaputra River, leading to recurrent incidents of severe flooding and bank erosion in regions of India, specifically West Bengal, and Assam as well as Bangladesh.

3 Materials and Methods

In the current study, flood hazard zones were delineated through the application of multicriteria decision analysis (MCDA) and machine learning techniques, utilizing spatial data within a Geographic Information Systems (GIS) framework. Selected thematic layers, representing variables that influence flood susceptibility, were meticulously prepared and subsequently clustered within the GIS environment. For the quantitative variables, clustering was executed using Jenks' Natural Breaks method (Jenks 1967), except for qualitative variables such as Land Use and Land Cover (LULC), soil texture, and other non-quantitative factors. The Natural Breaks classification technique is commonly employed in both MCDA and machine learning-based flood risk assessments. This is due to its capability to partition data ranges into distinct clusters based on inherent, natural groupings within the data (Pathan et al. 2022). The utilization of the Natural Breaks method in this research was justified by its efficacy in enhancing interclass variability while concurrently minimizing intraclass variance. Specifically, this technique optimizes the differentiation between means of distinct class groups while also reducing the deviation of each class from its respective mean. Thus, the method is particularly effective in creating statistically homogeneous groups, making it a robust strategy for the categorization of variables affecting flood susceptibility. The step-by-step methods used in this study are presented below.

3.1 Generation of the Flood Inventory Map (FIM)

The Flood Inventory Map (FIM) serves as an essential instrument for predicting future flood events, typically generated through the collation of historical flood data and analyzed using various computational models (Sarkar and Mondal 2020; Ahmed et al. 2022; El-Magd 2022). The accuracy and reliability of any future flood vulnerability assessments are contingent upon the veracity of previously documented flood events. In this research, the FIM for the Brahmaputra river basin was constructed within a GIS framework (Fig. 2). A comprehensive dataset encompassing 500 flood sites was collated, drawing upon data from the severe flooding event that occurred in 2022 (Source: https://geoapps.icimod.org).

Fig. 2
figure 2

Flood inventory map of the Brahmaputra river Basin

To minimize selection bias, recommendations from prior research advocate for the inclusion of an equivalent number of non-flood locations to serve as counterpoints to flood-affected areas (Islam et al. 2022). Consequently, this study incorporated 500 non-flood sites, randomly selected based on previous literature published by Kumar et al. (2022) using SAR data, topographic maps, historical flood data, field surveys, and Normalized Difference Water Index (NDWI) maps spanning a decade.

Following the collection of non-flood and flood data points, these were arbitrarily partitioned into two datasets: training and testing. Although there is an absence of standardized guidelines governing the ratio of training to testing data, prevailing research frequently allocates approximately 70% of the data for training purposes and the remaining 30% for validation (Pradhan and Lee 2010; Ahmed et al. 2022). Consistent with this approach, the present study selected 700 samples (70%) for training and 300 samples (30%) for validation in constructing the Flood Susceptibility Index (FSI) map (Sarkar and Mondal 2020). Binary values were employed to distinguish between flood and non-flood sites, with '1' representing flood points and '0' denoting non-flood points. This binary classification was integral to the creation of the training flood inventory, subsequently serving as the dependent variable in predictive models.

3.2 Methods for the Preparation of the Flood Influencing Criteria

The creation of a Flood Susceptibility Index (FSI) map for a specific geographical area is a complex and comprehensive task that entails the integration of numerous hydrological and terrain variables. In the context of this analysis, a set of 18 flood-influencing criteria was selected, drawing upon empirical findings and methodological frameworks outlined in prior research and flood susceptibility models (Das 2020; Gupta & Dixit 2022; Mitra and Das 2023). These criteria include Elevation (El), Drainage Density (DD), Slope (Sl), Distance from the River (DR), Aspect (As), Curvature (Cu), Lithology (Lg), Land Use and Land Cover (LULC), Rainfall (Rf), Normalized Difference Vegetation Index (NDVI), Soil Texture (ST), Topographic Wetness Index (TWI), Flow Accumulation (FA), Stream Power Index (SPI), Sediment Transport Index (STI), Topographic Position Index (TPI), Roughness (Rn), and Topographic Ruggedness Index (TRI) (Fig. 3). The data sources for each selected criterion are elaborated in Table 1.

Fig. 3
figure 3figure 3figure 3

Flood influencing factors of the Brahmaputra River basin. a elevation; b slope; c drainage density (DD); d distance from the river (DR); e aspect; f curvature, g flow accumulation; h SPI; i STI; j TPI; k roughness; l TRI, m lithology; n LULC; o NDVI; p Rainfall; q soil texture; r TWI

Table 1 Different applied factors (data)used in the flood susceptibility modeling of this study

The selected flood-influencing criteria were categorized into four overarching groups for analytical convenience: Topographic Factors, Climatic Factors, LULC Factors, and Soil Factors. The topographic factors encompass Elevation (El), Slope (Sl), Drainage Density (DD), Distance from River (DR), Aspect (As), Curvature(Cu), Topographic Wetness Index (TWI), Flow Accumulation (FA), Stream Power Index (SPI), Sediment Transport Index (STI), Topographic Position Index (TPI), Roughness (Rn), and Topographic Ruggedness Index (TRI). Conversely, the climatic group consists solely of Rainfall (RF). LULC and NDVI were categorized under the LULC Factors. Lithology and soil texture were classified as Soil Factors. A comprehensive exposition detailing the analytical procedures, significance, and methods employed for the preparation of these flood-influencing criteria is available in the supplementary material. A flowchart delineating the methodological approach used in this analysis is presented in Fig. 4.

Fig. 4
figure 4

Methodology visualization followed for the present study

3.3 Multicollinearity Tests of the Flood Influencing Factors

In susceptibility models, multicollinearity tests typically precede regression analyses to ensure the integrity and validity of the model. These tests often employ measures such as tolerances, conditional indices, Variance Inflation Factors (VIF), and Pearson's correlation coefficients (Ahmed et al. 2022). In the current investigation, both tolerances and VIF were utilized to evaluate the extent of multicollinearity among the 18 selected flood-influencing criteria (Mitra and Das 2023). Within the realm of statistical analysis, VIF serves as a diagnostic tool to ascertain the degree of multicollinearity in an Ordinary Least Squares (OLS) regression model. Specifically, VIF quantifies the inflationary impact of multicollinearity on the variance of an estimated regression coefficient. Previous studies by Duque and Aquino (2019) as well as Khosravi et al. (2018) posited that multicollinearity poses a concern if any of the flood-affected elements exhibit total VIF values exceeding 9. Conversely, a tolerance value lower than 0.1 is considered another indicator of significant multicollinearity. Therefore, it is strongly recommended to exclude such variables from the modeling equation if they meet the following criteria: a VIF value greater than 9 and a tolerance value less than 0.1. The mathematical formulations employed to compute these multicollinearity indices are elaborated upon in the supplementary material.

3.4 Methods for the FSI Mapping Using MCDM Technique

In this study, the delineation of flood susceptibility zones within the Brahmaputra Basin was executed by employing an ensemble of MCDM-based techniques—TOPSIS, VIKOR, EDAS, and WASPAS—integrated with a Geographic Information System (GIS) framework. Each of the flood-influencing factors, along with their sub-categories, were prioritized according to Saaty’s (1980) analytical hierarchy scale, as presented in Table 2. These prioritizations facilitated the construction of FSI maps using the chosen MCDM models.

Table 2 Classification of criteria from 1 to 9 categorical scales

3.4.1 Technique for Order Preference by Similarity to Ideal Solution (TOPSIS).

Developed by Hwang and Yoon (1981), TOPSIS serves as a multi-criteria decision-making methodology employing the Euclidean distance metric to discern the optimal choice among a set of alternatives (Mitra & Das 2023). Generally, the alternative that exhibits the shortest Euclidean distance to the positive ideal solution (A +) and the longest distance to the negative ideal solution (A–) is favored. This method offers precise values for both performance ratings and factor weights and has been employed in recent flood hazard analyses with noteworthy accuracy (Khosravi et al. 2019; Pathan et al. 2022; Mitra & Das 2023).

3.4.2 Vise Kriterijumska Optimizacijai Kompromisno Resenje (VIKOR)

VIKOR, a compensatory variant of TOPSIS, was initially proposed by Duckstein and Opricovic (1980) and later refined by Opricovic and Tzeng (2004). Utilizing a linear normalization approach, this method minimizes the distance to the ideal solution. It enables decision-makers to pinpoint compromise solutions, thereby facilitating more nuanced decision-making. The alternative closest to the ideal solution is deemed the compromise option. This technique has been effectively applied in flood hazard zone identification (Mitra and Das 2023).

3.4.3 Evaluation Based on Distance from Average Solution (EDAS)

EDAS, a more recent addition to MCDM techniques, has been increasingly employed for alternative ranking (Mitra et al. 2022). Formulated by Ghorabaee et al. (2015), this methodology assesses alternatives based on their positive and negative distances from the mean solution. The average solution is straightforwardly calculated as the arithmetic mean of the performance values across various alternatives for each criterion. Given the pivotal role of the arithmetic mean in stochastic processes, the EDAS approach demonstrates efficacy particularly in stochastic MCDM scenarios (Ghorabaee et al. 2017).

3.4.4 Weighted Aggregated Sum Product Assessment (WASPAS)

WASPAS is a well-established technique that combines the strengths of the Weighted Sum Model (WSM) and the Weighted Product Model (WPM), thereby enhancing the accuracy of alternative rankings (Zavadskas et al. 2016). Introduced by Zavadskas (2012), this technique computes an optimal combination parameter for a more accurate alternative assessment. Detailed steps for implementing these selected MCDM models are elucidated in the supplementary material. By employing these advanced MCDM techniques, this study aims to provide a comprehensive and robust framework for flood susceptibility mapping within the Brahmaputra Basin.

3.5 Methods for the FSI mapping using ML Algorithms

In this research, two advanced machine learning algorithms—Random Forest (RF) and Support Vector Machine (SVM)—were employed for generating the FSI map of the Brahmaputra Basin.

3.5.1 Random Forest (RF) Algorithm

The RF algorithm serves as an ensemble learning method and operates by utilizing a multitude of straightforward decision trees for both classification and regression tasks (Saikh and Mondal 2023). It was originally developed based on the random-subspace approach for random-decision forests (Ho 1995). Distinctive attributes of the RF model include its resilience to multi-collinearity and its robustness in handling missing or noisy data. The algorithm operates in a three-stage process: initially, it employs subsets of the dataset to construct individual decision trees; subsequently, it aggregates the classification or predictive outcomes from these trees; and finally, it employs bootstrap-based resampling techniques to generate these subsets from the primary dataset (Breiman 2001). During the model training, an Out-of-Bag (OOB) error, indicative of the misclassification rate in out-of-bag samples, is computed. A trial-and-error methodology was adopted to minimize the OOB error by iteratively adjusting the number of trees and the variables at each node. In the present investigation, the model was configured using a total of 500 trees and incorporated 6 variables at each node. The RF model also yields metrics such as mean decrease in accuracy and Gini index, which are valuable for comprehending the relative influence of various factors on flood susceptibility within the basin. The 'Randomforest' package was employed to instantiate the RF model for this study.

3.5.2 Support Vector Machine (SVM) Algorithm

SVM constitutes another prevalent machine-learning technique that employs a set of linear discriminant functions for predictive modeling (Pourghasemi et al. 2013). This algorithm was initially conceived by Vladimir Vapnik and his associates in 1992 and is applicable for both classification and regression tasks (Vapnik 1995). The performance of SVM is contingent upon the choice of kernel functions, of which four are predominantly used: linear, polynomial, sigmoid, and Radial Basis Function (RBF). The algorithm tends to underperform when target classes are non-linearly separable or when the dataset is exceptionally large and noisy. In the current study, the RBF kernel was selected for the SVM model due to its proven efficacy in multiple research settings (Pourghasemi et al. 2013; Saikh and Mondal 2023).

By integrating these machine learning algorithms, this study aims to offer an empirically substantiated, technologically robust framework for flood susceptibility mapping in the Brahmaputra Basin.

3.6 Development of the Ensemble Method

An ensemble model for flood prediction was devised by executing each of the previously described models within the ArcGIS framework. In recent times, numerous models have been formulated to generate susceptibility maps, each employing varying training datasets, algorithms, and additional variables specific to a particular domain. The ensemble approach amalgamates the results of these foundational models to produce a singular, more definitive forecast. The raster calculator tool within the ArcGIS application was utilized to merge all output raster layers, thereby generating the composite output of the ensemble model (Saikh and Mondal 2023). Relative to individual models, ensemble methods generally exhibit enhanced predictive accuracy and superior performance.

3.7 Methodology for Model Validation

Model validation serves as a critical component in evaluating the reliability of any developed model and is considered an indispensable aspect of this research (Mitra and Das 2023). Several methodologies exist for the validation of hazard susceptibility models. In this study, the validity of the Flood Susceptibility Index (FSI) map was assessed through multiple statistical metrics: the “Area Under the Curve” of the “Receiver Operating Characteristics” (AUC-ROC), the Mean Absolute Error (MAE), the Mean Squared Error (MSE), and the Root Mean Square Error (RMSE). The AUC-ROC offers an insightful depiction of the trade-off between specificity and sensitivity (Mitra & Das 2023; Gupta & Dixit 2022; Ahmed et al. 2022). Within the bi-dimensional ROC graph, the y-axis represents sensitivity (true positive rate), whereas the x-axis represents 1-specificity (false positive rate). Detailed equations employed for accuracy estimation are provided in the supplementary file.

To scrutinize the relationships among the TOPSIS, VIKOR, EDAS, and WASPAS models, the non-parametric Spearman's rank test (rs) was applied. The test statistic rs was computed using Eq. 1:

$${r}_{s}=\frac{1-6{\sum }_{i=1}^{n}{d}_{i}^{2}}{{n}^{3}-n}$$
(1)

In this equation, \({d}_{i}\) signifies the difference in rankings for each data pair, and n represents the total number of data pairs.

4 Results

4.1 Multicollinearity Results of the Flood Influencing Factors

The Variance Inflation Factor (VIF) and tolerance values were calculated to assess the degree of multicollinearity among the variables utilized in the flood susceptibility and vulnerability models. These metrics serve to evaluate the significance of each chosen flood-controlling factor. In the realm of multicollinearity analysis, a VIF value exceeding 10 and a tolerance value falling below 0.1 are generally considered indicative of multicollinearity issues. In the present study, the VIF and tolerance values for all conditioning factors ranged between 10 and 0.01, as presented in Table 3. This outcome suggests that none of the 13 flood conditioning factors were compromised by multicollinearity. The analysis further disclosed that the variable of flow accumulation exhibited the lowest tolerance value (0.15) and the highest VIF value (6.67).

Table 3 Multicollinearity test (Tolerance and VIF values) of the flood conditioning factors

4.2 Decision Matrix for the TOPSIS, VIKOR, EDAS, and WASPAS

The study area’s flood susceptibility maps were developed using twenty flood-influencing factors based on criteria pertinent to the physical environment of the Brahmaputra basin (Mitra and Das 2023). Given that none of the chosen Multi-Criteria Decision Making (MCDM) methods are pixel-based, this research employed sample points. The evaluation matrices for TOPSIS, VIKOR, EDAS, and WASPAS were independently constructed in a spreadsheet, incorporating the values of 15,000 sample points derived from the flood-influencing factors. Table 4 delineates the evaluation matrix for the four MCDM methods, comprising 15,000 rows for the sample points and 18 columns representing flood-influencing factors.

Table 4 Decision matrix of the used MCDA methods (TOPSIS, VIKOR, EDAS, and WASPAS)

The categorization of the influencing factors was determined based on an extensive review of existing literature (Das and Gupta 2021; Pathan et al. 2022; Gupta and Dixit 2022; Mitra and Das 2023), as well as expert consultation. The selected factors were bifurcated into two principal categories: Beneficial Criteria (BC) and Non-Beneficial Criteria (NBC). Of the 18 factors chosen, 11 were categorized as Beneficial Criteria, while the remaining 7 were designated as Non-Beneficial Criteria.

The equitable distribution of weights among the flood-influencing factors represents a significant challenge in MCDM-based research. Consequently, the weights for the four selected MCDM methodologies were assigned based on expert opinions culled from diverse research studies. Table 5 outlines various research contributions where weights for the chosen flood-influencing factors have been utilized. In the current investigation, the highest weight was allocated to elevation (13%), followed by flow accumulation (12%), slope (11%), drainage density (10%), distance from the river (10%), rainfall (9%), Topographic Wetness Index (TWI) (7%), lithology (4%), Normalized Difference Vegetation Index (NDVI) (4%), Land Use and Land Cover (LULC) (3%), soil texture (3%), aspect (2%), curvature (2%), Stream Power Index (SPI) (2%), Topographic Position Index (TPI) (2%), ruggedness (2%), Sediment Transport Index (STI) (2%), and Terrain Ruggedness Index (TRI) (2%).

Table 5 Literature depicting each parameter's-assigned weights in various flood susceptibility mapping studies

4.3 Computation and Interpretation of Indices in TOPSIS, VIKOR, and EDAS Methodologies

In the application of the TOPSIS technique, the indices S+ TOPSIS, STOPSIS, and Ci were computed for the 15,000 extracted sample points. Within this framework, the values for S+ and S ranged from 0.02 to 68.54 and from 0.006 to 0.07, respectively (Fig. 5a and b). The composite index Ci was then employed to create the final Flood Susceptibility Index (FSI) map of the Brahmaputra basin using the TOPSIS methodology. The Ci values exhibited a maximum and minimum of 0.260 and 0.019, respectively. Conversely, in the VIKOR technique, the indices Sj (SjPositive), Rj (SjNegative), and Qj were scrutinized to delineate flood susceptibility zones. The Sj values oscillated between 0.559 and 0.708, while the Rj values ranged from 0.100 to 0.130 (Fig. 5c and d). Subsequently, the Qjindices were employed to construct the FSI map using the VIKOR methodology. The Qj values spanned from a minimum of 0.477 to a maximum of 0.843. In the case of the EDAS technique, the indices NSPi (NSPiPositive), NSNi (NSNiNegative), and ASi (ASiAppraisal Score) were calculated to assess flood susceptibility. Specifically, NSPi varied from 0.225 to 0.484, and NSNi ranged from 0.993 to 0.997 (Fig. 6a and b). The calculated ASi values were utilized to generate the FSI map, with the highest and lowest values being 0.738 and 0.610, respectively. The computed values for Ci, Qj, and ASi are presented in Table 6 for further scrutiny.

Fig. 5
figure 5

The positive and negative impact criteria for a Positive TOPSIS; b Negative TOPSIS; c Positive VIKOR; and d Negative VIKOR

Fig. 6
figure 6

The positive and negative impact criteria for a Positive EDAS; b Negative EDAS; and c WPM and d WSM for the WASPAS model

Table 6 TOPSIS, VIKOR, EDAS and WASPAS solution and positions

4.4 Integration of WSM and WPM in WASPAS Methodology

The Weighted Aggregated Sum Product Assessment (WASPAS) technique amalgamates two distinct methods: the Weighted Sum Model (\({P}_{i}^{\left(1\right)}\)) and the Weighted Product Model \({(P}_{i}^{\left(2\right)})\). This hybrid methodology calculates performance values by employing criteria weights as a mechanism to resolve specific problems. In the current study, this technique was executed utilizing a dataset comprising 15,000 sample points, and it was implemented within the ArcGIS platform. Within the WASPAS framework, the \({P}_{i}^{\left(1\right)}\) values oscillated between 0.242 and 0.317, while the \({P}_{i}^{\left(2\right)}\) values varied from 0.334 to 0.472 (Fig. 6c and d). TSubsequently, \({P}_{i} v\) alues varying from 0.288 to 0.388 were utilized for the generation of the final Flood Susceptibility Index (FSI) map.

4.5 Spatial Distribution and Vulnerability Assessment of Flood-Prone Zones in the Brahmaputra Basin

The generated FSI maps were systematically categorized into five distinct susceptibility levels: very low, low, moderate, high, and very high. This classification was executed utilizing Jenks Natural Breaks categorization methodology within the ArcGIS software platform. According to the TOPSIS, VIKOR (Fig. 7), EDAS, and WASPAS (Fig. 8) models, the aggregate areas classified under the 'high to very high' susceptibility zones within the Brahmaputra River Basin are 77,441 km2, 114,654 km2, 124,676 km2, and 150,619 km2, respectively. When analyzed in the context of the Random Forest (RF) and Support Vector Machine (SVM) models (Fig. 9), the total areas delineated as ’high’ and 'very high' susceptibility zones account for 88,786 km2 and 105,239 km2, respectively. Conversely, in the ensemble model (Fig. 10), an aggregate area of 97,669 km2 is classified within the 'high to very high' susceptibility zone (Table 7).

Fig. 7
figure 7

Flood Susceptibility map of the Brahmaputra River basin using Topsis (a) and Vikor (b)

Fig. 8
figure 8

Flood Susceptibility map of the Brahmaputra River basin using EDAS (a) and WASPAS (b)

Fig. 9
figure 9

Flood Susceptibility map of the Brahmaputra River basin using SVM (a) and RF (b)

Fig. 10
figure 10

Ensemble-based flood Susceptibility map of the Brahmaputra River basin

Table 7 Susceptible areas of the Brahmaputra River basin based on the different models

Figure 11 reveals that approximately one-third of the Brahmaputra Basin is categorized as falling within 'moderate to very high' flood-prone areas, whereas over 50 percent of the basin is classified between 'low and very low' flood-prone categories. The spatial representation of the FSI maps indicates a remarkable consistency in the geographical locations designated as 'high to very high' flood-prone zones across all susceptibility models. Specifically, the maps highlight a particular vulnerability in the lower catchment areas of the basin.

Fig. 11
figure 11

Area under the flood susceptibility according to different selected models

Regions including the Brahmaputra Valley in Assam (India), the northern territories of West Bengal (India), and the Rangpur and Rajshahi divisions in Bangladesh appear to be disproportionately susceptible to 'high to very high' flood-prone zones according to all FSI models. Further, districts such as Dhemaji, Dibrugarh, Lakhimpur, Majuli, Darrang, Nalbari, Barpeta, Bongaigaon, and Dhubri in Assam, and Coochbihar and Jalpaiguri in West Bengal manifest elevated susceptibility to flooding. In Bangladesh, the districts of Kurigram, Gaibandha, Bogra, Sirajganj, Pabna, Jamalpur, and Manikganj also fall within the 'high to very high' flood-susceptible zones.

Though the upper catchment of the basin does not regularly experience flooding, there are sporadic incidents attributed to excessive monsoonal rainfall and glacial outbursts, as documented by (Nie et al. 2020), particularly affecting regions in Bhutan (Tempa 2022) and the Arunachal Pradesh state of India (Goswami et al. 2023). The present study also identifies river valleys in the upper catchment as potentially flood-prone areas, particularly in the susceptibility maps generated through the VIKOR, EDAS, and WASPAS methodologies.

4.6 Model Validation and Evaluation: Assessing the Efficacy of Flood Susceptibility Index Models

The Flood Susceptibility Index (FSI) maps, generated through multiple modeling approaches, underwent a comparative analysis using designated validation points. The Area Under the Curve—Receiver Operating Characteristic (AUC-ROC) analysis was employed for this purpose, utilizing "The ArcSDM" toolbox in ArcMap 10.7. Figure 12 delineates the AUC-ROC curves for TOPSIS, VIKOR, WASPAS, EDAS, SVM, RF, and Ensemble models.

Fig. 12
figure 12

ROC-AUC curve for all the different selected models

The AUC-ROC serves as a quantitative measure for evaluating the performance efficacy of FSI models and classifies them into distinct categories: Excellent (> 0.9), Accepted (0.8–0.9), Good (0.7–0.8), Considerable (0.6–0.7), and Poor (0.5–0.6). In the present analyses, the respective AUC values for the deployed models were: 0.920, 0.888, 0.746, 0.860, 0.928, 0.935, and 0.947. These values substantiate the reliability and suitability of the chosen models for discerning flood susceptibility patterns. The AUC-ROC analysis revealed that the Ensemble model, RF, SVM, and TOPSIS yielded the highest levels of accuracy, followed by VIKOR, EDAS, and WASPAS. According to this grading metric, Ensemble, SVM, RF, and TOPSIS performed in the 'Excellent' range, VIKOR and EDAS were classified as 'Accepted,' and WASPAS fell into the 'Good' category.

Furthermore, key statistical metrics such as Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean Squared Error (RMSE) were also examined to augment the model evaluation process (Table 8). Notably, the Ensemble model exhibited the lowest MAE value of 0.1447, the smallest MSE value of 0.4096, and the most minimal RMSE of 0.1678, in comparison to other selected ML algorithms and MCDM models (Table 9). These results imply a robust correlation between the observed and predicted flood points when utilizing the Ensemble model. Additionally, the MAE, MSE, and RMSE metrics for the SVM, RF, and TOPSIS models demonstrated comparable performance to that of the Ensemble model. Hence, the validation findings affirm that, among the all techniques applied, the Ensemble, RF, SVM, and TOPSIS models exhibit an elevated level of predictive accuracy with minimal errors in the flood susceptibility index (FSI) modeling. Moreover, the results indicate that the TOPSIS model had the highest predictive accuracy compared to other MCDM techniques.

Table 8 Accuracy assessment of the four susceptible models for training data using different error measures
Table 9 Correlation studies between TOPSIS, VIKOR, EDAS and WASPAS

4.7 Correlation Analysis of Multi-Criteria Decision-Making Models

To assess the interrelatedness among the Multi-Criteria Decision Making (MCDM) models of TOPSIS, EDAS, VIKOR, and WASPAS, Spearman's Rank Correlation Coefficient (Spearman'sρ) was employed. This analysis was conducted on a dataset comprising 15,000 samples (Fig. 13). Given the non-normal distribution of data in the current study, Spearman's Rank Correlation Coefficient was deemed an appropriate statistical measure for examining the pairwise correlation among the employed MCDM models. For assigning rank, the lower value gets the highest rank in the VIKOR model, whereas, in TOPSIS, EDAS, and WASPAS a higher value is assigned a higher rank (Mitra and Das 2023). Therefore, the VIKOR model shows a negative correlation with other models. The analytical outcomes reveal a notably high correlation coefficient (ρ = 0.859) between EDAS and WASPAS, suggesting a close alignment in their normalization and aggregation results. Conversely, the models TOPSIS-VIKOR, TOPSIS-EDAS, TOPSIS-WASPAS, VIKOR-EDAS, and VIKOR-WASPAS manifested moderate levels of correlation, with ρ values of 0.393, 0.111, 0.117, 0.215, and 0.428, respectively. Moreover, the significance of these correlations was evaluated at the 0.01 level (two-tailed), affirming that the relationships between the FSI maps generated by these models are statistically significant. Consequently, the observed correlation coefficients not only underline the efficacy of the employed MCDM models in capturing flood susceptibility patterns but also point towards the proximate similarity between EDAS and WASPAS in particular.

Fig. 13
figure 13

Correlation studies between a TOPSIS-VIKOR;b TOPSIS-EDAS; c TOPSIS-WASPAS; d VIKOR-EDAS; e VIKOR-WASPAS; f EDAS-WASPAS

5 Discussion

The present study adopts an innovative, integrative approach to delineate flood-prone zones within the complex geographical landscape of the Brahmaputra River Basin. Utilizing four hybrid Multi-Criteria Decision Making (MCDM) techniques—TOPSIS, VIKOR, EDAS, and WASPAS—alongside two Machine Learning (ML) algorithms—Support Vector Machine (SVM) and Random Forest (RF)—the study exploits Geographic Information Systems (GIS) to provide an empirical basis for flood resilience strategies. Previous work has often relied on either parametric (index-based) or numerical methods for flood hazard mapping, each with distinct advantages and drawbacks concerning modeling framework, predictive reliability, and data sensitivity (Dash and Sar 2020). Our work, by contrast, synergistically merges MCDA-based and ML algorithms to enhance predictive accuracy while maintaining computational efficiency.

The study incorporates a wide array of topographical, hydrological, climatic, and edaphic factors, as identified in previous research (Das 2020; Mitra and Das 2023), to generate a Flood Susceptibility Index (FSI) map for the basin. Significantly, higher weightages were allocated to variables such as elevation, flow accumulation, slope, drainage density, distance from the river, rainfall, and Topographic Wetness Index (TWI), consistent with extant literature (Mitra and Das 2023; Das and Gupta 2021; Gupta and Dixit 2022). The allocation of a 12% weightage to flow accumulation, second only to elevation, echoes the findings of Negese et al. (2022) and Dash and Sar (2020), emphasizing its critical role in flood dynamics, particularly in downstream areas.

The study’s geographical focus adds a layer of nuance to existing research. The basin’s topographical heterogeneity, comprising both mountainous and plain terrains, is considered, with particular attention paid to the flood-vulnerabilities of the Brahmaputra Valley in India and Bangladesh during monsoon seasons. Pham et al. (2021), Saikh and Mondal (2023) Pham et al. (2021), and Saikh and Mondal (2023) determined that, despite the models' differing performance in dividing the research area into various levels of flood susceptibility, all the models agreed that the low-lying parts near the rivers are the most flood-prone portions of the study area. The same conclusions were arrived at during the present study. Highly correlated performance metrics from EDAS and WASPAS models reinforce the methodological soundness of our approach.

The model validation, based on AUC-ROC and various error metrics like MAE, MSE, and RMSE, underscored the effectiveness of the Ensemble, RF, SVM, and TOPSIS models in flood hazard zonation. These models were categorized as highly reliable, fulfilling the research objectives and extending the applicability of these methodologies to other flood-prone areas globally. The Ensemble model proved marginally better than the other models applied in this study, consistent with the findings of Pham et al. (2021), Bui et al. (2019), and Zhang and Ma, (2012). These studies asserted that the ensemble methods provided better accuracy due to their ability to be integrated, which allows for a more thorough evaluation of the flooding and conditioning elements, improving the accuracy of the final map. Bui et al. (2019) and Zhang and Ma (2012) conducted their research on flash flood vulnerability using ensemble methods and mentioned that ensemble learning algorithms generally improve the generalization and forecasting capabilities of the basic models. Moreover, Saikh and Mondal (2023) also opined that ensemble models performed better than the other models. On the other hand, among the selected MCDM models TOPSIS performed very well for the identification of the flood susceptibility zone in the Brahmaputra basin. Additionally, the accuracy results of the TOPSIS are comparable to ML algorithms. Mitra et al. (2022) claimed the TOPSIS model is highly recommended for flood susceptibility analysis in the sub-Himalayan region based on MAE, MSE, and RMSE. Also, Pathan et al. (2022) and Hadian et al. (2022) revealed that the TOPSIS approach estimates more precise flood risk coverage than the other MCDM approaches. According to Mousavi et al. (2022), TOPSIS is more accurate because it can apply discrete alternative challenges, which are the most crucial methods for resolving problems in the real world, and recognize viable alternatives right away. Additionally, the requirements are reduced by the paired comparison, and the process cannot be properly controlled by the capacity constraint. Thus the results of this study recommended that ML algorithms and MCDM-based TOPSIS model are highly preferable for the identification of the flood susceptibility zone of the Brahmaputra River basin.

The study makes several novel contributions. It presents a holistic flood-risk analysis across the Brahmaputra basin, utilizing an unprecedented 18 flood-influencing factors. It also scrutinizes spatial variations in the basin’s physical settings, highlighting flood risk concentrations, particularly in Assam (Hazarika et al. 2018; Pareta 2021; Gupta and Dixit 2022) and Bangladesh (Haque and Nicholls 2018; Haque et al. 2023). Additionally, the study advances flood susceptibility literature by integrating recent MCDM methods like EDAS and WASPAS (Ghorabaee et al. 2015; Zavadskas et al. 2012) within a GIS framework. However, limitations persist in terms of spatial resolution disparities among the influencing factors and the inherent inability of any model to fully replicate complex real-world phenomena (Mitra and Das 2023). By addressing these multi-dimensional aspects comprehensively, the study not only advances the academic discourse on flood susceptibility but also provides a robust empirical framework for informed policy interventions as discussed below.

The flood susceptibility maps generated in this study can guide governments and policymakers in making informed decisions for targeted resource allocation for flood management, especially in the vulnerable areas identified in Assam and Bangladesh. The study's nuanced approach, integrating both MCDM and ML algorithms, offers a reliable basis for enhancing disaster preparedness strategies, such as early warning systems and emergency response protocols. The findings can inform land-use planning and zoning regulations, potentially discouraging the development of infrastructure in areas identified as highly susceptible to flooding. Accurate flood susceptibility assessment aligns with the broader goals of sustainable development, enabling long-term resilience and adaptation strategies for the region. By relying on an array of factors, including climatic and topographical data, the study can help facilitate a multi-disciplinary and stakeholder-inclusive dialogue on flood management. Given the study's methodological robustness, the findings may serve as a blueprint for flood susceptibility assessment in other similarly affected regions globally, thereby broadening its impact beyond the immediate geographical scope. Future research could focus on resolving the issue of disparate spatial resolutions among influencing factors, perhaps through advanced geostatistical methods. While the employed models proved reliable, their performance could potentially be further optimized through hybrid or ensemble approaches. Given the dynamic nature of flood susceptibility influenced by climate change, a longitudinal analysis employing the current model framework could provide insights into temporal trends. Additional variables encompassing socio-economic factors could be integrated into the current models to produce a more holistic susceptibility index. Ground-based observations could be utilized in future studies to validate the models' predictive efficacy, thereby enhancing their empirical robustness. Conducting similar studies in other flood-prone regions would not only validate the universal applicability of the proposed methodologies but also contribute to a global database on flood susceptibility. By addressing these implications, the current study not only augments the academic discourse surrounding flood susceptibility and management but also offers pragmatic solutions that are highly germane to policy formulation and implementation. Thus, it significantly advances both the theoretical and practical aspects of flood risk assessment and management.

6 Conclusion and Recommendations

Flooding remains a significant environmental hazard in the Brahmaputra River basin, necessitating an intricate analysis for effective management strategies. The current study employs an innovative approach by utilizing four hybrid Multi-Criteria Decision Making (MCDM) models—TOPSIS, VIKOR, EDAS, and WASPAS—in conjunction with Machine Learning algorithms, Random Forest (RF) and Support Vector Machines (SVM), to develop a Flood Susceptibility Index (FSI) map of the basin. Utilizing geospatial technology, multiple thematic layers indicative of flood triggers were integrated to produce this FSI map, which was subsequently validated through the Area Under the Receiver Operating Characteristic (AUC-ROC) curve and other statistical metrics. Notably, the Ensemble, RF, SVM, andTOPSIS models emerged as particularly efficacious in delineating flood-prone zones, corroborated by robust validation scores exceeding 90% in AUC-ROC and registering below 30% in MAE, MSE, and RMSE. The geospatial findings explicitly identify Assam in India, Coochbihar, and Jalpaiguri districts of West Bengal in India, as well as Rangpur and Rajshahi divisions of Bangladesh as regions with high susceptibility to flooding. These maps can serve as invaluable tools for a variety of stakeholders, ranging from policymakers and civil engineers to environmentalists and local administration officials, in both India and Bangladesh. Specifically, they provide empirical support for strategic decision-making aimed at enhancing flood mitigation and management efforts in the region, colloquially known as the Jamuna flood plain in Bangladesh. It is imperative to acknowledge the dynamic nature of flood risks, especially in the context of escalating climate change impacts. Projections based on the Representative Concentration Pathway 8.5 (RCP8.5) scenario suggest a heightened level of wetness and river discharge in the Brahmaputra basin by the end of the century (Rao et al. 2020). Given these evolving threats, adaptation strategies are of paramount importance (Jongman 2018). The regions delineated as high-risk areas in this study are significantly populated, further complicated by informal settlements in mid-channel bars, locally termed "Char land," and near river banks. Both of these areas are intrinsically vulnerable to both flooding and bank erosion. Navigating these complexities to formulate effective management and mitigation measures presents an intricate challenge. However, a community-based resilience framework, often referred to as the "People-centered model," offers a promising avenue for disaster risk reduction (Haque et al. 2023; Huq and Bracken 2015; Haque 2019). Such a comprehensive and integrated approach not only enhances community preparedness and adaptability in the face of uncertain flood risks but also offers a sustainable model for safeguarding both human and environmental resources against flood hazards. The current study significantly advances our understanding of flood susceptibility modeling and provides a robust empirical foundation for future research, policy formulation, and on-the-ground interventions.