Introduction

Research background

The movement of suspended sediments is essential in various fields, like water structure designs, water management, and dam and river engineering1,2. Modelling the quantity of sediment load (SL) in rivers is vital for designing flow control and water storage facilities, for example, canals and dams3,4. In addition, suspended sediments impact drinking water quality supplies of residential localities and water requirements of industry and agriculture5,6. SSL is the outcome of various physical procedures, comprising detachment, transport, and settlement of particles that depend upon intensity and magnitude of rainfall, discharge in river network, land use, physical features of soil, and topography. Several studies have reported rainfall and discharge as the key governing aspects for SSL7,8,9. Human activities like deforestation and land-use change can upsurge input of fine sediment to streams10,11. In this context, SSL in river networks can be defined as collective outcome of catchment management applications12. This suggests that suspended sediments result from rivers' complex and non-linear flow processes13. Hence, modelling the non-linear relationship amid river flow and suspended sediments utilizing different non-linear approaches has become a fundamental challenge for various scientific communities, such as engineering and water resources management14,15,16,17.

Because of the stochastic aspect of sediment particle transport in the flow and the non-linear behavior of suspended sediment problems, conventional computational techniques may be unsuccessful for precise SSL predictions18,19. In this context, AI methodologies have commonly been employed for predicting SSL in rivers20,21. Compared to mathematical and physical approaches, machine learning (ML) techniques are more prevalent because of lower cost, few parameters, and high accuracy22. ML algorithms have been effectively employed in the last two decades to model various hydrological and water resources problems. Employed ML algorithms for SSL prediction can be categorized into standalone and hybrid techniques. Taking examples of the implementation of standalone methods, various researchers investigated the efficacy of the ANN model for SL prediction23,24,25. In another study, SVM model was applied for predicting SL in three rivers in Malaysia. Outcomes revealed that predictive performance of SVM was higher compared to conventional techniques26. Again, regression model, GEP, and ANFIS (adaptive neuro-fuzzy inference system) methods were developed for SL prediction in three Malaysian rivers. They found that performance of GEP model was best compared to other models27. Potential of ANNs, ANFIS, WANN, and customary sediment rating curve (SRC) models were studied for daily SSL estimation in two gauging sites in USA28. Estimation accuracy assessment of applied models presented that WANN provided more precise SSL estimations than other models. Long short-term memory (LSTM) was considered for estimating sediment concentration on a daily basis in Schuylkill River, United States29. Also, Linear Regression (LR), MLP, Extreme Gradient Boosting, and LSTM were applied for sediment load prediction with different time intervals30. Both studies revealed that LSTM performed better than other applied models. In another study, ANN, ANFIS, least square-SVM (LS-SVM), and group method of data handling (GMDH) were employed to estimate SSL using monthly sediment and average river flow data31. Outcomes revealed that LS-SVM model generated higher accuracies compared to other models. Kumar and Tripathi32 predicted SSC applying ANN, SVM, and MLR models using runoff and sediment data from Musiri gauge station, located on bank of River Cauvery, India. Their findings showed that ANN model with a solitary hidden layer is most appropriate for SSC predictions.

Problem statement and literature

Given the importance of sediment load movement in sculpting the Earth's surface, the challenges associated with sediment transport prediction have attracted a great deal of interest33. The important thing is usually to use a tool that is well-founded in order to assess the suspended sediment load. Even with the advancement of modern numerical models, the movement of river sediment load is still difficult to understand. For example, a direct technique that necessitates the installation of a hydrometric station for sample collection and monitoring can be expensive and time-consuming, particularly in remote locations34. Although indirect approaches are less costly, the susceptibility of sediment particles to various environmental variables makes it difficult to reconcile theoretical results with observations35. Furthermore, a limited range of environment circumstances are covered by the majority of experimentally validated equations, thereby limiting their applicability36. Alternatively, the emergence of artificially intelligent algorithms, including ANN and SVM, has revolutionized many time series forecasting, including silt transport prediction, by avoiding the computation of the complex sediment transport rate. One of the key benefits of this strategy is that it does not require knowledge of the complex sediment transport process's underlying physical mechanism37.

Regardless of their broad application, ML algorithms still have various flaws38,39,40,41. For predicting hydrological variables, it is necessary to train neural network models. Training level finds best values for weight connections, bias values, number of hidden layers, and number of neurons. Conventional training algorithms such as backpropagation algorithms, have a propensity of getting stuck in local minima. In addition, low convergence speed can also hamper the effectiveness of such techniques42,43. Even though AI techniques are vastly considered to predict different hydrological variables, such techniques necessitate fine-tuning using training algorithms. The complete AI model must be tuned for completing the ultimate network training. Recently, optimization algorithms like GA, bat algorithm (BA), Grey Wolf Optimization (GWO), shark algorithm (SA), particle swarm optimization (PSO), and firefly algorithm (FA) have been utilized for training soft computing models for determining their best parameter values44,45,46,47,48,49,50,51.

Rajaee et al.52 applied SRC, multilinear regression (MLR), ANN, and Wavelet-ANN models for daily SSL modelling in Iowa River gauge station (US). They concluded that W-ANN model showed better agreement with collected SSL values and performed superior to other considered models. Kisi et al.53 investigated accuracy of ANFIS-GA, ANFIS-PSO, ANFIS-ACO, and ANFIS-BOA models in drought forecasting considering monthly precipitation data of Biarjmand, Ebrahim-Abad, and Abbasabad stations located in Iran. Adnan et al.54 proposed an alternative tool named dynamic evolving neural fuzzy inference system (DENFIS) for estimating SSL based on historical sediment and streamflow values recorded at Guangyuan and Beibei stations in China. Obtained results were compared with other two models, and they found that the DENFIS model generated improved SSL predictions. Hassanpour et al.55 applied SVR-FCM hybrid model, and compared it with SRC), ANN, ANFIS, and SVR models for predicting daily SSL in River Sistan, Iran. They found that SVR-FCM model predicts SSL more accurately in the specified study region. Banadkooki et al.42 proposed hybrid ANN-BA, ANN-PSO, and ANN-ALO models for SSL prediction in Goorganrood basin, Iran. Based on comparison of results, they observed that ANN-ALO model generated most accurate prediction results in the study region. In another study, Ehteram et al.56 applied ANN-WA, ANN-PSO, and ANN-BA for optimizing performance of ANN in predicting the rate of SSL accurately in Goorganrood basin, Iran. They found that ANN-WA performed best with accurate SSL predictions. Nhu et al.57 used random subspace (RS), SVM-RBF (radial basis function) kernel, random forest (RF), and SVM-NPK (normalized polynomial kernel) for SSL prediction in Haraz catchment situated in mountainous Mazandaran Region (Iran). Obtained results revealed that RS model showed great potential in SL prediction in poor data catchments. PSO, ACO, BA, DE, BOA, and GOA algorithms have been successfully applied recently to improve the prediction accuracy of SVM models; for this reason, BA, GOA, BOA algorithms can be chosen as benchmark optimization methods in this work. Even though these algorithms have produced results that are adequate in the current literature, there is always room to increase the speed at which convergence occurs, find speedy optimal solutions, and prevent tapping in local minima58. Therefore, in order to create a reliable predictive model for SSL, a new optimization algorithm sparrow search algorithm is investigated in this work.

Objective of this study

As discussed in the related literature, optimization algorithms enhance convergence speed of conventional ML models and increase their performance accuracy48,49,59. The SSA is a population-based optimization algorithm which was proposed based on foraging and anti-predatory behaviors of sparrow populations, and built upon existing population intelligence algorithms, such as GWO, ALO, and PSO etc. It presents certain advantages in terms of stability, convergence accuracy, and velocity. To the best of the authors' knowledge, no preceding effort has been put into applying an SVM-based SSA model for SSL prediction in Brahmani River basin, hence the objective of present study. By distributing the population of sparrow into three groups: discoverers, entrants, and guards, thresholds and input weights of SVM are optimized. Moreover, various scenarios have been adapted to model input–output architecture for achieving best prediction accuracy for SSL. Lastly, for the performance evaluation between novel SVM-SSA model and other AI algorithms, a complete comparative assessment has been conducted. The outcomes show that proposed SSA algorithm increases convergence speed and efficiently avoids optimization procedure from falling into local optimum.

Study area

River Brahmani flows in the eastern portion of India between 20° 30′ 10″ to 23° 36′ 42″ N latitudes and 83° 52′ 55″ to 87° 00′ 38″ E longitudes (Fig. 1 “generated using ArcGIS software environment”). On the right of the basin lies the Mahanadi basin, and on the left the Baitarani basin with a total 39,313.50 km2 catchment area. Climatic conditions of Brahmani basin are tropical, with moderately cold winter and fairly hot summer. The average annual rainfall in the basin is 1305 mm, and most of the rain occurs by the influence of the southwest monsoon season i.e., between June to October. In summer, the maximum temperature goes as high as 47 °C, and in winter minimum temperature drops unto 4 °C. Brahmani River is the major source of water supply for various industries and townships and irrigation purposes in the Odisha state (India).

Figure 1
figure 1

Illustration of Barhmani River basin.

Materials and methods

Support vector machine (SVM)

SVM is a supervised binary learning algorithm first presented by Vapnik60. It is based on statistical learning technique and structural risk minimization. The objective for development of an SVM model is to minimize errors and model intricacy61. It converts input space to a high-dimensional feature space for finding the best splitting hyperplane from a training data set. The general architecture of SVM is illustrated in Fig. 2.

Figure 2
figure 2

Schematic structure of SVM-based model.

Amid points of two distinctive classes inside a definite error boundary, an optimum splitting hyperplane is proposed in actual space of n coordinates (\({x}_{i}\) constraints in \(x\) vector). Consider that \(x\) and \(y\) represent input and output variables. If \({x}_{i}\in {R}^{n}\), \({y}_{i}\in \{-1, 1\}\) and \(i=1,\dots ,n,\) then optimum splitting hyperplane is computed utilising a categorization decision function.

$$g\left(x\right)=sgn(\sum_{i=1}^{n}{y}_{i}{\alpha }_{i}K\left({x}_{i},{x}_{j}\right)+b)$$

where \(n\)—number of input variables; \({\alpha }_{i}\)—Lagrange multipliers; \(K\left({x}_{i},{x}_{j}\right)\)—kernel function; \(b\)—offset of hyperplane from source. Different types of kernel functions are linear, RBF, sigmoidal, or polynomial. RBF is most often used for its robust forbearance to input noise, simple design, online learning capability, and good generalisation. RBF kernel function is described using following expression59,62:\(K\left({x}_{i},{x}_{j}\right)=\text{exp}{(-\gamma {x}_{i}-{x}_{j})}^{2}\) where \(\gamma\) controls degree of nonlinearity of SVM model. Large and small \(\gamma\) values cause over- and under-fitting of training data, correspondingly.

Bat algorithm (BA)

Yang 63 introduced BA emulating echolocation behaviour of a bat. In nature, there are several types of bats. When bats navigate and hunt, they all have similar behaviour; but are different in weight and size. Microbats broadly use echolocation characteristic that helps them to seek prey and avoid hurdles in complete darkness64. Artificial bats have a velocity vector, frequency vector, and position vector in BA, updated in the period of repetitions. BA can discover search space using velocity and position vectors (Fig. 3).

Figure 3
figure 3

Flowchart of BA algortihm.

Every bat has a frequency \({F}_{i}\), velocity \({V}_{i}\), and position \({X}_{i}\) in a \(d\)-dimension search space. Position, frequency, and velocity vectors are updated using following equations.

$${V}_{i}\left(t+1\right)={V}_{i}\left(t\right)+({X}_{i}\left(t\right)-Gbest)\times {F}_{i}$$
(1)
$${X}_{i}\left(t+1\right)={X}_{i}\left(t\right)+{V}_{i}\left(t+1\right)$$
(2)

here \(Gbest\)—optimal solution obtained thus far; \({F}_{i}\)\(ith\) bat’s frequency that is updated during each iteration as expressed below:

$${F}_{i}={F}_{min}+({F}_{max}-{F}_{min})\times \beta$$
(3)

where \(\beta\)—arbitrary quantity of steady distribution between 0 to 1. As given below, a random walk is employed in BA for improving its exploitation capability:

$${x}_{new}={x}_{old}+\varepsilon {A}_{t}$$
(4)

where \(\varepsilon\)—arbitrary number between − 1 to 1; \(A\)—intensity of produced sound. Pulse emission \((r)\) and loudness are updated at each iteration as expressed below:

$${A}_{i}\left(t+1\right)=\alpha {A}_{i}(t)$$
(5)
$${V}_{i}\left(t+1\right)={r}_{i}(0)(1-{e}^{(-\gamma \times t)})$$
(6)

where \(\alpha\) and \(\gamma\)—constant constraints which lie amid 0 and 1 and utilised for updating pulse rate \(({r}_{i})\) and loudness rate \({A}_{i}\). Pseudocode of BA is given below.

Algorithm 1
figure a

Algorithm BA

Grasshopper optimization algorithm (GOA)

Saremi et al.65 proposed a robust metaheuristic optimisation algorithm called the GOA mimicking the swarming behaviour of grasshoppers that comprises adults (grasshoppers having wings) and nymph (not having wings). Adults are utilised for globally searching entire search space (exploration) and finding enhanced food source areas66. In contrast, nymphs are utilised for exploiting a specific neighborhood or area of a specific location (exploitation). GOA efficiently balances exploitation and exploration and is mathematically incorporated in a less complicated mechanism of algorithm configuration (Fig. 4).

Figure 4
figure 4

Flowchart of GOA algortihm.

In nature, behaviour of grasshopper swarms that seeks food sources is articulated using following equation:

$${X}_{i}(t+1)=c\left[\sum_{\begin{array}{c}j=1\\ j\ne 1\\ \end{array}}^{N}c\frac{u{b}_{d}-l{b}_{d}}{2}s\left(\left|{X}_{j}(t)-{X}_{i}(t)\right|\right)\frac{{x}_{j}(t)-{x}_{i}(t)}{{d}_{ij}}\right]+\widehat{{T}_{d}}$$
(7)

where \({X}_{i}(t+1)\) —position of \(i\) th grasshopper at \(t + 1\) iteration; \(c\)—coefficient of reduction for smoothing stability amid exploitation and exploration phases. \(c\) is given by:

$$c(t)={c}_{max}-t\frac{{c}_{max}-{c}_{min}}{{t}_{max}}$$
(8)

where \({c}_{min}\) and \({c}_{max}\)—minimum and maximum values of \(c(t)\) parameter, respectively. In addition, \({t}_{max}\) and \(t\)—maximum and current number of iterations. In Eq. (7), \(l{b}_{d}\) and \(u{b}_{d}\)—lower and upper bound of D-dimension hunt space, —\({d}_{ij}\)distance between grasshoppers and \(\widehat{{T}_{d}}\)—location of solution having best fitness function. Lastly, \(s\left(d\right)\) signifies societal forces that can be computed using:

$$s\left(d\right)=f{exp}^{-\frac{d}{{l}_{s}}}-\text{exp}(-d)$$
(9)

where \(f\)—attraction intensity and \({l}_{s}\)—attractive scale of length. Additional thorough information can be found in67,68.

Algorithm 2
figure b

Algorithm GOA

Butterfly optimization algorithm (BOA)

Arora and Singh69 proposed a new bionic optimization algorithm by simulating butterflies' mating and foraging behaviour, namely, BOA. The underlying operational process of BOA is based on the observation that during food search, butterflies produce specific fragrances related to their fitness. Also, the fitness of a butterfly changes accordingly as it travels from one search area to another. Fragrance is transmitted in search procedure, and meanwhile, a butterfly can recognize variations in the fragrance of other butterflies70. Butterflies travel in the direction of the one butterfly with a more potent fragrance in global search. At the same time, butterflies move arbitrarily in local search for searching food when they cannot sense fragrance from other butterflies. Mainly, fragrance having an exclusive aroma in every butterfly is the distinctive feature of BOA that can be expressed as in Eq. (10):

$$f={cl}^{a}$$
(10)

where \(f\)—detected fragrance magnitude, that is, how other butterflies detect strong fragrances; \(c\)—sensory modality; \(l\)—intensity of stimulus; \(a\)—power proponent which depends on modality accounting changing grade of absorption. In BOA, there are two key steps: global and local search phases. Butterfly takes a step in the direction of fittest solution/butterfly \({g}^{*}\) in global search and is expressed utilising Eq. (2).

$${x}_{i}^{t+1}={x}_{i}^{t}+({r}^{2}\times {g}^{*}-{x}_{i}^{t})\times {f}_{i}$$
(11)

where \({x}_{i}^{t}\)—solution vector \({x}_{i}\) in iteration \(t\) for \(ith\) butterfly; \(g*\)—current optimal solution obtained between all solutions in present iteration.\(r\)—arbitrary number between 0 and 1; \({f}_{i}\)—fragrance of \(ith\) butterfly.

Local search is formulated using following equation:

$${x}_{i}^{t+1}={x}_{i}^{t}+({r}^{2}\times {x}_{j}^{t}-{x}_{k}^{t})\times {f}_{i}$$
(12)

where \({x}_{j}^{t}\) and \({x}_{k}^{t}\)\(jth\) and \(kth\) butterflies from solution space. Figure 5 provides a basic flow diagram of the algorithm.

Figure 5
figure 5

Flowchart of BOA algortihm.

Algorithm 3
figure c

Algorithm BOA

Sparrow search algorithm (SSA)

Xue and Shen71 proposed SSA based on theory of anti-predation and foraging behavior of sparrows. SSA is novel and has advantages like fast convergence speed and strong optimisation capability. Mainly the procedure of sparrow foraging is simulated by SSA. The procedure is a producer-joiner model, and it overlays the early warning and detection mechanism72. Producers are those individual sparrows who find food without difficulty, and other entities are joiners. A specific number of sparrows in the population are chosen for early warning and investigation at the same time. However, food is abandoned if danger is found since safety is the priority.

Mathematical Model of Sparrow Search Algorithm.

Individuals can be categorized as alerters, participants, or discoverers in SSA. The discoverer is in charge of organizing the population's hunt and locating food. In order to grab food, the participants follow the discoverer. When environmental dangers arise, the alerter notifies the sparrow population to flee to a safe location.

It is necessary to create the following rules to simplify the behavior of the sparrow in order to represent the eating process of the bird using a mathematical model:

  1. i.

    The objective function's fitness evaluation determines the environment's fitness in the sparrow population, and the finder's fitness is greater than the participants'.

  2. ii.

    The discoverer and the participant have an internal competitive relationship. In an attempt to boost their own energy, some participants watch how the discoverer behaves in order to compete for food.

  3. iii.

    Less energetic sparrows may relocate in search of more energetic ones.

  4. iv.

    Sparrows possess adaptable individual behavioral methods that enable them to alternate between participants and discoverers, rendering them highly fit discoverers; yet, the population's proportion of participants and discoverers does not change.

  5. v.

    When a sparrow population's alarm value exceeds the security threshold, the finder flees from its current location and guides the population to a secure spot. Warners in the population warn when they perceive an external environmental threat.

  6. vi.

    In order to minimize the risk of their own predation, the alert will take the initiative in escaping when it detects external environmental threats or natural enemies. The alert at the population center will randomly transition from a feeding state to an active state, while the alert at the population edge will move closer to the population center.

Step 1: Construct and set up the solution. At this point, it is known the size of the population, the maximum number of replicates, the producer ratio (PD), and the PV (sparrows in intensive care) ratio. Equation (13) displays the sparrow population's starting position. They are generated at random.

$$X=\left[\begin{array}{ccccc}{x}_{\text{1,1}}& {x}_{\text{1,2}}& \cdots & \cdots & {x}_{1,d}\\ {x}_{\text{2,1}}& {x}_{\text{2,2}}& \cdots & \cdots & {x}_{2,d}\\ \vdots & \vdots & \vdots & \vdots & \vdots \\ {x}_{n,1}& {x}_{n,2}& \cdots & \cdots & {x}_{n,d}\end{array}\right]$$
(13)

In Eq. (13), \(n\)—number of sparrows; \(d\)—dimension of choice variables. Equation (14) is used to assess each person's suitability for the upcoming operation. Each row in \({F}_{X}\) represents a person's fit, and \(n\) in Eq. (14) indicates number of sparrows:

$${F}_{X}=\left[\begin{array}{ccccc}{f[x}_{\text{1,1}}& {x}_{\text{1,2}}& \cdots & \cdots & {x}_{1,d}]\\ f[{x}_{\text{2,1}}& {x}_{\text{2,2}}& \cdots & \cdots & {x}_{2,d}]\\ \vdots & \vdots & \vdots & \vdots & \vdots \\ {f[x}_{n,1}& {x}_{n,2}& \cdots & \cdots & {x}_{n,d}]\end{array}\right]$$
(14)

Step 2: Those who create cuisine are not given favor over producers with greater fitness values in the SSA. Unlike the explorers, producers are able to seek a wider area for cuisine because they are in charge of locating it and guiding the movement of the entire population. In SSA, the discoverer’s location update formula is expressed using

$${X}_{i}^{t+1}=\left\{\begin{array}{c}{X}_{i}^{t}\text{exp}\left(-\frac{i}{a{t}_{max}}\right), ifR<S\\ {X}_{i, j}+QL, if R\ge S\end{array}\right.$$
(15)

where \(t\)—current iteration number, and \({X}_{i}\)—information about \(i\) th sparrow’s position. \(a\)—arbitrary number between [0, 1].\(S(S\in [0.5, 1])\) and \(R(R\in [0, 1])\) signify safety and warning parameters, correspondingly. \(R\)—arbitrary number, \(S\)—specified constant. When \(R<S\), search environment is found to be safe, and no danger for the population, and a broad range of searches can be conducted by discoverer. When \(R \ge S\), adjust search approach as scouts find a threat and hence, rapidly move closer to an innocuous region. \(Q\)—an arbitrary number following a normal distribution. \(L\)—an all-one matrix of \(1 \times d\) dimension.

Joiner’s position update formula is formulated by

$${X}_{i}^{t+1}=\left\{\begin{array}{c}Qexp\left(\frac{{X}_{r}-{X}_{i}^{t}}{{i}^{2}}\right),if i>\frac{n}{2}\\ {X}_{b}^{t+1}+\left|{X}_{i}^{t}-{X}_{b}^{t+1}\right|{A}^{+}L, otherwise\end{array}\right.$$
(16)

where \({X}_{b}\)—best location of producer at present and \({X}_{r}\)—worst location in the world at present. \(A\) -matrix of \(1 \times d\) dimensions, every element has 1 or − 1 value and \({A}^{+}={A}^{T}{(A{A}^{T})}^{-1}\).

Scout’s position update formula formulated by

$${X}_{i}^{t+1}=\left\{\begin{array}{c}{X}_{B}^{t}+\beta \left|{X}_{i}^{t}-{X}_{B}^{t}\right|, if {f}_{i}\ne {f}_{B}\\ {X}_{i}^{t}+K\frac{\left|{X}_{i}^{t}-{X}_{r}^{t}\right|}{\left({f}_{i}-{f}_{R}\right)+\varepsilon }, if {f}_{i}={f}_{B}\end{array}\right.$$
(17)

where \({X}_{B}\)—current global optimal position; \(\beta\)—steplength regulator parameter, an arbitrary number with variance ‘1’ and mean value ‘0’ drawn on a normal distribution. \(K\)—arbitrary number between [− 1, 1]. \({f}_{i}\)—distinct fitness value of sparrow at present step. \({f}_{R}\) and \({f}_{B}\)—current worst fitness and global optimal values, correspondingly. \(\varepsilon\)—a tiny constant. At the end of the iteration, optimisation result is output. Flowchart of SSA is given in Fig. 6.

Figure 6
figure 6

Flowchart of SSA algortihm.

Algorithm 4
figure d

Algorithm SSA

During iteration procedure, if new position of sparrow is improved than preceding position, present position will be updated and global optimal fitness and global optimal position are found. Also, the sparrow’s identity is continuously updated and alternated during this phase. If each sparrow is well adapted, it can be a finder; however, proportions of joiners and finders in the population are constant.

Performance criteria

Four performance indices that includes R2, RMSE, MAE, and ENS are considered in this study for measuring the accuracy of the applied models 73,74. The mathematical expression of the statistical measures can be denoted as:

$${\text{R}}^{2}={(\frac{\sum_{i=1}^{N}({O}_{i} -\overline{{O }_{i}})({F}_{i} -\overline{{F }_{i}})}{\sqrt{\sum_{i=1}^{N}{({O}_{i} -\overline{{O }_{i}})}^{2}\sum_{i=1}^{N}{({F}_{i} -\overline{{F }_{i}})}^{2}}})}^{2}$$
(18)
$$\text{MAE }=\frac{1}{N}\sum_{i=1}^{N}\left|{F}_{i}-{O}_{i}\right|$$
(19)
$$\text{RMSE }=\sqrt{\frac{1}{n}\sum_{i=1}^{n}{\left[{O}_{i}-{F}_{i}\right]}^{2}}$$
(20)
$$\text{ENS }=1-[\frac{\sum_{\text{i}=1}^{\text{N}}{\left({O}_{i}-{F}_{i}\right)}^{2}}{{\sum }_{\text{i}=1}^{\text{N}}{(\left|{O}_{i} -\overline{{O }_{i}}\right|)}^{2}}]$$
(21)

where \({O}_{i}\); \({F}_{i}\); \(\overline{{O }_{i}}\) and \(\overline{{F }_{i}}\) express observed, predicted, average observed and average predicted values.

R2 is a statistical index in regression, which shows how fittingly the predictive models estimate real data sets. When the value of R2 is 1, the predicted values perfectly fit observed values, whereas value of 0 specifies no linear connection. The average error magnitude is measured by a quadratic scoring rule known as the RMSE. It is the square root of average of squared difference amid predicted and actual observations. Contrary to RMSE, MAE is a quantity utilized for measuring how closer the prediction values are to actual observations. MAE calculates average error magnitude amid prediction and observed values with no difference amid direction of error. Low values of MAE and RMSE specify high assurance in model-prediction values69. RMSE has the advantage of penalizing huge errors more, thus can be extra suitable in certain circumstances. On the other hand, MAE is evidently a better statistical measure from an interpretation viewpoint. In addition, ENS is one of the standardized measures to assess the model precision, whose value lies between zero and one. ENS value of 1 specifies best agreement, while a value of 0 specifies no agreement. ENS measure is extremely subtle to limit values because of usage of difference squares.

For each model scenario, Table 1 gives input parameter combinations where discharge and suspended sediment load parameters are given in current time step (Qt and SSLt) along with previous monthly lag time. From Table 1 it can be seen that five different scenarios were considered for estimating SSL utilising different input combinations of SSLt−1, SSLt−2, SSLt−3, Qt, Qt−1, Qt−2, Qt−3 parameters. It is worthy to note that the selected scenarios are considered based on correlation of Q and SSL variables.

Table 1 Implemented models and their input parameters.

Modeling results and analysis

Four hybrid SVM models employed in this study in integration with four different MAs namely BA, BOA, GOA, and SSA are compared for modeling SSL utilizing data of River Brahmani basin. Next, performance of the applied models was assessed against each other and conventional SVM model using statistical measures and graphical interpretations. This section provides the outcomes of the comparisons and effectiveness of all mentioned models at four proposed gauge stations. Two major influencing data, i.e., Q, and SSL, were applied to predict SSL. The outcomes of training and testing periods are provided in Tables 2, 3, 4, 5 and 6, showing prediction performance of all applied models on basis of R2, RMSE, ENS, and MAE criteria. The results obtained indicated that these proposed parameters effectively estimated the SSL.

Table 2 Performance of SVM models for SSL estimation for all data.
Table 3 Performance of SVM-BA models for SSL estimation for all data.
Table 4 Performance of SVM-GOA models for SSL estimation for all data.
Table 5 Performance of SVM-BOA models for SSL estimation for all data.
Table 6 Performance of SVM-SSA models for SSL estimation for all data.

The performance statistics of SVM1 during testing phase for Jaraikela station when SSL and discharge of current month is considered are as follows: R2 = 0.8996, ENS = 0.8941, MAE = 40.6325, RMSE = 40.768, respectively. Next, we consider a combination of SSLt−1, Qt−1 (one month lag). Here the SVM2 model (R2 = 0.9.32, ENS = 0.8975, MAE = 39.341, RMSE = 39.4767) performs better than the SVM1 model. Similarly, when a combination of SSLt−1, SSLt−2, SSLt−3, Qt, Qt−1, Qt−2, Qt−3 (one-month, two-month, three-month lag and current month) are considered, the performance of this combination are: R2 = 0.90819, ENS = 0.90449, MAE = 35.9001, RMSE = 36.0338. It can be observed that, as the lag time is increased, there is a gradual performance improvement in case of SVM as shown in Table 2, i.e., the prediction accuracy of SVM5 model is better than the prediction accuracy of SVM4, SVM3, SVM2 and which in turn is better than SVM1 model.

The performance statistics of SVM-BA1, SVM-GOA1, SVM-BOA1 when SSL and discharge of current month is considered are as follows: R2 = 0.9324, ENS = 0.9286, MAE = 30.951, RMSE = 31.0848; R2 = 0.9441, ENS = 0.9383, MAE = 26.6632, RMSE = 26.799; R2 = 0.9514, ENS = 0.9458, MAE = 22.3218, RMSE = 22.4579 for Jaraikela station during the testing phase. When a combination of SSLt−1, Qt−1 is considered, the SVM-BA2, SVM-GOA2, SVM-BOA2 generates R2 = 0.9344, ENS = 0.9305, MAE = 30.194, RMSE = 30.3279; R2 = 0.9454, ENS = 0.9389, MAE = 25.7062, RMSE = 25.8427; R2 = 0.9541, ENS = 0.9505, MAE = 21.5987, RMSE = 21.7323 as compared to first model scenario. Further, the SVM-BA5, SVM-GOA5, SVM-BOA5 models gives R2 (0.9341), ENS (0.93782), MAE (27.812), RMSE (27.948); R2 (0.9508), ENS (0.94662), MAE (23.26), RMSE (23.3941); R2 (0.9596), ENS (0.95341), MAE (19.3658), RMSE (19.5018) when we consider a combination of SSLt−1, SSLt−2, SSLt−3, Qt, Qt−1, Qt−2, Qt−3. Thus in case of SVM-BA, SVM-GOA, SVM-BOA too, the last scenario performs better than the other four scenarios. The detailed results are shown in Tables 3, 4 and 5.

By observing Table 6, considering SSLt−1, SSLt−2, SSLt−3, Qt, Qt−1, Qt−2, Qt−3 provides best results i.e., R2 = 0.97014, ENS = 0.96481, MAE = 15.3926, RMSE = 15.5287 followed by SSLt−1, SSLt−3, SSLt−2, SSLt−3, Qt−3 (R2 = 0.9691, ENS = 965, MAE = 16.0031, RMSE = 16.1372), and the worst performance was obsered when SSLt, Qt is considered (R2 = 0.9636, ENS = 0.9578, MAE = 18.36, RMSE = 18.4958). The performance statistics of SVM-SSA method is best compared to all other hybrid models for all the stations during both training and testing phases. From all the selected stations, all the models perfomed best at Jaraikela, Tilga, Jenapur and Gomlai respectively.

Tables 2, 3, 4, 5 and 6 provide the outcomes on train and test datasets for the applied techniques. It gives a general trend, where performance (R2, ENS, MAE, RMSE) tends to rise when more characteristics are included, which is found for all proposed models at all the selected stations utilized for SSL prediction. Based on type of statistical measures, the larger values of ENS or R2 signify that results are better whereas smaller values of RMSE or MAE signify that obtained results are better. Tables 2, 3, 4, 5 and 6 provide the outcomes of statistical assessment measures for result data of Tilga, Jenapur, Jaraikela and Gomlai stations estimated by five different scenarios. From Table 6, the final obtained results revealed that the proposed ML models were adequately trained and verified. The predicted outcomes during testing phase can reflect performance of the predictive models in a better way. As stated before, four hybrid SVM models and the conventional SVM model were employed for SSL prediction; every model had an equal number of MFs. From an analysis of the figure, it is clear that the considered algorithms BA, GOA, BOA, and SSA enhanced the performance of conventional SVM during training and testing stages. During the training period, SVM-SSA showed the best performance, with R2, RMSE, MAE, and ENS values of 0.99616, 0.14994, and 0.01578, 0.99195 respectively. Similarly, during the testing period, R2, RMSE, MAE, and ENS values for SVM-SSA are 0.97014, 15.5287, 15.3926, and 0.96481 respectively.

The scatter plots of predicted data by SVM, SVM-BA, SVM-GOA, SVM-BOA, and SVM-SSA models against actual data during training and testing phases are reported in Fig. 7. Generally, the model shows better performance when the scatters are closer to 45° slanted line. There were differences in the regression values between the actual and predicted data for each of the recommended approaches. Scatters of SVM-SSA model are more concentrated nearby to the 45° slanted line compared to other four models. According to the graphical variations presented amid observed and predicted sediment load values; the SVM-SSA model achieved eminent correlation with maximum value for all input combinations followed by SVM-BOA, SVM-GOA, SVM-BA and ordinary SVM model. This is shown by the ability of the SVM-SSA prediction model in capturing varied sediment load observations of all four proposed gauge stations. Certainly, utilising SSA optimizer enhanced performance of SVM-SSA compared to other hybrid models and the conventional SVM model in all input scenarios.

Figure 7
figure 7

Scatter diagram of actual vs predicted SSL for best combination using five proposed models at (a) Tilga, (b) Jenapur, (c) Jaraikela, and (d) Gomlai.

To visually compare computational outcomes of SSL obtained from applied models (for best input combination) in comparison to available observed data, a time-series plot of predicted SSL data against observed values is presented in Fig. 8 for all stations. The plot illustrates that hybridized ML algorithms have superior prediction capability, predominantly in finding the peak SSL values, which is a significant development over the conventional model. As illustrated in the figure, the overall trend of SVM prediction model's predicted values can follow the fluctuation trend of the actual values to a certain degree. The prediction trend of SVM-BA and SVM-GOA models does not differ much from the each other, and the overall fluctuation from the real value is quite high. The SVM-BOA model generated prediction values with slightly less deviations from the real ones which matches with the real situation. The overall fluctuation of SVM-SSA prediction model is least having some differences in the validation phase of the trend and the actual trend with the predicted values extremely near to the real values. This figure showed that, in comparison to other predictive models, the SVM-SSA model predictions were more accurate in predicting the matching actual SSL values. Based on time series of modeled and observed SSL in Fig. 8, peak SSL data are well predicted by SVM-SSA algorithm. It can be observed from the figure that there is a minor difference amid time series of modeled and observed SSL.

Figure 8
figure 8figure 8figure 8figure 8

Observed vs model-predicted SSL values based on each algorithm for (a) Tilga, (b) Jenapur, (c) Jaraikela and (d) Gomlai.

To visually evaluate performance of the models in replicating probability distribution of actual SSL data, violin plots were prepared. Violin plots of actual and model-predicted SSL data are demonstrated in Fig. 9. The similar resemblance in the form of violin signifies more likeliness of spreading observed and simulated SSL data. Figure 9 illustrates a better similarity amid actual, and SVM-SSA simulated SSL at all four sites. It was observed from the figure that the violin’s shape of hybridized models was more similar to shape of actual violin for all locations. The maximum disparity in the violin was witnessed for standalone SVM followed by SVM-BOA, SVM-GOA, and SVM-BA. An assessment of outcomes at four locations showed improved performance of all hybrid models in simulating observed SSL distributions. The reason lies is that for a long-term forecasting task, it becomes much more difficult for forecasting technique to capture dynamic change of SSL because of the more uncertain factors involved in the complex hydrological process. Therefore, the proposed method utilizing SSA to optimize parameters of SVM model can generate satisfactory forecasting outcomes. Based on different statistical indicators, performance of hybrid models was found to fluctuate slightly for different sites. Steadiness in the outcomes reveals a strong dominance of SVM-SSA model in replicating SSL in the selected study region.

Figure 9
figure 9

Violin plots of models for (a) Tilga, (b) Jenapur, (c) Jaraikela, and (d) Gomlai stations.

For a good apprehension of the estimation accuracies of five employed models, SVM, SVM-BA, SVM-GOA, SVM-BOA, and SVM-SSA, SSL values of varied series, predictions, and observations in diverse ranges are compared. Histogram plots of predicted and actual SSL values are shown in Fig. 10. Prediction of SSL at Jenapur station illustrates that for highest (100000–200000 µg/l) and lowest (300000–400000 µg/l) ranges of SSL, frequency (number of events) of precise prediction by SVM-SSA5 model displays superior agreement with frequency of actual values in comparison to frequency of precise prediction by other models. Performance of SVM-SSA, SVM-BOA, and SVM-GOA models is fairly similar in middle range values (200000–800000 µg/l); yet a slight improvement in range values is noted in performance of SVM-SSA model over SVM-BOA and SVM-GOA models whereas its performance is more enhanced than SVM-BA and SVM models. Overall, for all stations, the histogram plots show that during the training and testing periods, probability distribution of predicted values is closer to the observed values (better agreement) for the hybrid models than simple SVM model.

Figure 10
figure 10figure 10figure 10figure 10

Histogram plots of best scenario in SVM-based models for SSL prediction analysis at (a) Tilga, (b) Jenapur, (c) Jaraikela and (d) Gomlai.

For a better representation of results obtained from the conventional and hybrid SVM models, bar charts showing MAE (mg/l) and ENS are demonstrated in Figs. 11 and 12 respectively. The MAE value closer to 0 and the ENS value closer to 1 imply the excellent efficacy of a model. It is clearly visible that the SVM-SSA model exhibited lower MAE values while higher ENS values in the monthly forecasting scenario as compared to SVM-GOA, SVM-BOA, SVM-BA and standalone SVM models. Hence, a comparison of the outcomes shown in Figs. 11 and 12 reveals that the hybrid SVM models had better results than the conventional model. Based on both ENS and MAE measures, it can be found that SVM-SSA performed superior to all other models.

Figure 11
figure 11

Comparison plot of MAE for (a) training and (b) testing phase.

Figure 12
figure 12

Comparison plot of ENS for (a) training and (b) testing phase.

A comparison of the model's ability to predict peak values in different models was conducted because, in river engineering problems, the most important portion of discharge prediction is the peak values. The time-based variations of observed vs. predicted SSL utilizing SVM-SSA, SVM-GOA, SVM-BOA, SVM-BA, and SVM algorithms are demonstrated in the form of hydrographs in Fig. 9 to assess and compare them to one another. As presented in all the presented figures, the hybrid SVM approaches indicated a better matchup with observed SSL values at all gauge stations; among which SVM-BA shows the maximum difference between the observed and estimated SSL. In contrast, the simple SVM model overestimated and underestimated certain peak values. Overall, SVM-SSA produced better SSL predictions and presented more precise estimations of the peak values than the other models.

A radar chart of the performance indicators was also used to evaluate the effectiveness of the applied models (Fig. 13). The figure shows the radar charts of the metrics used for model assessments. Radar charts were utilized in various hydrological studies to provide a better diagnostic examination of the efficiency of all models 75,76,77,78. Each of the four statistics is displayed simultaneously on the graph. It demonstrates that the standalone SVM yields the worst results, while the SVM-SSA has the lowest RMSE and MAE, the highest ENS, and the highest R2. The primary benefits of using a radar chart are its ability to display the highest and lowest values of the variables in the dataset and to enable multivariable quantitative analytics.

Figure 13
figure 13

Radar chart showing RMSE, MAE, R2 and ENS in training and testing phases of (a) Tilga, (b) Jenapur, (c) Jaraikela (d) Gomlai.

The coefficient of determination (R2) values of the applied models are represented in Fig. 10. Higher R2 values (between 0 and 1), i.e., values close to 1, reveal a better agreement between the observed and predicted values. The figures show that the integrated SVM models had a higher value of R2 than conventional SVM model, and among the hybrid models, the SVM-SSA provided the closest value to 1 (R2 − 0.9261). It can be concluded from the R2 values that SVM-SSA model outperformed other models. Based on the values, the applied models are ordered as SVM-SSA, SVM-GOA, SVM-BOA, SVM-BA, and SVM.

Discussion

Modelling of river sedimentation is one of the most complicated transformative hydrological modelling problems. Transport of suspended sediment is a dynamic non-linear system that raises significant uncertainty in characteristics of river hydrological modelling, consisting of changes in inflow and sediment load. In this context, robust methods must be employed for modelling SSL in rivers. Based on the assessment provided in previous sections, the developed SVM-SSA model effectively modeled SSL in this study, taking advantage of an optimization system to find optimum values of conventional SVM. The hybrid SVM-BOA, SVM-GOA, and SVM-BA models fail to estimate extreme SSL values accurately. However, the robust SVM-SSA model can correctly predict the maximum and minimum values with lesser errors. Based on forecasting results yielded by SVM-BA, SVM-GOA, SVM-BOA and SVM-SSA, it can be observed that there are slight differences with respect to four statistical metrics, indicating the importance of selecting an appropriate optimization algorithm for model parameter calibration. Standard SVM utilizing structural risk-minimization principle can gain good generalization performance. However, performance of SVM generally depends on optimization algorithm to calibrate parameters. Even though BA, GOA, BOA have been successfully used in solving optimization problems, all these algorithms face the drawback of easy premature convergence. As a newly proposed optimization algorithm, SSA has strong global optimum ability and can efficiently avoid local optimum issues. Hence, compared to BA, GOA and BOA, SSA affords better optimization potential.

In addition, SVM-SSA utilizes a high race optimum procedure that can learn the stochastic phenomena of SSL. It must be emphasized that field engineers are keen on using less complicated tools for practical use. Because the SVM-SSA model incorporates fewer input constraints in its architecture, it can be deliberated as an economic model for SSL prediction. The importance of this study lies in the usage of the sparrow search algorithm and its application in sediment load prediction. SSA has robust optimization capability, fast convergence speed, and broader applications than conventional heuristic search techniques. These advantages attract researchers to apply SSA for major issues like sediment load estimation, which is essential for monitoring and damage mitigation purposes. Also, high load of suspended sediment in streams are known to create unfavorable impacts on river water quality, potable water sources, reservoir or dam operations and irrigation activities.

Even though this research has made several contributions and innovations, there still exist certain drawbacks. A drawback of this research is considering a particular case study (Brahmani River basin). Further investigation will include testing the applied approach on different other streams. The authors also plan on evaluating the usage of input variables produced from weather stations utilizing numerical rainfall–runoff modeling as a substitute for input variables obtained in situ. This will facilitate the adaptation of our applied prediction models (specifically where onsite data are restricted) and probably further enhance performance of SSL estimation. Another drawback is that because of the database's incompleteness, certain possible influence aspects might not be identified. For more enhancements in stability and accuracy of predictive models, the interrelation of more robust optimization algorithms is also worth having a consideration. For future efforts, a powerful model can also attempt for solving prediction problems in several other fields of study.

Conclusions

Prediction of river SSL is significant in planning, functioning, and preserving water structures. Sediment transport exhibits random behaviour in a river that estimates SL. This study predicted sediment loads by hybrid SVM-SSA, SVM-BOA, SVM-GOA, SVM-BA, and conventional SVM approaches. Monthly discharge and SSLs measured at Tilga, Jenapur, Jaraikela, and Gomlai stations were considered for setting up the prediction models. The prediction accuracy of the applied models is assessed utilizing statistical performance measures such as RMSE, MAE, R2, and ENS for different arrangements of input parameters. Graphical comparisons are also used in scatter plots, time-series plots, boxplots, and histogram plots for identifying the best prediction model. Results showed that SSA is the leading algorithm with fast convergence and higher accuracy. The results indicated that best SVM-based model has the lowest MAE and RMSE values and highest ENS values. This was achieved with 5 inputs after hybridization with SSA algorithm. The SVM-SSA hybrid model can precisely apprehend extreme SSL values, signifying its robustness for applicability in hydrological and water resource problems. The SVM-SSA model generated the best SSL predictions as confirmed by RMSE values of 15.6992, 15.9143, 15.5287, and 16.01885 during testing data set in Tilga, Jenapur, Jaraikela and Gomlai stations of Brahmani River. In contrast, the SVM model performed worst, as verified by the RMSE value of 36.457 (Tilga), 36.6975 (Jenapur), 36.0338 (Jaraikela), and 37.1004 (Gomlai) during testing phase. Using the proposed models to predict SSL and modelling the SSL process by taking into consideration other variables (e.g., precipitation intensity, temperature, runoff volume) can improve the present investigation. The current study is primarily relied on black-box models within the hybrid and ensemble SSL modeling framework, which is a notable limitation. Therefore, to increase the robustness of SSL modeling in future research, it is suggested for the inclusion of process-based models and optimization algorithms within the hybrid and ensemble framework. This integration of machine learning and process-based modeling approaches can potentially provide more interpretable and reliable modeling. Moreover, the authors recommend that projected SVM-SSA model could be employed in other catchments for reconfirming the model's efficiency. It is essential to employ new and excellent decomposition algorithms to improve the quality of subsequences. Furthermore, more machine learning techniques should be verified to enhance the single model forecast accuracy. As a future direction, SVM-SSA model can be verified for high time resolutions, like hourly, daily, and weekly, and combined into physical-based hydrological models.