Introduction

Blasting has been regarded as the main effective technique for rock excavation in open pit and underground mines (Bhandari 1997; Armaghani et al. 2016; Wang et al., 2018a, b; Huo et al., 2020; Du et al., 2022). Nevertheless, only a small amount (20–30%) of the explosive energy is used to break rock with current blasting techniques, and most (70–80%) of the explosive energy is lost with varying degrees of adverse consequences (Fig. 1) such as flyrock, backbreak and air blast (Agrawal and Mishra 2018; Uyar and Aksoy 2019; Fang et al., 2020; Fattahi and Hasanipanah 2021; Ramesh et al., 2021; Ye et al., 2021; Zhou et al., 2021a, b, c; Dai et al., 2022). Among these unwanted consequences, backbreak (BB) is one of the continuous and focused concerns of blasting engineers and scholars (Khandelwal and Monjezi 2013; Shirani et al., 2016). BB is defined as the damaged rocks beyond the limits of the last row of holes (Konya and Walter 1991; Jimeno et al., 1995). Different studies have investigated various parameters associated with BB, including controllable blasting parameters and uncontrollable blasting parameters (Konya and Walter 1991; Bhandari 1997; Konya 2003; Monjezi and Dehghani 2008; Monjezi et al., 2010a, b). Controllable parameters include blast design parameters and explosive properties such asburden (B), spacing (S), stemming (ST), subdrilling (SU), blasthole length (BL), blasthole diameter (BD), stiffness ratio (SR), explosive type, explosive density, explosive strength, powder factor (PF) and coupling ratio (Sari et al., 2014; Ghasemi 2017). Konya (2003) found that BB is positively correlated with ST and B. Monjezi and Dehghani (2008) considered that ST/B, PF, charge per delay and other parameters have the greatest influence on BB. Monjezi et al., (2010a) reported the different influences of ST, B, S and depth of the hole (DH) on BB. Sari et al. (2014) showed that reducing explosive strength and PF can effectively reduce BB. Several researchers reported the effects of different materials of explosive and coupling ratio on BB (Wilson and Moxon 1988; Firouzadj et al., 2006; Iverson et al., 2009; Enayatollahi and AghajaniBazzazi 2010). Uncontrollable blasting parameters refer to physical and mechanical properties of rock masses such as rock density, rock porosity, rock strength, discontinuities orientation, and discontinuities strength (Ghasemi 2017). Bhandari and Badal (1990) considered the relationship between the orientation of discontinuities and BB. Bhandari (1997) showed the effects characteristics of rock mass on BB. Jia et al. (1998) believed that joints with a dip angle were one of the main causes of BB based on the numerical simulation results.

Figure 1
figure 1

Adverse consequences of blasting in an open-pit

Because it is too difficult to evaluate and predict BB quickly and correctly based on the various influence parameters, several scholars proposed some empirical models or regression models to predict BB by considering different input variables from controllable or uncontrollable blasting parameters (Lundborg 1974; Roth 1979; Monjezi et al., 2010a, b; Esmaeili et al., 2014). Nevertheless, only partial valid parameters were considered in empirical formulas, which lack updates to new data (Saghatforoush et al., 2016; Kumar et al., 2021). In recent years, artificial intelligence (AI) technologies had been used widely in civil and mining engineering to solve forecasting problems (Zhou et al., 2012, 2016, 2019, 2022a, b; Khandelwal and Singh, 2011; Khandelwal et al., 20172018; Nguyen et al., 2020; Armaghani et al., 2020, 2021; Wang et al., 2021; Li et al., 2021a, b). Several researchers have adopted different AI technologies to predict BB, including, among others, artificial neural network (ANN) (Monjezi et al., 2013, 2014; Esmaeili et al., 2014), back propagation neural network (BPNN) (Sayadi et al., 2013), support vector machine (SVM) (Khandelwal and Monjezi 2013; Mohammadnejad et al., 2013; Yu et al., 2021), adaptive neuro-fuzzy inference system (ANFIS) (Esmaeili et al., 2014; Ghasemi et al., 2016), and random forest (RF) ( Sharma et al., 2021; Zhou et al., 2021c; Dai et al., 2022). Nonetheless, most single AI algorithms are prone to falling into local minima with low learning rates, particularly ANN, SVM, and ANFIS (Wang et al., 2004; Moayedi and Jahed Armaghani 2018; Ghaleini et al., 2019). The extreme learning machine (ELM) proposed by Huang et al., (2006) was proved to be superior to ANN and SVM in solving the prediction problem (Shariati et al., 2020). Meanwhile, swarm intelligence optimization (SIO) algorithms based on the biological behavior of natural populations have been used widely to optimize single AI algorithms to improve the performance of the model for BB prediction (Ebrahimi et al., 2016; Saghatforoush et al., 2016; Ghasemi 2017; Hasanipanah et al., 2017; Eskandar et al., 2018; Zhou et al., 2021c; Bhatawdekar et al., 2021; Dai et al., 2022;).

Therefore, this study aimed to develop novel hybrid ELM models by using six SIO algorithms to predict BB in an open pit, i.e., ELM-based particle swarm optimization (ELM–PSO), ELM-based fruit fly optimization (ELM–FOA), ELM-based whale optimization algorithm (ELM–WOA), ELM-based lion swarm optimization (ELM–LSO), ELM-based seagull optimization algorithm (ELM–SOA) and ELM-based sparrow search algorithm (ELM–SSA). The rest of this study is organized as follows. The section "Methodology" introduces the ELM model and six SIO algorithms. The section "Dataset and Preparation" shows data sources and detailed data analysis. The section "Performance Indicators" introduces six indicators to evaluate the performance of different models. The section "Results and Discussion" describes the development of all models and displays the results of models for BB prediction. The section "Conclusion and Summary" gives the main conclusion remarks of this study and some personal opinions.

Methodology

Extreme Learning Machine

Huang et al. (2006) proposed a special neural network model, called the ELM, as one of the single-layer feed-forward neural network (SLFN) architectures. This model has one hidden layer, which can easily handle optimization problems by simply adjusting the number of neurons in the hidden layer (Zhang et al., 2021a, b). Assuming a training set D that contains K-dimensional (\(x_{i} = \left[ {x_{i1} ,x_{i2} , \ldots ,x_{iK} } \right]^{T}\)) input vectors and L-dimensional output vectors ti = \(\left[ {t_{i1} , \, t_{i1} , \, ... \, ...{ , }t_{iL} } \right]^{T}\), an effective ELM model is built to simulate the internal connection between input and output vectors according to the following three steps.

  • Step 1: Building an SLFN. The purpose of this step is to establish preliminarily the input weights Wq and bias Bq between the input layer and the hidden layer, and the output weights \(\beta_{i}\) between the hidden layer and the output layer. Therefore, an SLFN with M neurons in a hidden layer can be written as:

$$\sum\limits_{q = 1}^{M} {\beta_{i} g\left( {W_{q} \cdot x_{i} + B_{q} } \right) = t_{i} } \, i = 1, \, 2, \, 3,...,D$$
(1)

where g(x) represents the activation function, wq belongs to the set W: \(\left[ {W_{q1} , \, W_{q1} , \, ... \, ...{ , }W_{qn} } \right]^{T}\), and Bq belongs to B: \(\left[ {B_{q1} , \, B_{q1} , \, ... \, ...{ , }B_{qn} } \right]^{T}\).

  • Step 2: Selecting weights and biases. These have an important effect on the output for a certain number of neurons in the hidden layer. To minimize the output error \(\sum\nolimits_{i = 1}^{P} {t_{i} - u_{i} = 0}\), the SLFN in step 1 can be transformed to:

    $$\sum\limits_{q = 1}^{M} {\beta_{i} g\left( {W_{q} \cdot x_{i} + B_{q} } \right) = u_{i} } \, i = 1, \, 2, \, 3,...,D$$
    (2)

    where ui = \(\left[ {u_{i1} , \, u_{i1} , \, ... \, ...{ , }u_{iL} } \right]^{T}\) represents the target vector. Then, the output of hidden layer neurons H and weights β can be expressed as:

    $$H(W_{1} , \, ...W_{M} , \, B_{1} , \, ..B_{M} , \, x_{1} , \, ...x_{M} ) = \left[ {\begin{array}{*{20}c} {g(W_{1} \cdot x_{1} + B_{1} )} & \cdots & {g(W_{P} \cdot x_{1} + B_{M} )} \\ \vdots & \cdots & \vdots \\ {g(W_{1} \cdot x_{D} + B_{1} )} & \cdots & {g(W_{M} \cdot x_{D} + B_{M} )} \\ \end{array} } \right]_{D \times M} \beta = \left[ {\begin{array}{*{20}c} {\beta_{1}^{T} } \\ \vdots \\ {\beta_{M}^{T} } \\ \end{array} } \right]_{M \times L}$$
    (3)
  • Step 3: Estimating the weights between the hidden and output layers. The optimal output weight can be solved by an inverse hidden layer output matrix (Shariati et al., 2019). It means that the target vector Tv is closest to the real vector. Therefore, the target vector Tv and the corresponding output weights vector \(\hat{\beta }\) can be expressed as:

    $$T_{v} = H \cdot \beta = \left[ {\begin{array}{*{20}c} {u_{1}^{T} } \\ \vdots \\ {u_{D}^{T} } \\ \end{array} } \right]_{D \times L} \, \hat{\beta } = H^{\dag } T_{v}$$
    (4)

    where \(H^{\dag }\) is Moore–Penrose generalized inverse matrix.

Swarm Intelligence Optimization

Particle Swarm Optimization

Kennedy and Eberhart (1995) proposed a particle swarm optimization (PSO) algorithm to solve the optimization problem inspired by the predation behavior of birds. The core of PSO comprises massless particles with velocity and position. Velocity indicates how fast birds search for food, and position affects the direction of birds. Each bird (particle) is independent but shares the position of the food at the same time. Throughout the search space, the individual extremum is a position of food for each bird. Birds aim to move toward the best food location by comparing shared food positions. The velocity and position of each bird in the n + 1th iteration can be expressed by two mathematical formulas, thus:

$$V_{i}^{n + 1} = uV_{i}^{n} + c_{1} r_{1} (P_{{{\text{individual}},i}}^{n} - P_{i}^{n} ) + c_{2} r_{2} (P_{{{\text{group}},i}}^{n} - P_{i}^{n} ) \, i = 1,2, \ldots ,N$$
(5)
$$P_{i}^{n + 1} = V_{i}^{n + 1} + P_{i}^{n}$$
(6)

where N is the number of particles, u is a factor not less than zero, c1 and c2 are individual and social learning factors, where c1 = c2 = 2 in this study, r1 and r2 are random numbers between 0 and 1, and \(P_{individual,i}^{n}\) and \(P_{group,i}^{n}\) are the optimal positions for the individual and the group, respectively.

Fruit Fly Optimization Algorithm

Pan (2012) proposed a new algorithm based on the foraging behavior of fruit flies to solve the global optimization problem, named the fruit fly optimization algorithm (FOA). The fruit fly is considered one of the best hunters in nature because of its excellent sense of smell and vision. The illustration of the body looks and foraging process of the fruit fly is depicted in Figure 2. Within a certain range of search space, the fruit fly first activates the olfactory function to search for food. After approaching, it uses keen vision to search for food precisely and finally determine the position. Therefore, there are two main steps in the FOA.

  • Step 1: Osphresis search. Assume the position of one fruit fly is (xi, yi), which randomly searches for food in a certain space based on olfactory feedback. However, the position of the food is not known in advance. The smell concentration (Smelli) is assumed to be inversely proportional to the distance (Disti) of the ith fruit flies from the starting point (0, 0). Then, the Osphresis foraging can be expressed as:

$$\left\{ \begin{gathered} x_{i} = x\_{\text{axis + random value}} \hfill \\ y_{i} = y\_{\text{axis + random}}\,{\text{value}} \hfill \\ \end{gathered} \right.$$
(7)
$${\text{Dist}}_{i} = (x_{i}^{2} + y_{i}^{2} )^{{0.5}}$$
(8)
$${\text{Smell}}_{i} = 1/{\text{Dist}}_{i}$$
(9)
Figure 2
figure 2

Foraging process of fruit flies

  • Step 2: Vision search. Olfactory search aims to determine the position of flies with the best smell concentration (Smellbest) and moving toward the position (x_axis, y_axis), which can be expressed as:

    $${\text{Smellbest}} = \max {\text{Smell}}_{i} \, \left\{ \begin{gathered} x\_{\text{axis}} = x({\text{Smellbest}}) \hfill \\ y\_{\text{axis}} = y({\text{Smellbest}}) \hfill \\ \end{gathered} \right.$$
    (10)

Whale Optimization Algorithm

Mirjalili and Lewis (2016) developed the whale optimization algorithm (WOA) by mimicking the predatory behavior of humpback whales in the ocean. Whales are relatively intelligent creatures in the ocean, thanks to having more than twice as many spindle cells as humans, especially humpback whales have even developed their own language (Hof et al., 2007). The most interesting thing about humpback whales is their foraging behavior, which is called bubble-net hunting as shown in Figure 3a. This foraging is required such that humpback whales dive 12–15 m to the bottom of the shoal and then attack by creating bubbles along a circle or ‘9’-shaped path (Mirjalili and Lewis 2016; Fan et al., 2020; Zhou et al. 2022c). Before hunting, humpback whales are very good at locating and encircling prey, and this behavior can be expressed mathematically as:

$$E = \left| {C_{1} \cdot X_{w}^{*} (n) - X(n)} \right| \, C_{1} = 2f$$
(11)
$$X(n + 1) = X_{w}^{*} (n) - C_{2} \cdot E \, C_{2} = 2b \cdot f - b$$
(12)

where C1 and C2 are coefficient vectors, respectively, \(X_{w}^{*}\) and X represent the best and current positions of whales, respectively, in the nth iteration, E is the absolute value of the distance between whales and prey, and e is decreasing from 2 to 0 in the course of iteration, and u is randomly changed in [0, 1].

Figure 3
figure 3

The encircling and bubble-net hunting behavior of humpback whales

After encircling their prey, humpback whales shrink encircling and constantly reposition themselves to complete the bubble-net feeding behavior. As shown in Figure 3b, the shrinking encircling behavior is done by decreasing b in Eq. 11. Meanwhile, the positions of humpback whales (X, Y) are also changing in a spiral as shown in Figure 3c. The new position of the whale is expressed as:

$$X(n + 1) = \left\{ {\begin{array}{*{20}l} {X_{w}^{*} (n) - C_{2} \cdot E} & {{\text{if }}x{\text{ }}\,{\text{ < }}\,0.5} \\ {E \cdot \exp ^{{sl}} \cdot cos\left( {2\pi l} \right) + X_{w}^{*} (n)} & {{\text{if }}x{\text{ }}\,{\text{ > }}\,0.5} \\ \end{array} } \right.{\text{ }}$$
(13)

where x and l are changed randomly in [0, 1], and s is a constant to define the spiral shape.

Lion Swarm Optimization

Liu et al. (2018) proposed the lion swarm optimization (LSO) based on the hunting behavior of a lion swarm in nature. There is a strict social hierarchy within a lion swarm. The first echelon is the king lion, called a leader. A leader is responsible for assigning tasks to the other lions, distributing food and accepting status challenges. The second echelon is the lioness, called a predator. The predator is responsible for hunting, including searching for, tracking, trapping, and attacking prey. In addition, the predator is a direct communication link between the leader and the other lions and is responsible for giving instructions and feedback. The lion cubs are at the bottom of the swarm, and their main job is to learn how to hunt from a predator, a called follower. Once in their adulthood, followers are driven out of the group and trained to challenge the leader. Assume a lion swarm has N lions, where the ratio of leaders is less than 0.5, the positions of the lions in the different echelons are expressed as follows.

  1. (a)

    The position of leader is near the food:

$$p_{i}^{k + 1} = g^{k} (1 + \gamma \left\| {l_{i}^{k} - g^{k} } \right\|$$
(14)
  1. (b)

    The predator often needs the help of another lioness to move:

    $$p_{i}^{k + 1} = \frac{{l_{i}^{k} + l_{c}^{k} }}{2}(1 + \alpha_{f} \gamma )$$
    (15)
  1. (c)

    The positions of the cubs are determined by the leader and the predator:

    $$p_{i}^{k + 1} = \left\{ {\begin{array}{*{20}c} {\frac{{l_{i}^{k} + g^{k} }}{2}(1 + \alpha_{c} \gamma ), \, q \le 1/3} \\ {\frac{{l_{m}^{k} + l_{i}^{k} }}{2}(1 + \alpha_{c} \gamma ),{ 1/3} \le q < 2/3} \\ {\frac{{l_{i}^{k} + \overline{g}^{k} }}{2}(1 + \alpha_{c} \gamma ),{ 2/3} \le q \le 1} \\ \end{array} } \right.$$
    (16)

    where \(p_{i}^{k + 1}\) represents the position of the ith leader at the k + 1th iteration, gk and \(l_{i}^{k}\) represent the optimal and historical position of the leader in the kth iteration, respectively, \(\gamma\) and q are changed randomly in [0, 1], \(l_{c}^{k}\) and \(l_{m}^{k}\) represent the historical position of the ith predator and follower at the kth iteration, respectively, and \(\overline{g}^{k}\) represents the current position of followers. The disturbance factors are defined as \(\alpha_{f}\) and \(\alpha_{c}\) in the movement range of predator and follower.

Seagull Optimization Algorithm

Dhiman and Kumar (2019) proposed a new algorithm, called the seagull optimization algorithm (SOA), based on the migration and attacking behavior of seagulls (Fig. 4) to solve the optimization problem. Seagulls rely on unique intelligence to catch prey, such as imitating the sound of rain to lure fish to the surface (Dhiman and Kumar 2019). As a kind of seasonal migration, seagulls need to obtain food to supplement energy in the process of migration to reach the destination (Avise 2017). However, the population of seagulls is very large in migration such that it is important to avoid colliding with each other. Assume the movement behavior (U) of seagulls, and this problem can be solved as:

$$\vec{x}_{s} = U \times \vec{N}_{s} \left( t \right)$$
(17)
$$U = C - (t \times (\frac{C}{{{\text{m}} \_{\text{iter}}}}))$$
(18)

where \(\vec{x}_{s}\) and \(\vec{N}_{s}\) represent the updated and current position of the seagulls, respectively, t represents the iteration time, m_iter indicates the maximum iteration, and C represents a control factor of U, which can be decreased linearly from 2 to 0. To get enough good food, seagulls need to constantly adjust their position to keep moving toward the best food. This behavior can be described as:

$$\vec{D}_{s} = F \times \left( {\vec{N}_{bs} \left( t \right) - \vec{N}_{s} \left( t \right)} \right)$$
(19)
$$\vec{A}_{s} = \left| {\vec{x}_{s} + \vec{D}_{s} } \right|$$
(20)

where \(\vec{N}_{bs} \left( t \right)\) represents the current best position of a seagull, \(\vec{D}_{s}\) indicates the position where the current seagull toward the best seagull, \(\vec{A}_{s}\) represents the distance between the updated seagull and the best seagull, and F is a balance factor that can be estimated as:

$$F = 2U^{2} \cdot h$$
(21)

where h is changed randomly in [0, 1]. The seagulls maintain a spiral motion and change the angle and speed through their wings and weight in the attack. This attack pattern can be written in x, y, z planes, thus:

$$\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{x} = r \times \cos K,\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{y} = r \times \sin K,\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{z} = r \times K$$
(22)
$$r = c_{1} \times e^{{Kc_{2} }}$$
(23)

where r indicates the radius of the spiral in each turn, K represents a random number in the range 0–\(2\pi\), c1 and c2 represent constants to describe the shape of a spiral, and e is the base of the natural logarithm. Therefore, the best position of seagulls is calculated as:

$$\vec{N}_{s} (t) = (\vec{A}_{s} \times \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{x} \times \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{y} \times \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{z} ) + \vec{N}_{bs} \left( t \right)$$
(24)
Figure 4
figure 4

Migration and attacking behavior of seagulls

Sparrow Search Algorithm

Xue et al. (2020) developed a new SIO algorithm that was inspired by the foraging behaviors of sparrows. The sparrows are common small social birds and do not migrate seasonally. Meanwhile, sparrows have powerful memories that help them better find food. Note that the sparrows are mainly divided into two types in this so-called sparrow search algorithm (SSA). The producers are responsible for searching for high-energy foods, and the food of the scroungers comes from the producers. The interesting thing is the flexible interchangeability of the producers and the scroungers identities, but the ratio of producers to scroungers is fixed in the sparrow swarm (Barta et al., 2004; Xue et al., 2020). This means that the strategy is useful for the producers and the scroungers to find higher-energy foods (Liker and Barta 2002). The natural curiosity of sparrows helps the producers, and the scroungers evade attackers. When one or more individuals spot attackers and sing, the entire swarm flies away (Pulliam 1973).

Assuming that there are n sparrows, J represents the spatial distribution and setting of the warning signal \(W_{s}\). In the SSA, the producers are not only responsible for finding the food, but also for feeding the scroungers. Therefore, the producers can search a wider area for energy-dense foods, and the position of the producers can be written as:

$$S_{i,j}^{k + 1} = \left\{ {\begin{array}{*{20}c} {S_{i,j}^{k} { + }Q \cdot L{, }R_{2} \ge W_{s} } \\ {S_{i,j}^{k} \cdot \exp \left( {\frac{ - i}{{\alpha \cdot {\text{iter}}_{\max } }}} \right), \, R_{2} < W_{s} } \\ \end{array} } \right.$$
(25)

where \(S_{i,j}^{k + 1}\) represents the position of the ith producer in the jth dimension at the k + 1th iteration, Q represents a random number that follows a normal distribution, L represents a matrix (\(1 \times d\)) where each element is 1, and the maximum number of columns (d) is the maximum dimension of J; \(\alpha\) and R2 are random numbers that vary in \((\mathrm{0,1})\), and itermax indicates the maximum time of iteration. As shown in Eq. 25, when R2 > Ws it means that individuals detect the attackers, the producers and the scroungers quickly fly to safe places. Inversely, the producers continue to search for food. The positions of the scroungers are related to the producers. The scroungers grab the producers of higher energy foods and update their positions according to:

$$S_{i,j}^{k + 1} = \left\{ {\begin{array}{*{20}c} {Q \cdot \exp (\frac{{S_{worst}^{k} - S_{i,j}^{k} }}{{i^{2} }}){, }i > \frac{n}{2}} \\ {S_{b}^{k + 1} + \left| {S_{i,j}^{k} - S_{b}^{k + 1} } \right| \cdot A^{ + } \cdot L, \, i \le \frac{n}{2}} \\ \end{array} } \right.$$
(26)

where \(S_{b}^{k + 1}\) and \(S_{worst}^{k}\) represent the current best position of the producers and the global worst position, A represents a matrix (\(1 \times d\)) where each element is 1 or -1 and the maximum number of columns (d) also is the maximum dimension of J, and \(A^{ + } = A^{T} (AA^{T} )^{ - 1}\). However, if i > n/2, then the ith scrounger is most likely the starving sparrow. As soon as one or more individuals spot attackers, the sparrows move to safety. The mathematical expression for this behavior is:

$$S_{i,j}^{k + 1} = \left\{ {\begin{array}{*{20}c} {S_{best}^{k} + \beta \cdot \left| {S_{i,j}^{k} - S_{best}^{k} } \right|, \, f_{i} > f_{b} } \\ {S_{i,j}^{k} + \kappa \cdot \left( {\frac{{\left| {S_{i,j}^{k} - S_{worst}^{k} } \right|}}{{(f_{i} - f_{w} ) + \varepsilon }}} \right), \, f_{i} = f_{b} } \\ \end{array} } \right.$$
(27)

where \(S_{best}^{k}\) represents the current global best position at iteration k, \(\beta\) shows a control factor of step size that corresponds to a normal distribution with a mean of 0 and a variance of 1, \(\kappa\) represents a random number in [-1, 1]. Here, fi, fb and fw are values of a present sparrow, the current global best and worst, respectively. \(\varepsilon\) is a constant to avoid fi= fw. To put it simply, fi > fb indicates that sparrows in this position are highly vulnerable to attacker assail, while fi = fb indicates that sparrows in the center of the group are also aware of the presence of an attacker and begin to approach others to reduce the risk.

Dataset and Preparation

The Sungun mine is one of the large open-pit mines in Iran (Fig. 5). Investigations have shown that the maximum BB in this mine was 10 m. In total, 234 blasting operations recorded by Khandelwal and Monjezi (2013) in the Sungun mine were used as a dataset in this study. Before a blasting operation, controllable blasting parameters can be set and recorded by blasting engineers actively. Therefore, B, HL, ST, S, PF and special drilling (SD) were used as input parameters to predict BB. Figure 6 shows details of the six input parameters in boxplots, and the correlations between different parameters and BB are shown in Figure 7. The dataset was divided randomly into two groups: The training set (70%) was responsible for constructing the prediction model with certain precision, and the testing set (30%) was responsible for evaluating the prediction performance of the model.

Figure 5
figure 5

modified from Khandelwal and Monjezi (2013))

Location of the Sungun mine in Iran, which used as case study for forecasting BB (

Figure 6
figure 6

Boxplots of input parameters

Figure 7
figure 7

Correlations between input and output parameters

Performance Indicators

In order to obtain accurately the prediction performance of ELM and six novel ELM–SIO models, the root mean square error (RMSE), Pearson correlation coefficient (R), determination coefficient (R2), mean absolute error (MAE), variance accounted for (VAF) and sum of square error (SSE) were used to evaluate these models in the training and testing phase. This was performed not only to verify the optimization effect of the swarm intelligence algorithm but also to select the best model for application in practical engineering. Therefore, six performance indicators were defined as follows (Hasanipanah et al., 2016; Zhou et al., 2020a, b, 2021a, b;; Jahed et al., 2021; Li et al., 2021a, b, c; Xie et al., 2021a, b):

$${\text{R}} = \frac{{N \cdot \sum\limits_{i = 1}^{N} {(y_{i} \cdot t_{i} )} - \sum\limits_{i = 1}^{N} {y_{i} } \cdot \sum\limits_{i = 1}^{N} {t_{i} } }}{{\sqrt {\left[ {N \cdot \sum\limits_{i = 1}^{N} {y_{i}^{2} } - \left( {\sum\limits_{i = 1}^{N} {y_{i} } } \right)^{2} } \right] \cdot \left[ {N \cdot \sum\limits_{i = 1}^{N} {t_{i}^{2} } - \left( {\sum\limits_{i = 1}^{N} {t_{i} } } \right)^{2} } \right]} }}$$
(28)
$${\text{RMSE}} = \sqrt {\frac{1}{N}\sum\limits_{i = 1}^{N} {\left( {y_{i} - t_{i} } \right)^{2} } }$$
(29)
$${\text{R}}^{{2}} = \frac{{\left[ {\sum\limits_{i = 1}^{N} {(y_{i} - \overline{y}) \cdot (t_{i} - \overline{t})} } \right]^{2} }}{{\sum\limits_{i = 1}^{N} {(y_{i} - \overline{y})} \cdot \sum\limits_{i = 1}^{N} {(t_{i} - \overline{t}_{i} )} }}$$
(30)
$${\text{MAE}} = \frac{1}{N}\sum\limits_{i = 1}^{N} {\left| {y_{i} - t_{i} } \right|}$$
(31)
$${\text{VAF}} = \left[ {1 - \frac{{{\text{var}} (y_{i} - t_{i} )}}{{{\text{var}} (y_{i} )}}} \right] \times 100\%$$
(32)
$${\text{SSE}} = \sum\limits_{i = 1}^{N} {\left( {y_{i} - t{}_{i}} \right)^{2} }$$
(33)

where N represents the number of samples, yi and \(\overline{y}\) indicate the observed and mean observed BB, respectively, and ti and \(\overline{t}\) indicate the predicted and mean predicted BB, respectively.

Results and Discussion

In total, seven models were considered for BB prediction. Figure 8 depicts the entire prediction process, including input data, model development, and performance evaluation. In addition to stability analysis, model development and performance evaluation were emphasized.

Figure 8
figure 8

Flowchart of predicting BB

Models Development

ELM Model

Six swarm intelligence models were developed based on ELM. Therefore, it was necessary to determine the optimal ELM structure. ELM is a special neural network structure with a single hidden layer, and the number of neurons in the hidden layer determines the performance of prediction. In this study, the RMSE index was used to evaluate the performance of ELM in the training and testing phases. The initial number of neurons was 10, and the next experiment increased by increments of 10 and was stopped until 150. Table 1 shows the prediction performance and corresponding RMSE values of the considered model with different numbers of neurons in 15 experiments. As shown in this table, the lowest RMSE occurred in the hidden layer with 60 neurons in both the training and testing phases. Therefore, an ELM model with 60 neurons in the single-layer feed-forward neural network (SLFN) architecture was developed (Fig. 9).

Table 1 Results of determining the number of neurons in the ELM model
Figure 9
figure 9

Structure of ELM with 60 neurons in hidden layer

ELM–SIO Models

Six ELM–SIO models were developed with the same architecture as the ELM (i.e., a hidden layer with 60 neurons in the SLFN). Thus, the optimization problem was to obtain the best input weights and biases value. SIO is a relatively new idea, and it was proposed by imitating the behavior of insects and animals (Saghatforoush et al., 2016). Different from other methods, SIO only needs to train the most appropriate population to solve an optimization problem. For example, Shariati et al. (2020) considered 75 wolves in the ELM–GWO model to predict the compressive strength of concrete with partial replacements for cement. For a similar purpose, six ELM–SIO models (ELM–PSO, ELM–FOA, ELM–WOA, ELM–LSO, ELM–SOA, ELM–SSA) were used here to obtain the optimal number of the population in a certain number of iterations. The RMSE index was also used here to evaluate the model performance, and the calculation time of each iteration was also recorded. The result of the fitness value in a different numbers of the population is shown in Figure 10. This illustrates that the fitness value was not affected by the number of the population that exceeded 400 iterations in the ELM–PSO model, the rest were 500 in FOA (Fig. 10b), 300 in WOA (Fig. 10c), 500 in LSO (Fig. 10d), 400 in SOA (Fig. 10e) and 500 in SSA (Fig. 10f). As shown in Figure 11, the lowest RMSE and the corresponding iteration time are shown per ELM–SIO model each with a different number of populations. As can be realized, 60 birds were considered in the ELM–PSO model, the rest were 50 fruit flies in FOA (Fig. 11b), 50 whales in WOA (Fig. 11c), 50 lions in LSO (Fig. 11d), 70 seagulls in SOA (Fig. 11e) and 40 (Fig. 11f) sparrows in SSA.

Figure 10
figure 10

Impact of the number of the population on the fitness value in the development of ELM–SIO models: (a) ELM–PSO; (b) ELM–FOA; (c) ELM–WOA; (d) ELM–LSO; (e) ELM–SOA; (f) ELM–SSA

Figure 11
figure 11

Iteration time and the best fitness value in the development of ELM–SIO models: (a) ELM–PSO; (b) ELM–FOA; (c) ELM–WOA; (d) ELM–LSO; (e) ELM–SOA; (f) ELM–SSA

Comparison of Results

As discussed earlier, the numbers of neurons and population in six ELM–SIO models were turned, and these hybrid models were used to predict BB. The prediction performances of the six models were compared by comparing the predicted values with the observed values in regression diagrams, and six evaluation indices (RMSE, R, R2, VAF, MAE, SSE) were calculated. The regression diagrams of the six ELM–SIO models in the training phase are shown in Figure 12. The vertical and horizontal axes represent predicted and observed values of BB, respectively. The diagonal (solid line) is responsible for separating the predicted value from the observed value. If the predicted value is greater than the observed value, it falls on the perfect fitting line and, on the contrary, falls below. Only when the predicted value is equal to the observed value can it appear on the line, and the model with more targets on the line has higher predictive performance. At the same time, a boundary of 10% off the perfect fitting line was set to cover more points to compare performance. As can be realized, the points distributed on the perfect fitting line were the least in the ELM–FOA model, and some predicted values even exceeded the 10% range. The prediction performances of the other five models were obviously better than that of ELM–FOA, but it was difficult to distinguish the optimal model by the naked eye. Table 2 records the performance evaluation indicators per model. As shown in this table, the ELM–FOA model had the worst performance indicators of all SIO models (RMSE: 0.4055, R: 0.9879, R2: 0.9760, VAF: 97.5956%, MAE: 0.3023 and SSE: 26.9645) and the same is depicted in the regression diagrams. Among the other five models, the ELM–LSO model had the best performance indicators (RMSE: 0.1129, R: 0.9991, R2: 0.9981, VAF: 99.8135%, MAE: 0.0706 and SSE: 2.0917) even though the differences among them were not significant.

Figure 12
figure 12

Regression diagrams of the models in the training phase: (a) ELM–PSO (b) ELM–FOA; (c) ELM–WOA; (d) ELM–LSO; (e) ELM–SOA; (f) ELM–SSA

Table 2 Performance evaluation indicators of the EML–SIO models in the training phase

The result of training does not represent the final predictive ability of the model (Shariati et al., 2020). Conversely, models that perform well in the training may perform poorly in the testing phase. To avoid misevaluation, a boundary of 30% off the perfect fitting line was increased based on the original 10%. Figure 13 illustrates regression diagrams of the six ELM–SIO models in the testing phase. Compared with the training phase, the predictive performance of each model in the testing phase decreased. As shown in this figure, five models all had the phenomenon that the prediction point was outside the boundary line of 30%, except the ELM–LSO model. Table 3 records the performance evaluation indicators per model in the testing phase. As can be observed in this table, the RMSE (0.2441), MAE (0.1669) and SSE (4.1710) of the ELM–LSO model were higher than those in the training phase, but still the lowest among the six models in the testing phase. The R (0.9949), R2 (0.9891) and VAF (98.9806%) of ELM–LSO were the highest. As mentioned earlier, the ELM–SOA model performed second only to ELM–LSO, while the ELM–FOA model also had the worst predictive performance in the testing phase.

Figure 13
figure 13

Regression diagrams of the models in the testing phase: (a) ELM–PSO; (b) ELM–FOA; (c) ELM–WOA; (d) ELM–LSO; (e) ELM–SOA; (f) ELM–SSA

Table 3 Performance evaluation indicators of the EML–SIO models in the testing phase

Furthermore, the ranking scores of the six ELM-SLO models were obtained to offer more intuition as to their performance indicators, as shown in Figure 14. In terms of ranking scores, the ELM–LSO model was the best in both the training (36) and the testing (36) phases. To further verify the model performance, Figure 15 depicts the predicted curve versus the observed curve per model in the testing phase for BB prediction. Overall, the predicted curve of each model was consistent with the observed curve. However, some local details are very important to evaluate the predictive performance of each model. As shown in this figure, the ELM–PSO model had obvious prediction errors for sample numbers 6, 37 and 63; the ELM–FOA model had obvious prediction errors for sample numbers 33–35, 42, 46–49 and 65–66; the ELM–WOA model had obvious prediction errors for sample numbers 18, 37 and 63; the ELM–SOA model had obvious prediction errors for sample numbers 9 and 49; the ELM–SSA model had obvious prediction errors for sample numbers 9, 15, 42 and 63. However, the ELM–LSO model was not obviously wrong in the details. Therefore, the results indicate that the ELM–LSO was the best model for BB prediction.

Figure 14
figure 14

Intuitive comprehensive ranking of the six ELM–SIO models: (a) Training phase; (b) testing phase

Figure 15
figure 15

BB prediction in the testing phase by different ELM–SIO models

Figure 16 clearly shows the difference between the ELM model and the ELM–LSO model. In this figure, the abscissa is the number of samples, and the ordinate is the ratio of predicted value to observed value. A ratio of 1 indicates that the predicted value is equal to the observed value, and the closer it is to 1, the better the prediction performance is. Figure 16a and c shows the prediction results in the training phase. The ELM–LSO model significantly improved the prediction performance of the ELM model and made the predicted value closer to the observed value. Also, in the testing phase, there were more targets close to 1 in the ELM–LSO model. While this model is not perfect, the outliers were not very far away.

Figure 16
figure 16

Comparison of the ELM model with the ELM–LSO model: (a) and (b) Training and testing phases of ELM; (c) and (d) training and testing phases of ELM–LSO

The research results not only show that the performance of the ELM model can be improved by using SIO algorithms for BB prediction, but that ELM–LSO was the best model. Therefore, the comparison between the six hybrid models proposed in this study and previous studies is shown in Table 4. As can be seen in this table, the performance of the ELM–LSO model was the best among all models, especially based on the same dataset proposed and used by Khandelwal and Monjezi (2013), Zhou et al., (2021c) and Dai et al., (2022).

Table 4 Comparison among the current and previous works for BB prediction

Relative Importance of Influence Variables

Sensitivity analysis is of great help to judge the prediction effect of influence variables on BB. According to the comparison results of different SIO models, the importance of variables was extracted from the ELM–LSO model in this study. The variable importance test mechanism is reflected in the amount of impurity reduction after changing randomly the values of the variables (Qi et al., 2018). Therefore, a new global sensitivity analysis method called PAWN, introduced by Pianosi and Wagener (2015, 2018), was used in this study. Different from the total local sensitivity analysis method based on square difference, the importance score (S) calculation expression is:

$$S = {\text{stat}}_{{\tilde{u} = 1,...,n}} \mathop {\max }\limits_{y} \left| {\hat{F}_{y} \left( y \right) - \hat{F}_{{y\left| {x_{i} } \right.}} (y\left| {x_{i} } \right. \in {\rm I}_{{\tilde{u}}} )} \right|$$
(34)

where stat is a statistic (e.g., maximum, mean or median), and maximum was selected in this study.\(\hat{F}_{y} \left( y \right)\) and \(\hat{F}_{{y\left| {x_{i} } \right.}}\) are unconditional and conditional cumulative distribution functions (CDFs) of the output variable y, respectively. \({\rm I}_{{\tilde{u}}}\) is equally spaced subintervals of input important variable xi. \(\tilde{u}\) is usually set to the default value, 10 (Xue et al., 2021).

Figure 17 shows the results of the variable importance scores. B and ST were the most sensitive variable for BB, with the highest importance score of 0.8717. The variable with the lowest score was SD (0.4817). However, there are no definitive studies to show that SD is necessarily the least important variable, especially given the amount of data covered in this study. In contrast, Faradonbeh et al., (2016) imposed that the PF and B are the most influential variables on BB. Some research suggested that HL and ST have greater effects on BB (Zhou et al., 2021a, b, c; Dai et al., 2022). Without further discussion, it is believed that the order of importance of influence variables to BB is the B and ST → HL and S → PF → SD.

Figure 17
figure 17

Sensitivity analysis of different variables obtained by the ELM–LSO model

Conclusions and Summary

Predicting BB is an interesting and productive exercise. In this study, 234 cases of BB were involved, with six input variables (hole length, spacing, burden, stemming, powder factor and specific drilling) and unique output (BB). Under the unique advantages of the ELM algorithm combined with SIO, six hybrid models were developed to predict BB. It is concluded that the performance of ELM–LSO was the best model in the prediction of BB compared to the other five ELM–SIO models (ELM–PSO, ELM–FOA, ELM–WOA, ELM–SOA and ELM–SSA). For the ELM–LSO model, RMSE was 0.1129 (R: 0.9991, R2: 0.9981, VAF: 99.8135%, MAE: 0.0706 and SSE: 2.0917) in training phase and 0.2441 in testing phase (R: 0.9949, R2: 0.9891, VAF: 98.9806%, MAE: 0.1669 and SSE: 4.1710). Therefore, SIO can be a very effective solution in improving the performance of the ELM model. It is worth noting that the burden and the stemming (0.8717) were the most influential input variable, followed by hole length and spacing (0.8205), powder factor (0.5575) and specific drilling (0.4871). This only represents the conclusion supported by the data in this study, which can be used as a reference but is not immutable. Moreover, more valid parameters such as rock quality designation (RQD), geological strength index (GSI) and weathering index (WI) could be taken into account in future BB prediction tasks, even if these parameters are difficult to obtain in actual blasting investigations. Meanwhile, advanced and effective optimization algorithms need to be developed to improve the BB prediction accuracy as much as possible, which has always been the difficulty of this kind of work.