1 Introduction

Metaheuristic optimization algorithms are commonly used to solve problems in the real world, such as feature selection (Hancer 2020; Hussien et al. 2017, 2020; Hussien and Amin 2022; Mostafa et al. 2022), text clustering (Abualigah et al. 2020, 2021; Assiri et al. 2020) and image segmentation (Elaziz et al. 2021; Hussien et al. 2022). In addition, due to their high effectiveness and minimal computational complexity, they are also employed to solve engineering problems (Onay and Aydemır 2022; Chhabra et al. 2023; Zheng et al. 2023). The optimization method determines the optimal or near-optimal solution for a given situation by minimizing or maximizing the problem’s objective function. Furthermore, metaheuristic algorithms explore the search space by considering random solutions for a global search and exploiting the feasible local solutions in the search space (Abu Khurma et al. 2020; Hussien et al. 2023; Hussien 2022). These two phases are the main steps in the metaheuristic algorithms, known as the exploration and exploitation phases (Khurmaa et al. 2021; Abu Khurma and Aljarah 2021; Fathi et al. 2021).

For metaheuristic algorithms (MAs), the balance between exploitation and exploration is essential. As an illustration, if MA has a crucial exploration, various search space areas are found by producing random operators, increasing the likelihood of trapping into the local optima region. The optimization strategy, on the other hand, aims to locate the optimal solution in the search space and is still able to fall in the local optima area if MA, on the other hand, pays more attention to the exploitation phase. For that, and to converge to the global optimum, the balance between these two phases must be ideal. Many MAs have been put forth in recent years to balance these two phases (Chen et al. 2022; Alzaqebah et al. 2022). However, the NFL theorem (Wolpert and Macready 1997; Singh et al. 2022; Hussien et al. 2022) states that no algorithm can solve every optimization problem. Therefore, studying metaheuristic algorithms for certain problems still has many real-world applications.

MAs can be categorized into four main categories: physics–mathematics-based algorithms (PMA), human-based algorithms, evolutionary algorithms (EA), and swarm intelligence (SI) (Onay and Aydemır 2022; Chen et al. 2022). In the PMA, MA draws inspiration from mathematical models and physical phenomena such as sine cosine algorithm (SCA) (Mirjalili 2016), Fick’s law algorithm (FLA) (Hashim et al. 2023), water cycle algorithm (WCA) (Hussien et al. 2022), and multi-verse optimizer (MVO) (Mirjalili et al. 2016). In comparison, the human-based algorithms take cues from human interaction and collaboration, such as the teaching–learning-based optimization algorithm (TLBOA) (Venkata Rao et al. 2011).

In contrast, the biological evolution that occurs in nature through reproduction, mutation, crossover, and selection serves as inspiration for the evolutionary algorithm (EA). Furthermore, individuals compete and may team up throughout evolution to identify the best candidate solution among them, such as genetic algorithms (GA) (Holland 1992) and evolutionary programming (EP) (Koza 1994). Last, the SI algorithms mimic animal behavior in nature in encircling and attacking the prey. The essential characteristic of the SI is the self-organization nature by self-transforming the components into valuable forms to deal with, such as particle swarm optimization (PSO) (Kennedy and Eberhart 1995), crow search algorithm (CSA) (Hussien et al. 2020), gray wolf optimizer (GWO) (Mirjalili et al. 2014), wild horse optimizer (WHO) (Zheng et al. 2022), Remora optimization algorithm (ROA) (Wang et al. 2022), Harris Hawk optimization (HHO) (Heidari et al. 2019), snake optimizer (SO) (Hashim and Hussien 2022), moth flame optimization (MFO) (Alazab et al. 2022), Aquila optimizer (AO) (Huangjing et al. 2022), jellyfish search (JS) (Gang et al. 2023), and whale optimization algorithm (WOA) (Mirjalili and Lewis 2016).

Beluga whale optimization (BWO) is a swarm-inspired population-based optimizer that simulate swimming, hunting, and falling whales’ behavior. Despite the good performance of BWO, it has many drawbacks and still need more improvements. These drawbacks include slow convergence and imbalance between exploitation–exploration. In this paper, a novel modified BWO (mBWO) optimizer is suggested which includes: Elite evolution strategy, randomization control factor, and factor of transition between exploitation and exploration.

The contribution of this paper can be summarized as follows:

  • An improved BWO (mBWO) is suggested which integrates three techniques with BWO. These techniques are elite evolution strategy, randomization control factor, and transition factor between exploitation and exploration.

  • mBWO is compared with original BWO and ten various optimizers using twenty-nine CEC2017 functions.

  • mBWO is employed to solve eight engineering problems, namely welded beam design problem, three-bar truss design problem, tension/compression spring design problem, speed reducer design problem, optimal design of industrial refrigeration system, pressure vessel design problem, cantilever beam design, and multi-product batch plant.

  • Results of mBWO overwhelmed other algorithms in both constrained & unconstrained problems.

The rest of this paper is organized as follows: Section 2 introduces some enhancements that have been done in many algorithms, whereas Sect. 3 proposes beluga whale optimization inspiration and mathematical formulation. Sections 4 and 5 propose the novel approach and its results compared with many well-known algorithms in solving constrained & unconstrained problems, whereas Sect. 6 concludes the paper.

2 Related work

Metaheuristic algorithms provide considerable advantages in solving challenging optimization and multi-objective problems. Numerous swarm intelligence techniques have been introduced and used to solve engineering and global optimization issues. However, there are a considerable number of swarm approaches. Moreover, balancing local and global search effectively avoids compromising the local optimum, drawing limitations in the metaheuristic algorithms. These limitations directly impact convergence, solution accuracy, and optimization efficiency for these algorithms (Wang et al. 2022; Singh et al. 2019).

In order to overcome these limitations, different metaheuristic algorithms were enhanced and modified using different techniques. Yueting et al. (2019) published a modified moth flame optimization technique (MFO) that incorporates chaotic local search and Gaussian mutation. Using these two techniques, the improved CLSGMFO outperforms the majority of metaheuristics. To overcome its shortcomings, MFO is further enhanced in Yueting et al. (2019) by merging it with Gaussian mutation (GM), Cauchy mutation (CM), and Lévy mutation (LM). Saunhita and Mini combined opposite-based learning (OBL) with Cauchy mutation and evolution boundary constraint management to improve the MFO’s functionality. Compared to other algorithms, this approach achieved the best results for 13 of 18 functions tested on a set from CEC2005 (Sapre and Mini 2019).

Similarly, the GWO’s performance was enhanced, and it was applied as a technique for solving global optimization issues by Ibrahim et al. (2018). The chaotic, opposite-based learning and DE algorithms are used to implement this improvement. Using evolutionary population dynamics (EPD), which aims to execute the removal of the weak individual, the authors in Saremi et al. (2015) enhanced the efficiency of GWO. This operator improves the exploration of GWO. The EEGWO algorithm was a revised GWO that the authors of Wen et al. (2018) proposed to strengthen its exploration. A new position-updated equation is presented by choosing a random individual from the population to direct the search for new candidate individuals. This improvement is made possible by adding a new strategy to update the position by using a random solution to guide the process of searching for new solutions to improve the exploration.

Recently, and in the same manner, Onay and Aydemır (2022) proposed an improved hunger games optimizer (HGS) for global optimization and engineering problems. The authors employed the chaotic map in the HGS; ten different chaotic maps have been used in three different scenarios to control two random values in the exploitation and exploration phases in the HGS algorithm. The proposed improvement shows superior results on the CEC2017 real-world engineering problems compared with the standard HGS and promising convergence capability compared to other state-of-the-art optimization algorithms. Furthermore, for solving multi-dimensional engineering global optimization problems, Lin et al. (2022) suggested a whale optimization algorithm with the niching strategy (NHWOA) to speeds up the algorithm’s convergence and covers more positions in the search space. However, the niching technique encourages population variation and prevents early convergence in searching for a universally superior solution. At the same time, a heuristic change is made to the WOA algorithm’s parameters to encourage search agents’ capacity for exploration throughout evolution. Multiple niches are created from the initialized global population, and each niche is updated separately. The search agents in the niches are redistributed to increase population variety after each constant interval iteration. The NHWOA was tested on the CEC2014 benchmark and showed exciting results compared with other modified versions of the WOA.

Table 1 Previous state-of-the-art techniques

A novel slime mould algorithm (SMA) for global optimization in practical engineering issues was introduced by Örnek et al. (2022). However, there is a good chance that the SMA will enter the local optimum rather than converge effectively. To get around this restriction, the authors made use of the sine cosine algorithm’s strength. Additionally, they added an improved sigmoidal function that is based on the Schwarz lemma for transformation to replace the arctanh function. On various engineering problems, including cantilever beam design, pressure vessel design, three-bar truss, and speed reducer real-world problems, their proposed technique exhibited a good orientation to avoid trapping into the local optima and faster convergence than the standard sine cosine slime mould algorithm.

Hashim et al. (2021) designed and tested a new metaheuristic algorithm for engineering problems, which is called AOA. The interesting physics concept known as Archimedes principle inspired the authors as they created a new optimization technique. The upward force exerted on an object that is wholly or partially submerged in a fluid is modeled as being proportional to the weight of the displaced fluid. It is important to note that the suggested strategy maintains a balance between exploitation and exploration. This characteristic makes it suitable for dealing with complex optimization issues with several local optimal solutions since it maintains a population of solutions and searches a large area to locate the best overall solution. Furthermore, the proposed algorithm shows a powerful optimization tool that balances exploration and exploitation with regard to convergence speed when evaluating it with other optimizers on the CEC2017 benchmark and four engineering design problems.

In line with the kinds mentioned earlier in the literature, solving real-world engineering problems has attention and concerns in the research field. And based on the free-lunch theorem, it is unrealistic to expect any metaheuristic method to solve every optimization problem. This challenge motivates us to develop and enhance a metaheuristic algorithm to overcome the general drawbacks of the used techniques to deal with this type of problem. Table 1 summarizes the previous state-of-the-art techniques used in the literature.

3 Beluga whale optimization (BWO)

The beluga whale (Delphinapterus leucas) is a species of whale that lives in the sea. It is renowned for its adults’ snow-white hue and has earned the nickname “canary of the sea” due to the variety of noises it makes. A beluga whale has a medium-sized, spherical, stocky body that ranges in length from 3.5 to 5.5 ms and weighs roughly 1500 kgs; they can form groups that range in size from 2 to 25, with an average of ten members. The ability to hear and see clearly allows belugas to maneuver and hunt by sound, and due to their blunt teeth, beluga whales typically suck their prey into their mouths. Beluga whales are primarily found in the Arctic and subarctic oceans, including Alaska, northwest Canada, and the waters near Ellesmere Island. Some beluga whales live in aquariums and have beautiful movements and a friendly demeanor (Zhong et al. 2022).

Zheng and Mong proposed beluga whale optimization (BWO) algorithm in 2022 (Zhong et al. 2022), miming the natural behavior inspired by beluga whales’ swimming, hunting, and whale-falling habits. BWO has exploration and exploitation phases, just like other metaheuristics. However, by selecting beluga whales randomly, the exploration phase ensures the design space may be searched globally, while the exploitation phase manages local searches. The BWO uses a balance factor (\(B_{\textrm{f}}\)) to transit from the exploration to the exploitation phase. The BWO is in the exploration phase when the \(B_{\textrm{f}}>0.5\), and in the exploitation phase when the \(B_{\textrm{f}}\le .5\) which is calculated as:

$$\begin{aligned} B_{\textrm{f}} = B_0 (1-t/ 2T) \end{aligned}$$
(1)

where \(B_0\) is a random variable in the range [0, 1] and changes in each iteration. The current iteration and the maximum number of iterations are denoted as t and T, respectively. Figure 1 illustrates the main phases in the BWO algorithm.

Fig. 1
figure 1

BWO phases

The exploration phase represents the swimming behavior of beluga whales with various social-sexual postures and actions, for example, when a pair of beluga whales swim closely and in a synchronized or mirror form. So, the pair swim of beluga whales is the determinant of the search agent’s positions, which will be updated as follows (Zhong et al. 2022):

$$\begin{aligned} X^{t+1}_{i,j} = {\left\{ \begin{array}{ll} X^{t}_{i,p_j} + (X^{t}_{r,p_i} - X^{t}_{i,p_j})(1+r_1) \sin {(2\pi r_2)}, &{}\quad j = Even\\ X^{t}_{i,p_j} + (X^{t}_{r,p_i} - X^{t}_{i,p_j})(1+r_1) \cos {(2\pi r_2)}, &{} \quad j = Odd\\ \end{array}\right. } \end{aligned}$$
(2)

where \(X^{t+1}_{i,j}\) is the new position for the \(i^{th}\) search agent in the \(j^{th}\) dimension, \(p_j\) is a random number from the d dimension. \(X^{t}_{i,p_j}\) and \(X^{t}_{r,p_i}\) are the current positions of the \(i^{th}\) and \(r^{th}\) search agents. While \(r_1\) and \(r_2\) are random numbers in the range [0,1], they are used for enhancing the randomness in the exploration phase. Furthermore, the fins of the mirrored beluga whales are toward the surface according to \(\sin {(2\pi r_2)}\) and \(\cos {(2\pi r_2)}\). The updated position depicts the synchronous or mirror behaviors of beluga whales when swimming or diving, depending on the dimension selected by odd and even numbers.

The exploitation phase was mathematically modeled based on the beluga whales’ hunting habits. Beluga whales share information about the search agents’ positions and the best position for the candidate prey. In addition, beluga whales can forage and move together when other beluga whales are around. In contrast, and to enhance the convergence of the BWO in the exploitation phase, the Levy flight (LF) was mathematically modeled and adopted in the BWO as follows (Zhong et al. 2022):

$$\begin{aligned} L_{\textrm{f}}&= 0.5 \times \frac{u \times \sigma }{|v|^{\frac{1}{\beta }}} \end{aligned}$$
(3)
$$\begin{aligned} \sigma&= \left( \frac{\tau (1+\beta ) \times \sin {(\pi \beta /2)}}{\tau ((\beta +1)/2) \times \beta \times 2^{(\beta -1)/2}} \right) ^{\frac{1}{\beta }} \end{aligned}$$
(4)

where u and v are normally distributed random numbers, \(\beta \) is the default constant equal to 1.5.

Finally, the exploitation phase can be addressed as follows:

$$\begin{aligned} X^{t+1}_{i} = r_3 X^{t}_{\textrm{best}} - r_4 X^{t}_{i} + C_1 L_{\textrm{f}} (X^{t}_{r} - X^{t}_{i}) \end{aligned}$$
(5)

where \(X^{t+1}_{i}\) is the new position of the \(i^{th}\) search agent. \(X^{t}_{r}\) and \(X^{t}_{i}\) are the current positions of the \(i^{th}\) and random search agents. The best search agent’s position is represented as \(X^{t}_{\textrm{best}}\), and \(r_3\) and \(r_4\) are in the range [0,1] random variables. The \(C_1\) is determining the strength of the jump in the LF (\(C_1 = 2r_4(1-t/T)\)).

The whale fall is a natural phenomenon on the seabed. However, instead of using their cunning to avoid the dangers around them, beluga whales may be attacked by other creatures like killer whales and humans and end up plummeting into the ocean depths. This dead body is considered food for other sea creations. In order to model this behavior, an assumption of the Whale fall (\(W_{\textrm{f}}\)) is introduced and expressed as follows (Zhong et al. 2022):

$$\begin{aligned} W_{\textrm{f}} = 0.1 - 0.05 t/T \end{aligned}$$
(6)

The risk of a whale falling is reduced from 0.1 in the first iteration to 0.05 in the last iteration, indicating that the danger posed by beluga whales lessens as they get closer to their food source during the optimization process.

It is crucial to note that the BWO’s search agent count, which must remain constant, is directly impacted by the whale fall. Therefore, the BWO uses the updated position, beluga whale positions, and step size of whale falls to guarantee the number of search agents (population size).

$$\begin{aligned} X^{t+1}_{i}&= r_5 X^{t}_{i} - r_6 X^{t}_{r} + r_7 X_{\textrm{step}} \end{aligned}$$
(7)
$$\begin{aligned} X_{\textrm{step}}&= (u_b - l_b)\exp (-C_2 t/T) \end{aligned}$$
(8)

where \(r_5\), \(r_6\), and \(r_7\) are random numbers bounded by 0 and 1, and \(X_{\textrm{step}}\) is the step size of whale fall. \(C_2\) is the step factor, which is connected to the likelihood that whales would plummet and the size of the population (\(C_2=2W_{\textrm{f}} \times n\)). Lower and upper boundaries are denoted as \(l_b\) and \(U_b\), respectively.

4 Proposed BWO method

The main drawback of the standard BWO algorithm is falling into local optima because of weakness in the searchability technique of this algorithm. In this paper, we propose an enhanced version of the BWO called mBWO by strengthening the standard BWO’s capability of searching to avoid trapping into local optima and achieve good balance between the exploration and exploitation phases. In the study, the BWO algorithm is enhanced by taking the advantages of the global and local search strategies from evolution-based and swarm-based algorithms with an exploration–exploitation balance. In contrast, and to achieve this purpose, we adopt several techniques; by using the elite evolution strategy, efficient solutions will be generated with meaningful variations between them to avoid falling into local optima. Additionally, we employed a controlling factor for controlling the randomization phase inside the BWO to force the algorithm to go away from the local optima region. The standard BWO is missing a transfer parameter to transit from the exploration phase to the exploitation phase, which directly affects the algorithm’s stability. For that, we add a transition factor to control the transition process.

  • Elite evolution strategy

    In the population-based MAs, these algorithms start with randomly selected solutions. During the optimization process, these algorithms update the solution’s positions by generating new solutions and keep the best solution till finding a better solution. The best solution may represent the local best solution, which means falling into local optima. In the worst-case scenario, just one individual can finish the entire search. Therefore, it uses an individual search mechanism and does not use fitness landscape information in its search process. The elite individual is a population-level solution with the highest level of fitness. The elite approach, which is used by many MA algorithms, preserves the top candidates for the following generation of searches and conducts memetic investigations around the elite candidates. The elite individual suggests a greater likelihood that the global optimum exists in the search location (Pei 2020). For that, we adopt the elite evolution strategy in the proposed improvements.

    1. 1.

      Elite random mutation: The elite random mutation enhances the search strategy of the algorithm and provides a more vital exploration ability that avoids stagnation in the local optimum. Here, a new solution is generated, which is unexpected, and at the same time, related to the existing solutions; this gives the advantages of the exploration ability without excessive randomness. The generated solution \(X_{\textrm{new}}\) based on elite random mutation is given by:

      $$\begin{aligned} X_{\textrm{new}}&= X_{\textrm{center}}+ R_G \times |X_{\textrm{center}} - X_{\textrm{best}}| \end{aligned}$$
      (9)
      $$\begin{aligned} X_{\textrm{center}}&= \frac{U - L}{2} \end{aligned}$$
      (10)

      where \(X_{\textrm{center}}\) is the centered position limited by the upper bound U and the lower bound L, and \(X_{\textrm{best}}\) represents the best position in the population space, while \(R_G\) is number generated by Gaussian probability distribution technique (\(\mu =0, \sigma =1\)).

    2. 2.

      Gaussian local mutation: An optimization technique called Gaussian mutation (GM) (Luo et al. 2018) acts on the initial position vector to produce a new location by using a random number that complies with the normal distribution. As a result, most altered operators are dispersed close to the original location, which is analogous to conducting a local search in a constrained area. This mutation improves the optimizer’s optimization accuracy and enables the original algorithm to leave the local optimal zone. A few operators are located far from the current location, increasing population variety and enabling a faster and more accurate search for viable locations. This accelerates the optimizer’s trend toward convergence (Song et al. 2021). So, the Gaussian mutation provides local-wide mutations for new solutions. This is given by:

      $$\begin{aligned} X_{\textrm{new}}&= X_N + R_G \times |X_N - X_t| \end{aligned}$$
      (11)
      $$\begin{aligned} X_N&= {\left\{ \begin{array}{ll} X_{\textrm{best}2}, &{} \quad if\ r1< 0.5 \ and \ r2<0.5\\ X_{\textrm{best}3}, &{} \quad if\ r1 <0.5 \ and \ r2 >0.5 \\ X_{\textrm{best}}, &{}\quad otherwise \end{array}\right. } \end{aligned}$$
      (12)

      where \(R_G\) is number generated by Gaussian probability distribution technique (\(\mu \)=0, \(\sigma \)=0.333), \(X_t\) is the location at iteration t, and r1 and r2 are the random number in the range [0,1].

  • Control randomization

    There is a chance that the generated population will be in a particular area of the search space since the randomization process explores the search space using random positions, which could cause a fall into the local optima region. For that, controlling the randomization has a vital role in avoiding stagnation in local optimum. Here, we used a simple way to control randomization using:

    $$\begin{aligned} \hbox {ran }= 2 \times \hbox {rand} -1 \end{aligned}$$
    (13)

    where the rand represents a random number bounded by 0 and 1. As a result, the ran operator gives different values in a negative and positive direction in the interval \([-1, 1]\), excellently covering the given search space.

  • Transition Factor (TF) phase

    The BWO algorithm lacks a transfer parameter during the exploration phase, which affects the search strategy’s stability and potentially adds to time waste. To address this issue, the search method in both phases includes a transfer factor that gradually transitions the agents from exploration to exploitation concerning the amount of time required; TF is given by:

    $$\begin{aligned} \hbox {TF} = \exp \left( -\frac{t}{T}\right) \end{aligned}$$
    (14)

    where t and T are the current iteration and the maximum number of iterations, respectively.

Finally, the techniques mentioned above will be employed in determining the BWO’s new position in the exploration, exploitation, and whale fall phases by taking the absolute value and adding control randomization (ran), and TF parameter. The new equations for exploration, exploitation, and whale fall phase are as follows:

  • Exploration phase

$$\begin{aligned}&X_{i,j}^{t+1} = {\left\{ \begin{array}{ll} X_{i,pj}^t + D \times TF \times ran \times (1+r_1) \times \sin {(2\pi r_2)}\times | X_{i,p1}^t- X_{i,pj}^t|, &{}\quad if \ j = even\\ X_{i,pj}^t + D \times TF \times ran \times (1+r_1) \times \cos {(2\pi r_2)}\times | X_{i,p1}^t- X_{i,pj}^t|, &{} \quad if \ j = odd \end{array}\right. } \end{aligned}$$
(15)

where D is diversity operator in the interval [-1,1].

  • Exploitation phase

    $$\begin{aligned} X_i^{t+1}{} & {} = R_3 \times X_{\textrm{best}} - r_4 \times X_i^t+D \times TF \times ran \nonumber \\{} & {} \quad \times C_1 \times L_{\textrm{f}} \times |X_r^t - X_i^t| \end{aligned}$$
    (16)
  • Whale fall phase

    In this phase, the high randomization without the best solution guide will lead to high diversity with unstable in the search mechanism and will lead to a fall in local optimum and waste time. Here, we simplify this phase that achieves sufficient diversity with control randomization given by:

    $$\begin{aligned} X_i^{t+1} =X_r^t +D \times TF \times ran \times |X_r^t - X_i^t| \end{aligned}$$
    (17)

Algorithm 1 presents the pseudo-code of the modified beluga whale optimization (mBWO) algorithm.

figure a
Table 2 Metaheuristic algorithm parameter settings
Table 3 Benchmark functions

5 Experimental results and discussion

This study uses the CEC’17 test suite to evaluate the proposed mBWO. CEC’17 contains 30 functions that represent 30 minimization problems. These are effectively used to evaluate metaheuristic algorithms (Mirjalili et al. 2014; Hashim et al. 2021; Mirjalili and Lewis 2016). Furthermore, seven engineering design problems are used for evaluation purpose. The main objective of using these functions is to evaluate the search capability of the proposed method and its convergence behavior. Because these algorithms are stochastic based, the experiments in this study are running 30 times. The reason for this is to consider the randomness of the algorithms and the change in the results at each run. The parameter set up is illustrated in Table 2. Several state-of-the-art and new developed algorithms are used in the experiments to make comparisons with the proposed mBWO. The used algorithms are BOA (Arora and Singh 2019), HHO (Heidari et al. 2019), WOA (Mirjalili and Lewis 2016), SCA (Mirjalili 2016), AEO (Zhao et al. 2020), BSA (Civicioglu 2013), SCSO (Seyyedabbasi and Kiani 2022), COA (Boveiri and Elhoseny 2020), SAO (Salawudeen et al. 2021), CHIO (Al-Betar et al. 2021), WSO (Braik et al. 2022), and SMA (Li et al. 2020). To make fairness among the compared algorithms in the experiments, they all applied on the same hardware system to solve the CEC’17 test suite. Besides, all the experiments have a maximum number of iterations equal 1000. The used programming language for all algorithms is MATLAB2021 on 64-bit Windows 8.1 operating system. Table 3 includes the name of each function, its type, and its optimal value. The common parameters between these algorithms are the population size which equals 30, maximum number of iterations which equals 1000, and the number of runs which equals 30.

5.1 Experimental series 1: CEC2017

The CEC2017 (Wu et al. 2017) consists of a set of functions that represent a set of optimization problems. These functions are characterized by diversity, complexity, and dynamism. They are commonly used for the evaluation of the proposed optimization algorithms. In this study, they are used to evaluate the proposed mBWO. Furthermore, they can describe the exploration and exploitation behavior of the optimization algorithm.

Table 3 shows the 30 functions, which are distributed into four sets as follows: F1 to F3 is the unimodal set. F4 to F10 is the multimodal set. F11 to F20 is the hybrid set, and finally, F21 to F30 is the composition set. F2 is not used in the evaluation process. Thus, 29 functions are used to evaluate the proposed mBWO and other algorithms. Also, Table 3 shows that the search range for all of the test functions is from -100 to 100 and the dimension equals 30 (Figs. 2, 3, and 4).

As given in Table 4, the results prove the efficiency of the proposed mBWO in optimizing the functions of the CEC2017 test suite. According to the rank results, mBWO outperforms the other compared algorithms across 25 functions. These functions are F1, F4, F5, F6, F7, F8, F9, F10, F11, F12, F13, F15, F16, F17, F19, F20, F21, F22, F23, F24, F25, F27, F28, F29, and F30. The HHO algorithm outperformed the mBWO across F3. The COA algorithm achieved the first rank for F14 and F18 and F26. The algorithms with the least performance across these functions are BWO, AEO, SAO, WOA, and BOA. Figures 5, 6, and  7 show the boxplots of the mBWO and other algorithms. These are the graphical representation of the values recorded in Table 4. For each algorithm, the minimum, maximum and mean value is represented by a boxplot.

Fig. 2
figure 2

Convergence curve of some functions from F1–F10 for all algorithms CEC2017

Fig. 3
figure 3

Convergence curve of some functions from F11–F20 for all algorithms CEC2017

Fig. 4
figure 4

Convergence curve of some functions from F21–F30 for all algorithms CEC2017

To validate the mBWO, Table 5 shows the statistical results of the nonparametric Wilcoxon rank sum test. It gives an indication of the significance of the difference between the mBWO and other algorithms. According to the results in Table 5, the mBWO is significantly better than BWO, AEO, BOA, BSA, SAO, SCA, SMA, and WOA across all the test functions. On the other hand, mBWO is significantly better than COA, HHO, and SCSO for the majority of the functions.

Figures 2, 3, and 4 show the convergence behavior of the proposed mBWO and the other compared with algorithms applied on CEC2017 functions(F1–F10), (F11–F20), and (F21–F30), respectively. For functions F6, F9, F11, F13, F15, F16, F17, F19, F20, F21, F22, F23, F24, F29, and F30, the mBWO shows superior convergence capability. This is because it is capable of balance between exploration and exploitation. This is not only seen in the faster convergence achieved by the mBWO, but also in the optimal solutions found by the search process. The COA achieves the best convergence behavior for F14, F18, and F26.

Table 4 Evaluation results of F1–F30
Fig. 5
figure 5

Box plot of some functions from F1–F10 for all algorithms CEC2017

Fig. 6
figure 6

Box plot of some functions from F11–F20 for all algorithms CEC2017

Fig. 7
figure 7

Box plot of some functions from F21–F30 for all algorithms CEC2017

Table 5 Wilcoxon results of mBWO versus other metaheuristics on cec17

For further validation of the mBWO results on the functions of the CEC2017 test suit, an extensive statistical analysis is performed during the search process. This is done by recording the ratios of exploration and exploitation. Figures 8, 9, and 10 show the exploration and exploitation obtained by mBWO while optimizing CEC2017 functions. It can be seen that mBWO achieves a high rate of exploration at the beginning of the search process, while it adaptively converges to exploitation at the later iterations of the search process. However, on F1, F6, F7, F8, F11, F14, F15, F16, F17, F18, F19, F20, F21, F23, and F24, the mBWO performed more exploration than exploitation to reach to the best position in the search space that contains the global optimal solutions. The mBWO follows the same behavior on F1, F4, F6, F8, F12, F13, F22, F24, F26, and F27. Overall, the proposed mBWO produces a dynamic behavior during the search process.

Fig. 8
figure 8

Exploration–exploitation of some functions from F1–F10 for all algorithms CEC2017

Fig. 9
figure 9

Exploration–exploitation of some functions from F11–F20 for all algorithms CEC2017

Fig. 10
figure 10

Exploration–exploitation of some functions from F21–F30 for all algorithms CEC2017

Table 6 Statistical results of mBWO versus other metaheuristics on welded beam design problem
Table 7 Statistical results of mBWO versus other metaheuristics on welded beam design problem
Fig. 11
figure 11

Design of welded beam design problem

5.2 Experimental series 2: engineering problems

In this study, the proposed optimizer is tested on different eight constrained problems namely:

  1. 1.

    Welded beam design problem

  2. 2.

    Three-bar truss design problem

  3. 3.

    Tension/compression spring design problem

  4. 4.

    Speed reducer design problem

  5. 5.

    Optimal design of industrial refrigeration system

  6. 6.

    Pressure vessel design problem

  7. 7.

    Cantilever beam design

  8. 8.

    Multi-product batch plant

Table 8 Statistical results of mBWO versus other metaheuristics on three-bar truss design problem

5.2.1 Welded beam design problem

This is an engineering design problem that is commonly used to evaluate new or modified optimizers. Its main objective is to minimize the cost of the welded beam design (Coello Coello 2000). The problem relies on four variables: the cut variable \(m(x_1)\), the bending stress in the beam \(n(x_2)\), the bending load on the bar \(s(x_3)\), and the end deviation of the beam \(t(x_4)\). The first and second variables are named \(\lambda \) and \(\theta \), respectively. The third and fourth variables are named \(\phi \) and \(k_d\), respectively.

$$\begin{aligned} \begin{array}{ll} \text { Let } &{} x = [x_1, x_2, x_3, x_4] = [m, n, s, t]\\ \text { Minimize } &{} f(x)=1.10471 x_1^2x_2+0.04811x_3x_4 (14.0+x_2)\\ \text { Subject to: } &{} \\ &{}R_1(x) = \lambda (x) - 13600 \le 0\\ &{}R_2(x) = \theta (x) - 30000 \le 0\\ &{}R_3(x) = x_1 - x_4 \le 0\\ &{}R_4(x) = 0.10471 (x_1^2) +0.04811x_3x_4(14+x_2)-5.0 \le 0\\ &{}R_6(x) = \Phi (x) - 0.25 \le 0\\ &{}R_7(x) = 6000 - k_d (x) \le 0\\ \text { where } &{} \lambda (x)=\sqrt{(\lambda )+(2\lambda ^\backprime \lambda ^{\backprime \backprime })\frac{x_2}{2G}+{\lambda ^{\backprime \backprime 2}}}\\ &{}\lambda ^\backprime =\frac{6000}{\sqrt{2x_1x_2}}\\ &{}\lambda ^{\backprime \backprime }=\frac{WG}{F}\\ &{}W=6000(14 +\frac{x_2}{2}) G=\sqrt{\frac{x_2^2}{4}+(\frac{x_1+x_3}{2})_2}\\ &{}F=2\left\{ x_1x_2\sqrt{2} \left[ \frac{x_2^2}{12}+(\frac{x_1+x_2}{2})_2\right] \right\} \\ &{}\theta (x)=\frac{504000}{x_4x_3^2}\\ &{}\Phi (x)=\frac{65856000}{(30 \times 10^6)x_4x_3^3}\\ &{}k_d=\frac{4.013(30 \times 10^6) \sqrt{\frac{x_3^2x_4^5}{36}}}{196}\left( 1- \frac{x_3\sqrt{\frac{30\times 10^6}{4(12\times 10^6)}}}{28}\right) \\ \text { with } &{} 0.1 \le x_1, x_4 \le 2.0 \quad and \quad 0.1 \le x_2, x_3 \le 10.0 \end{array} \end{aligned}$$
(18)

Table 6 shows that the mBWO achieves the second rank after the AEO that comes in the first rank. It also has a very small standard deviation value which indicates the robustness of the algorithm. Figure 11 shows that most of the algorithms have similar convergence behavior. However, the mBWO can reach positions in the search space that contains the minimum values. The figures show that the AEO suffers from premature convergence and entrapment in local minima. Table 7 shows that the mBWO competitive results regarding the four variables of this problem in comparison with other algorithms. Table 22 show that the mBWO has a significant difference compared with other algorithms except for the AEO algorithm.

Table 9 Statistical results of mBWO versus other metaheuristics on three-bar truss design problem
Fig. 12
figure 12

Three-bar truss design problem

Table 10 Statistical results of mBWO versus other metaheuristics on tension/compression spring design problem

5.2.2 Three-bar truss design problem

This engineering design problem is based on the area of bars 1 and 3 and the area of bar 2. Also, there are some design conditions that influence the manufacturing process such as stress, deflection, and buckling. As an optimization problem, its main objective is to minimize the total weight of the structure. More details of the three-bar truss design problem are provided in Eskandar et al. (2012). In mathematics, it is described as follows:

$$\begin{aligned} \begin{array}{ll} \text { Let} &{} x = [x_1x_2] = [S_1S_2],\\ \text { Min } &{} f(x) = (2\sqrt{2}x_1 + x_2)\times 1,\\ \text { Subject to: } &{} R_1(x) =\frac{\sqrt{2}x_1 + x_2}{\sqrt{2}x_1^2+ 2x_1x_2} P - \phi \le 0,\\ &{}R_2(x) = \frac{x_2}{\sqrt{2}x_1^2+ 2x_1x_2}P - \phi \le 0,\\ &{}R_3(x) = \frac{x_2}{\sqrt{2}x_2 + x_1}P - \phi \le 0,\\ \text { where } &{} 0 \le x_1, x_2 \le 1,\\ \text { where } &{} 1 = 100 cm, P = 2 KN / cm^2, \phi = 2 KN / cm^2 \end{array} \end{aligned}$$
(19)

Table 8 shows that the proposed mBWO achieves competitive results compared with other algorithms. It comes with the AEO algorithm in the second rank after BSA which comes in the first rank. Also, in Table 9, the mBWO achieves promising values regarding the first and second variables of the problem. Table 22 shows that mBWO has meaningful differences compared with other algorithms except the AEO algorithm. Figure 12 shows that mBWO has a promising convergence behavior where the curves reach very small values at the latest iterations. As opposed to the mBWO, the SAO suffers from premature convergence which indicates that it is entrapped in local minima. Also, the WOA and AEO converge to points that are not the optimal solutions of the problem.

Table 11 Statistical results of mBWO versus other metaheuristics on tension/compression spring design problem
Fig. 13
figure 13

Tension/compression spring design problem

5.2.3 Tension/compression spring design problem

Tension/compression spring is a design problem that is striving to obtain the minimum weight of the spring. Three independent variables are considered the basis of this problem. These are the wire of the diameter (w) or (\(x_1\)),(c) or coil diameter (\(x_2\)), and (a) the number of active coils (\(x_3\)). The mathematical model of this problem is described in Sadollah et al. (2013). The following set of equations describes the mathematical model of the tension/compression spring problem.

$$\begin{aligned} \begin{array}{ll} \text { Let} &{} x = [x_1x_2x_3] = [wca]\\ \text { Min } &{} f(x) = (x_3 + 2) x_2x_1^2\\ \text { Subject to: } &{} R_1(x) = 1 -\frac{4x_2^2-x_1x_2}{12566(x_2x_1^3 - x_1^4})\le 0\\ &{}R_2(x) =\frac{4x_2^2 - x_1x_2}{2566( x_2x_1^3-x_1^4)} +\frac{1}{5108x_1^2}- 1\le 0\\ &{}R_3 (x) = 1 -\frac{140.45x_1}{x_2^2x_3}\le 0\\ &{}R_4 (x) =\frac{x_1 + x_2}{1.5}- 1 \le 0\\ \text { where } &{} 0.05\le x_1 \le 2.0, 0.25\le x_2 \le 1.3,\\ &{} and \quad 2.0\le x_3 \le 15.0 \end{array} \end{aligned}$$
(20)

Table 10 shows that the mBWO outperforms other algorithms and comes in the first rank. It also has a very small standard deviation which means that it is a robust algorithm. In Table 11, the mBWO has achieved competitive results for its variables. Table 22 shows that mBWO has a meaningful difference with compared with other algorithms. Figure 13 shows that the mBWO and other algorithms have similar behavior by reaching to very small values at the latest iterations of the optimization process. However, the BOA and WOA has a premature convergence behavior that indicates the entrapment in local minima.

Table 12 Statistical results of mBWO versus other metaheuristics on speed reducer design
Table 13 Statistical results of mBWO versus other metaheuristics on speed reducer design
Fig. 14
figure 14

Speed reducer design

Table 14 Statistical results of mBWO versus other metaheuristics on design of industrial refrigeration system
Fig. 15
figure 15

Optimal design of Industrial refrigeration System

5.2.4 Speed reducer design problem

This is an optimization problem to minimize the weights of different design instruments. It uses the constraints in the optimization process such as gear teeth, stress and deflection ratios of bending, surface and shafts (Mezura-Montes and Coello Coello 2005). Speed reducer is based on seven design variables for minimizing the weights. These are \(x_1\), \(x_2\), \(x_3\), \(x_4\), \(x_5\), \(x_6\), and \(x_7\), which stand for face width, teeth module, pinion teeth number, the length of the first shaft between the bearings, the length of the second shaft between the bearings, and the first diameter and second shafts.

$$\begin{aligned} \begin{array}{ll} \text { Let} &{} \\ \text { Min } &{} f(x)=0.7854x_1x_2^2(3.3333x_3^2 +14.9334x_3-43.0934)\\ &{}-1.508x_1(x_6^2 + x_7^2) + 7.4777 (x_6^3 + x_7^3) \\ &{}+ 0.7854 (x_4x_6^2 + x_5x_7^2)\\ \text { Subject to: } &{} R_1 (x) =\frac{27}{x_1x_2^2x_3}- 1 \le 0\\ &{}R_2(x) = \frac{397.5}{x_1x_2^2x_3} -1 \le 0\\ &{}R_3 (x) =\frac{1.93x_4^3}{x_2x_3x_6^4}-1\\ &{}R_4 (x) = 1.93x_5^3x_2x_3x_7^4- 1 \le 0\\ &{}R_5 (x) =\frac{1}{110x_6^3}\sqrt{(\frac{745x_4}{x_2x_3})^2+ 16.9 \times 10^6} -1 \le 0\\ &{}R_6 (x) =\frac{1}{85x_7^3}\sqrt{(\frac{745x_5}{x_2x_3})^2+ 157.5 \times 10^6} -1 \le 0\\ &{}R_7 (x) =\frac{x_2x_3}{40}-1 \le 0\\ &{}R_8 (x) =\frac{5x_2}{x_1}-1 \le 0\\ &{}R_9 (x) =\frac{x_1}{12x_2}-1 \le 0\\ &{}R_10 (x) =\frac{1.5x_6 + 1.9}{x_4}-1 \le 0\\ &{}R_11 (x) =\frac{1.1x_7 + 1.9}{x_5}-1 \le 0\\ \text { where } &{} 2.6 \le x_1 \le 3.6, 0.7 \le x_2 \le 0.8, 17 \le x_3 \le 28, 7.3 \le \\ &{}x_4 \le 8.3, 7.8 \le x_5 \le 8.3, 2.9 \le x_6 \le 3.9, \\ &{} and \quad 5 \le x_7 \le 5.5 \end{array} \end{aligned}$$
(21)

Table 12 shows that the mBWO comes in the first rank with a very small standard deviation value. It also obtains competitive values regarding the seven variables as shown in Table 13. Table 22 shows that mBWO has a significant difference compared with other algorithms. Figure 14 shows that a great difference in the convergence behavior among the algorithms. However, it appears that the mBWO achieves the best convergence scale by reaching to a very small values at the latest iteration of the search process.

5.2.5 Optimal design of industrial refrigeration system

The cooling system uses coolant to cool a hot stream. This process is performed in three steps. Each step has a heat exchanger on one side and a boiling cooler on the other.

The amount of current being pumped depends on the surface area of the heat exchange. Also, at the beginning of each step, the temperature required for boiling the refrigerant is determined. Stream flow rate and fluid temperature are key factors in designing a cooling system. The optimum cooling design needs to determine the area of three surfaces of the liquid cooling heat exchanger. The required heat is 4186.8 J/kg\(^{\circ }\)C pumped at a rate of 10,800 kg per hour from 10\(^{\circ }\)C to -55\(^{\circ }\)C. The unit operates for a minimum of 300 days in a year. The main parameters for the refrigerating system are the latent heat of the refrigerant (\(\lambda \)), 2,32,600 J/Kg, and the overall heat transfer coefficient in stages, 1130 J/s m2 \(^{\circ }\)C. The main objective in the design of the refrigeration system is to minimize the cost of the three steps as in Eq. (22).

$$\begin{aligned} Cost=\sum _{i-1}^3[c_i(C_i)^{0.5}+d_iM_i] \end{aligned}$$
(22)

The Cost is the variable for the capital cost of the heat exchange surface area. The variable on the right side is the refrigerant operating cost. The optimization process target is to achieve competitive values of design variables such as fluid’s temperatures. The design variables determine the area of the heat exchange and the liquid refrigerant addition rate in each step. The temperature of the liquid refrigerant in each step is:

$$\begin{aligned} {[}Temp_1=-18^oC, Temp_2=-40^oC, Temp_3=-62^oC] \end{aligned}$$

The temperature of the input fluid to the system (\(Temp_0\)) is \(10^o C\) and the temperature of the output fluid from the system (\(Temp_3\)) is \(-55^o C\). The output temperature at a given step must be greater than the refrigerant temperature. Therefore, the conditions on the design variables are:

$$\begin{aligned}&Temp_0=10^o C \ge Temp_1\ge -18^o C 40TT C \end{aligned}$$
(23)
$$\begin{aligned}&Temp_1\ge Temp_2 \ge -40^o C \end{aligned}$$
(24)
$$\begin{aligned}&F_i=H_i\times C_i\times (\Delta Temp_i)_{\ln } \end{aligned}$$
(25)
Table 15 Statistical results of mBWO versus other metaheuristics on design of industrial refrigeration system

The log mean temperature at stage i:

$$\begin{aligned} (\Delta Temp_i)_{\ln }=\frac{Temp_{i-1}-Temp_{i}}{\ln \frac{Temp_{i}-Temp_{R_i}}{Temp_{i-1}-Temp_{R_i}}} \end{aligned}$$
(26)

The energy balance over refrigerant is:

$$\begin{aligned} F_i= \lambda _i \times M_i \end{aligned}$$
(27)

\(\lambda _i\) is the Penaltyfactor.

where \(F_i\) is the ratio of heat flow, J/s.

The energy balance over the fluid is:

$$\begin{aligned} F_i=V \times Kl \times (temp_{i-1}-temp_i) \end{aligned}$$
(28)

where kl is the specific heat of fluid, \(J/kgC^o\), V is the hot fluid pump ratio, Kg/hr.

Table 14 shows that the proposed mBWO obtains the first rank compared with other algorithms. Also, as shown in Table 15, the mBWO obtains the smallest values for most of the problem variables compared with other algorithms. Table 22 shows that mBWO has meaningful results in comparison with all other algorithms. Figure 15 shows that the algorithms have different behavior in convergence toward the best solution. The mBWO shows a promising convergence by reaching the minimum values at the latest iterations of the search process.

Table 16 Statistical results of mBWO versus other metaheuristics on pressure vessel design problem
Fig. 16
figure 16

Pressure vessel design problem

5.2.6 Pressure vessel design problem

Pressure vessel design problem (Kannan and Kramer 1994) is the third engineering design problem used in this study to evaluate the proposed AOABSA against other algorithms. This is an optimization problem that it aims to achieve the minimum cost of pressure vessel design. The cost depends on five design variables which are \(x_1\), \(x_2\), \(x_3\), and \(x_4\). These variables refer to shell thickness, head thickness, inner radius, and cylinder length, respectively. The mathematical model of this problem is described extensively in Hashim et al. (2021). The following set of equations describes the mathematical model of the pressure vessel design problem.

$$\begin{aligned} \begin{array}{ll} \text { Let} &{} \\ \text { Min } &{} f(x) = 0.6224x_1x_3x_4 + 1.7781x_2x_3^2 \\ &{}+ 3.1661x_1^2x_4+19.84x_1^2x_3 \\ \text { Subject to: } &{} R_1(x) = -x_1 + 0.0193x\\ &{}R_2(x) = -x_2 + 0.00954x_3 \le 0\\ &{}R_3(x) = -\pi x_3^2x_4 - (4/3)\pi x_3^3 + 1, 296, 000 \le 0\\ &{}R_4(x) = x_4 - 240 \le 0\\ \text { where } &{} 0 \le x_i \le 100, i = 1, 2\\ &{}10 \le x_i \le 200, i = 3, 4 \end{array} \end{aligned}$$
(29)

Table 16 shows that mBWO has the first rank with a minimum mean value which equals 179.1527. Figure 16 shows that all algorithms have nearly similar behavior. However, the mBWO has reached to minimum values in the latest iterations. Table 17 shows the values of all the variables of this problem obtained by all the algorithms. It is seen that mBWO achieves the competitive values for \(x_1\), \(x_3\), and \(x_4\). Also, it has the second minimum value for \(x_2\). Table 22 shows that the mBWO achieves a significant difference against all algorithms except the SMA algorithm.

Table 17 Statistical results of mBWO versus other metaheuristics on pressure vessel design problem
Table 18 Statistical results of mBWO versus other metaheuristics on cantilever beam design
Table 19 Statistical results of mBWO versus other metaheuristics on cantilever beam design
Fig. 17
figure 17

Cantilever beam design problem

Table 20 Statistical results of mBWO versus other metaheuristics on multi-product batch plant problem
Fig. 18
figure 18

Multi-product batch plant problem

5.2.7 Cantilever beam design

The proposed mWOA is applied to solve CBD problem which has five parameters that need to be determined during the optimization process. The mathematical representation of this problem can be formulated as:

$$\begin{aligned}&\text {Minimize} f(x) = 0.6224(x_1, x_2, x_3, x_4, x_5) \nonumber \\&\text {Subject to:}\ g(x) = \frac{60}{x_1^3} + \frac{27}{x_2^3} + \frac{19}{x_3^3} + \frac{7}{x_4^3} + \frac{1}{x_5^3} -1 \le 0 \nonumber \\&\text {where}, (0.01 \le x_i \le 100,i=1.2,3,4,5) \end{aligned}$$
(30)

In Table 18, the performance results of the mBWO for the CBD engineering problem are given. As per Table 18, the mBWO obtains the first rank with the smallest STD compared to other methods. In Table 19, the mBWO achieves competitive results regarding the problem’s variables. Figure 17 shows the convergence behavior the mBWO and other methods over 1000 iterations for the CBD. Figure 17 shows that the mBWO reaches to the minimum values at the latest iterations.

Table 21 Statistical results of mBWO versus other metaheuristics on multi-product batch plant problem
Table 22 Wilcoxon results of mBWO versus other metaheuristics

5.2.8 Multi-product batch plant problem

In the multi-product batch plant problem, the customer first announces his order before starting the production. Each customer’s order represents one product. The batch size of each order remains constant during production. A due and release dates are assigned with each order. Each stage has its own processing units that are only operated at this stage. The objective function of this problem is to minimize the make-span. Also, other constraints must be taken into consideration such as the unallowed units assignment orders due and release date, and storage issues. The mathematical model of this problem is described extensively in Gupta and Karimi (2003). Equations (31)–(42) show the formulation of this problem. Equation (31) shows the constraint related to order assignment, which requires that each order i can only be processed on a single unit j in step s.

$$\begin{aligned} \sum _{j\in J_{is}} Z_{ij}=1,i\in I_j \end{aligned}$$
(31)

Equations (32)–(33) shows order sequence of each unit. Equations (32) and Eq. (34) show that only one order can be the first order on each unit j. Equation (35) shows the sequence constraint for different orders i and \(i^\backprime \) on the same unit j.

$$\begin{aligned}&\sum _{i\in I_j} ZF_{ij}\le 1 \end{aligned}$$
(32)
$$\begin{aligned}&\sum _{i^\backprime \in I_s} X_{i^\backprime is} +\sum _{j\in J_{is}} Z_{ij}=1, I\in I_s \end{aligned}$$
(33)
$$\begin{aligned}&Z_{ij} \ge ZF_{ij}, i\in I_j\# \end{aligned}$$
(34)

Equations (35) and (36) are unit assignment constraints. If integer \(X_{ii^\backprime s}\) or \(X_{i^\backprime is}\) is activated, then order \(i^\backprime \) and order i must be processed in the same unit j

$$\begin{aligned}&2( X_{ii^\backprime s}+X_{i^\backprime i s}) +\sum _{j\in J_{is}-J_{i^\backprime s}} Z_{ij}\nonumber \\&\quad +\sum _{j\in J_{I\backprime s-J_{is}} Z_{i^\backprime j}} \le 2,i^\backprime > i,(i^\backprime i)\in I_s \end{aligned}$$
(35)
$$\begin{aligned}&Z_{ij}\le Z_{i^\backprime j}, j\in J_{is}\cap J_{i^\backprime j}i, (i^\backprime i)\in I_s \end{aligned}$$
(36)

Equations (37)–(39) show the order timing constraints. Equation (37) shows the timing constraints for one order in different steps. Equation (38) is the timing constraint for different orders on the same unit. If unit release time \(UR_j\) or order release time \(OR_i\) are considered, Eqs. (39) and (40) are invoked. Equation (41) represents the constraint for cases with due date \(DD_i\).

$$\begin{aligned}&T_{is^\backprime } \ge T_{is}+\sum _{j inJ_{is}} Z_{ij}PT_{ij}, s^\backprime =ns_{is},s\in S_i \end{aligned}$$
(37)
$$\begin{aligned}&M(1-X_{ii^\backprime s})+T_{i^\backprime s}\ge T_{is} +\sum _{j inJ_{is}} Z_{ij}PT_{ij}, s^\backprime =ns_{is},s\in S_i \end{aligned}$$
(38)
$$\begin{aligned}&T_{is}\ge \sum _{j inJ_{is}} ZF_{ij}UR_j, i\in I_s \end{aligned}$$
(39)
$$\begin{aligned}&T_{is}\ge OR_i, s=fs_i \end{aligned}$$
(40)
$$\begin{aligned}&T_{is}+\sum _{j \in J_{is}} Z_{ij}t_{ij}\le DD_i, i \in I, s=Is_i \end{aligned}$$
(41)

The objective function is to minimize the make-span and is formulated as follows:

$$\begin{aligned} Make-span=min(max(T_{is}+\sum _{j \in J_{is}} Z_{ij}PT_{ij})), s=Is_i \end{aligned}$$
(42)

All the constraints can be easily satisfied except Eq. (41) for minimizing make-span using meta-heuristic algorithms. To fit in Eq. (41), the penalty function is used. And Eqs. (43) and (44) are used to compute the objective function in this paper.

$$\begin{aligned}&d_i \ge max(T_{is}+\sum _{j \in J_{is}} Z_{ij}PT_{ij}-DD_i 0), i\in I, s=Is_i \end{aligned}$$
(43)
$$\begin{aligned}&Objective function=min(max(T_{is}+\sum _{j \in J_{is}} Z_{ij}PT_{ij}))\nonumber \\&\quad + M.\sum _{i \in {I}} d_i, s=Is_i \end{aligned}$$
(44)

Equation (43) defines the penalization function and is applied to penalize the violation in Eq. (44). If the end time of each order is beyond the associated due date, Eq. (43) is activated. And the objective value in Eq. (44) would become down.

Table 20 shows that the mBWO comes in the second rank after the AEO. In Table 21, the mBWO has obtained competitive results compared with other algorithms regarding most of the problem variables. Table 22 shows that mBWO has a meaningful difference with other algorithms excepts the AEO. Figure 18 shows that the mBWO and some other algorithms have similar convergence behavior that reaches to the minimum values. However, the SAO and BWO have a premature convergence behavior that indicates the entrapment in local minima.

6 Conclusions and future work

This paper presents a modified beluga whale optimization called (mBWO) which overcomes the limitations of the classical BWO. These limitations are slow convergence, the imbalance between exploration–exploitation, falling into local optimal regions, etc. mBWO optimizer integrates three different strategies with standard BWO. These strategies are a transition factor for exploration–exploitation, a novel random control factor, and an elite evolution strategy. To test the optimizer and have a fair judgment, mBWO is compared with 13 optimizers using 29 CEC’17 functions and other eight constrained ones. Results indicate the significance and powerfulness of the suggested optimizer.

However, as all other optimizers, mBWO has the same drawbacks like getting trapped in local optimal areas. Moreover, as stated by NFL, mBWO is not able to solve all optimization problems.

In the future, a binary or multi-objective can be proposed to solve discrete and multi-objective problems. Also, mBWO can be applied to solve feature selection, scheduling, knapsack, etc.