1 Introduction

Due to the rapid development of various industries, people face more complex optimization problems in real life. Conventional optimization techniques have limitations in resolving complex and massive optimization problems, which cannot meet the requirement of convergence speed and calculation accuracy [1]. Compared with conventional optimization techniques, meta-heuristic algorithms (MAs) have the characteristics of flexibility, simplicity, derivation-free mechanism, and local optimal avoidance [2]. Therefore, MAs have been widely used to resolve complex optimization issues in recent years. The research inspiration for MAs mainly comes from biological behavior or natural physical phenomena. Furthermore, according to the different simulated natural behaviors, MAs can be divided into three main classes: evolution-based algorithms (EAs), physical-based algorithms (PAs), and swarm-based algorithms (SAs) [3]. EAs are inspired by Charles Darwin’s theory of natural selection, in which the best individuals always combine to produce better offspring [4]. The main representatives of the evolution-based algorithm are genetic algorithms (GA) and differential evolution (DE). PAs imitate the physical rules and chemical reactions of the universe, such as simulated annealing (SA), particle swarm optimization (PSO), and sine cosine algorithm (SCA) [5]. SAs is a kind of heuristic algorithm that simulates the swarm behavior to solve optimization problems, such as firefly algorithms (FA) [6], moth flame optimization (MFO) [7], flame optimization algorithm (FOA) [8], Harris Hawks optimizer (HHO) [9], and Slime Mould algorithm (SMA) [10]. SAs have become an important method for solving optimization problems because of their excellent self-organization, self-adaptation, and self-learning characteristics. It has been adopted in various domains [11, 12], such as image segmentation [13], wireless networks [14], unmanned aerial vehicles [15], target tracking [16], neural network [17], MRI classification [18], feature selection [19], and engineering problems [20], and vehicle design [21, 22].

Mirjalili and Lewis proposed a novel meta-heuristic algorithm in 2016 [23], named whale optimization algorithm (WOA), which was encouraged by the humpback whale’s foraging behavior. In recent years, WOA has attracted much attention and has been utilized to find optimal solutions in many fields. For example, based on chaotic and multi-swarm strategies, Wang et al. [24] developed an improved WOA and used it in two optimization scenarios. Chen et al. [25] introduced a chaotic local search strategy and Levy flight (LF) into WOA (BWOA), which was applied to solve three well-known problems in mathematical modeling studies. Abdel-Basset et al. [26] proposed two new variants of WOA, based on the ranking method and cyclic exploration–exploitation operator, and named RWOA and HWOA, respectively. They were applied to identify the parameters of the three-diode photovoltaic model. Peng et al. [27] proposed a hybrid WOA to improve the performance of cloud load forecasting. Chakraborty et al. [28] proposed a modified WOA variant and used it to solve problems in the engineering domain. Peng et al. [27] introduced a hybrid WOA based on the Levy and migration strategies into cloud load forecasting. Ye et al. [29] devised a novel modified WOA using the strategy of LF and pattern search and applied it to the field of energy optimization. Mostafa et al. [30] studied a WOA-based liver segmentation method from magnetic resonance images. Chao et al. [31] embedded orthogonal crossover into WOA to improve its exploration ability and applied it to estimate the surface duct. Hassib et al. [32] combined WOA with bidirectional recurrent neural network algorithms to train a deep learning approach. M.Hassi et al. [32] combined WOA with bidirectional recurrent neural network algorithms to train a deep learning approach. Darwish et al. [33] developed a novel whale optimization algorithm based on chaotic maps, which was used to select feature sets with high classification performance and a small number of features. Tripathi et al. [34] proposed a WOA variant for the recommendation over large-scale datasets in managing large-scale datasets.

In the original WOA, although the number of parameters to be adjusted is less and the convergence ability is strong enough, it still has the disadvantages of slow convergence speed and low convergence accuracy. Therefore, many optimization schemes were proposed to mitigate them to overcome these shortcomings. Hussien et al. [35] introduced a binary whale optimization algorithm based on two transfer functions. Chakraborty [36] et al. improved WOA in three aspects: the original algorithm’s parameters, the prey's search range, and the inertia weights. While reducing the complexity of the algorithm, it effectively improves the performance of the algorithm. Luo et al. [37] improved WOA based on the regularity of chaos and the mutational character of Gaussian mutation. Saha et al. [38] proposed a cosine-adapted modified whale optimization by incorporating cosine parameters into the selection of control parameters. In the study [1], LF and ranking-based mutation operators were embedded into WOA, which can prevent the algorithm from falling into local optimum and help the algorithm find the optimal solution quickly. Tu et al. [39] enhanced WOA with the strategies of communication mechanism and biogeography-based optimization algorithm, which overcome the shortcomings of WOA in slow convergence speed and easy trapping into the local optimum. Heidari et al. [40] integrated the learning mechanism and hill-climbing local search into WOA to enhance the exploitation process of the original WOA, which is called BMWOA. Chen et al. [41] combined two strategies, random replacement and double adaptive weight, with WOA to improve the performance of WOA. In the study [42], the exploration and exploitation capabilities of WOA were enhanced by embedding the modified mutualism phase of symbiotic organisms search and the DE mutation operator. Wang et al. [43] integrated elite strategy and spiral motion from MFO into WOA. Elhosseini et al. [44] introduced an inertia weight strategy to improve the parameters of WOA nonlinearly as a way to strengthen the ability of WOA to find optimal solutions. Chakraborty et al. [45] improved the exploration ability of WOA using the DE mutation strategy and balanced the exploration and exploitation capabilities by introducing a new parameter. Wang et al. [46] used opposition-based learning and a global grid ranking mechanism to enhance WOA's performance.

Although the above meta-heuristic algorithms, including WOA and improved WOA variants, have demonstrated their effectiveness in many optimization problems, there are still some shortcomings in the convergence speed and accuracy [11, 47]. The original WOA can be further improved when solving some complex practical problems, especially in the field of feature selection [48] and image segmentation [49]. In the late iterative stage, the exploitation capabilities of WOA do not enable much change in the location of search agents [44, 50]. When the algorithm is trapped in a local optimum, it does not solve this type of situation well. This may make the WOA-based multi-threshold image segmentation, and feature selection methods have insufficient power to jump out of the local optimum and thus fail to achieve satisfactory optimization performance. In addition, the diversity of the population decreases with the number of evaluations [29, 41, 51]. The global exploration capability of WOA is insufficient, which may result in some regions with better solutions not being found. As a result, when WOA is applied in the multi-threshold image segmentation problem and feature selection problem, it may miss the thresholds and feature subsets that have better effects.

To this end, we designed an improved WOA, called QGBWOA, by combining a quasi-opposition-based learning (QOBL) strategy and the Gaussian barebone (GB) mechanism. To address the shortcoming of WOA in jumping out of the local optimum in the later evaluation, the QOBL strategy is introduced. QOBL strategy is characterized by considering the opposite position of the current solution in the solution space. Therefore, if the current solution falls into a local optimum, its opposite solution may help it to break out of the local optimum and find a more optimal solution. The local exploitation capability of the algorithm is thus enhanced. The GB mechanism is introduced to address the problem of population diversity decreasing in the late evaluations. GB mechanism can generate the position of individuals in the search region by Gaussian distribution. Therefore, it enhances the population’s diversity and improves the algorithm’s global exploration ability, which gives the algorithm more possibility to move to regions with better solutions for searching.

In summary, the main contributions of this paper are as follows:

  1. 1.

    A new enhanced WOA algorithm combining QOBL and GB, called QGBWOA, is proposed, with the experimental results showing that QGBWOA has higher accuracy and a faster convergence rate in obtaining global solutions.

  2. 2.

    A QGBWOA-based wrapper feature selection method is proposed for tackling feature selection tasks.

  3. 3.

    A QGBWOA-based image segmentation method of 2D histograms combined with Kapur’s entropy is proposed and applied to real COVID-19 pathology images.

  4. 4.

    QGBWOA achieves higher classification accuracy and a smaller number of features in the feature selection task and shows excellent performance in multi-threshold image segmentation problems in all three evaluation metrics, including Peak Signal to Noise Ratio (PSNR) [52], Structural Similarity (SSIM) [53], and Feature Similarity (FSIM) [54].

The rest of this article is organized as follows. The original WOA is presented in Sect. 2. Section 3 describes the proposed QGBWOA algorithm. Section 4 analyzes the experimental results of QGBWOA in the benchmark function test. Section 5 provides the application of QGBWOA to the feature selection and image segmentation problems. The conclusion and future work are summarized in Sect. 6.

2 Whale Optimization Algorithm

WOA [23] simulates the hunting actions of whales: encircling prey, bubble net attack, and searching for prey.

2.1 Encircling Prey Phase (Exploitation)

In the exploitation phase, whales use bubble nets to attack their prey, including two models of shrinking encircling and spiral updating. In the encircling prey stage, the best search agent position obtained so far is selected as the optimal position, and other individuals gradually approach the best agent. Its mathematical model is as Eq. (2):

$$\begin{array}{c}{D}_{\mathrm{dist}}=\left|C\bullet {X}_{\mathrm{best}}(t)-X(t)\right|,\end{array}$$
(1)
$$\begin{array}{c}X\left(t+1\right)={X}_{\mathrm{best}}\left(t\right)-A\bullet {D}_{\mathrm{dist}},\end{array}$$
(2)

where \(X(t+1)\) is the position of the search agent in the next evaluation, \(X(t)\) is the position of the search agent in the current evaluation, and \({X}_{\mathrm{best}}(t)\) is the best agent explored so far. Let \(\mathrm{FEs}\) represent the counter of evaluation, \(A\) and \(C\) are two control parameters that can be presented as follows:

$$\begin{array}{c}A=2a\bullet {r}_{1}-a,\end{array}$$
(3)
$$\begin{array}{c}a=2-2\times \frac{\mathrm{FEs}}{\mathrm{MaxFEs}},\end{array}$$
(4)
$$\begin{array}{c}C=2{r}_{2},\end{array}$$
(5)

where \({r}_{1}, {r}_{2}\) are two random numbers in [1], \(\mathrm{MaxFEs}\) denotes the maximum number of evaluations of the algorithm, and \(a\) linearly decreases from 2 to 0 over evaluations.

2.2 Spiral Updating Phase (Exploitation)

The spiral updating phase is realized by Eq. (6). \({O}_{\mathrm{dist}}=\left|{X}_{\mathrm{best}}\left(t\right)-X(t)\right|\) denotes the distance between the search agent in the current evaluation and the best agent obtained so far.

$$\begin{array}{c}X\left(t+1\right)={O}_{\mathrm{dist}}\bullet {\mathrm{e}}^{\mathrm{f}l}\bullet \mathrm{cos}(2\uppi l)+{X}_{\mathrm{best}}\left(t\right),\end{array}$$
(6)

where \(f\) is a constant that controls the logarithmic spiral’s shape and is set to 1 according to the original text. The parameter \(l\) is a random number in [− 1, 1].

The probability of being selected for the spiral moves and shrinking encircling phase is 50% each:

$$\begin{array}{c}X\left(t+1\right)=\left\{\begin{array}{cc}{X}_{\mathrm{best}}\left(t\right)-A\bullet {D}_{\mathrm{dist}}& \mathrm{pro}<0.5;\\ {O}_{\mathrm{dist}}\bullet {\mathrm{e}}^{\mathrm{f}l}\bullet \mathrm{cos}(2\uppi l)+{X}_{\mathrm{best}}\left(t\right)& \mathrm{pro}\ge 0.5,\end{array}\right.\#\end{array}$$
(7)

where \(\mathrm{pro}\) is a random number between 0 and 1.

2.3 Search for Prey Phase (Exploration)

When \(A\) is less than − 1 or more than 1, whales use a random walk mechanism to search for prey based on the locations of other individuals. The mathematical model for the exploration phase is as follows:

$$\begin{array}{c}{D}_{\mathrm{dist}}=\left|C\bullet {X}_{\mathrm{rand}}(t)-X(t)\right|,\end{array}$$
(8)
$$\begin{array}{c}X\left(t+1\right)={X}_{\mathrm{rand}}(t)-A\bullet {D}_{\mathrm{dist}},\end{array}$$
(9)

where \({X}_{\mathrm{rand}}\left(t\right)\) represents a random search agent selected from the current population.

The flow chart of WOA is as shown in Fig. 1.

Fig. 1
figure 1

The flow chart of WOA

3 The proposed QGBWOA

In this section, the proposed QGBWOA will first be described by flowchart and pseudo-code. Then, the two strategies, QOBL and GB, will be described in detail. Finally, the time complexity of QGBWOA is analyzed.

3.1 Algorithm Overview

The flow chart of QGBWOA is shown in Fig. 2. We present the pseudo-code of the proposed QGBWOA in Algorithm 1.

Fig. 2
figure 2

Flow chart of QGBWOA

The pipeline of QGBWOA is described as follows. First, the population is randomly initialized. Then, find the optimal individual in the current evaluation based on the fitness value, and update the individual position. In the position updating phase, by incorporating the QOBL mechanism, the ability of the algorithm to find a superior solution is boosted to some extent. Thus, the convergence rate and the quality of the solution can be improved. After the individual position is updated, the GB strategy is used to update the population position again. The increased diversity of the population enhances the exploration ability; thereby, the frequency of the algorithm falling into the local optimum is significantly reduced. The details of QOBL and GB mechanisms are presented in the following subsections.

figure a

3.2 Quasi-Opposition-Based Learning

As shown in Eq. (3), the value of parameter \(A\) will be less than 1 at the late stage of evaluation in the original WOA, and the encircling prey phase is executed. The position of the new individual is only related to the position of the optimal individual and current individual, so the final new individual’s position will not change significantly in the late evaluation, which may cause it to fall into the local optimum. Therefore, QOBL [55] is taken to enhance the original WOA in local search ability in the late evaluation process, to reduce the frequency of WOA falling into the local optimum. QOBL is an improved version of opposition-based learning (OBL) [56], which considers the individual with the opposite position to the current individual may be closer than the current individual. In recent years, the QOBL strategy has been used in MAs [57,58,59,60] to improve convergence speed accuracy.

The mathematical model is depicted as follows:

$$\begin{array}{c}{x}_{j}^{\mathrm{qo}}=rand\left[\left(\frac{{\mathrm{lb}}_{j}+{\mathrm{ub}}_{j}}{2}\right),{x}_{j}^{o}\right],\#\end{array}$$
(10)

where \({x}_{j}^{\mathrm{qo}}\) represents the quasi-opposite individual of the current search agent in \(j\)-th dimension, \(\frac{{\mathrm{lb}}_{j}+{\mathrm{ub}}_{j}}{2}\) represents the center of \([{\mathrm{lb}}_{j},{\mathrm{ub}}_{j}]\), \(\mathrm{rand}\left[\left(\frac{{\mathrm{lb}}_{j}+{\mathrm{ub}}_{j}}{2}\right),{x}_{j}^{o}\right]\) represents a uniformly distributed random number between in \(\frac{{\mathrm{lb}}_{j}+{\mathrm{ub}}_{j}}{2}\) and \({x}_{j}^{o}\), \({x}_{j}^{o}\) denotes the opposite individual of the current search agent in \(j\)-th dimension:

$$\begin{array}{c}{x}_{j}^{o}={\mathrm{lb}}_{j}+{\mathrm{ub}}_{j}-{x}_{j},\end{array}$$
(11)

where \({\mathrm{lb}}_{j}\) denotes the lower bound of the search space, \({\mathrm{ub}}_{j}\) denotes the upper bound of the search space, and \({x}_{j}\) denotes the position of the current individual in \(j\)-th dimension.

3.3 Gaussian Barebone Mechanism

As we mentioned earlier, when the algorithm is evaluated in the later phase, the diversity of WOA will reduce, which can cause insufficient convergence speed and convergence accuracy. The GB mechanism can help individuals choose the most suitable direction and continuously approach the optimal solution to avoid prematurely falling into local optimality. Therefore, after the position of all the search agents has been updated, the characteristics of the randomness of GB are incorporated into WOA to enhance the population diversity. This balances the algorithm’s local exploitation and global search capabilities and further improves the convergence speed.

The GB [61] strategy is on the basis of bare bones PSO (BBPSO) [62], and the parameter CR is employed in the GB strategy to guide each individual. If the probability of random generation is less than CR, the Gaussian distribution is used to update the individual’s position in the next evaluation; otherwise, the idea of differential evolution is introduced to update the individual’s position. The GB strategy is as follows:

$$\begin{array}{c}{V}_{i,j}=\left\{\begin{array}{cc}\mathrm{G}\left(\frac{{P}_{\mathrm{Leader}}-{X}_{i,j}}{2},\left|{P}_{\mathrm{Leader}}-{X}_{i,j}\right|\right)& {r}_{3}<CR;\\ {X}_{t1,j}+{r}_{4}\times \left({X}_{t2,j}-{X}_{t3,j}\right)& \mathrm{otherwise},\end{array}\right.\end{array}$$
(12)

where \({V}_{i,j}\) denotes the position of the \(i\)-th individual in the \(j\)-th dimension, \({P}_{\mathrm{Leader}}\) denotes the global optimal position in the population, \({X}_{i,j}\) is the current individual in the \(j\)-th dimension, G represents the Gaussian distribution, \({r}_{3} and {r}_{4}\) are random numbers within [1], \({X}_{t1,j}\), \({X}_{t2,j}\), \({X}_{t3,j}\) are three arbitrarily selected individuals that are diverse from the current individual.

3.4 Time Complexity of QGBWOA

The time complexity of QGBWOA is subject to the population size \((N)\), the number of dimensions \((\mathrm{Dim})\), and the maximum number of algorithm evaluations \((\mathrm{MaxFEs})\). Then, the overall time complexity is as follows:

  • The population size is N. The time complexity of initializing all individual whales is O (N).

  • The time complexity of population fitness and updating the position and the fitness of the current optimal solution is \(\mathrm{MaxFEs}\times O (2N)\).

  • The primitive WOA search mechanism also causes the position of each search agent to change during the search process of QGBWOA. The time complexity of updating the position of each search agent is \(\mathrm{MaxFEs}\times ( O \left(N\times \mathrm{Dim}\right)+ 5\times N)\).

  • Implementing the QOBL mechanism is \(\mathrm{MaxFEs}\times O (N\times \mathrm{Dim})\).

  • Performing GB strategy is \(\mathrm{MaxFEs}\times O (N\times (\mathrm{Dim}+5))\).

Therefore, the total time complexity is O (QGBWOA) = O (Initialization) + O (Calculation of initial whales and Selection) + O (WOA) + O (QOBL strategy) + O (GB mechanism) = O(N) + \(\mathrm{MaxFEs}\times\) (\(O (2N)\) +\(O \left(N\times \mathrm{Dim}\right)+ 5\times N\) + \(O (N\times \mathrm{Dim})\)  + \(O (N\times (\mathrm{Dim}+5))\)).

4 Experimental Results and Discussion

In this section, the algorithm stability and strategy combination are analyzed, and experimental simulation results on the IEEE CEC 2014 and CEC 2020 benchmark functions are shown to verify the performance of QGBWOA comprehensively.

4.1 Benchmark Functions

CEC 2014 benchmark functions and CEC 2020 benchmark functions are used to verify the efficacy of QGBWOA. The details of the functions are shown in Appendix A.

For the fairness of the experiment results, all tested algorithms were performed in the same environment: the population size was 30, the maximum number of evaluations was set to 300,000, and the algorithms were independently estimated 30 times on each benchmark function. We used the Friedman and WS tests to evaluate the experiment results.

4.2 Balance and Diversity Analysis on QGBWOA and WOA

In this section, QGBWOA is qualitatively analyzed on CEC 2014 in five aspects: search history, search trajectory, average fitness, and diversity and population balance.

The search history, search trajectory, and average fitness results are reported in Appendix B. In Fig. B.1, The first, second, third, and fourth columns of the figure show the three-dimensional of corresponding functions, the historical search trajectories in 2-dimensional (2D), the trajectories of the search agents, and the average fitness of individuals, respectively.

The red dots shown in Fig. B.1 (b) represent the global optimal solution, while the black dots represent the position of the search agent. The figure clearly shows that with the increase of the number of evaluations, the black dots gradually approach the red dots to find the optimal solution. In Fig. B.1 (c), the individual trajectory fluctuation is small in F2, while in the early evaluation process of F4, F10, and F16, the individual trajectory fluctuation is relatively strong. This shows that QGBWOA can reach most of the search space. Figure B.1 (d) shows that the average fitness decreases faster in F2, while the average fitness has strong fluctuations during early evaluations in F4 and F10. This indicates that QGBWOA can quickly determine the approximate range of optimal solutions in the early evaluations and further explore the optimal solution in later evaluations to achieve accurate convergence.

Figure B.2 shows the result of QGBWOA and WOA balance analysis. The figure's red, blue, and green curves represent the exploration, exploitation, and incremental–decremental curves. When the exploration effect is weaker than the exploitation effect, the green curve decreases, and vice versa. This is because the algorithm usually performs the global exploration first in the solution space, determines the approximate solution location with superior quality, and then performs the solution's local exploitation to find a more optimal solution. As shown in Fig. B.2, in the beginning of the evaluation, the search curve always starts with a higher value and the algorithm is mainly based on a global search. Then, local exploitation soon dominated. From Fig. B.2, it can be observed that the search phase of QGBWOA ends earlier than WOA, indicating that the local exploitation time of QGBWOA in the target area is longer.

Figure B.3 shows the results of QGBWOA and WOA diversity analysis. The diversity of the population is high at the beginning because the algorithm initializes the population randomly. As the evaluation progresses, the search range of the algorithm continues to decrease, and the population diversity also continues to decrease. It can be seen from Fig. B.3 that the average diversity of QGBWOA falls faster than WOA, which indicates that QGBWOA converges more quickly than WOA.

In summary, QGBWOA demonstrates remarkable advantages in terms of convergence speed and global search capability compared with WOA.

4.3 Ablation Study on QGBWOA

To demonstrate the influence of QOBL and GB mechanisms, ablation experiments with QGBWOA were conducted. In Table 1, “Q” and “GB” mean the quasi-opposition-based learning mechanism and the Gaussian barebone mechanism, respectively. “1” implies that the corresponding mechanism is employed, and a “0” conversely indicates that it is not used. For example, QWOA indicates that WOA only uses the quasi-opposition-based learning mechanism and does not use the Gaussian barebone mechanism. Figure B.4 shows the convergence curves of QGBWOA with the other two mechanisms and the original WOA on the CEC 2014 benchmark functions, and the Dim is set to 30. It can be seen from the figure that QGBWOA is far superior to WOA in terms of convergence speed and convergence accuracy, which demonstrates the effectiveness of the GB mechanism in the global exploration capability of WOA. Experimental results show that QGBWOA is the best way to solve these different functions.

Table 1 The ablation experiment of QGBWOA

The WS test and Friedman test were used to compare the algorithms for statistical difference calculation. The results are shown in Appendix A. Table A. 3 records the average (Avg), standard deviation (Std), and average ranking (ARV) for each algorithm. "+", "−" and "=" in Table A. 3 indicate that QGBWOA is better than other algorithms, inferior to other algorithms, and equal to other algorithms, respectively. Table A. 3 shows that QGBWOA ranks first among its two variants of the mechanism and the original WOA algorithm. On the benchmark functions F23–F25 and F27–F30, the Std values of QGBWOA are 0, which indicates that QGBWOA has good robustness. This is because the addition of the QOBL strategy improves the local exploitation ability of QGBWOA, and the addition of the GB strategy comprehensively improves the global exploration ability of QGBWOA, improves the population diversity, and better helps the algorithm find the global optimal solution. The final average ranking of QGBWOA is the first, which indicates that the algorithm is optimized best when these two mechanisms work together.

Table A. 4 shows the p values results of the WS test. When the value in the WS test is less than 0.05, QGBWOA has remarkable performance over its peer. The values less than 0.05 in the table have been bolded. From Table A. 3 shows that QGBWOA outperforms QWOA, GBWOA, and WOA in 20, 23, and 27 of the 30 benchmark functions, respectively. It can be seen that with the integration of the proposed two mechanisms, the performance is gradually improved.

4.4 Comparison with Other Metaheuristic Algorithms on CEC 2014 Test

In this section, QGBWOA is compared with 7 of the MAs, including WOA, DE [63], FA [6], FOA [64], PSO [65], SCA [5], and MFO [7]. Table A. 5 shows the results of F1-F30 with Dim values of 10, 30, 50, and 100, respectively.

It can be found that QGBWOA performs better in F1 when Dim values are 10, 30, 50, and 100. DE performs better on F2 and F3. Furthermore, compared with FOA, SCA, MFO, and WOA, QGBWOA is superior to them on all hybrid and composition functions when Dim is 10, 30, 50, and 100, respectively. DE can obtain the optimum on F15 in all four dimensions. QGBWOA finally ranks first when the Dim is 30, 50, and 100. The average mean is 1.8, which is 26% higher than the second-ranked DE algorithm and 63% higher than the original WOA. The p values are shown in Table A. 6 when Dim is 30. In Table A. 6 shows that the values of FA, FOA, and SCA on all functions are less than 0.05, indicating that QGBWOA outperforms these algorithms.

Figure 3 reports convergence curves and boxplots for each algorithm on 10 functions. In Fig. 3, F1 and F2 are unimodal functions. F4, F5, and F11 are multi-model functions. F17 and F18 are hybrid functions. F23, F24, and F29 are composition functions. The first and third columns of the figure show the convergence curves, and the second and fourth columns show the corresponding box plots. In unimodal, multimodal, and partial hybrid functions, although QGBWOA did not discover the optimal solution in the beginning phase, it converged to the optimal solution in later evaluations, indicating that QGBWOA has a good ability to avoid falling into the local optimal solution. In composition functions, the convergence rate and convergence accuracy of QGBWOA are significantly better than other algorithms. In the box plot, the center marker of each box indicates the intermediate value. Each box has the 25th and 75th percentiles at the lower and upper margins. Red "+" marks are used to mark outliers. The box plot in Fig. 3 shows that QGBWOA has more stable optimization results and fewer outliers than the compared algorithms in most cases. These results confirm the greatly improved performance of QGBWOA compared to WOA and other peers. The proposed QGBWOA has better performance due to the QOBL strategy and GB mechanism. The impact of these two strategies is shown in the convergence plots of the benchmark functions. Observing the convergence curves of F1, F2, F11, and F17, it can be found that the proposed algorithm does not fall into the local optimum at the evaluation times of about 50,000, and continues to converge to the region with higher solution quality, which illustrates that the QOBL strategy can be a good way to improve the local exploitation ability and the accuracy of the solution. In the convergence plots of F18, F23, F24, and F29, QGBWOA improves the exploration ability of the population due to the GB strategy so that the global optimal solution can be fast located, enabling the improvement of convergence faster.

Fig. 3
figure 3

Convergence curves and boxplots of QGBWOA and other meta-heuristics (Dim = 30)

4.5 Comparison with The State-of-The-Art WOA Variants on CEC 2014 Test

To confirm the efficacy of QGBWOA, we compared QGBWOA with 6 WOA variants on CEC 2014 test. These variants include ACWOA [44], CCMWOA [37], OBWOA [46], RDWOA [41], BMWOA [40], and BWOA [25].

Table A.7 shows the statistical results in different dimensions. The table shows that QGBWOA obtains the smallest optimization result among the 30 test functions. QGBWOA ranks first when Dim values are 10, 30, 50, and 100, respectively. Compared with the second-ranked RDWOA, QGBWOA outperforms RDWOA in these four categories of dimensions with 8, 19, 19, and 21 functions, respectively. Moreover, QGBWOA performs better in all four dimensions on functions F1, F2, F4–F7, F15, F17, F19, F21–F23, and F27–F30.

Table A.8 reports the comparison results of the WS test between QGBWOA and the compared variants of WOA when Dim is 30. The p values on all test functions of BMWOA except F14 are less than 0.05, which indicates that QGBWOA’s performance is superior to that of BMWOA. The p values of RDWOA, CCMWOA, and BWOA on the F23, F25, F27, and F28 functions are 1, indicating that RDWOA, CCMWOA, and BWOA can obtain the same optimization results as QGBWOA.

In Fig. 4, the convergence rate of QGBWOA is faster than the compared state-of-the-art algorithms on most of the benchmark functions. In addition, it can be observed that its convergence accuracy is the best, whereas other variants of WOA are trapped in local optimal values to varying degrees. Figure 4 depicts the box plots of the fitness of the best individuals found in the final generation. These comparison results show that QGBWOA is better than these state-of-the-art algorithms for complex optimization problems. The results affirmed the ability of QGBWOA to solve benchmark problems in different dimensions.

Fig. 4
figure 4

Convergence curves and boxplots of QGBWOA and other variants algorithms

4.6 Comparison of CEC 2020 Benchmark Functions

In this subsection, QGBWOA is tested with 7 group intelligence algorithms, including WOA, HHO [9], SMA [10], HGS [66], SCA [5], RDWOA [41], ACWOA [44], on the CEC 2020 benchmark functions with Dim equals to 30. Table A.9 illustrates the statistical results of all optimizers on Avg and Std. It can be seen in the table that QGBWOA outperforms the compared optimizers for 9, 8, 4, 3, 10, 6, and 8 of the 10 CEC 2020 benchmark functions, respectively. QGBWOA ranks first in total with ARV equal to 1.4.

The p value results of QGBWOA against the compared optimizers are shown in Table A.10, where p values greater than 0.05 are shown in bold. The results show that QGBWOA significantly differs from the optimizer for most of the tested functions except F9 and F10.

4.7 Statistical Analysis of QGBWOA

Because the Freidman test can only give a conclusion on whether there is a variance in performance among algorithms, a post hoc test is needed to find the statistical difference in the performance of the algorithms. Commonly used follow-up tests include the Nemenyi and Bonferroni–Dunn tests [67]. The Nemenyi test is employed to compare the performance of algorithms with each other, while the Bonferroni–Dunn test is used to compare an algorithm with the rest algorithms. Bonferroni–Dunn post hoc statistical analysis is used in this article to verify the performance difference between QGBWOA and the compared algorithms. Assume the difference in the average rank between the two algorithms is better than the critical difference (CD). The CD is described as Eq. (13).

$$\begin{array}{c}CD={q}_{\alpha }\sqrt{\frac{k\left(k+1\right)}{6\mathrm{Num}}},\end{array}$$
(13)

where \(\alpha\) denotes the significant level, \({q}_{\alpha }\) is the critical value, \(k\) is the number of algorithms, and \(\mathrm{Num}\) represents the number of test functions.

In the experiment in Sect. 4.4, eight algorithms were chosen, so \(k\) is 8. Thirty benchmark functions were used, so \(Num=30\). The significant levels of \(\alpha\) were selected as 0.05 and 0.1. According to Eq. (13), it can be calculated that when \(\alpha\) values are 0.05 and 0.1, respectively, the corresponding CD values are 1.7 and 1.55, respectively. The average rank of QGBWOA is \({\mathrm{ARV}}_{\mathrm{QGBWOA }}=1.70\) when Dim is 30. When the average rank of the comparison algorithm is greater than \(\mathrm{CD }+{\mathrm{ARV}}_{\mathrm{QGBWOA }}=3.4/3.25 (\alpha =0.05/0.1)\), there is a significant difference between QGBWOA and this algorithm. In Fig. 5, the solid line represents the threshold when the significant level is 0.1, and the dotted line indicates the threshold when the significant level is 0.05. As shown in Fig. 5, QGBWOA outperforms FA, FOA, PSO, SCA, MFO, and WOA at these two significant levels. In the Bonferroni–Dunn test, QGBWOA and DE showed no remarkable difference in performance.

Fig. 5
figure 5

Bonferroni–Dunn test results of experiments in Sect. 4.4 (Dim = 30)

The post hoc test analysis is performed for the experiment in Sect. 4.5. The results are shown in Fig. 6. Because eight algorithms were tested in 30 functions, the \(k\) is 8 and \(\mathrm{Num}\) is 30. Similarly, CD values are 1.70 and 1.55 when the corresponding \(\alpha\) values are 0.05 and 0.1, respectively. From the figure, we can see that QGBWOA is significantly better than ACWOA, BMWOA, CCMWOA, BWOA, OBWOA, and WOA at both significant levels and exhibits no obvious difference from RDWOA in the Bonferroni-Dunn test.

Fig. 6
figure 6

Bonferroni–Dunn test results of experiments in Sect. 4.4.5 (Dim = 30)

5 Applications

In this section, we will show the applications of QGBWOA in feature selection and image segmentation.

5.1 Feature Selection Based on Proposed QGBWOA

5.1.1 Binary QGBWOA

Data mining technology has been widely used in medicine in recent years. Most medical data are high-dimensional, and extracting a sub-set of useful features from high-dimensional data is tricky. Therefore, the dimensionality of the dataset needs to be reduced before data mining. Feature selection is one of the dimensionality reduction methods. However, it is often difficult to exhaust the combination of feature subsets, especially when dealing with high-dimensional data, and using a heuristic algorithm can solve this problem well [68]. As a heuristic method, the SAs have the intelligent selection and random search advantages, which can search for an ideal solution set. In recent years, the combination of SAs and feature selection methods has received more and more attention from researchers [69,70,71,72]. In this subsection, a QGBWOA-based wrapper feature selection method is proposed. Twenty-four datasets from the UCI machine learning repository are utilized to measure the effectiveness of the proposed method.

Since the search space in the feature selection problem is represented in binary, the real value of each solution found by QGBWOA must be converted into a Boolean type. The function is defined as follows:

$$\begin{array}{c}{X}_{i,j}^{t+1}\left\{\begin{array}{cc}1,& \text{if }{r}_{5}\ge T\left({X}_{i,j}^{t}\right);\\ 0,& \text{otherwise,}\end{array}\right.\#\end{array}$$
(14)
$$\begin{array}{c}T\left(x\right)=\frac{1}{1+{\mathrm{e}}^{-2x}},\#\end{array}$$
(15)

where \({r}_{5}\) is a random number between in 0 and 1. If \({X}_{i,j}=1\), it means the solution of \({X}_{i}\) in the \(j\)-th dimension is regarded as a relevant feature; otherwise, if \({X}_{i,j}=0\), it means the solution of \({X}_{i}\) in the \(j\)-th dimension is regarded as an irrelevant feature. Since feature selection aims to obtain better classification accuracy with fewer features, the classifier error rate and the number of selected features are utilized to form the fitness function. The fitness function is defined as follows:

$$\begin{array}{c}Fitness=\theta \cdot Acc+\upsilon \cdot \frac{S}{T},\#\end{array}$$
(16)

where \(\mathrm{Acc}\) denotes the classification accuracy of the K-nearest neighbors (KNN) classifier, \(\theta\) and \(\upsilon\) represent the weight coefficients of the classifier error rate and the number of selected features, respectively, \(S\) represents the number of the selected feature sub-set, and \(T\) means the total number of features in the dataset.

The overall algorithm pipeline is shown in Fig. 7. First, the dataset is processed. Second, using ten-fold cross-validation to divide the dataset into ten parts, nine of which are used as training data, and the remaining one is used as test data. Before utilizing QGBWOA to search for the best feature combination in the dataset, the number of attributes of the dataset is set to the dimension of the population in QGBWOA. The KNN classifier is used to evaluate the accuracy of the selected features. The fitness of the population is calculated. Then, QGBWOA is applied to update the position of the population in the discrete search space. After reaching a specified number of evaluations, an optimal feature sub-set is obtained. The KNN classifier evaluates the classification accuracy of the obtained feature sub-set. Finally, the best sub-set of features is obtained.

Fig. 7
figure 7

The architecture of feature selection using the binary QGBWOA

5.1.2 Experiment on Feature Selection

Twelve low- and high-dimensional datasets are used to examine the efficacy of the proposed approaches. Both these two categories of datasets are chosen from the California Irvine (UCI) Machine Learning Repository [73]. Table 2 describes datasets in terms of the number of samples, features, and classes. It can be seen that datasets in low-dimension contained less than 350 features, and most of the high-dimensional datasets have more than 5000 features. And QGBWOA was tested together with bMFO [74], BSMA [75], BSO [76], bWOA [77], and BDE [78]. Table 3 reports their parameter settings. Moreover, the datasets are divided by ten-fold cross-validation [79]. The wrapped feature selection method is based on the KNN (\(K=1\)) classifier [80]. The maximum iteration is set to 50.

Table 2 Datasets of the feature selection experiment
Table 3 Parameter setting of compared algorithms

Appendix A reports the results of QGBWOA and other methods based on average classification error, the number of selected features, fitness values, and time cost. Table A.11–Table A.13 and Table A.15–Table A.17 show the fitness, number of selected features, and average classification error results on low-dimensional and high-dimensional datasets, respectively. It can be observed that QGBWOA significantly outperforms the compared algorithms in terms of fitness and can select a smaller number of features. QGBWOA ranked first in total compared to other methods. Table A.14 and Table A.18 show the timing results on each data set. The results in the table report that the time cost of the proposed QGBWOA is higher than the compared algorithms, such as bSMA and BSO. To a certain extent, the embeddedness of QOBL and GB strategies increases the time overhead of the proposed algorithm.

Figure B. 5 shows the convergence curves of QGBWOA and the other five algorithms on 12 low-dimensional datasets. The figure shows that QGBWOA performs well on all these 12 datasets. The convergence value of QGBWOA reaches the minimum, indicating that QGBWOA can obtain higher classification accuracy than the algorithms. Furthermore, it can be seen from Fig. B. 6 that the convergence value of the fitness of QGBWOA on all 12 high-dimensional datasets is smaller than that of the compared methods except Brain_tumor2 and Lung_cancer.

5.2 Image Segmentation Based on Proposed QGBWOA

5.2.1 Proposed Image Segmentation Method

Image segmentation is a fundamental technique in a variety of image processing applications. The characteristics of the image divide the image into multiple discrete regions, which are characterized by continuity or similarity in the same region and have obvious contrast between different regions [81]. The multi-threshold method is a commonly used image segmentation method, which uses multiple thresholds to mark target regions of interest in an image. The choice of threshold has an impact on the image segmentation effect. One of the often-considered methods is the histogram-based method. The histogram describes the frequency of the corresponding gray value in the image. The one-dimensional histogram only reflects the magnitude of the pixel’s gray level, and the 2D histogram can reflect the spatial correlation information between the pixel and its neighborhood. Abutaleb et al. [82] proposed an image segmentation method combining the local pixel gray average with the original gray histogram. Kapur’s entropy can be used to evaluate the optimal thresholds. It divides the image into different categories and determines whether the categories are consistent according to the entropy size. Kapur’s entropy finds the optimal threshold by maximum fitness value [83]. Kapur’s entropy is chosen in this article to determine the n best thresholds. Let \([{T}_{1},{T}_{2},{T}_{3},\cdots ,{T}_{n}]\) be the threshold set values to divide the image into \(n+1\) classes and the formula is described as:

$$\begin{array}{c}F\left({T}_{1},{T}_{2},{T}_{3},\cdots ,{T}_{n}\right)={H}_{0}+{H}_{1}+\cdots +{H}_{n},\#\end{array}$$
(17)
$$\begin{array}{c}{H}_{0}=-{\sum }_{s=0}^{{T}_{1}-1}\frac{{p}_{s}}{{\omega }_{0}}\mathrm{ln}\frac{{p}_{s}}{{\omega }_{0}},{\omega }_{0}=\sum_{s=0}^{{T}_{1}-1}{p}_{s},\#\end{array}$$
(18)
$$\begin{array}{c}{H}_{1}=-{\sum }_{s={T}_{1}}^{{T}_{2}-1}\frac{{p}_{s}}{{\omega }_{1}}\mathrm{ln}\frac{{p}_{s}}{{\omega }_{1}},{\omega }_{1}=\sum_{s={T}_{1}}^{{T}_{2}-1}{p}_{s},\#\end{array}$$
(19)
$$\vdots$$
$$\begin{array}{c}{H}_{n}=-{\sum }_{s={T}_{n}}^{L-1}\frac{{p}_{s}}{{\omega }_{n}}\mathrm{ln}\frac{{p}_{s}}{{\omega }_{n}},{\omega }_{n}=\sum_{s={T}_{n}}^{L-1}{p}_{s},\#\end{array}$$
(20)
$$\begin{array}{c}{p}_{s}=\frac{h(s)}{P},s=0, 1,\cdots ,L-1,\#\end{array}$$
(21)

where \(P\) is the total number of pixels in the image, \(L\) represents the total gray levels of the given image, \(s\) is the gray level, \({p}_{s}\) is the probability of the \(s\)-th gray level, \({H}_{1},{H}_{2},\cdots ,{H}_{n}\) denote the Kapur's entropies of corresponding classes, and \({\omega }_{0},\) \({\omega }_{1},{\cdots ,\omega }_{n}\) denote the probabilities of corresponding classes.

Enumerating all combinations of thresholds and selecting the optimal one is quite difficult, and the time complexity will grow exponentially with an increasing number of thresholds [84]. The use of MAs to find the optimal threshold has attracted attention in recent years [85,86,87,88]. An improved butterfly optimization algorithm was proposed by Sharma et al. [89], which is utilized for image segmentation problems. Chakraborty et al. [90] introduced an enhanced version of WOA to tackle image segmentation problems. A new pulse couple neural network model based on grey wolf optimizer was proposed by Wang et al. [88] for medical image segmentation. Zhao et al. [91] improved the Salp swarm algorithm (SSA). This version is used to find optimal segmentation threshold for images.

In this subsection, we put forward a QGBWOA-based multi-threshold image segmentation method by integrating the 2D histogram with the entropy of Kapur. QGBWOA is utilized to find the optimal set of thresholds, where the entropy of Kapur is the objective function, and the image is segmented according to the threshold. The detailed flowchart of the method is given in Fig. 8.

Fig. 8
figure 8

Flow chart of multi-threshold image segmentation using QGBWOA

5.2.2 Simulation Experiment

We selected six COVID-19 patients’ images collected by Cohen et al. [92] as the segmentation images. These six images are named A, B, C, D, E, and F in this experiment. Computed tomography (CT) of the lungs of COVID-19 patients often shows high-gray diffuse ground-glass and pulmonary nodular shadows. This is because the COVID-19 virus enters the pulmonary bronchi and further invades the alveolar epithelial cells, where it replicates itself. The rapid replication of the virus leads to significant swelling of the epithelial cells, which will appear as high-gray shadows on CT images with significant gray differences from the normal lung parenchyma.

Figure 9 shows the original images and the 2D histograms of these six images. The population number of the algorithm in this experiment was set to 20, the iteration times were set as 100, and the size of the image was set as 512 × 400. The performance of the QGBWOA-based multi-threshold image segmentation method was evaluated at various thresholds, including low threshold levels (4 and 6) and high threshold levels (10 and 15). We compared our QGBWOA-based multi-threshold image segmentation with the multi-threshold segmentation methods based on other methods, including WOA, Cuckoo search (CS) algorithm, MFO, SSA, and biogeography-based learning PSO (BLPSO). In addition, the experiment results are evaluated by three indicators, including PSNR, SSIM, and FSIM.

Fig. 9
figure 9

The original images and their 2D histograms

The set of ideal segmentation thresholds found by the proposed algorithm for these images is shown in Fig. 10. The solid line in Fig. 10 represents the grayscale histogram, and the dotted line represents the ideal threshold set produced by QGBWOA. Because the threshold is 10, there are ten red lines in each picture.

Fig. 10
figure 10

The optimal threshold of images at threshold level 10

The Avg and Std values of PSNR, SSIM, and FSIM in each threshold are shown in Appendix A, where the optimal results have been bolded. The results in Tables A.19 to Table A.21 show that QGBWOA achieves optimal results at most thresholds. The mean value of the overall ranking of these evaluation metrics is shown in Tables 4, 5 and 6. The results show that the mean of the overall ranking of QGBWOA is the smallest, which confirms the strong competitiveness of the proposed method. The best results for Kapur’s entropy obtained by the proposed method are illustrated in Appendix A. In Table A.22, QGBWOA has a noticeable superiority in searching for the optimal value of Kapur’s entropy compared with other algorithms. Figure 11 illustrates the final segmentation results with a threshold of 10. From the result, we can see that depending on the threshold found by QGBWOA; the image can be segmented into blocks of pixels with different gray values with sharper borders, which is useful for evaluating suspected cases of COVID-19.

Table 4 PSNR average results rankings
Table 5 SSIM average results ranking
Table 6 FSIM average results ranking
Fig. 11
figure 11

The segmentation results of all images obtained by QGBWOA at threshold level 10

6 Conclusion and Future Work

In this paper, a new WOA variant QGBWOA based on QOBL and GB strategy has been proposed to improve the inadequacies of the original WOA. The QOBL strategy is introduced to strengthen the local exploitation ability and to assist the proposed method to jump out of the local optimum. The GB strategy is employed to balance the algorithm’s exploitation and exploration capabilities and help the algorithm find regions with better solutions. QGBWOA was tested on CEC 2014 and CEC 2020 benchmark functions with different dimensions, in which the performance of QGBWOA is compared with the basic methods and the state-of-the-art WOA variants. The experimental results show that QGBWOA can provide optimum solutions and effectively avoid premature convergence. Finally, the ability of QGBWOA to solve real-world problems is validated by the feature selection and the multi-threshold image segmentation applications.

The increase in time complexity is the inherent result of mechanism embedding, and how to reduce the time complexity is also our next work. In future, parallel computing techniques [93] can be considered to reduce the time complexity while keeping the performance of QGBWOA. Additionally, we would like to apply QGBWOA to other fields, such as multi-objective optimization [94], fuzzy definition optimization [95], and dynamic optimization [96].

7 Data Availability Statement

The data involved in this study are all public data, which can be downloaded through public channels.