1 Introduction

1.1 Introduction to meta-heuristics

Optimization through meta-heuristics has emerged as a prominent trend for problem-solving and systematic resource management in multi-disciplinary research and real-world scenarios with numerous applications. Optimization has been adopted and encouraged by researchers and experts apprehensive of its simplicity and efficacy in solving complex problems with a greater degree of success [1, 10, 11, 13, 36, 45]. The development of Genetic algorithm (GA) [38], Particle swarm optimization (PSO) [74] marks a watershed in the history of optimization with a myriad of other techniques to follow soon. The optimization techniques iterate sequentially to determine the best optimal solutions concerning the objective function as it explores a complex labyrinth of peaks and valleys known as the “search-landscape/search space”. With a fraction of the knowledge required to determine the solutions while considering the ambivalent state of the problems, optimization is a promising aberration and has been at the forefront of the many-sided research avenues that continue to thrive toward the perfection of the existing and forthcoming systems [45].

1.2 A review of meta-heuristics in multimedia applications and artificial intelligence

Literature in recent times [11, 23, 71] provides an outlook on the widespread applications of meta-heuristic-based stochastic optimizers in multimedia tools and artificial intelligence (AI). The adaption of meta-heuristic solvers coupled with IoT [58] in big-data data analytics [52], block-chain [11], video games [10], artificial intelligence [24], feature selection [15], machine learning [14, 53], and deep learning has gained immense popularity on account of its simplicity, accuracy in training and testing and robustness to entrapment [7, 58]. Figure 1 provides a classification of various areas within the realm of multimedia and AI adopting meta-heuristic algorithms.

Fig. 1
figure 1

Application of meta-heuristic optimization to various domains in multimedia and AI

Moreover, multiple other domains including medicine and health sectors have been increasingly relying on the utilization of AI and multimedia tools to improve their accuracy and efficiency while working with large datasets. A few examples from the literature lately include the development of a multi-feature fusion convolution neural network (CNN) framework to represent the complex morphology and gene expression patterns [68], fall prediction based on key points of human bones through bone map dataset and CNN to prevent the damage to the elderly [70] etc. Computing systems for image and vision integrate AI to enhance the detection capabilities and these include a neural network-based edge-oriented framework for saliency detection enhancement for complex images [69], Deep Convolutional Generative Adversarial Networks (DCGAN) with TensorFlow deep learning framework for virtual face generation [31] etc. The financial engineering domain with AI-based prediction has been of increasing concern as they help to track and predict the trends of various markets and effectively design strategic products and services to maximize profits. For example, the research at [67] developed a novel mobile personalized recommended method based on the money flow model for the stock exchange to provide investors with reliable practical investment guidance and receive more returns. Machine learning strategies based on two-dimensional numerical models in financial engineering [66] to reduce the prediction error and improve forecasting precision for major U.S. stock market index fall under the same category.

The efficacy of meta-heuristics in multimedia tools and AI has been well-researched and documented in the literature. Compared to the traditional solvers such as the gradient-descent method or back-propagation method where the initial solutions play a crucial role in the outcome of optimization, the stochasticity of meta-heuristics forms its major strength in exploring the various possibilities of solution combinations with very little to no dependence on the initial guess. Examples include (i) a continuation approach for training Artificial Neural Networks (ANNs) with meta-heuristics by J. R-Delado et al. in [55] including Particle Swarm Optimization (PSO), Firefly Algorithm (FA) and Cuckoo Search. The execution times were lowered by about 5–30% without statistically significant loss of accuracy for the public benchmark datasets and this was achieved by the accelerated convergence of the meta-heuristics. (ii) Automated fine-tuning compiler heuristics through meta-optimization and machine learning to reduce compiler design complexity and tedium of heuristic tuning is implemented in [61]. The resulting framework improved the average speedup of the heuristic compilation by 23% with an average performance improvement of 25% on the training set and 9% on the test set. (iii) In other works, ANN training integrating a hybrid meta-heuristic combining the exploration and exploitation capabilities of Invasive Weed Optimization and Differential Evolution (DE) was realized in [46]. Benchmarked against the 5 and 10-layered multi-layer perceptron training, the hybrid algorithm lowered the training and testing errors by about 5 to 10% for the three different datasets with a faster rate of convergence. (iv) In a similar development, S. Benabderrahmane in 2017 [5], combined machine learning and swarm intelligence for real-time object detection and tracking to accelerate time processing and enhance the extraction efficiency of the classifier. Experimenting with genetic algorithms (GA), particle swarm optimization (PSO), random walk and a novel hybrid combination of these methods, significant improvements were observed in computation time, efficiency and accuracy. (v) Interactive software design incorporating meta-heuristic algorithms for search engines with user-provided evaluation and rating systems to develop Interactive Evolutionary Algorithms (IEAs) is implemented in [59]. A comparative analysis between greedy local search, an evolutionary algorithm and ant colony optimization (ACO) showed that ACO-based interactive search outperformed the latter for the software design problem.

Examples of integration of meta-heuristics for the optimization of multimedia tools include (i) An optimized network flow wavelet-based image coding for multipath selection to maximize the received multiple description coding (MDCs) in a lossy network model in [28]. The multi-objective optimization problem was tackled using GA and PSO-based simulations for various random network topologies. PSO delivered the most optimal routings with reduced packet loss and increased throughput. (ii) In other works, a video watermarking scheme for anti-piracy protection incorporates Squirrel Search Algorithm (SSA) with constraints on video quality and other thresholds [6]. The embedded watermarking scheme utilized the frame selection method on five different videos and the proposed SSA-based framework recorded a Peak Signal to Noise ratio (PSNR) of 71.06 dB outperforming eight methods from the literature. (iii) In similar developments, phishing website detection integrating a support vector machine and an Improved spotted hyena optimization (ISHO) algorithm is proposed to select proper features for classifying phishing websites [56]. Compared to PSO, FA and bat algorithm-based SVM classifiers, the ISHO-based SVM achieved higher classification accuracy compared to others.

1.3 Challenges associated with meta-heuristic algorithms

Although the efficacy of the swarm-based nature-inspired optimization algorithms is imputed to its multifarious search mechanisms, the same mechanisms are prone to a myriad of problems and complications that require addressal for these algorithms to maximize their potential. The search mechanisms devised to mimic the processes in nature, which account for the core of any nature-inspired meta-heuristic, may not be competent at optimizing every class of optimization problem. Even though they are meticulously crafted, the crucial and conflicting case of exploration and exploitation has been eluding researchers and motivating them towards the realization of a near-perfect optimization strategy. Grounding on this, lately, there has been a gold rush for the development of better/improved variants of nature-inspired algorithms best suited to the search landscape of the problem being dealt with. Simultaneously, there is an upsurge in the publications relating to the development of better/improved variants tackling the following aspects.

  • The deterioration in the performance (stagnation of fitness or sluggish convergence characteristics) of an optimization technique with the increasing number of problem dimensions is more often than not ascribed as “the curse of dimensionality”, coined by Richard E. Bellman. The manifold reason is that there could be several possibilities of every decision variable for each combination of values and the fitness of all such possibilities are to be computed within a present number of function evaluations resulting in solutions very far from the global optimum.

  • The swarm intelligent optimization algorithms are inherent to the conflicting case of the balancing of exploration (global search) and exploitation (local search). An ideal trade-off between exploration (diversification) and exploitation (intensification) is needed such that the algorithm is capable of understanding the condition to explore further or improve the existing solutions [2, 23].

  • Another issue managed through the improvement of meta-heuristics is the near-perfect coordination of the tuning criterion (otherwise called “algorithm-specific parameters”) and, notably, the requirement to tune several of such parameters to extract the best possible performance is often a tedious and time-consuming one and improper or inappropriate tuning has often been the major reason for the algorithms’ failure.

  • Algorithms excelling at unconstrained cases may not perform equally well for a constrained problem and similarly, algorithms with quick convergent characteristics may not deliver the best optimality compared to others. Furthermore, algorithms designed to explore efficiently over complex landscapes may not be efficient at local search and vice-versa. This is commonly alluded to as the “No free lunch theory” [64], which expresses that the perfect optimization algorithm is not practically realizable and no meta-heuristics can deliver the best performance for every optimization task.

To address the aforementioned issues and extract the best performance for the chosen optimization problem, several improved/upgraded meta-heuristics have been proposed and have gained significance owing to their superior performance in terms of optimality, consistency and robustness [8]. The improvement of any meta-heuristic algorithm for the specific application is predominantly done through the introduction and empirical establishment of special techniques/operators that advance the exploration to newer areas within the search space while at the same time balancing the exploitation/local search. This procedure of altering a meta-heuristic to upgrade its existing abilities for better and more robust performance is known as “improving” or “enhancing” or “modifying” [9]. Researchers frequently turn towards improvisation and improving the existing meta-heuristics to guarantee that a near-perfect compromise between the exploration and exploitation is achieved to a good extent such that the need for the tedious tuning process is reduced through adaptive techniques that are capable of dynamically adapting to the search landscape and also work towards restructuring them to eliminate the algorithms’ weaknesses in dealing with complex optimization tasks. This is represented in Fig. 2.

Fig. 2
figure 2

Infographic depicting the classification, acclimatization and improvisation in meta-heuristic optimization algorithms

1.4 Contributions of the proposed work

In this work, Grey Wolf Optimizer (GWO) has been studied extensively with its merits and demerits have been analysed and an improved version is realised to overcome the various shortcomings associated with it. The proposed algorithm has been named Competitive-learning GWO (Clb-GWO) and it employs four major modifications.

  1. 1.

    Dual search mechanisms with new techniques for global and local search have been devised and arranged in a selective complementary fashion.

  2. 2.

    Population sub-grouping into major and minor groups is considered to dedicate sections of the population to learn and adapt with respect to the problem landscape.

  3. 3.

    Novel difference vectors are designed to promote population diversity and prevent population stagnation.

  4. 4.

    Non-linear hunting and competitive learning strategies are formulated and integrated systemically with adaptive mechanisms to achieve equilibrium between exploration and exploitation.

The underlying reasons for the choice of GWO over the other optimization paradigms are as follows. (1) GWO is one of the most successful state-of-the-art optimization techniques with impeccable performance in multi-disciplinary applications and stands unabated with incredible competence outperforming other paradigms as outlined in the literature survey. (2) The simple structure of GWO is easier to be realized in any programming language of choice and can be deployed to various optimization problems in accordance with the researchers’ interests. (3) There exists a plethora of publications wherein the performance of GWO has been greatly improved through either application-specific enhancements or hybridization indicating a greater scope of its re-usability for a potentially robust variant of GWO for the many-sided research avenues. (4) The tuning of GWO has been experimented with quite often to improve the accuracy, population diversity, lower its susceptibility to “the curse of dimensionality”, overthrow local entrapment etc. There is always room for improvement considering the applicability of the variant aimed at, e.g., complex constrained optimization problems with a higher dimensional count could require additional modifications to the algorithmic structure and dynamic tuning resulting in a greater search gradation. (5) The selection and population updating strategies and their coherence to the performance of the algorithms, and the outcome of the optimization have been reviewed and analysed leading to a multitude of variants that exploit various population selection and updating techniques to leverage the algorithm’s full potential.

1.4.1 Highlights of the current work

The highlights of the proposed work are outlined as follows.

  • A novel version of GWO immune to the curse of dimensionality and premature convergence is designed through multi-population and adaptive learning mechanisms

  • Extensive testing through the latest benchmarking (2020 and 2019) suites is carried out to determine the suitability and effectiveness of the proposed algorithm while proving an updated overview of GWO’s performance for the latest benchmarking standards.

  • Comprehensive comparisons are made with the recent and advanced variants of GWO with have been evaluated for older benchmarking (2005) suites so far. This study compares multiple aspects of the variants of GWO to establish their performance standards for the latest in benchmarking.

  • The validation of the proposed method’s performance for complex real-world problems in multimedia tools and artificial intelligence is established through MLP training (5 classification datasets and 3 function approximation datasets).

The remainder of this article is organized as follows. Section 2 focuses on the working of GWO followed by a discussion of its merits and demerits. Section 3 discusses the formulation of the competitive learning-based GWO technique with a detailed description of its various attributes. The performance of Clb-GWO with ten different meta-heuristics (including five variants of GWO, two modern meta-heuristics and two state-of-the-art advanced meta-heuristics) is analysed in Section 4 through CEC2020 and CEC2019 benchmarking suites. Additionally, Section 4 analyses the sensitivity of the various tuning parameters on the outcome of optimization and the effect of population size, and number of iterations on the exploration and exploitation. Section 5 analyses the performance of the proposed method with real-world complex optimization tasks (MLP training for five classification datasets and three function approximation datasets). The conclusion, merits and demerits of Clb-GWO potential applications and the future scope of the current work are given in Section 5.

2 Grey wolf Optimizer

Grey Wolf Optimizer, referred to as GWO, is a swarm-based, nature-inspired meta-heuristic optimization algorithm based on the leadership hierarchy and hunting mechanism of grey wolves (Canis lupus). Developed in 2014 by Seyedali Mirjalili, Seyed Mohammad Mirjalili and Andrew Lewis, GWO has risen to become one of the prominent state-of-the-art optimizers. [42]. GWO is unique with its excellently crafted social hierarchical system as it groups the grey wolves into alpha, beta, delta, and omega and explores and exploits the search space. The tuning requisites of GWO constitute the basic specification of the population size and iteration count and an optional control vector. The balance of the exploration and exploitation is achieved through a linearly decreasing nature of the control vector which is set to decrement from 2 to 0 over the course of iterations. The simplicity of its algorithmic structure and its outstanding performance towards optimization of both unconstrained and constrained with good convergence properties has attracted researchers and practitioners from various fields to opt for it. Computer Science, Machine Learning and Artificial Intelligence, Engineering, Mathematics, Energy, Materials Science, Physics and Astronomy, etc., are some of the applications of GWO across various disciplines.

2.1 Working of GWO

To understand the working of GWO, it is essential to gain insight into how the social hierarchy of wolves is considered in mathematical modelling. GWO considers the alpha wolves (male/female) as the leader (the dominant wolves) as they dictate the functioning of the group and are predominantly responsible for decision-making and managing the group. The second-order consists of the beta wolves which are the subordinates and the advisors and also command the other lower order of wolves. The third in the line-up is the omega wolves which form the lowest ranking group and often assume the roles of a scapegoat or a babysitter. Additionally, the delta wolves which don’t identify themselves as alpha, beta or omega are the Scouts, sentinels, elders, hunters, and caretakers in the group. The delta wolves dominate the omegas but obey the betas and alphas forming an intermediate between the beta wolves and delta wolves. The collective foraging activity based on the social hierarchy forms the core of GWO. Figure 3 depicts the social dominance based hierarchical system of the grey wolves.

Fig. 3
figure 3

Social dominant hierarchy of the grey wolves

In GWO, the best solution is considered as the alpha, the second-best solution is beta and the third-best solution is delta respectively. The latter of the population is considered the omegas. A comprehensive description of the various aspects of the mathematical modelling of GWO is as follows.

Encircling the prey

The first phase of GWO is aimed at determining the position of the prey. Initially assumed to be unknown, the algorithm explores the search space considering that the prey’s position is located near the optimal solution. Once, the location of the prey is found, they encircle it as a part of the hunting process. To locate a better solution, grey wolves explore the area around the location of prey.

Eq. (1) and Eq. (2) constitute the mathematical model for the encircling of the prey in GWO.

$$ \overrightarrow{P_{gw}\ }\left(t+1\right)=\overrightarrow{P_p(t)}-\overrightarrow{A}\ \overrightarrow{.\mathrm{d}} $$
(1)
$$ \overrightarrow{d}=\left|\ \overrightarrow{E\ }.\overrightarrow{P_p(t)}-\overrightarrow{P_{gw}(t)\ }\right| $$
(2)

where, \( \overrightarrow{P_{gw}} \) is the position of the grey wolf,\( \overrightarrow{A\ } \)and \( \overrightarrow{E\ } \) are coefficient vectors, t is the present iteration, \( \overrightarrow{P_p(t)} \)is the position of the prey, || is the modulus operator to determine the absolute value and’.’ represents multiplication in an element-to-element manner.

Eq. (3) and Eq. (4) describe the mathematical formulation of the co-efficient vectors \( \overrightarrow{A\ } \) and \( \overrightarrow{E} \).

$$ \overrightarrow{A} = 2\overrightarrow{a}.\overrightarrow{\ {\mathit{\operatorname{rand}}}_1} - \overrightarrow{a} $$
(3)
$$ \overrightarrow{E} = 2.\overrightarrow{\ {\mathit{\operatorname{rand}}}_2} $$
(4)

where,\( \overrightarrow{\ a} \) is the control vector whose value tends to linearly decrease from an initial value of 2 to a final value of 0 over the course of iterations and \( \overrightarrow{\ \mathit{\operatorname{rand}}} \) denotes a random vector in [0, 1].

Hunting

As soon as the location of the prey has been recognized, the hunting proves commences guided by the alpha. Supported by the beta, delta and on rare occasions by the omega, the positions of the omegas are updated in conjuncture with the mean position of the alpha, beta and delta. The best three solutions obtained are saved as described in the hierarchical dominance of the wolves to further estimate the location of prey and guide the omegas to update their positions around it in the subsequent iterations.

The distances between the current grey wolf and the three dominant wolves are given in Eq. (5) and the positions formulated based on the distances are given in Eq. (6).

$$ {\displaystyle \begin{array}{c}\overrightarrow{d_{\alpha }}=\left|\ \overrightarrow{E_1\ }.\overrightarrow{P_{\alpha }}-\overrightarrow{P_{gw}\ }\right|\\ {}\overrightarrow{d_{\beta }}=\left|\ \overrightarrow{E_2\ }.\overrightarrow{P_{\beta }}-\overrightarrow{P_{gw}\ }\right|\\ {}\overrightarrow{d_{\delta }}=\left|\ \overrightarrow{E_2\ }.\overrightarrow{P_{\delta }}-\overrightarrow{P_{gw}\ }\right|\end{array}} $$
(5)
$$ {\displaystyle \begin{array}{c}\overrightarrow{P_1} = \overrightarrow{P_{\alpha }}-\overrightarrow{A_1}.\left(\overrightarrow{d_{\alpha }}\right)\\ {}\overrightarrow{P_2} = \overrightarrow{P_{\beta }}-\overrightarrow{A_2}.\left(\overrightarrow{d_{\beta }}\right)\\ {}\overrightarrow{P_3} = \overrightarrow{P_{\delta }}-\overrightarrow{A_3}.\left(\overrightarrow{d_{\delta }}\right)\end{array}} $$
(6)

Finally, the position of the grey wolf is given by Eq. (7)

$$ \overrightarrow{P_{gw}\ }\left(t+1\right)=\left[\frac{\overrightarrow{P_1} + \overrightarrow{P_2} + \overrightarrow{P_3\ }}{3}\right] $$
(7)

where, \( \overrightarrow{P_{gw}} \) is the position of the grey wolf, \( \overrightarrow{P_{\alpha }} \), \( \overrightarrow{P_{\beta }} \) and \( \overrightarrow{P_{\delta }} \) represent the positions of the alpha, beta and delta wolves, \( \overrightarrow{A\ } \) and \( \overrightarrow{E} \) are the co-efficient vectors.

2.2 Demerits of the canonical GWO

Although efficient in several applications, the shortcomings of GWO include a lack of population diversity, local entrapment, premature convergence, lack of a stronger exploitation system etc. to name a few. Several review articles [14, 21, 22, 26, 44, 49] have outlined the limitations of GWO and there has been a greater focus towards the improvement of GWO to achieve a reliable and robust variant.

A summarization of the critical limitations of GWO from various review and work articles has been listed below

  • GWO has been susceptible to the curse of dimensionality in several benchmark and real-world applications. The performance has been deteriorating in problems with multiple constrained and higher independent decision variables owing to the selection and the population updating strategy. The algorithm’s inability to manage multiple dimensions as it may not reposition all its search agents appropriately has been researched extensively to realise a better variant immune to such drawbacks.

  • The convergence speeds are slower compared to other algorithms depending on sorting techniques for benchmarking and real-world scenarios.

  • The system of splitting iterations aimed at an explorative search for the first half and exploitative intensification for the next half does not necessarily guarantee that the majority of the search space has been covered and the conflicting aspects of exploration versus exploitations have not been perfectly balanced despite its good performance compared to the classical paradigms.

  • Complex and multi-modal search landscapes have been a challenging aspect as the algorithm is more likely to fall prey to local entrapment leading to premature convergence.

  • The exploration system being robust initially, narrows down to the location of the three dominant wolves with the progression of iterations and the wolves may not move far away from each other beyond the initial stages resulting in premature convergence.

  • The higher dependence on the three dominant wolves in the wolf pack localizes the population towards the end of iterations causing local entrapment inevitable. If entrapment occurs at an earlier stage, there are no adaptive techniques to escape it.

3 Proposed method: Competitive learning-based Grey wolf Optimizer

In this work, a competitive learning GWO is proposed after a comprehensive analysis of GWO, its other state-of-the-art variants, review articles, and publications related to GWO and its applications. The improved algorithm, named Competitive-learning based GWO is devised to address the various limitations of GWO as mentioned previously to improve its immunity to the curse of dimensionality and local entrapment with an accelerated convergence towards the global optimum and to achieve the desired equilibrium between the global (exploration) and local search (exploitation) with an enhanced population diversity.

3.1 Analysis and deductions from the previous publications aimed at improving GWO

Although very successful at multi-disciplinary optimization, GWO has its fair share of criticism and controversies surrounding its algorithmic structure for having a one-sided search system known to favour the geometric centre of the search landscapes. There have been several publications demonstrating its demerits that have pointed out it’s weaknesses, including the lack of a strong exploration system for multi-modal and complex landscapes. The population progression system in GWO favours diversity in the initial stages and quickly converges to the surroundings of the dominant wolves leading to stagnation and loss of population diversity. The analysis at [50] demonstrated this tendency of GWO to slide to the geometric centre (Functions with 0 as the location of the global optimum in the publication) and proposed a verification method through a set of nine modified test functions with varying degrees of shifted global optimum positions. The study concluded that the performance deterioration of GWO was proportional to the degree of shift in the global optimum from 0 and deduced the linearly decreasing nature of the control variable as one of the possible reasons. Multiple works have focussed on improving the population diversity with the intent of lowering the algorithm’s dependence on the three dominant wolves in the wolf pack [4, 34, 73]. This has also been referred to as the reason for the algorithm’s premature convergence which happens a result of local entrapment [19, 76]. The algorithm’s lack of immunity towards the curse of dimensionality has also been the centre of focus as well, with studies indicating the excessive dependence on the dominant wolves and the lack of elitism among the population as the reasons for a poor exploratoty sysem [21, 22, 26, 44]. Slower convergence has been reported in several cases as the algorithm accepts all the population members to replace the older population (Mu, Lambda (μ,λ) selection) despite their inferior fitness [63].

On the other hand, several publications have credited the linear control strategy as it provided a basic foundation that can be further improved to ensure robustness in the performance of GWO for other complex landscapes. Reference [47] adopted the standard GWO foraging techniques with an additional dimensional learning strategy based on Euclidian distance and greedy selection to improve the performance of GWO for complex landscapes with an improved GWO algorithm. The performance of IGWO has been verified against the CEC2017 benchmarking suite where none of the benchmarking functions had their global optimum at ‘0’. In [16], one of the more popular variants of GWO, a random walk GWO with greedy selection is proposed and tested against the CEC2014 suite (also with functions having no optima located at ‘0’) and demonstrated its robustness with complex landscapes. Additionally, selective opposition-based GWO in [12] incorporating the Spearman coefficient in an opposition-based learning scheme to improve the fitness of omegas with respect to the difference between the alpha and omegas is proposed. In [75], hybridization of GWO with Biogeography-Based Optimization (BBO) algorithm to enhance the population diversity of GWO and accelerate the convergence speeds has been proposed. It compared the hybrid GWO with EPSDE, SHADE and SinDE for the CEC2014 benchmarking suite where it outperformed them by a large margin for the thirty benchmark functions.

In other developments, non-linear control strategies have been very popular to establish a solid balance between exploration and exploitation for several multi-disciplinary applications. W.Long et al. in [32] proposed an exploration-enhanced GWO by experimenting with multiple non-linear modulation indices for the control parameter ‘a’ and deduced that an initial value of 1 or higher nearer to 1.5 is promising for multi-modal landscapes. The article at [73] proposed an improved GWO with exponential control vectors based on the current and final iterations to enhance the exploration quality for the truss optimization. Research has also been directed at balancing the exploration and the exploitation system of the GWO through the introduction of chaotic strategies to mitigate local entrapment [3, 18, 27, 33, 51, 57]. Hybridization/combinatorial variants of GWO with the existing swarm and evolutionary algorithms have been an ongoing trend since the publication of GWO. The combinatorial variants operate in synergy combining the best aspects of both their parent algorithms with robust and consistent performance across all standards and have considerable performance improvement for the conflicting cases of exploration versus exploitation [60, 63, 77].

3.2 Motivation

In addition to the aforementioned aspects and a myriad of publications of GWO, the motivation for the current work is as follows:

  1. 1.

    Although GWO is a relatively old meta-heuristics, its search process can be efficiently improved to enhance its robustness and consistency for multi-modal and complex landscapes.

  2. 2.

    To combat the demerits associated with the linear exploration method, other non-linear schemes can be experimented with and incorporated into the search mechanism to promote the diversity of the population.

  3. 3.

    A balance between exploration and exploitation can be promoted through the design of suitable difference vectors which can be incorporated strategically and systematically at different stages in the search process.

  4. 4.

    The segregation of the population into two groups has not been experimented with GWO so far to dedicate smaller sections of the population to achieve a specific purpose. The previous multi-strategy ensemble variants have aimed at modifying the structure of GWO completely and have not experimented with population sub-division as of lately.

  5. 5.

    The inclusion of a second position updation strategy and greedy selection has been proven to be beneficial for most complex search spaces and grounding on this the competitive learning phase with distinct strategies is laid out.

  6. 6.

    Success and failure-based strategy adaption, the most popular with the state-of-the-variants of DE have been experimented with to allow the algorithm to learn and adapt to any possible scenario.

Hence, based on the above developments, the current study proposes an improved GWO with an ensemble of strategies to improve population diversity and exploration quality. The following aspects have been considered for the development of the proposed method:

  1. 1.

    The proposed method is built on the merits of the standard GWO linear hunting scheme and a second learning phase with greedy selection follows it as has been the most successful for composite and complex landscapes as seen with the various improved variants.

  2. 2.

    The proposed method follows the population sub-division and is implemented in two phases with each phase complimenting the other in terms of the search strategy and selection mechanisms.

  3. 3.

    The enhancement of population diversity has been the core of the current method.

  4. 4.

    A non-linear control strategy and competitive learning strategy have been added to the standard GWO to improve its robustness and immunity against the curse of dimensionality

  5. 5.

    Benchmarking is done through the recent CEC2020 suite and CEC2019 suites, neither of which have been considered for benchmarking so far and none of the benchmarking functions in them has their global optimum located at ‘0’.

  6. 6.

    Comparisons are not only made against the advanced variants of GWO but also with the state-of-the-art advanced meta-heuristics from the literature and these algorithms are kept consistent throughout the entire benchmarking and real-world testing.

3.3 Implementation

Clb-GWO is implemented in two phases and in both phases, the population is divided into two subgroups. The algorithm’s phases and population groups are structured in a selective complementary fashion with the aim of promoting population diversity over convergence. The population is divided into a majority group with 90 % of the wolves and a minority group with 10 % of the remaining wolves. The majority group is mainly responsible for the large-scale exploration and exploitation while the minority groups are reserved to promote divergence and convergence as per their formulation. The advantages of population sub-division have been highlighted in several of the state-of-the-art advanced meta-heuristics such as EPSO [35], EPSDE [37], MPEDE [65] etc.

3.3.1 Modified GWO phase

  1. i)

    Majority Group 1 / Hunting group:

As discussed earlier, the linear hunting strategy from the standard GWO has its own set of merits and demerits and grounding on this, the first phase considers a linear hunting scheme complemented by a non-linear hunting scheme. The hunting scheme, either linear hunting or non-linear hunting is selected with a random probability such that both the schemes contribute to the generation of a new population as represented by Eq. (8).

$$ \overrightarrow{P_{hunt}\ }\left(t+1\right)=\left\{\begin{array}{cc} Linear\ GWO\ hunting\ \left(\overrightarrow{{P_{gw}}^{lin}\ }\right)& p{r}_1>0.5\\ {} Non- linear\ GWO\ huting\ \left(\overrightarrow{{P_{gw}}^{nl}\ }\right)& otherwise\end{array}\right. $$
(8)

where, \( \overrightarrow{P_{hunt}\ }\left(t+1\right) \) is the updated position of the grey wolf through the various hunting schemes, pr1 is a random number in 0 and 1 generated through uniform distribution.

Linear hunt

The linear hunting scheme is the same as the encircling and hunting of prey technique from the standard GWO algorithm. The distance and position vectors are described in Eq. (9) and Eq. (10) respectively.

$$ {\displaystyle \begin{array}{c}\overrightarrow{d_{\alpha }}=\left|\ \overrightarrow{E_1\ }.\overrightarrow{P_{\alpha }}-\overrightarrow{P_{gw}\ }\right|\\ {}\overrightarrow{d_{\beta }}=\left|\ \overrightarrow{E_2\ }.\overrightarrow{P_{\beta }}-\overrightarrow{P_{gw}\ }\right|\\ {}\overrightarrow{d_{\delta }}=\left|\ \overrightarrow{E_2\ }.\overrightarrow{P_{\delta }}-\overrightarrow{P_{gw}\ }\right|\end{array}} $$
(9)
$$ {\displaystyle \begin{array}{c}\overrightarrow{P_1} = \overrightarrow{P_{\alpha }}-\overrightarrow{A_1}.\left(\overrightarrow{d_{\alpha }}\right)\\ {}\overrightarrow{P_2} = \overrightarrow{P_{\beta }}-\overrightarrow{A_2}.\left(\overrightarrow{d_{\beta }}\right)\\ {}\overrightarrow{P_3} = \overrightarrow{P_{\delta }}-\overrightarrow{A_3}.\left(\overrightarrow{d_{\delta }}\right)\end{array}} $$
(10)

The coefficient vectors are described by Eq. (11) and Eq. (12) respectively.

$$ \overrightarrow{A} = 2\overrightarrow{a}.\overrightarrow{\ {r}_1} - \overrightarrow{a} $$
(11)
$$ \overrightarrow{E} = 2.\overrightarrow{\ {r}_2} $$
(12)

The positions of grey wolves are updated as per Eq. (13)

$$ \overrightarrow{{P_{gw}}^{lin}\ }\left(t+1\right)=\left[\frac{\overrightarrow{P_1} + \overrightarrow{P_2} + \overrightarrow{P_3\ }}{3}\right] $$
(13)

where, \( \overrightarrow{{P_{gw}}^{lin}} \) is the updated position of the grey wolf through the linear hunting scheme, \( \overrightarrow{P_{gw}} \) is the position of the grey wolf, \( \overrightarrow{P_{\alpha }} \), \( \overrightarrow{P_{\beta }} \) and \( \overrightarrow{P_{\delta }} \) represent the positions of the alpha, beta and delta wolves, \( \overrightarrow{A\ } \) and \( \overrightarrow{E} \) are the co-efficient vectors, \( \overrightarrow{\ a} \) is the control vector whose value tends to linearly decrease from an initial value of 2 to a final value of 0 over the course of iterations ‘t’ and \( \overrightarrow{\ r} \) denotes a random vector in [0, 1].

Non-linear hunt

The non-linear hunting scheme is considered to improve the population diversity based on an exponentially decreasing vector through the course of iterations. This strategy includes the worst solution to form the differential vector while also considering the selection of randomized omega wolves to prevent local stagnation which is often the result of the dominant wolves converging quickly to a single point in the search space. The non-linearity associated with the control vector prevents the solutions from sliding towards the geometric centre of the search space which has been known to severely impact its performance for shifted and rotated benchmark functions. The distance and position vectors are described in Eq. (14) and Eq. (15) respectively.

$$ {\displaystyle \begin{array}{c}\overrightarrow{d_{\alpha }}=\left|\ \left[\overrightarrow{P_{\alpha }} - \overrightarrow{P_W\ }\right]-\overrightarrow{P_{gw}\ }\right|\\ {}\overrightarrow{d_{\beta }}=\left|\ \left[\overrightarrow{P_{\beta }} - \overrightarrow{P_W\ }\right]-\overrightarrow{P_{gw}\ }\right|\\ {}\overrightarrow{d_{\delta }}=\left|\ \left[\overrightarrow{P_{\delta }} - \overrightarrow{P_W\ }\right]-\overrightarrow{P_{gw}\ }\right|\end{array}} $$
(14)
$$ {\displaystyle \begin{array}{c}\overrightarrow{P_1} = \overrightarrow{\phi .}\overrightarrow{P_{\omega (r1)}}-\overrightarrow{r_3}.\left(\overrightarrow{d_{\alpha }}\right)\\ {}\overrightarrow{P_2} = \overrightarrow{\phi .}\overrightarrow{P_{\omega (r2)}}-\overrightarrow{r_4}.\left(\overrightarrow{d_{\beta }}\right)\\ {}\overrightarrow{P_3} = \overrightarrow{\phi .}\overrightarrow{P_{\omega (r3)}}-\overrightarrow{r_5}.\left(\overrightarrow{d_{\delta }}\right)\end{array}} $$
(15)

where, \( \overrightarrow{P_W\ } \) position of the grey wolf with the worst fitness value, \( \overrightarrow{\phi\ } \) is the control vector whose value tends to exponentially decrease from an initial value of 1 to a final value of 0 over the course of iterations ‘t’.

The exponential control vector decreases from 1 to 0 over the course of iterations as described in Eq. (16).

$$ \overrightarrow{\phi} = {e}^{\left(-0.05\times t\right)} $$
(16)

The final position of the grey wolf is the average of the three positions described by Eq. (17).

$$ \overrightarrow{{P_{gw}}^{nl}\ }\left(t+1\right)=\left[\frac{\overrightarrow{P_1} + \overrightarrow{P_2} + \overrightarrow{P_3\ }}{3}\right] $$
(17)

where, \( \overrightarrow{{P_{gw}}^{nl}} \) is the updated position of the grey wolf through the non-linear hunting scheme.

  1. ii)

    Minority Group 1 / Diverging group:

The first minority group comprising the remaining 10 % of wolves is retained for re-initialization and random repositioning to diverge the wolves and prevent local entrapment. The divergence is achieved through the described equations in Eq. (18) chosen at random. The divergence vector \( \overrightarrow{P_{\Omega}\ } \) is formulated to push the wolves far away from each other.

$$ \overrightarrow{P_{hunt}\ }\left(t+1\right)=\left\{\begin{array}{cc}\overrightarrow{P_{gw}} + \Delta .\left[\overrightarrow{P_{\Omega}\ }\right]& p{r}_2>0.5\\ {} lb+\overrightarrow{r_6}.\left[ ub- lb\right]& otherwise\end{array}\right. $$
(18)

where

$$ \overrightarrow{P_{\Omega}} = \overrightarrow{P_{\omega \left({r}_a\right)}}-\overrightarrow{P_{\omega \left({r}_b\right)}} $$

where, is a random vector in [1, 13], \( \overrightarrow{P_{\Omega}\ } \) denotes the difference vector of any two randomly chosen omega wolves \( \overrightarrow{P_{\omega \left({r}_a\right)}} \) and \( \overrightarrow{P_{\omega \left({r}_b\right)}} \), lb and ub denote the lower and upper bounds for the decision variables.

The diversity preserving Mu, Lambda (μ, λ) selection follows the modified GWO phase for the population updation wherein every new solution is accepted to replace its parent solution despite its improved or deteriorated fitness value. However, the memory of the three dominant wolves is updated when a wolf with better fitness than their respective fitness is found.

3.3.2 Competitive learning phase

The competitive learning process follows the standard GWO procedure to further improve the quality of solutions, expand the solution space and ensure a better balance of exploration and exploitation. The search processes are synchronized to allow for the exploration of the search space in both the GWO and the competitive learning phases with a higher emphasis on exploration through the majority competitive learning group followed by a greedy selection process to ensure those fitter solutions replace the older ones.

Similar to the first phase, the competitive learning phase also divides the population into two sub-groups i.e., the Minority group 2 and majority group 2.

i) Minority Group 2 /Converging group:

The second minority group considered the first 10 % of the wolves to improve the local search and accelerate the convergence in a controlled manner. Here, fitness-based repositioning is implemented to guide the wolves to the promising areas of the search space. A single-dimensional update strategy is followed to generate one random number for all the problem dimensions as it ensures accelerated convergence for multi-modal and separable functions.

Every wolf in the second minority group is compared with a random wolf other than the three dominant wolves and repositioned closer to the alpha wolf with respect to its fitness. Fitter wolves are allowed to migrate slowly while the non-fitter wolves are given a higher degree of freedom to reposition themselves much closer to the alpha. Local search around the current position and exploitation of the best solutions is facilitated through the \( \overrightarrow{{P_{\Omega}}^{gw}\ } \) and \( \overrightarrow{{P_{\Omega}}^{\alpha }\ } \) vectors respectively as described below in Eq. (19).

$$ {\displaystyle \begin{array}{c}\overrightarrow{P_{learn}\ }\left(t+1\right)=\left\{\begin{array}{cc}\overrightarrow{P_{\omega {(r)}_{hunt}}} + \overrightarrow{r_7}.\left[\overrightarrow{{P_{\Omega}}^{gw}\ }\ \right]+\overrightarrow{r_8}.\left[\overrightarrow{{P_{\Omega}}^{\alpha }\ }\ \right]&\ Fit(i)< Fit(r)\\ {}\overrightarrow{P_{gw}} - \overrightarrow{r_9}.\left[\overrightarrow{{P_{\Omega}}^{\alpha }\ }\ \right]& Fit(i)> Fit(r)\end{array}\right.\\ {}\overrightarrow{{P_{\Omega}}^{gw}} = \overrightarrow{P_{gw}} - \overrightarrow{P_{\omega (r)}}\\ {}\overrightarrow{{P_{\Omega}}^{\alpha }} = \overrightarrow{P_{\alpha }} - \overrightarrow{P_{\omega (r)}}\end{array}} $$
(19)

where, \( \overrightarrow{P_{\mathrm{learn}}\ }\left(t+1\right) \) is the updated position of the grey wolf through the various learning schemes, \( \overrightarrow{{P_{\Omega}}^{\alpha }\ } \)denotes the difference vector of the alpha wolf \( \overrightarrow{P_{\alpha }\ } \)and a randomly chosen omega wolf \( \overrightarrow{P_{\omega (r)}} \) and \( \overrightarrow{{P_{\Omega}}^{gw}\ } \) denotes the difference vector of the current wolf \( \overrightarrow{P_{gw}\ } \)and a randomly chosen omega wolf \( \overrightarrow{P_{\omega (r)}} \).

The inclusion of at least one omega wolf whose position has not been modified from the previous hunting phase is made sure to prevent the loss of diversity during the repositioning process.

ii) Majority group 2 / Learning group:

This learning phase is selected based on the success and failure rates that serve as the moderators to switch between the linear learning and adaptive learning techniques that have been described below. Multi-dimensional update strategy which has been proven to be excellent for non-separable functions has been applied to the learning group, wherein the random numbers are unique for each dimension such that it expands the search space around them for a stronger global search emphasis. Initially, the selection is made probabilistically, and a learning parameter named ‘competitive rate’ controls the selection of the schemes best suited to ensure that the exploration goes on in a smooth and undisturbed manner as per Eq. (20). The competitive rate is the sum of the number of consecutive failures (fr) and success (sr) corresponding to each of the strategies. A detailed description of the competitive rate and its impact on the learning outcomes are discussed in the upcoming sub-sections.

$$ {\displaystyle \begin{array}{c}\overrightarrow{P_{learn}\ }\left(t+1\right)=\left\{\begin{array}{cc} Linear\ GWO\ learning\left(\overrightarrow{{P_{clb}}^{lin}\ }\right)& If\ fr>10\ or\kern0.5em If\ sr>5\\ {} Adaptive\ GWO\ learning\left(\overrightarrow{{P_{clb}}^{adapt}\ }\right)& otherwise\end{array}\right.\\ {} Comp= fr+ sr\end{array}} $$
(20)

where, Comp denotes the competitive rate fr stands for the failure rate and sr stands for the success rate respectively.

Linear learning

The linear GWO learning scheme adopts the linearly decreasing control vector from the standard GWO phase to search for new solutions around the most promising areas in the search space and has a good global search ability. The second technique comprising random omega wolves from the current and hunting population is simply added to prevent the one-sided search progression associated with the linear control strategy and hence has been given a lower priority. The linear search process is prone to drive the population to the geometric centre and to avoid this the difference vectors \( \overrightarrow{P_{\Omega}\ } \) and \( \overrightarrow{{P_{\Omega}}^{hunt}\ } \) are designed. Linear hunting is described by Eq. (21).

$$ \overrightarrow{{P_{clb}}^{lin}\ }\left(t+1\right)=\left\{\begin{array}{cc}\overrightarrow{P_{\alpha }} + \overrightarrow{a}.\left[\overrightarrow{P_{\Omega}\ }\ \right]& p{r}_3<0.75\\ {}\overrightarrow{P_{\alpha }} + \overrightarrow{r_{10}}.\left[\overrightarrow{P_{\Omega}\ }\ \right]+\overrightarrow{r_{11}}.\left[\overrightarrow{{P_{\Omega}}^{hunt}\ }\ \right]& otherwise\end{array}\right. $$
(21)

where

$$ \overrightarrow{{P_{\Omega}}^{hunt}} = \overrightarrow{P_{\omega {(r)}_{hunt}}} - \overrightarrow{P_{\omega (r)}} $$

where, \( \overrightarrow{{P_{clb}}^{lin}\ } \) is the updated position of the grey wolf through the linear learning scheme, \( \overrightarrow{\ a} \) is the control vector whose value tends to linearly decrease from an initial value of 2 to a final value of 0 over the course of iterations ‘t’, \( \overrightarrow{P_{\omega {(r)}_{hunt}}\ } \) denotes a randomly chosen omega wolf from the previous hunting schemes and \( \overrightarrow{{P_{\Omega}}^{hunt}\ } \) denotes the difference vector of a randomized omega wolf \( \overrightarrow{P_{\omega {(r)}_{hunt}}\ } \) from the hunting schemes and a randomly chosen omega wolf \( \overrightarrow{P_{\omega (r)}} \).

Adaptive learning

The adaptive hunting scheme comprises an adaptive cooperative learning technique (the first technique) with the alpha, beta and delta wolves to form the solution vector while the second technique involves the selection of only randomised omega wolves from the current and the previous hunting populations. The first technique is prioritized over the second as the knowledge of the alpha, beta and delta can be exploited efficiently in guiding the omegas to more promising areas. The second strategy serves the purpose of diversity enhancement and prevents excessive dependence on the dominant wolves at all times in the search process through the divergence vectors \( \overrightarrow{P_{\Omega_1}} \) and \( \overrightarrow{P_{\Omega_2}} \) respectively and hence its priority is set to be lower for its selection. Adaptive learning is achieved through the vectors \( \overrightarrow{{P_{\Omega}}^{\alpha \prime }\ } \) and \( \overrightarrow{{P_{\mathrm{gw}}}^{\beta, \delta }\ } \) wherein the information from the three dominant wolves is used to reposition the wolves from the previous phases. Adaptive hunting is described by Eq. (22).

$$ \overrightarrow{{P_{clb}}^{adapt}\ }\left(t+1\right)=\left\{\begin{array}{cc}\overrightarrow{P_{gw}} + \overrightarrow{R_1}.\left[\overrightarrow{{P_{\Omega}}^{\upalpha \prime }\ }\ \right]-\overrightarrow{R_2}.\left[\overrightarrow{{P_{\mathrm{gw}}}^{\beta, \delta }\ }\ \right]& p{r}_4<0.75\\ {}\overrightarrow{P_{\omega {(r)}_{hunt}}} + \overrightarrow{r_{12}}.\left[\ \overrightarrow{P_{\Omega_1}}\right]+\overrightarrow{r_{13}}.\left[\overrightarrow{P_{\Omega_2}}\right]& otherwise\end{array}\right. $$
(22)
$$ \overrightarrow{R}=\mathit{\operatorname{rand}}\left(1,D\right) $$
$$ \overrightarrow{{P_{\Omega}}^{\alpha \prime }} = \overrightarrow{P_{\omega {(r)}_{hunt}}} - \overrightarrow{P_{\alpha }\ } $$
$$ \overrightarrow{{P_{\mathrm{gw}}}^{\beta, \delta }} = \overrightarrow{P_{gw}} + \left(\overrightarrow{P_{\beta }} + \overrightarrow{P_{\delta }\ }\right) $$

where, \( \overrightarrow{{P_{clb}}^{adapt}\ } \) is the updated position of the grey wolf through the adaptive learning scheme, \( \overrightarrow{R} \) is a random vector comprising random numbers in [0,1] of the size of 1 by D, with D representing the problem dimensions.

The final step is the fitness evaluations of all the newer population members. The greedy selection technique follows the competitive learning phase to update the population pool with superior solutions from the competitive learning phase. The greedy selection allows for the population members from the competitive learning strategies with better fitness compared to the one from the modified GWO process. The survival of the fittest strategy is followed to select the fitter population members and discard the rest. In the case of inferior solutions, the positions from the modified GWO procedure are retained as given by Eq. (23).

$$ \overrightarrow{P_{gw}\ }\left(t+1\right)=\left\{\begin{array}{cc}\overrightarrow{P_{hunt}\ }\left(t+1\right)& if\ f\left(\overrightarrow{P_{hunt}\ }\right)<f\left(\overrightarrow{P_{learn}\ }\right)\\ {}\overrightarrow{P_{learn}\ }\left(t+1\right)& otherwise\end{array}\right. $$
(23)

where, \( f\left(\overrightarrow{P_{learn}\ }\right) \) is the fitness score of the decision variables obtained by the competitive learning strategy and \( f\left(\overrightarrow{P_{hunt}\ }\right) \) fitness score of the decision variables obtained by the modified GWO procedure.

The overall algorithmic structure of Clb-GWO is presented in Fig. 4.

Fig. 4
figure 4

Flowchart of Clb-GWO

3.3.3 Pseudocode of Clb-GWO

Algorithm 2
figure a

Clb-GWO

3.3.4 Analysis of the difference vectors

The difference vectors lie at the core of the proposed method and have been designed after a meticulous study and analysis of the various possible combinations used in previous advanced meta-heuristics. The primary function of the various vectors is to lower the dependence of the algorithms at all times on the three dominant wolves and eventuate to increased diversity in the population. Most of the difference vectors comprise a ransom omega wolf from the population pool which has been deliberately planned to eliminate the clustering of the wolves at any given time and extend the course of exploration over a greater interval of time. Although this can result in slower convergence, the implementation of the search mechanism with them is eliminating the one-side search system in GWO that has received a lot of criticism. The evolution of the wolfpack can be directed in the right direction to explore and exploit systematically and without being susceptible to entrapment. A tabulation of the various difference vectors designed for the proposed method is tabulated in Table 1.

Table 1 Tabulation of the various difference vectors implemented in Clb-GWO

3.3.5 Exploration and exploitation

The exploration in Clb-GWO is largely contributed by the two majority search groups in the two phases simultaneously, i.e., the encircling mechanism from the linear GWO process where exploitation is favoured for \( \overrightarrow{a} \)>1 or the diversity promoting the non-linear exponential scheme and the competitive learning strategies wherein the higher distance between the three dominant wolves (alpha, beta and delta) puts it to explore around them.

The exploration mechanism in Clb-GWO aims at enhancing the distribution of omega wolves around the potentially promising areas obtained from the standard GWO process considering the difference vectors based on randomised omega wolves. This makes up for two simultaneous exploration processes that run in synergy following each other targeting the various areas of the search spaces with all the new positions of the wolves entering the population pool in the standard GWO process followed by a re-exploration in the Clb-GWO process but with a priority at selectin only the fitter positions involving other random positions to avoid the stagnation of the population members. The existing population is positioned around the alpha, beta and delta from the social hierarchy-based hunt and the competitive learning system through the inclusion of a random population from the population pool allows re-distribution of the wolves far away from each other and converge steadily at the best position towards the end of iterations improving the evasion of local entrapment and faster convergence. A mechanism like this provides ample time for the search processes to effectively explore covering a majority of the search space and dedicating the remaining iterations to effectively exploit them to improve the accuracy of the solutions. This technique is different from the conventional approaches of dedicating two separate phases for either exploration or exploitation only. In Clb-GWO, the linear and adaptive competitive learning strategies ensure that a smooth transition from exploration to exploitation is achieved and the elitism promoting population selection prevents inferior solutions from influencing the other omega wolves and enables the system to gain a better insight into the current state of the population members and their knowledge of the search space. Several of the recent meta-heuristics (WOA [41], SSA [43], ChoA [25], HHO [20], SMA [29], ALO [39] etc.) have adopted a similar strategy of exploration-exploitation split allowing the population to diverge from each other for the first half of the iterations and converge for the latter. These meta-heuristics have shown good performance with better population diversity and decent convergence behaviour. In a theoretical sense, there could be several other advantages with such an exploration-exploitation split system but in a practical sense there are many limitations as well and the research towards a perfect exploration and exploitation system is still provides a lot of room for development.

The minority population groups complement each other with the top 10 % of the population being placed closer to the dominant wolves and the bottom 10 % spreading around the entire search space promoting better diversity in the population. The interpreting aspect is that the population from one group can influence the other at finding new potential areas to explore or prevent local entrapment around a single point at various points during the search.

Hence to encourage better exploration and good population diversity, the two processes handle the population member differently with different selection mechanisms resulting in a robust ensemble of strategies that aid each other and allowing the population of omegas to learn and adapt to the complex search spaces to explore and exploit optimally.

3.3.6 Time complexity and computational complexity

The position update system in Clb-GWO occurs twice i.e., the standard GWO procedure assigns the positions to all the wolves in the wolf pack after the evaluation of the fitness of the wolves from its previous iteration to determine the alpha, beta and delta wolves. This is followed by the second position update following any of the competitive learning strategies. The greedy selection follows the competitive learning strategy wherein the new position of all the wolves updated from the competitive learning strategy are assessed to decide on preserving the fitter solutions or discarding the inferior ones. The fitness evaluation and the position updates are performed for all the members in the population pool twice in an iteration. Hence, it is obvious that Clb-GWO performs double fitness evaluations (DFEs) per iteration. For an iterative count of T iterations with a population size of N each having a D number of decision variables/dimensions, the following are the computational complexities of individual phases. In addition to the total computational complexity of the standard GWO process, which is O(N×(D + T+(T × D))), the competitive learning procedure has an additional computational complexity of O(T×(N × D)) for the competitive learning phase followed by fitness evaluation of all the new position for the greedy selection with O(N × T). Summing up, the total computational complexity is O(N×(D + 2 × (T+(T × D)))).

In the same manner, the time complexity of Clb-GWO is measured considering its total run time i.e., ‘ttotal’ for one independent run. It is as shown in Eq. (24).

$$ {t}_{total}={t}_1\times {O}_1+{t}_2\times {O}_2+\dots \dots .{t}_N\times {O}_N $$
(24)

where, t1, t2…..tN are the computational times needed by GWO to complete the various operations O1, O2…..ON for N number of wolves. The various operations and the time requirements are presented in Table 2.

Table 2 The time complexity of Clb-GWO algorithm

Therefore, based on analysis from Table 2, the time complexity of Clb-GWO is O(N).

4 Results and discussion

Extensive testing and analysis employing standard benchmark functions, standard engineering problems and real-world complex optimization problems are chosen to evaluate the performance potential of the proposed algorithm. The benchmarking tests include the CEC2020 benchmark functions (dimensions set to 5,10,15 and 20) to assess the algorithm’s susceptibility to the curse of dimensionality, assess the performance improvement with respect to the higher number of function evaluations and CEC2019 benchmark functions to evaluate the algorithm’s capability to avoid local entrapment and premature convergence. This is followed by the application of Clb-GWO and the other competitor algorithms towards the multi-layer perceptron training for five classification datasets and three function approximation datasets.

All the exterminations considered for the current work are performed on an Ultrabook running the operating system of Microsoft Windows 10® Pro (Version 20H2 - OS Build 19,042.867) with 16 Gigabytes of DDR3 RAM powered by an Intel(R) Core (TM) i7-4700MQ quad-core CPU @ 2.40GHz. MATLAB R2020a is chosen to code all the algorithms for all the considered exterminations in the comparative analysis.

4.1 Description of benchmark functions and performance evaluation criteria

The performance evaluation criteria for all the fifteen algorithms including Clb-GWO for the different benchmarking scenarios (CEC2020, CEC2019 benchmarking suites) are as follows. The purpose of the two benchmarking suites and their importance in the validation of meta-heuristics is specified in Table 3.

  1. a.

    The average (mean) and the standard deviation values are obtained based on 30 independent runs for all the algorithms in comparison. In addition to them the best and the worst fitness values are provided in certain cases (CEC2019 benchmarking suite, standard engineering problems, the power flow optimization problems).

  2. b.

    The NFEs are modified for each problem type and the population size is set based on the NFEs. The details of these are provided prior to the results of every test case. The computational times for the CEC2020 are computed as per the documentation at [72] and the average computational times are provided for the other tests.

  3. c.

    No additional tuning modifications to the algorithm-specific parameters have been made for the entire benchmarking and real-world complex optimization problems.

  4. d.

    The first statical test, i.e., Wilcoxon’s rank-sum test at a 0.05 significance level is performed for Clb-GWO concerning the other algorithms. For better performance of the other algorithms with respect to Clb-GWO “+” symbol is used, for the similar performance of the other algorithms with respect to Clb-GWO “≈” symbol is used and for the inferior performance of the other algorithms concerning Clb-GWO “-” symbol is used.

  5. e.

    The second statistical test, i.e., a ranking test through a non-parametric Friedman’s test is performed to rank the best-performing algorithms.

Table 3 Significance of the CEC2020 and CEC2019 test suites in the validation of meta-heuristics

4.1.1 Algorithms in the benchmarking framework

  • The performance of Clb-GWO is compared and validated against the standard GWO algorithm from 2014 and five of its latest state-of-the-art variants whose description is provided in Table 4.

  • Additionally, two state-of-the-art advanced meta-heuristics within the swarm intelligence namely, CLPSO and GABC, have been employed to assess the performance of the proposed method. A brief description of the two state-of-the-art advanced meta-heuristics is provided in Table 4.

  • In addition to the aforementioned variants of GWO, two of the modern meta-heuristics are selected for the testing and validation process. A brief description of the two modern meta-heuristic is provided in Table 4.

  • To assess the performance of the proposed methods and rank them for each benchmarking suite, the statistical results of winners/top-performing algorithms are also added in their sub-sections to provide a comprehensive analysis of the current standings of the proposed method.

Table 4 Description of the state-of-the-art meta-heuristics used in the comparative analysis

4.1.2 Tuning settings of the algorithms

To ensure that a fair comparison is achieved, it is required to set/tune the algorithm-specific parameters (tuning parameters) appropriately to extract the best performance. Hence, after a meticulous review of the various algorithms’ performances, the following tuning settings have been finalized to ensure that the chosen algorithms deliver their best performance to the fullest of their potential. Please note that the values of the tuning parameters provided in Table 30 (Appendix) remain the same for the entire benchmarking process and real-world problems tackled in the remainder of the manuscript.

The basic parametric tuning for all the algorithms is shown in Table 5.

Table 5 Description of the basic tuning parameters for all the algorithms used in the comparative analysis

4.2 CEC2020 benchmarking suite

The first set of benchmarking tests is performed using the CEC2020 benchmarking suite with 5, 10, 15 and 20 dimensions as per the competition rules for the functions described in Table 6. The benchmarking allows for the exponential growth of the computational resources (Number of function evaluations-NFEs) for the increase in the dimensionality of the test functions and their complexity. It comprises 10 scalable benchmark problems within the search range [−100, 100]D with the global optimum shifted and rotated based on a rotation matrix generated from standard normally distributed entries by Gram-Schmidt ortho-normalization.

Table 6 Description of the test functions from the CEC2020 benchmarking suite

The termination criteria for the CEC2020 test functions as defined by the documentation are described in Table 7.

Table 7 Termination criteria for the CEC2020 benchmarking suite

4.2.1 Results of benchmarking (CEC2020 test suite)

In this sub-section, Clb-GWO is compared against the standard GWO, five of its latest and advanced variants and two of the modern meta-heuristics. The benchmarking results (mean and standard deviation) are shown in Table 8, the p-values of Wilcoxon’s rank-sum test and the results are shown in Table 9 and the results of Friedman’s non-parametrical test are shown in Table 10.

Table 8 The values of mean and standard deviation (std) of the CEC2020 benchmarking suite comparing Clb-GWO with the ten competitor algorithms
Table 9 Results of the Wilcoxon’s rank sum test comparing Clb-GWO with the ten competitor meta-heuristics for the CEC2020 benchmarking suite
Table 10 Ranking the algorithms based on Friedman’s rank for the CEC2020 benchmarking suite (Clb-GWO vs ten competitor meta-heuristics)

Analysis of results

  • Clb-GWO outperformed the standard GWO and its three variants by a significant margin and ranked first in Friedman’s test.

  • However, the performance was poor for the functions with 5 dimensions for five out of the ten functions as it was slower to exploit within the given NFEs. This is on account of the algorithm’s tendency to promote exploration and population diversity over exploitation leading to slower convergence. The converging group/Minority group 2 of Clb-GWO is solely responsible for its exploitation for most of the search and the minority groups tendency to exploit very late in the search process are also responsible for a slow convergence with limited computational resources.

  • The performance improved greatly for the functions with a higher number of dimensions as the algorithm was able to benefit from the increased number of function evaluations enabling deeper exploitation towards the end of the search. This is evident with F1, wherein Clb-GWO reached the minimum error value of IE-08 for 10D, 15D and 20D while the others had their solutions too far away from the global optimum.

  • Apart from F2, the performance of Clb-GWO has been good for the others indicative of the influence of the difference vectors in maintaining population diversity and promoting exploration in complex landscapes.

  • The performance of the other meta-heuristics was not on par with Clb-GWO. The increase in the problem dimensions lead to entrapment of the population members and failed to escape it leading to poor diversity in the population. The sorting techniques adopted by them increased the computational times and failed to produce a notable improvement in the performance. Another reason could be traced back to the presence of a large number of tuning parameters and no adaptive control strategy and out of the fourteen chosen algorithms, WOA, GWO, IGWO required no additional tuning parameter settings while MEGWO, SOGWO and ChOA required the tuning of special algorithm-specific parameters (2 to 4 parameters) whose values have been set based on their corresponding publications. Although the empirical setting favoured performance for problems with a lower number of dimensions as seen in [12, 18, 27, 48, 63], the same performance was not reflected for the larger dimensional problems. One particular reason for this is to do with the formulation of the solution set wherein every dimension/decision variable has not achieved the global best solution leading to an imbalance in the optimization and thereby producing highly non-optimal solutions.

  • Clb-GWO’s performance is consistent through the testing with the increase in the number of problem dimensions having very little effect on the efficiency of the algorithm as seen in Table 8. The other algorithms’ performance dwindled over the increase in the dimensions with IGWO being the most hard-hit followed by SOGWO and MEGWO. Although IGWO features the greedy selection strategy, the neighbourhood construction strategy complicates the nearest neighbour search in high dimensional space. It is not possible to quickly reject candidates by using the difference in one coordinate as a lower bound for a distance based on all the dimensions. This system can lead to various phenomena that arise when analysing and organizing data in high-dimensional spaces that do not occur in low-dimensional settings. This is evident by the performance of IGWO being better at handling lower-dimensional problems as seen from the benchmarking.

  • Furthermore, comparing Clb-GWO’s performance with IGWO and MEGWO for the 5D test case, the results are in favour of the latter as their exploitation capabilities at lower dimensionalities aid their convergence, accelerating intensification. One more reason for the slow convergence of Clb-GWO for the 5D test case is the computational budget limitation to 50,000 NFEs which limits Clb-GWO from adequately transitioning from exploration to exploitation. Clb-GWO’s structure is majorly inclined towards exploration over exploitation as premature exploitation tends to be a major contributor to stagnation. This is well established by the fact that Clb-GWO attained good convergence for the rest of the test cases i.e., 10, 15 and 20D cases as it outperformed all the competitor algorithms avoiding entrapment to a greater degree. Considering that most real-world scenarios have higher dimensionality (D > 10 up to 1000), diversification through enhanced exploration is the key to achieving the best optimality.

The time complexity of all algorithms in the comparative analysis has been calculated as per the CEC2020 documentation and is given in Table 11.

Table 11 The time complexity (in ms) of the various algorithms in the comparative analysis for the CEC2020 benchmarking suite

The computational times for Clb-GWO are on the higher side as it employs multiple strategies and adaptive measures that work in coordination with each other. Clb-GWO’s computational times are slightly lower compared to that of SOGWO and IGWO which require individual distance evaluations of all population members with respect to each other. The higher computational times can be overcome using parallel computational capabilities with modern computing systems as the multi-population-based algorithmic structure of Clb-GWO can be modified to allow both the population pools to operate independent of each other. The same modification cannot be made in the canonical GWO nor its variants as they follow a linear structure that requires the computations to take place in a top to bottom approach as their algorithmic structure dictates. The parallel compute capabilities of Clb-GWO can be extended to CPU’s with virtualization technologies to further boost the computational speeds and achieve lower computational times which is a constraint in the standard GWO and its variants.

4.3 CEC2019 benchmarking suite

The CEC2019 test suite from Special Session and Competition on Single Objective Numerical Optimization in 2019 introduced 10 special functions to be minimized with limited control parameter “tuning” for each function [54]. The test functions were meticulously crafted with multiple local optima and one unique global optimal solution to ensure that the exploratory prowess and local minima avoidance characteristics are put to test. Similar to composition functions from the previous CEC sessions, the CEC 2019 benchmark suite presents challenging exploratory conditions with their landscape shifted and rotated to further complicate the search process of an algorithm. It is to be noted that these functions are extremely challenging for any global optimization algorithm to determine the global optimal solution as their formulation is such that they are intended to trap the algorithms at local best positions, especially for algorithms designed with a tendency to converge to the central point of the search landscape. Additionally, these problems have a large number of dimensions making the search process even harder and complex and only the algorithms with a higher exploratory tendency of the entire search space can determine the global optimal solution or generate solutions in close proximity to the global best.

The description of the CEC2019 benchmarking suite is shown in Table 12.

Table 12 Description of the 10 CEC2019 benchmark functions (composition functions) used to determine the algorithms’ ability to avoid local entrapment

The benchmarking results (mean and standard deviation) are shown in Table 13, the p-values of Wilcoxon’s rank-sum test and the results are shown in Table 14, the results of Friedman’s non-parametrical test are shown in Table 15 followed by the average computational times in Table 16. A maximum of 1,000,000 function evaluations were allowed for all the algorithms with 30 independent runs.

Table 13 The values of best, worst, mean and the standard deviation of the eleven algorithms for the CEC2019 benchmark functions
Table 14 The p-values obtained from the Wilcoxon’s rank sum test comparing Clb-GWO with the ten algorithms for the CEC2019 benchmark functions
Table 15 Ranking the eleven algorithms based on the Friedman’s for the CEC2019 benchmark functions
Table 16 Comparison of the computational times (ms) of the eleven algorithms for the CEC2019 benchmark functions

Analysis of results

  • Clb-GWO entrapment evasion capabilities helped it attain higher accuracy and precision for the CEC2019 test suite. Compared to its competitors, it recorded lower instances of stagnation and reached the global optima for most test functions. While its competitors, could only exploit the unimodal cases, i.e., function C1, Clb-GWO was effective for both unimodal and multi-modal functions as well.

  • No tuning modifications except the population size and iterations were made to obtain the results.

  • It is interesting to note that the performance of Clb-GWO has been poor for C7 (Shifted and Rotated Schwefel’s Function) which is the same function as F2 in the CEC2020 benchmarking suite indicating its weakness in exploring its complex landscape.

  • Compared to the variants of GWO, Clb-GWO had the best optimal fitness for C1, C4, C5, C7, C8, C9 and C10. It is indicative of the algorithm’s exploratory and local minima avoidance capabilities. The proposed method generated solutions closer to the global optimal solutions for C4, C5, C6, C7, C8 and C10. For the functions C2, C4 and C5, the prosed method outperformed the other variants of GWO and the modern meta-heuristics.

  • The learning and adaptive learning schemes are key to ensure that a thorough exploration of the search space is achieved with the divergence and convergence groups helping Clb-GWO to prevent clustering at local optimal points and extend the duration of exploration for the most possible time in the search process.

  • The non-linear hunting aids in sampling vast areas of the complex landscapes promoting diversity as the exploration switches to exploitation towards the end of the search process. The advantage of the non-linear hunting is evident for functions C2, C6 and C10 wherein the variants of GWO had fallen prey to local entrapment.

  • C1 and C6 were the easiest followed by C10 and C2. Functions C4, C7, C8 and C9 were the most challenging at achieving the perfect score with little to no improvement in the accuracy of the digits within the maximum NFEs.

  • As explained earlier for the CEC2020 test suite, the higher computational times of Clb-GWO on account of its multi-population approach and adaptive measures’ can be minimized through the adoption of parallel computational methods and virtualization technologies that allow multiple CPU cores to share the computational burden. The standard GWO on account of its linear algorithmic approach requires less overall computational times but cannot be extended to parallel systems as it has a single population pool whose members are updated one after the other without any sub-divisions amongst them.

4.4 Sensitivity to tuning parameters

Clb-GWO incorporates two tuning /parameters namely, the non-linear control vector (\( \overrightarrow{\phi} \)) to balance exploration and exploitation phases and the Competition rate (comp) which is used to adapt the competitive learning strategies over the problem landscapes. These values must be determined through empirical analysis for the best exploration and exploitation trade-off and to prevent local entrapment. Additionally, the ratio of the population count (N) and the number of iterations (T) has to be assessed to analyze and deduce the optimal population count and the number of iterations required by both algorithms to perform effectively. Hence an experimental setup through the CEC2020 benchmarking suite with 10 dimensions is chosen with 1,000,000 NFEs. Three experimentations to determine the perfect setting of the tuning parameters (non-linear control vector (\( \overrightarrow{\phi} \)), Competition rate (comp) and the N:T ratio) for the perfect trade-off of the exploration and exploitation are opted and the mean and standard deviations of Clb-GWO are recorded for 30 independent runs.

4.4.1 Influence of the non-linear control vector (\( \overrightarrow{\phi} \))

The non-linear control vector (\( \overrightarrow{\phi} \)) balances the exploration and exploitation phases in the modified GWO procedure and ensures are population diversity and prevents the algorithm from drifting towards the geometric centre of the landscape It set to decrease exponentially from 1 to 0 over the course of iterations after a comparative analysis of other linear, non-linear and chaotic vectors as described below.

Firstly, its value is tested using the sinusoidal and exponentially decreasing and oscillating vectors in the range 0 and 1. Additionally, two chaotic Logistic map and Tent map (which have been the most opted chaotic maps in several chaotic variants of GWO) are chosen to generate values between 0 and 1. The mean and standard deviations of the selected test functions for the two experimentations are provided in Table 17 and it can be observed that the exponentially decreasing vector in the range 1 to 0 provided a better balance of population diversity and intensification.

Table 17 Comparison of mean and the standard deviation of Clb-GWO for the variations in the value of the non-linear control vector (\( \overrightarrow{\phi} \)) for the CEC2020 benchmarking suite

4.4.2 Influence of competition rate (comp)

The Competition rate (comp) is one of the key parameters to be tuned appropriately to ensure that the algorithm switches between linear and adaptive learning based on the complexity of the landscapes. The Competition rate is based on the failure rate or the success rate of the given learning strategy. The failure rate is a measure of the number of consecutive failures associated with the learning strategy and the success rate is vice-versa. The given learning strategy must fail a higher number of times before switching to the other learning strategy as the inferior solutions are ruled out from entering the population pool. At the same time, setting the success rate to a higher value than the failure rate deprives the population of diversity and lead to stagnation associated with the repetition of the same learning strategy. This setting is very crucial such that the linear learning strategy is capable of balancing the exploration and exploitation on its own based on the linear descent rate and is more likely to drive the population to the geometric centre. On the other hand, adaptive learning is dependent on the three dominant wolves and leads to entrapment if not adjusted. Hence the failure rate and success rate follow the ratio fr:sr = 2:1 and their sum gives the competition rate which is always a multiple of 15. To assess the variation of the optimization outcomes, testing is performed with the competitive rate set to 15, 30 and 45 respectively and the results are shown in Table 18. It can be observed that the setting comp to 30 with ‘fr’ at 20 and ‘sr’ at 10 yields the best balance between the learning strategies.

Table 18 Comparison of mean and the standard deviation of Clb-GWO for the variations in the value of Competition rate (comp) for the CEC2020 benchmarking suite

4.4.3 Influence of the N:T ratio

The optimal ratio of the population size and iterations to match the NFEs can play an important role in determining how effectively these numbers translate to optimality. Clb-GWO is formulated with a dual search strategy and it relies on two function evaluations in every iteration and four ratios have been experimented with to determine the optimal ratio for the algorithm to effectively corresponds to optimal resource utilization. From Table 19, it has been observed that the ratios 1:4 and 1:8 have been the most successful at delivering a perfect balance between the global and local search within the set NFEs.

Table 19 Comparison of mean and the standard deviation of Clb-GWO for the variations in the N:T ratio for the CEC2020 benchmarking suite

5 Multi-layer perceptron training

In order to demonstrate the effectiveness of the proposed algorithm towards handling of complex real-world with higher problem dimensions, five classification datasets and three function approximation datasets for the MLP training from the recent literature have been considered. The same algorithms are chosen with the previously set configurations for the algorithm tuning settings and a comprehensive comparative analysis is provided below.

5.1 Problem description

The MLP training is accomplished through Feedforward neural networks (FNNs) with input, hidden and output layers. The optimization algorithms are then integrated to determine the optimal combination of weights and biases within the given upper and lower bounds to achieve the highest classification/prediction accuracy. The optimal solution vector to be determined by the optimization algorithm is an array of weights and biases represented by Eq. (25).

$$ \overrightarrow{S}=\left\{\overrightarrow{W},\overrightarrow{B}\right\}=\left\{{W}_{1,1},{W}_{1,2},\dots, {W}_{n,n},{B}_1,{B}_2,\dots, {B}_n\ \right\} $$
(25)

where, \( \overrightarrow{S} \) denotes the solution vector containing the weights and biases, \( \overrightarrow{W} \) is the sub-vector containing all the weights and \( \overrightarrow{B} \) is the sub-vector with all the biases, Wi, j denotes the connection weight between the ith and jth nodes, n is the total number of input nodes, Bj denotes the bias (threshold) of the jth hidden node and j = 1,2,…,h denotes the jth hidden node and h denotes the total number of hidden nodes.

Following it, the weighted sums of inputs are calculated as per Eq. (26).

$$ {\omega}_j=\sum \limits_{i=1}^n\left({W}_{i,j}.{I}_i\right)-{B}_j $$
(26)

where, ωj is the weighted sum of inputs for the jth hidden node, Ii is the ith input.

The output of every individual hidden node is then computed as shown in Eq. (27).

$$ {H}_j= Sigmoid\left({\omega}_j\right)=\frac{1}{\left(1+\mathit{\exp}\left(-{\omega}_j\right)\right)} $$
(27)

where, Hj is the output of the jth hidden node.

The weighted sum outputs of the hidden nodes are then calculated as per Eq. (28).

$$ {o}_k=\sum \limits_{j=1}^h\left({w}_{j,k}.{H}_j\right)-{B_k}^{\prime } $$
(28)

Where, ok is the weighted sum of inputs for the kth output node, wj, k denotes the connection weight between the jth and kth nodes, Bk denotes the bias (threshold) of the kth output node.

The output of every output is then computed as per Eq. (29).

$$ {O}_k= Sigmoid\left({o}_k\right)=\frac{1}{\left(1+\mathit{\exp}\left(-{o}_k\right)\right)} $$
(29)

The objective function is simply formulated as Mean Square Error (MSE) where a given set of training samples is applied to the MLP and the difference between the desired output and the value that is obtained from the MLP. Finally, the performance of an MLP is evaluated based on the average of MSE over all the training samples as denoted by Eq. (30).

$$ \mathit{\operatorname{Minimize}}:F\left(\overrightarrow{S}\right)=\overline{MSE} $$
(30)

where,

$$ \overline{MSE}=\sum \limits_{t=1}^T\frac{\sum \limits_{y=1}^Y{\left({a}_y^t-{d}_y^t\right)}^2}{T} $$

where, t = 1, 2, …,T denotes the current training sample and T denotes the total number of training samples, y = 1, 2, …,Y denotes the current input and Y denotes the total number of outputs, \( {a}_y^t \) is the actual output of the yth input unit when the tth training sample appears in the input, \( {d}_y^t \) is the desired output of the yth input unit when the tth training sample appears in the input,

5.2 Experimental setup

A detailed description of the datasets is presented in Table 20 and the optimization is terminated upon reaching the maximum NFEs for 10 independent runs. In order to have a fair comparison, all the algorithms are given 25,000 NFEs with an initial population of 50. The statistical results of the optimization are given in Table 21 and Table 22 for the classification datasets and the function approximation datasets respectively. The computational times are given in Table 23.

Table 20 Description of the five classification datasets and three function approximation datasets used for MLP training
Table 21 Statistical results and the classification rates (CR) of the 11 algorithms for the MLP training with the five classification datasets
Table 22 Statistical results and the test errors (TE) of the 11 algorithms for the MLP training with the three function approximation datasets
Table 23 Computational times (seconds) of the 11 algorithms for the MLP training for all the datasets

It is evident from Table 21 that the classification training and testing efficiencies of Clb-GWO are excellent with similar computational times to its competitors. The performance of Clb-GWO was best for the XOR and balloon datasets and it managed to achieve the least MSE for both with 100% classification rates. Furthermore, the MSE obtained by Clb-GWO has been the least for all five of the classification datasets indicating its superiority in training neural networks. One important outcome of the optimization has been the effective handling of problem dimensions by Clb-GWO and this is clearly demonstrated in the Heart dataset where the highest classification rate of 87.5% was achieved by Clb-GWO with the least training MSE. Compared to the canonical GWO, the classification rate has more than doubled while the other variants of GWO excluding SOGWO and MEGWO fail to achieve higher classification rates. A major reason for their poor performance is the lack of strong exploration capabilities with the necessary adaptive frameworks to support the hunt of grey wolves to advance their hunt during the search process. Furthermore, the Heart dataset helps evaluate the proposed algorithm’s immunity to the curse of dimensionality and highlights the excellent solution diversification system in Clb-GWO. In addition to these, the lower standard deviation rates for Clb-GWO help affirm its consistency in dealing with multi-modal large dimensional problems with enhanced exploration and exploitation.

The performance of Clb-GWO for the function approximation datasets (from Table 22) was more or less similar to its competitors and it can be observed the proposed methods present a slight advantage in training and testing with the least possible error rates. However, Clb-GWO remained the most consistent of them with the least possible deviation rates for all three cases. Given that the search landscapes for the function approximation datasets are composed of multiple peaks and valleys, most algorithms often converge to local optimal solutions as seen with the current testing. The convergence to the global optimal solution in these cases requires higher computational budgets well above the current considerations and while requiring additional training samples to improve the accuracy of testing. Nevertheless, the function-approximation datasets provide a good platform to validate the consistency of the meta-heuristic and it can be concluded that Clb-GWO is the most consistent of the testing group.

The computational times recorded by Clb-GWO are quite similar to the standard GWO despite its adaptive frameworks and multi-strategy approach. A closer inspection reveals that in all the cases except the balloon dataset, the computational times of Clb-GWO are lower than GWO by about 2 to 5%. This is on account of the greedy selection integrated into the competitive learning strategies that avoid population updation in case of an inferior solution. The standard GWO on the other hand updates the population of all its members irrespective of their fitness levels requiring higher computational times for larger dimensional problems.

Tables 24, 25, 26, 27 and 28 compare the statistical results of Clb-GWO with several other algorithms including GWO, PSO, GA, Ant Colony Optimization (ACO), Evolution Strategy (ES) and Population-based Incremental Learning (PBIL) from [40].

Table 24 Comparison of the Statistical results and the classification rates (CR) of the algorithms from the literature and Clb-GWO for the XOR classification dataset
Table 25 Comparison of the Statistical results and the classification rates (CR) of the algorithms from the literature and Clb-GWO for the Balloon classification dataset
Table 26 Comparison of the Statistical results and the classification rates (CR) of the algorithms from the literature and Clb-GWO for the Iris classification dataset
Table 27 Comparison of the Statistical results and the classification rates (CR) of the algorithms from the literature and Clb-GWO for the Cancer classification dataset
Table 28 Comparison of the Statistical results and the classification rates (CR) of the algorithms from the literature and Clb-GWO for the Heart classification dataset

On comparing Clb-GWO’s performance with the other algorithms for the five classification datasets from the literature, it can be found that the overall efficiency of Clb-GWO is higher as it ranks on the top for all the datasets. Excluding the balloon dataset, Clb-GWO ranked first in terms of training and classification efficiency. Despite the fact that the competitor algorithms for XOR and Balloon datasets utilized half the computational budget, the performance to budget ratio of Clb-GWO is better as it recorded better mean and lower deviation rates accounting for its enhanced precision. For the latter, the MSEs and classification rates obtained by Clb-GWO are better than the rest as it utilized half the computational resources. The classification rate of Clb-GWO has never fallen below 85% indicating its superiority in training neural networks with larger datasets. The classification rate achieved by Clb-GWO at 87.5% for the Heart dataset has been the highest so far and it manages to reach the same with half the number of function evaluations compared to the methods from the literature.

A comparison of the performance of other methods such as Logistic Regression (LR), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF) and Artificial Neural Network (ANN) from [17] are provided in Table 29.

Table 29 Comparison of performance of various mechanisms for ANN training from the literature and Clb-GWO

From Table 29, the performance of Clb-GWO for MLP-FNN training is on par with the other contemporary mechanisms as it achieved higher classification rates of 99% or above for three datasets. Furthermore, the lower standard deviation rates achieved for the MLP FNN training indicates the consistency and reliability of Clb-GWO for MLP training.

The convergence curves for the various datasets comparing all the algorithms are provided in Fig. 5.

Fig. 5
figure 5

Convergence characteristics of the 11 meta-heuristics for the MLP training for (a) XOR dataset (b) Balloon dataset (c) Iris (d) Cancer dataset (e) Heart dataset (f) Sigmoid dataset (g) Cosine dataset (h) Sine dataset (i) Legend

Observing the nature of convergence, the proposed method has quicker convergence capabilities compared to its competitors. The convergence speeds for the five classification datasets have been dominated by Clb-GWO as it surpassed its competitors demonstrating its enhanced exploitation capabilities. It can also be observed that at times of stagnation, the adaptive triggers in Clb-GWO enable it to diversify the population such that stagnation is avoided. This is clearly seen in Fig. 5 (a), (b), (c) and (e) where the MSE drops down in sequences as the feedback system tunes the search direction to prevent entrapment as seen with ChOA and GWO. Similar convergence characteristics were obtained for all the algorithms with the function approximation models. However, Clb-GWO remained the fastest as it managed to converge within the least possible time span.

6 Conclusion

This article realizes an improved meta-heuristic optimization technique known as Clb-GWO with dual search strategies and population sub-division structured in a selective complimentary arrangement to improve population diversity through the development of difference vectors and learning strategies. The proposed method has a better performance in handling constrained and unconstrained problems and has been effective at avoiding local entrapment in complex search landscapes. A better balance of exploration and exploitation with accelerated convergence has been witnessed across the various test cases with statistically significant performance.

Furthermore, the benchmarking analysis of the CEC2020 test functions proved that Clb-GWO was quite capable of effectively utilizing the increased computational resources and delivering optimal solutions for the increments in the problem dimensions. The results obtained were statistically significant (Wilcoxon’s rank-sum test) and Clb-GWO ranked first in most test cases in Friedman’s test. Clb-GWO stood immune to the curse of dimensionality and had little to no performance deterioration for the increased number of problem dimensions. Ten special benchmark functions from the CEC2019 benchmark suite to validate the exploratory skills and avoidance of local optima with challenging search landscapes comprising of several calculatedly placed local optimal solutions. The performance of Clb-GWO in comparison to GWO, its variants and the modern meta-heuristics was better having mean values closer to the best values and also having the least standard deviation. The proposed method avoided local entrapment with good exploratory capabilities in several cases compared to its competitors most of which fell prey to the local entrapment while recording lower computational times in most cases. The MLP training for different cases was dominated by Clb-GWO in terms of optimality, computational times and lower deviation. In terms of solution optimality and accuracy, Clb-GWO’s performance was superior compared to the meta-heuristics from the literature with higher performance to computational cost ratio. This indicates that the proposed method can effectively handle problems with multiple constraints and a large number of dimensions.

6.1 Merits and demerits

The improvement and enhancement techniques in meta-heuristics have contributed to the betterment and have aided researchers to push the potential of optimization to new heights. These advancements although unique to one specific meta-heuristic can be experimented with the other algorithms and have opened up greater opportunities in the pursuit of the perfect optimization paradigm. Although quite successful and efficient in many cases, the improvement techniques have their fair share of criticism and complications. The “No free lunch theorem” summarizes that the perfect optimization algorithm for every optimization task is not practically feasible and an optimization technique that excels with one class of optimization problems may not perform adequately when deployed for other classes of optimization problems. Hence, to have a fair and unprejudiced view, the merits and the demerits associated with the proposed method are discussed below.

6.1.1 Merits

  • The increasing problem complexity with the dimensions of the problem did not affect the performance of Clb-GWO for CEC2020 functions. The same performance could not be witnessed in the standard GWO algorithm nor its variants and the two modern meta-heuristics chosen from the comparison indicates that Clb-GWO is immune to the curse of dimensionality. This is attributed to the competitive learning phase and the greedy selection strategy promoting elitism and avoiding local entrapment while effectively repositioning the position of the wolves from the modified GWO process. The balance of exploration and exploitation system by the various competitive leaving strategies is the other reason for its dominant performance in most benchmarking scenarios.

  • The population sub-division into the majority and minority groups dedicated to a specific purpose following the dual search strategy has been the stronghold of the proposed method. Exploration and population diversity with the algorithm’s dependency on the three dominant wolves has been balanced through this structuring of the algorithm.

  • A good exploration and exploitation balance has been possible through the empirically set values of the non-linear control vector (\( \overrightarrow{\phi}\Big) \) and competition rate (comp). The proposed method delivered solutions at the global optimum or closer to it in most complex benchmarking cases avoiding local entrapment and also had good convergence characteristics with the empirically tuned parametric settings.

  • The system of double fitness evaluations (DFEs) has been a positive reinforcement allowing the modified GWO procedure and the competitive learning procedure to synergistically work with respect to each other and help reposition the omega wolves with increased population diversity encouraging better coverage of the search landscape.

  • The linear and adaptive competitive learning strategies coupled with the greedy selection strategy promoted elitism by selecting the best wolves and the best solution is assigned as the alpha wolf and passed on to the standard GWO procedure in the next iteration to further refine and explore the solutions and help the algorithm to gain a better knowledge of the search landscape. An example of this can be seen in the CEC2019 benchmarking test and the standard engineering problems where Clb-GWO avoided local entrapment and delivered better optimal solutions while maintaining a minimal standard deviation of its results.

  • The dependence of GWO on the alpha, beta and delta wolves to reposition every omega wolf is lowered in Clb-GWO. The formulation of unique difference vectors with the inclusion of randomised omega wolves allows for better information exchange between the different wolves and prevents local entrapment associated with a lower diversified population system.

  • The greedy selectins strategy adopted in the competitive learning phase allows only the solutions with superior fitness to enter the population pool for the next generation. This system promotes elitism allowing the algorithm to concentrate its search on the potentially best areas within the search landscape. Global search is prioritised through the greedy selection and local search is prioritised through the initial Mu, Lambda selection the standard GWO procedure.

6.1.2 Limitations

  • The incorporation of double fitness evaluations (DFEs) although beneficial to the performance of Clb-GWO, requires the lowering of the population size or the number of iterations for predefined NFEs. This has been witnessed in all the benchmarking and other testing scenarios where the population size had to be lowered by half to match the required NFEs while the iteration count was the same. Lower population size may result in reduced population diversity and the worst cases lead to local entrapment. Although Clb-GWO showed a greater capability at dodging the local entrapment, the choice to either drop the population size or iteration count to match the other algorithms with single fitness evaluations (SFEs) has to be dealt with by the practitioner through meticulous planning.

  • Despite the fact that the empirically set values for the non-linear control vector (\( \overrightarrow{\phi}\Big) \)and competition rate (comp) resulted in good optimization outcomes, but the scope for additional tuning and modification persists. The performance of Clb-GWO for the CEC2019 was better only for seven out of the ten functions and for the others, the solutions were in close proximity to the global optimal solution. This indicates that through additional tuning management, a better performance suited for the problems’ search landscape is possible.

  • Although Clb-GWO benefits from a higher iterative count, its performance was comparatively better for the MLP training with only 250 iterations. The benchmarking for the CEC2020 with 5D test case was the only limiting condition as the lower number of functions evaluations prevented it from fully converging.

  • The dependence of Clb-GWO on the random omegas (at least 7 different omega wolves) mandates the population size to be set above seven at all times. The algorithm may fail to run with a population size lower than seven.

6.2 Future scope

Clb-GWO can be deployed to a wide spectrum of problems falling under artificial intelligence, power systems, machine learning etc. Practitioners are free to modify the proposed method as per their requirements and hence to encourage such an extendibility, simplicity has been embraced in the design of Clb-GWO. The proposed method can be applied to various other optimization areas in power systems such as elective vehicle (EV) optimization, power electronics, smart grid integration, distribution systems, power dispatch problems, control systems, power quality enhancement etc. In computer science, the proposed method can be deployed toward neural network (NN) training (convolution NNs). Image classification, data classification, pattern recognition etc. can be optimized through the proposed methods. A plan to deploy the current method for the infection detection of COVID-19 from the X-ray images via a support vector classifier is in its roots. Feature selection is a potential area of application of the proposed methods through the formulation of a binary version of Clb-GWO. The realization of a multi-objective variant is a possibility for tackling problems requiring a Pareto-optimal front. The idea to develop a multi-objective variant for the optimization of energy management in EVs has been planned as a project. In electrical engineering, Clb-GWO can be adopted and modified for parameter estimation in photovoltaic cells and battery management systems. While penalty function has been considered for most of the benchmarking with the current method, other constraint handling techniques can be integrated and experimented with for several of the existing problems. The proposed competitive learning strategies can be extended to other meta-heuristics for experimental analysis towards its improvement.

Furthermore, Clb-GWO can be deployed to tackle problems at strategic, tactical and operational levels for ride-sharing systems like Uber, Ola etc. Optimal carpooling is another possibility that can be realized through Clb-GWO as it can effectively handle problems with higher dimensionality. Real-time optimization in several domains including automated sensing, sensor fusion, and electric vehicles can benefit from the exploratory capabilities of Clb-GWO and artificial intelligence techniques.