1 Introduction

Collective intelligence is one of the approaches commonly found useful for problem-solving in the modern times. This is motivated by the fact that collective effort pays off better than individual effort in the real life and has been bought in by computer science researchers and implemented in various problem-solving approaches. Swarm intelligence is known to be a family of collective problem-solving frameworks such as ant colony optimisation, particle swarm optimisation and artificial bee colonies imposing use of population of solutions, here-forth called swarm of individuals. The main benefit of population-based metaheuristic approaches, particularly swarm intelligence algorithms, is that the algorithms nicely harmonise local search activities around various neighbourhoods without guaranteeing to cover the whole search space. Therein, the local search is devised, to a certain extent, to intensify the search and enhancement activities are facilitated to diversify the search for managing change among neighbourhoods.

Diversification plays a crucial role to arrange visiting unseen regions of the search space as efficiently as possible so that the search effort for optimum solution would not be trapped in locality and be able to keep enough energy for further search. On the other hand, intensification is required to make the search algorithm as focus as possible so that any particular local region would not remain under-examined. A balanced/well-featured search algorithm harmonises the actions required for both diversification and intensification, which is required for effective and efficient search. In fact, individual solution-driven search algorithms conduct more intensified search, while population-driven algorithms are more diversifying by their nature. Hence, swarm intelligence algorithms do require intensification of the search in local regions as they deliver very diverse search by default. This feature applies to the algorithms developed inspired of the collective behaviour of honeybees, where a number of bees algorithm (BA) [25] and artificial bee colony (ABC) [12] variants have been redesigned to manage/handle such a harmony among various search actions. In fact, variants of both BA and ABC algorithms are devised mainly for these purposes, where a variety of difficult problems can be solved with a more generalised search that well-featured with diverse and focus search activities, adequately [4, 22, 34]. However, it is observed that the existing mechanics of BA and ABC algorithms do not sufficiently support intensification, which drives us to further investigations.

The main aim of this paper is to propose a hybridising framework to merge the strong search capabilities of both BA and ABC to seek for higher efficiency in problem-solving. Both algorithms, BA and ABC, have been reviewed first to identify the strengths and weaknesses with respect to intensification in search. Next, revisions for better granularity level in search are proposed for removing the weakness identified in both of BA and ABC, accordingly, where the revised versions demonstrated that they offer better granularity level in search steps to facilitate further intensification. Finally, a hybridisation approach is introduced to merge the strengths of both into a new algorithm for further intensification and improved diversification. In this respect, the contributions of this paper can be listed as follows:

  • Reviewing the properties of both standard BA and ABC algorithms with respect to diversification and intensification of the search.

  • Improve the intensification properties of both algorithms with appropriate revisions,

  • Device a hybridisation approach/framework in which the strengths of the algorithms are aligned to help improve diversification of the search.

The rest of this paper is organised as follows: Sect. 2 introduces swarm intelligence algorithms inspired of natural honeybee colonies, Sect. 3 introduces related works and identifies how this work differs from the related ones, while Sect. 4 introduces the proposed approach including revisions envisaged for the parent algorithms (BA and ABC) and the proposed new hybrid algorithm. Section 5 includes a comprehensive experimental study to test the performance of all algorithms, while Sect. 6 provides the conclusions.

2 Swarm Intelligence and honeybees-inspired algorithms

Swarm intelligence is one of the cutting edge soft computing technologies used for solving various optimisation problems in more efficient ways. This is because the approaches and frameworks proposed are adaptive, flexible and robust in the way that the algorithms handle the problems using various techniques of collectivism. Collective effort by each individual within the swarms is managed by sharing the information regarding search activities towards the common targets. That helps divers the search by its nature.

Figure 1 sketches the search idea metaphorically delivered by honeybees, where a typical function, which has multiple optima points, is explored through for the optimum points, while the search conducted within a neighbourhood by a team of bees is also spotted out to reflect the idea and its implementation. This metaphoric idea has been borrowed from honeybees and their way of collective search by mainly two algorithms, as explained above. Both of the algorithms are detailed in the following subsections.

Fig. 1
figure 1

Search metaphorically delivered by honeybees

2.1 Bees algorithm (BA)

Bees algorithm is one of the mainstream swarm intelligence algorithms inspired of natural honeybee colonies introduced by Pham and his associates [25, 33]. It looks like a typical population-based optimisation algorithm in which solutions are considered as individual bees and are evaluated based on the fitness function-like evaluation rules, which are usually of simple objective functions. The algorithm imposes a search procedure inspired of food/nectar exploration process by honeybees within the nature. An elitist approach is followed to search through the most fruitful regions of the search space so that the optimum or a useful near-optimum can be found as fast as possible without causing further complexities. This algorithm has not only been used for solving numerical optimisation problems, e.g. benchmark functions, neural network training, etc. but also been considered for solving a variety of combinatorial optimisation problems [19, 34].

Let \({\mathbf{\mathcal{X}}}\) be a population of solutions, which is considered to be the bee colony, and let \({\mathbf{x}}_{i} = \left\{ {x_{i,j} |i = 1, \ldots ,N; j = 1, \ldots ,D} \right\}\) represent solution i within this population, which is also called an individual bee as a member of colony/swarm, where N denotes the size of bee colony, \(N = \left| {\mathbf{\mathcal{X}}} \right|,\) and D is the size of input set. Suppose also that \(F({\mathbf{x}}_{\text{i}} )\) is a function defined (\(f_{i} : {\mathbf{x}}_{i} \text{ } \to {\mathbb{R}}\)) to measure the quality/fitness of solution \({\mathbf{x}}_{i}\). The initial population/swarm of bees is generated using \(x_{i, j} = x_{i, \hbox{min} } + \rho * \left( {x_{i, \hbox{max} } - x_{i, \hbox{min} } } \right),\) where \(x_{i,j }\) is a data point for jth input of \({\mathbf{x}}_{i}\) solution initialised to be a random value within the range of \(\left[ {x_{i,\hbox{min} } , x_{i,\hbox{max} } } \right]\) normalised with the random number of \(\rho\).

After generating the initial swarm, each individual bee is evaluated using the fitness function created based on the main objective of the problem tackled. The bees are, then, classified based on their performance/fitness; a set of elite bees, \({\mathbf{\mathcal{E}}},{\mathbf{ }}\) where \({\mathbf{y}}_{e} \in {\mathbf{\mathcal{E}}}\) and \({\mathbf{y}}_{e} = \left\{ {y_{e,j} |e = 1, \ldots ,\left| {\mathbf{\mathcal{E}}} \right|; j \in D} \right\},\) a set of moderate search bees, \({\mathbf{\mathcal{M}}},{\mathbf{ }}\) where \({\mathbf{z}}_{m} \in {\mathbf{\mathcal{M}}}\) and \({\mathbf{z}}_{m} = \left\{ {z_{m,j} |m = 1, \ldots ,\left| {\mathbf{\mathcal{M}}} \right|; j \in D} \right\}\), and a set of employee bees, \({\mathbf{\mathcal{I}}},\) where \({\mathbf{x}}_{k} \in {\mathbf{\mathcal{I}}}\) and \({\mathbf{x}}_{k} = \left\{ {x_{k,j} |k = 1, \ldots ,N - \left( {\left| {\mathbf{\mathcal{E}}} \right| - \left| {\mathbf{\mathcal{M}}} \right|} \right); j \in D} \right\}\). Therefore, \({\mathbf{\mathcal{X}}} = {\mathbf{\mathcal{E}}}{\bigcup }{\mathbf{\mathcal{M}}}{\bigcup }{\mathbf{\mathcal{I}}}\), where \(N_{e} = \left| {\mathbf{\mathcal{E}}} \right|,N_{m} = \left| {\mathbf{\mathcal{M}}} \right|,\varvec{ }\) and \(\left| {\mathbf{\mathcal{I}}} \right| = N - \left( {\left| {\mathbf{\mathcal{E}}} \right| - \left| {\mathbf{\mathcal{M}}} \right|} \right)\). In order for moving to the next generation, \({\mathbf{\mathcal{E}}} \in {\mathbf{\mathcal{X} }}\,\,{\text{and }}\,\,{\mathbf{\mathcal{M}}} \in {\mathbf{\mathcal{X}}}\) are preserved ahead and the rest of the population, which are employee bees, are scraped.

The next step of producing the next generation is to deploy supporting bees, which are not created initially, but later while breeding the new generation in order for supporting each elite, \({\mathbf{y}}_{e} ,\) and moderate, \({\mathbf{z}}_{m}\), bees within the neighbourhood of each. Each individual elite bee, \({\mathbf{y}}_{e} ,\) is supported with a team of bees to further explore within its neighbourhood. This extends the size of elite bees’ set from Ne to \(N_{e} \times \beta\) while the moderate search bees are also supported in the same way, but with different predefined supporting teams of bees. This also increases the size of moderate bee set to \(N_{m} \times \gamma\), where \(\beta\) and \(\gamma\) are predetermined fixed numbers, to identify how many bees to be recruited in the neighbourhood of each elite and moderate bee, respectively. The supporting bees, which are deployed in the search regions of elite and moderate bees, are created with the rule of \(x_{i,j} = x_{i,j} + \rho *\delta\), where \(\rho\) is a random number generated within the range of (−1, 1) and \(\delta\) is another predetermined fixed value to be the step size of change in any input of a solution/a bee. This rule can be specified for each of the bee types as follows: (1) supporting bees for elite bees with \(y_{i,j} = y_{i,j} + \rho *\delta\), while for moderate search bees with \(z_{i,j} = z_{i,j} + \rho *\delta\). Once support teams of bees are deployed within corresponding search regions, the majority of the swarm of the next generation becomes complete. The remaining small portion of the new colony (around 20%) is randomly generated in the way of the initial random population.

Once the elite bees, moderate search bees and the others are identified, and the predefined number of supporting bees is sent to each neighbourhood of both types of these bees. This procedure is repeated until a predetermined stopping criterion is met.

2.2 Artificial bee colony algorithm (ABC)

Artificial Bee Colony (ABC) is another very popular swarm intelligence algorithm developed inspiring of the collective behaviours of honeybee colonies. Karaboga [12] has first initiated this algorithm to solve numerical optimisation problems [14] and then extended the applications with various combinatorial optimisation ones [16, 24]. ABC imposes considering individual solutions as sources of food (nectar) for honey bees, and searching around each solution is named to be collective activities of various types of bees. There are mainly two bee types envisaged; employed and unemployed, where unemployed bees can be in two types; Onlooker and Scout bees. A set of search activities is organised around the nectar sources by recruiting various types of bees in various configurations.

Let \(\overrightarrow {{x_{i} }}\) be a solution, defined as an input vector of \({\mathcal{D}}\) size considered as a source of nectar. A population of \({\mathcal{N}}\) different sources are initially generated using \(x_{ij} = x_{i, \hbox{min} } + \rho * \left( {x_{i, \hbox{max} } - x_{i. \hbox{min} } } \right)\), where \(i = 1, .., {\mathcal{N}};\,j = 1,..,{\mathcal{D}}; x_{i, \hbox{min} }\) and \(x_{i. \hbox{max} }\) are minimum and maximum values of ith input of \(\overrightarrow {{x_{i} }}\) source. Once the whole population of the sources is generated completely, and then, the nectar level of each source is determined to identify the quality of each, which becomes the fitness value of each solution. Following this step, the employed bees start operating on each source to search for sources with better quality using \(\upsilon_{ij} = x_{ij} + \phi_{ij} \left( {x_{ij} - x_{ik} } \right)\), where \(\overrightarrow {{v_{i} }}\) is the new source found following an interaction between \(\overrightarrow {{x_{i} }}\) and \(\overrightarrow {{x_{k} }}\), which is a randomly selected known source among many within the colony of the generation. The difference calculated between the two sources is normalised with a randomly generated \(\phi_{ij} \in \left( { - a,a} \right)\). After the new source identified, a decision is made whether or not to adopt the new source to replace the original one. The ultimate fitness of a typical source decision is calculated using:

$$F\left( {\overrightarrow {{x_{i} }} } \right) = \left\{ {\begin{array}{*{20}c} {\frac{1}{{1 + f\left( {\overrightarrow {{x_{i} }} } \right)}},} & {f\left( {\overrightarrow {{x_{i} }} } \right) \ge 0} \\ {1 + \left| {f\left( x \right)} \right|, } & { {\text{otherwise}}} \\ \end{array} } \right.$$

Onlooker bees start operations following complete by employed bees. The main role of onlooker bees is to monitor the employed bees and taking the search further using a probabilistic process, where a probability of \(p_{i}\) is calculated using \(p_{i} = \frac{{F\left( {\overrightarrow {{x_{i} }} } \right)}}{{\mathop \sum \nolimits_{i = 1}^{{\mathcal{N}}} F\left( {\overrightarrow {{x_{i} }} } \right)}}\) for each individual candidate source and a roulette-wheel selection rule is used to choose a solution for further explorations. The neighbourhood of a chosen source is conducted with \(\upsilon_{ij} = x_{ij} + \phi_{ij} \left( {x_{ij} - x_{ik} } \right)\) similar to employed bees. A small size memory is associated with each further investigated source if any progress is achieved or not. A counter for each investigated source is created and run up to a predefined threshold. If no progress accomplished, then the source is removed from the colony.

Scout bees, then, follow onlookers to diversify the colony, randomly inserting new sources using the initial rule of source generation: \(x_{ij} = x_{i, \hbox{min} } + \rho * \left( {x_{i, \hbox{max} } - x_{i. \hbox{min} } } \right)\). This generational process is repeated until a certain level of satisfaction is reached. As part of the above-mentioned process, each individual solution/source can be included in the next generation via either of the following cases: (1) a source would remain without any change, (2) an employed bee would generate a new solution, (3) an onlooker bee may bring a new solution, (4) a source would be found by both employed or onlooker bees, or (5) an investigated source is replaced with a new source as a result of non-improvement decision. It is a fact that each solution is attempted for improvement at least once, which would be investigated with more attempts if its fitness remains high.

2.3 Relevant works

A variety of works have considered honey bee-inspired algorithms for problem-solving purposes, attempting with a variety of the problems including numerical optimisation benchmarks. The benchmark functions provided in Table 1 are commonly used functions for performance evaluation purposes. Kong et al. [22] use numerical optimisation functions to test the success of their hybrid algorithm, which uses an orthogonal initialisation. However, 60-dimensional problems, at most, are considered, and mostly underperformed in comparison with our results. Hacibeyoglu et al. [9] produced the results for the same set of benchmarks, but did not tabulate them so as to be compared with our results, where the level of dimensions is clearly lower than our case. Kiran and Gunduz [21] have borrowed and embedded a crossover operator from genetic algorithms to solve the numerical benchmark functions, where they considered 50 dimensions at most and the results seem to be very fluctuating as standard deviations are higher than mean statistics in some cases. Karaboga and Basturk [14] have first published their ABC algorithm with the same set of benchmark numerical functions solving them in relatively lower dimensions. However, Karaboga and Akay [13] have presented the success of the same algorithm providing extensive details of their comparative study, where they solved around 50 benchmarks including our benchmarking problems. The algorithms seem implemented very successfully for dimensions up to 30, noting that many other ABC implementations could not hit that level of success. Kiran and Findik [20] present a directed/adaptive ABC algorithm solving the benchmarks with 10, 30 and 50 dimensions, where our results are competitive with them at this level while we solve the problems for much higher dimensions.

Table 1 Benchmark functions commonly used for performance evaluation of algorithms

On the other hand, Pham et al. [25] introduce their BAs algorithm with solving the same set of benchmarking numerical function with rather very lower dimensions, e.g. up to D = 10. Likewise, Yuce et al. [33] have also attempted to solve a number of benchmark functions including those considered in this study with up to 10 dimensions at most. Hussein et al. [11] have improved BAs algorithm with a pre-processing of particular initialisation algorithm and gained better results than both of [25] and [33] in solving the same set of benchmarks with up to 60 dimensions, where our results apparently outperform for all functions except Schwefel.

A number of other metaheuristic and/or swarm intelligence algorithms have also attempted to solve the benchmarks we considered, recently. Based on the relevance that the same functions have been attempted, it is decided to include these studies in the review to help grasp the difficulty of the problems attended. Gong et al. [7], Liu et al. [23], Zhao and Tang [35] and Xin et al. [32] have published their results for the benchmark problems up to 30 dimensions using different variants of particle swarm optimisation, differential evolution and a particular algorithm so-called monkey algorithm. Their results are apparently either not better than, or remain competitive with ours. Likewise, Han et al. [10], Rahmani and Yusof [28] and Alam et al. [1, 3] have introduced their approaches for 30 and 50 dimensions, where our approach usually outperforms them or remain competitive. None of the following references have attempted dimensions larger than 50, but, the majority of them have only considered up to 30, while our approach outperform them in major [2, 5, 8, 17, 26]. These studies have mostly compared their result with those produced by Suganthan et al. [31] in which a comprehensive study is extensively reported on solving a number of numerical optimisation benchmarks.

The algorithms reviewed and cited above propose various approaches to improve the performance of bee-inspired algorithms either (1) with extending the algorithms through pre-processing or (2) introducing additional computational activities into any stage of the algorithms or (3) embedding local search procedures into the algorithms. To the best knowledge of the authors, none of the works related to this paper reviews the capabilities of the two mainstream honeybees-inspired algorithms (BA and ABC) and not propose hybridising the two to improve the fundamental properties of the search. This paper introduces an approach/framework to hybridise BA and ABC algorithms for improving the performances via further intensification and diversification in the search process, which can be made transferrable to any variant of the algorithms in this kind.

3 Proposed approach

The above-mentioned honeybees-inspired algorithms have been examined with respect to the balance between diversification and intensification of the search, and few ideas have been put together for the purpose of improving their performances in solving numerical optimisation problems.

Following the structural and experimental analysis, both of the algorithms, BE and ABC, introduced above have been found with strengths and weaknesses with respect to diversification and intensification of search process. Both ABC and BA algorithms include freshly generated random solutions into the new generations to a certain level, where diversification of the search is achieved in this way. In addition, BA algorithm intensifies the search on fruitful sources, where further search attempts are organised around highly fitted sources/solutions, which helps intensification further, while ABC uses memory-like mechanism to let scout bees intensify their search around certain sources for a number of attempts until it is understood that the source is dried out. Once a source is dried out, it is deleted from the population.

On the other hand, both algorithms conduct search with few shortcomings, which have been considered, in this study, as the grounds of improvement to enhance the capabilities of above-mentioned bee-inspired algorithms. In this regard, BA algorithm uses a parameter to normalise the step size, so-called environmental/neighbourhood factor and denoted with \(\delta\), in the previous sections. It is set to a fixed value at the initialisation stage and kept as it is to the end of the search. This makes granularity of the step size coarse-grained in approximating the optimum value, which drifts intensification away, and prevents the search to reach the optimum in most of the time. Another weakness of BA algorithm is the diminishing probability of having random solutions within the population, especially during the late stages of the search. This can escalate to disabling diversification at later stages. In the case of ABC, the weaknesses arise in two points; (1) the sources taken out of population are evaluated not based on the fitness, but, improvability, which can cause disregard of useful solutions, and (2) in addition to this, some useful and very well-improved solutions can be decommissioned from the population since their improvability is reduced to 0 according to the criteria adopted. Both of these weaknesses can drive the algorithm towards very unfertile region of search space.

In the following two subsections, ideas are considered and discussed to enhance the capabilities of both of the bee-inspired algorithms following the above-mentioned structural assessments. These will be used as bee operators in the hybrid algorithm.

3.1 Intensification in bees algorithm (rBA)

The main revision envisaged for BA to improve intensification, based on the shortcomings discussed above, is to make step sizes more fine-grained. The granularity of step size can be adjusted through the fixed-valued (constant) \(\delta\) in the update rule, \(z_{i,j} = z_{i,j} + \rho *\delta\), where \(z_{i,j}\) is a single dimension of a complete solution and \(\rho\) is a random number within the range of [−1, 1]. It is important to note that the parameter of \(\delta\) is selected at the configuration stage the algorithms under consideration and kept constant through out of the entire search process in original BA. The parameter of \(\delta\) corresponds to shrinking constant (sc) in [33], and to the parameter of h, so-called incremental size in [34] as part of BA variant embedded with a hill-climbing local search procedure in which the step size is calculated with slope-angle approach. The embedded hill-climbing procedure is not sufficiently explained to realise how the step size is calculated through the gradient and the incremental size parameter in [34].

This constant-valued parameter,\(\delta\) as declared in original BA, manages the granularity level and leaves the approach rather coarse-grained, which causes the step size not to be easily adjustable in finer precisions and can take much longer time to approximate. In order to avoid this shortcoming, the update rule is revised as follows: \(z_{i,j} = z_{i,j} + \rho *\delta *z_{i,j}\), where \(\delta\) is made to be a rate within the range of [0,1], and can be adaptive, too. Therefore, the new step size calculated with \(\delta *z_{i,j}\) will be more adjustable and proportional to the range of (\(z_{\hbox{min} } ,z_{\hbox{max} }\)) with which the algorithm can approximate much faster than before, and more preciously. The update rule is applied to all types of bees recruited as part of the algorithm, while the rest of the algorithm remains as original.

3.2 Intensification in ABC algorithm (rABC)

In order to ease the difficulties in approximation using standard ABC, following the shortcomings identified and discussed above, two revisions have been envisaged to achieve ABC improvement; (1) one is to collect all results from all employed and onlooker bees and then apply roulette-wheel selection instead of original practice, and (2) the other revision is to adopt a rank-based selection rule for the next generation, where 25% of top ranked solution from entire existing solution set, \({\mathcal{N}} + {\mathcal{E}}\), where \({\mathcal{N}}\) denotes original bee colony and \({\mathcal{E}}\) is the number of generated solutions. The first revision widens the candidate set for roulette-wheel operator to select a more fruitful solution to enhance search while the second devises a greedier approach towards the top best solutions.

3.3 Proposed hybrid algorithm (Hybrid)

This algorithm attempts at merging the strengths of both of above-mentioned mainstream bee algorithms for better search performance. It is a hybrid algorithm in which the hybridisation process is managed based on the framework of BA algorithm with implementing not only the bee operators from BA algorithm but also all other above-mentioned algorithms. The details of proposed Hybrid algorithm are provided in Fig. 2. As can be seen, the main structure of the algorithm is inherited from BE algorithm and devised differently using a set of rules given below as the set of equations; Eqs. (1)–(5). Each rule is a bee operator used by one of the above-mentioned algorithms. The Hybrid algorithm implies a systematic harmony/reuse of Eqs. (1)–(5), adopted as bee operators used for generating new solutions/bees as well as neighbours for the existing elite and fit bees. We should note that Eq. (1) is used for generating the initial swarm and independent bees exploring for better nectar sources while Eqs. (2)–(5) are used to send supporting bees around each elite bee.

Fig. 2
figure 2

Pseudocode for Hybrid bee algorithm

$$\overrightarrow {{x_{i} }} = \overrightarrow {x }_{\hbox{min} } + \overrightarrow {\rho } *\left( {\overrightarrow {x }_{\hbox{max} } - \overrightarrow {x }_{\hbox{min} } } \right)\quad {\text{for}} \,\,\forall i \in N$$
(1)
$$\overrightarrow {{x_{i} }} = \overrightarrow {{x_{i} }} + \overrightarrow {\rho } *\delta \quad {\text{for}}\,\, \forall i \in N\quad {\text{and}}\quad \delta \in {\mathbb{R}}$$
(2)
$$\overrightarrow {{\upsilon_{i} }} = \overrightarrow {{x_{i} }} + \overrightarrow {{\phi_{i} }} \left( {\overrightarrow {{x_{i} }} - \overrightarrow {{x_{k} }} } \right)\quad k \in N\quad {\text{and}}\quad {\text{for}}\,\, \forall i \in N$$
(3)
$$\overrightarrow {{x_{i} }} = \overrightarrow {{x_{i} }} + \overrightarrow {\rho } *\delta *\overrightarrow {{x_{i} }} \quad {\text{for}}\,\, \forall i \in N\quad {\text{and}}\quad \delta \in \left[ {0,1} \right]$$
(4)
$$\overrightarrow {{\upsilon_{i} }} = \overrightarrow {{x_{i} }} + \overrightarrow {{\phi_{i} }} \left( {\overrightarrow {{x_{i} }} - \overrightarrow {{x_{k} }} } \right)\quad k \in \{ Q_{1} of N\} \quad {\text{and}}\quad {\text{for}}\,\,\forall i \in N$$
(5)

Equations (2), (3), (4) and (5) are the neighbourhood rules used, respectively, by the ordinary BA algorithm, the revised BA algorithm (rBA), ABC and revised ABC algorithms (rABC) to explore around a local nectar source, which means a local region of the search space in optimisation context. The hybrid algorithm randomly selects one of these rules to generate a neighbouring solution of a particular elite solution, each time, to complete up \(\beta\) supporting bees for each elite so that \(N_{e} \times \beta\) bees can be placed in the new generation. The moderate search bees randomly use Eq. (2) to (5) for generating their neighbouring solutions to complete \(\gamma\) number of supporting bees so as to place \(N_{m} \times \gamma\) solutions in the next swarm while the independent bees explore with Eq. (1) for further generations of randomly searched nectars. The rest of algorithmic mechanics of this hybrid algorithm works in the same way as the ordinary bee algorithm does until a certain satisfactory level is achieved as indicated in the pseudocode provided in Fig. 2.

It is important to note that this frame work can be flexible with additional and different types of bee operators to be employed as part of Step 4 for populating the new generation of the swarm. In the following experimental study, it is demonstrated that random selection of bee operators among the set of operators/rules given with Eqs. (1)–(5) can pay off better in comparison with the original algorithms take part of Hybrid. It is also possible that a bespoke selection policy can be adopted instead of randomly selecting the operators for generating bees and the neighbouring solution.

4 Experimental evaluations

The following section introduces a major experimental study to demonstrate the performance of above-mentioned well-known bee algorithms and the revisions envisaged to enhance the capabilities via performances.

4.1 Functional optimisation

The first bit of this experimental evaluation is made with functional optimisation benchmarks. The functions considered for testing purposes are multidimensional functions, which can also be considered as many-dimensional functions, where the tests have been conducted over their 5, 30, 60, 100 and 150 dimensions. The reason to opt with these dimensions is that the literature [1, 14, 22, 33] reports solving these problems with similar dimensions, where 100 and 150 dimensions are exercised first time in this study. Two of the functions are known as uni-model (labelled as (6) and (7) in Table 1), which means that they have only single optimum points, while the other four are multi-model functions meaning that they can have multiple optimum points. These are all well-known and challenging benchmark functions used to test optimisation algorithms across the literature of this field. An extensive study on a number of numerical optimisation benchmarks including those considered below is reported in [31].

The parametric design details of the algorithms are provided in Table 2, where the parameters of the main three algorithms tabulated. It should be noted that the revised versions of both BA and ABC algorithms, rBA and rABC, have the same parametric values as the original ones since they suggest more procedural rather than parametric changes. As a matter of fact, the neighbourhood structures of the algorithms, which is also a procedural difference, are indicated as follows: all algorithms use fixed-sized local neighbourhood, while BA has a rank-based random selection, ABC uses roulette-wheel selection and, in fact, Hybrid adopts both in a systematic use. In addition, Hybrid algorithm selects mate bees from top quartile when operating with revised ABC.

Table 2 Parametric details of the algorithmic configurations

The initial populations/swarms have been randomly generated with rather different seeds, and the experimentations for each function with each variant of algorithm are repeated 30 times. The experimentation has started with rather lower dimensions and gradually increased up in due course. The first set of experiments has been carried out with a dimension of 5 and 30 for all benchmarks, just to be in-line with the existing literature. The results are tabulated in Table 3, where the results are recorded in two statistics, mean and standard deviation, with the five algorithms against each benchmark function. As part of investigating the number of iterations, preliminary experiments with 200 and 1000 iterations have been conducted as plotted in Fig. 3a, b. As suggested, the averaged differences between the optimum and the results found are plotted for each algorithm. Further performance details of the algorithms are provided accordingly within Table 3 with 5000 iterations for both 5-D and 30-D benchmark problems.

Table 3 Experimental results by all five bee algorithms with 5000 iterations for 5-D and 30-D benchmark functions
Fig. 3
figure 3

Differences between the optimum and the results found by the algorithms for aD = 5, bD = 30

Figure 3a, b present the differences between known optimum values and the achieved results averaged overall benchmark problems categorised dimensions and the number of iterations taken. Figure 3a, b include the results for 200 iterations. All three figures clearly suggest that Hybrid algorithm outperforms all others and its approximation goes closer to 0. On the other hand, revised algorithms perform better than the original algorithms in the same overall point of view, where rBA remains as the first runner algorithm after Hybrid. It is also observed that ABC performs much better when dimension is lower, but performs not as good as the other rivals with growing dimension. However, rABC, the revised ABC, is one of the competitors with Hybrid regardless of the growing dimensions.

As indicated above, the experimental results reported in Table 3 are the performance of five algorithms for each of the benchmark problems.

Sphere function is easily solved by almost all algorithms with five dimensions over 5000 iterations, while the function with 30 dimensions becomes a bit challenging, where all four algorithms except ABC find the optimum, and Hybrid hits the optimum with an ignorable difference after 200 iterations, while the other significantly remain distant. After 1000 iterations, BA and ABC only stay struggling, but the other three solve the problem with exact solution. ABC only remains a little bit distant after 5000 iterations while the rest solve it exactly.

Rosenbrock function is one of two functions found challenging in this research. None of the algorithms have found the optimum while the best with five dimensions is by BA and with 30 dimensions by rABC. Algorithms’ performances improve with an increase the number of iterations to 5000; however, the optimum is still not achieved, although BA and rABC perform better for 5 and 30 dimensions, respectively, and Hybrid always follows as the second best.

The best approximation for Ackley function is made by Hybrid, while BA and rABC remain competing with Hybrid to reach the exact optimum; however, both remain in a very ignorable distance. It is observed that Hybrid performs the best after 200 iterations for 5 dimensions cases, but BA and rABC compete with Hybrid in other cases of 30 dimensions.

Griewank function is best approximated by rABC and Hybrid algorithms, even as early as 200 iterations in both dimensions of 5 and 30. The other algorithms approximate to the optimum level after 5000 iterations, noting that rBA and ABC remain a little bit distant to the optimum. rABC solves Rastrigin function to optimum in both dimensions (5 and 30) after 200 iterations, while the other algorithms struggle to approximate even after 5000 iterations. It is important to indicate that this function is attended by Kong et al. [22] with 5 and 10 dimensions only. Schwefel function remains as the most challenging benchmark since BA and ABC solve it with 5 dimensions after 200 iterations, but none of the algorithms managed solving the problems to the optimum with higher dimensions even after 5000 iterations, where initial swarms/populations escalate to very different results, each time.

Table 4 presents the performances of all five algorithms for 60-dimensional benchmarks after 5000 iterations, where it is clear that both ordinary BA and ABC algorithms remain very underperforming in comparison with the revised versions and the hybrid algorithm, although their performance improves with more iterations as indicated in the bottom (5000 iteration cases) section of the table. On the other hand, rBA, rABC and Hybrid approximate to the optimum in four functions after 1000 iterations, while struggle in solving Rosenbrock and Schwefel functions, despite that their performance improves in Rosenbrock function. These results indicate that Schwefel function clearly requires far more attention to better approximate. It is also notable that the results after 5000 iterations deviate from the optimum with very ignorable level.

Table 4 Experimental results by all five bee algorithms with 5000 iterations for 60-D benchmarks

The results in Tables 5 and 6 include the performance of BA, ABC and the Hybrid algorithms only, since beyond D = 60, the revised algorithms seem to underperform following their original versions. However, Hybrid remains competitive and solving four functions to optimum out of the six benchmarks. Tables 5 and 6 presents the results of Hybrid in comparison with original BA and ABC for D = 100 and D = 150 running the algorithms for 5000 iterations. Hybrid solves Sphere, Ackley, Griewank and Rastrigin functions to optimum for all dimensions including 60-D, 100-D and 150-D. However, the algorithms, BA, ABC and the revised versions of these two, remain behind this level of achievement with growing dimensions. The algorithms other than Hybrid seem falling in a local optimum around 20 while solving Ackley function for dimensions of 100 and 150. Rosenbrock function is the second challenging benchmark among all, where the approximation of Hybrid remains just below 100 for 100-D and below 150 for 150-D cases. Clearly, Schwefel function is the most challenging one since the approximation of all algorithms stays far apart of the expected optimum. This hints that Schwefel function requires particular attention. Both BA and ABC improve their performances with an increase in iteration numbers, but the level of improvement, apparently, remains rather weaker. That means that the approximation of both algorithms approach to the ultimate level of achievement, and beyond this level of iterations a significant improvement is not expected.

Table 5 Experimental results for 100-D and 150-D cases with iterations of 5000
Table 6 Experimental results for 100-D and 150-D cases with iterations of 10000

Both Tables 5 and 6 include the results gained after 10,000 iterations for both large-dimensional cases, i.e. 100-D and/or 150-D. The results do not include any surprise on that beyond a certain number of iterations; the achievement is not improving significantly, where Hybrid evidently out performs both of its competitors. In fact, both of BA and ABC algorithms approximate very roughly, while Hybrid approaches to the optimum values except the cases of Rosenbrock and Schwefel functions.

In order to visualise and overview the overall performances, Fig. 4a–f presents the overall performance indication of BA, ABC and Hybrid algorithms for the dimensions of 100 and 150. Similar to the previous cases depicted in Fig. 3a, b, the results of all the algorithms for all functions have been further processed to calculate the differences between the optima and the results produced and then averaged accordingly. Figure 4a, c, e plot the averaged differences for 100-D, while Fig. 4b, d, f plots for 150-D. It is observed that, as suggested in Fig. 4a–f, BE significantly underperforms in comparison with both ABC and Hybrid, while ABC does runs up after Hybrid, where Hybrid significantly outperforms both rivals. The performance of both ABC and Hybrid improves with an increase in the number of iterations, while BA does not improve significantly with the growing number of iterations for both cases of D-100 and D-150. As seen, Fig. 4c, d presents the performance excluding Schwefel function, which is the most challenging one, while Fig. 4e, f shows the performance excluding both challenging functions: Rosenbrock and Schwefel. It is observed that Hybrid solves all problems to optimum except Schwefel, while ABC approximates better excluding Rosenbrock and Schwefel.

Fig. 4
figure 4

Averaged overall achievements by BA, ABC and Hybrid; a all comparative results in D-100 cases, b all comparative results in D-150 cases, c comparative results for benchmarks except Schwefel in D-100 cases, d comparative results for benchmarks except Schwefel in D-150 cases, e comparative results for benchmarks except Rosenbrock and Schwefel in D-100 cases and f comparative results for benchmarks except Rosenbrock and Schwefel in D-150 cases

4.2 Conclusions for functional optimisation

Functional optimisation provided in the previous subsection helps demonstrate how to implement the proposed Hybrid algorithm for functional optimisation benchmarks and prove that the proposed algorithm outperforms two mainstream honeybees-inspired algorithms. Tables 3, 4, 5 and 6 present the comparative results in both mean and standard deviation statistics, where the outperforming achievement of Hybrid can be observed. The standard deviations show the steadiness of the mean results, where the results by Hybrid and ABC seem not much fluctuating while BA results fluctuate, meaning that BA is not fit enough to tackle the algorithms. The overall performance of all three algorithms is plotted in Fig. 4a–f to visualise the outperforming success by Hybrid. Also, the marginal achievement is plotted excluding the challenging benchmark problems, where BA does not improve, but both Hybrid and ABC improve with exclusion of challenging benchmarks as well as with the growing number of iterations.

4.3 Neural network training

In this section, the Hybrid algorithm is tested with another numerical optimisation case, which is used for optimising the weights of feed-forward neural network models used in classification problems.

Training NN models with search algorithms is not a very new concept, but is under consideration for further improvements with more powerful algorithms. Both ABC and BA algorithms are, as mentioned above, relatively new swarm intelligence algorithms attract further research efforts as it proves success in various areas of applications, especially in numerical optimisation problems. Dugenci and Aydin [6] have recently demonstrated that the algorithm is capable of solving high-dimensioned complex numerical functions. On this basis, BA-based training is devised for optimising the set of connection weights of feed-forward NN models, so that they can predict and classify more precisely and successfully [18, 29, 30]. In order to achieve this, first an ANN model should be configured, and then, the set of weights can be retrieved. A typical feed-forward NN model with two inputs, few hidden nodes and one output is displayed in Fig. 5, where the connection weights labelled with honeybees and the nodes are signified with honeycombs. The total number of connections is the size of a solution state, which is subjected to optimisation process.

Fig. 5
figure 5

Feed-forward NN model configured and trained with Hybrid BA

Given the circumstances, a feed-forward NN model with one hidden layer has I number of input nodes, H number of hidden nodes and O number of output nodes. The number of connections constituted between input and hidden layer is \(\left( {I + 1} \right)H\), while the number of connections required between hidden layer and output layer is \(\left( {H + 1} \right)O\), where in each level 1 bias node is also considered as part of feed-forward ANN to facilitate learning more smoothly. The set of weights between I and H is \(w_{i}^{I - H} = \left\{ {w_{i,j}^{I - H} |j = 1, \ldots ,\left( {I + 1} \right)H} \right\}\), while the weight set for connections between H and O is \(w_{i}^{H - O} = \left\{ {w_{i,j}^{H - O} |j = 1, \ldots ,\left( {H + 1} \right)O} \right\}\). The ultimate set of weights is \(w_{i} = \left\{ {w_{i}^{I - H} \cup w_{i}^{H - O} } \right\}\) with the size of \(\left| {w_{i} } \right| = \left( {I + 1} \right)H + \left( {H + 1} \right)O\). For example, given the model in Fig. 5, there are two input neurons,three hidden layer neurons and 1 output neuron with bias nodes in each layer of hidden and output level; hence, the total number of connection weights is (2 + 1) * 3 + (3 + 1) * 1 = 12. Therefore, a typical bee will represent the whole NN with a vector of 12 weights including nine \(w_{i}^{I - H}\) and four \(w_{i}^{H - O}\) values. The total number of weights will change accordingly if any of I or H or O changes.

The benchmark problems have been taken from UCI data collection, one of well-known medical data collection used for research purposes [27]. Table 7 tabulates the results including all data types with the sizes of both training and test sets alongside the configurations of neural nets, where the models have multiple and many outputs; each corresponds to one particular class to be identified. The NN model is expected to produce one output much higher than the others so as to consider that the highest output to be the class-identified subject to given input data. Table 7 introduces the data sets with the overall data size in the second column, which is divided into the training and test sets given in the third and fourth columns. The fifth column in the table provides the configurations of feed-forward NN models set up to classify corresponding data sets indicated the rows. The form of data is introduced in <I–H–O> , where I, H and O stand for the number of input nodes, hidden nodes and output nodes, respectively. The last two columns show the comparative results in CEP measure, mean and standard deviation in parenthesis, where the column labelled “Hybrid” presents the results by Hybrid algorithm while the one labelled with “ABC” contains the results by ABC algorithm which is taken from Karaboga and Ozturk [15].

Table 7 Experimental results by Hybrid and ABC algorithms in mean of CEP and standard deviation in parenthesis for various medical data sets used for training and testing neural network models

As mentioned above, the base ABC algorithm considered in this comparison is introduced by Karaboga and Ozturk [15]. The performances are measured in an index called classification error percentage (CEP), which is the percentage of misclassification patterns over the total number of test patterns. The configurations of each NN model developed per data set are the same as those in the Karaboga and Ozturk [15]. As suggested in table (Table 7), the majority of the results by Hybrid are significantly better than those produced by ABC with respect to mean and standard deviation statistics, where the top four problem cases are slightly better with Hybrid, but last five cases are significantly better as suggested by rather lower standard deviations and much higher CEP means. This demonstrates that the diversification and intensification operations handled as the result of hybridisation pay off and prove the superiority of Hybrid algorithm.

5 Conclusions

In this paper, a hybrid bee-inspired algorithm is proposed with a comprehensive performance investigation through two numerical optimisation problem types; (1) functional optimisation benchmarks and (2) ANN training through optimising the weights of connection links of feed-forward NN models. Although a number of variants of both mainstream bee-inspired algorithms, BA and ABC, have been developed and used for a number of problem-solving purposes, there was not attempt to merge the strengths of both mainstream algorithms so that more efficient and robust problem-solving can be achieved. This paper presents a novel bee algorithm which hybridises both BA and ABC algorithms for better performance.

The properties of both algorithms are first reviewed to identify the strengths and weaknesses, and then, remedies are identified to cure the weaknesses, where revised versions of both algorithms, rBA and rABC, are developed. Afterwards, a framework is devised to harmonise and reuse the bee operators of all original and revised algorithms into the search process. It is demonstrated that the existing and improved capabilities of BA and ABC algorithms with respect to diversification and intensification are pulled in the Hybrid so that the strengths of all participating algorithms can be merged in Hybrid framework. The Hybrid framework has been comparatively tested with (1) solving very high-dimensional numerical optimisation benchmarks and (2) optimising the weights of feed-forward NN models develop for classification purposes. The experimental results clearly suggested that revised versions of both BA and ABC (rBA and rABC) improve the performance by large and more importantly the proposed Hybrid algorithm significantly performs better in comparisons with the original and revised versions of both algorithms, BA and ABC.

This achievement is attained with better harmony induced in the hybrid algorithm, where both of rBA and rABC provided better intensification and randomly and systematically use of operators helped achieve improved diversification. This does not limit the Hybrid framework to the set of equations proposed and used neither rules out any selection policy other than random selection. It means that further studies are needed to test the Hybrid framework with variety of bee operators and selection policies to identify the best configurations bespoke to the problems under investigation. In addition, the hybrid framework requires to be further investigated for combinatorial optimisation problems as the next step of this research.