Introduction

In human society, brainstorming is an effective method when a group of people is facing a difficult problem. People with different backgrounds are gathered together to constantly propose new ideas in the process of cooperation and mutual inspiration and finally solve the problem. The brain storm optimization (BSO) [29] is a novel swarm intelligence (SI) for global optimization inspired by this process, which has received more and more attention in recent years [21, 25, 26, 34]. The primary idea of BSO is to divide the population into different clusters, while new solutions are generated within a cluster or between two clusters. Therefore, the key steps affecting the performance of BSO are the solution clustering strategy and the solution mutation strategy.

k-means is the clustering strategy used in the classic BSO. However, the role of k-means for BSO is worth discussing. First, k-means is suitable for spherical clustering, and cannot deal with the clusters with arbitrary shapes. Second, the expensive time complexity of k-means affects the efficiency of BSO. Especially in the early stage of evolution, it is meaningless to cluster by k-means when the individuals are scattered. Third, for the problem with D-dimensional solution space and single objective space, k-means only considers clustering the population in the solution space but not using the information of the objective space.

There has been a lot of work trying to improve the clustering strategy of BSO in different ways. First, researchers have tried to improve the time efficiency of BSO from two perspectives. On the one hand, different grouping algorithms were proposed to replace k-means, such as simple grouping method (SGM) [40], random grouping strategy (RGS) [4]. Besides, a class of objective space-based grouping algorithms [11, 19, 30] could greatly reduce the time cost, which was first used in BSO-OS (BSO in objective space) proposed by Shi [30]. On the other hand, some work tried to reduce the times of calling for k-means [3, 5] or controlled the number of iterations of k-means [43]. Second, some researchers have considered the fitness and the position of solutions simultaneously and proposed new clustering strategies, such as fitness-guided clustering strategy [41], and the role-playing strategy [6]. Third, some well-known clustering strategies were also used in BSO, such as affinity propagation [8] and agglomerative hierarchical clustering [7].

The mutation strategy is also the focus of BSO. In the classic BSO, the Gaussian mutation is performed on the selected individual or combined individual, and a transfer function is used to update the step size of the mutation. However, one of the obvious defects of this mutation strategy is that the mutation step size is a fixed schema. It does not depend on the problem and the current population distribution.

The improvements of the mutation strategy mainly include three categories. The first is based on the mutation strategy of the classic BSO, such as the improvements of the update strategy of step size [4, 10, 24, 42], the mutation strategy adaptive selection [38]. The second contains several new mutation strategies, such as modified BSO (MBSO) [40], advanced discussion mechanism-based BSO (ADMBSO) [35], BSO with learning strategy (BSOLS) [32], adaptive BSO with multiple strategies (AMBSO) [9], active learning BSO (ALBSO) [5]. The third is hybrid algorithms formed by BSO and existing algorithms, such as BSO with a chaotic operation (BSO-CO) [36], BSO with differential evolution [2], hybrid BSO and simulate annealing [16], and hybrid covariance matrix adaptive evolution strategy and global-best BSO [13].

In this paper, we propose a novel and efficient BSO, termed BSO20 (proposed in 2020). BSO20 improves the classic BSO from two aspects. First, regarding the clustering strategy, an efficient hybrid clustering strategy is designed. Second, in terms of the mutation strategy, a modified mutation strategy is proposed to improve the exploration efficiency. In summary, the main contributions of this paper are listed as follows.

  1. 1.

    We propose a hybrid clustering strategy, which combines nearest-better clustering (NBC) [27, 28] and RGS. NBC is a competitive clustering strategy based on the objective space and the solution space, which could better reveal the information of the landscape. Especially, NBC is suitable for clusters with arbitrary shapes. Moreover, RGS is used to cluster the scattered population in the early stage of the optimization, as well as poor individuals in the evolution process. By RGS, the hybrid clustering strategy could reduce the time cost of the algorithm and increase the diversity within a cluster.

  2. 2.

    We propose a modified mutation strategy for intra-cluster and inter-cluster mutation. The primary idea is to enhance the ability of individuals to share information and to improve the exploration efficiency of the population. In the intra-cluster, each selected individual shares information with one better individual, and in the inter-cluster, each selected individual shares information with two individuals from the different clusters.

BSO20 is tested on the problems of the 2017 IEEE Congress on Evolutionary Computation competition on real parameter numerical optimization (CEC’17 RPNO) and compared with several up-to-date BSOs as well as two variants of particle swarm optimization (PSO). The experimental results show that the performance of BSO20 is competitive.

The organization of this paper is as follows. The next section introduces the related work. The details of the proposed BSO20 algorithm are described in the following section. Next section demonstrates the experimental results and the parameter analysis. Following section discusses the CPU time and the convergence of BSO20. Finally, the last section concludes this paper and introduces the future work.

Related work

In this section, BSO and some of its clustering strategies are reviewed. Then, a competitive clustering strategy NBC is introduced.

Brain storm optimization

BSO was proposed by Shi [29], which is inspired by human being’s brainstorming process. The core operations of an iteration of BSO include the following three parts.

  1. 1.

    Solution clustering: The cluster strategy is used for dividing the population, and NP individuals in the population are clustered into k clusters. In order to avoid the premature convergence of BSO, the replacing operator is used to control the initialization of a random cluster center with the probability \(p_{init}\).

  2. 2.

    Solution generation: In the classic BSO, a new solution \(x_{new}\) is generated according to Formula (1), where y is a base individual, d is the dimension index, \({\mathcal {N}}^d(0,1)\) is a random number obeying standard normal distribution, and \(\xi (t)\) is the step size at t-iteration updated according to Formula (2).

    $$\begin{aligned} x^d_\mathrm{new}= & {} y^d + \xi ^d (t) \cdot {\mathcal {N}}^d(0, 1), \end{aligned}$$
    (1)
    $$\begin{aligned} \xi ^d (t)= & {} \text {logsig}\left( \frac{0.5\times T-t}{20} \right) \cdot \mathrm{rand}^d. \end{aligned}$$
    (2)

    In Formula (2), T is the maximum number of iterations, t is the current number of iterations, rand\(^d\) is a random number between 0 to 1 following the uniform distribution, and the transfer function “logsig(\(\cdot \))” is described by Formula (3).

    $$\begin{aligned} \text {logsig}(a) = \frac{1}{1+\exp (-a)}. \end{aligned}$$
    (3)

    There are two mutations in BSO, namely intra-cluster mutation and inter-cluster mutation, where the probability of the intra-cluster mutation is \(p_{\mathrm{one}\_\mathrm{cluster}}\), and the probability of the inter-cluster mutation is (\(1-p_{\mathrm{one}\_\mathrm{cluster}}\)). In the intra-cluster mutation, an individual is selected from a random cluster as the base individual y, where y is the center of the selected cluster with the probability \(p_{\mathrm{one}\_\mathrm{best}}\). In the inter-cluster mutation, the base individual y is generated according to Formula (4).

    $$\begin{aligned} y = r \cdot x_{i_1} + (1-r) \cdot x_{i_2}. \end{aligned}$$
    (4)

    where r is a random number between 0 to 1, \(x_{i_1}\) and \(x_{i_2}\) are two different individuals coming from two randomly clusters, as well as \(x_{i_1}\) and \(x_{i_2}\) are the centers of the respective clusters with the probability \(p_{\mathrm{two}\_\mathrm{best}}\).

  3. 3.

    Solution selection: The new solution is compared with the old solution with the same index, and the better solution is retained in the population.

Clustering strategies in BSO

The solution clustering strategy is one of the primary steps of BSO, which has many classification methods. In this section, the existing clustering strategies are divided into four categories according to the difference of clustering space: the solution space-based clustering strategies represented by k-means [29], the objective space-based clustering strategies represented by BSO-OS [30], the clustering strategy based on both the solution space and the objective space, such as fitness-guided clustering [41], and other clustering strategies.

Solution space-based clustering strategies

The solution space-based clustering strategies are the mainstream clustering strategies in BSO, which is to divide the similar individuals into one cluster according to the distance.

k-means is a popular clustering strategy used in BSO, and its idea is as follows. First, k cluster centers are randomly generated in the solution space, and each individual is grouped to the nearest cluster center. Then, the new cluster centers are recalculated, and the population is grouped again around the new cluster centers. The above steps are repeated until clusters tend to be stable. However, k-means requires multiple iterations to update clusters, and the number of iterations is related to the choice of the initial cluster centers. In order to reduce the time cost brought by k-means in the classic BSO, efforts were made in two aspects. On the one hand, the number of calling for k-means in BSO is reduced, such as k-means is called with probability \(p_c\) in [3] and [5], where \(p_c\) is a constant value in [5], whereas is dynamically adjusted in [3]. On the other hand, the number of iterations of k-means is reduced. For example, in [43], the median is used in place of the mean in k-means, which stabilizes the cluster centers to reduce the number of iterations.

In addition to k-means, affinity propagation clustering [8] and agglomerative hierarchical clustering [7] have also been adopted in BSO. Compared with k-means, the time complexity \(O(n^2\log \,n)\) of these are high.

Moreover, some new clustering algorithms have been proposed for BSO. For example, Zhan et al. proposed SGM in [40]. SGM first randomly selects M different individuals from the population as cluster centers at first, and then each individual is grouped into the nearest cluster center.

Objective space-based clustering strategies

On the single objective optimization problems, the objective space-based clustering strategies group individuals according to fitness. In [30], Shi first proposed this strategy called BSO-OS. The solutions are divided into two clusters in BSO-OS. One cluster of individuals with better fitness is called elite solutions, and the other cluster is called normal solutions. There is no need to calculate the distance between individuals in BSO-OS, which greatly reduces the computational burden. Similarly, in [19], Li et al. proposed the BSO based on a competition mechanism, which designed two competing groups in the same way.

In [11], Mohammed proposed another objective space-based clustering strategy called fitness-based grouping, which ensured that the good and the poor individuals in the population are evenly distributed in each cluster.

Clustering strategy based on solution and objective space

There was a little work that considers clustering individuals in both the solution space and the objective space. In [41], Zhang et al. proposed the fitness-guided clustering strategy, and the clusters are formed as follows. First, all individuals in the population are labeled as unprocessed individuals. Then, the best individual \(x_b\) among unprocessed individuals is selected as the cluster center. Next, n individuals nearest to \(x_b\) are selected from unprocessed individuals and are grouped into a new cluster with \(x_b\). These individuals are labeled as processed individuals. The above steps are repeated for unprocessed individuals until all individuals are processed.

In [6], Chen et al. proposed an enhanced BSO with a role-playing strategy (PRBSO). The solutions are grouped into three clusters according to the fitness difference and distance between all solutions and the best solution. Among them, good solutions closer to the best solution are called innovative ideas, good solutions far from the best solution are called conservative ideas, and others are called ordinary ideas.

Other clustering strategy

In [4], Cao et al. proposed a random grouping BSO (RGBSO), in which RGS is adopted as the clustering strategy. RGS randomly divides the population into k clusters of the same size. The best individual in each cluster is the cluster center. Compared with other clustering strategies, the time complexity of RGS is lowest, because there is no need to calculate distances and sort the population. Moreover, the diversity of each group is stronger in RGS. In [33], Wang et al. proposed to use orthogonal experiment design to improve the distribution of individuals in clusters.

Nearest-better clustering

Nearest-better clustering (NBC) [20, 22, 27, 28] is an excellent clustering strategy based on the solution space and objective space for the multimodal optimization. NBC relies upon the assumption that the distance between the optimal solutions is greater than the weighted average distance of all individuals to their nearest better neighbors. The details of NBC are as follows.

First, the population P is sorted according to the fitness value from the superior to the poor, and the distance of each pair of individuals is calculated. Then, an empty tree is created, and the best individual becomes the root node. For the remaining individuals, each individual is connected to its nearest better neighbor, and the length of the edge is the distance between two individuals. In [20], the parent node is called the leader, and the child node is called a follower. Next, each edge in the tree is traversed, and all edges whose length of edge greater than the threshold \(\varepsilon \) are cut off. Generally, the threshold \(\varepsilon \) is set to the weighted average of the length of all edges. Finally, each connected component forms a cluster.

Fig. 1
figure 1

The schematic of NBC

Figure 1 illustrates an example of clustering the population \(P=\{x_1, x_2, x_3,\) \(x_4, x_5\}\) using NBC. In the figure, each individual looks for its nearest better neighbor to form a tree. Then, the edge \(e\langle x_3, x_4\rangle \) is cut off according to the threshold \(\varepsilon \), and finally, the population P is divided into two clusters.

The proposed algorithm: BSO20

Motivation

The core parts of the classic BSO are the clustering strategy and the mutation strategy. However, there are some deficiencies in the performance of clustering and exploration that need to be improved.

First, in the classic BSO, the population is clustered by k-means before each iteration, and the higher time complexity of k-means affects the efficiency of the algorithm. Moreover, k-means only makes use of the information of the solution space, whereas the information of the objective space is ignored. Although there have been some studies on the clustering strategy, the performance still could be improved. Consequently, there is a need to introduce a novel and efficient clustering strategy for BSO, and NBC is considered in this paper.

The steps of NBC include three parts: sorting with time complexity \(O(n\log n)\), distance calculation with time complexity \(O(n^2)\), and cluster partition with complexity O(n). Therefore the time complexity of NBC is \(O(n^2)\), where n is the population size. NBC uses the distance information of the solution space and the fitness information of the objective space, which is promising for obtaining more knowledge about the landscape. However, Luo et al. pointed out that the “long-tail phenomenon” exists in NBC [23], which means that the NBC may be unreliable to deal with poor individuals in some cases. Therefore, in this paper, NBC is considered for clustering better individuals, while the clustering strategy RGS is considered for clustering poor individuals.

Second, for the mutation strategy of the classic BSO, the primary defect is that the update of the step size is independent of the problem and the state of the current population, which affects the exploration performance of BSO. For this reason, an improved mutation strategy is proposed, and the generation of the base individual is modified to improve the exploration efficiency.

In summary, a novel and efficient variant of BSO, named BSO20, is proposed in this paper. In the rest of this section, the clustering strategy and the mutation strategy of BSO20 are detailed, and the algorithm flow of BSO20 is described at the end of this section.

Clustering strategy of BSO20

A new hybrid clustering strategy combining NBC and RGS is used in BSO20. The primary idea is that NBC is used to cluster better individuals to exploit the potential of promising areas, while RGS is used to cluster poorer individuals to increase the diversity within the cluster, and to further reduce the time cost of BSO20. In the early stage of evolution, individuals are scattered randomly in the search space, and RGS is the main clustering strategy at this time. Then, NBC is used to cluster better individuals when the promising areas could be found, and the other individuals are clustered by RGS to explore the search space. As the population evolves, the number of individuals using NBC for clustering is gradually increased, and finally, all individuals are clustered by NBC.

The number and the size of clusters need to be considered when designing a clustering strategy because a smaller number of clusters could affect the performance of inter-cluster mutation, and a smaller size of clusters could affect the performance of intra-cluster mutation. In this paper, the size of clusters generated by RGS is set to \(S_r\) in BSO20, and the total number of clusters is set to k in the hybrid clustering strategy (Formula (5)).

$$\begin{aligned} k = \frac{\mathrm{NP}}{S_r} \end{aligned}$$
(5)

where NP is the population size.

During the population evolution, the number of clusters generated by RGS decreases linearly with the number of iterations (Formula (6)).

$$\begin{aligned} k_r = \left\lceil k \cdot \left( 1 - \frac{t}{T} \right) \right\rceil \end{aligned}$$
(6)

where T is the maximum number of iterations, t is the current number of iterations.

Therefore, the number of individuals clustered by NBC can be calculated as Formula (7).

$$\begin{aligned} N_\mathrm{nbc} = \mathrm{NP} - k_r \cdot S_r \end{aligned}$$
(7)

In order to ensure the total number of clusters is k, the number of clusters formed by NBC, i.e., \(k_n\), should be set to \(k - k_r\). Therefore, the partition condition of NBC should be modified, that is, the longest \(k_n-1\) edges are cut off to generate \(k_n\) clusters.

Overall, the pseudocode of the clustering strategy of BSO20 is given as Algorithm 1. Among them, lines 3–7 describe the clustering process of NBC, and lines 9–12 describe the clustering process of RGS.

figure a

Mutation strategy of BSO20

The mutation strategy of BSO20 is an improved version based on the classic BSO. The main idea of the modified mutation strategy is to promote the efficiency of information exchange between individuals within a cluster or among multiple clusters to improve the exploration performance of BSO.

Before describing the modified mutation strategy, a concept of the leader set needs to be briefly introduced. In the clusters formed by NBC, the leader set of an individual x includes its ancestor nodes. In the clusters formed by RGS, the leader set of an individual x includes the individuals which are better than x. Particularly, if the leader set of an individual x is empty, then the leader of x is itself.

In the intra-cluster mutation, each selected individual shares information with its random leader to generate a base individual y (Formula (8)). In the inter-cluster mutation, each selected individual shares information with two individuals from two different clusters (Formula (9)).

$$\begin{aligned} y^d= & {} (1 - r) \cdot x_{s}^d + r \cdot \mathrm{leader}_s^d, \end{aligned}$$
(8)
$$\begin{aligned} y^d= & {} (1 - r_1 - r_2) \cdot x_{s}^d + r_1 \cdot x_{i_1}^d + r_2 \cdot x_{i_2}^d. \end{aligned}$$
(9)

where r, \(r_1\) and \(r_2\) are random numbers uniformly distributed between 0 and 1, \(x_s\) is an individual randomly selected from the population, leader\(_s\) is a leader individual randomly selected from the leader set of \(x_s\), \(x_{i_1}\) and \(x_{i_2}\) are randomly selected from two randomly selected clusters, and d denotes the dimension index.

The mutation of the base individual y is the same as that in the classic BSO, except that \({\mathcal {N}}_b^d(0,1)\) is used instead \({\mathcal {N}}^d(0, 1)\) in BSO20 (Formula (10)). The random numbers generated by Gaussian distribution are limited between lb and ub.

$$\begin{aligned} \begin{aligned}&x^d_\mathrm{new} = y^d + \xi ^d (t) \cdot {\mathcal {N}}_b^d(0, 1), \\&{\mathcal {N}}_b^d(0, 1) = {\left\{ \begin{array}{ll} \mathrm{ub}, &{} \text{ if } {\mathcal {N}}^d(0,1) > \mathrm{ub}, \\ \mathrm{lb}, &{} \text{ if } {\mathcal {N}}^d(0,1) < \mathrm{lb}, \\ {\mathcal {N}}^d(0,1), &{} \text{ otherwise }, \end{array}\right. } \end{aligned} \end{aligned}$$
(10)

where ub and lb are the upper bound and lower bound of the search space, respectively, and \(\xi ^d(t)\) is updated as Formulas (2) and (3).

figure b

BSO20

In the above parts, the clustering strategy and the mutation strategy of BSO20 are introduced. Algorithm 2 shows the pseudocode of BSO20, and the details are described as follows.

First, after the population and parameters are initialized (i.e., lines 1–2), the population P is clustered by the hybrid clustering strategy, and the leader set of each individual is obtained (i.e., lines 4–5).

Then, from line 7 to line 16, an individual \(x_s\) is randomly selected from the population to generate a base individual y. The intra-cluster mutation is performed with the probability \(p_{\mathrm{one}\_\mathrm{cluster}}\), and the inter-cluster mutation is performed with \(1 - p_{\mathrm{one}\_\mathrm{cluster}}\).

Finally, from line 17 to line 22, a new individual \(x_\mathrm{new}\) is obtained from the base individual y by Gaussian mutation, and the population is updated.

In BSO20, the replacing operator is not considered, due to its contributing less to the algorithm, and this was verified by the experiment in [39]. Moreover, the selection probabilities of cluster center \(p_{\mathrm{one}\_\mathrm{best}}\) and \(p_{\mathrm{two}\_\mathrm{best}}\) are not considered. In the intra-cluster mutation, the selected individual moves towards its leader, so there is a chance to explore near the cluster center. In the inter-cluster mutation, the selected individual shares information within two clusters. BSO20 only retains one parameter of the classic BSO, i.e., the probability of the intra-cluster mutation \(p_{\mathrm{one}\_\mathrm{clsuter}}\), and adds one parameter \(S_r\), i.e., the size of clusters generated by RGS, which reduces the hassle of tuning many parameters. The source codes of BSO20 are available at https://github.com/squrriler-xu/BSO20.

Experiments

In this section, BSO20 is tested on the CEC’17 RPNO, and the parameters of BSO20 are analyzed at the end of the section.

Experimental setting

In this paper, 29 global optimization problems in CEC’17 RPNO [1] are used as the benchmark, which includes four types of functions: unimodal functions (\(F_1, F_3\)), simple multimodal functions (\(F_4\)\(F_{10}\)), hybrid functions (\(F_{11}\)\(F_{20}\)), and composition functions (\(F_{21}\)\(F_{30}\)). The problem \(F_2\) has been excluded according to the recommendation of [1].

The dimensions D of the benchmark include 30D, 50D, and 100D. For each function, this benchmark suggests that the algorithm runs 51 times independently to calculate the mean as well as the standard deviation of 51 results [1]. The maximum number of evaluations for one run is \(10^4\times D\).

In order to verify the performance of BSO20, the classic BSO [29] and several variants are used as compared algorithms in this experiment, namely BSO-OS [30], RGBSO [4], BSO with chaotic local search (CBSO) [37], active learning BSO (ALBSO) [5], and bare-bones global-best BSO (BBGBSO) [12]. BSO-NBC is also adopted for comparisons, which is similar to the classic BSO, except that NBC is used as a clustering strategy instead of k-means. The clustering strategies of the classic BSO and BSO-OS are based on the solution space and objective space, respectively. The clustering strategy of RGBSO is RGS. Therefore, the above four algorithms are used to compare the effect of the clustering strategy on the performance of the proposed algorithm. BSO20 is also compared with the latest variants of BSO, such as CBSO, ALBSO, and BBGBSO, to demonstrate its competitiveness. Moreover, two variants of particle swarm optimization (PSO), i.e., scout particle swarm optimization (ScPSO) [17] and phasor particle swarm optimization (PPSO) [15], are also added as compared algorithms for comparison experiments.

BSO20 only includes two parameters, which are set as follows. The size of the cluster formed by RGS \(S_r\) is set to 20, and the probability of the intra-cluster mutation \(p_{\mathrm{one}\_\mathrm{cluster}}\) is set to 0.1. In contrast, the compared algorithms have more parameters than BSO20. For example, the classic BSO has 5 main parameters, namely the number of clusters k, the probability of replacing operator \(p_\mathrm{init}\), the intra-cluster mutation probability \(p_{\mathrm{one}\_\mathrm{cluster}}\), as well as the cluster center selection probabilities \(p_{\mathrm{one}\_\mathrm{best}}\) and \(p_{\mathrm{two}\_\mathrm{best}}\). In addition, the number of main parameters of BSO-OS, RGBSO, CBSO, ALBSO, and BBGBSO are 3, 4, 5, 5, and 8, respectively. The parameters of the above compared algorithms all adopt the parameter settings recommended in the original literature [4, 5, 12, 29, 30, 37]. Moreover, the population size of all algorithms is set to 4D.

For each algorithm, the mean and standard deviation of each function in 51 runs are counted as experimental results. Moreover, the Wilcoxon signed-rank test is used for verifying significant differences between the statistical results of BSO20 and the compared algorithms.

Experimental results

Table 1 Experimental results on 30-D functions \(F_1\), \(F_3\)\(F_{16}\)
Table 2 Experimental results on 30-D functions \(F_{17}\)\(F_{30}\)
Table 3 Experimental results on 50-D functions \(F_1\), \(F_3\)\(F_{16}\)
Table 4 Experimental results on 50-D functions \(F_{17}\)\(F_{30}\)
Table 5 Experimental results on 100-D functions \(F_1\), \(F_3\)\(F_{16}\)
Table 6 Experimental results on 100-D functions \(F_{17}\)\(F_{30}\)

The experimental results on 30D, 50D, 100D functions are listed in Tables 1, 2, 3, 4, 5, 6, and In each line, the bolded term is the best mean or standard deviation of the fitness values found by algorithms. In the last row of Tables 2, 4, and 6, the comparison results of BSO20 on different dimensions with each compared algorithm are given, where the symbol “\(+/-\)” indicates that BSO20 is better/worse than the compared algorithms.

The statistical results show that the performance of BSO20 remains stable on different dimensions. Comparing BSO20 with the classic BSO, there are 23, 24, 25 functions better than the classic BSO on 30D, 50D, and 100D problems, respectively, which can also be seen at the bottom of the second column of Tables 2, 4, and 6. The experimental results indicate that the performance of the classic BSO decreases as the dimension increases, while BSO20 can maintain better performance. Similarly, the comparison results of “\(+/-\)” between BSO20 and BSO-OS are “25/4,” “25/4,” and “26/3,” which can also be seen at the bottom of the third column of Tables 2, 4, and 6.

Then, from the results of RGBSO and BSO-NBC, it can be seen that RGS is competitive on higher dimensional problems, and NBC is also more valuable than k-means for BSO. Furthermore, as shown at the bottom of the fourth and fifth columns of Tables 2, 4, and 6, BSO20 achieves better results in different dimensions, outperforming RGBSO and BSO-NBC on more than 20 functions, which demonstrates the superiority of the hybrid clustering strategy.

From the sixth columns to the eighth columns of Tables 1-6, comparisons between BSO20 against the state-of-the-art variants of BSO show that BSO20 is the best performer overall. BSO20 has great superiority when comparing with CBSO and ALBSO. In particular, it outperforms ALBSO on all functions. Moreover, compared with BBGBSO, BSO20 also performed well on 30D problems, but as the dimension increases, the competitiveness of BSO20 gradually decreases, especially on the composition functions.

Finally, the last two columns of the Tables 1, 2, 3, 4, 5, 6 list the results of the performance comparison between BSO20 and two variants of PSO. The experimental results indicate that the performance of BSO20 is inferior to ScPSO on 30D problems, but is better than ScPSO on 100D problems. Moreover, compared with PPSO, BSO20 performs better on different dimensional problems, and the comparison results of “\(+/-\)” between BSO20 and PPSO are “19/10,” “18/11,” and “17/12.”

BSO20 performs better on the multimodal problems than the unimodal problems, which is determined by the feature of BSO. Furthermore, BSO20 also performs well on the unimodal problems, such as \(F_1\). On the multimodal problems, the modified mutation strategy improves the exploration ability of BSO20 and enhances the chance of BSO20 to find better solutions. Moreover, on the hybrid functions, BSO20 achieves better results on most functions, while it has general performance on the composition functions.

Wilcoxon signed-rank test

In this section, Wilcoxon signed-rank test is carried to further analyze the experimental results. The results of the Wilcoxon signed-rank test between BSO20 and all compared algorithms on 30D, 50D, and 100D functions are shown in Table 7, which are obtained by the open source software KEEL 3.0 [31]. The details of related statistical indicators can be found in [14]. In each pair of comparisons of Table 7, the bolded \(R^{+}\) values emphasize that BSO20 outperforms the compared algorithm with a level of significance of 0.05.

Table 7 Results obtained by the Wilcoxon signed-rank test for BSO20

As can be seen from Table 7, BSO20 significantly outperforms the classic BSO, BSO-OS, BSO-NBC, CBSO, and ALBSO with a level of significance of 0.05 on different dimensions. Compered with RGBSO and BBGBSO, BSO20 has significant superiority on 30D problems, but not significant on 50D and 100D. Nevertheless, BSO20 is still better than RGBSO and BBGBSO from the value of \(R^{+}\). Therefore, the statistical results demonstrate the competitive performance of BSO20. From the last two rows of Table 7, the values of signed rank \(R^{+}\) and \(R^{-}\) demonstrate that BSO20 is better than ScPSO on 50D and 100D problems, and better than PPSO on 30D and 50D problems. However, the experimental results also indicate that the performance difference between BSO20 and SCPSO as well as PPSO is not significant at a level of significance of 0.05.

Parameter analysis

BSO20 contains only two parameters, the intra-cluster mutation probability \(p_{\mathrm{one}\_\mathrm{cluster}}\), and the cluster size \(S_r\). In this section, the impact of the parameters \(p_{\mathrm{one}\_\mathrm{cluster}}\) and \(S_r\) on the performance of BSO20 are analyzed. For the convenience of presentation, the logarithm of the mean fitness, i.e., \(\log _{10}(\mathrm {fit})\), is used as the ordinate.

Parameter analysis of \(p_{\mathrm{one}\_\mathrm{cluster}}\)

Figures 2 and 3 show the mean values of experimental results of functions \(F_1\), \(F_3\)\(F_{30}\) with different intra-cluster mutation probability \(p_{\mathrm{one}\_\mathrm{cluster}}\), where \(p_{\mathrm{one}\_\mathrm{cluster}}\) is set to \(0, 0.1, \ldots , 0.9, 1\), and \(S_r\) is set to 20.

Fig. 2
figure 2

The effect of parameter \(p_{\mathrm{one}\_\mathrm{cluster}}\) on 30-D functions (\(F_1\), \(F_3\)\(F_{30}\))

From Figs. 2 and 3, it can be seen that, for the same function in different dimensions (i.e, 30D and 50D), the trend of the mean changing with the parameter \(p_{\mathrm{one}\_\mathrm{cluster}}\) are similar. The experimental results also indicate that for most functions, inter-cluster mutation with greater probability could achieve better results, especially on functions \(F_{19}\), \(F_{22}\) and \(F_{30}\). However, there are some functions that are not sensitive to the parameter \(p_{\mathrm{one}\_\mathrm{cluster}}\), such as \(F_{11}\)\(F_{13}\), \(F_{26}\)\(F_{27}\). Moreover, BSO20 with greater probability of the intra-cluster mutation is more effective on some functions, such as \(F_{3}\), \(F_{14}\) and \(F_{18}\), where \(F_{3}\) is a unimodel function, and \(F_{14}\), \(F_{18}\) are hybrid functions including unimodal components.

Fig. 3
figure 3

The effect of parameter \(p_{\mathrm{one}\_\mathrm{cluster}}\) on 50-D functions (\(F_1\), \(F_3\)\(F_{30}\))

From the experimental results, the following conclusion could be drawn. For most multimodal functions, the inter-cluster mutation has more effect on the performance of BSO20, while for some unimodal functions, the inter-cluster mutation may not be significant. One of the reasons for this situation is that for global optimization problems, the inter-cluster mutation helps share information between multiple clusters, and the clusters are gathered slowly, which is beneficial to further exploitation of promising areas for the algorithm.

Parameter analysis of \(S_r\)

Fig. 4
figure 4

The effect of parameter \(S_r\) on 30-D functions (\(F_1\), \(F_3\)\(F_{30}\))

The parameter \(S_r\) determines the total number of clusters k in BSO20, and k determines whether BSO20 prefers exploitation or exploration. If k is small, the clusters gather faster, and BSO20 tends to exploit. Conversely, BSO20 tends to explore if there are many clusters. In order to ensure that k is an integer, \(S_r\) is set to \(\{5, 10, 20, 24, 30, 40\}\) for 30D problems, and \(S_r\) is set to \(\{5, 10, 20, 25, 40, 50\}\) for 50D problems. Moreover, \(p_{\mathrm{one}\_\mathrm{cluster}}\) is set to 0.1.

Figures 4 and 5 show the mean values of experimental results of functions \(F_1\), \(F_3\)\(F_{30}\) with different clusters size \(S_r\). There are trends in Figs. 4 and 5 to suggest that \(S_r\) has a little impact on the performance of the algorithm on most functions, except for 50D function \(F_{22}\). In other words, BSO20 is not sensitive enough to the parameter \(S_r\) on most functions. Moreover, the results also reveal that the effect of \(S_r\) on the performance of the algorithm is related to the landscape of functions. For example, BSO20 prefers a larger \(S_r\) for functions \(F_{20}\) and \(F_{22}\), while a smaller \(S_r\) for function \(F_3\) and \(F_{30}\).

Fig. 5
figure 5

The effect of parameter \(S_r\) on 50-D functions (\(F_1\), \(F_3\)\(F_{30}\))

Discussion

BSO20 includes two improvements, i.e., the clustering strategy and the mutation strategy. In this section, the impact of the hybrid clustering strategy and the modified mutation strategy of BSO20 in terms of CPU time consumption and convergence are discussed.

Fig. 6
figure 6

The mean CPU time on 29 functions

Fig. 7
figure 7

The changing trends of the population radius of BSO20 on \(F_1\), \(F_5\), \(F_{15}\), and \(F_{30}\) functions

CPU time consumption

The clustering strategy is one of the crucial parts that affects the efficiency and performance of BSO. In this paper, the hybrid clustering strategy combining NBC and RGS was proposed. Compared with the classic BSO, the proposed hybrid strategy improves the efficiency and performance of BSO. In this section, CPU time consumption is used as an indicator to compare the efficiency of BSO20 and the classic BSO.

The details of the platform we used are Intel(R) Xeon(R) CPU E5-2640 v4 2.4 GHz processor, 32GB RAM, MATLAB R2017b, and Windows Server 2012 R2. The experiments open up a parallel pool, which processes 29 functions in parallel, and the algorithm runs independently on each function 51 times.

The Fig. 6 shows the experimental results on 30D and 50D functions. The CPU time consumption of BSO20 on each function is less than the classic BSO. It can be concluded that the clustering strategy adopted by BSO20 is more efficient than k-means adopted by classic BSO. Therefore, experimental results of the algorithm performance and the CPU time consumption both support that the hybrid clustering strategy is superior to k-means adopted by the classic BSO.

Convergence analysis

BSO20 is a variant of BSO, which has the advantage of BSO exploring multiple promising regions in parallel. The novel mutation strategy increases the step size of each mutation and shares information among multiple clusters to improve the exploration efficiency.

In this section, the functions \(F_1\), \(F_5\), \(F_{15}\) and \(F_{30}\) are selected as the representatives of unimodal functions, simple multimodal, hybrid functions, and composition functions. The population radius could be used to analyze the convergence of the algorithm on these functions [18]. In this paper, the population radius is calculated as Formula (11).

$$\begin{aligned} r = \max _{1\le i\le n} \{\Vert x_i-x_\mathrm{mean}\Vert \}, \end{aligned}$$
(11)

where \(x_i\) is the ith individual in the population, \(x_\mathrm{mean}\) represents the center of the population, which is set to the mean of all individuals, \(\Vert \cdot \Vert \) represents the Euclidean distance, and n is the population size. There is one difference from [18], we adopt the maximum value of the distances between all individuals and the center of the population as the population radius instead of the average.

Figure 7 shows the changing trends of the population radius on 30D and 50D functions during evolution, where the ordinate is the population radius and the abscissa is the number of iterations. The experimental results reveal that the population radius decreases significantly as the iteration number increases. The changing trends of the population radius shown in Fig. 7 could be divided into three stages. In the early stage of the algorithm, multiple clusters are distributed in different regions, and the population radius is large. Subsequently, about 1200–1500 iterations, multiple clusters gather, BSO20 focuses on exploiting the local regions. Finally, the population will converge. However, population convergence cannot always be guaranteed. For example, the population radius of 50D function \(F_{30}\) eventually tends to \(10^{-1}\). The reason for this situation is that there are multiple clusters localized to local peaks.

Conclusions

In this paper, we have proposed BSO20 as the improvement of the classic BSO for real-parameter numerical optimization problems. The major changes of BSO20 include the clustering strategy and the mutation strategy. First, a hybrid clustering strategy was proposed in this paper for BSO20, which combines NBC and RGS, and the subpopulation size clustered by two strategies was dynamically adjusted as the population evolves. The new clustering strategy not only has superiority in time efficiency but also achieves good results in experiments. Second, a modified mutation strategy was also proposed for BSO20 to enhance the exploration ability. Overall, BSO20 improves the time efficiency and the search efficiency of BSO. Moreover, BSO20 reduces the number of parameters of the classic BSO to ease the difficulty of tuning parameters. BSO20 has been tested on CEC’17 RPNO and compared with several other BSOs as well as two PSOs. The experimental results have demonstrated the competitiveness of the proposed algorithm.

In the future work, we will further improve the BSO20, especially the adaptive strategy to adjust the parameters dynamically to balance the convergence and diversity of the population. Additionally, we will draw on the successful experience of differential evolution (DE) and particle swarm optimization (PSO), and strive to improve the performance of BSO20 to achieve more highly-competitive results.