Introduction

Since the introduction of the genetic algorithm [1], computing intelligence has attracted wide spread attention and become a powerful tool to handle complex theoretical and real-life optimization problems. Nowadays, many population-based stochastic optimization algorithms are proposed every year. Compared with traditional optimization methods, population-based algorithms are easy to implement and can provide near optima on most problems. Furthermore, they are applicable to non-continuous and non-differentiable problems. When the mathematical model is absent, population-based algorithms are still applicable. Population based algorithms provide an effective approach to solve NP-hard problems.

As one of the widely used population-based algorithms, particle swarm optimization (PSO) [2] is simple in concept and converges fast. Due to the effectiveness of PSO on solving complex optimization problem, many theoretic and real-life application PSO algorithms have been reported since its inception. The original PSO suffers from diversity loss and premature convergence. On complex multimodal problems, PSO is apt to fall into local optima. To mitigate premature convergence problem, a series of excellent PSO algorithms have been proposed in the past two decades. These improved versions of PSO can be roughly divided into four classes, namely, parameter adjustment algorithms, neighborhood topology algorithms, hybridization with other evolutionary algorithms (EAs) and learning strategy algorithms. The representative literature for each class is as follows:

  1. 1.

    Parameter adjustment: Adjusting the control parameters (inertia weight and acceleration coefficients) can control the algorithms’ exploration and exploitation capability. Linear [3], non-linear [4], fuzzy system [5, 6], chaotic [7], and adaptive [8] adjustment algorithms have been proposed to balance exploration and exploitation. Parameter adjustment can improve performance to some extent, while on complex multimodal problems its impact is limited. Nowadays dynamic parameter adjustment methods are employed to enhance the performance of novel learning strategies [9].

  2. 2.

    Topology: Neighborhood topology determines the information exchanging among the particle swarm. Common topologies include ring, gbest, wheel, random, star and von Neumann [10]. Besides static topology, a series of dynamic topologies have been proposed. For example, Liang et al. [11] proposed dynamic multi-swarm topology, where the whole swarm is divided into many sub-swarms and these sub-swarms are regrouped periodically. Li et al. [12] introduced a dynamic pyramid topology based on the particles’ fitness ranking. Lim et al. [13] developed an increasing topology connectivity to achieve better control of exploration/exploitation. Lin et al. [14] proposed a Tournament Topology Particle Swarm Optimization. The previous literature indicates that high topology connectivity converges quickly while low topology connectivity performs well on multimodal problems. Nowadays dynamic topology with fitness selection is the primary development trend.

  3. 3.

    Hybridization: PSO is widely hybridized with other evolutionary algorithms to obtain better performance. There are three major hybrid methods. (1) Employing different algorithm sub-swarms to perform different functions [15]; (2) Adopting another EA’s operators to overcome premature convergence [16]; and (3) Utilizing another EA’s operators to construct learning examples [17]. In recently years, many hybrid PSO algorithms have been proposed. For example, Abdulhameed et al. [18] proposed a hybrid algorithm based on PSO and the crow search algorithm (CSA) for feature selection. The CSA is adopted for enhancing global search. Yang [19] presented a hybrid algorithm based on PSO and cuckoo search (CS) for tuning the parameters of PID. CS with random walk was employed to enhance swarm diversity of PSO. Abdülkadir et al. proposed [20] a hybrid firefly and PSO algorithm with chaotic local search. Sama et al. [21] used PSO to explore the global search area and utilized fast-simulated annealing to refine the visited search area. Zhen et al. [22] developed a hybrid wolf pack algorithm and PSO for parameter estimation. To develop high performance hybrid PSO, the component algorithms should be excellent and complementary.

  4. 4.

    Learning strategy: Novel learning strategies can improve the performance of PSO by constructing promising learning exemplars [23] or introducing effective competition and cooperation mechanisms [24]. Some representative learning exemplar-based PSO algorithms are as follows: Gong et al. [17] proposed genetic learning particle swarm optimization by employing the crossover, mutation and selection operators of genetic algorithms to breeding exemplars. Cheng et al. [25] introduced a “learning from any better particles” mechanism and proposed social learning particle swarm optimization. Xu et al. [26] proposed a dimensional learning strategy to discover and employ the promising particles’ information. Wang et al. [27] proposed an adaptive learning strategy to adjust the self-learning component and competitive-learning component. To deal with large-scale optimization, Kaucic et al. propose [28] a level-based learning swarm optimizer with a hybrid constraint-handling technique for large-scale portfolio selection problems. To obtain good balance between exploration and exploitation on large-scale problems, Li et al. [29] proposed a learning structure decoupling exploration and exploitation for large-scale optimization. Sheng et al. [30] employed dynamic p-learning mechanisms and multi-level population to improve the performance of PSO on large-scale optimization.

To help the particle swarm converge to optima rapidly, local search strategies are employed. For example, Liang et.al. [31] employed a quasi-Newton method to further improve a portion of top local best regularly. Wu et al. [32] conducted comparison experiments and pointed out that the Broyden–Fletcher–Goldfarb–Shanno method (BFGS) performs better than Nelder–Mead simplex search, Davidon–Fletcher–Powell (DFP) and pattern search (PS) on complex optimization problems. Chen et al. [33] used BFGS to enhance a portion of Lbest regularly for a dynamic multi-swarm differential learning particle swarm optimizer. Hu et al. [34] merged sub-gradient local search into PSO iteration. Cao et al. [35] adopted the quasi-entropy index to trigger local search. The aforementioned literature indicates that local search is an effective auxiliary approach to enhance exploitation of PSO.

Novel learning strategies are an active research direction of PSO. To enhance the adaptability of learning exemplar-based algorithms, scholars have employed two or more types of exemplars with different characteristics to guide the motion of particle swarm. For example, Fu et al. [36] developed an adjustable driving force-based particle swarm optimization by employing two types of exemplars to update particle velocities. Chen et al. [37] employed two DE mutations for generating learning exemplars of PSO, so the hybrid algorithm enhances exploration at the start and gradually changes over to enhance exploitation. Lynn et al. [38] adopted an ensemble mechanism to adjust the sizes of sub-swarms with different learning exemplars. Wang et al. proposed [39] a multiple-strategy learning particle swarm optimization for large-scale optimization problems by utilizing different learning strategies in different stages. Ning et al. [40] employed three particle swarms and three velocity update methods for high dimensional problems. Lynn et al. [41] proposed heterogeneous comprehensive learning particle swarm optimization (HCLPSO) by employing two sub-swarms to enhance exploration and exploitation, respectively. The test results show that HCLPSO yields high performance on different problems and outperforms seven state-of-the-art PSO variants on the CEC2005 test suite.

Due to the robustness performance of HCLPSO, this study further extends the heterogeneous theory by employing two DE mutants to construct diversified exploration and exploitation learning exemplars, respectively. The major contribution of this study is

  1. 1.

    Adopting two DE mutants to construct exploration and exploitation learning exemplars, respectively.

  2. 2.

    Introducing accompany operation to strengthen the impact of excellent particles.

  3. 3.

    Employing BFGS local search to further improve search accuracy.

  4. 4.

    The proposed heterogeneous differential evolution particle swarm optimization with local search (HeDE-PSO) is tested on CEC2017 test suite and Industrial Refrigeration System design problem.

The resultant algorithm is referred to as heterogeneous differential evolution particle swarm optimization with local search (HeDE-PSO). The rest of this paper is organized as follows: the next section reviews related works, the subsequent section presents the methodology, the following section reports the experimental results and the last section summarizes the paper.

Related works

Particle swarm optimization

PSO imitates the bird flock’s foraging behavior to find the global optima. In PSO, each solution is regarded as a particle. The particles are guided by their own best position (Pbest) and the global best position (Gbest) according to Eqs. (1, 2) [42]. Veli,d denotes the particle’s velocity, the subscripts i and d stand for the particle index and dimension, respectively. Pos and P stand for the Position and Pbest of the ith particle, respectively, and G denotes the global optima found by particle swarm. w, c1 and c2 are inertia weight and two acceleration coefficients, respectively. r1,d and r2,d are two uniformly distributed random numbers in the range of 0 and 1. In the learning process, the particles are oscillating in the neighborhood of Pbest and Gbest. The greedy selection algorithm is employed to update Pbest and Gbest.

$${\mathrm{Vel}}_{i,d}=w\cdot {\mathrm{Vel}}_{i,d}+{c}_{1}\cdot {r}_{1,d}\cdot \left({P}_{i,d}-{\mathrm{Pos}}_{i,d}\right)+{c}_{2}\cdot {r}_{2,d}\cdot \left({G}_{d}-{\mathrm{Pos}}_{i,d}\right),$$
(1)
$${\mathrm{Pos}}_{i,d}={\mathrm{Pos}}_{i,d}+{\mathrm{Vel}}_{i,d},$$
(2)

PSO converges fast while suffering from premature convergence. To acquire better exploration, some exemplar-based learning strategies are proposed. For example, Wu et al. [32] proposed a superior solution guided PSO (SSG-PSO) framework to fully utilize the valuable information of superior solutions found in the optimization process. References [9, 43] employed crossover, mutation and selection operators to construct high quality learning exemplars for PSO. Chen et al. [44] presented a biogeography-based learning strategy for PSO. Lu et al. [45] employed valid history information to guide the behavior of particles through a reinforcement learning strategy. Zhang et al. [46] merged Bayesian iteration method into comprehensive learning particle swarm optimization. Zhang et al. [47] employed self-organizing topological structure and self-adaptive adjustable parameters to improve the performance of PSO. Cheng et al. [48] have reviewed the developments of particle swarm optimization in the past quarter century.

Differential evolution

Differential evolution (DE) [49] is a versatile evolutionary optimization technique. Kinds of DE mutants are proposed to fulfill different optimization problems. The representative DE mutations are as follows:

$${\mathrm{DE}/\mathrm{rand}/1: V}_{i}={X}_{r1}+F\left({X}_{r2}-{X}_{r3}\right),$$
(3)
$${\mathrm{DE}/\mathrm{best}/1: V}_{i}={X}_{\mathrm{Best}}+F\left({X}_{r2}-{X}_{r3}\right),$$
(4)
$${\mathrm{DE}/\mathrm{rand}/2: V}_{i}={X}_{r5}+F\left({X}_{r1}-{X}_{r2}\right)+F\left({X}_{r3}-{X}_{r4}\right),$$
(5)
$${\mathrm{DE}/\mathrm{best}/2: V}_{i}={X}_{\mathrm{Best}}+F\left({X}_{r1}-{X}_{r2}\right)+F\left({X}_{r3}-{X}_{r4}\right),$$
(6)

Vi = [vi,1, vi,2,…,vi,D] and X = [xi,1, xi,2,…,xi,D] are the trial vector and target vector, respectively. F is the scale factor. r1, r2, r3, r4, and r5 are five are mutually exclusive random integers within the range of [1, NP], and NP denotes for the population size of DE.

DE generates new candidate solutions in three steps:

  1. 1.

    Mutation: Generate a mutant vector according to one DE mutant among Eqs. (36).

  2. 2.

    Crossover: Generate a random number for each dimension and compare it with the crossover probability according to Eq. (7) to determine whether to update the relative dimension of the trial vector Ui = [ui,1, ui,2,…,ui,D]. cri is the crossover probability, and k is a random dimension to guarantee at least one dimension is chosen from mutant vector Vi.

    $${u}_{i,j}=\left\{\begin{array}{ll}{v}_{i,j} &\quad\mathrm{rand}<{\mathrm{cr}}_{i}\ \mathrm{or}\ j=k\\ {x}_{i,j} &\quad \mathrm{else}\end{array}.\right.$$
    (7)
  3. 3.

    Selection: The fitness value of trial vector Ui is calculated and compared with the target vector Xi; if the fitness value of trial vector Ui is better than the target vector Xi, the target vector is replaced by the trial vector.

    $${X}_{i}=\left\{\begin{array}{ll}{U}_{i}&\quad f({U}_{i})\le f({X}_{i}) \\ {X}_{i} &\quad \mathrm{else}\end{array}.\right.$$
    (8)

DE/rand/1 and DE/rand/2 bear strong exploration properties, while DE/best/1 and DE/best/2 possess high exploitation properties. To balance exploration and exploitation, Zhang adopts DE/current-to-pbest/1 to generate the trial vector according to Eq. (9) and proposes adaptive differential evolution with optional external archive (JADE) [50]. rp denotes a randomly selected target vector from the top p% individuals in Eq. (9). DE/current-to-pbest/1 is widely used nowadays in DE variants for its excellent performance [51]. For more research on DE variants, please refer to reference [52].

$$\text{DE/current-to-pbest/1}: {V}_{i}={X}_{i}+F\left({X}_{rp}-{X}_{i}\right)+F\left({X}_{r1}-{X}_{r2}\right).$$
(9)

Hybrid DE-PSO

PSO and DE are widely used stochastic algorithms. To utilize the advantages of both sides, a series of hybrid DE and PSO algorithms (referred to as DE-PSO in the following passages) are proposed. For example, Melton et al. [53] propose a hybrid DE-PSO for time-optimal slew-maneuver problems with path constraints. Once stagnation is detected, a set of DE iteration is employed. The experiments show that this hybrid DE-PSO can handle stagnation successfully and reduce CPU time by 40%. Song et al. [54] presented Gaussian particle swarm optimization with differential evolution to optimize three dimensional wind turbines. Wang et al. [55] proposed a self-adaptive mutation differential evolution algorithm based on particle swarm optimization. In this hybrid algorithm, DE/rand/1 is adopted to maintain swarm diversity in the early stage and PSO is employed to achieve high convergence speed. Mohammadi et al. [56] develop a hybrid algorithm for suspended sediment load estimation by merging multi-layer perception and DE into PSO. To adjust a PID controller, Amal et al. [57] presented a hybrid algorithm based on differential evolution and particle swarm optimization with an aging leader and challengers. Li et al. [58] combined differential evolution with particle swarm optimization to calibrate camera parameters. Wang et al. [59] presented a self-adaptive particle swarm optimization differential evolution for parameter identification of lithium-ion batteries.

Methodologies

Heterogeneous exemplars

Heterogeneity is an effective approach to improve algorithms’ adaptability. Both DE/rand/1 and DE/current-to-pbest/1 are widely used DE mutants. Comparing with DE/rand/1, DE/current-to-pbest/1 has stronger exploitation capability and converges relatively faster. In this study, DE/rand/l is adopted to construct exploration exemplars, while DE/current-to-pbest/1 is utilized to construct exploitation exemplars. Compared with DE/best/1, DE/current-to-pbest/1 is less liable to fall into local optima. The two DE mutants work in parallel, and the valuable information can be exchanged sufficiently. DE mutants employ the Pbests of the current swarm and archive to generate exemplars to challenge the relevant particle’s Pbest. DE/rand/1 and DE/current-to-pbest/1 generate exemplars according to Eqs. (10,11).

$$ V_{i} = {\text{Pbest}}_{r1} + F\left( {{\text{Pbest}}_{r2} + E{\text{Pbest}}_{r3} } \right) \quad\left( {i < Ps_{1} } \right), $$
(10)
$$ V_{i} = {\text{Pbest}}_{i} + F\left( {{\text{Pbest}}_{{i{\text{best}}}} - {\text{Pbest}}_{i} } \right) + F\left( {{\text{Pbest}}_{r4} - E{\text{Pbest}}_{r5} } \right). $$
(11)

Epbestr3 and Epbestr5 denote randomly selected Pbests from the union of the current swarm and archive. Ps1 stands for the sub-swarm size of DE/rand/1, and the remaining particles generate learning exemplars according to DE/current-to-pbest/1.

After mutation, the crossover operation and out of bounds treatment are carried out according to Eqs. (7) and (12, 13), correspondingly. Then the fitness value of the exemplar is evaluated and compared. If the fitness of the exemplar is better than the corresponding particle’s Pbest, the exemplar will replace the previous Pbest, and the previous Pbest will challenge the worst solution in the archive. Otherwise, the generated exemplar will be abandoned.

$${u}_{i,j}=({\mathrm{Pbest}}_{i,j}+{\mathrm{Up}}_{j})/2 \quad\mathrm{if}\quad {u}_{i,j}>{Up}_{j,}$$
(12)
$${u}_{i,j}=\frac{{\mathrm{Pbes}t}_{i,j}+{\mathrm{Low}}_{j}}{2} \quad\mathrm{if} \quad {u}_{i,j}<{\mathrm{Low}}_{j}.$$
(13)

Accompanying learning

To avoid wasting computing resources, the accompanying learning strategy is introduced. In the PSO learning stage, each particle selects an accompanying particle randomly. If the learning particle’s fitness is better than the accompany particle’s, the PSO learning process will be executed according to Eq. (14). Otherwise, the PSO learning is skipped in this iteration. With accompanying learning, the best particles win more computing resources to improve the algorithm’s performance.

$${\mathrm{Vel}}_{i,d}=w\cdot {\mathrm{Vel}}_{i,d}+c\cdot {r}_{d}\left({\mathrm{Pbest}}_{i,d}-{\mathrm{Pos}}_{i,d}\right).$$
(14)

BFGS

In the late stage of PSO, the swarm diversity is insufficient to maintain high convergence speed. Hence employing local search in the late stage can improve search accuracy. BFGS is a popular quasi-Newton method whose core is to approximate the inverse Hessian matrix. The major steps of BFGS [32] are given in Algorithm 1.

figure a

Methodologies

The major steps of HeDE-PSO are illustrated in Fig. 1. In each iteration, DE/rand/1 and DE/current-to-pbest/1 work in parallel to generate complementary exemplars for PSO. Then the fitness value of a particle is compared with that of the accompanying particle to determine whether to execute the PSO learning stage. In the late stage of optimization, the promising area is located and BFGS local search is employed to improve search accuracy. Details for HeDE-PSO are as follows:

Fig. 1
figure 1

The flowchart of HeDE-PSO

Step 1: Initialize the initial population and control parameters. The fitness of the initial population is evaluated.

Step 2: Generate the trail vector through two DE mutants with different characteristics. DE/rand/1 is employed for exploration, and it updates individuals according to Eq. (10). DE/current-to-pbest/1 is employed for enhancing exploitation, and it generates new individuals according to Eq. (11).

Step 3: Conduct crossover according to Eq. (7). Handle out of bounds trial vectors according to Eqs. (12, 13).

Step 4: Evaluate and compare the fitness value of a trial vector with the relevant particle’s Pbest. If the fitness of the trial vector is better than its Pbest, replace the Pbest with its trial vector. Update FEs.

Step 5: Randomly select an accompanying particle. If the learning particle’s fitness value is no worse than the accompany particle’s, execute PSO according to Eq. (14). Otherwise, skip the PSO learning part (update the index of particle and Fes, jump to step 2).

Step 6: Compare the fitness of new position with its Pbest. If the former is better, replace the Pbest with its new position.

Step 7: Update the index of particles and FEs (function of evolutions), if the index is bigger than the population, reset the index to 1.

Step 8: Compare FEs with PSO learning FEs (MFEs-k·D, MFEs is the maximum allowable FEs, k is an integer). If FEs is no bigger than PSO learning FEs, return to step 2, otherwise execute the following step.

Step 9: Conduct BFGS local search for the rest of FEs to improve accuracy.

The characteristics of HeDE-PSO are analyzed in “Ablation experiments”.

Experimental works

To test the performance of HEDE-PSO, 29 CEC2017 functions [66] are adopted. f2 is absent from this test. Six recent PSO variants, four meta-heuristics and JADE are employed for comparison tests. Among the comparison algorithms, biogeography-based learning particle swarm optimization (BLPSO) [31] updates each particle using the combination of its own personal best position and personal best positions of all other particles through the biogeography-based optimization (BBO) migration. Particle swarm optimization using dynamic tournament topology (DTT-PSO) [14] employs several better solutions choosing from the entire swarm to guide the evolution of each particle. Expanded particle swarm optimization based on multiple exemplars and forgetting ability (XPSO) [9] utilizes the locally best solution and the globally best solution to construct social learning exemplars and assigns different forgetting abilities to different particles. Fitness-based multi-role PSO (FMPSO) [60] divides the swarm into leaders, ramblers, and followers based on their fitness in each generation, and employs different learning strategies for different roles. Phasor particle swarm optimization (PPSO) [61] adjusts the control parameters of PSO based on phasor angle theory. Dynamic multiswarm differential learning particle swarm optimizer (DMSDL-PSO) [33] employs DE mutation to construct exploration enhanced exemplars and adopts BFGS local search in the late stage to improve accuracy. An adaptive particle swarm optimizer with decoupled exploration and exploitation (APSO-DEE) [29] employs two learning components to enhance exploration and exploitation, respectively. Multi-swarm strategy with adaptive sub-swarm size regulating is adopted to achieve better performance. The Honey Badger Algorithm (HBA) [62] employs honey badgers’ digging and honey finding approaches to construct the exploration phase and the exploitation phase, respectively. The salp swarm algorithm (SSA) [63] divides the swarm into leaders and follows according to the swarm behavior of salp chains. The leader guides the swarm, and the followers follow each other during the optimization process. The Archimedes optimization algorithm (AOA) [64] imitates the principle of buoyant force exerted upward on an object partially or fully immersed in fluid. AOA updates the density, volume and acceleration of every object to determine new positions in every generation. The dwarf mongoose optimization algorithm (DMO) [65] divides the swarm into the alpha group, babysitters, and the scout group. Each group contributes to compensatory behavioral adaptation, which leads to a semi-nomadic way of life in a territory large enough to support the entire group. Adaptive differential evolution with optional external archive (JADE) [42] improves the performance of DE by employing a new mutation strategy “DE/current-to-pbest/1” with optional external archive and adaptive updating control parameters. All the comparison algorithms adopt the authors recommended parameters configuration.

10-dimensional and 30-dimensional function experiments are conducted with the same parameter settings to evaluate the algorithms’ scalability. Each function is run for 30 independent runs, and the Wilcoxon signed-rank test [67] with significance level of 0.05 is adopted for comparing test results. The parameter configurations of involved algorithms are given in Table 1. All the comparison algorithms adopt the authors’ suggested parameter configurations.

Table 1 Parameter configuration of algorithms

Comparing with PSO algorithms

Table 2 shows the statistical results of HeDE-PSO and other PSO algorithms. The symbols “>”, “≈ ”, “<” denote that the performance of HeDE-PSO is significantly better than, tied with or significantly worse than the compared algorithms, according to the Wilcoxon signed-rank test, respectively. For example, in the first cell “> (7.62 ± 7.70)E + 02”, “>” means the performance of HeDE-PSO is significantly better than BLPSO on f1, and “7.62” and “7.70” are the mean error and standard deviation achieved by BLPSO on f1, respectively. “E + 02” is the order of magnitude of scientific notation. In the last two rows, “Best” denotes for times of best mean performance achieved by the relevant algorithm, “w/t/l” stands for the times that HeDE-PSO performs “significantly better than”, “tied with”, or “significant worse than” the compared algorithm, respectively. In the 10D test, HeDE-PSO achieves the best performance on 16 functions, while DMSDL-PSO, BLPSO, DTT-PSO and FMPSO yield the best performance on 9, 5, 4 and 1 functions, respectively. XPSO, PPSO and APSO-DEE generate no best performances. HeDE-PSO performs significantly better than BLPSO, DTT-PSO, XPSO, FMPSO, PPSO, DMSDL-PSO and APSO-DEE on 16, 21, 23, 20, 27, 14 and 26 functions, respectively. HeDE-PSO performs worse than DMSDL-PSO on 7 functions. The overall performance of HEDE-PSO is better than DMSDL-PSO and other peer PSO algorithms. The average ranking of mean performance is given in Fig. 2, the lower average rank, the better. Figure 2 indicates that the average rank of HeDE-PSO is slightly lower than DMSDL-PSO, much better than other PSO algorithms. HeDE-PSO exhibits significant advantages except for ten composition functions (f21f30). With DE to construct learning exemplars and BFGS local search, DMSDL-PSO yields high performance of composition functions.

Table 2 Comparison with PSO algorithms on 10D CEC2017 functions
Fig. 2
figure 2

Average ranking among PSO algorithms

On 30D functions, Table 3 indicates that HeDE-PSO has the best performance on 17 functions, DTT-PSO, BLPSO and DMSDL-PSO have the best performance on 6, 4 and 4 functions, respectively. XPSO, FMPSO, PPSO and APSO-DEE yield no best performances. HeDE-PSO generates the best performance on all ten hybrid functions (f11f20), while with seven simple multimodal functions (f4f10), HeDE-PSO only wins the best performance on one function. HeDE-PSO outperforms BLPSO, DTT-PSO, XPSO, FMPSO, PPSO, DMSDL-PSO and APSO-DEE on 20, 22, 27, 28, 29, 21 and 29 functions, respectively. Figure 2 shows that the average rank of HeDE-PSO is lower than those of other PSO algorithms. The advantages of HeDE-PSO on 30D functions are more significant than on 10D functions. DMSDL-PSO and BLPSO perform better than the remaining PSO algorithms. With two DE mutations to construct exploration and exploitation learning exemplars, HeDE-PSO relieves premature convergence successfully and exhibits better adaptation than other PSO algorithms. HeDE-PSO outperforms APSO-DEE on both 10D and 30D CEC2017 test suite, indicates that employing two sub-swarms to enhance exploration and exploitation, respectively, performs better than utilizing two learning components to enhance exploration and exploitation in this study.

Table 3 Comparison with PSO algorithms on 30D CEC2017 functions

Comparison with other meta-heuristics

The comparison test results with other meta-heuristics on 30D functions in Table 4 show that HeDE-PSO has the best performance on 19 functions. JADE, DMOA and SSA yield the best performance on 7, 3 and 1 functions, respectively. HeDE-PSO outperforms HBA, SSA, AOA, DMOA and JADE on 29, 28, 29, 25 and 16 functions, respectively. The average rank in Fig. 3 shows that HeDE-PSO ranks the first, while JADE and HBA occupy the second and the third places, respectively. HeDE-PSO outperforms four recent meta-heuristics and JADE in this test.

Table 4 Comparison with other meta-heuristics on 30D CEC2017 functions
Fig. 3
figure 3

Average ranking among meta-heuristics

Convergence analysis

To analyze the convergence speed of HeDE-PSO, the convergence curves of HeDE-PSO and other PSO algorithms on four different types of functions are given in Fig. 4. Figure 4a indicates that on the unimodal function f1, HeDE-PSO achieves the highest accuracy, DMSDL-PSO ranks the second, and the rest of the PSO algorithms yield almost the same mean errors. Figure 4b shows that on the simple multimodal function f6, DMSDL-PSO generates the lowest mean error, and HeDE-PSO and BLSPO rank second and the third, respectively. The rest of the PSO algorithms yield relatively bigger mean errors. On composition function f16, Fig. 4c shows that HeDE-PSO yields the lowest mean error, BLPSO ranks second, and the other algorithms yield a relatively bigger mean error. Figure 4d reveals that DTT-PSO converges faster and yields the lowest mean error, and DMSDL-PSO ranks the second. The mean error of HeDE-PSO is relatively bigger. HeDE-PSO has significant advantages on unimodal functions and hybrid functions, while on simple multimodal functions and composition, HeDE-PSO yields moderate performance.

Fig. 4
figure 4

The curves of mean Log(Error)

Ablation experiments

In this section, ablation experiments are carried out to show the effectiveness of each strategy employed by HeDE-PSO. Five contrast algorithms denoted by NoLS-PSO, Noapy-PSO, NoDE-PSO, DEr1-PSO, and DEpbest-PSO are modified from HeDE-PSO. Each contrast algorithm is developed by removing one strategy from HeDE-PSO in “Methodologies”. Details of five contrast algorithms are given in Table 5.

Table 5 Modification of contrast algorithm

The test results are given in Table 6. The Wilcoxon signed-rank test shows that HeDE-PSO outperforms any contrast algorithm on at least twenty functions. HeDE-PSO outperforms NoLS-PSO on 23 functions, which indicates that the BFGS local search can improve search accuracy in the late stage of HeDE-PSO. HeDE-PSO performs better on 20 functions than Noapy-PSO, showing that the accompanying operation can allocate more computing resources to high quality particles to achieve high performance. HeDE-PSO defeats NoDE-PSO, DEr1-PSO, DEpbest-PSO on 25, 24 and 21 functions, respectively, indicating that employing both DE/rand/1 and DE/current-to-pbest/1 to generate learning exemplars (heterogeneous DE exemplars) can achieve better performance than employing a single DE mutant or not using DE mutation to construct learning exemplars. HeDE-PSO performs better than all the contrast algorithms and yields the best performance on 21 functions, indicating that the BFGS local search, the accompanying operation and the heterogeneous DE exemplar are indispensable for HeDE-PSO.

Table 6 Test results of contrast algorithms

To analyze the proposed strategies in balancing exploitation and diversity, the convergence curves and diversity curves on unimodal f1 and multimodal f17 are given in Fig. 5. The diversity is evaluated by the mean Euler distance between all the particles and their centroid.

Fig. 5
figure 5

Convergence and diversity curves of the proposed strategies

Figure 5a shows that with both DE/rand/1 and DE/current-to-pbest/1 to construct learning exemplars, HeDE-PSO outperforms NoDE-PSO, DEr1-PSO and DEpbest-PSO. DEpbest-PSO converges fast in the early stage and performs better than NoDE-PSO and DEr1-PSO, indicating that employing DE/current-to-pbest/1 to generate learning exemplars (DE/current-to-pbest/1 exemplars) can achieve high convergence speed on unimodal f1. Figure 5b shows that HeDE-PSO can keep moderate diversity in the early stage and converge to the high quality area in the later stage. The diversity of HeDE-PSO is bigger than that of DEpbest-PSO but smaller than that of DEr1-PSO. Without DE to construct learning exemplars, NoDE-PSO cannot converge to a high quality solution. Figure 5c indicates that on multimodal functions f17, HeDE-PSO ranks the first, and DEr1-PSO, DEpbest-PSO and NoDE-PSO rank second, third and fourth, respectively. HeDE-PSO employing heterogeneous DE exemplars performs better than employing only either the DE/rand/1 exemplar or the DE/current-to-pbest/1 exemplar. Adopting the DE/rand/1 exemplar performs better than employing DE/current-to-pbest/1 exemplar. Figure 5d shows that on multimodal f17, only HeDE-PSO converges to a high-quality solution, while other contrast algorithms do not converge sufficiently to achieve high accuracy. With heterogeneous DE exemplar, HeDE-PSO can keep proper swarm diversity to avoid local optima. HeDE-PSO generates exemplars by exploration enhanced DE mutants and exploration/exploitation balanced DE mutants, which helps to balance convergence and diversity. HeDE-PSO is more adaptive to different kinds of optimization problems than employing a single DE mutant learning exemplar.

Parameter analysis

In this section, three parameters of HeDE-PSO are analyzed, namely, Ps1, p% and FEBFGS (function evolution of BFGS). In each test, only one parameter is tested for different values, while other parameters are set according to Table 1. f1, f7, f11, f17, f21 and f27 are employed as representative functions.

Ps1 stands for the subswarm size of DE/current-to-pbest/1, and the remaining particles employ DE/rand/1 to construct learning exemplars. Increasing Ps1 means enlarging the exploitation enhanced subswarm and reducing the exploration enhanced subswarm. Table 7 indicates that Ps1 = 15 obtains the best performance of f1, f7, f11 and f21; hence, Ps1 = 15 is the best choice of HeDE-PSO.p% denotes that the top p% particles are selected for DE/current-to-pbest/1 mutation in Eq. (11). Reducing p% means selecting a small portion of high-quality particles for DE/current-to-pbest/1 mutation to enhance exploitation, while increasing p% means more elite particles can be employed for DE/current-to-pbest/1 mutation to enhance exploration. Table 8 shows that p% = 20% yields relatively better performance than other values; therefore p% = 20% is adopted by HeDE-PSO.

Table 7 The effects of Ps1
Table 8 The effects of p%

FEBFGS denotes the total function evolution consumed by BFGS local search in the later stage of HeDE-PSO. BFGS local search converges faster than population-based algorithms, while it is apt to fall into local optima. The initial point is very important for BFGS local search. Table 9 indicates that FEBFGS = 100*D performs better than other parameter settings.

Table 9 The effects of FEBFGS

Application to industrial refrigeration system design

To test the performance of HeDE-PSO on real-life applications, the industrial refrigeration system design problem is used. The industrial refrigeration system design problem provided in CEC2020 is from a real-world single-objective constrained optimization competition [68], and its mathematical model is expressed in Eqs. (1934). The industrial refrigeration system design problem has fourteen variables, one objective and fifteen inequality constraints. The global optimum is 3.2213E−2 and the violation is 0. For fair comparison, ε-constraint [69, 70] is adopted as a constraint handling technique, and the BFGS local search [32] of DMSDL-PSO and HeDE-PSO is removed. The violation function Φ(x) can be expressed as

$$ \begin{aligned}\phi (x)=\sum_{j}||\mathrm{max}\{0,{g}_{j}(x)|{|}^{p}+\sum_{j}||\mathrm{max}\{0,{h}_{j}(x)|{|}^{p},\qquad \end{aligned}$$
(15)

where gj(x) ≤ 0 and hj(x) = 0 denote inequality constraints and equality constraints, respectively. Let fA, ΦA, and fB, ΦB, denote for fitness values and constraint violation values at points A and B, respectively. ε level comparison is defined as below:

$${(f}_{A},{\phi }_{A}){<}_{\varepsilon }{(f}_{A},{\phi }_{A})=\left\{\begin{array}{ll}{f}_{A}<{f}_{B}, &\quad If {\phi }_{A},{\phi }_{B}\le \varepsilon \\ {f}_{A}<{f}_{B},&\quad If {\phi }_{A}={\phi }_{B}\\ {\phi }_{A}<{\phi }_{B} \,\,otherwise\end{array},\right.$$
(16)
$${\varepsilon }_{0}=\phi \left({{\varvec{x}}}_{\theta }\right),$$
(17)
$$\varepsilon (t)=\left\{\begin{array}{ll}\varepsilon (0)(1-{T}_{\lambda }/{T}_{c}{)}^{cp}, &\quad 0<t\le {T}_{c}\\ 0&\quad t>{T}_{c}\end{array}.\right.$$
(18)

In this study, θ = 0.5, Tλ = 0.95 × Tc, cp = 3, Tc = 0.2 × MFEs. MFEs is the maximum allowable function evolutions.

The test results of 30 independent runs are given in Table 10. MF, MV, FR, and SR denote mean fitness, mean constraint violation, feasibility rate and success rate, respectively. Success rate is the ratio that one algorithm obtains a feasible solution and the error is not bigger than 1e−8 (\(f(\overline{x })-f({\overline{x} }^{*})\le {10}^{-8}\)). Table 10 indicates that HeDE-PSO achieves 100% FR and 100% SR. BLPSO achieves 100% FR while SR is 0%. APSO-DEE, XPSO, FMPSO and DMSDL-PSO yield high FR, while their SR are not high. The SR of DTT-PSO and XPSO are 3.33% and 16.7%, respectively. The SR of BLPSO, FMPSO and PPSO are 0%. HeDE-PSO has significant advantages over other PSO algorithms.

Table 10 Test results on industrial refrigeration system design

Minimize:

$$f(\overline{x })=63098.88{x}_{2}{x}_{4}{x}_{12}+5441.5{x}_{2}^{2}{x}_{12}+115055.5{x}_{2}^{1.664}{x}_{6}+6172.27{x}_{2}^{2}{x}_{6}+63098.88{x}_{1}{x}_{3}{x}_{11}+5441.5{x}_{1}^{2}{x}_{11}+115055.5{x}_{1}^{1.664}{x}_{5}+6172.27{x}_{1}^{2}{x}_{5}+140.53{x}_{1}{x}_{11}+281.29{x}_{3}{x}_{11}$$
$$\quad +70.26{x}_{1}^{2}+281.29{x}_{1}{x}_{3}+281.29{x}_{3}^{2}+14437{x}_{8}^{1.8812}{x}_{12}^{0.3424}{x}_{10}{x}_{14}^{-1}{x}_{1}^{2}{{x}_{7}x}_{9}^{-1},$$
(19)

subject to

$${g}_{1}(\overline{x })=1.524{x}_{7}^{-1}\le 1,$$
(20)
$${g}_{2}(\overline{x })=1.524{x}_{8}^{-1}\le 1,$$
(21)
$${g}_{3}(\overline{x })=0.07789{x}_{1}-2{x}_{7}^{-1}{x}_{9}-1\le 0,$$
(22)
$${g}_{4}(\overline{x })=7.05305{x}_{9}^{-1}{x}_{1}^{2}{{x}_{10}x}_{8}^{-1}{x}_{2}^{-1}{x}_{14}^{-1}-1\le 0,$$
(23)
$${g}_{5}(\overline{x })=0.0833{x}_{13}^{-1}{x}_{14}-1\le 0,$$
(24)
$${g}_{6}(\overline{x })=47.136{x}_{2}^{0.333}{x}_{10}^{-1}{x}_{12}-1.333{x}_{8}{x}_{13}^{2.1195}+62.08{x}_{13}^{2.1195}{x}_{12}^{-1}{x}_{8}^{0.2}{x}_{10}^{-1}-1\le 0,$$
(25)
$${g}_{7}(\overline{x })=0.04471{x}_{10}{x}_{8}^{1.8812}{x}_{12}^{0.3424}-1\le 0,$$
(26)
$${g}_{8}(\overline{x })=0.0488{x}_{9}{x}_{7}^{1.893}{x}_{11}^{0.316}-1\le 0,$$
(27)
$${g}_{9}(\overline{x })=0.0099{{x}_{1}x}_{3}^{-1}-1\le 0,$$
(28)
$${g}_{10}(\overline{x })=0.0193{x}_{2}{x}_{4}^{-1}-1\le 0, $$
(29)
$${g}_{11}(\overline{x })=0.0298{x}_{1}{x}_{5}^{-1}-1\le 0,$$
(30)
$${g}_{12}(\overline{x })=0.056{x}_{2}{x}_{6}^{-1}-1\le 0,$$
(31)
$${g}_{13}(\overline{x })=2{x}_{9}^{-1}-1\le 0,$$
(32)
$${g}_{14}(\overline{x })=2{x}_{10}^{-1}-1\le 0,$$
(33)
$${g}_{15}(\overline{x })={x}_{12}{x}_{11}^{-1}-1\le 0.$$
(34)

With bounds

$$ 0.001\le {x}_{i} \le 5,\quad I=1, 2, 3, \ldots, 4. $$

Conclusions

This study proposes a heterogeneous differential evolution particle swarm optimization (HeDE-PSO) method. HeDE-PSO adopts two DE mutants to construct learning exemplars for PSO to improve adaptability and employs BFGS local search to increase search accuracy in the late stage. DE/rand/1 is employed for enhancing exploration and DE/current-to-pBest/1 is employed for enhancing exploitation. The test results on 10-dimensional and 30-dimensional functions show that HeDE-PSO defeats the comparison algorithms on most of the tested functions. For 30-dimensional functions, HeDE-PSO outperforms other PSO algorithms on at least 20 functions out of 29 functions. HeDE-PSO obtains high performance on 10-dimensional and 30-dimensional CEC2017 functions without parameter turning. On the industrial refrigeration system design problem, HeDE-PSO is the sole algorithm generating 100% feasibility rate and 100% success rate. The test results indicate that adopting heterogeneous DE learning exemplars can improve the performance and adaptability of HeDE-PSO. HeDE-PSO is applicable to both benchmark functions and real-life application problems.