Introduction

The Differential Evolution (DE) algorithm was first proposed by Storn and Price in 1995 [1], it is widely used to solve real number optimization problems. DE is an adaptive global optimization algorithm based on population. It solves optimization problems through the iteration of evolutionary operators, and its steps include mutation, crossover and selection [2, 3]. Because of its simple structure and easy implementation, it is widely used in data mining, artificial neural networks, digital filter design, electromagnetism and other fields. For example, solar cell and module parameter estimation [4], Electromagnetic framework [5], Environmental and economic power dispatching [6], protein structure prediction [7].

Different from other algorithms [8, 9], DE mainly balances the trade-off of exploration and exploitation through mutation strategy, population size \(\text {NP}\), scaling factor F, crossover rate \(\text {CR}\) and other parameters. In the early stage, DE changes its parameters and mutation strategy through trial and error in order to obtain the best result. However, this approach necessitates a lot of testing, therefore many researchers started looking into adaptive ways to increase effectiveness and conserve computing resources. Adaptive technology is introduced to dynamically update the control parameters to optimize the problem. Additionally, if the parameter adaptive design is suitable, the algorithm’s convergence performance can be enhanced.

According to the classification scheme introduced by Angeline [10] and Eiben et al. [11, 12], parameter control mechanisms can be divided into the following three categories:

  1. 1.

    Deterministic rule setting parameters. In an evolutionary iteration, the setting parameters are controlled by some deterministic rules and does not accept any feedback information. For instance, the mutation rate parameter is changed by the time-dependent [13].

  2. 2.

    Adaptive setting parameters. In the search process, setting parameters are dynamically changed by interacting feedback information. The adaptive setting parameter set has a great advantage in improving the convergence rate and global optimal search probability of the algorithm. This category includes many recently suggested DE algorithms, including jDE [14], SaDE [15], JADE [16], SHADE [17], and others.

  3. 3.

    Self-adaptive setting parameters. The setting parameters can be adaptively changed in the process of evolution. Better parameter values can be passed on to more offspring because they are more likely to generate better individuals. In this category, SPDE [18] and DESAP [19] both fit.

In addition to adaptive technology, the prevalence of distributed computing offers a fresh method for enhancing the performance and accessibility of algorithms. The population is put on a distributed system by the researchers to enhance the performance of the DE method. Due to the population’s scattered structure, it is possible for each sub-population member to find an optimal solution independently during evolution. An adaptive DDE (ADDE) is proposed by Zhan et al. [20]. In ADDE, the total population is divided into four sub-populations, including one master population and three slave populations, using a master-slave multi-population distribution framework. The master population is responsible for collecting individuals from the three slave populations and reassigning them to the three slave populations after distinguishing them, while different slave populations will adaptively choose their own mutation strategies according to the feedback of individuals. Wu et al. propose a multi-population based framework (MPF) in EDEV [21] to achieve the purpose of integrating multiple DE variants. In EDEV, each DE variant evolves independently, and computing resources are dynamically allocated to the DE variants based on their performance. Li et al. introduce an information-sharing mechanism in the sub-population of MPMSDE [22] to avoid falling into local optima. In Cloudde [23], the author divides the population into four parallel sub-populations and uses different mutation strategies in the concurrent sub-populations to reduce the calculation cost of practical problems and meet different search requirements.

A multi-population based technique is used in MPEDE [24] presented by Wu et al. It can achieve the dynamic integration of numerous mutation strategies. A reward sub-population and three other sub-populations are created at MPEDE, with the reward sub-population being assigned to one of the other sub-population. The three sub-populations adopt different mutation methods and different updating techniques, respectively, which include “DE/current-to-pbest/1” and archive, “DE/current-to-rand/1” and “DE/rand /1” strategies. A strategy with the highest recent performance will automatically receive additional computing resources as the algorithm develops, while the remaining computing resources will be distributed equally among the other two strategies.

However, there are still some problems with these works. The greatest computational resources are made available to the best mutant strategy in MPEDE throughout evolution, whereas the computational resources of medium and lower strategies are equal. This allocation strategy saves computing resources. The medium strategy should have more computational resources than the inferior strategy in order to utilize them more effectively. The “DE/rand/1” mutation technique provides an advantage in increasing population diversity, but it may reduce the rate of convergence [25]. The “DE/current-to-pbest/1” method in MPEDE employs the arithmetic mean to control parameters, which could result in premature convergence of the algorithm [26].

In order to address these issues, an integrated differential evolution of multi-population based on contribution degree (MDE-ctd) is proposed. It employs a new grouping technique (Dynamic regrouping method, DRM) to balance the distribution of computing resources. DRM can calculate the contribution degree of three sub-populations, and regroup the size of the three sub-populations according to the rank of contribution degree. In the process of evolution, integrated sub-population expanded the search space and lessens the impact of strategy on convergence speed by using one mutation strategy pool(“DE/best/2”, “DE/rand/1” and “DE/current-to-rand/1”) and two-parameter value pools(\(\text {CR}\) parameter pool and F parameter pool). Additionally, historical successful weight (HSW) parameter adaptive approaches were used for “DE/current-to-pbest/1” parameters. The adaptive technology can avoid premature convergence, speeds up convergence and avoids local optimization. On the benchmark function set of CEC 2005 and CEC 2014, MDE-ctd was tested in several dimensions, and an exhaustive comparison with a number of state-of-the-art DE variants fully illustrated its competitive performance. The MDE-ctd algorithm performed well, especially for handling highly complex optimization issues.

Related work

This section discusses computing resource allocation methods, the traditional DE, differential evolution of multi-population based on the ensemble of mutation strategies (MPEDE) [24], and Ensemble of differential evolution variants (EDEV) [21].

Computational resource allocation methods

In the process of evolution, a single mutation strategy is very different in the face of multi-modal, constrained, large-scale, dynamic, and uncertain optimization problems, which cannot make the algorithm achieve global optimality. Many researchers began to use multiple mutation strategies to co-evolve. In addition, effective computing resource allocation methods are also valued by many researchers. Reasonable computing resource allocation methods can not only avoid the waste of computing resources but also maintain the good search performance of the algorithm. The use of these two methods is not limited to some DE algorithms mentioned in the first section. In other studies such as particle swarm optimization and coevolution, there are also contribution-based computing resource allocation methods and multiple population strategies.

In MPCPSO [27], Li et al. divided the whole population into “elite group (EP)” and “general group (GP)”, and each group adopted different learning strategies. By using a dynamic segment-based average learning strategy (DSMLS) and multi-dimensional comprehensive learning strategy (MDCLS) simultaneously, information sharing and simultaneous evolution among populations can be realized. In DCCA [28], a two-layer distributed coevolutionary structure is proposed. On the one hand, the scalability of dimension division is obtained, and high-dimensional problems are effectively handled. On the other hand, the sub-components can quickly adapt to the changes in resource allocation, and achieve an effective allocation of computing resources. In DCCC [29], a cooperative coevolutionary framework based on difficulty and contribution is proposed, which allocates more computing resources to sub-problems with greater and more difficult contributions, thus comprehensively solving the problem of difficulty and imbalance of computing resources. In the CBCCO [30] proposed by Jia et al., the contribution-based overlapping problem decomposition (CBD) method is used to allocate computing resources to the sub-components with larger contributions, effectively and efficiently decompose and optimize the non-separable large problems with overlapping sub-components.

DE algorithm

DE is a population-based optimization method, which is a kind of evolutionary algorithm. The main difference between DE and other evolutionary algorithms (EA) is that the DE algorithm is based on individual differences [31]. Evolution is composed of four basic steps: initialization, mutation, crossover and selection [32]. The DE implementation process is as follows.

Initialization

The population is randomly created during the initialization phase in accordance with the uniform distribution in the search space as:

$$\begin{aligned} x _ { i , j , 0 } = L _ { j } + \text {rand} \times ( U _ { j } - L _ { j } ) \end{aligned}$$
(1)

where i is the individual index, \(U _ { j }\) and \(L _ { j }\) are upper bounds of the jth dimension and lower bounds of the jth dimension. rand is a random number in the range [0, 1]. Once the operator has been initialized, DE optimizes it via a sequence of evolutions that include mutation, crossover, and selection until the termination condition is met.

Mutation

At g generation, for each individual \(x_{i,g}\) in the current population, there have some mutants based on the mutation strategy as \(\{ v _ { i , g } = ( v _ { i , 1 ,g } , v _ { i , 2, g }, \ldots , v _ { i , D, g } )\mid i = 1,2,3, \ldots , \text {NP} \}\), where NP is the population size, D is the dimension of the problem. The following are some common mutation strategies in the literature:

DE/rand/1:

$$\begin{aligned} v _ { i ,g} = x _ { r 1 ,g} + F \cdot ( x _ { r 2,g } - x _ { r 3 ,g} ) \end{aligned}$$
(2)

DE/rand/2:

$$\begin{aligned} v _ { i ,g} = x _ { r 5 ,g} + F \cdot ( x _ { r 1,g } - x _ { r 2,g }) + F \cdot ( x _ { r 3 , g } - x _ { r 4,g } ) \end{aligned}$$
(3)

DE/current-to-best/1:

$$\begin{aligned} v _ { i ,g} = x _ { i ,g } + F \cdot ( x _ { { \text {best} ,g} } - x _ { i ,g } ) + F \cdot ( x _ { r 1 ,g } - x _ { r 2 ,g} ) \end{aligned}$$
(4)

DE/current-to-rand/1:

$$\begin{aligned} v _ { i ,g} = x _ { i ,g} + F \cdot ( x _ { r 3,g } - x _ { i ,g} ) + F \cdot ( x _ { r 1,g } - x _ { r 2,g } )\end{aligned}$$
(5)

DE/current-to-pbest/1:

$$\begin{aligned} v _ { i , g } = x _ { i , g } + F \cdot ( x _ { { \text {best} } , g } ^ { p } - x _ { i , g } ) + F \cdot ( x _ { r 1 , g } - x _ { r 2 , g } )\end{aligned}$$
(6)

where \(r _ {1}, r _ {2}, r _ {3}, r _ {4}\) and \(r _ {5}\) are mutually distinct random values drawn from the range \([1, \text {NP}]\). The best individual in the g-generation of the population is \(X _ {\text {best},g}\). From the present population’s best p percent, \(X _ {\text {best},g}^{p}\) is chosen at random. F is used to scale the difference vector in order to control the search step.

Crossover

After the mutation operator, the initial vector is crossed with the mutation vector to generate the target vector of the test vectors \(u _ { i , g } = ( u _ { i , 1 , g } , u _ { i , 2 , g } , \ldots , u _ { i , D , g } )\). In DE, crossover operation can be implemented by using one of three methods: binary crossover, exponential crossover [33], and arithmetic crossover. DE usually uses a binomial crossover defined as :

$$\begin{aligned} u _ { i , j , g } = \left\{ \begin{array} { l l } { v _ { i , j , g } ,} &{} { { {\text {if}} \ (\text {rand} } \le \text {CR} \ { or }\ j = j _ { \text { rand } }) } \\ { x _ { i , j , g }, } &{} { { \text {otherwise} } } \end{array} \right. \end{aligned}$$
(7)

where \(j _ {\text {rand}} \)= RandInt(1, D) is a random integer from 1 to D, it can make sure that at least one variable of the trial vector \(u _ {i,g}\) comes from the mutation vector \(v _ {i,g}\). CR is the crossover rate, if CR =1, there is no crossover, and the test vector is equal to the mutation vector. CR determines the proportion of \(u _ {i,g}\) inherited from \(v _ {i,g}\).

Selection

If the value of the newly generated decision variable is greater than the upper bound or less than the lower bound, it will be reset to the corresponding bound and continuously re-initialized within the predetermined range. Then, the selection operation is performed after evaluating the target function values of all test vectors.

In order to maintain a better vector in the population, the fitness of the trial vector is calculated after the crossover operator, and the selection operator is executed. The fitness values of \(x _ {i,g}\) and \(u _ {i,g}\) are used to determine which of \(x _ {i,g}\) or \(u _ {i,g}\) will survive in the next generation. For the minimization problem, vectors with lower fitness values always advance to the following generation, as illustrated by

$$\begin{aligned} x _ { i , g + 1 } = \left\{ \begin{array} { l l } { u _ { i , g } , } &{} { \text { if }\ f ( u _ { i , g } ) \le f ( x _ { i , g } ) } \\ { x _ { i , g } , } &{} { \text {otherwise} } \end{array} \right. \end{aligned}$$
(8)

where f(x) is the fitness evaluation function.

MPEDE

MPEDE [24] is a multi-population-based differential evolution algorithm, which uses three different mutation strategies. The entire population is dynamically divided into four populations, including three smaller sub-populations and a larger reward population. Three mutation strategies are chosen by MPEDE: “DE/current-to-rand/1” and archive, “DE/current-to-rand/1” and “DE/rand/1”. In the process of evolution, the reward population is dynamically allocated to the dominant population according to the proportion of population fitness improvement of different strategies.

It is crucial to choose the right control parameters for each mutation strategy because the performance of the DE algorithm and the mutation strategy is correlated. As a result, different mutation strategies may require different parameter settings.

MPEDE uses adaptive parameter technology. the scale factor \(F _ {i,j}\) can help individual \(x _ {i}\) using jth mutation strategy to generate the crossover probability of the trial solution. The scale factor \(F _ {i,j}\) is generated from the Cauchy distribution of the position parameter \(\mu F _ { j }\) and the scale parameter 0.1. The crossover probability \(\text {CR} _ {i,j}\) is the crossover probability of the trial solution generated by the jth mutation strategy for individual \(x _ {i}\). The crossover probability is generated by the normal distribution of mean \(\mu \text {CR} _ { j }\) and standard variance 0.1. In each generation of g, the parameter set formula is as follows:

$$\begin{aligned}{} & {} \left\{ \begin{array} { l } { F _ { i j } = { \text {rand}c } _ { i , j } ( \mu F _ { j } , 0.1 ) } \\ { \text {CR} _ { i , j } = { \text {rand}n } _ { i , j } ( \mu \text {CR} _ { j } , 0.1 ) } \end{array} \right. \end{aligned}$$
(9)
$$\begin{aligned}{} & {} \quad \mu F _ { j } = ( 1 - c ) \cdot \mu F _ { j } + c \cdot { \text {mean} } _ { A } ( S _ { F , j } ) \end{aligned}$$
(10)
$$\begin{aligned}{} & {} \quad \mu \text {CR} _ { j } = ( 1 - c ) \cdot \mu \text {CR} _ { j } + c \cdot { \text {mean} } _ { L } ( S _ { \text {CR} , j } ) \end{aligned}$$
(11)

where c is a positive constant between 0 and 1, \( { \text {mean} } _ { A } ( S _ { F , j } )\) is an arithmetic mean value of \(S _ {F,j}\) [26], and \( { \text {mean} } _ { L } ( S _ { \text {CR} , j } )\) is the Lehmer mean of \(\text {CR} _ {F,j}\) [17] which are defined as

$$\begin{aligned}{} & {} { \text {mean} } _ { A } ( S _ { F , j } ) = \frac{ 1 }{ m } \sum _ { k = 1 } ^ { m } S _ { F , k } \end{aligned}$$
(12)
$$\begin{aligned}{} & {} {\text {mean}} _ {L} ( S _ { \text {CR} , j } ) = \frac{ {\textstyle \sum _{k=1}^{\mid S _ { \text {CR} } \mid }}S _ { \text {CR} , k } ^ { 2 }}{{\textstyle \sum _{k=1}^{\mid S _ { \text {CR} } \mid }}S _ { \text {CR} , k }} \end{aligned}$$
(13)

EDEV

Wu et al. propose in 2017 that EDEV is a practical multi-population based framework (MPF ) [21] for integrating multiple DE variants. EDEV is composed of three traditional DE versions: JADE [16], CoDE [34], and EPSDE [35]. In the evolutionary process, each component DE variable in EDEV is assigned to an indicator sub-population using the MPEDE allocation method. After a certain number of iterations, the reward sub-population is assigned to the DE variant with the best performance recently. The following three DE variants that are utilized in EDEV are briefly described here.

JADE

The “DE/current-to-pbest/1” mutation approach is employed in JADE, and it consists of two versions: one has archive and the other does not. The archival version uses the historical optimal parameters generated by a normal distribution and Cauchy distribution to realize the self-adaptation of parameters F and \(\text {CR}\).

CoDE

Three alternative mutation strategies are utilized in CoDE: “DE/rand/1”, “DE/rand/2”, and “DE/current-to-rand/1”. Additionally, there are three control parameter combinations of \([F = 1.0, \text {CR} = 0.1]\), \([F = 1.0, \text {CR} = 0.9]\), and \([F = 0.8, \text {CR} = 0.2]\). Each parent vector generation of the iterative process of evolution uses three alternative mutation techniques to produce child vectors, and if the child vector produces a result that is better than the parent vector, the child vector would continue to exist.

EPSDE

In EPSDE, the “DE/best/2”, “DE/rand/1”, and “DE/current-to-rand/1” mutation strategy pools are present. The \(\text {CR}\) parameter pool value range is [0.1, 0.2, 0.3, 0.4, 0.5,  0.6, 0.7, 0.8, 0.9], while the F parameter pool value range is [0.4, 0.5, 0.6, 0.7, 0.8, 0.9]. A mutation method and associated parameter value are randomly allocated to each member of the initial population of their respective pools. The related mutation strategies and parameter values are permitted to survive when the child vector produced during the process of evolution outperforms the parent vector, while when the child vector generated in the evolution process cannot be superior to the parent vector, the mutation strategy and parameter values are randomly reinitialized.

Fig. 1
figure 1

The main framework of MDE-ctd

An integrated differential evolution of multi-population

The traditional DE based on multi-population allocates computing resources of the same size to each sub-population, which is not conducive to the full and rational use of computing resources. MDE-ctd divides the whole population into three sub-populations (archival sub-population, exploratory sub-population and integrated sub-population.) based on mutation strategy. In the iterative evolution process, the contribution degree of each sub-population is used to rank, and the computing resources are dynamically adjusted according to the ranking results. The main framework of MDE-ctd was shown in Fig. 1.

In MDE-ctd, the mutation strategy of the archival sub-population is “DE/current-to-pbest/1”, the mutation strategy of the exploratory sub-population is “DE/current-to-rand/1”, and the integrated sub-population uses one mutation strategy pool (“DE/best/2”, “DE/rand/1” and “DE/current-to-rand/1”) and two-parameter value pools (\(\text {CR}\) parameter pool [0.1–0.9] and F parameter pool [0.4–0.9]).

Contribution degree

The contribution degree is the participation degree of each sub-population in finding the global optimal solution, it can count the proportion of the optimal fitness of each sub-population in the whole population.

The sub-population based on mutation strategy is ranked according to the ratio between the fitness change value of the previous ng generation of the sub-population and the evaluation of the consumption function. The indicators of the high-ranked sub-population h and the low-ranked sub-population l are defined as follows:

$$\begin{aligned} h= & {} \arg \left( \max _ { 1 \le j \le 3 } \left( \frac{ \Delta f _ { j } }{ ng \cdot \text {NP} _ {j} } \right) \right) \end{aligned}$$
(14)
$$\begin{aligned} l= & {} \arg \left( \min _ { 1 \le j \le 3 } \left( \frac{ \Delta f _ { j } }{ ng \cdot \text {NP} _ {j} } \right) \right) \end{aligned}$$
(15)

Where \(\text {NP}_ {j}\) is the size of the jth sub-population, \(\Delta f_ {j}\) is the accumulation of fitness change value brought by the jth mutation strategy in the previous generation, and \( ng \cdot \text {NP}_ {j}\) is the function evaluations consumed of the jth mutation strategy in the previous ng generation. The order of the medium mutation strategy except for the best mutation strategy h and the worst mutation strategy l is m.

After ranking the sub-population, The high-ranked sub-population h received 3 points, the medium-ranked sub-population m with 2 points, and the low-ranked sub-population l with 1 point. The size of the sub-population can be determined as follows.

$$\begin{aligned}{} & {} \lambda _ j = \frac{\text {score} _ j}{\sum _ { j = 1 , \ldots 3 }\ {\text {score} _ j}} \end{aligned}$$
(16)
$$\begin{aligned}{} & {} \text {NP}_{ j }^{new} = \lambda _ { j } \cdot \text {NP} , \quad j = 1 , \ldots , 3 \end{aligned}$$
(17)

Where \(\text {score}_{j}\) is the score of jth strategy, \(\lambda _{j}\)is the proportion of the jth strategy contribution, and \( \text {NP} \)is the total population size.

For different types of optimization problems, MDE-ctd can dynamically allocate computing resources according to the contribution degree. It can make more computing resources available to mutation strategies that are suitable for this type of optimization problem, and allocate less computing resources to the poor strategy, thus making full use of computing resources.

Multi-population and multi-strategy integration method

In the whole iterative process of DE, the mutation strategy plays different roles in different periods. In the early stage, individuals are scattered. In the middle and late stages, individuals will gather in the local or global optimal regions.

In MDE-ctd, the archival sub-population uses the “DE/current-to-pbest/1 (with achieve)” mutation strategy, it can preserve good search performance, accelerate convergence, and improve local search ability. The “DE/current-to-rand/1” method was used in the exploratory sub-population to allow individuals to explore more territory and prevent populations from clumping together too early in the evolutionary process. Note that the “DE/current-to-rand/1” strategy does not use crossover.

The mutation strategy and the set of parameter values are used in the renewal process of the integrated sub-population. It had a pool of values for each relevant parameter as well as a pool of mutation strategies. When tackling various real number optimization challenges, these various mutation methods were employed to compete to develop offspring populations that can exhibit various performance traits at various phases of evolution. The collection contained the mutation techniques “DE/rand/1,” “DE/best/2”, and “DE/current-to-rand/1”. In a numerical pool, the parameters F and \(\text {CR}\), \(F \in \left[ 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 \right] \), \(\text {CR} \in \left[ 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 \right] \). Each member of the integrated sub-population was assigned a mutation strategy and randomly selected parameter values from their respective pools.

Adaptive parameter settings based on historical successful weight

Effective parameter combinations (F and CR) can improve the performance of DE. It is desirable to record and reuse these individual characteristics in later generations, thus enabling excellent individuals to have a higher chance of survival. However, different mutation strategy requires different combinations of parameters.

In order to use different parameter combinations rationally, an improved parameter adaptive method based on historical successful weight (HSW) was designed to dynamically update F and \(\text {CR}\). JADE uses all successful F and \(\text {CR}\) values to guide individual parameter adjustment, while the core idea of HSW is to judge the advantages of contemporary weights by whether it can successfully generate better child solutions, and calculate new F and \(\text {CR}\) through weighted adaptive methods. The following was the formula for set parameters:

$$\begin{aligned}{} & {} F _ { i, j } = { \text {rand}c } _ { i , j } ( \mu F _ { j } , 0.1 ) \end{aligned}$$
(18)
$$\begin{aligned}{} & {} \text {CR} _ { i , j } = { \text {rand}n } _ { i , j } ( \mu \text {CR} _ { j } , 0.1 ) \end{aligned}$$
(19)

where \(F _ { i, j }\) is the scaling factor of individual \(x_{i}\) using the j-th mutation strategy, \(\text {rand}c _ {i,j}\) is the Cauchy distribution. If \(F_{i,j} > 1\), it is truncated to 1, and if \(F_{i,j}\le 0\), it is regenerated. \(\text {CR} _ { i , j }\) is the crossover probability of individual \(x_{i}\) using the jth mutation strategy, \(\text {rand}n _ {i,j}\) is the Normal distribution and the standard deviation is 0.1. At the end of each generation, \(\mu F _ { j }\) and \(\mu \text {CR} _ { j } \) are updated according to the following formula:

$$\begin{aligned}{} & {} \mu F _ { j } = ( 1 - c ) \cdot \mu F _ { j } + c \cdot { \text {mean} } _ { \text {WL} } ( S _ { F , j } ) \end{aligned}$$
(20)
$$\begin{aligned}{} & {} \mu \text {CR} _ { j } = ( 1 - c ) \cdot \mu \text {CR} _ { j } + c \cdot { \text {mean} } _ { \text {WL} } ( S _ { \text {CR} , j } ) \end{aligned}$$
(21)

where \({ \text {mean} } _ { \text {WL} } ( S _ { \text {CR} , j })\) and \({ \text {mean} } _ { \text {WL} } ( S _ { F , j })\) are the weighted mean of \(S _ {F,j}\) and \(S _ {\text {CR},j}\) respectively, and c is a normal number between 0 and 1. The \({ \text {mean} } _ { \text {WL} } ( S _ { \text {CR} , j })\) and \({ \text {mean} } _ { \text {WL} } ( S _ { F , j })\) can be calculated as follows:

$$\begin{aligned}{} & {} {\text {mean}} _ {\text {WL}} ( S _ { \text {CR} , j } ) = \frac{ {\textstyle \sum _{k=1}^{\mid S _ { \text {CR} } \mid }} w _ { k }^{ g } \cdot S _ { \text {CR} , k } ^ { 2 }}{{\textstyle \sum _{k=1}^{\mid S _ { \text {CR} } \mid }} w _ { k }^{ g } \cdot S _ { \text {CR} , k }} \end{aligned}$$
(22)
$$\begin{aligned}{} & {} {\text {mean}} _ {\text {WL}} ( S _ { F , j } ) = \frac{ {\textstyle \sum _{k=1}^{\mid S _ { F } \mid }} w _ { k }^{ g } \cdot S _ { F , k } ^ { 2 }}{{\textstyle \sum _{k=1}^{\mid S _ { F } \mid }} w _ { k }^{ g } \cdot S _ { F , k }} \end{aligned}$$
(23)

The weights of each F and CR were calculated by weighting the mean. The weight \(w_{k}^{g}\) is updated like that:

$$\begin{aligned}{} & {} w _ { k } = \frac{\Delta d_{k}}{ {\textstyle \sum _{k = 1}^{\mid S _ { \text {CR} } \mid or \mid S _ { F } \mid }} \Delta d _ { k , g }} \end{aligned}$$
(24)
$$\begin{aligned}{} & {} \quad \Delta d _ { k , g } = |f(x_{k,g}) - f(u_{k,g})| \end{aligned}$$
(25)
$$\begin{aligned}{} & {} \quad w _ { k } ^{ g+1 } = \left\{ \begin{array} { l l } { w _ { k }^{ g } ,}\ { { if\ f(x_{k,g})< f(u_{k,g}) } } \\ { w _ { k }, } \ { { \text {if}\ (f(x_{k,g}) > f(u_{k,g})\ \text {and}\ \text {rand}<0.5)} } \\ { \delta _{1} *w _ { k } + \delta _{2} *w_{k}^{g},} \ { { \text {otherwise} } }\end{array} \right. \nonumber \\ \end{aligned}$$
(26)

where \(\Delta d _ { k , g }\) are fitness improvements and are used to influence parameter adaptation. \(\delta _{1}\) and \(\delta _{2}\) are the weight combination coefficients.

First, if the current generation succeeded in generating a better solution, the weight continued to live into the next generation, i.e. \(w_{k}^{g+1} = w_{k}^{g}\). Second, if the generation did not succeed in generating a better solution and the random number \(\text {rand}\) was less than 0.5, the weight was discarded and a new weight was generated through \(\Delta d _ { k , g }\). Third, if the contemporary generation did not succeed in generating a better solution and the random number \(\text {rand}\) was greater than 0.5, the contemporary weights were used to regenerate and calculate the weights for the next generation.

In particular, MDE-ctd applies this parameter control method to the “DE/current-to-pbest/1” mutation strategy and continues to use the parameter adaptive mechanism in JADE on other mutation strategies.

Algorithm 1
figure b

Pseudo-code of MDE-ctd.

Complexity analysis

According to the pseudo-code given in Algorithm 1, the time complexity of MDE-ctd was analyzed as follows. \(\text {NP}\) is the population size, D is the dimension of problem dimension, \(G_{max}\) is maximum number of generations. The time complexity of classical DE can be expressed as \(O(G_{\max }\times \text {NP}\times D)\). As stated in [36], MPEDE only extended the JADE mutation strategy and parameter adaptability, so the total complexity of both MPEDE and JADE was \(O( G _ { \max } \times \text {NP}\times [D + \log _{}{(\text {NP})}])\). The main difference between MDE-ctd and MPEDE was the grouping method, integrated mutation strategy and adaptive mechanism. For the grouping methods based on contribution degree and the adaptive individual parameter settings (F and \(\text {CR}\)) based on HSW, it only called the previous data in the archive, and would not increase the time complexity of the whole algorithm. Multi-population and multi-strategy integration method maintained the same complexity as in MPEDE. Therefore, considering the overall algorithm, the running complexity of MDE-ctd is \(O( G _ { max } \times \text {NP}\times [D + \log _{}{(\text {NP})} ])\).

Table 1 Parameter configuration of each tested algorithm
Table 2 Comparison results of solution accuracy on CEC2005 benchmark (30D)

Experimental studies

In order to fully evaluate the performance of MDE-ctd, several experiments were done on CEC 2005 global optimization benchmark function suite (30D, 50D) [37] and CEC 2014 global optimization benchmark function suite (30D) [38]. Some state-of-art algorithms (CoDE [34], JADE [16], SAKPDE [39], MPMSDE [22], EPSDE [35], SHADE [17], EDEV [21] and MPEDE [24]) are used to compare with MDE-ctd. Further, the contribution degree analysis, strategy effectiveness and parameter sensitivity are conducted to test the robustness of MDE-ctd.

Benchmark functions and experimental settings

The 25 benchmark functions of the CEC2005 can be divided into the following four categories: unimodal functions (F1–F5), basic multimodal functions (F6–F12), expanded multimodal functions (F13–F14), and hybrid composition multimodal functions (F15–F25). The 30 benchmark functions of the CEC2014 can be divided into the following four categories: unimodal functions (F1–F3), simple multimodal functions (F4–F16), hybrid multimodal functions (F17–F22), and composition multimodal functions (F23–F30). The detail information on benchmark optimization functions, see [37, 38].

The parameters of MDE-ctd are set to NP = 210, ng = 5, and c = 0.1. The initial F and \(\text {CR}\) values are set to 0.5 for the archival sub-population. The initial F values of the exploratory sub-population are set to 0.5. In the integrated sub-population, \(\text {CR}\) ranged in [0.1, 0.9] and F ranged in [0.4, 0.9]. The values of \(\delta _ {1}\) and \(\delta _ {2}\) in the parameter adaption based on HSW are 0.8 and 0.2, respectively. \(MaxFes = 10{,}000\times D\) is chosen as the experiment’s maximum fitness evaluation number, where D is the dimension of the benchmark function. All comparison algorithms’ parameter configurations (including NP, F and CR values and various additional parameters) were set to the identical guidelines on their original publications, these parameters are shown in Table 1.

Due to the algorithm’s randomness, each algorithm was independently run 25 times for statistical comparisons. The mean value and standard deviation value are calculated to assess the algorithm’s performance. For each problem, the best result is bolded. Results of MDE-ctd are compared with those of CoDE, JADE, SAKPDE, MPMSDE, EPSDE, SHADE, EDEV and MPEDE, respectively, by Wilcoxon rank sum test at the significance level of 0.05. The marker “−” is worse than the results of MDE-ctd, “\(+\)” is better than the results of MDE-ctd and “\(\approx \)” is equivalent to the results of MDE-ctd.

Table 3 Comparison results of solution accuracy on CEC2005 benchmark (50D)

Comparison with advanced DE variants on CEC2005 for 30D/50D problems

From Tables 2 and 3, MDE-ctd was compared with eight DE variants in the CEC2005 benchmark function suite (30D and 50D), respectively. Several observations and conclusions can be drawn from the analysis of the experimental results. First of all, for the unimodal functions (F1–F5), at 30D, MDE-ctd was almost superior to all other algorithms, only slightly worse than SHADE on F5. In the case of 50D, except CoDE and EPSDE, other algorithms can find the global optimal solution to F1 in 50D. MDE-ctd was superior to CoDE, MPMSDE, EPSDE and MPEDE on F2, only inferior to MPMSDE, SHADE and EDEV on F3, superior to the other 5 algorithms. It was better than all other algorithms on F4–F5. For unimodal functions, MDE-ctd had the ability of fast convergence, and it can find the optimal solution quickly. This is because the “DE/current to best/1” strategy in the archival sub-population can converge quickly on the unimodal problem. More computing resources are allocated to archival sub-populations through the contribution degree, thus it can improve the overall convergence ability of the MDE-ctd.

Secondly, for the basic multimodal functions F6–F12, at 30D, MDE-ctd is inferior to SAKPDE and SHADE in F6 and superior to all other algorithms in F7. JADE, MPMSDE, EPSDE, SHADE, EDEV, MPEDE, and MDE-ctd have the same performance at F8. All the nine algorithms can find the global optimal solution on F9. MDE-ctd was superior to other algorithms except SHADE on F10, and superior to EPSDE and SHADE on F11. MDE-ctd outperformed all other algorithms on F12. In the case of 50D, MDE-ctd was superior to all other algorithms on F7, F10, F12. Only second to EDEV on F6, other algorithms except CoDE, SAKPDE and SHADE maintained the same performance on F8. MDE-ctd, JADE, MPMSDE, SHADE and EDEV can find global optimal solutions on F9, MDE-ctd was superior to JADE, EPSDE and SHADE on F11.

Fig. 2
figure 2

Convergence curves of 9 algorithms on F3, F4, F7, F10, F12, F16 representative test functions of 30D

Fig. 3
figure 3

Convergence curves of 9 algorithms on F17, F18, F19, F20, F22, F23 representative test functions of 30D

Thirdly, for the expanded multimodal functions F13–F14, SHADE performed best on F13. In the 30D case of F14, CoDE, JADE, SAKPDE, MPMSDE, EDEV, MPEDE, and MDE-ctd maintained the same performance, and in the 50D case, JADE, MPMSDE, SHADE, EDEV, MPEDE and MDE-ctd have the same performance. For multimodal functions, MDE-ctd can balance exploration and development, and maintain population diversity through multi-population and multi-strategy integration method.

Finally, for the hybrid composition multimodal functions F15–F25, at 30D, MDE-ctd was second only to EPSDE and MPEDE on F15 and F17 respectively, and MDE-ctd maintained an advantage on F16, F18–F23 and F25. At 50D, MDE-ctd is second only to SAKPDE and EPSDE in F15, second only to EPSDE in F18–F20, superior to all other algorithms on F16, and second only to MPMSDE and MPEDE in F17. SAKPDE and MPEDE were superior to all other algorithms on F21, MDE-ctd and EPSDE are superior to the other 7 algorithms on F22, MED-ctd, SAKPDE, MPMSDE, EDEV and MPEDE were superior to the other 4 algorithms on F25. For hybrid composition multimodal functions, MDE-ctd can not only ensure that the optimal mutation strategy obtains enough computing resources, but also use the multi-population and multi-strategy integration method to search more space and improve the diversity of the algorithm, so as to jump out of the local optimum.

In addition, the convergence rate was also a key indicator of these algorithms, the convergence curves of portion comparison algorithms on some selected functions of 30D were plotted in Figs. 2 and 3 to observe their evolution. The performance index was measured by the mean function error \((f(x) - f(x^{*}))\), where x was the contemporary optimal solution to evolution and \(x^{*}\) was the global optimization of the best function. As seen in Figs. 2 and 3, MDE-ctd can converge quickly and is superior to other DE algorithms on Unimodal functions F3 and F4. On the basic multimodal functions F7 and F12, it can be observed that MDE-ctd can get better solutions than other DE algorithms. On F10, MDE-ctd had a faster convergence rate in the early stage of evolution. On the Hybrid composition functions F16–F20 and F22–F23, the convergence rate of MDE-ctd at F16, F18, F19 and F22 were a little slower in the early stage of evolution, but it can get better solutions than other DE algorithms in the late stage of evolution. On F17, MPEDE and MDE-ctd obtained the best solution. On F23, EPSDE converged slowly, while other DE algorithms can achieve a similar convergence speed.

In conclusion, MDE-ctd has obvious advantages over several other state-of-art DE algorithms. The reason that MDE-ctd is superior to other algorithms is that MDE-ctd can evolve simultaneously in the mutual influence using multi-mutation strategies. These mutation strategies can work independently, enabling MDE-ctd to solve different types of optimization problems. In addition, the contribution degree grouping method used in MDE-ctd allows more computing resources to be allocated to different sub-populations more reasonably, which is faster than other multi-population or multi-strategy algorithms to obtain feedback in the evolution process and locate the optimal solution quickly.

Comparison with advanced DE variants on CEC2014 for 30D problems

MDE-ctd was compared with eight state-of-art DE variants on CEC 2014 (30 benchmark functions) to test its performance. The comparison of MDE-ctd and eight DE variants was summarized in Table 4. The following conclusions were drawn from comparing and analyzing the experiments.

Table 4 Comparison results of solution accuracy on CEC2014 benchmark (30D)

On the unimodal functioned F1–F3, MDE-ctd performs better than other algorithms. On F2, MDE-ctd, CoDE, SAKPDE, MPMSDE, EDEV, and MPEDE can all locate the global optimal solution. On F3, only SAKPDE, EDEV and MDE-ctd can do so.

On the simple multimodal functions F4–F16, MDE-ctd outperformed other algorithms on F4 and F13. On F5, the 7 algorithms have the same performance. On F6 and F14, MDE-ctd was second only to SHADE and EDEV respectively. On F7, MDE-ctd, SAKPDE, MPMSDE, CoDE, EDEV and MPEDE can find globally optimal solutions. On F9, it was only surpassed by JADE and SHADE, while on F12, MDE-ctd outperformed SAKPDE, MPMSDE, EPSDE, EDEV, and MPEDE. On F15, MDE-ctd has the same performance as JADE and outperforms MPMSDE, EPSDE, EDDV, and MPEDE.

In the mixed multimodal functions F17–F22, MDE-ctd was second only to MPEDE on F17 and F18 and superior to JADE, EPSDE, EDEV and MPEDE on F19. On F20-F22, MDE-ctd outperforms all other algorithms. On the Compound multimodal function F23–F30, nine algorithms have the same performance on F23. On F24, F25 and F26, MDE-ctd achieved a comparable level of performance with 6 algorithms, respectively. On F27–F30, only EPSDE on F29 have a similar performance with MDE-ctd. In other cases, MDE-ctd is superior to other algorithms.

These results show that MDE-ctd has good performance on the CEC2014 benchmark function (unimodal functioned F1–F3, hybrid functions F17–F22 and composition functions F23–F30). Indeed, for unimodal functions, the contribution grouping method used by MDE-ctd can ensure that the best mutation strategy obtains more computing resources. For hybrid functions composition functions, multi-population and multi-strategy integration methods can maintain good diversity of the population and increase the ability of the algorithm to jump out of the local optimum.

Contribution degree analysis

In order to study the contribution degree of the three sub-populations (archival sub-population, exploratory sub-population, integrated sub-population) of MDE-ctd in the whole evolutionary process, some comparative experiments were tested on CEC2005 (30D and 50D) benchmark functions suite respectively. The experimental results are summarized in Table 5 and Fig. 4. Due to the randomness of the algorithm, the MDE-ctd was run independently 25 times. The contribution degree of each sub-population in the evolution process was evaluated by calculating the proportion of the average contribution degree to the total. For clarity, the sub-populations with the highest contribution were marked in boldface.

In different benchmark functions and different stages of evolution, the contribution degree of the three sub-populations is completely different. The contribution of the archival sub-population on F1–F9, F11–F15, and F25 was larger than the exploratory sub-population and the integrated sub-population on 30D. On F10 and F21, the exploratory sub-population contributed more than the other two sub-populations. The contribution of the archival subspecies and the integrated subspecies on F16–F17 is similar. The contribution of Integrated sub-population is always superior to archival sub-population and exploratory sub-population on F18–F20. tion of 50D, the contribution of archival sub-population was higher than the exploratory sub-population and integrated sub-population on F1–F7, F9–F17, F19–F20 and F25. Among the remaining functions, the contribution degree of the integrated sub-population was always better than archival sub-population and exploratory sub-population. Therefore, in the evolution process, the contribution degree can reflect the interaction between the constituent strategies to a certain extent, and allocate computing resources according to the contributions to different sub-populations, which is conducive to the full use of computing resources.

Table 5 Sub-population contribution on CEC2005 benchmark (30D)

Strategy effectiveness

In order to verify the effectiveness of the method proposed in this Manuscript. MDE-ctd (c1), MDE-ctd (c2) and standard MDE-ctd were compared in the CEC2005 function. MDE-ctd (c1) is MDE-ctd without grouping methods based on contribution degree, while MDE-ctd(c2) is MDE-ctd without multi-population and multi-strategy integration method. Each algorithm is run independently 25 times and calculate the mean value and standard deviation to evaluate the performance of the algorithm. Among them, The experimental results are shown in Table 6, Figs. 5 and 6.

The experimental results show that MDE-ctd (c1) performs poorly in unimodal function (F1–F5), basic multimodal function (F6–F12), and extended multimodal function (F13–F14). It is not only the rate of convergence is slow, but also the optimal solution cannot be localized.

This is because MDE-ctd(c1) uses the equal grouping method to allocate the same computing resources to each sub-population. For different optimization problems, the equal grouping method can not effectively use because the feedback information to adjust the population size in time, and it can not make the entire population close to because the optimal solution, thus reducing the overall convergence speed of the algorithm and the ability to find the optimal solution. Standard MDE-ctd uses dynamic regrouping based on contribution degree. From experiment, both the convergence rate and the ability to find the global optimal solution of Standard MDE-ctd are better than MDE-ctd(c1).

Fig. 4
figure 4

Contribution degree of sub-population to different test functions

Table 6 Effectiveness of strategys CEC2005 benchmark (30D)
Fig. 5
figure 5

Convergence curves of MDE-ctd(c1), MDE-ctd(c2), MDE-ctd on CEC2005 benchmark functions F2–F13 with 30D

Fig. 6
figure 6

Convergence curves of MDE-ctd(c1), MDE-ctd(c2), MDE-ctd on CEC2005 benchmark functions F14–F25 with 30D

On the mixed function (F15–F25), the performance of MDE-ctd (c1), MDE-ctd (c2) and standard MDE-ctd are similar. This is because there are many local optimizations on highly complex mixed functions. Only the reallocation of computing resources can not make the MDE-ctd jump out of the local optima, and there must be multiple strategies to solve highly complex optimization problems. Therefore, this paper also proposes a multi-population and multi-strategy integration method for MDE-ctd, which can reduce the impact of using multiple mutation strategies on the convergence speed of the algorithm while maintaining population diversity. Table 6, Figs. 5 and 6 shows that the multi-population and multi-strategy integration method used alone has a good effect on the mixed function (F15–F25). Using the multi-population and multi-strategy integration method, MDE-ctd can have powerful search capabilities in the evolutionary process.

Parameter sensitivity

The influence of parameters ng and NP on the performance of MDE-ctd were analyzed. The different combinations of ng and NP values were compared on the CEC2005 benchmark function suite and used the mean value to judge the results. The sensitivity analysis results of parameters ng and NP are shown in Table 7 and Fig. 7.

In parameter sensitivity analysis, when one parameter is analyzed, other parameters are set as standard values (i.e. \(ng = 5\) or \( NP = 210\)). From the analysis results of the parameters ng and NP in Table 7 and Fig. 5, it can be found that MDE-ctd is sensitive to the parameters ng and NP of many benchmark functions. If the parameter ng is too large, this change can not reflect the search situation in time. If the parameter ng is too small, the frequent grouping will make the algorithm unable to effectively evaluate the evolution. Compared with the ng parameter, the NP parameter has a greater impact on the performance of MDE-ctd. Traditionally, it is believed that a larger population is conducive to solving multimodal optimization problems, while a smaller population is conducive to solving single-mode optimization problems. However, the case, which was found in our experiments, is not always unchanged, For example, when dealing with unimodal function F3, the performance of MDE-ctd increases with the increase of population. When dealing with multi-modal function F11, the performance of MDE-ctd increases with the decrease in population.

According to the analysis of the experimental data, when \(ng = 5\), it can not only reflect the search situation in time, but also reduce the population grouping frequency and effectively evaluate the evolution process. When \(\text {NP}=210\), it can well reduce the impact of population size on the algorithm in the evolution process. Therefore, when \(ng=5\) and \(\text {NP}=210\), the comprehensive performance of MDE-ctd is the best.

Conclusion and future work

This work offered an integrated differential evolution of multiple populations based on contribution degree, known as MDE-ctd, in order to address the deficiencies of the original difference approach in computing resource allocation and handling highly complex optimization issues. In MDE-ctd, three sub-populations (archival sub-population, exploration sub-population, and integrated sub-population) co-evolved at the same time and used a variety of mutation strategies with different advantages to balance exploration and development. In the process of evolution, the contribution of each sub-population to the optimization problem is tracked by the contribution degree, and the dynamic grouping mechanism is adopted to maximize the advantages of different sub-populations and improve the global optimality of MDE-ctd. We effectively realize the dynamic allocation of computing resources among different sub-populations, and allocate more computing resources to the best sub-population in time, thus reducing the waste of computing resources.

Table 7 Computational results of MDE-ctd with different ng and NPsettings over benchmark functions with 30 variables
Fig. 7
figure 7

Bar comparison results of different parameters (ng and NP)

In the following research, the computing resource regrouping method based on contribution degree can provide a new scheme for other multi-population DE algorithms. Meanwhile, MDE-ctd can be applied to some real application problems, such as cloud computing resources scheduling problems, to further test its performance.