1 Introduction

Differential evolution (DE), proposed by Price et al. (2006), is a well-known population-based evolutionary algorithm for solving global optimisation problems over continuous spaces. Literature indicates that DE exhibits very good performance over a wide variety of optimisation problems (Das and Suganthan 2011). However, although being a very efficient optimiser, its local search ability has long been questioned and work has been done to improve its local convergence by combining DE with local optimisation strategies (Qing 2010).

In previous works by the authors, Locatelli and Vasile (2015) and Vasile et al. (2011), it was demonstrated that DE can converge to a fixed point, a level set or a hyperplane that does not contain the global minimum. The collapse of the population to a fixed point or a neighbourhood of a fixed point from which DE cannot escape was one of the motivation for the development of inflationary differential evolution algorithm (Vasile et al. 2011).

IDEA is based on the hybridisation of DE with the restarting procedure of monotonic basin hopping (MBH) (Wales and Doye 1997); it implements both a local restart in the neighbourhood of a local minimum and a global restart in the whole search space. IDEA was shown to give better results than a simple DE, but its performance is dependent upon the parameters controlling both the DE and MBH heuristics (Vasile et al. 2011). These parameters are the crossover probability \(\textit{CR}\), the differential weight F, the radius of the local restart bubble \(\delta _{\mathrm{local}}\) and the number of local restarts \(n_{\mathrm{LR}}\), whose best settings are problem dependent. Different adaptive mechanisms for adjusting \(\textit{CR}\) and F during the search process can be found in the literature, (Brest et al. 2006, 2013; Liu and Lampinen 2005; Omran et al. 2005); a parameter-less adaptive evolutionary algorithm has been presented in Papa (2013). However, no approach has been proposed so far to adapt \(\delta _{\mathrm{local}}\) and \(n_{\mathrm{LR}}\). In this paper, we present a simple mechanism to adapt \(\textit{CR}\) and F within a single population and a multi-population strategy to adapt \(\delta _{\mathrm{local}}\) and \(n_{\mathrm{LR}}\). The multi-population version of IDEA is in the following called MP-AIDEA (Multi Population Adaptive Inflationary Differential Evolution Algorithm).

The resulting algorithm was extensively tested over 51 test problems from the single objective global optimisation competitions of the Congress on Evolutionary Computation (CEC) 2005, 2011 and 2014. Tests to assess the performance of the algorithm include rankings, Wilcoxon test and success rate. It will be shown that the adaptive version of IDEA always ranks in the first three best algorithms in every competition for every number of dimensions except for the CEC 2014 test set with 30 dimensions. Furthermore, it will be shown that the simple adaptation of \(\textit{CR}\) and F within a single population can outperform the multi-population version with adaptation of \(\delta _{\mathrm{local}}\) and \(n_{\mathrm{LR}}\) if these two parameters are properly chosen.

This paper extends the work presented in Di Carlo et al. (2015). In Di Carlo et al. (2015), the basic mechanisms that constitute MP-AIDEA were introduced, and the performance of MP-AIDEA was measured only by a relative ranking against other algorithms. This paper provides a more detailed explanation of all the mechanisms and heuristics inside MP-AIDEA; moreover, it presents an extensive empirical assessment of its performance, using several metrics in addition to the relative ranking. As part of this extensive performance evaluation, we compare MP-AIDEA against a number of other algorithms and a single population version of MP-AIDEA with no adaptive local restart. Detailed results obtained for each test functions are also presented, so that the paper can be used as a reference for comparison against other algorithms.

The paper starts stating the problem we are trying to solve in Sect. 2 and briefly introducing the basic principles and fundamental theoretical developments underneath inflationary differential evolution in Sect. 3. The adaptation mechanisms are presented, together with the resulting adaptive multi-population version of IDEA, MP-AIDEA, in Sect. 5. The test cases are presented in Sect. 6, and the obtained results are presented in Sect. 6.1. Finally, the paper presents the results of all the comparative tests in Sects. 6.26.3 and 6.4. Section 7 concludes the paper.

2 Problem statement

This paper is concerned with the following class of global minimisation problems with box constraints:

$$\begin{aligned} \min _{{\mathbf {x}}\in B} f({\mathbf {x}}) \end{aligned}$$
(1)

with \(f:B \subseteq {\mathbb {R}}^{n_\mathrm{D}}\rightarrow {\mathbb {R}}\), \(n_\mathrm{D}\) the number of dimensions and the box B defined by the upper and lower boundaries \({\mathbf {x}}_{\mathrm{lower}} \le {\mathbf {x}} \le {\mathbf {x}}_{\mathrm{upper}}\). In the following, we will use a gradient-based local search algorithm; therefore, we further require that \(f\in C^2(B)\). Note, however, that this is not a strict requirement as we can show that the algorithm can work also when a finite number of non-differentiable points exist.

3 Inflationary differential evolution

This section briefly recalls the working principles of inflationary differential evolution and presents the parameters that the algorithm proposed in this paper adapts. Following the notation introduced in Vasile et al. (2011), we express the general DE process as a discrete dynamical system. The governing equation, for the i-th individual at generation k, is expressed as:

$$\begin{aligned} {\mathbf {x}}_{i,k+1}={\mathbf {x}}_{i,k}+S({\mathbf {x}}_{i,k}+{\mathbf {u}}_{i,k},{\mathbf {x}}_{i,k}) {\mathbf {u}}_{i,k} \end{aligned}$$
(2)

with

$$\begin{aligned} \begin{aligned} {\mathbf {u}}_{i,k} ={\mathbf {e}}&\left[ G {\mathbf {x}}_{r_1,k} + (1 - G) {\mathbf {x}}_{i,k} +F({\mathbf {x}}_{r_2,k}-{\mathbf {x}}_{r_3,k})\right. \\&+\,\left. (1-G) F ({\mathbf {x}}_{\mathrm{best},k}-{\mathbf {x}}_{i,k}) \right] \end{aligned} \end{aligned}$$
(3)

where G can be either 0 or 1 [with \(G = 1\) corresponding to the DE strategy DE/rand and \(G = 0\) corresponding to the DE strategy DE/current-to-best (Price et al. 2006)]. In Eq. (3), \(r_1\), \(r_2\) and \(r_3\) are integer numbers randomly chosen in the population, and \({\mathbf {e}}\) is a mask containing random numbers of 0 and 1 according to:

$$\begin{aligned} e_t= \Bigg \lbrace \begin{array}{c} 1\Rightarrow U\le CR \\ 0\Rightarrow U> CR \\ \end{array} t = 1, \dots , n_\mathrm{D} \end{aligned}$$
(4)

U is a random number taken from a random uniform distribution [0, 1]. The product between \({\mathbf {e}}\) and the term in square brackets in Eq. (3) has to be intended component-wise. In this work, given \(u_{t,i,k}\), the t-th component of the trial vector \({\mathbf {u}}_{i,k}\), the following correction is applied to satisfy the box constraints (Zhang and Sanderson 2009):

$$\begin{aligned} u_{t,i,k} = {\left\{ \begin{array}{ll} \left( x_{t,i,k} + x_{t,\mathrm{lower}}\right) /2, &{} \text { if } u_{t,i,k} < x_{t,\mathrm{lower}}\\ \left( x_{t,i,k} + x_{t,\mathrm{upper}}\right) /2, &{} \text { if } u_{t,i,k} > x_{t,\mathrm{upper}} \end{array}\right. } \end{aligned}$$
(5)

The selection function S is defined as:

$$\begin{aligned} S({\mathbf {x}}_{i,k}+{\mathbf {u}}_{i,k},{\mathbf {x}}_{i,k})=\Big \{ \begin{array}{l} 1 \;\; \text {if} \;\; f({\mathbf {x}}_{i,k}+{\mathbf {u}}_{i,k})<f({\mathbf {x}}_{i,k})\\ 0 \;\; \text {otherwise} \end{array} \end{aligned}$$
(6)

In the general case in which the indices \(r_{1}\), \(r_{2}\) and \(r_{3}\) can assume any value, in Vasile et al. (2011) it was demonstrated that the population can converge to a fixed point different from a local minimum or to a level set. Furthermore, in Locatelli and Vasile (2015) it was demonstrated that DE can converge to a hyperplane that does not contain the global minimum. Finally, consider the following proposition.

Proposition 1

Consider the subset \(\varPsi =\{{\mathbf {x}}\in B: f({\mathbf {x}})\le {\bar{f}}\}\) and the superset \(\phi \) such that:

  1. 1.

    \(\varPsi \subset \phi \)

  2. 2.

    \({\mathbf {x}}_{i,k+1}\in \phi , \forall i\)

  3. 3.

    \(\forall {\mathbf {y}}\in \phi \setminus \varPsi , f({\mathbf {y}})>{\bar{f}}\)

then if the population at iteration k is entirely contained in \(\varPsi \) it cannot escape from \(\varPsi \) at any future iteration.

Proof

The proof descends from the definition of S. Suppose that a candidate individual \({\mathbf {x}}_{i,k+1}\) was generated by map (2) then, because of point 3 of the proposition, it would be rejected by the selection operator. \(\square \)

Therefore, when the population contracts within a ball \(B_\mathrm{c}\subseteq \varPsi \) of radius \(\rho _\mathrm{l}\), DE can only converge to a point or a subset within \(B_\mathrm{c}\). We call \(\rho _\mathrm{l}\) the contraction limit, in the following.

In inflationary differential evolution, the DE heuristics is iterated until the population reaches the contraction limit. A local search is then started from the best individual in the population \({\mathbf {x}}_{\mathrm{best}}\), the corresponding local minimum \({\mathbf {x}}_{\mathrm{LM}}\) is saved in an archive of local minima A and the population is restarted in a bubble \(B_\mathrm{R}\) of radius \(\delta _{\mathrm{local}}\) around the local minimum \({\mathbf {x}}_{\mathrm{LM}}\). This mechanisms is borrowed from the basic logic underneath monotonic basin hopping (Wales and Doye 1997). To assess if the contraction condition is satisfied, the maximum distance between all possible combinations of individuals of the population at generation k, \(\rho ^{(k)}\), is computed:

$$\begin{aligned} \rho ^{(k)} = \text {max} \left( || {\mathbf {x}}_{i,k} - {\mathbf {x}}_{l,k}||\right) i,l = 1, \dots , N_{\mathrm{pop}} \end{aligned}$$
(7)

where \(N_{\mathrm{pop}}\) is the number of individuals in the population. The contraction is verified when \(\rho ^{(k)}\le {\bar{\rho }} \rho _{\mathrm{max}}\), where \(\rho _{\mathrm{max}}=\max _k \rho ^{(k)}\) is the maximum value of \(\rho \) ever recorded until generation k and \({\bar{\rho }}\) is one of the parameters of the algorithm, the contraction threshold. This contraction criterion is consistent with Proposition 1 under the assumption that \(\rho _\mathrm{l}={\bar{\rho }} \rho _{\mathrm{max}}\).

After a number \(n_{\mathrm{LR}}\) of such local restarts, without any improvement of the current best solution, the archive A collects all the local minima found so far. At this point, the population is restarted globally in the search space so that every individual is initially at a distance \(\sqrt{n_\mathrm{D}} \delta _{\mathrm{global}}\) from the centres of the clusters of the local minima in A. During local restarts, the most important information is preserved in the local minimum. The assumption is that the basin of attraction of that local minimum has already been explored and that exploration led to the convergence of the population to \(B_\mathrm{c}\). When the population is restarted globally the essential information, all the local minima, is stored in the archive A. Here the assumption is that IDEA has completely explored a funnel structure resulting in a cluster of minima.

These restart procedures were proven to be very effective in a series of difficult real problems in which the landscape presents multiple funnels (see Vasile et al. 2011 for additional details).

The complete inflationary differential evolution process with trial vector (3) is governed by the following key parameters: \(N_{\mathrm{pop}}\), \(\textit{CR}\) and F, G, \({\bar{\rho }}\), \(\delta _{\mathrm{local}}\), \(n_{\mathrm{LR}}\), \(\delta _{\mathrm{global}}\). From experience, we know that \(\delta _{\mathrm{global}}\) is not a critical parameter in most of the cases while \(\textit{CR}\), F, \(\delta _{\mathrm{local}}\) and \(n_{\mathrm{LR}}\) play a significant role and are not always easy to define. The parameters \(\textit{CR}\) and F are applied to update each individual in a population while \(\delta _{\mathrm{local}}\) and \(n_{\mathrm{LR}}\) are applied to restart the whole population. Therefore, in this paper we propose two adaptation mechanisms, one for \(\textit{CR}\) and F and one for \(\delta _{\mathrm{local}}\) and \(n_{\mathrm{LR}}\). In particular, the adaptation mechanisms of \(\textit{CR}\), F and \(\delta _{\mathrm{local}}\) are such as to result in the definition of numerical values for these parameters, to be used by the algorithm. On the contrary, the use of \(n_{\mathrm{LR}}\) is replaced by a mechanism that allows the algorithm to decide when to perform a local or global restart, so that the definition of a numerical value for \(n_{\mathrm{LR}}\) is not required anymore.

4 Adaptation mechanisms

Because of the very nature of \(\textit{CR}\) and F, \(\delta _{\mathrm{local}}\) and \(n_{\mathrm{LR}}\), the automatic adaptation of \(\textit{CR}\) and F requires only the evaluation of the success of each candidate increment \({\mathbf {u}}_{i,k}\). On the other hand, the adaptation of \(\delta _{\mathrm{local}}\) and \(n_{\mathrm{LR}}\) requires the evaluation of the success of the restart of an entire population. Therefore, in this paper it is proposed to extend the working principle of inflationary differential evolution by evolving \(n_{\mathrm{pop}}\) populations in parallel, where \(n_{\mathrm{pop}}\) is defined a priori.

Each population adapts its own values of \(\textit{CR}\) and F. We use a stigmergic approach in which the \(\textit{CR}\) and F of each individual are drawn from a joint probability distribution, over a set of possible values of \(\textit{CR}\) and F, that evolves with the population.

All populations are then concurrently adapting \(\delta _{\mathrm{local}}\) and the number of local restarts. More specifically, the adaptation mechanism of the local restart bubble evolves a probability distribution function over a range of possible values of \(\delta _{\mathrm{local}}\). Each population draws values from that probability distribution and at each local restart increases the probability associated to the value of \(\delta _{\mathrm{local}}\) that led to a transition from one local minimum to another. The range of \(\delta _{\mathrm{local}}\) is also adapted by taking the mean and the minimum distance among the local minima in A.

The number of local restarts, instead, is dictated by the contraction of a population within the basin of attraction of an already identified local minimum. Given a local minimum \({\mathbf {x}}_{\mathrm{LM}}\in A\) and a list of \(n_{\mathrm{best},LM}\) best individuals from which a local search converged to \({\mathbf {x}}_{\mathrm{LM}}\), the size of the basin of attraction of \({\mathbf {x}}_{\mathrm{LM}}\) is defined as

$$\begin{aligned} d_{\mathrm{basin},LM} = \min _j || {\mathbf {x}}_{\mathrm{best},j} - {\mathbf {x}}_{\mathrm{LM}}||,\;\; j\in {1,...,n_{\mathrm{best},LM}} \end{aligned}$$
(8)

Each local minimum \({\mathbf {x}}_{\mathrm{LM}}\) in A, therefore, is associated to a particular \(d_{\mathrm{basin},LM}\). Figure 1 illustrates this mechanism. Once \(d_{\mathrm{basin},LM}\) is estimated, every time the condition \(\rho _m^{(k)}\le {\bar{\rho }} \rho _{m,\mathrm{max}}\) is satisfied for population m, if the best individual \({\mathbf {x}}_{\mathrm{best},m}\) is at a distance lower than \(d_{\mathrm{basin},LM}\) from \({\mathbf {x}}_{\mathrm{LM}}\), then no local restart is performed but the population is restarted globally in the search space. The number \(n_{\mathrm{best},LM}\) is set to 4 in this implementation.

The overall algorithm, called Multi-Population Adaptive Inflationary Differential Evolutionary Algorithm (MP-AIDEA), is described in more detail in the following section.

Fig. 1
figure 1

Identification of the basin of attraction of local minimum \({\mathbf {x}}_{\mathrm{LM}}\)

5 Multi-population adaptive inflationary differential evolution

figure a

MP-AIDEA is described in Algorithm 1. Let \(n_{\mathrm{pop}}\) be the number of populations and m the index identifying each population. With reference to Algorithm 1, after initialisation of main parameters and functionalities (Algorithm 1, line 1), MP-AIDEA starts by running \(n_{\mathrm{pop}}\) Differential Evolutions in parallel, one per population (Algorithm 1, line 3). During each evolution process, the parameters F and \(\textit{CR}\) are automatically adapted following the approach presented in Sect. 5.2. When a population m contracts within a ball \(B_\mathrm{c}\) of radius \({\bar{\rho }}\; \rho _{m,\mathrm{max}}\), the evolution of that population is stopped. Once all the populations have contracted, the relative position of the best individual of each population, \({\mathbf {x}}_{\mathrm{best},m}\) with respect to the local minima in A, \({\mathbf {x}}_{\mathrm{LM}}\), is assessed (Algorithm 1, line 7). This step makes use of all the minima found by all populations and, therefore, it has to be regarded as an information sharing mechanism among populations. If the best individual of population m is not within the basin of attraction of any previously detected local minimum (that is, \(\forall LM \; : \; \Vert {\mathbf {x}}_{\mathrm{best},m} - {\mathbf {x}}_{\mathrm{LM}} \Vert > d_{\mathrm{basin},LM}\)) then a local search is run (Algorithm 1, line 8) and the resulting local minimum is stored in the archive A (Algorithm 1, line 16). The flag for the local restart, \(LR_m\), is set to 1. On the contrary, if the best individual of population m is inside the basin of attraction of a previously detected local minimum, the local search is not performed and \(LR_m\) is set to 0 (Algorithm 1, line 20).

Before running a local or a global restart (Algorithm 1, line 24), the probability distribution associated to \(\delta _{\mathrm{local}}\) and its range are updated (Algorithm 1, line 23). After restarting the population, if the number of maximum function evaluations, \(n_{\mathrm{feval,max}}\), is not exceeded, the process restarts from line 2 in Algorithm 1. Each part of Algorithm 1 is explained in more details hereafter.

5.1 Initialisation

The steps for the initialisation of MP-AIDEA are presented in Algorithm 2. MP-AIDEA starts with the initialisation of \(n_{\mathrm{pop}}\) populations, with \(N_{\mathrm{pop}}\) individuals each, in the search space B. The number of function evaluations for each population is set to zero, \(n_{\mathrm{feval},m} = 0\) and \({\bar{\rho }}\), \(\delta _{\mathrm{global}}\), are initialised to the values specified by the user. The counter of the number of local search per population, \(s_m\), is set to 0.

figure b

5.2 Differential evolution and the adaptation of \(\textit{CR}\) and F

For each population m, a DE process is run (Algorithm 3, line 6), using Equations 23, 4 and 6. The parameter G, in Equation 3, assumes values equal to 0 or 1 with probability 0.5. During the advancement from parents to offspring, each individual of the population is associated to a different value of \(\textit{CR}\) and F, drawn from a distribution \(\mathbf {CRF}_m^{(k_m)}\) (Algorithm 3, lines 123). \(\mathbf {CRF}_m^{(k_m=1)}\) is initialised as a uniform distribution with \((n_\mathrm{D}+1)^2\) points in the space \(\textit{CR} \in [0.1, 0.99]\) and \(F \in [-0.5, 1]\) (Algorithm 3, line 1). A Gaussian kernel is then allocated to each node and a probability density function is built by Parzen approach (Minisci and Vasile 2014). The values of \(\textit{CR}\) and F to be associated to the individuals of the population are drawn from this distribution (Algorithm 3, line 4). A change value dd linked to each kernel is initialised to zero (Algorithm 3, line 3) and is used during the advancement of the population from parents to children to adapt \(\textit{CR}\) and F (Algorithm 3, line 8). The adaptation of \(\textit{CR}\) and F is summarised in Algorithm 4 and described in the following.

figure c
figure d

For each individual i of each population m, the adaptation mechanism for \(\textit{CR}\) and F is started only if the child is characterised by an objective function value lower than the parent’s one, that is \(f({\mathbf {x}}_{m,i}^{(k_m+1)}) < f({\mathbf {x}}_{m,i}^{(k_m)}) \) (Algorithm 4, line 1). If this condition is verified, the difference in objective function between parent and child at subsequent generation, \(df_{m,i}^{(k_m+1)} = ||f ( {\mathbf {x}}_{m,i}^{(k_m+1)} ) - f ({\mathbf {x}}_{m,i}^{(k_m)} ) ||\), is computed (Algorithm 4, line 2). Then the sorted elements of \(\mathbf {CRF}_{m}^{(k_m)}\) are sequentially evaluated; the q-th value of \(\textit{CR}\) in \(\mathbf {CRF}_m^{(k_m)}\) is identified as \(\mathbf {CRF}_{m,q,1}^{(k_m)}\) and the q-th value of F is identified as \(\mathbf {CRF}_{m,q,2}^{(k_m)}\). The first time that \(dd_{m,q}^{(k_m)}\) (the dd value associated to the q-th row of \(\mathbf {CRF}_{m}^{(k_m)}\)) is lower than \(df_{m,i}^{(k_m+1)}\) (Algorithm 4, line 4), the differential weight \(F_{m,i}^{(k_m)}\) used for the individual \({\mathbf {x}}_{m,i}^{(k_m)}\) substitutes \(\mathbf {CRF}_{m,q,2}^{(k_m)}\) and \(df_{m,i}^{(k_m+1)}\) substitutes \(dd_{m,q}^{(k_m)}\) (Algorithm 4, lines 5 and 6). This is because \(F_{m,i}^{(k_m)}\) produced a bigger decrease in the objective function than \(\mathbf {CRF}_{m,q,2}^{(k_m)}\) (as shown by \(df_{m,i}^{(k_m+1)} > dd_{m,q}^{(k_m)}\)). For \(\textit{CR}\), the value associated to \({\mathbf {x}}_{m,i}^{(k_m)}\) substitutes \(\mathbf {CRF}_{m,q,1}^{(k_m)}\) (Algorithm 4, line 8) only if \(df_{m,i}^{(k_m+1)}\) is greater than a given value CRC (Algorithm 4, line 7), (Minisci and Vasile 2014).

The DE stops according to the contraction condition presented in Sect. 3. In order to prevent an excessive use of resources when the population partitions, a fail safe criterion was introduced that stops the DE after 10D generations (Algorithm 3, line 13).

5.3 Local search and restart mechanisms

After the evolution of all populations has stopped, MP-AIDEA checks if the best individual of each population is inside the basin of attraction of any previously detected local minimum (see Algorithm 1, line 7). If that is not the case, a local search is performed from the best individual and the population is locally restarted within a hypercube with edge equal to \(2\delta _{\mathrm{local}}\) around the detected local minimum; otherwise, no local search is performed and the population is restarted globally in the whole search space (Algorithm 1, line 24). Prior to the implementation of the restart mechanisms, MP-AIDEA updates the estimation of the size of the basin of attraction of each minimum, the archive A (see Algorithm 1, lines 5 to 22) and the distribution over the possible values of \(2\delta _{\mathrm{local}}\) (see Algorithm 1, line 23). In the following the identification of the basin of attraction, the estimation of \(\delta _{\mathrm{local}}\) and the two restart mechanisms are described in more details.

5.3.1 Identification of the basin of attraction

In order to mitigate the possibility of running multiple local searches that converge to already discovered local minima, MP-AIDEA estimates for each local minimum in A the radius of the basin of attraction of that local minimum. The radius of the basin of attraction is here defined as the distance \(d_{\mathrm{basin},LM}\) from a given local minimum \({\mathbf {x}}_{\mathrm{LM}}\) such that if the best individual in population m, \({\mathbf {x}}_{\mathrm{best},m}\), is at a distance from \({\mathbf {x}}_{\mathrm{LM}}\) lower than \(d_{\mathrm{basin},LM}\) a local search starting from \({\mathbf {x}}_{\mathrm{best},m}\) would converge to \({\mathbf {x}}_{\mathrm{LM}}\).

The radius \(d_{\mathrm{basin},LM}\) is estimated with the simple procedure in Algorithm 1, lines 7 to 19. Once the evolution of all populations has stopped, the distance \(\Vert {\mathbf {x}}_{\mathrm{best},m}-{\mathbf {x}}_{\mathrm{LM}}\Vert \) of the best individual, in each population, with respect to all the minima in A is calculated and compared to the \(d_{\mathrm{basin},LM}\) associated to each local minimum in A; initially all \(d_{\mathrm{basin},LM}\) are set to 0. If the distance \(\Vert {\mathbf {x}}_{\mathrm{best},m}-{\mathbf {x}}_{\mathrm{LM}}\Vert \) is grater than \(d_{\mathrm{basin},LM}\) a local search is started from \({\mathbf {x}}_{\mathrm{best},m}\). If the resulting local minimum \({\mathbf {x}}_{\mathrm{min},m}^{(s_m)}\) already belongs to A, the counter \(i_{\mathrm{LM}}\) is updated and the new estimate of the basin of attraction of \({\mathbf {x}}_{\mathrm{LM}}\) becomes \(d_{\mathrm{basin},LM}=\min [d_{\mathrm{basin},LM},\Vert {\mathbf {x}}_{\mathrm{best},m}-{\mathbf {x}}_{\mathrm{LM}}\Vert ]\). \({\mathbf {x}}_{\mathrm{min},m}^{(s_m)}\) belongs to A if \( \exists \; LM \; : \; \Vert {\mathbf {x}}_{\mathrm{min},m}^{(s_m)} - {\mathbf {x}}_{\mathrm{LM}} \Vert \le \varepsilon \varDelta \). \(\varepsilon \) is set to \(10^{-3}\). If \(i_{\mathrm{LM}}\) exceeds a given maximum value and \(\Vert {\mathbf {x}}_{\mathrm{best},m}-{\mathbf {x}}_{\mathrm{LM}}\Vert < d_{\mathrm{basin},LM} \; \forall \; LM\) no local search and no local restart are performed. The counter \(i_{\mathrm{LM}}\) is initialised to 1 for every new local minimum and keeps track of the number of times a local minimum is discovered.

5.3.2 Adaptation of \(\delta _{\mathrm{local}}\)

When a population m is locally restarted, individuals are generated by taking a random sample, with Latin Hypercube, within a hypercube with edge equal to \(2\delta _{local,m}\). The dimension \(\delta _{local,m}\) is drawn from a probability distribution that is progressively updated at every restart. We use a kernel approach with kernels centred in the elements of a vector \({\mathbf {B}}\) (see Algorithm 6) containing a range of possible values of \(\delta _{local,m}\). The vector \({\mathbf {B}}\) is initialised, with the procedure presented in Algorithm 5, when all populations performed a local search for the first time and at every global restart. During initialisation the distance between all the local minima in the archive A is computed (Algorithm 5, line 1) and \({\mathbf {B}}\) is initialised with values spanning the interval between the minimum and the mean distance among minima (Algorithm 5, lines 2–3). The mean values instead of the max is used to limit the size of the restart bubble and speed up convergence under the assumption that a local restart needs to lead to the local exploration of the search space. In the experimental tests, it will be shown that this working assumption is generally verified and \(\delta _{local,m}\) tends to converge to small values. Then, a second vector \({{\mathbf {d}}}{{\mathbf {d}}}_{b}\), with the same number of components of \({\mathbf {B}}\), is initialised to zero (Algorithm 5, line 4).

During the update phase of \(\delta _{local,m}\), MP-AIDEA uses the index \(s_m\) to keep track of the number of times population m performed a local search and calculates the difference \(p_m\) between two subsequent local minima (see Algorithm 6, line 5). The value \(p_m\) is then compared to the elements in \({{\mathbf {d}}}{{\mathbf {d}}}_{b}\) and when \(dd_{b,q} < p_m\) then \(\delta _{local,m}\) replaces \(B_q\), and \(p_m\) replaces \(dd_{b,q}\) (Algorithm 6, lines 7-10). In other words, if the \(\delta _{local,m}\) used to restart population m led to a local minimum \({\mathbf {x}}_{\mathrm{min},m}^{(s_m)}\) different from \({\mathbf {x}}_{\mathrm{min},m}^{(s_m-1)}\), the local minimum previously identified by the same population, the probability of sampling \(\delta _{local,m}\) is increased.

figure e
figure f
figure g

5.3.3 Local and global restart

After the identification of the basin of attraction and the update of the value of \(\delta _{\mathrm{local}}\), populations undergo a restart process in which a new population is generated either by sampling a neighbourhood of a local minimum (local restart) or by sampling the whole search space (global restart). The two restart procedures are described in Algorithm 7.

The local restart procedure takes the latest identified local minimum \({\mathbf {x}}_{\mathrm{min},m}^{(s_m)}\) of population m and restart the population with Latin Hypercube sampling in a box centred in \({\mathbf {x}}_{\mathrm{min},m}^{(s_m)}\) with edge length \(2\delta _{local,m}\).

The global restart procedure identifies clusters of local minima with a Fuzzy C-Mean algorithm (Bezdek 1981), computes the centre of each cluster and initialises population m so that each individual is at distance at least \(\sqrt{n_\mathrm{D}}\delta _{\mathrm{global}}\) from each of the centres of the clusters (Algorithm 7, lines 6 and 7).

At each local and global restart, the \(\mathbf {CRF}\) matrix is re-initialised while the vector \({\mathbf {B}}\) is initialised only after every global restart. The motivation for re-initialising \(\mathbf {CRF}\) at every restart is twofold: on the one hand different values of \(\textit{CR}\) and F might be optimal in different parts of the search space, and on the other hand convergence to the optimal value of \(\textit{CR}\) and F is not always guaranteed. In search spaces with uniform and homogeneous structures, restarting \(\mathbf {CRF}\) and \({\mathbf {B}}\) might lead to an overhead on the computational cost; therefore, in future implementations we will test the possibility of retaining \(\mathbf {CRF}\) and \({\mathbf {B}}\) across the restart process.

5.4 Computational complexity

The computational complexity of MP-AIDEA is defined by the three main sets of operations:

  • Local search. The local search uses the Matlab fmincon function which implements an SQP scheme with BFGS estimation of the Hessian matrix. Since the matrix is generally dense, its decomposition is \({\mathcal {O}}(n_{D}^3)\).

  • Adaptation of \({{\mathbf {C}}}{{\mathbf {R}}}\)and\({\mathbf {F}}\). The adaptation of \(\textit{CR}\) and F for each individual in each population is the other expensive bit of the algorithm and is \({\mathcal {O}}(n_{\mathrm{pop}}N_{\mathrm{pop}}n_{D}^2)\) ( see line 2 in Algorithm 1, line 8 in Algorithm 3 and line  3 in Algorithm 4). As a comparison, the computational complexity of the standard DE is \({\mathcal {O}}\left( N_{\mathrm{pop}} \right) \).

  • Restart mechanisms. The cost of the local restart procedure is limited to the generation of \(n_{\mathrm{pop}} N_{\mathrm{pop}}\) individuals, while the global restart has a cost associated also to clustering, which is \({\mathcal {O}} = (n_{\mathrm{LM}}^2 n_\mathrm{D} n_{\mathrm{iter}})\) (Bezdek 1981), where \(n_{\mathrm{iter}}\) is the number of iterations for the clustering, and one associated to the verification that the new population is far from the clusters, which is \({\mathcal {O}}(N_{\mathrm{pop}}n_{\mathrm{LM}})\) (see line 7 of Algorithm 7).

Overall when \(n_{\mathrm{pop}}N_{\mathrm{pop}}<n_{D}\) the dominant algorithmic cost is the local search while the adaptation of \(\textit{CR}\) and F becomes more expensive for large and numerous populations. Since in the experimental test cases we will use \(N_{\mathrm{pop}}=n_\mathrm{D}\) and \(n_{\mathrm{pop}}=4\) the overall algorithmic complexity remains \({\mathcal {O}}(n_{D}^3)\).

Table 1 Functions of the CEC 2005 test set
Table 2 Functions of the CEC 2011 test set
Table 3 Functions of the CEC 2014 test set

6 Experimental performance analysis

The effectiveness of MP-AIDEA is tested on a benchmark composed of three test sets. The three test sets are made of functions taken from three past competitions of the Congress on Evolutionary Computation (CEC). We took 20 functions from CEC 2005 (Suganthan et al. 2005), 9 real-world problems from CEC 2011 (Das and Suganthan 2010) and 22 functions from CEC 2014 (Liang et al. 2013), for a total of 51 different problems. The list of functions used in each test set is reported in Tables 1, 2 and 3. They include both academic test functions and real-world optimisation problems. Since we are interested in solving problem (1), all functions selected for this benchmark are continuous and differentiable

We used four different metrics to evaluate MP-AIDEA against the algorithms that participated in the three CEC competitions:

  • Metric 1: Best, worst, median, mean and standard deviation of the best result over a given number of independent runs of the algorithm.

  • Metric 2: Ranking against the other algorithms using the same ranking approach proposed in the CEC 2011 competition.

  • Metric 3: Wilcoxon test. This is used to compare MP-AIDEA to the algorithm participating in the CEC 2011 and CEC 2014 for which the source code is available online.

  • Metric 4: Success rate. This is used to compare MP-AIDEA to the algorithm participating in the CEC 2011 and CEC 2014 for which the source code is available online.

Table 4 Settings for the CEC 2005, CEC 2011 and CEC 2014 test functions
Table 5 Objective functions error of the CEC 2005 test set in dimension 10D and 30D

The settings of MP-AIDEA were maintained constant for all problems within a particular test set and were changed going from one test set to another. This is in line with the way all the other algorithms competed. Table 4 summarises the parameters and settings used for the CEC 2005, CEC 2011 and CEC 2014 test functions. More details about the chosen parameters will be given in Sect. 6.1.

The ranking of the algorithms participating in every competition was adjusted to account only for their performance on the selected subset of differentiable functions.

It will be shown that all metrics lead to similar conclusions: MP-AIDEA ranks among the first four algorithms, if not first, in all three test sets and for all dimensions. We will also show that MP-AIDEA can detect previously undiscovered minima on some particularly difficult functions.

The current implementation of MP-AIDEA can be found open source at https://github.com/strath-ace/smart-o2c together with the benchmark of test cases.

6.1 Test sets

This section briefly describes each test set, the settings of MP-AIDEA and metric 1 for all test sets.

Table 6 Objective functions error of the CEC 2005 test set in dimension 50D
Table 7 CEC 2005 best objective function error values for functions 13 and 16, \(n_\mathrm{D}=10\)

6.1.1 CEC 2005 test set

Following the rules of the CEC 2005 competition, MP-AIDEA was applied to the solution of the problems in the CEC 2005 test set in dimension \(n_\mathrm{D} = 10\), 30 and 50, with a maximum number of function evaluation equal to \(n_{\mathrm{feval,max}} = 10000 n_\mathrm{D}\). The experiments were repeated for a total of \(n_{\mathrm{runs}} = 25\) independent runs for each function (Suganthan et al. 2005). Functions 4, 17, 24 and 25 of the CEC 2005 competition were not included in the test set because non-differentiable.

The number of populations in MP-AIDEA was set to \(n_{\mathrm{pop}} = 4\) and the number of individuals in each population was set to \(N_{\mathrm{pop}} = n_\mathrm{D}\). The number of populations to be deployed on a particular problem depends on the type and complexity of that problem, and the available number of function evaluations. We tested MP-AIDEA with different numbers of populations from 1 to 4 (results using MP-AIDEA with one population are presented in Sect. 6.2). Results showed that MP-AIDEA with 4 populations performs consistently well on all benchmarks, and, thus, we decided to present our findings for \(n_{\mathrm{pop}}=4\). The contraction limit was set to \({\bar{\rho }} = 0.2\) and the global restart distance was set to \(\delta _{\mathrm{global}} = 0.1\) (Table 4). In line with the metrics presented at the CEC 2005 competition, Tables 5 and 6 reports the difference, in the objective value, between the result obtained with MP-AIDEA and the known global minimum.

Table 8 Objective functions of the CEC 2011 test set

Table 7 reports the best objective function error values obtained by all the algorithms participating in the CEC 2005 competition and MP-AIDEA for functions 13 and 16 and \(n_\mathrm{D} = 10\). According to the CEC 2005 specifications, the accuracy level for the detection of the global minimum is \(10^{-2}\) for these functions. MP-AIDEA is able to identify the global minimum of both functions 13 and 16. Previously only EvLib (Becker 2005) succeeded in identifying the global minimum of function 13 and no other algorithm managed to find the global minimum of function 16.

6.1.2 CEC 2011 test set

Following the rules of the CEC 2011 competition (Das and Suganthan 2010), MP-AIDEA was run for \(n_{\mathrm{feval,max}} = 150000\) function evaluations on the CEC2011 test set. The experiments were repeated for \(n_{\mathrm{runs}} = 25\) independent runs. Test functions with equality and inequality constraints were not included in the tests. The number of populations \(n_{\mathrm{pop}}\) was set to 4 and the number of individuals in each population was set to \(N_{\mathrm{pop}}=30\) regardless of the dimensionality of the problem. The contraction limit and the global restart distance were set, respectively, to \({\bar{\rho }} = 0.2\) and \(\delta _{\mathrm{global}} = 0.1\) (Table 4). Table 8 reports the best, worst, median, mean objective function found by MP-AIDEA and the associated standard deviation.

6.1.3 CEC 2014 test set

In line with the rules of the CEC 2014 competition (Liang et al. 2013), MP-AIDEA was applied to the solution of the functions in the CEC 2014 test set in dimension \(n_\mathrm{D} = 10\), 30, 50 and 100, with maximum number of function evaluations \(n_{\mathrm{feval,max}} = 10000 n_\mathrm{D}\). The experiments were repeated for \(n_{\mathrm{runs}} = 51\) independent runs. Non-differentiable functions 6, 12, 19, 22, 26, 27, 29 and 30 were not included in the test set (see Table 3). The number of populations was set to \(n_{\mathrm{pop}} = 4\) and the number of individuals in each population was set to \(N_{\mathrm{pop}} = n_\mathrm{D}\). The contraction limit and the global restart distance were set, respectively, to \({\bar{\rho }} = 0.2\) and \(\delta _{\mathrm{global}} = 0.1\) (Table 4).

Tables 9 and 10 report the difference between the objective value found by MP-AIDEA and the known global minimum. In agreement to the guidelines of the competition error values smaller than \(10^{-8}\) are reported as zero, (Liang et al. 2013). Table 11 reports the best objective function values obtained by all the algorithms participating in the competition and MP-AIDEA for functions 9, 10, 11 and 15 in 10 dimensions. MP-AIDEA finds the global minimum of function 11, unlike all the other competing algorithms, and gives good results for the other functions.

Table 9 Objective functions error of the CEC 2014 test set in dimension 10D and 30D
Table 10 Objective functions error of the CEC 2005 test set in dimension 50D and 100D
Table 11 CEC 2014 best objective function error values for functions 9, 10, 11 and 15, \(n_\mathrm{D}=10\)

6.2 Ranking

In this section, MP-AIDEA is ranked against a group of algorithms participating in each CEC competition. The rankings include those algorithms that reported their results in a paper and MP-AIDEA with two different settings:

  • \(n_{\mathrm{pop}} = 4\) and \(N_{\mathrm{pop}} = n_\mathrm{D}\). This settings will be indicated as “MP-AIDEA” in the following and corresponds to the settings that was used to generate the results in Sect. 6.1.

  • \(n_{\mathrm{pop}}=1\), \(N_{\mathrm{pop}} = 4 n_\mathrm{D}\); MP-AIDEA adapts \(\textit{CR}\) and F but uses fixed values for \(\delta _{\mathrm{local}}\) and \(n_{\mathrm{LR}}\). In particular, \(n_{\mathrm{LR}} = 10\) and \(\delta _{\mathrm{local}}\) = 0.1, unless otherwise specified. This settings will be indicated as “MP-AIDEA, \(n_{\mathrm{pop}}=1\)” in the following.

The ranking method follows the rules of the CEC 2011 competition, (Suganthan 2011). All algorithms are ranked on the basis of the best and mean values of the objective function obtained over a certain number of runs. The following procedure is used to obtain the ranking:

  • for each function, algorithms are ranked according to the best objective value;

  • for each function, algorithms are ranked according to the mean objective value;

  • the ranking for the best and mean objective values of a particular algorithm are added up over all the problems to get the absolute ranking.

In the following, the rankings obtained for the CEC 2005, CEC 2011 and CEC 2014 test sets are presented.

6.2.1 CEC 2005 test set

The rankings obtained for \(n_\mathrm{D} = 10\), \(n_\mathrm{D}=30\) and \(n_\mathrm{D} = 50\) are reported in Table 12. Only the competing algorithms that reported in their paper also the results obtained for the hybrid functions of the CEC 2005 competition (Table 1) are considered. Results show that, for \(n_\mathrm{D} = 10\) and \(n_\mathrm{D} = 30\), MP-AIDEA with adaptation of \(\delta _{\mathrm{local}}\) and \(n_{\mathrm{LR}}\) is ranked first, while for \(n_\mathrm{D} = 50\) results are better when using MP-AIDEA with non-adapted \(\delta _{\mathrm{local}}=0.1\) and \(n_{\mathrm{LR}} = 10\). In any case, both settings outperform the winning algorithm of the competition CEC 2005.

Table 12 CEC 2005 algorithms ranking
Table 13 CEC 2011 algorithms ranking

6.2.2 CEC 2011 test set

The results obtained on the CEC 2011 test set are reported in Table 13. MP-AIDEA ranks first if problem 13 (the Cassini 2 Spacecraft Trajectory Optimisation Problem) is excluded from the test set and second if it is included.

The reason can be found in Fig. 2. Figure 2 shows the convergence profile of the best solutions found by MP-AIDEA and GA-MPC, the best algorithm of the competition, on function 13 for an increasing number of function evaluations (greater than the limit prescribed by the CEC 2011 competition). The results for GA-MPC are obtained using the code available online (http://www3.ntu.edu.sg/home/epnsugan/index_files/CEC11-RWP/CEC11-RWP.htm).

On this test problem, GA-MPC converges very rapidly to a local minimum but then stagnates. On the contrary, MP-AIDEA has a slower convergence for the first 200,000 function evaluations but then progressively finds better and better minima as the number of function evaluations increases. This demonstrates that in a realistic scenario in which function evaluations are not arbitrarily limited, MP-AIDEA would provide better results than the algorithm that won the competition.

Results in Table 13 shows that MP-AIDEA with adaptation of \(\delta _{\mathrm{local}}\) and \(n_{\mathrm{LR}}\) performs better than MP-AIDEA with fixed values of \(\delta _{\mathrm{local}}\) and \(n_{\mathrm{LR}}\). The adaptation history of \(\delta _{\mathrm{local}}\) is shown in Fig. 3 for each of the four populations on test functions 12 and 13 and for 600,000 function evaluations.

Fig. 2
figure 2

Best values of MP-AIDEA and GA-MPC for Function 13, CEC2011

Fig. 3
figure 3

\(\delta _{\mathrm{local}}\) for the four populations of MP-AIDEA for functions 12 (top) and 13 (bottom), CEC 2011

Table 14 CEC 2014 algorithms ranking

6.2.3 CEC 2014 test set

The ranking results for the CEC 2014 test set are reported in Table 14. MP-AIDEA with one population is tested in this case with \(\delta _{\mathrm{local}}=0.1\) and \(\delta _{\mathrm{local}} = 0.3\). For \(n_\mathrm{D} = 10\) the results of MP-AIDEA with adaptation of \(\delta _{\mathrm{local}}\) and \(n_{\mathrm{LR}}\) are better than those of MP-AIDEA with fixed values of \(\delta _{\mathrm{local}}\) and \(n_{\mathrm{LR}}\), for both \(\delta _{\mathrm{local}}=0.1\) and \(\delta _{\mathrm{local}}=0.3\). In the other cases MP-AIDEA with fixed values of \(\delta _{\mathrm{local}}\) and \(n_{\mathrm{LR}}\) outperforms MP-AIDEA with adaptation of \(\delta _{\mathrm{local}}\) and \(n_{\mathrm{LR}}\) when \(\delta _{\mathrm{local}} = 0.1\) but not when \(\delta _{\mathrm{local}} = 0.3\). These results show the strong influence of this parameter on the results obtained by MP-AIDEA. The adaptation history of \(\delta _{\mathrm{local}}\) for test functions 9, 17 and 25 at \(n_\mathrm{D}=30\) and 300,000 functions evaluations is shown in Fig. 4.

These figures show how the adaptation of \(\delta _{\mathrm{local}}\) is effective when a sufficient number of adaptation steps can be performed within the limit of the maximum number of function evaluation (300,000 in this case). For function 25, for example, the adaptation steps are only 7, while they are 11 for function 17 and 18 for function 9. In these two cases \(\delta _{\mathrm{local}}\) converges to 0.1 and 0.04, respectively.

Fig. 4
figure 4

\(\delta _{\mathrm{local}}\) for the four populations of MP-AIDEA for functions 9 (top), 17 (middle) and 25 (bottom), \(n_\mathrm{D}=30\), CEC 2014

The performance of MP-AIDEA for the 30D functions of the CEC 2014 test set is further investigated to test the dependence of the results upon the two non-adapted parameters, \({\bar{\rho }}\) and \(\delta _{\mathrm{global}}\). Table 15 shows the raking obtained when varying \({\bar{\rho }}\) and \(\delta _{\mathrm{global}}\).

Case B of Table 15 shows the ranking obtained when using \({\bar{\rho }} = 0.3\) instead than \({\bar{\rho }} = 0.2\). Comparing the results in Table 15 with those in Table 14, it is possible to see that MP-AIDEA performs better using \({\bar{\rho }} = 0.3\) rather than \({\bar{\rho }} = 0.2\), moving from the fourth to the third position in the ranking. At the same time, there is no significant dependence upon the value of \(\delta _{\mathrm{global}}\), as shown by Cases C and D in Table 15, where \(\delta _{\mathrm{global}}\) is changed from its nominal value of 0.1 to 0.2 and 0.3.

Table 15 CEC 2014 algorithms ranking, 30D, \({\bar{\rho }}=0.1\) and \({\bar{\rho }}=0.3\)

6.3 Wilcoxon test

The Wilcoxon rank sum test is a nonparametric test for two populations when samples are independent. In this case, the two populations of samples are, for each problem, the \(n_{\mathrm{runs}}\) values of the objective function obtained by MP-AIDEA and by another algorithms participating in the CEC 2011 and CEC 2014 competitions. No test is performed for the CEC2005 test set, since for no one of the algorithms participating in the CEC 2005 competition the code is available on-line.

The Wilcoxon test is realised using the Matlab\(^{\textregistered }\) function ranksum. ranksum tests the null hypothesis that data from two entries x and y are samples from continuous distributions with equal medians. Results from ranksum are presented in the following as values of p and h. p, ranging from 0 to 1, is the probability of observing a test statistic as or more extreme than the observed value under the null hypothesis. h is a logical value, where \(h = 1\) indicates rejection of the null hypothesis at the \(100 \alpha \)\(\%\) significance level while \(h = 0\) indicates a failure to reject the null hypothesis at the \(100 \alpha \)\(\%\) significance level, where \(\alpha \) is 0.05. When \(h=1\), the null hypothesis that distributions x and y have equal medians is rejected, and additional test are conducted to assess which one of the two distributions has lower median. In order to do so, three types of tests are realised using ranksum for the two distributions x and y:

  • Two-sided hypothesis test: the alternative hypothesis states that x and y have different medians. Two distributions with equal medians will give as results \(p_B=1\) and \(h_B=0\) (failure to reject the null hypothesis that x and y have equal medians), while two distributions with different medians will give as results \(p_B = 0\) and \(h_B=1\) (rejection of the null hypothesis that x and y have equal medians). If the two-sided hypothesis test finds that the two distributions have equal medians (\(p_B=1\) and \(h_B=0\)), no further test is conducted. Otherwise, the left-tailed and right-tailed hypothesis test are conducted.

  • Left-tailed hypothesis test: the alternative hypothesis states that the median of x is lower than the median of y. If x has median greater than the median of y, results will be \(p_L=1\) and \(h_L=0\) (failure to reject the hypothesis that x has median greater than y) while if x has median lower than y results will be \(p_L=0\) and \(h_L=1\) (rejection of the hypothesis that x has median greater than y).

  • Right-tailed hypothesis test: the alternative hypothesis states that the median of x is greater than the median of y. If x has median lower than the median of y, results will be \(p_\mathrm{R}=1\) and \(h_\mathrm{R}=0\) (failure to reject the hypothesis that x has median lower than y) while if x has median greater than y results will be \(p_\mathrm{R}=0\) and \(h_\mathrm{R}=1\) (rejection of the hypothesis that x has median lower than y).

If x is the distribution of results of MP-AIDEA and y the distribution of results given by another algorithm, the possible results obtained from the ranksum tests are summarised in Table 16.

Table 16 Wilcoxon test: possible outcomes

Case 1 in Table 16 (\(h_B=0\)) represents a situation in which the distribution of results from MP-AIDEA and a competing algorithm have equal median (failure to reject the hypothesis that x has median lower than y). Case 2 (\(h_B\)=1, \(h_L\)=0 and \(h_\mathrm{R}\)=1) represents a situation in which the median of MP-AIDEA is greater than the median of the other algorithm (rejection of the null hypothesis that x and y have equal medians, failure to reject the hypothesis that x has median greater than y, rejection of the hypothesis that x has median lower than y). Case 3 (\(h_B\)=1, \(h_L\)=1 and \(h_\mathrm{R}\)=0) represents instead a situation in which the median of MP-AIDEA is lower than the median of the other algorithm (rejection of the null hypothesis that x and y have equal medians, rejection of the hypothesis that x has median greater than y, failure to reject the hypothesis that x has median lower than y). In the following, test functions with results corresponding to cases 1 and 3 are shown in bold (MP-AIDEA has median equal or lower than the competing algorithm). For case 3 results with \(p_B < 5 \cdot 10^{-2}\), \(p_L < 5\cdot 10^{-2}\) and \(p_\mathrm{R} > 9.5\cdot 10^{-1}\) are considered significant. Analogously, the competing algorithm has median lower than MP-AIDEA if \(p_B < 5 \cdot 10^{-2}\), \(p_L > 9.5\cdot 10^{-1}\) and \(p_\mathrm{R} < 5 \cdot 10^{-2}\).

6.3.1 CEC 2011 test set

For the CEC 2011 test set, we limited the comparison against the two top algorithms GA-MPC and DE-\(\varLambda \), for which the code is available online (http://www3.ntu.edu.sg/home/epnsugan/index_files/CEC11-RWP/CEC11-RWP.htm; http://uk.mathworks.com/matlabcentral/fileexchange/39217-hybrid-differential-evolution-algorithm-with-adaptive-crossover-mechanism/content/DE_TCRparam.m). The outcome of the Wilcoxon test for the comparison of MP-AIDEA against GA-MPC, the winning algorithm of the CEC2011 competition, can be found in Table 17 for all the functions in the test set in Table 2.

Table 17 Outcome of the Wilcoxon test on the CEC 2011 test set: MP-AIDEA versus GA-MPC
Table 18 Outcome of the Wilcoxon test on the CEC 2011 test set: MP-AIDEA versus DE-\(\varLambda \)

The comparison of MP-AIDEA with GA-MPC shows that the median of MP-AIDEA is lower than the median of GA-MPC (Case 3) for functions 2, 5, 6 and 7, while it is higher (Case 2) for functions 1, 3 and 13. Results for functions 10 and 12 are not significant enough to obtain a clear indication.

The outcome of the Wilcoxon test for the comparison of MP-AIDEA with DE-\(\varLambda \) is reported in Table 18.

The comparison of MP-AIDEA with DE-\(\varLambda \) (Table 18) shows that the median of MP-AIDEA is lower than the median of DE-\(\varLambda \) for functions 3, 5, 6, 10, 12 and 13. Results for the remaining functions 1, 2 and 7 are not significant enough to obtain a clear indication.

Table 19 summarises the outcome of the Wilcoxon tests for the CEC 2011 test set. The table reports the number of functions for which the median of MP-AIDEA is lower, equal or higher than the median of the competing algorithm. The results in Table 19 show that MP-AIDEA clearly outperforms DE-\(\varLambda \) and has median lower than GA-MPC for 4 test functions.

Table 19 Summary of Wilcoxon test results, CEC 2011 test set: MP-AIDEA versus GA-MPC and DE-\(\varLambda \). The table reports the number of functions for which the median of MP-AIDEA is equal (Case 1), higher (Case 2) or lower (Case 3) than the median of the competing algorithm

6.3.2 CEC 2014 test set

Codes for the algorithms UMOEAs, CLMSP, L-SHADE and MVMO are avilable online (http://web.mysites.ntu.edu.sg/epnsugan/PublicSite/Shared%20Documents/Forms/AllItems.aspx). Wilcoxon test results for the comparison of MP-AIDEA with these algorithms at 10, 30, 50 and 100 dimensions are reported in Appendix A (Tables 24, 25, 26, 27, 28, 29, 30, 31).

A summary of the obtained results is given in Table 20. Table 20 shows the number of function for which Case 1, 2 or 3 in Table 16 are verified and the number of functions for which the results are not significant enough to judge, for \(n_\mathrm{D}\) equal to 10, 30, 50 and 100.

Table 20 Summary of Wilcoxon test results, CEC 2014. The table reports the number of functions for which the median of MP-AIDEA is equal (Case 1), higher (Case 2) or lower (Case 3) than the median of the competing algorithm

For \(n_\mathrm{D} = 10\), the median of MP-AIDEA is lower than the one of UMOEAs in 11 cases, while in 3 cases the medians are equal and in 4 cases the median of UMOEAs is lower than the median of MP-AIDEA. In 4 cases (functions 10, 17, 20 and 21), the results are not significant enough. For \(n_\mathrm{D}=30\) and \(n_\mathrm{D}=100\) the median of MP-AIDEA is lower than the median of UMOEAs in 9 cases and the median of UMOEAs is lower than the one of MP-AIDEA for other 9 functions. For 4 functions, the results are not significant enough to obtain a clear indication. The median of MP-AIDEA is lower than the one of UMOEAs in 11 cases for \(n_\mathrm{D}=50\).

As regards the comparison with L-SHADE, MP-AIDEA has lower median for a number of functions greater than L-SHADE only for \(n_\mathrm{D} = 10\) (9 functions).

Table 21 Success rate: CEC2011 test set. Highest success rates for each function are shown in bold, and their total is reported at the bottom of the table. MP-AIDEA* represents MP-AIDEA with settings \(n_{\mathrm{pop}}=1\), \(\delta _{\mathrm{local}}=0.1\) and \(n_{\mathrm{LR}}=10\)

In all dimension but \(n_\mathrm{D} = 50\), the number of functions for which the median of MP-AIDEA is lower than the median of MVMO is greater than the number of functions for which the median of MVMO is lower than the median of MP-AIDEA.

Table 22 Success rate: CEC 2014, 10D and 30D

In all the cases, MP-AIDEA has median lower than CMLSP for the majority of the tested functions.

Summarizing, results of the Wilcoxon test show that MP-AIDEA clearly outperforms CMLSP for all the values of \(n_\mathrm{D}\), gives similar or slightly better results than UMOEAs and MVMO while is outperformed by L-SHADE for \(n_\mathrm{D} = 30\), \(n_\mathrm{D} = 50\) and \(n_\mathrm{D}=100\).

Table 23 Success rate: CEC 2014, 50D and 100D

6.4 Success rate

In this section, we present the success rate of MP-AIDEA and the top performing algorithms on the test sets CEC 2011 and CEC 2014. As for the Wilcoxon test no algorithm participating in the CEC 2005 was included in the comparison due to the lack of availability of the source code.

The computation of the success rate SR is reported in Algorithm 5 for a generic algorithm AG and a generic problem \(\hbox {min}_f\) where n is the number of runs (Vasile et al. 2011). In Algorithm 5, \(\bar{{\mathbf {x}}}\left( AG,i \right) \) denotes the lowest minimum observed during the i-th run of the algorithm AG. The quantity \(f_{\mathrm{global}}\) is the known global minimum of the function and \(\hbox {tol}_f\) is a prescribed tolerance with respect to \(f_{\mathrm{global}}\). The index \(j_{sr}\) represents the number of times algorithm AG generates values lower or equal than \(f_{\mathrm{global}}+\hbox {tol}_f\). For each test set, we report also the total number of problems in which each of the tested algorithms has the best success rate.

figure h

6.4.1 CEC 2011 test set

For the calculation of the success rate on the test set CEC 2011, we consider the following algorithms: MP-AIDEA with 4 populations (MP-AIDEA), adaptive \(\delta _{\mathrm{local}}\) and local restart, MP-AIDEA with one population, \(n_{\mathrm{LR}}=10\) and \(\delta _{\mathrm{local}}=0.1\) (MP-AIDEA*), GA-MPC and DE-\(\varLambda \). Table 21 shows the obtained values of SR and the value of \(\hbox {tol}_f\) used for each function and shows that MP-AIDEA outperforms all the other algorithms on most of the functions. The result against GA-MPC would be even better if a higher number of function evaluation was considered, as explained in Sect. 6.2.2.

6.4.2 CEC 2014 test set

For the comparison on the test set CEC 2014, we considered the following algorithms: MP-AIDEA, MP-AIDEA*, UMOEAs, CLMSP, L-SHADE and MVMO (http://web.mysites.ntu.edu.sg/epnsugan/PublicSite/Shared%20Documents/Forms/AllItems.aspx). The values of the success rates for all tested algorithms are shown in Tables 22 and 23, together with the associated values of \(\hbox {tol}_f\). The total number of problems for which an algorithm yields the best success rate is also reported.

For all dimensions, MP-AIDEA compares very well against the other algorithms. In low dimension, the full adaptive settings is the most competitive while as the number of dimensions increases the single population version with \(\delta _{\mathrm{local}}=0.1\) results the most successful algorithm. These results are in line with the results in Sect. 6.2.3 and confirm the position of MP-AIDEA in the ranking.

7 Conclusions

This paper presented MP-AIDEA, an adaptive version of inflationary differential evolution which automatically adapts the two key parameters of differential evolution, \(\textit{CR}\), F, the size of the restart bubble \(\delta _{\mathrm{local}}\) and the number of local restarts \(n_{\mathrm{LR}}\). The adaptation of the number of local restarts is implemented through a mechanism that mitigates the possibility to detect the same local minimum multiple times. This mechanism allows MP-AIDEA to automatically identify when to switch from a local to a global restart of the population.

MP-AIDEA was tested on a total of 51 problems, taken from three CEC competitions, grouped in three test sets (named CEC 2015, CEC 2011 and CEC 2014) and compared against 53 algorithms that participated in those three competitions. Four different metrics were presented to assess the performance of MP-AIDEA. Results demonstrated that MP-AIDEA ranks first in the CEC 2015 outperforming all the other algorithms for all problem dimensionalities. On the CEC 2011 test set, MP-AIDEA ranks second, after GA-MPC, if we restrict the number of function evaluations to the one prescribed by the competition. However, it was demonstrated that, in problem 13, an increase in the number of function evaluations does not provide any improvement of the objective value returned by GA-MPC but greatly improves the result of MP-AIDEA. It was noted, in fact, that GA-MPC has a fast convergence but then tends to stagnate. On the contrary, the convergence profile of MP-AIDEA is slower but, thanks to the restart mechanism, achieves better objective values. In this test set, in particular, the adaptation of the local restart neighbourhood was shown to be effective providing competitive results compared to the settings of MP-AIDEA with a single population and predefined values of \(\delta _{\mathrm{local}}\) and number of restarts. This is confirmed by the Wilcoxon test, and the success rate.

On the test set CEC 2014, results are not equally satisfactory for all dimensions. MP-AIDEA is in the top three algorithms except in dimension 30. When the number of populations is reduced to one and \(\delta _{\mathrm{local}}=1\), MP-AIDEA outperforms all other algorithms in dimension 50 and 100.

One part of the problem is the extra effort required by the multi-population adaptive algorithm to identify the correct value of \(\delta _{\mathrm{local}}\). However, another part of the problem was found in the contraction limit. This is in line with the theoretical findings by the authors who demonstrated that DE can converge to a level set in the general case. Furthermore, it was noted that the populations can naturally partition and form clusters that independently converge to separate points. This slow rate of convergence affects the restart and local search mechanisms and the associated adaptation machinery. Since the current implementation uses a synchronous restart and adaptation of \(\delta _{\mathrm{local}}\) and \(n_{\mathrm{LR}}\), the number of restarts might be limited by the fact that the evolution of all populations has to come to a stop before any of them can be restarted. Future work will be dedicated to improve these aspects of the algorithm.