Introduction

The differential evolution (DE) algorithm [50] has been shown to be very powerful in solving discontinuous, non-differential, multi-modal, and not well-defined problems. As a population-based search metaheuristic, however, DE is also subject to challenges when solving computationally expensive problems [5, 34, 36, 69, 71]. Thus, computationally cheap surrogate models, including polynomial regression (PR) [34, 64], radial basis function (RBF) [22, 32, 53, 59], artificial neural network (ANN) [14, 23, 27], support vector machines (SVM) [63, 68], and Kriging or Gaussian process (GP) [3, 6, 16, 41, 42] are often used as a substitution of the original computationally expensive objective function to reduce the required computational cost of evolutionary optimization [1, 8, 29].

While early research on surrogate-assisted evolutionary optimization typically used single surrogate models, either in evolutionary search [26, 27] or in local search [46], multi-model assisted evolutionary algorithms have become increasingly popular for different reasons. A straightforward motivation of using multiple surrogates is to enhance the approximation accuracy and predict the reliability of the approximated fitness using a neural network ensemble [28], or to promote the diversity of search in both single- and multi-objective optimization [35, 39, 73]. Tang et al. [61] presented a hybrid surrogate assisted particle swarm optimization, in which an RBF model constructed by interpolating the residual errors was added to a low order polynomial regression model to form the final hybrid surrogate model to approximate the original fitness landscape. A selective ensemble was proposed for offline optimization [66] to adapt the surrogate where the on-line update of the models is impossible. A more sophisticated idea is to use multiple surrogates to approximate the global and local features of the fitness function, respectively. For example, Zhou et al. [74] suggested to combine both the global and local surrogate models to accelerate the evolutionary optimization, in which a prescreen procedure was presented based on a global Gaussian process model and an RBF-based trust-region search strategy was proposed in the spirit of Lamarckian learning. Tenne et al. [62] developed an improved version of hierarchical surrogate assisted memetic algorithm by adaptively switching global and local RBF models based on the leave-one-out cross-validation errors. Sun et al. [59] suggested to use a global surrogate model to smooth out the local optima and a local surrogate model was trained for each individual to approximate the fitness as accurately as possible. The authors further proposed a cooperative swarm algorithm for high-dimensional computationally expensive problems [58], in which an RBF network was utilized to smooth out the local optima and assist the social learning particle swarm optimization to quickly locate the region where the global optimum is located. The global RBF is combined with a local fitness estimation strategy [60] to improve the accuracy of the approximated fitness. In the context of off-line data-driven optimization, where no fitness evaluations are allowed, a low-order polynomial has been used to generate training data for online update of a Gaussian process [9]. A slightly different approach was reported in Isaacs et al. [24] that employed multiple RBF models for multiple database subsets obtained by k-means clustering to present a spatially distributed surrogates assisted evolutionary algorithm for multi-objective optimization. The third line of research on multi-model assisted evolutionary optimization aims to exploit the uncertainty information multiple surrogates can offer, similar to the infill criteria in Bayesian optimization [57]. Wang et al. [65] proposed an ensemble-based model management method for surrogate assisted particle swarm optimization which searches for the promising and most uncertain candidate solutions to be evaluated using the expensive simulation model. In [20], a heterogeneous ensemble is employed to replace the Gaussian process for estimating the mean as well as the standard deviation of the fitness so that infill criterion driven model management strategies can be applied to surrogate-assisted optimization of high-dimensional problems. Note that the computational complexity of the Gaussian process increases in cubic with the number of training data, making it unrealistic for the high-dimensional problems.

Surrogate-assisted DE algorithms for solving computationally expensive problems have also been widely studied. Mallipeddi et al. [44] proposed to keep generating candidate offspring for each parent solution until the fitness of the candidate offspring approximated by the Kriging model is better than that of its parent or the number of generated candidate offspring reaches the predefined maximum. Then the candidate offspring with the best-approximated fitness will be evaluated using the expensive objective function and compared with its parent. The winner will be passed to the next generation. Similarly, Gong et al. [19] proposed to generate multiple offspring candidates for each parent by applying different reproduction operators, and the one with the maximum density value estimated by the Parzen window method is chosen to be the offspring to be evaluated using the expensive objective function. After that, each offspring will compete with its parent to survive. To further reduce the number of required real fitness evaluations, Vincenzi et al. [64] proposed an infill sampling strategy that is the weighted sum of the estimated fitness and the distance to the history sample points to assist a DE algorithm. Liu et al. [41] adopted the lower confidence bound for a Gaussian process assisted DE algorithm for solving medium-scale expensive optimization problems. Note, however, that the Sammon mapping technique was used for reducing the original decision space to a low-dimensional space. Multi-model assisted DE algorithms have also been proposed for solving expensive optimization problems [43]. A soft-margin support vector classifier was employed to divide a population combining the parent and offspring individuals into two groups according to their estimated fitness. The group of individuals having better fitness were re-evaluated using a regression model to determine which ones will be eventually evaluated using the real objective function. In [15], Elsayed et al. proposed to use a global Kriging model to approximate the fitness of individuals of a DE algorithm, then the expected improvement (EI) criterion is used for selecting solutions to be evaluated using the real objective function. Liu et al. [40] proposed a surrogate-assisted DE for multi-fidelity evolutionary optimization of computationally expensive problems.

While many surrogate-assisted evolutionary algorithms incorporate the uncertainty of the estimated fitness in model management, the uncertainty propagation in environmental selection lacks consideration. As a result, the individuals selected as the parents of the next generation may be located very far away from the optimum and the recently evaluated individuals, making the evolutionary algorithm more likely to get stuck in a minimum of the surrogate model. To avoid premature convergence and promote the diversity of the population, this work proposes a selection strategy that considers both the approximated fitness values and the estimated degree of uncertainty. To this end, a GP surrogate model is adopted to estimate the fitness value and the uncertainty of the estimated fitness, which are used as two criteria in selecting parents. Different from most existing GP-assisted evolutionary algorithms, this work employs a second surrogate, which is an RBF model, for determining which offspring individuals are to be evaluated using the real fitness function. The main reason is that the computational complexity of the GP is cubic to the number of training data, which makes the GP unsuited for accurately modeling the local fitness landscape in a high-dimensional space. By contrast, RBF models have shown to be computationally efficient for local modeling of both low- and high-dimensional problems [25]. Due to the above reasons, we introduce an RBF model to approximate the local details of the fitness landscape and employ it to refine the approximation quality of the new parent population prevailing in the environmental selection, apart from the GP as a global surrogate for selecting parents. To promote the exploitation, a local search is applied to find the optimum of the RBF model in the local area covered by the selected parents and the top-ranking evaluated samples for fitness evaluation using the expensive real objective function. We term the overall algorithm multi-model assisted differential evolution algorithm, MADE for short.

The main contributions of this paper can be summarized as follows:

  1. (1)

    A coarse-to-fine evaluation scheme basing on two surrogate models, i.e., a coarse Gaussian process and a fine radial basis function, is proposed for assisting the differential evolution algorithm to solve computationally expensive optimization problems.

  2. (2)

    A surrogate-assisted environmental selection strategy is developed to select promising parents for the next generation according to the non-dominance relationship between approximated fitness and estimated uncertainty.

  3. (3)

    In each generation, a fine surrogate model is used to refine the approximation quality of the selected parent individuals from the environmental selection and screen out promising individuals for real-evaluation. On the other hand, a surrogate-based local optimization is performed according to the improvement of the current best to strike a balance between exploration and exploitation.

  4. (4)

    Systematic experiments have been conducted to validate the effectiveness and efficiency of MADE and compare it with several state-of-the-art algorithms on symmetric fitness landscapes with optimum lying on the origin and non-symmetric fitness landscapes with non-origin optimum.

The remainder of this paper is divided into the following sections. Section  2 describes in detail the proposed multi-model assisted differential evolution algorithm. Section  3 empirically assesses the performance of the MADE algorithm on two sets of benchmark problems with different types of fitness landscapes. Finally, Sect. 4 concludes the paper with a summary of this work and suggestions for future research.

Multi-model assisted DE

In canonical surrogate-assisted differential evolution algorithm, parent individuals for the next generation are normally selected according only to the approximated fitness, while overlooking the estimated uncertainty of the approximated fitness. Consequently, surrogate-assisted evolutionary algorithms are very likely to converge prematurely [3], when the surrogate fails to correctly capture the landscape of the fitness function and has a large uncertainty in fitness approximation. Therefore, we propose to select also the individuals whose approximated fitness has a large degree of uncertainty. In addition, a fine surrogate model is constructed to refine the approximation quality of these individuals and select the best one as well as the local optimum in the neighborhood of these individuals for fitness evaluation using the real computationally expensive function.

Figure 1 gives a diagram of the proposed algorithm. From Fig. 1, we can see that two surrogate models, a coarse GP model and a fine RBF model, are introduced to assist the optimization. The coarse GP model [54] is mainly for selecting parent individuals for the next generation in the environmental selection, while the fine RBF model, which has been shown effective for high-dimensional problems [72], is adopted for two purposes. One is to refine the approximation quality of the selected parent individuals from the GP-assisted environmental selection and choose the best one for fitness evaluation using the real computationally expensive objective function, and the other is to search the local optimum for real-evaluation from the region covered by the new parent population and the best-evaluated samples.

Before the optimization starts, a number of solutions are sampled using the symmetric Latin hypercube design (SLHD) method [33] and they are evaluated using the computationally expensive objective function. These offline sampled solutions as well as their fitness values are stored in an archive. The best solutions in the archive are used to seed the initial population of the DE, and then the crossover and mutation operators of DE are conducted to generate the trial/offspring candidate solutions. After that, a coarse GP model is first constructed using the collected solutions from the archive that are in the vicinity of the parents and the trial/offspring population. A parent for the next generation will then be selected according to the dominance relationship based on the approximated fitness and the estimated uncertainty. Once all parents for the next generation are selected, a fine RBF model is then built using adjacent real-evaluated solutions of the selected parents to refine the approximation quality of the parent individuals, and the best one among these selected parents is chosen for real-evaluation if it is superior to the current global best. Otherwise, an RBF-assisted local optimization will be implemented to found a local optimum for fitness evaluation using the real computationally expensive objective function. Afterwards, the archive and the global best are updated by the newly evaluated solutions. And finally the above process is repeated until the termination condition is reached.

Fig. 1
figure 1

A diagram of the multi-surrogate assisted differential evolution algorithm

Algorithm 1 gives the pseudo-code of the MADE algorithm. The proposed algorithm follows the main steps of the basic differential evolution algorithm, including offline solution sampling, population initialization, generation of offspring using mutation and crossover, and selection of parents. The main difference between the MADE algorithm and the canonical differential evolution algorithm is that a coarse GP model is used for fitness calculation and selection, and at most two solutions will be evaluated using the real objective function at each generation, which are chosen based on a second surrogate, a fine RBF model. Note that an archive is used to store all solutions that are evaluated using the real fitness function. In the following, we will give a detailed description of the training of the coarse GP model and the fine RBF model, GP-assisted environmental selection, RBF-assisted individual refinement, and RBF-assisted local optimization.

figure a

Construction of the coarse Gaussian process

As the Gaussian process model is used for the selection of parents for the next generation, the archive samples distributed in the neighborhood of the current parent and the trial/offspring populations are to be used for constructing the coarse GP model. To minimize the impact of unevenly distributed training samples that result in poor approximation quality of the GP model in certain regions of the whole design space, a training sample selection strategy based on the maximin distance criterion with the current global best to be the initially selected sample is employed to determine the final training set. Here it should be noted that the coarse GP model aims to capture the global fitness landscape related to the location of the current parent and the trial/offspring populations. The procedure to train a coarse GP model is given in Algorithm 2.

figure b

In Algorithm 2, pop(t) and \(pop_\mathrm{{trial}}(t)\) are the parent population and the trial/offspring population, respectively. MP(t) represents the population merged by pop(t) and \(pop_\mathrm{{trial}}(t)\). CNs denotes a set that includes all neighbors of the solutions in the merged population MP(t). Generally, a minimum of \(D+1\) training samples are required to train the GP model for a D-dimensional optimization problem [51] so that the resulting model is able to properly learn the global profile of the fitness landscape. As suggested in [3, 16], the number of training samples should not be less than \(2\times {D}\), so we set the size of training set to \(2\times (D+1)\) in our work.

In Algorithm 2, we obtain the final training samples with two steps. First, a number of closest neighbors are found for each solution in the merged population. Then, \(2\times (D+1)\) samples will be sequentially selected according to the distance to the best solution in the neighborhood. Figure 2 gives an illustrative example showing the procedure of sample selection using a two-dimensional Ackley function. In this work, \(2\times (2+1)=6\) samples are required for training the coarse GP model. In this example, to better show the proposed strategy for selecting training samples, 12 samples will be chosen. In the figure, the stars in black represent all individuals in the merged population, including the parent and trial/offspring populations. The circles in red are a union of all neighbors found for each individual in the merged population, and the triangles in blue are the selected samples for training the GP. The numbers in Fig. 2 represent the order of individuals in the merged neighbors chosen to be saved in the training set \(T_s\) for training the GP model. The solution indicated with the number 1 has the best fitness value among all merged neighbors, so it will be saved in set \(T_s\) at first, denoted as \(T_s = \{{\mathbf {x}}_1\}\). And then the minimum distance of each individual in the merged neighbors to the current set \(T_s\) will be calculated. We can see that solution number 2 has the maximum value among all these minimum distances, which will be saved to \(T_s\), \(T_s = \{{\mathbf {x}}_1, {\mathbf {x}}_2\}\). We continue to do this for the rest individuals in the merged neighbors and this time, solution number 3 has the maximum distance to \(T_s\) and will be saved to set \(T_s = \{{\mathbf {x}}_1, {\mathbf {x}}_2, {\mathbf {x}}_3\}\). This procedure continues until this is done for all solutions in the merged neighbors. From Fig. 2, we can see that the training samples will be widely distributed in the decision subspace where the merged population is located. We can also note that different markers overlap at some positions in Fig. 2. This is because the individual in the parent population may be evaluated using the real objective function, which will be saved in the archive, and obviously will be one of the individuals in the merged neighbors because of the zero-distance to the merged population.

Fig. 2
figure 2

An illustrative example of data selection for constructing the coarse GP model

GP-assisted environmental selection

The main benefit of using the GP model is that it can provide not only the approximated fitness value but also an estimated uncertainty of the approximated fitness. Different from most existing DE algorithms that use the fitness value to select parents for the next generation, the proposed MADE algorithm selects parents based on both the approximated fitness and the estimated uncertainty. More specifically, solutions with a better fitness value will be passed to the next generation. Besides, we also expect that some individuals with large estimated uncertainty can be included in the next population to avoid searching for potentially poor solutions and achieve a good trade-off between exploration and exploitation. Accordingly, bi-objective selection will be performed as described in Eq. (1):

$$\begin{aligned} \min {\mathbf {g}}({{\mathbf {x}}})=({\hat{f}}({\mathbf {x}}),-{\hat{s}}({\mathbf {x}})) \end{aligned}$$
(1)

In Eq. (1), \({\hat{f}}({\mathbf {x}})\) and \({\hat{s}}({\mathbf {x}})\) are the approximated fitness and estimated uncertainty, respectively. Algorithm 3 lists the pseudo-code of selecting parents for the next generation. The merged population will be sorted into a number of non-dominated fronts at first according to the two objectives, i.e., the approximated fitness and estimated uncertainty, basing on the non-dominated sorting in [12]. Then we select parent solutions from the first non-dominated front until the next parent population is filled. If the number of solutions on the last front to be selected is more than what the population can hold, only the ones with larger crowding distances [12] will be selected.

figure c

Fine RBF-assisted individual refinement

After the parents for the next generation are selected, a fine surrogate model, an RBF model, is built to re-approximate the fitness of these parents so that the approximation quality of the parent individuals can be refined. Herein, the parents with superior RBF prediction against the current global best will undergo the real objective function evaluation. The strategy for the fine RBF modeling is described in Algorithm 4. As depicted in Algorithm  4, all the nearest neighbors surrounding the selected parent population \(pop(t+1)\) without duplicates are taken to train the fine RBF model.

figure d

Fine RBF-assisted local optimization

When there is no parent solution that is superior to the current global best after refining the approximation quality of the selected parent population according to the fine RBF model, an optimization based on this fine RBF model is performed in the local regions that are intersections of the hypercube areas covered by the partial top-ranking real-evaluated samples and the selected parent population. For the fine RBF-assisted local optimization, a multi-point based swarm intelligence optimizer SL-PSO [7] is used as the underlying local search engine instead of the traditional single-point based gradient-based optimization algorithm, such as sequential quadratic optimization (SQP) [46], to ensure sufficient exploitation of the local fitness landscape, avoiding premature convergence due to the initial value sensitivity. However, other swarm intelligence optimizers can also be used for the fine RBF-assisted local optimization.

Some practical issues

In MADE, the coarse GP-assisted environmental selection, the fine RBF-assisted individual refinement and the fine RBF-assisted local optimization are consecutively used to balance the trade-off between global exploration and local exploitation. Here attention should be paid to the roles of different surrogate models, where the coarse GP model is employed to provide additional guidance information on the uncertainty of the trial/offspring population for the identification of the new parent population in the environmental selection instead of the one-to-one competitive selection operator in the canonical DE, whereas the fine RBF model is employed to identify better individuals amongst the selected parents and to find the optimal solutions in the neighborhood covered by the selected parents and the best-evaluated samples. For the GP-assisted environmental selection, promising parent individuals for the consecutive evolution are sifted out from the merged population of the parent population coupled with the trial/offspring population, according to their approximated fitness and approximated uncertainty. These individuals driven by the DE behavior learning operators (DE/current-to-best/1 used in this paper) are prone to gradually explore the unknown landscape of the whole search space, especially the ones with large approximated uncertainty. On the other side, for the RBF-assisted local optimization, the optimum of the RBF model is used to update the current global best and accelerate the convergence of the algorithm, thereby facilitating the local exploitation. And meanwhile, these optima can also be provided as additional training sample alternatives for GP modeling as well as for improving the quality of the RBF model in the optimal region.

Additionally, it is conceivable that more and more sample points will aggregate in the optimal region of the design space as the optimization proceeds since the sequential infilling of the real-evaluated optimal solutions determined in the RBF-assisted individual refinement and the RBF-assisted local optimization in each generation, especially the ones in the fine RBF-assisted local optimization. Thus, in this paper, a distance threshold

$$\begin{aligned} \varepsilon =\min \left( \sqrt{10^{-6} D}, 5.0 \times 10^{-5} D \times \min (UB-LB)\right) \end{aligned}$$
(2)

in [37] is preset as an additional selection criterion in determining whether the candidate solutions will be re-evaluated using the real objective function so as to avoid the evaluated solutions being too close to each other, where D is the problem dimension, UB and LB denote the upper and lower bounds of the search space, respectively. Here in the RBF-assisted individual refinement and RBF-assisted local optimization, the optimal solutions whose distances to the real-evaluated samples are larger than the threshold will undergo re-evaluation by the real objective function.

Table 1 Characteristics of the benchmark problems

Experimental results and discussions

To evaluate the effectiveness and scalability of MADE, two sets of benchmark problems, including five commonly used basic test functions [58, 65] featured by symmetric fitness landscape with optimum lying on the origin and eight benchmark problems from CEC’s 14 expensive optimization test suit [38] characterized by non-symmetric fitness landscape and shifted optimum (non-origin optimum), are adopted in the comparative studies. Table 1 lists the 13 test functions and their main characteristics. We conduct experiments on these test problems of 10, 20 and 30 dimensions, respectively, therefore, a total of 39 benchmark problems are involved.

In the following subsections, we first examine the effectiveness of the GP and RBF models in MADE, followed by a verification of the validity of the local search. Then we perform an investigation on the behavioral characteristics of MADE and finally make a comparison of its performance against several state-of-the-art algorithms. In the experiments, the population size of all compared DE algorithms are set to \(5\times {D}\) for problems of 10, 20, and 30 dimensions, and the stopping criterion is when the maximum number of fitness evaluation reaches \(11\times {D}\). The scaling factor F and crossover rate Cr are set to \(F=0.50\) and \(Cr=0.75\), respectively [4]. The SL-PSO used in the fine RBF-assisted local optimization follows the same parameter settings as suggested in the corresponding literature, and the termination condition of SL-PSO is when the maximum number of iteration reaches \(50\times {D}\) or the local optimum of the fine RBF model remains unchanged for 20 consecutive generations. 30 independent runs are carried out for each test instance and all compared algorithms are implemented in MATLAB\(^{\circledR }\) R2019b and run on an Intel(R) Core(TM) i7-9700U CPU @3.00 GHz desktop.

The roles of GP model and RBF model

In this work, we use a GP model for selecting parents for the next generation and an RBF model for searching the best solutions to be evaluated using the real objective function. To show the effectiveness of the GP model and the RBF model, here we compare the proposed MADE algorithm that using both GP and RBF models with two MADE variants, i.e., MADE-GPOnly that using GP model alone in MADE and MADE-RBFOnly that using RBF model alone in MADE, on the five basic benchmark problems. The characteristics of these two algorithms and MADE are summarized in Table 2. Here it should be noted that the main differences between these three algorithms lie in three aspects: surrogate-assisted environmental selection, surrogate-assisted individual refinement and surrogate-assisted local optimization. Other settings of these three algorithms such as initialization, training sample selection, and the update of the global best, etc. follow Algorithm 1.

Table 2 Characteristics of MADE, MADE-GPOnly and MADE-RBFOnly

Table 3 lists the statistical results of these three algorithms on the five basic benchmark problems over 30 independent runs. In Table 3, symbols “\(+\)”, “−”, “\(\approx \)” indicate that MADE-GPOnly and MADE-RBFOnly perform significantly worse than, significantly better than and comparably with MADE, respectively, according to the pairwise Wilcoxon rank-sum test at 0.05 significance level. The “Win/Loss/Tie” records the number of benchmark instances that MADE wins, losses and ties with MADE-GPOnly and MADE-RBFOnly. In general, MADE can obtain better and comparable results against the other two contestants on the selected benchmark problems. Comparing to MADE-GPOnly, MADE can obtain significantly better results on all selected benchmark problems throughout 10, 20 and 30 dimensions, indicating the superiority of the fine RBF model against the fine GP model in improving the performance of MADE. Comparing to MADE-RBFOnly, MADE wins on six out of 15 benchmark problems and tied on nine problems. Specifically, MADE can obtain better results on a majority of benchmark problems except for the Griewank function, on which the results obtained by MADE-RBFOnly are slightly better than those of MADE. We speculate that it may be due to the smooth effect of the cubic kernel function used in the fine RBF model that smoothed out the local optima, thus facilitating the fast convergence of the parent population. However, from Table 3, we can find that MADE can obtain significantly better results than MADE-RBFOnly on Rosenbrock function and Ackley function throughout 10, 20 and 30 dimensions, indicating the effectiveness and good performance of the coarse GP-assisted environmental selection in MADE.

As a whole, the results in Table 3 confirm the effectiveness and efficiency of the coarse GP and the fine RBF model in MADE, and also demonstrated the superiority of the coarse-to-fine evaluation scheme based on these two surrogate models.

The effectiveness of local search

In MADE, we perform a local optimization to search the optimum of the fine RBF model for real-evaluation when the current global best is unimproved. To verify the effectiveness of local search, we count the number of local searches performed by MADE on the five basic benchmark problems for 30 runs. Figure  3 gives the boxplot of the number of executions of local search by MADE on 10-, 20- and 30-dimensional cases. In general, from Fig. 3, we can find that the number of local searches is within half of the total computational budget of 11D real fitness function evaluations for 10- to 30-dimensional problems. The number of executions of local search increases with the number of dimensions. In addition, from Fig. 3, we can also note that the deviations of the number of local searches from 30 runs are large on the Ellipsoid function and Griewank function of 10–30 dimensions. This can be attributed to the single funnel landscape features of these two problems. Overall, the results in Fig. 3 verified the effectiveness of local search.

Fig. 3
figure 3

The number of local searches performed by MADE on the selected benchmark problems for 30 runs

Table 3 Statistical results of MADE, MADE-GPOnly and MADE-RBFOnly on the selected benchmark problems

Behavioral analysis of MADE

In this subsection, we compare the proposed MADE with several MADE variants concerning RBF-DE, RBF-DE-LS, GP\(_\mathrm{{EI}}\)-DE, GP\(_\mathrm{{MinF}}\)-DE, MADE-NoLS, and the standard DE, to gain further insight into the impact of each algorithm component on the performance of MADE, such as the coarse GP-assisted environmental selection, fine RBF-assisted individual refinement, and RBF-assisted local optimization.

Table 4 Characteristics of MADE variants

Description of MADE variants

Table 4 shows the main characteristics of MADE variants in the environmental selection and infill-sampling. More specifically, for the RBF-DE, this method is an RBF-assisted DE, in which an RBF model is built to predict the fitness of the derived trial/offspring population after performing mutation and crossover operations on the parent population. The training samples for RBF modeling are drawn from the adjacent evaluated samples surrounding the trial/offspring population. In the model management, the infilling strategy follows a greedy pairwise comparison sampling rule that identifies the individuals in the trial/offspring population with superior predictions over their associated parents as candidate solutions for real-evaluation. And these selected individuals will be real-evaluated only when their distances to the evaluated samples meet the predefined distance criterion, which is the same as the one adopted in MADE. Note that, in the environmental selection of RBF-DE, the new parent population for the next generation is determined through a pairwise competition rule that selects better individuals from each pair of target/parent and trial/offspring individuals.

For the RBF-DE-LS, it is an RBF-DE variant that embedded an RBF-assisted SL-PSO local search after the environmental selection. The RBF model adopted in the local search is the same as the one in the environmental selection. As with the MADE, the region for local search is also the intersection area of the hypercube neighborhoods surrounded by the trial/offspring population and the best sample subset in the archive. And the optimum found in local search performs real-evaluation only if its distance from the evaluated samples meets the predefined distance criterion.

For the GP\(_\mathrm{{EI}}\)-DE and GP\(_\mathrm{{MinF}}\)-DE, these two algorithms are two variants of MADE stripped of the RBF-assisted SL-PSO local search, both of them employ only a GP model for assisting the environmental selection and infill sampling. GP\(_\mathrm{{EI}}\)-DE and GP\(_\mathrm{{MinF}}\)-DE adopt the same GP modeling method that collects the nearest neighbors surrounding the parent and trial/offspring populations as training samples for constructing the GP model. And both algorithms use the same environmental selection strategy proposed in the MADE. The key point to differentiating between these two algorithms lies in the infill criterion, where GP\(_\mathrm{{MinF}}\)-DE relies on the performance-based sampling that chooses the individual with the minimum GP prediction in the parent population for real-evaluation, while GP\(_\mathrm{{EI}}\)-DE chooses the individual maximizing the EI criterion.

For MADE-NoLS, it is also an MADE variant that excludes the RBF-assisted local optimization part in MADE. As with the MADE, MADE-NoLS adopts a GP-assisted environmental selection operator to determine the parent population for the next generation and builds a fine RBF model to filter out the individual with the best predictions among the new parent population for subsequent real-evaluation.

Comparison results of MADE and MADE variants

The statistical results averaged over 30 independent runs are listed in Table 5, wherein the best value on each benchmark instance is highlighted in boldface. The pairwise Wilcoxon rank-sum test [17, 18] with \(95 \%\) confidence level is conducted for investigation of significant differences between MADE and MADE variants, and symbols “\(+\)”, “−”, “\(\approx \)” indicate that MADE performs significantly better than, significantly worse than, and comparably with the compared MADE variants, respectively. The average rank of each algorithm is calculated according to the Friedman test [13], and the “Win/Loss/Tie” records the number of benchmark instances that MADE wins, losses, and ties with the contenders.

Table 5 Comparison results of MADE and MADE variants on the basic benchmark problems of 10, 20, and 30 dimensions

From Table 5, one can observe that all surrogate-assisted DE algorithms can obtain better results than the canonical DE within a limited computational budget. Compare MADE with RBF-DE, MADE obtains the best solutions on all benchmark problems varying from different dimensionalities and fitness landscapes, indicating the positive effects of associating multiple surrogates and local search to improve the performance of MADE. Comparing MADE with RBF-DE-LS, MADE obtains the best results on the majority of test instances except for the Griewank function, on which the RBF-DE-LS performs the best. This may be attributed to the low precision of the coarse GP model that fails to capture the local details of the multi-peak topology on the fitness landscape of the Griewank function covered by the merged population, resulting in the failed selection of the correct offspring population. However, the results demonstrate the attractive performance of GP-assisted environmental selection in guiding the right convergence of MADE. Comparing with another two MADE variants, ie., GP\(_\mathrm{{MinF}}\)-DE and GP\(_\mathrm{{EI}}\)-DE, we can find from Table 5 that the MADE-NoLS have achieved significantly better results on most of the benchmark instances, revealing the good performance of fine RBF-assisted individual refinement in enhancing the right prescreening of promising individuals for real-evaluation. When comparing to MADE, however, MADE-NoLS obtains significantly worse results on all the benchmark instances, verifying the excellent reinforcement effect of RBF-assisted local optimization in substantially improving the performance of MADE. In addition, from Table 5, in terms of the average ranking according to the Friedman test, for all involved benchmark functions, MADE gets the first ranking, followed by MADE-NoLS, RBF-DE-LS, GP\(_\mathrm{{MinF}}\)-DE, GP\(_\mathrm{{EI}}\)-DE, RBF-DE, and canonical DE. As a conclusion, the results in Table 5 demonstrated the good synergy between the components of MADE in steering its correct convergence.

Fig. 4
figure 4

Convergence profiles of compared algorithms on the selected benchmark problems of 10, 20, and 30 dimensions

The convergence profiles of the compared algorithm on the selected benchmark problems of 10, 20, and 30 dimensions are summarized in Fig. 4, which were averaged over 30 independent runs. From Fig. 4, we can found that MADE performs the best on Ellipsoid, Rosenbrock, Ackley, and Rastrigin function through 10–30 dimensions, whereas the RBF-DE-LS outperforms other alternatives on Griewank function. For the Griewank function characterized by a multi-peaked single funnel topology, MADE adopts a coarse GP model to select a promising parent population for guiding the evolving dynamics of DE at each iteration, which is conducive to smoothing the search space and accelerating convergence in the early stage of optimization as shown in Fig. 4j–l. Nonetheless, the coarse GP model fails to capture the multi-peak fitness landscape around the global optimum, resulting in the incorrect selection of the parent population and premature convergence in the later search stage. In contrast, RBF-DE-LS constructs a fine RBF model to learn a priori knowledge concerning the local features of the global optimal region, thus delivering high accuracy in the selection of a new parent population. Note that in MADE the fine RBF model is utilized to refine the new parent population and sift out the best-predicted individual for real-evaluation during the RBF-assisted individual refinement, while in RBF-DE-LS, the fine RBF model is performed to screen the predicted trail/offspring individuals that are better than the associated target/parent individuals so as to identify the parent population for the next generation.

In terms of GP\(_\mathrm{{MinF}}\)-DE and GP\(_\mathrm{{EI}}\)-DE, GP\(_\mathrm{{MinF}}\)-DE outperforms GP\(_\mathrm{{EI}}\)-DE on most of the benchmark instances, especially for Rosenbrock and Rastrigin function. GP\(_\mathrm{{EI}}\)-DE performs slightly better than GP\(_\mathrm{{MinF}}\)-DE on the 10-dimensional Ackley function, while both present comparable performance on the 20- and 30-dimensional Ackley function. From Fig. 4, we can also find that both GP\(_\mathrm{{MinF}}\)-DE and GP\(_\mathrm{{EI}}\)-DE perform better than RBF-DE on all the selected benchmark instances through 10–30 dimensions and poses a good convergence tendency, demonstrating the effectiveness of GP-assisted environmental selection operator in the parent population selection.

From Fig. 4, however, one can note that among the three MADE variants without RBF-assisted local optimization, MADE-NoLS outperforms GP\(_\mathrm{{MinF}}\)-DE and GP\(_\mathrm{{EI}}\)-DE on all benchmark instances, corroborating the imperative of using the fine RBF model to refine the approximation quality of the selected parent population. Here note that the correct filtration of the parent population contributes to providing more valuable training samples for the subsequent update of the GP model.

For Rosenbrock and Rastrigin function, from Fig. 4d–f and 4m–o, we can find that RBF-DE-LS are outperformed by MADE-NoLS throughout 10- to 30-dimensional instances. And RBF-DE-LS stagnates and prematurely converges in the later stage of optimization. We surmise that the excessive consumption of real-evaluations causes the performance degradation of RBF-DE-LS since RBF-DE-LS consumes a majority of real-evaluations for selecting the new parent population, whereas no real-evaluation is conducted in MADE-NoLS during the GP-assisted environmental selection and at most one individual performs real-evaluation in RBF-assisted individual refinement at each iteration.

From Fig.  4, we can also find that MADE-NoLS achieves a dramatic performance improvement in conjunction with the fine RBF-assisted local optimization (ie., MADE) on all the benchmark instances, and MADE maintains a good downtrend against other alternatives, indicating the effectiveness of the RBF-assisted local optimization in promoting the exploitation capability of MADE. To summarize, the results shown above demonstrate the effectiveness and feasibility of collaboration between coarse GP-assisted environmental selection, fine RBF-assisted individual refinement, and RBF-assisted local optimization in MADE.

Performance comparison between isomorphic algorithms

To examine the efficacy of MADE with other isomorphic algorithms, in this subsection, we conduct two sets of comparative studies of MADE with four representative state-of-the-art algorithms, ie., CAL-SAPSO [65], GORS-SSLPSO [70], FSAPSO [37], and S-JADE [4], on two different types of benchmark test suits, respectively. Here the CEC’14 expensive optimization test suit featured by non-symmetric fitness landscapes with shifted global optima is used as another test suit apart from the above five basic test functions. The main task of these comparisons is to gain insight into the performance differences of these methods in handling different types of problems within a small number of real fitness function evaluations.

In the following subsections, we first compare MADE with CAL-SAPSO, GORS-SSLPSO, and FSAPSO on the five popular basic test functions, followed by a comparison of MADE with GORS-SSLPSO, FSAPSO, and S-JADE on the CEC’14 expensive optimization test suit. To make a fair comparison, these four algorithms adopt the same population size of 5D and each of them terminates after exhausting the computational budget of 11D real fitness function evaluations, thus ensuring that each algorithm consumes the same number of real-evaluations. The other parameters involved are configured with the same settings as recommended in their corresponding literature.

Illustrations on the algorithm architectures of these five contestants

CAL-SAPSO [65] sequentially employed one multi-surrogate ensemble to search for the most uncertain solution and the other for finding the best-approximated solution for real-evaluation in the whole design space, while adopting a local multi-surrogate ensemble to exploit the optimal region covered by the top-ranking archive samples for seeking the promising optima to be real-evaluated depending on whether the optimal solution achieves improvement. Analogous to CAL-SAPSO, MADE takes full advantage of multiple surrogates and leverages the uncertainty information provided by the coarse GP model for sifting out promising parent population for the next generation, whilst refining the best-approximated individual for real-evaluation by a fine RBF model. An RBF-assisted local search is also integrated within MADE to exploit the promising region surrounding by the top-ranking archive samples. The key to distinguishing CAL-SAPSO from MADE lies in the fact that the former concentrates on sampling promising solutions for re-evaluation by the multi-surrogate ensemble-based PSO search in the global exploration part, while the latter puts emphasis on selecting a promising parent population by means of a coarse-to-fine model evaluation for guiding the global exploration of DE.

GORS-SSLPSO [70] built a global RBF model to approximate the design space and employed the SL-PSO [7] to locate the optimal solution, where the search population of SL-PSO was periodically adjusted for consecutively approaching and exploiting the optimal region. GORS-SSLPSO dynamically restarted the underlying SL-PSO with a top-ranking archive sample subset to be the new initial population, and the optimum found in each restart loop was selected for real-evaluation. As distinct from GORS-SSLPSO, MADE determines the parent population for the next generation through a non-dominated sorting in the light of the uncertainty information of candidate solutions derived from a coarse GP model. On the other side, as with the MADE, GORS-SSLPSO also performs the local search centered around the neighborhood of the top-ranking archive sample subset in each restart loop. Therefore, it is interesting to compare the performance of these two approaches to gain certain insight into the efficacy of different environmental selection strategies.

FSAPSO [37] searched the optimum of the global RBF model in the territory covered by the iterative swarm and the best predicted particle amongst the iterative swarm for real-evaluations in tandem in each generation. And in case the global optimum remains unimproved, the most uncertain particle amongst the iterative swarm was additionally selected as a candidate solution for real-evaluation, wherein the uncertainties of particles were calculated by weighting the distance and fitness of their adjacent evaluated samples. FSAPSO has been proved to perform well on small and medium-scale problems. Similar to MADE, FSAPSO also incorporates an RBF-assisted local optimization, where a global RBF model was constructed using all the evaluated samples and the search region was circumscribed to a local region covered by the current iterative swarm. MADE differs from FSAPSO mainly in that MADE draws on the uncertainty information provided by the coarse GP model in environmental selection to extract promising parent population for the next generation and performs local optimization resorting to a fine local RBF model and a multi-point swarm intelligence optimizer SL-PSO, emphasizing generating instructive parent population, whereas FSAPSO focuses on infill sampling from the candidate solutions.

S-JADE [4] combined global and local RBF models to guide the mutation and selection operations in the optimization process. The optima of the global and local RBF models were separately utilized as competitive alternatives to the randomly chosen top-ranking best and dissimilar population individuals in the mutation operator for guiding the mutation operation. And in the environmental selection, a portion of trial/offspring individuals with smaller predictions by the global RBF model amongst the trial/offspring population underwent real-evaluation and further competed with the associated parent individuals for entering the next generation, while the remaining parent individuals were directly inherited from the ones in the previous generation. It follows that S-JADE employs the one-to-one rivalry selection to update part of the parent population, whereas MADE updates the parent population by a coarse GP-assisted environmental selection. And unlike S-JADE that constructs a local RBF model for each individual and uses the optima of both global and local RBF models to guide the mutation direction, MADE only uses the optimum of a local RBF model as an alternative to the current global best to guide the mutation of the target/parent population.

Comparison results of algorithms on the five basic test problems

Table 6 lists the optimization results of the four algorithms under comparison on the five basic test problems, including the mean and variance (shown as Mean ± Std.) obtained by each algorithm after 30 independent runs, the pairwise Wilcoxon rank-sum test at a significant level of 0.05, and the average rank of each algorithm calculated by Friedman test, where the symbols “+”, “−” and “\(\approx \)” indicate that MADE performs significantly superior to, significantly inferior to, and comparable to the contestants. As seen from Table  6, MADE can obtain significantly better results than CAL-SAPSO and FSAPSO throughout 10- to 30-dimensional benchmark problems within a limited evaluation budget. Comparing to GORS-SSLPSO, MADE acquires significantly better results on Ellipsoid, Rosenbrock, and Ackley functions. For Griewank and Rastrigin function, from Table  6, we can see that there is no significant difference in the results of GORS-SSLPSO and MADE except for the 30-dimensional Griewank function, on which MADE gets a significantly better result. we speculate that this may be due to the smoothing effect of the coarse GP model in MADE, inducing the parent population to quickly converge to the global optimal region, therefore providing a larger probability to locate the global optimum. Furthermore, according to the average rank, MADE gets the first rank, followed by GORS-SSLPSO, FSAPSO, and CAL-SAPSO. Thus, it can be concluded that for the basic test functions characterized by symmetric landscapes and optima lied on the origin, MADE can be chosen as a promising optimizer in contrast to GORS-SSLPSO, FSAPSO, and CAL-SAPSO when only a small number of real fitness evaluations are available, eg., 11D real-evaluation budget.

Table 6 Statistics of CAL-SAPSO, GORS-SSLPSO, FSAPSO, and MADE on the basic test functions of 10–30 dimensions

Figure. 5 presents the convergence profiles of these four approaches on the selected benchmark functions. As seen from Fig. 5a–c and  5g–i, MADE outperforms the other three contestants remarkably on the unimodal Ellipsoid and multimodal Ackley functions of 10–30 dimensions under an 11D real-evaluation budget. For the Ellipsoid function as shown in Fig. 5a–c, GORS-SSLPSO performs comparably to FSAPSO, whereas GORS-SSLPSO outperforms FSAPSO on the multimodal Ackley, Griewank, and Rastrigin functions. Nonetheless, both of them are outperformed by MADE as shown in Fig. 5g–o.

For the Rosenbrock function featured by multimodal and narrow valley fitness landscape, MADE converges slowly in the early stage of optimization, where CAL-SAPSO and GORS-SSLPSO converges faster as shown in Fig. 5d–f. It is conceivable that in the early search stage the surrogate ensemble in CAL-SAPSO approximates the fitness landscape with high precision under limited training samples available and the top-ranking solutions in GORS-SSLPSO provide a promising leading population. However, MADE and FSAPSO exhibit good downward trends in the later search stage, where MADE outperforms the other three compared algorithms on 10- and 20-dimensional cases and performs comparably with FSAPSO on 30-dimension case as shown in Fig. 5d–f.

For the single-funnel Griewank function characterized by symmetric and regular multimodal fitness landscape, MADE converges slightly faster than the other three compared approaches, especially on the 30-dimensional case as shown in Fig. 5j–5l, yet there is no significant difference between MADE and GORS-SSLPSO on 10- and 20-dimensional cases according to Table 6. Likewise, for the multimodal Rastrigin function with multiple funnels, MADE performs comparably to GORS-SSLPSO and both perform superior to CAL-SAPSO and FSAPSO across 10- to 30-dimensional cases within 11D real fitness function evaluations. These results confirm the fact that the selection of suitable evolutionary populations is conducive to improving the convergence performance of the algorithm while enhancing its adaptability to different fitness landscapes.

Fig. 5
figure 5

Convergence profiles of compared algorithms on the basic test problems of 10, 20, and 30 dimensions

Table 7 Statistics of GORS-SSLPSO, FSAPSO, S-JADE and MADE on the CEC’14 expensive optimization test suit

Comparison results of algorithms on the CEC’14 expensive optimization test suit

Table 7 lists the comparison results of MADE, S-JADE, FSAPSO and GORS-SSLPSO on the CEC’14 expensive optimization test suit within 11D real fitness function evaluations. In general, MADE significantly outperforms S-JADE, FSAPSO and GORS-SSLPSO on 21, 16 and 11 instances out of 24 benchmark problems, respectively, and MADE gets the first rank, followed by GORS-SSLPSO, FSAPSO and S-JADE, according to the Friedman test.

For the two surrogate-assisted DE variants MADE and S-JADE, the results in Table 7 show that MADE can obtain significantly better results than S-JADE on nine instances (F1–F9) among 12 unimodal benchmark problems (F1–F12), on which MADE is comparable to S-JADE on F10 and F11, and is outperformed by S-JADE on F12. Although there is no significant difference between the results obtained by MADE and S-JADE on F10 and F11, the mean values of MADE averaged over 30 independent runs are worse than those of S-JADE. However, from Table 7, MADE significantly outperforms S-JADE on 12 multimodal benchmark problems (F13–F24), indicating the superiority of MADE against S-JADE in dealing with problems with non-symmetric fitness landscapes and shifted global optima under a small number of real fitness function evaluations.

In terms of the two surrogate-assisted PSO variants FSAPSO and GORS-SSLPSO, from Table 7, we can find that MADE has obtained significantly better results than GORS-SSLPSO on six out of 12 unimodal benchmark problems (ie., F3, F6, F10, F11 and F12) and five out 12 multimodal benchmark problems (ie., F14, F15, F19, F20, and F21), especially on the shifted Step function (F10–F12) and shifted rotated Rosenbrock’s function across 10–30 dimensionalities. And on the other side, MADE ties with GORS-SSLPSO on 11 out of 24 benchmark problems, and is inferior to GORS-SSLPSO on 20-D F17 and 10-D F22. Comparing MADE to FSAPSO, from Table 7, one can also find that MADE has obtained significantly better results than FSAPSO on all the 12 multimodal benchmark problems, while the results obtained by MADE are as good as those of FSAPSO on six out of 12 unimodal instances and are worse than those of FSAPSO on shifted sphere function of 20 and 30 dimensions (F2–F3). As a whole, the results show in Table 7 further demonstrate the remarkable performance of MADE against GORS-SSLPSO and FSAPSO on problems with different fitness landscapes and non-origin optimum when only a limited real-evaluation budget is available.

Figures 6, 7 provides the convergence profiles of these four contestants on the CEC’14 expensive optimization test suit. From Fig. 6, it can be observed that S-JADE is significantly outperformed by the other three contenders on four unimodal benchmark problems of 10 to 30 dimensions including the shifted sphere function (F1–F3), shifted Ellipsoid function (F4–F6), and shifted and rotated Ellipsoid function (F7–F9). This may be due to the fact that S-JADE consumes more real-evaluations in the environmental selection than the other three contenders in each generation. In contrast, the performance of GORS-SSLPSO degrades on the shifted Step function of 10, 20 and 30 dimensions and GORS-SSLPSO converges prematurely in 11D real-evaluations, whereas S-JADE performs slightly better than MADE and FSAPSO, and MADE outperforms FSAPSO with the increase of dimensionality. We surmise that GORS-SSLPSO stagnates on the discontinuous and flat basins owing to the greedy search of population composed of the optimal subset, resulting in the failure of escaping from the local traps, while S-JADE invokes more real-evaluations to evaluate promising offspring individuals and therefore enriches the a priori knowledge of the search space, helping it jump out of the local traps. From Fig. 6, we can also notice that MADE performs comparably with GORS-SSLPSO and FSAPSO on most of the unimodal benchmark problems except for F3 and F8, on which MADE and FSAPSO outperforms GORS-SSLPSO.

Fig. 6
figure 6

Convergence profiles of GORS-SSLPSO, FSAPSO, S-JADE and MADE on the unimodal CEC’14 benchmark problems

In terms of multimodal benchmark problems, it can be seen from Fig. 7 that MADE shows significant superiority against S-JADE, FSAPSO and GORS-SSLPSO on the shifted Ackley’s function (F13–F15) and shifted rotated Rosenbrock’s function (F19–F20) of 10, 20 and 30 dimensions within 11D real fitness function evaluations, while S-JADE performs comparably with FSAPSO on 30-D F15, 10-D F19 and 20-D F18. For the shifted Griewank function, MADE, GORS-SSLPSO and FSAPSO show comparable performance on the 10-dimensional instance, yet MADE and GORS-SSLPSO outperforms FSAPSO as the increase of dimensions, and FSAPSO performs slightly better than MADE in the later search stage. Notice that MADE performs consistently better than S-JADE on this benchmark problem across 10 to 30 dimensions, indicating the superiority of coarse GP-assisted environmental selection in economizing the invocation frequency of the real fitness function. Moreover, MADE and GORS-SSLPSO perform comparable on 20- and 30-dimensional shifted Rastrigin function (F23–F24), while the MADE is outperformed by GORS-SSLPSO on 10-D instance (F22).

In summary, the results shown in Table 7 and Figs. 6 and 7 demonstrate the good adaptability and scalability of MADE in different types of benchmark problems, which are featured by non-symmetric fitness landscapes and non-origin optima. And further corroborated the comparable and superior performance of MADE against four state-of-the-art algorithms when a small number of the real-evaluation budget is available.

Computational complexity analysis

The computational complexity of MADE consists of the following main parts according to Algorithm 1: fitness evaluation at the initialization phase, coarse GP modeling and non-dominated sorting on the merged population in GP-assisted environmental selection, fine RBF modeling and promising individual selection in RBF-assisted individual refinement, and the SL-PSO local search in RBF-assisted local optimization. In this subsection, we only give an approximation to the upper bound of the computational complexity for one generation of MADE.

Fig. 7
figure 7

Convergence profiles of GORS-SSLPSO, FSAPSO, S-JADE and MADE on the multimodal CEC’14 benchmark problems

At the initialization phase, the fitness evaluation of the initial population (parent population) takes \(O\left( N_\mathrm{{p}} D\right) \) computations, where \(N_\mathrm{{p}}\) denotes the population size and D is the problem dimension. Then the parent population undergo mutation and crossover to generate the trial/offspring population, which also requires a runtime of \(O\left( N_\mathrm{{p}} D\right) \). After merging the parent population and the trial/offspring population, a coarse GP model is constructed to compute the approximated fitness and estimated uncertainty of the merged population, where the training set for the coarse GP modeling is determined by a two-step sample selection. A closest sample subset of size \(N_\mathrm{{n s}}\) to the merged population is firstly chosen from the archive, which demands \(O\left( 2 N_\mathrm{{p}} N_\mathrm{{arc}} D\right) \) computations for finding the closest neighbors in the archive by calculating the Euclidean distances of sample pairs. Here \(N_\mathrm{{arc}}\) denotes the size of archive. Then \(N_\mathrm{{s g p}}\) sample points is selected from the selected \(N_\mathrm{{n s}}\) closest sample points to be the final training samples for the coarse GP modeling, which requires \(O\left( N_\mathrm{{s g p}} N_\mathrm{{n s}} D\right) \) computations to calculate the Euclidean distances of sample pairs and \(O\left( N_\mathrm{{n s}} \log N_\mathrm{{n s}}\right) \) computations to sort the obtained distances. The computational complexity for building a GP model is \(O\left( K N_\mathrm{{s g p}}^{3} D\right) \) [41], where K denotes the number of iterations required for hyper-parameter optimization. Afterwards, the non-dominated sorting is carried out on the merged population to determining the new parent population for the next generation, which require \(O\left( M\left( 2 N_\mathrm{{p}}\right) ^{2}\right) +O\left( M\left( 2 N_\mathrm{{p}}\right) \log \left( 2 N_\mathrm{{p}}\right) \right) \) computations. Here M denotes the number of objectives and \(M=2\), \(M \ll N_\mathrm{{p}}\) in this paper. Note that the size of the merged population is \(2N_\mathrm{{p}}\). Thus the total computational complexity for the GP-assisted environmental selection is \(O\left( N_\mathrm{{p}} N_\mathrm{{arc}} D+N_\mathrm{{s g p}} N_\mathrm{{n s}} D+N_\mathrm{{n s}} \log N_\mathrm{{n s}}+K N_\mathrm{{s g p}}^{3} D\right) \) as \(N_{\text{ arc }}\), \(N_\mathrm{{s g p}}\) and \(N_\mathrm{{n s}}\) are usually bigger than \(N_\mathrm{{p}}\).

In the RBF-assisted individual refinement, a fine RBF model is built to re-approximate the selected parents from the GP-assisted environmental selection. Here a closest sample set to the new parent population is selected from the archive to be the training samples, which requires \(O\left( N_\mathrm{{p}} N_\mathrm{{a r c}} D\right) \) computations. To construct an RBF model of interpolation form used in our method, \(O\left( N_\mathrm{{s r b f}}^{3} D\right) \) computations are needed [67], where \(N_\mathrm{{s r b f}}\) denotes the size of training set. Thereafter, to select the best individual for real-evaluation, the parents are sorted according to the fitness approximated from the fine RBF model, which require \(O\left( N_\mathrm{{p}} \log N_\mathrm{{p}}\right) \) computations. Thus the total computational complexity for the RBF-assisted individual refinement is \(O\left( N_\mathrm{{p}} N_\mathrm{{a r c}} D+N_\mathrm{{s r b f}}^{3} D+N_\mathrm{{p}} \log N_\mathrm{{p}}\right) \).

In the worst case that the current global best is unimproved, a local search based on SL-PSO is carried out for finding the optimum of the fine RBF model, which involves \(O\left( N_\mathrm{{p s}}^{2}+N_\mathrm{{p s}} D\right) \) computations [7]. Here \(N_\mathrm{{p s}}\) denotes the population size of SL-PSO. Therefore, the overall computational complexity of MADE is as follows:

$$\begin{aligned} T= & {} O\left( N_\mathrm{{p}} d+N_\mathrm{{p}} N_\mathrm{{a r c}} D+N_\mathrm{{s g p}} N_\mathrm{{n s}} D+N_\mathrm{{n s}} \log N_\mathrm{{n s}}\right) \nonumber \\&+\,O\left( K N_\mathrm{{s g p}}^{3} D+N_\mathrm{{s r b f}}^{3} D+N_\mathrm{{p s}}^{2}+N_\mathrm{{p s}} D\right) \end{aligned}$$
(3)

Here it should be noted that the time complexity for a single fitness evaluation is non-deterministic polynomial hard (NP-hard) [47] for most of the real-world computationally expensive problems, so the time complexity of MADE is acceptable.

Conclusion

A novel multi-model assisted differential evolution algorithm called MADE is proposed in this paper for solving computationally expensive problems. Two surrogate models, an GP model and an RBF model, is separately used to explore and exploit the design space. The GP model is constructed as a coarse surrogate to learn the global feature of the fitness landscape and assist the selection operator for selecting the instructive parent population for the next generation. An coarse GP-assisted environmental selection strategy is then developed by simultaneously considering the approximated fitness and estimated uncertainty. In addition, an RBF model was introduced to serve as a fine model to refine the approximation quality of the selected parent population while finding the best individual amongst the population and the local optimum in its vicinity for fitness evaluation using the real objective function. These two strategies together make a good balance between exploration and exploitation in searching for an optimum within a limited computation budget. The proposed algorithm was evaluated on two sets of widely used benchmark problems characterized by different types of fitness landscapes. The numerical results showed that the proposed algorithm performs better than a few state-of-the-art surrogate-assisted DE and PSO algorithms.

Model management remains the focus of research in surrogate-assisted evolutionary algorithms. As our experimental results showed that the fidelity of the surrogate model may influence the optimal results of the algorithm, in the future, the criteria for self-determining the model fidelity will be studied based on the contribution of the model in the algorithm so as to improve the performance of our method. Moreover, extending our method to multi-objective optimization and applying it for real-world computationally expensive problems are also interesting works in the future.