Enhancing hierarchical surrogate-assisted evolutionary algorithm for high-dimensional expensive optimization via random projection

By remarkably reducing real fitness evaluations, surrogate-assisted evolutionary algorithms (SAEAs), especially hierarchical SAEAs, have been shown to be effective in solving computationally expensive optimization problems. The success of hierarchical SAEAs mainly profits from the potential benefit of their global surrogate models known as"blessing of uncertainty"and the high accuracy of local models. However, their performance leaves room for improvement on highdimensional problems since now it is still challenging to build accurate enough local models due to the huge solution space. Directing against this issue, this study proposes a new hierarchical SAEA by training local surrogate models with the help of the random projection technique. Instead of executing training in the original high-dimensional solution space, the new algorithm first randomly projects training samples onto a set of low-dimensional subspaces, then trains a surrogate model in each subspace, and finally achieves evaluations of candidate solutions by averaging the resulting models. Experimental results on six benchmark functions of 100 and 200 dimensions demonstrate that random projection can significantly improve the accuracy of local surrogate models and the new proposed hierarchical SAEA possesses an obvious edge over state-of-the-art SAEAs


Introduction
Evolutionary algorithms (EAs), such as differential evolution (DE) [1], genetic algorithm (GA) [2], and particle swarm optimization (PSO) [3], have been widely employed to solve real-world engineering optimization problems [4][5][6]. These EAs generally require a large number of fitness evaluations (FEs) to find a satisfying solution, which make them unsuitable for computationally expensive optimization problems [7][8][9][10]. The reason consists in that a single FE of these expensive problems often consumes lots of time or material resources.
Surrogate model provides an effective tool to reduce computational cost by partly replacing computationally expensive FEs during the evolution process. Over the past decades, several types of surrogate models, including polynomial regression [11], radial basis function (RBF) [12][13][14], Gaussian process (GP) [15,16], and support vector machine [17], have been developed and deeply analyzed, yielding various surrogate-assisted evolutionary algorithms (SAEAs). According to the role of the employed surrogate model, existing SAEAs can be roughly divided into the following three categories: 1) global SAEAs, 2) local SAEAs, and 3) hierarchical SAEAs. Global surrogate models aim to approximate the expensive fitness function in the whole solution space [18][19][20][21][22]. As a contrast, local surrogate models focus on a small solution region for the purpose of ensuring the approximation accuracy [23][24][25][26]. Fusing the exploration capability of the global surrogate and the exploitation capability of the local surrogate together, hierarchical SAEAs attracted much research attention in recent years and showed certain superiority over a single type of SAEAs on most expensive problems [27][28][29][30].
So far, SAEAs have achieved great success in tackling low-and medium-dimensional expensive problems, but they generally lose efficiency on high-dimensional problems due to the "curse of dimensionality" [31]. On the one hand, EAs cannot fully explore the huge solution space of a high-dimensional problem at the cost of acceptable computational resources. On the other hand, it is also impracticable to directly build an accurate enough surrogate model with a limited number of available training samples. Directing at the above two challenges, several beneficial attempts have been made. Tian et al. [32] developed a multiobjective sample infill criterion according to non-dominated sorting, and enhanced the adaptability of a GP-assisted PSO algorithm. Li et al. [33] proposed a surrogate-assisted multiswarm optimization algorithm, where a swarm is specially evolved to enhance the exploration capability of the whole algorithm.
Besides, it has been verified that hierarchical SAEAs are of great potential in solving high-dimensional expensive problems [34][35][36][37]. Profiting from its "blessing of uncertainty" [38], the global surrogate model generally helps to smooth out some local optima and thus to reduce the search space, whereas the local surrogate model helps to identify better solutions in the located local promising regions. Taking a recently proposed hierarchical SAEA called evolutionary sampling assisted optimization (ESAO) [34] as an example, it builds a global RBF model to assist DE to conduct global search by prescreening promising solutions. On the other side, it performs local search by taking another RBF model trained in the neighborhood of the current best solution as the objective function. ESAO alternates the two types of search if one of them cannot lead to a better solution.
By this means, it not only enhances the possibility of finding the global optimum but also speeds up the optimization process.
Compared with the global surrogate model, the local model in a hierarchical SAEA is expected to be of much higher accuracy.
Despite the reduced solution region, it is still a nontrivial task to build an accurate enough local surrogate model in a highdimensional solution space.
To alleviate this issue, this study proposes a random projection-enhanced hierarchical SAEA (RPHSA). RPHSA inherits the fine framework of ESAO, but adapts the local surrogate model therein, i.e., RBF, to high-dimensional problems with the random projection (RP) technique. As a commonly-used dimension reduction technique, RP can expediently project highdimensional data onto subspaces of much lower dimensions while maintaining the geometric structure among the data to a great extent. Now, RP has been successfully applied in many fields such as signal processing [39], machine learning [40], and high-dimensional optimization [41], but has seldom been employed to train surrogate models for high-dimensional problems.
With the introduction of RP, RPHSA builds its local RBF model as follows: first randomly project original training samples onto several low-dimensional subspaces, then train low-dimensional RBF models in respective subspaces so as to capture the characteristics of the original problem from different perspectives, and finally construct the final RP-based RBF (RP-RBF) by averaging all low-dimensional RBF models. In this way, the final RP-RBF is expected to achieve ideal approximation capability with a small number of training samples and thus to enhance the performance of RPHSA.
The remainder of this paper is organized as follows. Section 2 briefly reviews the related work of SAEAs. Section 3 describes the proposed RP-RBF model and the resulting RPHSA algorithm in detail. Section 4 reports experimental settings and results along with some analyses. The conclusion is finally given in Section 5.

Related Work
SAEAs have been receiving more and more research attention in recent years for their effectiveness in solving computationally expensive problems. They replace most of the expensive real FEs with surrogate estimations in the optimization process, such that many computational resources can be avoided and the optimization performance can be greatly improved. Early SAEAs tend to employ global surrogate models to fit the whole optimization problem. Their excellent performance on simple and low-dimensional expensive optimization problems have been verified. Jin et al. [18] adopted an artificial neural network as a global surrogate to assist the covariance matrix adaptation evolution strategy and also proposed an empirical criterion to switch between expensive real FEs and cheap fitness estimations during the evolutionary search process. Ratle [19] suggested using GP to construct a global surrogate so as to guide the search process. Regis et al. [20] developed a SAEA based on PSO and a global RBF model, where the former generates multiple trial positions for each particle in each iteration, while the later prescreens most promising trial positions to generate new particles. Liu et al. [21] employed a GP with lower confidence bound to select promising solutions in the evolution process of a DE algorithm. They also utilized Sammon mapping, a dimension reduction technique, to enhance the surrogate accuracy on medium-scale expensive optimization. Dong et al. [22] proposed a multi-start approach with the goal of finding all the local optima of a pre-trained global GP model. And the search was performed within the located local optimal solution regions.
Global surrogate models take effect on expensive problems of simple fitness landscape, but they suffer from "curse of dimensionality" and cannot adapt themselves well to complicated problems. To alleviate this issue, researchers developed local surrogate models to improve the approximation accuracy in local solution regions. Ong et al. [23] employed a trust-region method for interleaving use of exact models for the objective and constraint functions with computationally cheap RBF models in the local search process. The idea of fitness inheritance suggested by Smith et al. [24] can also be seen as a local surrogate, where the fitness value of an individual is estimated based on its neighbors and parents. Similarly, Sun et al. [25] proposed a fitness estimation strategy to approximate particles in PSO based on their positional relationship with other particles. Lu and Tang [26] integrated a local surrogate model into DE, where the model was used not only for regression but also for classification.
Compared with global surrogate models, local models are more likely to improve the solution quality, but they tend to lack the capability of jumping out of local optima. To achieve complementary advantages of these two types of models, many studies in recent years focused on developing hierarchical SAEAs by integrating global and local surrogate together. Zhou et al. [27] proposed a hierarchical SAEA, where a global GP and a local RBF network were jointly employed to assist a GA. The former was used to identify promising individuals at the global search level, while the later was adopted to accelerate the convergence of a trust-region enabled search strategy at the local search level. Tenne and Armfield [28] proposed a memetic optimization framework consisting of variable global and local surrogates, and employed RBF in a trust-region approach for expensive optimization. Sun et al. [29] introduced a two-layer surrogate-assisted PSO, where a global surrogate model was intended to smooth out some local optima of the original multimodal fitness functions and a local surrogate model was employed for fitness estimation. Inspired by committee-based active learning, Wang et al. [30] proposed an ensemble of several global surrogates to search for the best and most uncertain solutions to be evaluated by the real fitness function, and employed a local surrogate to model the neighborhood of the current best solution with the goal of further improving it by optimizing the model.
As for high-dimensional expensive problems, they have become a research hotspot in the field of SAEAs. In addition to the ESAO algorithm introduced in Section 1, some other significant research efforts have been made to push the boundary of SAEAs in solving this kind of problems. Sun et al. [35] proposed a surrogate-assisted cooperative swarm optimization algorithm, where an RBF-assisted social learning PSO focuses on exploration and a fitness estimation strategy-assisted PSO concentrates on local search. Yu et al. [36] embedded the social learning PSO into an RBF-assisted PSO framework. The former aims to find the optimum of the RBF model, which is constructed with a certain number of the best solutions found so far, and thereby refines its local approximation of the fitness landscape around the optimum. The latter conducts search in a wider solution region, enabling the RBF model to capture the global landscape of the fitness function. Yang et al. [37] developed a two-layer surrogate-assisted DE algorithm. This algorithm measures its evolutionary status according to the improvement times of the best solution. According to the feedback status, three different DE mutation operators are employed to generate new offspring, which are further prescreened by a global or local GP model. Tian et al. [32] revealed that the approximation uncertainty of GP becomes less reliable on high-dimensional problems and the commonly-used scalar sample infill criteria, which combines the approximated fitness and the approximation uncertainty in a scalar function, tends to lose efficacy. To overcome this defect, they developed a multiobjective sample infill criterion by considering the above two factors as separate objectives and selecting promising solutions according to non-dominated sorting, and thereby achieved a good balance between exploitation and exploration. Li et al. [33] proposed a SAEA involving two swarms, where the first one uses the learner phase of teaching-learning-based optimization to enhance exploration and the second one uses PSO for faster convergence. Moreover, they also designed a novel sample infill criterion by selecting particles predicted to be with selfimprovement for real FEs.
These recent studies enhance the adaptability of SAEAs on high-dimensional expensive problems. However, it is still an open problem to build accurate enough surrogate models, especially local models in hierarchical SAEAs, for high-dimensional problems. This study attempts to tackle this issue with the RP technique.

Proposed RPHSA
This section first introduces how to scale up RBF to high-dimensional problems with the RP technique, then discusses the integration of the resulting RP-RBF model and the basic ESAO algorithm, and finally presents the implementation of the proposed RPHSA algorithm.

RP-RBF model
As a commonly-used surrogate model, RBF has been shown to fit nonlinear functions well and be capable of both local and global modeling [20,23,[34][35][36]. The RBF model used in this paper has an interpolation form and can be formulated as follows.
where  , ()   , and i  denote the Euclidean norm, the basis function, and the weight coefficient to be learnt, respectively.
There are several types of basis functions, including multiquadric function, Gaussian function, and splines. In this study, the Although the performance of RBF is relatively insensitive to problem dimension, its accuracy on high-dimensional problems is still difficult to be guaranteed due to the huge modeling space and the very limited number of available training samples. To cope with this dilemma, a promising way is to reduce the modeling space with a dimension reduction technique. This study selects RP to execute this task for its simplicity and capability of preserving the geometric structure among training samples [42].
To perform projection with RP, a random projection matrix independent of training data is required. According to [43], the Gaussian random projection matrix, each of whose elements obeys the standard normal distribution, meets this requirement.
To ensure the orthogonality required by RP, we further perform column orthogonalization on the initially generated Gaussian matrix. Let kd P   denote the final projection matrix with d and k being space dimensions before and after projection, respectively, then a set of low-dimensional training samples can be obtained through the following projection operation: 12 ( , , , ) for , This lemma suggests that the distance between any two points in high-dimensional Euclidean space, i.e., the geometric structure of the original training data, can be preserved after projection within a certain error range, which depends on the dimension of the new space. The lower the new dimension is, the more greatly the difficulty of training an RBF can be reduced, but the broader the error range tend to be, which is harmful to the accuracy of the trained RBF model. To balance this contradiction, the developed RP-RBF model first projects the original training samples onto a group of low-dimensional random subspaces instead of a single one, and then trains an RBF in each subspace. In this way, the original training samples can be shared in different subspaces and a low-dimensional RBF capturing part of characteristics of the original problem can be easily trained in each subspace. By averaging all low-dimensional RBFs, the final RP-RBF is expected to be able to learn more characteristics of the original problem and to achieve higher accuracy. Fig. 1

Integration of RP-RBF and ESAO
As a recently developed hierarchical SAEA, ESAO maintains a global and a local RBF model to assist the global and the local search conducted by DE, respectively [34]. The global RBF is trained in each generation of DE with all truly evaluated solutions. It is employed to predict the fitness values of all offspring generated by DE through mutation and crossover operator.
The offspring with the lowest prediction will be further evaluated with the real fitness function, and will replace its parent if it is better. ESAO will also employ this offspring to update the current best solution and continue the global search process if it is of better fitness, otherwise ESAO will switch to the local search process. The local RBF is trained with a certain number of best solutions that have been truly evaluated. ESAO takes it as an approximate fitness function and employs DE to find its optimum. This optimum will undergo real evaluation. If it is better than the current best solution, ESAO will start a new local process after updating the current best solution with it and adding it to the population of the global process, otherwise ESAO will return to the global search process.
A straightforward way to integrate RP-RBF and ESAO is to build both the global and local surrogate model for ESAO with RP-RBF. However, the proposed RPHSA tends to just replace the local model with RP-RBF. The reasons are threefold. First, the local RBF model aims to capture fitness landscape details of the current promising local solution region. It is directly used as the fitness function for local search and thereby requires much on estimation accuracy. On the contrary, the global RBF model is designed to describe all explored solution regions. It is allowed to be of certain estimation error such that some local optima can be smoothed out. Second, the structural characteristics of a local high-dimensional solution region are relatively simple and are more likely to be preserved and learned in low-dimensional subspaces. As a contrast, the structural characteristics of the solution regions covered by the global RBF model are much more complicated and can hardly be modeled with high enough accuracy. Finally, experimental results in [34] reveal that more real FEs are conducted in the local search process than in the global one, which means that ESAO depends more on the local search process in seeking high-quality solutions. Therefore, it is hopeful to achieve better optimization performance by adopting RP-RBF as the local surrogate model in ESAO.
Taking the basic framework of ESAO, Algorithm 2 presents the pseudocode of RPHSA. As suggested by ESAO, step 1 initializes the population for global search by optimal Latin hypercube sampling (OLHS) [44]. Steps 3-13 execute the global search, and steps 14-22 conducts the local search. The two types of search alternate if they fail to find a better solution. RPHSA terminates its optimization process when meeting a stopping criterion, which is generally set as a maximum number of real FEs.

Experimental Studies
To investigate the effectiveness and efficiency of RPHSA, we tested it on six widely-used benchmark functions and empirically compared it with ESAO and some other state-of-the-art SAEAs for high-dimensional expensive problems. Table   1  Find out the offspring with the lowest prediction x g and evaluate it with () 18. Evaluate    To be specific, the RP-RBF model in RPHSA requires much fewer training samples than the traditional RBF models, which generally need at least 2d samples [34]. This merit mainly profits from the dimension reduction capability of RP. The unappealing performance of RPHSA with a large value of n may consist in the following two reasons: On the one hand, too many training samples either concern more than one local solution region or cause overfitting in a single local solution region.

Influence of parameters
On the other hand, it is almost impractical for RP to simultaneously keep the geometric structure among a large number of training samples. As for k, when it is set to a value from {40, 50, 60}, RPHSA demonstrates similar and acceptable performance.
However, the performance of RPHSA significantly deteriorates when k becomes smaller. This is understandable because a too small value of k will make the original high-dimensional training samples lose too much structural information during projection, which will further reduce the approximation accuracy of RP-RBF. We also tested RPHSA with different combinations of k and n on 200-D functions. It was found that the combination of 50 = k and 100 n= also enables RPHSA to get superior results. Therefore, we set this combination as the default setting of RPHSA and employ it in the following experiments.

Effectiveness of RP-RBF
To verify the effectiveness of the developed RP-RBF model, we empirically compared it with the traditional RBF model in terms of approximation accuracy and capability of enhancing optimization performance.
1) Approximation accuracy. In this experiment, we specially built a traditional local RBF model in RPHSA besides RP-RBF.
This model was strictly built according to the method described in ESAO. It did not participate in any algorithmic operation but being used for comparison. In the middle or late evolution stage of RPHSA, we picked out a population of DE that conducts local search and evaluated each individual therein by real fitness function, the traditional RBF, and RP-RBF. This indicates that RP-RBF has higher approximation accuracy than the traditional RBF and thus is more likely to help the optimizer find high-quality solutions. The superiority of RP-RBF also verifies the effectiveness of RP in reducing the burden of high-dimensional modeling.
2) Optimization performance. We verified the superiority of RP-RBF in enhancing optimization performance by directly comparing the optimization results of RPHSA and ESAO since the difference of these two algorithms only lies in the local surrogate model. Table 2 reports the results on six 100-D functions, where the mean, the standard deviation, and the result of Wilcoxon rank sum test at a significance level of 0.05 are reported. The results in bold indicate that they are the better ones, and the symbols "+", "-" and "=" indicate that the result of RPHSA is better than, worse than, and similar to the corresponding one of ESAO, respectively. It can be observed from Table 2 that RPHSA significantly outperforms ESAO on all benchmark functions except F5. It improves the results of ESAO on F1, F3, and F4 by at least an order of magnitude in terms of the mean.  Table   2. These results reveal that the local search of RPHSA is more efficient than that of ESAO, which benefits from the higher approximation accuracy of RP-RBF.

Comparison with state-of-the-art algorithms
To comprehensively verify the efficiency of RPHSA in solving high-dimensional expensive problems, we compared it with three state-of-the-art algorithms, including ESAO [34], SA-COSO [35], and SHPSO  Tables 4 and 5 and Figs. 4 and 5, the following observations can be made: problems. In conclusion, RPHSA could definitely be seen as the champion on this set of functions.
2) RPHSA has a good scalability. When the function dimension changes from 100 to 200, the performance of all the four algorithms degenerates to a certain extent. Despite this fact, RPHSA has the smallest performance degeneration and shows more obvious superiority over its competitors on most functions. The good scalability of RPHSA should be mainly attributed to the dimension reduction capability of RP.
3) RPHSA converges more stably and rapidly. It can be seen from Figs. 4 and 5 that no matter the function dimension is 100 or 200, RPHSA always keeps a more stable improvement tendency and continuously finds new better solutions during the whole evolution process. Its convergence rate is also faster than those of other algorithms on most functions, leading to better final optimization results.

Conclusion
In this paper, an RP-enhanced hierarchical SAEA, i.e., RPHSA, is proposed to solve high-dimensional computationally expensive optimization problems. RPHSA inherits the fine framework of ESAO, but builds the local surrogate model therein, i.e., RBF, with the help of RP and develops a new local model named RP-RBF. Different from ESAO which directly trains its local model in the original high-dimensional space, RPHSA trains a group of RBFs in their respective subspaces generated by RP and constructs the final RP-RBF by averaging all low-dimensional RBFs. With the introduction of RP, not only the main characteristics of the original high-dimensional problem can be preserved to a large extent, but also the number of samples required for modeling can be significantly reduced.
Experimental results on six 100-D and 200-D functions indicate that RP-RBF has higher approximation accuracy and thus greatly enhances the optimization capability of RPHSA. Compared with three state-of-the-art SAEAs, RPHSA presents obvious superiority in solution quality, scalability, and convergence performance.
In future work, we will attempt to apply RPHSA to solve some real-world high-dimensional expensive problems. It is also interesting to scale up other kinds of surrogate models such as GP with RP under the framework of RPHSA.