Comparison of synchronous and asynchronous parallelization of extreme surrogate-assisted multi-objective evolutionary algorithm

This paper investigates the integration of a surrogate-assisted multi-objective evolutionary algorithm (MOEA) and a parallel computation scheme to reduce the computing time until obtaining the optimal solutions in evolutionary algorithms (EAs). A surrogate-assisted MOEA solves multi-objective optimization problems while estimating the evaluation of solutions with a surrogate function. A surrogate function is produced by a machine learning model. This paper uses an extreme learning surrogate-assisted MOEA/D (ELMOEA/D), which utilizes one of the well-known MOEA algorithms, MOEA/D, and a machine learning technique, extreme learning machine (ELM). A parallelization of MOEA, on the other hand, evaluates solutions in parallel on multiple computing nodes to accelerate the optimization process. We consider a synchronous and an asynchronous parallel MOEA as a master-slave parallelization scheme for ELMOEA/D. We carry out an experiment with multi-objective optimization problems to compare the synchronous parallel ELMOEA/D with the asynchronous parallel ELMOEA/D. In the experiment, we simulate two settings of the evaluation time of solutions. One determines the evaluation time of solutions by the normal distribution with different variances. On the other hand, another evaluation time correlates to the objective function value. We compare the quality of solutions obtained by the parallel ELMOEA/D variants within a particular computing time. The experimental results show that the parallelization of ELMOEA/D significantly reduces the computational time. In addition, the integration of ELMOEA/D with the asynchronous parallelization scheme obtains higher quality of solutions quicker than the synchronous parallel ELMOEA/D.


Introduction
Evolutionary algorithms (EAs) have been applied to many real-world applications like engineering (Obayashi et al. 2010;Oyama et al. 2017), data mining (Devos et al. 2014;Soufan et al. 2015), electronics (Roberge et al. 2014), and nanoscience Davis et al. 2015) because of their high search capability without any preliminary knowledge of a target problem. Since many realworld applications include two or more objectives with conflict, i.e., a trade-off, multi-objective evolutionary algorithms (MOEAs) have attracted much attention for This work was supported by Japan Society for the Promotion of Science Grant-in-Aid for Young Scientists Grant Number JP19K20362. dealing with multiple objectives simultaneously. One problem when applying MOEAs to real-world applications is a necessity of enormous computing time to obtain optimal solutions. This is because MOEAs need to evaluate many tentative solutions through the optimization process, and the evaluation of tentative solutions is computationally expensive.
Previous works have proposed surrogate-assisted EAs (SAEAs) (Jin 2011;Haftka et al. 2016) to reduce the computational time required for the EA process. SAEAs construct a surrogate model that estimates the evaluation values of solutions from already evaluated solutions using machine learning techniques. A proper optimizer, like an MOEA, is executed using the evaluation estimated by a constructed surrogate model. Finally, the surrogate-assisted optimization obtains promising solutions that can exhibit superior actual evaluation value, and a time-consuming process actually evaluates them. The computational time of the estimation with a surrogate model is extremely shorter than the actual evaluation, and the actual evaluation is applied merely to promising solutions. Thus, an SAEA can reduce the computational time to obtain optimal solutions. Previous works have proposed some surrogate-assisted MOEAs, for example, MOEA/D-RBF (Zapotecas Martínez and Coello Coello 2013) or ParEGO (Knowles 2006), which use a Gaussian process or a radial based function (RBF) as a surrogate model. ELMOEA/D (Pavelski et al. 2014(Pavelski et al. , 2016, extreme learning assisted MOEA/D, is one of the state-of-the-art as a novel SAEA. This method adopts the surrogate model based on an extreme learning machine (ELM) (Huang et al. 2004), and MOEA/D Zhang and Li (2007) generates promising solutions with the ELM surrogate evaluation. Their work demonstrated that ELMOEA/D outperforms ParEGO and MOEA/D-RBF on multi-objective optimization benchmarks especially when the dimension of the search space is large.
To further reduce computational time, the parallelization of SAEAs can be considered as a potential attempt. Parallel EAs (PEAs) (Alba and Tomassini 2002;Alba et al. 2013) also have been studied in the last decades. In particular, a master-slave parallelization is a straightforward approach to implement PEAs. A master-slave PEA performs the main procedure of EAs, i.e., initialization, selection, variation, and replacement, on one master node. Many slave nodes, on the other hand, execute evaluations of newly generated solutions in parallel. Since the solution evaluation is the most time-consuming part in EAs, PEAs can reduce the computational time compared to sequentially evaluating solutions. In recent years, several works have studied the integration of SAEA with parallelization for reducing the computational time of the EA process. For example, a parallel efficient global optimization has been proposed (Wang et al. 2016) and applied to the optimization of microwave antennas . The work of Akinsolu et al. (2019) proposed parallel surrogateassisted MOEA with the Gaussian process and applied it to electromagnetic design optimization.
Master-slave PEAs can be classified into two schemes, synchronous PEAs (SPEAs) and asynchronous PEAs (APEAs). SPEAs wait for all evaluations of solutions done by slave nodes and generate a new population by utilizing all newly evaluated solutions. APEAs, on the other hand, continuously generate a new solution without waiting for evaluations of other solutions. Since APEAs do not wait for the slowest solution evaluation, it is possible to efficiently utilize the computing resource of slave nodes in a situation where the evaluation of solutions differs. The previous works that integrate SAEA and the parallelization have considered only the synchronous PEA. However, it is generally assumed that the evaluation time of solutions is enormous and generally differs. For this fact, we consider that the integration of an SAEA with the asynchronous parallelization scheme is effective. The purpose of this study is to clarify which scheme of the synchronous and the asynchronous scheme is suitable for SAEA. Specifically, considering the parallelization of ELMOEA/D, we compare a synchronous parallel ELMOEA/D (SP-ELMOEA/D) and an asynchronous parallel ELMOEA/D (AP-ELMOEA/D).
This paper conducts experiments that compare SP-ELMOEA/D with AP-ELMOEA/D to investigate which parallelization scheme is appropriate for ELMOEA/D. We test these methods on well-known multi-objective optimization benchmarks, ZDT series (Zitzler et al. 2000), WFG series (Huband et al. 2005), and DTLZ series (Deb et al. 2002). We also use a few benchmarks from realworld problems provided in the black-box optimization competition (Loshchilov and Glasmachers 2015). This paper employs a simulated master-slave parallel computing environment proposed in the work of Zȃvoianu et al. (2015) to measure the computational time of the competitive methods. We simulate two settings of the evaluation time in the experiment to investigate the influence of the different evaluation time characteristics. One evaluation time is determined by the normal distribution with the different variances, while another evaluation time correlates to the objective function value. We employ a hypervolume (HV) indicator (Zitzler and Thiele 1998) to assess the quality of obtained solutions.
The remaining of this paper is organized as follows. Sect. 2 firstly presents the description of a multi-objective optimization problem and explains the detail of MOEA/D. Section 3 shows the detail of ELM and ELMOEA/D. Section 4 briefly explains PEAs and proposes integrated algorithms of SP-ELMOEA/D and AP-ELMOEA/D. Section 5 describes the experimental setup to compare the synchronous and asynchronous ELMOEA/D on multi-objective optimization problem (MOP) benchmarks and real-world problems. Section 6 shows the results on MOP benchmarks with the normally-distributed evaluation time, while Sect. 7 shows the one with the fitness-correlated evaluation time. Section 8 shows the results on real-world problems. Finally, Sect. 9 concludes this research and shows future work.

Multi-objective evolutionary algorithms
This section firstly explains what multi-objective optimization problem is, and then, we show a detailed algorithm of MOEA/D (Zhang and Li 2007), which is an underlying algorithm of ELMOEA/D.

Multi-objective optimization problem
where x ¼ ðx 1 ; . . .; x D Þ is a D dimensional decision variable. x L;d and x U;d are the lower and upper bounds of the dth variable, respectively. f : X ! R m is a vector having m objective functions and maps the decision variable space X to the objective space R m .
It is difficult to obtain a single optimal solution in MOPs because each objective function is in a trade-off relationship. Therefore, the goal of MOPs is to acquire the Pareto optimal solution set. The Pareto optimal solution set is defined with the dominance relationship of solutions in MOPs denoted as follows: x 0 y , f j ðxÞ f j ðyÞ 8j 2 f1; . . .; mg f j ðxÞ\f j ðyÞ 9j 2 f1; . .
where x 0 y means x dominates y. The Pareto optimal solution set consists of solutions that are not dominated by any other solutions, and it is denoted as follows: 2.2 MOEA/D MOEA/D (Zhang and Li 2007) is a multi-objective evolutionary algorithm based on decomposition. It decomposes multiple objectives into many single objective subproblems based on an aggregation function associated with the weight vector uniformly distributed on the objective space. MOEA/D simultaneously performs many singleobjective searches with genetic operations within a local region determined by the neighborhood of the weight vectors. This paper adopts MOEA/D-DE (Li and Zhang 2009) that introduces a differential evolution (DE) operator (Storn and Price 1997;Das and Suganthan 2011) into MOEA/D as a crossover operator.
Algorithm 1 describes the detailed algorithm of MOEA/ D-DE. Firstly, the weight vector sets W are generated uniformly on the objective space. W consists of N (the population size) weight vectors k i ði ¼ 1; . . .; mÞ. The weight vector is used to decompose multiple objectives to a single objective with an aggregation function (detailed later). Then, the neighborhood set BðiÞ is generated, consisting of T weight vectors closest to the weight vector k i . Next, the initial population is generated and evaluated. The ideal point z Ã is calculated from the objective function values of the initial population. The parent solutions for the DE operator are selected from the neighborhood set BðiÞ with the probability of d, while selected from the entire population with the probability of ð1 À dÞ. After applying the DE operator and the polynomial mutation (PM) (Deb and Goyal 1996), a newly generated solution is evaluated, and the ideal point z Ã is updated. Finally, the population is updated according to the aggregation function value of a new solution. These procedures are repeated until the stopping criteria are satisfied.
Although previous works have proposed several aggregation functions, this research uses the Penalty Boundary Intersection (PBI) aggregation function. PBI is defined as Eqs. (6, 7 and 8): where h is the penalty factor.
3 Extreme learning surrogate-assisted MOEA/D This section introduces the original extreme learning assisted MOEA/D, ELMOEA/D (Pavelski et al. 2016). ELMOEA/D uses an extreme learning machine (ELM) (Huang et al. 2004) as a surrogate model and generates promising solutions by using MOEA/D-DE (Li and Zhang 2009) according to objective function values estimated by a constructed ELM model. In the following subsections, we first explain the overview of ELM, and then we provide the detail of ELMOEA/D.

Extreme learning machine (ELM)
ELM is a kind of machine learning technique (Huang et al. 2004). ELM is constructed as a single layer feed-forward neural network that consists of L hidden layer neurons weighted by a j ðj ¼ 1; . . .; LÞ, output weights b, and an activation function Gðx; a; bÞ. a indicates weights to a hidden neuron, while b indicates a bias value. The output of ELM is calculated as Y ¼ Hb, where b is a learned parameter, while H is a matrix of the activation function as The most notable feature of ELM is that the hidden layer weights a and biases b are randomly assigned and are not learned. The output weights b are only parameters to be learned from N distinct training samples of n-dimensional inputs x i with m dimensional outputs t i . In particular, b is is the matrix of desired outputs. ELM can use any piece-wise nonlinear continuous activation functions in the hidden layer, i.e., Gðx; a; bÞ. The original work of ELMOEA/D used the following three activation functions: -Sigmoid (SIG): -Gaussian (GAU): -Multiquadric (MQ): The advantages of ELM to neural networks and support vector machines (SVMs) are summarized as follows (Huang et al. 2004;Huang 2015): -The learning speed of ELM is extremely fast. It can train single-layer feed-forward networks much faster than classical learning algorithms. -Unlike traditional classic gradient-based learning algorithms, which intend to reach minimum training error but do not consider the magnitude of weights, ELM tends to reach not only the smallest training error but also the smallest norm of weights. Thus, ELM tends to have better generalization performance for feed-forward neural networks. -Unlike the traditional classic gradient-based learning algorithms that only work for differentiable activation functions, the ELM learning algorithm can be used to train single-layer feed-forward networks with nondifferentiable activation functions. -Unlike traditional classic gradient-based learning algorithms facing several issues like local minimum, improper learning rate, and overfitting, etc, ELM tends to reach the solutions straightforward without such trivial issues. The ELM learning algorithm looks much simpler than most learning algorithms for feed-forward neural networks. -Unlike SVM, which does not consider the feature representation and functioning roles of each inner hidden layer, ELM studies feature representations in each layer.

Algorithm of ELMOEA/D
Algorithm 2 briefly describes the algorithm of ELMOEA/D (see detail in the work of (Pavelski et al. 2016)).
ELMOEA/D uses Latin hypercube sampling (LHS) (McKay et al. 1979) as the sampling method in the initialization step. ELMOEA/D uses two weight vector sets, one is used for the evolution of MOEA/D, while another is used to select promising solutions. A weight vector set W used for the evolution of MOEA/D-DE is generated that uniformly spread on the objective space. The weight vector set consists of N weight vectors (W ¼ fk 1 ; . . .; k N g). Subsequently, a selector weight vector set W s used for the selection of N s promising solutions is generated, which consists of N s selector weight vectors Finally, a neighborhood vector set for the promising solution selection B s ðiÞ ¼ fk i;1 ; . . .; k i;K s g ðk i;j 2 WÞ is associated with each selector weight vector k s i 2 W s . Each neighborhood vector set B s ðiÞ consists of K s ð¼ N=N s Þ weight vectors in W that are closest to each selector weight vector k s i . After the initialization, ELMOEA/D constructs an ELM surrogate model by using a training set P t . ELMs are trained with different activation functions and different normalization parameters C in this step. The best ELM model that performs the smallest mean square error (MSE) for the training set is selected and used as a surrogate model. The constructed ELM model is used for evaluating solutions during the MOEA/D-DE procedure. The initial population is randomly selected from the non-dominated solutions in the archive population P a if its size exceeds the population size N. Otherwise, LHS generates the remaining solutions in the bound of ½l À r; l þ r, where l and r are the mean and the standard deviation calculated from the non-dominated solutions in P a .
The promising solutions for the actual evaluation are selected from the final population after the optimization of MOEA/D-DE. The aggregation function values of solutions in the final population are calculated for each k i;j 2 B s ðiÞ based on the estimated objective values by the surrogate model. Then, the best solution is selected from each neighborhood vector set B s ðiÞ, and totally, N s promising solutions are selected for the actual evaluation.
The previous work (Pavelski et al. 2016) reported that ELMOEA/D has the following advantages: -Since ELMOEA/D uses a simpler surrogate model with less user-specific parameters, it consumes less computational resources and is faster than the conventional surrogate-assisted EA. -ELMOEA/D showed promising results on the MOP benchmarks with high dimensional design variables compared with ParEGO and MOEA/D-RBF.
For these advantages, our research employs ELMOEA/D.

Parallelization of ELMOEA/D
This paper considers two schemes of ELMOEA/D parallelization, a synchronous parallel ELMOEA/D (SP-ELMOEA/D), and an asynchronous parallel ELMOEA/D (AP-ELMOEA/D). This section firstly introduces parallel EAs briefly, and then, we explain the detailed description of SP-ELMOEA/D and AP-ELMOEA/D. We finally introduce two selection mechanisms for AP-ELMOEA/D.

Parallel evolutionary algorithm
When applying EAs to real-world optimization problems, solution evaluations may take much computational time because of, for example, a requirement of physical simulation. Previous works have studied parallel EAs (PEAs) (Alba and Tomassini 2002;Alba et al. 2013) to speed up the optimization process of EAs by using multiple computing resources. In recent years, parallel MOEAs have been studied to reduce the computing time to approximate the Pareto front (Talbi 2019). A master-slave parallelization is one of the most typical and straightforward approaches of PEAs. A master-slave PEA performs the main procedure of EAs like initialization, selection, variation, and replacement on a single master computing node. At the same time, many slave computing nodes execute fitness evaluations in parallel. Such a masterslave parallelization has been applied to solving real-world problems. For example, the work of Barbera et al. (2018) applied the massively parallel EA to the phylogenetic placement problem. The work of Strofylas et al. (2018) proposed a parallel differential evolution method for calibrating a second-order macroscopic traffic flow.
Master-slave PEAs can be classified into two schemes, a synchronous evolution and an asynchronous one. Synchronous PEAs (SPEAs) is similar to generation based EAs, where the new population is generated after waiting for all solution evaluations. On the other hand, asynchronous PEAs (APEAs) is similar to steady-state EAs. A new solution is immediately generated whenever one solution evaluation completes without waiting for other solution evaluations. In general, an SPEA has a higher search capability than an APEA because an SPEA can utilize many evaluation information simultaneously when generating a new population. On the other hand, when the evaluation time of solutions differs from each other, i.e., the variance of the evaluation time is large, an APEA shows a higher search efficiency than in terms of the computational time Takadama 2013, 2014;Scott and De Jong 2015a, b). This is because APEAs can reduce the idling time of slave nodes that quickly complete the evaluation tasks, unlike SPEAs need to wait for the slowest solution evaluations.

Synchronous parallel ELMOEA/D
Algorithm 3 shows a master-slave synchronous parallel ELMOEA/D (SP-ELMOEA/D). SP-ELMOEA/D is an extension of ELMOEA/D, and the underlined texts denote the difference from Algorithm 2. At the initialization procedure, all generated solutions are sent to slave nodes in Step 1-2, and their evaluations are firstly waited for in Step 1-3. Then, the remaining initialization procedure is executed. After evolving solutions on MOEA/D-DE with an ELM surrogate, N s promising solutions are selected, and they are sent to slave nodes in Step 5-1. SP-ELMOEA/D waits for all evaluations from slave nodes in Step 5-2 and updates the archive population and the training set. SP-ELMOEA/D repeats these procedures until the terminal condition is satisfied.

Asynchronous parallel ELMOEA/D
AP-ELMOEA/D evaluates solutions in parallel as same as SP-ELMOEA/D but waits for only one solution evaluation every step. When one solution evaluation completes, it is appended to the archive population, and MOEA/D-DE with the ELM surrogate model generates a new promising solution. The differences between AP-ELMOEA/D and SP-ELMOEA/D are in Steps 4 and 5 in Algorithm 3. Concretely, these procedures are modified as Algorithm 4.
First, AP-ELMOEA/D waits for only one evaluation from a slave node in Step 5'-2, unlike SP-ELMOEA/D waits for all N s solution evaluations. In response to this, Steps 4' and 5' of AP-ELMOEA/D are modified to address one solution. In Step 4-1 in Algorithm 3, SP-ELMOEA/D selects N s promising solutions in each generation and actually evaluates them in parallel. On the other hand, AP-ELMOEA/D selects only one promising solution in Step 4'-3. For this selection, AP-ELMOEA/D needs to select an index k s of the selector weight vector in Step 4'-2 instead of considering all N s selector vectors. We take this into account in the next subsection. In addition to this, AP-ELMOEA/D addresses only one solution in the update of the archive population P gþ1 a and the training set P gþ1 t in Steps 5'-3 and 5'-4. Besides, because only one solution is selected for an actual evaluation every MOEA/D-DE optimization step, the surrogate model is constructed every N s actual evaluations in AP-ELMOEA/D not to increase the number of the ELM training.

Promising solution selection of AP-ELMOEA/ D
When applying the asynchronous evolution scheme to ELMOEA/D, we need to determine how to select one promising solution from the population optimized by MOEA/D-DE on the surrogate evaluation model. In particular, an index k s should be chosen Step 4' in Algorithm 4 whenever one solution evaluation completes. This paper explores simple two selection mechanisms of an index k s : the index order based selection and the random order selection.

Index order based selection:
A straightforward way to select an index k s is to use the order of the index of the neighborhood vector set. This paper denotes AP-ELMOEA/D with the promising selection using the order of the index of the neighborhood vector set as AP-ELMOEA/D-IO. AP-ELMOEA/D-IO selects a promising solution depending on the order of the index of the neighborhood vector set. AP-ELMOEA/D-IO initially set an index k s to 1, and whenever a promising solution is selected, k s is incremented as k s k s þ 1. If k s exceeds K s , which is the number of the neighborhood vector set, k s is reset to 1.

Random order selection
The second way is to choose an index k s every selection procedure randomly. This paper denotes this as AP-ELMOEA/D-RO. Whenever an index k s is selected, it is uniformly chosen from 1 to N s without considering the previous index, i.e., consecutive selection of the same index is allowed. Since an actual implementation of MOEA/D often uses a random permutation to select the aggregation vector during the optimization Fan et al. (2019); Nebro et al. (2015), the random order selection can be a valid strategy in AP-ELMOEA/D.

Experiment
We conduct an experiment to compare three different parallel ELMOEA/D variants, i.e., SP-ELMOEA/D, AP-ELMOEA/D-IO, and AP-ELMOEA/D-RO, to clarify which parallelization scheme is suitable for ELMOEA/D. Additionally, we compare the sequential ELMOEA/D (i.e., run on a single CPU) to confirm the effectiveness of the parallelization. The following subsections explain the benchmark problems used in the experiment, the experimental setting, and the evaluation criteria.
Additionally, we use four real-world problems provided by the black box optimization competition (Loshchilov and Glasmachers 2015). In particular, three are from the engineering design, a heat exchanger problem, a hydro-dynamics problem, and a vibrating platform problem, while one is from the operational research, a facility placement problem. All of them are bi-objective minimization, and D ¼ 16 for the heat exchanger problem, D ¼ 6 for the hydro-dynamics problem, D ¼ 5 for the vibrating platform problem, and D ¼ 20 for the facility placement problem.

Experimental setting
We conduct the experiment on the pseudo (simulated) master-slave parallel computing environment that measures the computational time according to the model proposed in the work of Zȃvoianu et al. (2015). In the experiment, we define the unit of the computing time as ''simulation time''. In particular, we define one unit simulation time as the time to complete the sequential tasks on the master node, i.e., the ELM learning and the MOEA/ D optimization with the surrogate evaluation.
We test two settings of evaluation time of solutions. One determines the evaluation time of any solutions by the normal distribution of N ðt p ; c v Â t p Þ simulation time on the distributed slave nodes. t p () 1) is the mean evaluation time, while the random variance is added according to the normal distribution with the standard deviation c v Â t p , where c v ðc v ! 0Þ is a parameter to determine the variance of the evaluation time. In this setting, the evaluation time does not depend on the feature of evaluated solutions. We call this evaluation time normally-distributed evaluation time.
Another setting defines the evaluation time of solutions correlates to the objective function value. In particular, the evaluation time is calculated as: where f 1 ðxÞ indicates the first objective function value of a solution x, t p () 1) is the base evaluation time when f 1 ðxÞ ¼ 1, while c g denotes the parameter to decides the gradient of the evaluation time. If c g [ 0, the evaluation time positively correlates to the first objective function value, i.e., the greater the objective function value, the longer the evaluation time. If c g \0, on the other hand, the evaluation time negatively correlates to the first objective function value, i.e., the greater the objective function value, the shorter the evaluation time. Since the evaluation time depends on the objective function value, the search direction of the AP-ELMOEA/D variants has the possibility to biased toward the search space where solutions quickly complete their evaluations (Scott and De Jong 2015a). This experiment uses the parameters shown in Table 1, most of which are the same as the original work (Pavelski et al. 2016). Notably, we use the MOEA/D-DE parameter settings that the original MOEA/D-DE work recommended (Li and Zhang 2009). Ten slave nodes are used for solution evaluations, which means N s ¼ 10 selected promising solutions can be evaluated simultaneously. To clarify how the variance of the evaluation time on the parallel computing environment affects the search performance, we consider different magnitudes of the variance c v ¼ f0:02; 0:05; 0:10; 0:20g in the normally-distributed evaluation time. On the other hand, in the fitness-correlated evaluation time, we use the different gradient parameters c g ¼ fAE0:2; AE0:5g to clarify the influence of the evaluation time gradient.
This paper uses the hypervolume (HV) indicator (Zitzler and Thiele 1998) to assess the quality of the achieved solution set by the methods. The HV indicator can evaluate both the convergence and the diversity of obtained solutions. For the benchmark problems, the scale of HV value differs in each benchmark problem. To clearly display the results in the table, we normalize the HV value by the ideal HV value calculated from the true Pareto front for each benchmark problem. The normalized HV value f HV is defined as follows: where HV indicates the HV value obtained by each method, while HV ideal indicates the ideal HV value. On the other hand, for the real-world problems, since the ideal HV value is unknown, we use raw HV values. The reference point, which is the extreme point of the area to be covered by acquired solutions, is required to calculate the HV value. We use the following reference points, which all are the same as the original work: -ZDT1-3, 6, WFG1-9, DTLZ2, 4-6: f5:0; 5:0g -ZDT4, DTLZ1, 3, 7: f100:0; 100:0g For the real-world problems, on the other hand, the reference point is set to f1:0; 1:0g, which is used in (Loshchilov and Glasmachers 2015).
We assess three methods from the viewpoint of the number of actual evaluations and the elapsed simulation time. The HV values of 30 independent trials are calculated for the maximum number of actual evaluations and the maximum elapsed simulation time. We set the maximum number of the actual evaluations as 2000 for all ZDT instances, 1000 for all WFG instances, and 200 for all DTLZ instances. For the real-world problems, we set the maximum number of the actual evaluations as 500 for the heat exchanger problem, 200 for the vibrating platform problem, and 1000 for the hydro-dynamics and the facility placement problems. The maximum elapsed simulation time, on the other hand, is set to 1:5 Â 10 5 simulation time for the ZDT instances, 7:0 Â 10 4 simulation time for the WFG instances, and 1:0 Â 10 4 simulation time for the DTLZ instances. For the real-world problems, we set the maximum elapsed simulation time to 3:0 Â 10 4 simulation time for the heat exchanger problem, 1:2 Â 10 4 simulation time for the vibrating platform problem, and 7:0 Â 10 4 simulation time for the hydro-dynamics and the facility placement problems. These maximum simulation times are chosen because all asynchronous schemes complete their executions before these computational times, though the synchronous one needs more computational time to complete its execution.

Results on benchmark problems with the normally-distributed evaluation time
This section shows the experimental results on the benchmark problems with the use of the normally-distributed evaluation time. First, we compare the HV values of the parallel ELMOEA/D variants at the maximum number of the actual evaluations. Then we compare them from the viewpoint of the elapsed simulation time.    The result that is significantly better than another is signed with a number of the method (1: SP-ELMOEA/D, 2: AP-ELMOEA/D-IO, 3: AP-ELMOEA/D-RO). The highest mean values are shown with in the bold style

Hypervolume at the maximum number of the actual evaluations
Comparison of synchronous and asynchronous parallelization of extreme surrogate-assisted multi… 199 bold style. We conduct a statistic test with the Mann-Whitney U test with a 5% significance level. If a method is significantly better than another, it is specified as the superscript character in the ''Mean'' column. ''1'' indicates that a corresponding method is significantly better than SP-ELMOEA/D, and similarly ''2'' indicates AP-ELMOEA/ D-IO, and ''3'' indicates AP-ELMOEA/D-RO. These results indicate that AP-ELMOEA/D-IO has the highest mean HV values in most cases, although the difference is not significant for many benchmarks. On the other hand, SP-ELMOEA/D significantly outperforms the AP-ELMOEA/D variants in ZDT3 and WFG9. There is no trend in the variance of the evaluation time. Table 3 shows an aggregation of the statistic test of the pairwise comparison. Each group of three columns indicates the compared two methods, while each column shows the count of the results of the statistic test. The ''1'' column indicates the first method is significantly better than the second one, while the ''2'' column indicates the second method is significantly better than the first one. If there is no significant difference, ''%'' is counted. These results reveal that the integration of the surrogateassisted EA with the asynchronous evolution scheme improves the search capability, even taking no account of the computational time.
6.2 Hypervolume at the maximum elapsed simulation time The mean normalized HV values at the maximum elapsed simulation time are shown in Table 4. The maximum mean values are shown in the bold style, and the results of the statistic test are specified as the superscript character. Both of them are the same as Table 2. Table 5 aggregates the statistic test of the pairwise comparison as same as Table 3. First, let us compare the results of the parallel and the sequential ELMOEA/Ds. It is confirmed that the parallel ELMOEA/Ds significantly outperforms the sequential ELMOEA/D for all benchmarks and all c v values. From this, the parallelization of ELMOEA/D contributes to reducing the computational time for the optimization.
Then, let us compare the parallelization schemes, i.e., the synchronous and the asynchronous parallel ELMOEA/ Ds. We can confirm a similar tendency to Table 2 from these results. The HV value of AP-ELMOEA/D-IO is significantly better than or equal to SP-ELMOEA/D in all benchmark problems. On the other hand, AP-ELMOEA/D-RO is also significantly better than or equal to SP-ELMOEA/D especially when the variance of the evaluation time is large. In particular, AP-ELMOEA/D-IO performs the highest mean HV values in most experimental cases. The results in Table 5 indicate that AP-ELMOEA/ D-IO is significantly better than SP-ELMOEA/D in 30 experimental cases of total of 84 cases. In comparison, AP-ELMOEA/D-RO is significantly better than SP-ELMOEA/ D in 15 experimental cases. Comparing AP-ELMOEA/D-IO and AP-ELMOEA/D-RO, AP-ELMOEA/D-IO is significantly better in 15 experimental cases.
Next, we compare the difference in performance as the variance of the evaluation time changes. When the variance of the evaluation time is small, the difference between SP-ELMOEA/D and the AP-ELMOEA/D variants is small. In particular, SP-ELMOEA/D outperforms AP-ELMOEA/D-RO in a few benchmarks. In contrast, when the variance of the evaluation time is large, the cases of AP-ELMOEA/Ds outperforming SP-ELMOEA/D increase. This result shows that as the variance of the evaluation time increases, the AP-ELMOEA/D variants shows higher performance.
From these results, it is revealed that the asynchronous evolution scheme also improves the search efficiency in terms of computational time. However, in some benchmark problems, there is no significant difference between SP- If the first algorithm wins against the second one, ''1'' is counted,  Comparison of synchronous and asynchronous parallelization of extreme surrogate-assisted multi… 201 The result that is significantly better than another is signed with a number of the method (1: SP-ELMOEA/D, 2: AP-ELMOEA/D-IO, 3: AP-ELMOEA/D-RO, 4: Sequential ELMOEA/D). The highest mean values are shown in the bold style ELMOEA/D and the AP-ELMOEA/D variants even though the evaluation time variance is large. The next subsection discusses the details of the transition of the HV values over the elapsed simulation time. In particular, we focus on ZDT4, ZDT6, WFG8, and DTLZ4, because there is no significant difference among SP-ELMOEA/D and the AP-ELMOEA/D variants in these problems. There is no significant difference over time between any pairs of the parallel ELMOEA/D variants in WFG8 and DTLZ4. On the other hand, in ZDT4, the significant difference between AP-ELMOEA/D-IO and SP-ELMOEA/D, and that between AP-ELMOEA/D-RO and SP-ELMOEA/ D can be found in the early and the middle term of the evolution when the variance of the evaluation time is low, though there is no significant difference in the final part of the evolution. This tendency can be found in the high variance in ZDT6, where AP-ELMOEA/D-RO significantly outperforms SP-ELMOEA/D in the middle term of the evolution. This result indicates that AP-ELMOEA/D-IO and AP-ELMOEA/D-RO reach to the final HV value quicker than SP-ELMOEA/D, even though the final HV value is almost the same.

Transition of HV over the elapsed simulation time
These results reveal that the convergence speed of the AP-ELMOEA/D variants is faster than SP-ELMOEA/D even though they finally achieve a similar quality of solutions.

Results on benchmark problems with the fitness-correlated evaluation time
This section shows the experimental results on the benchmark problems with the fitness-correlated evaluation time.
As same as the previous section, we first compare the HV values of the parallel ELMOEA/D variants at the maximum number of the actual evaluations, then we compare them from the viewpoint of the elapsed simulation time. Note that in this experiment, we exclude the results on the benchmarks of DTLZ1 and DTLZ3. This is because the range of the first objective function value of these benchmarks is vast (the maximum value is about 100.0), and the evaluation time calculated by Eq. (12) is not appropriate. Tables 6 and 7 show the mean normalized HV values and their standard deviations of the competitive methods at the maximum number of the actual evaluations with the fitness-correlated evaluation time. Table 6 shows the result of the positively correlated evaluation time. In contrast,   Table 7 shows the result of the negatively correlated evaluation time. These tables do not include the sequential ELMOEA/D for the same reason as Section 6.1. Tables 8  and 9 show aggregations of the statistic test of the pairwise comparison of the results in Tables 6 and 7. The notations in these tables are the same as in Sect. 6.1.

Hypervolume at the maximum number of the actual evaluations
The experimental results at the maximum number of the actual evaluations show that SP-ELMOEA/D and the AP-ELMOEA/D variants are less significantly different. In particular, compared with the previous section, the number of cases in which the AP-ELMOEA/D variants are significantly better than SP-ELMOEA/D decreases. At the same time, SP-ELMOEA/D significantly outperforms the AP-ELMOEA/D variants in some benchmarks. For example, SP-ELMOEA/D outperforms in the benchmarks of ZDT6, wFG9, and DTLZ3 when the evaluation time positively correlates to the first objective function value, while in ZDT3 when the evaluation time negatively correlates. However, AP-ELMOEA/D-IO comprehensively shows the highest average HV values in many of the benchmarks, and it is statistically equivalent to or better than the other two methods. These results indicate that although AP-ELMOEA/D-IO is affected by the evaluation time bias, it still performs high search capability.

Hypervolume at the maximum elapsed simulation time
Tables 10 and 11 show the mean normalized HV values and their standard deviations of the competitive methods at the maximum elapsed simulation time with the fitnesscorrelated evaluation time. Table 10 shows the result of the positively correlated evaluation time. In contrast, Table 11 shows the result of the negatively correlated evaluation time. Tables 12 and 13 show aggregations of the statistic test of the pairwise comparison of the results in Tables 10 and 11. The notations in these tables are the same as in Sect. 6.2. First, comparing the sequential ELMOEA/D and the parallel ELMOEA/Ds, it is shown that the parallel variants have significantly better performance than the sequential method in all benchmarks and all evaluation time setting. This result shows that the parallelization of ELMOEA/D contributes to shortening the computational time of the optimization even when the evaluation time is biased.
Next, comparing the three parallel ELMOEA/Ds, the AP-ELMOEA/D variants are significantly better than SP-ELMOEA/D in several benchmarks. However, the number of them is reduced compared to the case of the normallydistributed evaluation time. This is because the search direction of AP-ELMOEA/D is biased to the solution region with a short evaluation time if the evaluation time correlates to the objective function value. This causes to decrease the search capability of AP-ELMOEA/D. However, in terms of the computation time, the waiting time of SP-ELMOEA/D increases because the fitness-correlated evaluation time induces the evaluation time variance depending on the objective function value, thus its computational efficiency decreases. Therefore, although SP-ELMOEA/D is not affected by the evaluation time bias in search, it cannot outperform the AP-ELMOEA/D variants at the maximum elapsed evaluation time.
In this paper, we use ten slave nodes for the evaluation. However, APEAs are less affected by the evaluation time bias when the number of the slave nodes is small. Therefore, the performance deterioration of AP-ELMOEA/Ds was suppressed in our experiment. However, if we use more slave nodes, it may deteriorate the performance of the AP-ELMOEA/D variants.

Results on real-world problems
This section shows the experimental results on real-world problems. As same as the previous section, we first compare the HV values of the parallel ELMOEA/D variants at the maximum number of the actual evaluations, then we compare them from the viewpoint of the elapsed simulation time.  Table 14 shows the mean HV values and their standard deviation of the competitive methods at the maximum number of the actual evaluations with the normally-distributed evaluation time. We represent each problem as an abbreviation. ''HX'' denotes the heat exchanger problem, ''HD'' denotes the hydro-dynamics problem, ''VP'' denotes the vibrating platform problem, and ''FP'' denotes the facility placement problem. The best results are shown in the bold style. As same as the results on the benchmark problems, we conduct a statistic test with the Mann-Whitney U test with a 5% significance level, and its result is shown in this table as the same style in Table 2.
These results show that, in the heat exchanger problem and the facility placement problem, the AP-ELMOEA/D variants show better performance than SP-ELMOEA/D. Especially in some cases, the asynchronous variants significantly outperform the synchronous one. On the other hand, in the hydro-dynamics problem and the vibrating platform problem, AP-ELMOEA/D-RO shows worse performance than the other two methods. In particular, SP-ELMOEA/D shows the best mean HV values in the hydrodynamics problem for all evaluation time variances. Table 15 shows the mean HV values and their standard deviation of the competitive methods at the maximum number of the actual evaluations with the fitness-correlated evaluation time. The notation in this table is same as  Table 14. Even when the evaluation time correlates to the objective function value, the tendency is almost the same as the results with the normally-distributed evaluation time. In particular, in the heat exchanger problem and the facility placement problem, the AP-ELMOEA/D variants show better performance than SP-ELMOEA/D In the hydro-dynamics problem and the vibrating platform problem, on the other hand, AP-ELMOEA/D-RO is worse than the other two methods.
These results indicate that no clear difference is found in the real-world problems at the maximum number of the actual evaluations. However, AP-ELMOEA/D-IO shows averagely better performance than the other two methods, i.e., AP-ELMOEA/D-IO is never significantly worse. Table 16 shows the mean HV values and their standard deviation of the competitive methods at the maximum elapsed simulation time with the normally-distributed evaluation time. On the other hand, Table 17 shows the result with the fitness-correlated evaluation time. The notations of these tables are the same as Tables 14 and 15. These results include the sequential ELMOEA/D. First, from the result of the sequential ELMOEA/D, it is shown that the parallel variants all significantly outperform the sequential ELMOEA/D in all problems and both of the evaluation time settings. This result indicates that the parallelization of ELMOEA/D is even useful in real-world problems.

Hypervolume at the maximum elapsed simulation time
From the comparison of the parallel variants of ELMOEA/D in the normally-distributed evaluation time, no significant difference between three parallel ELMOEA/ Ds is found in the heat exchanger and the vibrating platform problems. Note that only in c v ¼ 0:2 in the vibrating platform problem, AP-ELMOEA/D-IO significantly outperforms SP-ELMOEA/D, where the variance of the evaluation time is enormous. On the other hand, in the facility placement problem, the asynchronous variants significantly outperform SP-ELMOEA/D regardless of the variance of the evaluation time. In the hydro-dynamics problem, SP-ELMOEA/D significantly outperforms the AP-ELMOEA/D variants when the variance of the evaluation time is low, i.e., c v ¼ f0:02; 0:05g. Meanwhile, SP-ELMOEA/D decreases its performance as the variance of the evaluation time increases even in the hydro-dynamics problem. On the other hand, the AP-ELMOEA/D variants keep their performance as the variance increases. Especially, AP-ELMOEA/D-IO shows the highest mean HV Comparison of synchronous and asynchronous parallelization of extreme surrogate-assisted multi… 207 The evaluation time of solutions increases as the 1st objective function value increases. The result that is significantly better than another is signed with a number of the method (1: SP-ELMOEA/D, 2: AP-ELMOEA/D-IO, 3: AP-ELMOEA/D-RO). The highest mean values are shown in the bold style  values in most cases and significantly outperforms the other two methods in some of the cases. Finally, we compare the parallel variants of ELMOEA/ D in the fitness-correlated evaluation. First, there is no significant difference between SP-ELMOEA/D and AP-ELMOEA/D-IO in all cases except for the facility placement problem with c g ¼ 0:5. On the other hand, although AP-ELMOEA/D-RO shows better performance only in the heat exchanger problem, it is inferior to AP-ELMOEA/D-IO in the other problems. In total, AP-ELMOEA/D-IO shows the highest mean HV values in most cases and significantly outperforms the other two methods in some of the cases.
These results reveal that AP-ELMEOA/D-IO shows high search performance in all real-world problems and all evaluation time settings. Although SP-ELMOEA/D shows the high performance when the variance of the evaluation time is low, its performance decreases as the variance increases.

Conclusion
This paper introduced a parallel evaluation scheme into a surrogate-assisted multi-objective evolutionary algorithm (MOEA) for further reducing the optimization time. As a surrogate-assisted MOEA, this paper used extreme learning assisted MOEA/D (ELMOEA/D) that uses an extreme learning machine (ELM) as a surrogate model and MOEA/ D-DE as an optimizer. On the other hand, we considered two different schemes of the parallelization, i.e., a synchronous evolution scheme and an asynchronous evolution If the first algorithm wins against the second one, ''1'' is counted, while ''2'' is counted if the second one wins. If two are a tie, ''%'' is counted. ''Sync.'' indicates SP-ELMOEA/D, ''IO'' indicates AP-ELMOEA/D-IO, and ''RO'' indicates AP-ELMOEA/D-RO If the first algorithm wins against the second one, ''1'' is counted,   The evaluation time of solutions increases as the 1st objective function value increases. The result that is significantly better than another is signed with a number of the method (1: SP-ELMOEA/D, 2: AP-ELMOEA/D-IO, 3: AP-ELMOEA/D-RO, 4: Sequential ELMOEA/D). The highest mean values are shown in the bold style Comparison of synchronous and asynchronous parallelization of extreme surrogate-assisted multi… 211 The evaluation time of solutions increases as the 1st objective function value decreases. The result that is significantly better than another is signed with a number of the method ( One is the index order selection that selects a promising solution depending on the order of the index of the neighborhood vector set. Another is the random order selection that randomly select the index of the neighborhood vector set. We conducted the experiment on multi-objective optimization problems to compare the sequential ELMOEA/D, SP-ELMOEA/D, and the AP-ELMOEA/D variants. The experiment used multi-objective benchmarks and realworld problems. The experiment was conducted on the pseudo (simulated) parallel computing environment and considered two settings of evaluation time of solutions; the Table 12 An aggregation of the pairwise statistical test by using Mann-Whitney U test with 5% of significance level for the result at the maximum elapsed simulation time with the fitness-correlated evaluation time (c g [ 0) The result that is significantly better than another is signed with a number of the method (  The result that is significantly better than another is signed with a number of the method (  The result that is significantly better than another is signed with a number of the method (  The result that is significantly better than another is signed with a number of the method ( normally-distributed evaluation time and the fitness-correlated evaluation time.
The experimental results first revealed that the parallelization of ELMOEA/D significantly reduces the computational time compared with the sequential one. From the comparison of the synchronous and the asynchronous parallelization schemes, the asynchronous evolution scheme outperforms the synchronous one, in particular, the asynchronous one converges faster even if the variance of the evaluation time is enormous. Even though the evaluation time is biased, the asynchronous evolution scheme has an advantage to the synchronous one. This is because the synchronous evolution scheme still decreases its computational efficiency due to the difference of the evaluation time. Moreover, although two promising solution selection strategies are not significantly different, the promising solution selection based on the index order showed the best performance in most test problems.
In the future, we will conduct experiments using not only ELMOEA/D but also other surrogate-assisted EAs. For example, the method proposed in (Habib et al. 2019) proposes a novel surrogate-assisted MOEA/D that uses multiple surrogate models. We can apply the parallelization scheme we have proposed to such MOEA/D-based approaches. For further improvement of the proposed method, we will introduce a preference-based EA like RVEA (Cheng et al. 2016) to AP-ELMOEA/D instead of MOEA/D. Since AP-ELMOEA/D selects only a single promising solution, we can improve the quality of the promising solution by concentrating the search only around the target weight vector.

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.
Ethical approval This article does not contain any studies with human participants or animals performed by any of the authors.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.