Introduction

Multi-objective optimization problems (MOP) are widely involved in the real-world applications, for example, industrial scheduling [21], software engineering [19], and control system design [10]. The mathematical models of the multi-objective optimization problems are given as follows:

$$\begin{aligned} \min&\mathbf {F}(\mathbf {x})=(f_1(\mathbf {x}), f_2(\mathbf {x}), \ldots , f_m(\mathbf {x})) \nonumber \\ s.t.&\mathbf {x}_{\min } \le \mathbf {x} \le \mathbf {x}_{\max }, \end{aligned}$$
(1)

where \(\mathbf {x} = (x_1, x_2, \ldots , x_D) \in \mathfrak {R}^D\) is a solution in the D-dimensional decision space, \(F: \mathfrak {R}^D \rightarrow \mathfrak {R}^m\) consists m objective functions \(f_i(\mathbf {x}), i=1,2,\ldots ,m\), and \(\mathfrak {R}^m\) denotes the m-dimensional objective space. In general, due to the conflicting nature of the objectives, no solution can be the optimum of all objective functions simultaneously, instead, a set of trade-off solutions, called Pareto optimal solutions or non-dominant solution set [7], will be found for the optimization problem. The set of all Pareto-optimal solutions is called the Pareto set (PS) and its mapping to the objective space is the Pareto front (PF).

Different evolutionary multi-objective optimization (EMO) methods have been proposed for solving multi-objective optimization problems [1, 2, 9, 31]. Especially, in recent years, optimization methods for problems with more than three objectives, which are called many-objective optimization problems (MaOPs), have been obtained more and more attentions because the performances of canonical algorithms for multi-objective problems will be degraded much quickly with the number of objective increases. Generally, the approaches proposed for solving MaOPs can be roughly classified into three categories.

The first category is multi-/many-objective optimization algorithms based on dominance relationship. The most representative one for multi-objective problems is NSGA-II [9], which was proposed by Deb in 2002. However, the performance of NSGA-II will be deteriorated when the number of objective increases because of the loss the selection pressure. Therefore, scholars have focused on finding more and more efficient strategies on dominance-based evolutionary algorithms for solving many objective optimization problems, such as \(\varepsilon \)-dominance [12], \(\theta \)-dominance [30], and fuzzy-Pareto-dominance [23]. Yang et al. [27] proposed a grid-based many-objective evolutionary algorithm (GrEA), in which grid domination and grid difference were used to improve the selection pressure. Zhang et al. [32] proposed a many-objective evolutionary algorithm based on knee point (KnEA), in which the distance between hyperplane and a knee point was used to select better non-dominant solutions, which greatly improves the selection pressure.

The second category is multi-/many-objective optimization algorithms based on decomposition strategy, which can further be divided into two types, one is that the multi-/many-objective optimization problems are transformed to a set of single-objective optimization problems [14, 15, 25, 29, 31], and the other is that the complex multi-objective algorithms are transformed to a set of simple multi-objective optimization problems [8, 18]. In [26], Xiang et al. proposed a vector angle-based many-objective evolutionary algorithm (VaEA) which uses maximum-vector-angle-first principle and worse-elimination principle to maintain the diversity and convergence of the population. Cheng et al. [4] proposed a many-objective evolutionary algorithm guided by a set of reference vectors (RVEA) and Jiang et al. [11] proposed a many-objective evolutionary algorithm based on reference direction (SPEA/R).

The indicator-based evolutionary algorithms fall into the third category, in which the performance indicator is used instead of fitness to select individuals. Zitzler and kunzli proposed the indicator-based evolutionary algorithm (IBEA) [33], in which a binary performance measure was proposed in the selection process. Bader et al. [1] proposed a hypervolume estimation algorithm, called HypE, for many-objective optimization, in which the Monte Carlo simulation was utilized to approximate the exact hypervolume values. Tian et al. [22] proposed an indicator-based multi-objective evolutionary algorithm with reference point adaptation (AR-MOEA), in which the algorithm adjusted the position of reference points based on the contribution of the indicator to improve the performance of irregular Pareto frontier problems.

In recent years, there are also other algorithms that combine the above three strategies. Li et al. [13] proposed a MOEA/DD algorithm, in which both decomposition and dominance strategies were utilized. Based on the performance indicator and domination relationship, Wang et al. [24] proposed a Two-arch2 algorithm. Deb and Jain [8] extended the well-known NSGA-II and proposed the NSGA-III algorithm to deal with many-objective optimization problem, in which a set of reference points were utilized to maintain the diversity of the population during the search with non-dominated sorting mechanism.

Literature reviews show that there are a small number of algorithms with the PSO framework proposed for solving many-objective optimization problems. The reason, we analyze, is because of the quick convergence of the PSO algorithm, which may not be able to provide a good diversity for finding the optimal Pareto front. In this paper, a modified particle swarm optimization with decomposition strategy and different ideal points, called MPSO/DD, will be proposed, in which the decomposition strategy is adopted to ensure the uniformity of the final outputs, and multiple ideal points are utilized to drive the population to quickly convergence to the optimal front. The learning strategy proposed by Cheng and Jin [3] is adopted to update the position of each individual, in which the demonstrators are those with less distance to the ideal point along the reference vector.

The paper is organized as follows: Section 2 describes our proposed method in detail. Experimental results are given in Section 3 with some discussions. Finally, Section 4 gives the conclusions and talks about some work we can do in the future.

The proposed MPSO/DD

Overall framework

Algorithm 1 gives the pseudocode of our proposed MPSO/DD algorithm. A series of reference vector \({\lambda }_i=(\lambda _{i1},\lambda _{i2},\ldots ,\lambda _{i,m}),i=1,2,\ldots ,N\) will be generated in the objective space at first. Then a population, each individual in which has its own position \(\mathbf {x}_i = (x_{i1}, x_{i2}, \ldots , x_{iD}), i=1, 2, \ldots , N\) and velocity \(\mathbf {v}_i = (v_{i1}, v_{i2}, \ldots , v_{iD}), i=1, 2, \ldots , N\), will be generated in the upper and lower bounds, and evaluated using the objective functions. All non-dominated solutions in the population will be saved to the archive Arc. If the stopping criteria is not met, the following steps will be repeated. Determine the ideal point for each reference vector using all non-dominated solutions. Sort the Tchebycheff values of an individual on all reference vectors in an ascending order, find the first reference vector after the sorting which has not been associated with any individual, and assign this reference vector with the current individual. Therefore, each individual will be associated with one and only one reference vector. After that, neighbors of each individual will be used to update the position of this individual, and correspondingly a new offspring population will be generated. Next, a new parent population will be selected from the parent and offspring populations according to the environmental selection strategy proposed in [20]. Finally, all non-dominated solutions stored in the external archive will be updated using the current population obtained by environmental selection and output when the terminal condition is satisfied, which can be seen in Step 13 of Algorithm 1.

In the following, we will give a detailed description on main parts of Algorithm 1:

figure a

Ideal point generation

Different from decomposition-based methods proposed previously where only one ideal point is used in the whole evolution, in our method, each reference vector has its own ideal point, which was determined by the objective values of individuals in the non-dominated archive, to speed up the convergence along the reference vector. Figure 1 gives a simple example to show our strategy to generate the ideal point for each reference vector. In Fig. 1, given an arbitrary reference vector \({\lambda }_i\), the circles in red represent the non-dominated individuals in Arc, and the circle in yellow is the ideal point which has the minimum distance among five non-dominated individuals along the reference vector \({\lambda }_i\) to the origin. Equation (2) gives the method to calculate the distance of each individual in Arc along the reference vector to the origin.

$$\begin{aligned} d_{ij}= \frac{||\mathbf {F}_{norm}(\mathbf {x}_j)^T {\lambda }_{i}||}{||{\lambda }_{i}||},j=1,2,\ldots ,K, \end{aligned}$$
(2)

where

$$\begin{aligned} F_{norm, k}(\mathbf {x}_j)=\frac{f_k(\mathbf {x}_j)-f_{\min ,k}}{f_{\max ,k}-f_{\min ,k}}. \end{aligned}$$
(3)

In Eqs. (2) and (3), \(\mathbf {F}_{norm}=(\mathbf {F}_{norm, 1}, \mathbf {F}_{norm, 2}, \ldots , \mathbf {F}_{norm,m})\) is the objective vector after normalization, \({\lambda }_i\) refers to the current reference vector. \(f_{\max ,k}\) and \(f_{\min ,k}\) are the maximum and minimum objective values on kth objective in the non-dominated solution set Arc, respectively. K is the size of the current non-dominated archive Arc.

Fig. 1
figure 1

An example to show the ideal point setting

Algorithm 2 gives the pseudocode of the determination of the ideal points. In Algorithm 2, |Arc| and \(|{\lambda }|\) represent the number of non-dominated solutions in the archive Arc and the number of reference vectors, respectively. The distance between the point, projected by a non-dominated solution in the archive Arc on the reference vector \({\lambda }_i\), and the origin will be calculated. The point with minimal distance to the origin along the reference vector will be the ideal point of this reference vector.

figure b

The offspring generation

In the original social learning particle swarm optimization proposed by Cheng and Jin [3], the velocity and position of each individual are updated as follows:

$$\begin{aligned} v_{ij}(t{+}1)= & {} r_1 v_{ij}(t) {+} r_2 (x_{wj}(t){-}x_{ij}(t)) {+} r_3 (\bar{x}_{j}(t){-}x_{ij}(t))\nonumber \\ \end{aligned}$$
(4)
$$\begin{aligned} x_{ij}(t{+}1)= & {} x_{ij}(t){+}v_{ij}(t{+}1), \end{aligned}$$
(5)

where \(v_{ij}\) and \(x_{ij}\) are the jth velocity and position of individual i, respectively. \(x_{wj}\) is the jth position of individual w whose fitness is better than individual i. \(\bar{x}_{j}\) is the mean position of the current population on jth dimension. \(r_1\), \(r_2\) and \(r_3\) are random numbers generated uniformly between 0 and 1. The original social learning particle swarm optimization was proposed for single-objective problems and has been shown a good performance to find better optimal solutions especially on large-scale optimization because of its good diversity. However, as we know, in the multi-/many-objective optimization, normally the individuals do not dominate each other and it is difficult to tell which individual is better than another based on the objective values, especially when the number of objectives increase. Therefore, in our method, for an individual i, we first calculate the distances between individuals in the neighborhood and the ideal point of the reference vector individual i associated with (shown in Eq. (6)). In Eq. (6), \(\mathbf {F}(\mathbf {x}_j), j = 1, 2 ,\ldots ,|NI|\) the objective vector of individual j in the neighborhood of individual i, |NI| is the number of neighbors of individual \(\mathbf {x}_j\), \(\mathbf {IP}_i\) is the ideal point of reference vector \({\lambda }_i\), and \(d1_{i,j}\) is the distance between individual j and the origin along the reference vector \({\lambda }_i\). All distances will be sorted in a descent order, and correspondingly, individual i can learn from those neighbors who has better convergence to the Pareto front. Both Eqs. (7) and (8) are used for updating the velocity of an individual with probabilities to prevent the population from falling into local optima. As we know, the convergence speed of the social learning particle swarm optimization algorithm is limited because of its good diversity; therefore, the coefficient proposed in [5], i.e., 0.729, is utilized in Eq. (7) to speed up the convergence. In Eq. (7), \(x_{wj}\) represents the jth dimension of individual w whose distance to the original along with the current reference vector is better than individual i. \(r_1\) and \(r_2\) are random number generated uniformly between 0 and 1. Equation (8) is used to randomly initialize the velocity so as to jump out of the local optimal position.

$$\begin{aligned}&d1_{i,j}= ||\mathbf {F}(\mathbf {x}_j)-\mathbf {IP}_i|| \end{aligned}$$
(6)
$$\begin{aligned}&v_{ij}=0.729*(r_1* v_{ij}+r_2* (x_{wj}-x_{ij})) \end{aligned}$$
(7)
$$\begin{aligned}&v_{ij}=r_1* (v_{j,\max }-v_{j,\min })+v_{j,\min }. \end{aligned}$$
(8)

Algorithm 3 gives the pseudocode of the generation of an offspring. For each reference vector, the distance to the ideal point of its neighbor individuals will first be calculated, and sorted in a descending way. Figure 2 gives a simple example to show how to select the demonstrator according to the distance to the ideal point. The best position on the right hand is the individual that has the minimal distance to the ideal point along the reference vector. A threshold, 0.99, is given empirically in line 7 of Algorithm 3, for determining which equation is to be used for velocity updating. To see the efficiency of parameter settings, we conducted three cases of empirical experiments on DTLZ3 with different number of objectives:

Fig. 2
figure 2

Demonstration selection

Case1: Without the coefficient 0.729 in Eq. (7).

Case2: Only Eq. (7) is utilized in the proposed method.

Case3: The threshold using Eq. (8) is set to a half of 0.99, i.e. 0.495.

Table 1 gives the results of three cases as well as our proposed setting. From Table 1, we can see that the results obtained in Case2 and Case3 are all worse than those obtained by MPSO/DD, which shows that 0.99 is best to be the threshold to select equation for velocity updating. Compared to Case1, we can see that our proposed MPSO/DD obtained better or competitive results on DTLZ3 problem with more than 10 objectives, which shows that the coefficient 0.729 is significant for the convergence of the algorithm to optimize the problems with high-dimensional objectives.

figure c
Table 1 The statistical results (mean and standard deviation) of the IGD values obtained by three cases and MPSO/DD on DTLZ3

The environmental selection

After the objective evaluation of each offspring, both parent and offspring individuals will be combined together and calculated the CAD proposed in [20] on each reference vector, where

$$\begin{aligned} CAD(i,j) = \frac{\cos \theta _{i,j}}{d1_{i,j}}. \end{aligned}$$
(9)

In Eq. (9), \(\cos \theta _{i,j}\) represents cosine of the angle between the ith reference vector and the jth individual in the combination population of parent and offspring. The larger the \(\cos \theta \) is (the smaller the angle is), the closer the individual and reference vector are, i.e. the even distribution of the individuals in the objective space. \(d1_{i,j}\) is calculated using Eq. (6). It can be seen obviously that the larger the CAD value is, the better between the balance on diversity and convergence of individuals.

Algorithm 4 gives the pseudocode of the environment selection. Each individual in the parent and offspring population will be calculated the CAD value related to each reference vector, and the individual with maximum CAD value to each reference vector will be kept to the next generation.

figure d

The archive updating

All non-dominated solutions will be saved in the archive Arc. When new population is generated and evaluated on objective functions, they will be used to update the archive Arc. Algorithm 5 gives the pseudocode of the archive updating. In Algorithm 5, \(Arc(t-1)\) represents the archive at the \(t-1\)th generation and P is the offspring population. Note that the size of the archive is fixed to the size of population N.

figure e
Table 2 The parameter setting in the experiments
Table 3 The number of reference vectors related to the number of objectives
Table 4 The statistical results(mean and standard deviation) of the IGD values obtained by NSGA-III, KnEA, RVEA, MOEA/DD, SPEAR, GrEA, BiGE and MPSO/DD on DTLZ1 TO DTLZ7

In Algorithm 5, the offspring population P will first be combined together with the non-dominated individuals in the archive \(Arc(t-1)\). If the size of Arc(t) is larger than the size of population, we will keep the individual with the maximum CAD value on the corresponding reference vector into Arc(t). Otherwise, all individuals in \(Arc(t-1)\) will be kept to Arc(t).

Experimental results and discussion

Parameter setting

To verify the effectiveness of our proposed MPSO/DD algorithm on many-objective optimization problems, seven DTLZ test functions are selected and tested on 3, 5, 8, 10, 15 and 20 objectives, respectively. The obtained results are compared with those of NSGA-III, KnEA, RVEA, MOEA/DD, SPEAR, GrEA and BiGE [16] that are state-of-the-art algorithms for many-objective problems, and also compared with NMPSO [17], which is proposed for many-objective optimization based on PSO. The experimental results of these seven algorithms are run on PlatEMO proposed by Tian [28]. The parameters of MPSO/DD are given in Table 2. Also, the relationship between the number of objectives and the dimension of variables are given in Table 2.

Table 3 gives other parameters used in the experiments that related to different number of objectives, including the number of objective evaluations and correspondingly the size of reference vector set. The reference vectors are generated uniformly according to the strategy proposed in MOEA/D [31]. The number of population is consistent with the size of reference vectors. All other parameters needed to be set in our experiments are analyzed and given in Sect. 2.3. The parameters used in the comparison algorithms are set same as those used in the corresponding method.

Performance metrics

To compare the performance of different algorithms, the inverted generational distance (IGD) [6] is used as indicator to evaluate the performance of different algorithms. Suppose \(P^*\) is a set of points uniformly distributed on the optimal Pareto surface in the objective space and P is a set of non-dominated solutions, then the IGD value is defined as follows:

$$\begin{aligned} IGD(P,P^*)=\frac{\sum _{\mathbf {x}\in {P^*}}\min _{\mathbf {y}\in {P}}dist(\mathbf {x},\mathbf {y})}{|P^*|}, \end{aligned}$$
(10)

where \(dist(\mathbf {x},\mathbf {y})\) represents Euclidean distance between two positions \(\mathbf {x}\) and \(\mathbf {y}\). Therefore, the IGD is the average value of the minimum distance from each position in \(P^*\) to P, which is used to measure the convergence and diversity of the non-dominated solution set P that obtained. In our experiments, we selected 10,000 solutions from the real Pareto front \(P^*\) for comparison. The smaller the IGD value is, the better the P is.

Table 5 Performance of IGD result on DTLZ1–DTLZ7 test problems, where MPOS/DD is better than \((+)\), worst than \((-)\) and approximate to (\(\approx \)) each of the seven compared algorithms according to the Wilcoxon rank sum test
Fig. 3
figure 3

Parallel coordinates of nondominated fronts obtained by eight algorithms on the five-objective DTLZ1 problem

Fig. 4
figure 4

Parallel coordinates of nondominated fronts obtained by 8 algorithms on the 20-objective DTLZ1 problem

Table 6 The statistical results(mean and standard deviation) of the IGD values obtained by NMPSO and MPSO/DD on DTLZ1 TO DTLZ7

Experimental result

Table 4 gives the statistical IGD results of the proposed MPSO/DD algorithm and the other seven algorithms on seven DTLZ test functions. The results of Wilcoxon rank sum test are also given: ‘\(+\)’, ‘−’ and ‘\(\approx \)’, respectively, indicate that the results of MPSO/DD algorithm are superior, inferior and similar to those of the comparative algorithm. The data in boldface in Table 4 represents the best results of all algorithms. All results are obtained on 20 independent running. From Table 4, we can clearly see that our proposed MPSO/DD method obtained better results on DTLZ 2, 3, 5 and DTLZ6 problems with high-dimensional objective space. Except the DTLZ2 with 20 objectives, MPSO/DD obtained better results on these problems with 10, 15, and 20 objectives, which showed the competition of our proposed method to solve problems with high dimensions on objectives. Table 5 gives a summary on the results given in Table 4. From Table 5, we can clearly see that generally, MPSO/DD obtained better results than NSGA-III, KnEA, RVEA, SPEAR, GrEA, and BiGE, and competitive results with MOEA/DD.

To show the effectiveness of our proposed algorithm, Figs. 3 and 4 plot the parallel coordinates of the non-dominated solution set obtained by different algorithms on 5-objective and 20-objective DTLZ1 test problems, respectively. From Figs. 3 and 4, we can see that the objective values of MOEA/DD and MPSO/DD can decline much quicker than other algorithms, and the solutions are better distributed than other algorithms in a limited number of evaluations. While compared to MOEA/DD, we can see that the performance of MPSO/DD is comparative on 5-objective DTLZ1, but not better than MOEA/DD on 20-objective DTLZ1. The reason, we analyze, is that the diversity of MPSO/DD is still not better than that of MOEA/DD on DTLZ1. Therefore, to see whether our proposed MPSO/DD algorithm is competitive with those many-objective optimization algorithms based on PSO, we compare the results on DTLZ with NMPSO [17], in which a balanceable fitness estimation method and a novel velocity update equation were presented so as to effectively solve the many-objective optimization problems. Table 6 shows the mean IGD results of MPSO/DD compared to NMPSO, where the best results are highlighted. From Table 6, we can see that the MPSO/DD obtained better results than NMPSO on DTLZ1–DTLZ6 with high-dimensional objectives, which further showed that our proposed MPSO/DD is competitive to solve the problems with high-dimensional objectives. However, the results on DTLZ7 obtained by MPSO/DD is not better than NMPSO, the reason, we analyze, may result from the BFE method proposed in NMPSO which strongly prefers the solutions with well converged and less crowded.

Table 5 also shows the summary on the results given in Table 6. From Table 5, we can clearly see that the proposed MPSO/DD obtained 26/42 better results than NMPSO, which showed the better performance of our proposed MPSO/DD than NMPSO.

Conclusion

This paper proposed a modified particle swarm optimization algorithm, in which the decomposition strategy and different ideal points are utilized, for many-objective problems. The experimental results showed that the proposed algorithm has advantages on solving problems with high-dimensional objective space, but many works are still remained for us to study further. In the future, we will try to add some strategies to prevent the algorithm from falling into local optima to achieve better results on all problems.