A benchmark test suite for evolutionary many-objective optimization

In the real world, it is not uncommon to face an optimization problem with more than three objectives. Such problems, called many-objective optimization problems (MaOPs), pose great challenges to the area of evolutionary computation. The failure of conventional Pareto-based multi-objective evolutionary algorithms in dealing with MaOPs motivates various new approaches. However, in contrast to the rapid development of algorithm design, performance investigation and comparison of algorithms have received little attention. Several test problem suites which were designed for multi-objective optimization have still been dominantly used in many-objective optimization. In this paper, we carefully select (or modify) 15 test problems with diverse properties to construct a benchmark test suite, aiming to promote the research of evolutionary many-objective optimization (EMaO) via suggesting a set of test problems with a good representation of various real-world scenarios. Also, an open-source software platform with a user-friendly GUI is provided to facilitate the experimental execution and data observation.


Introduction
The field of evolutionary multi-objective optimization has developed rapidly over the last two decades, but the design of effective algorithms for addressing problems with more than three objectives (called many-objective optimization problems, MaOPs) remains a great challenge. First, the ineffectiveness of the Pareto dominance relation, which is the most important criterion in multi-objective optimization, results in the underperformance of traditional Pareto-based algorithms. Also, the aggravation of the conflict between convergence and diversity, along with increasing time or space requirement as well as parameter sensitivity, has become key barriers to the design of effective many-objective optimization algorithms. Furthermore, the infeasibility of solutions' direct observation can lead to serious difficulties in algorithms' performance investigation and comparison. All of these suggest the pressing need of new methodologies designed for dealing with MaOPs, new performance metrics and benchmark functions tailored for experimental and comparative studies of evolutionary many-objective optimization (EMaO) algorithms.
In recent years, a number of new algorithms have been proposed for dealing with MaOPs [1], including the convergence enhancement based algorithms such as the grid-dominancebased evolutionary algorithm (GrEA) [2], the knee pointdriven evolutionary algorithm (KnEA) [3], the two-archive algorithm (Two_Arch2) [4]; the decomposition-based algorithms such as the NSGA-III [5], and the evolutionary algorithms based on both dominance and decomposition (MOEA/DD) [6], and the reference vector-guided evolutionary algorithm (RVEA) [7]; the performance indicator-based algorithms such as the fast hypervolume-based evolutionary algorithm (HypE) [8]. In spite of the various algorithms proposed for dealing with MaOPs, the literature still lacks a benchmark test suite for evolutionary many-objective optimization.
Benchmark functions play an important role in understanding the strengths and weaknesses of evolutionary algorithms. In many-objective optimization, several scalable continuous benchmark function suites, such as DTLZ [9] and WFG [10], have been commonly used. Recently, researchers have also designed/presented some problem suites specially for many-objective optimization [11][12][13][14][15][16]. However, all of these problem suites only represent one or several aspects of real-world scenarios. A set of benchmark functions with diverse properties for a systematic study of EMaO algorithms are not available in the area. On the other hand, existing benchmark functions typically have a "regular" Pareto front, overemphasize one specific property in a problem suite, or have some properties that appear rarely in real-world problems [17]. For example, the Pareto front of most of the DTLZ and WFG functions is similar to a simplex. This may be preferred by decomposition-based algorithms which often use a set of uniformly distributed weight vectors in a simplex to guide the search [7,18]. This simplex-like shape of Pareto front also causes an unusual property that any subset of all objectives of the problem can reach optimality [17,19]. This property can be very problematic in the context of objective reduction, since the Pareto front degenerates into only one point when omitting one objective [19]. Also for the DTLZ and WFG functions, there is no function having a convex Pareto front; however, a convex Pareto front may bring more difficulty (than a concave Pareto front) for decomposition-based algorithms in terms of solutions' uniformity maintenance [20]. In addition, the DTLZ and WFG functions which are used as MaOPs with a degenerate Pareto front (i.e., DTLZ5, DTLZ6 and WFG3) have a nondegenerate part of the Pareto front when the number of objectives is larger than four [10,21,22]. This naturally affects the performance investigation of evolutionary algorithms on degenerate MaOPs. This paper carefully selects/designs 15 test problems to construct a benchmark test suite for evolutionary manyobjective optimization. The 15 benchmark problems are with diverse properties which cover a good representation of various real-world scenarios, such as being multimodal,

Fig. 2
The Pareto front of MaF2 with three and ten objectives shown by Cartesian coordinates and parallel coordinates, respectively disconnected, degenerate, and/or nonseparable, and having an irregular Pareto front shape, a complex Pareto set or a large number of decision variables (as summarized in Table 1). Our aim is to promote the research of evolutionary many-objective optimization via suggesting a set of benchmark functions with a good representation of various real-world scenarios. Also, an open-source software platform with a user-friendly GUI is provided to facilitate the experimental execution and data observation. In the following, Sect. "Function definitions" details the definitions of the 15 benchmark functions, and Sect. "Experimental setup" presents the experimental setup for benchmark studies, including general settings, performance indicators, and software platform.

Function definitions
• D: number of decision variables • M: number of objectives . . .
where the number of decision variable is D = M + K − 1, and K denotes the size of x M , namely K = |x M |, with x M = (x M , . . . , x D ). As shown in Fig. 1, this test problem has an inverted PF, while the PS is relatively simple. This test problem is used to assess whether EMaO algorithms are

MaF2 (DTLZ2BZ [19])
min . . . with where the number of decision variable is D = M + K − 1, and K denotes the size of x M , namely . This test problem is modified from DTLZ2 to increase the difficulty of convergence. In original DTLZ2, it is very likely that the convergence can be achieved once the g(x M ) = 0 is satisfied; by contrast, for this modified version, all the objective have to be optimized simultaneously to reach the true PF. Therefore, this test problem is used to assess the whether and MOEA is able to perform concurrent convergence on different objectives. Parameter settings are: x ∈ [0, 1] D and K = 10.

MaF3 (convex DTLZ3 [5])
min where the number of decision variable is D = M + K − 1, and K denotes the size of x M , namely K = |x M |, with x M = (x M , . . . , x D ). As shown in Fig. 3, this test problem has a convex PF, and there a large number of local fronts. This test problem is mainly used to assess whether EMaO algorithms are capable of dealing with convex PFs. Parameter settings of this test problem are: x ∈ [0, 1] D , K = 10 ( Fig. 4).

MaF4 (inverted badly scaled DTLZ3)
with Parameter settings are a = 2. Besides, the fitness landscape of this test problem is highly multimodal, containing a number of (3 k − 1) local Pareto-optimal fronts. This test problem is used to assess whether EMaO algorithms are capable of dealing with badly scaled PFs, especially when the fitness landscape is highly multimodal. Parameter settings of this test problem are: x ∈ [0, 1] n , K = 10 and a = 2.
where the number of decision variable is D = M + K − 1, and K denotes the size of x M , namely K = |x M |, with x M = (x M , . . . , x D ). As shown in Fig. 5, this test problem has a badly scaled PF, where each objective function is scaled to a substantially different range. Besides, the PS of this test problem has a highly biased distribution, where the majority of Pareto optimal solutions are crowded in a small subregion. This test problem is used to assess whether EMaO algorithms are capable of dealing with badly scaled PFs/PSs. Parameter settings of this test problem are: x ∈ [0, 1] D , α = 100 and a = 2.

MaF6 (DTLZ5(I,M) [24])
min where the number of decision variable is D = M + K − 1, and K denotes the size of As shown in Fig. 6, this test problem has a degenerate PF whose dimensionality is defined using parameter I . In other words, the PF of this test problem is always an I -dimensional manifold regardless of the specific number of decision variables. This test problem is used to assess whether EMaO algorithms are capable of dealing with degenerate PFs. Parameter settings are: x ∈ [0, 1] D , I = 2 and K = 10.

MaF7 (DTLZ7 [9])
min with where the number of decision variable is D = M + K − 1, and K denotes the size of x M , namely K = |x M |, with x M = (x M , . . . , x D ). As shown in Fig. 7, this test problem has a disconnected PF where the number of disconnected segments is 2 M−1 . This test problem is used to assess whether

MaF8 (multi-point distance minimization problem [11,12])
This function considers a two-dimensional decision space. As its name suggests, for any point x = (x 1 , x 2 ) MaF8 calculates the Euclidean distance from x to a set of M target points (A 1 , A 2 , . . . , A M ) of a given polygon. The goal of the problem is to optimize these M distance values simultaneously. It can be formulated as where d(x, A i ) denotes the Euclidean distance from point x to point A i .
One important characteristic of MaF8 is its Pareto optimal region in the decision space is typically a 2D manifold (regardless of the dimensionality of its objective vectors). This naturally allows a direct observation of the search behavior of EMaO algorithms, e.g., the convergence of their population to the Pareto optimal solutions and the coverage of the population over the optimal region.
In this test suite, the regular polygon is used (to unify with MaF9). The center coordinates of the regular polygon (i.e., Pareto optimal region) are (0, 0) and the radius of the polygon (i.e., the distance of the vertexes to the center) is 1.0. Parameter settings are: x ∈ [−10,000, 10,000] 2 . Figure 8 shows the Pareto optimal regions of the three-objective and ten-objective MaF8.

MaF9 (multi-line distance minimization problem [25])
This function considers a two-dimensional decision space. For any point x = (x 1 , x 2 ), MaF9 calculates the Euclidean distance from x to a set of M target straight lines, each of which passes through an edge of the given regular polygon with M vertexes (A 1 , A 2 , . . . , A M ), where M ≥ 3. The goal of MaF9 is to optimize these M distance values simultaneously. It can be formulated as where ← − → A i A j is the target line passing through vertexes A i and A j of the regular polygon, and d(x, One key characteristic of MaF9 is that the points in the regular polygon (including the boundaries) and their objective images are similar in the sense of Euclidean geometry [25]. In other words, the ratio of the distance between any two points in the polygon to the distance between their corresponding objective vectors is a constant. This allows a straightforward understanding of the distribution of the objective vector set (e.g., its uniformity and coverage over the Pareto front) via observing the solution set in the two-dimensional decision space. In addition, for MaF9 with an even number of objectives (M = 2k where k ≥ 2), there exist k pairs of parallel target lines. Any point (outside the regular polygon) residing between a pair of parallel target lines is dominated by only a line segment parallel to these two lines. This property can pose a great challenge for EMaO algorithms which use Pareto dominance as the sole selection criterion in terms of  Fig. 9 The Pareto front of MaF9 with three and ten objectives shown by Cartesian coordinates and parallel coordinates, respectively convergence, typically leading to their populations trapped between these parallel lines [14]. For MaF9, all points inside the polygon are the Pareto optimal solutions. However, these points may not be the sole Pareto optimal solutions of the problem. If two target lines intersect outside the regular polygon, there exist some areas whose points are nondominated with the interior points of the polygon. Apparently, such areas exist in the problem with five or more objectives in view of the convexity of the considered polygon. However, the geometric similarity holds only for the points inside the regular polygon. The Pareto optimal solutions that are located outside the polygon will affect this similarity property. So, we set some regions infeasible in the search space of the problem. Formally, consider an M-objective MaF9 with a regular polygon of vertexes (A 1 , A 2 , . . . , A M ). For any two target lines ←−−→ A i−1 A i and ← −−− → A n A n+1 (without loss of generality, assuming i < n) that intersect one point (O) outside the considered regular polygon, we can construct a polygon (denoted as A i−1 A i A n A n+1 ) bounded by a set of 2(n−i)+2 line segments: We constrain the search space of the problem outside such polygons (but not including the boundary). Now the points inside the regular polygon are the sole Pareto optimal solutions of the problem. In the implementation of the test problem, for newly produced individuals which are located in the constrained areas of the problem, we simply reproduce them within the given search space until they are feasible.
In this test suite, the center coordinates of the regular polygon (i.e., Pareto optimal region) are (0, 0) and the radius of the polygon (i.e., the distance of the vertexes to the center) is 1.0. Parameter settings are: x ∈ [−10,000, 10,000] 2 . Fig-ure 9 shows the Pareto optimal regions of the three-objective and ten-objective MaF9.

MaF10 (WFG1 [10])
where the number of decision variable is D = K + L, with K denoting the number of position variables and L denoting the number of distance variables. As shown in Fig. 10, this test problem has a scaled PF containing both convex and concave segments. Besides, there are a lot of transformation functions correlating the decision variables and the objective functions. This test problem is used to assess whether EMaO algorithms are capable of dealing with PFs of complicated mixed geometries. Parameter settings are: x ∈ D i=1 [0, 2i], K = M − 1, and L = 10.

MaF11 (WFG2 [10])
where the number of decision variable is n = K + L, with K denoting the number of position variables and L denoting the number of distance variables. As shown in Fig. 11, this test problem has a scaled disconnected PF. This test problem is used to assess whether EMaO algorithms are capable of dealing with scaled disconnected PFs. Parameter settings are:

MaF12 (WFG9 [10])
with 1.5 2 4 Fig. 11 The Pareto front of MaF11 with three and ten objectives shown by Cartesian coordinates and parallel coordinates, respectively where the number of decision variable is D = K + L, with K denoting the number of position variable and L denoting the number of distance variable. As shown in Fig. 12, this test problem has a scaled concave PF. Although the PF of this test problem is simple, its decision variables are nonseparably reduced, and its fitness landscape is highly multimodal. This test problem is used to assess whether EMaO algorithms are capable of dealing with scaled concave PFs together with complicated fitness landscapes. Parameter settings are:

MaF13 (PF7 [13])
min where the number of decision variable is D = 5. As shown in Fig. 13, this test problem has a concave PF; in fact, the PF of this problem is always a unit sphere regardless of the number of objectives. Although this test problem has a simple PF, its decision variables are nonlinearly linked with the first and second decision variables, thus leading to difficulty in convergence. This test problem is used to assess whether EMaO algorithms are capable of dealing with degenerate PFs and complicated variable linkages. Parameter setting is:

MaF14 (LSMOP3 [16])
min with  Fig. 15 The Pareto front of MaF15 with three and ten objectives shown by Cartesian coordinates and parallel coordinates, respectively where N k denotes the number of variable subcomponent in each variable group x s i with i = 1, . . . , M, and u i and l i are the upper and lower boundaries of the ith decision variable in x s . Although this test problem has a simple linear PF, its fitness landscape is complicated. First, the decision variables are non-uniformly correlated with different objectives; second, the decision variables have mixed separability, i.e., some of them are separable while others are not. This test problem is mainly used to assess whether EMaO algorithms are capable of dealing with complicated fitness landscape with mixed variable separability, especially in large-scale cases. Parameter settings are: N k = 2 and D = 20 × M. (49) where N k denotes the number of variable subcomponent in each variable group x s i with i = 1, . . . , M, and u i and l i are the upper and lower boundaries of the ith decision variable in x s . Although this test problem has a simple convex PF, its fitness landscape is complicated. First, the decision variables are non-uniformly correlated with different objectives; second, the decision variables have mixed separability, i.e., some of them are separable while others are not. Different from MaF14, this test problem has non-linear (instead of linear) variable linkages on the PS, which further increases the difficulty. This test problem is mainly used to assess whether EMaO algorithms are capable of dealing with complicated fitness landscape with mixed variable separability, especially in large-scale cases. Parameter settings are: N k = 2 and D = 20 × M in Figs. 14 and 15.

Experimental setup
To conduct benchmark experiments using the proposed test suite, users may follow the experimental setup as given below.

Performance metrics
• Inverted generational distance (IGD) Let P * be a set of uniformly distributed points on the Pareto front. Let P be an approximation to the Pareto front. The inverted generational distance between P * and P can be defined as: where d(v, P) is the minimum Euclidean distance from point v to set P. The IGD metric is able to measure both diversity and convergence of P if |P * | is large enough, and a smaller IGD value indicates a better performance. In this test suite, we suggest a number of 10,000 uniformly distributed reference points sampled on the true Pareto front 3 for each test instance. • Hypervolume (HV) Let y * = (y * 1 , . . . , y * m ) be a reference point in the objective space that is dominated by all Pareto optimal solutions. Let P be the approximation to the Pareto front. The HV value of P (with regard to y * ) is the volume of the region which is dominated by P  4 Then we use y* = (1,…,1) as the reference point for the normalized objective vectors in the HV calculation.

Software platform
All the benchmark functions have been implemented in MATLAB code and embedded in a recently developed software platform-PlatEMO. 5 PlatEMO is an open source MATLAB-based platform for evolutionary multi-and manyobjective optimization, which currently includes more than 50 representative algorithms and more than 100 benchmark functions, along with a variety of widely used performance indicators. Moreover, PlatEMO provides a user-friendly graphical user interface (GUI), which enables users to easily perform experimental settings and algorithmic configurations, and obtain statistical experimental results by one-click operation.
In particular, as shown in Fig. 16, we have tailored a new GUI in PlatEMO for this test suite, such that participants are able to directly obtain tables and figures comprising the statistical experimental results for the test suite. To conduct the experiments, the only thing to be done by participants is to write the candidate algorithms in MATLAB and embed them into PlatEMO. The detailed introduction to PlatEMO regarding how to embed new algorithms can be referred to the users manual attached in the source code of PlatEMO [26]. Once a new algorithm is embedded in PlatEMO, the user will be able to select the new algorithm and execute it on the GUI shown in Fig. 16. Then the statistical results will be displayed in the figures and tables on the GUI, and the corresponding experimental result (i.e., final population and its performance indicator values) of each run will be saved to a .mat file.