Machine learning-based framework to cover optimal Pareto-front in many-objective optimization

One of the crucial challenges of solving many-objective optimization problems is uniformly well covering of the Pareto-front (PF). However, many the state-of-the-art optimization algorithms are capable of approximating the shape of many-objective PF by generating a limited number of non-dominated solutions. The exponential increase of the population size is an inefficient strategy that increases the computational complexity of the algorithm dramatically—especially when solving many-objective problems. In this paper, we introduce a machine learning-based framework to cover sparse PF surface which is initially generated by many-objective optimization algorithms; either by classical or meta-heuristic methods. The proposed method, called many-objective reverse mapping (MORM), is based on constructing a learning model on the initial PF set as the training data to reversely map the objective values to corresponding decision variables. Using the trained model, a set of candidate solutions can be generated by a variety of inexpensive generative techniques such as Opposition-based Learning and Latin Hypercube Sampling in both objective and decision spaces. Iteratively generated non-dominated candidate solutions cover the initial PF efficiently with no further need to utilize any optimization algorithm. We validate the proposed framework using a set of well-known many-objective optimization benchmarks and two well-known real-world problems. The coverage of PF is illustrated and numerically compared with the state-of-the-art many-objective algorithms. The statistical tests conducted on comparison measures such as HV, IGD, and the contribution ratio on the built PF reveal that the proposed collaborative framework surpasses the competitors on most of the problems. In addition, MORM covers the PF effectively compared to other methods even with the aid of large population size.


Introduction
Evolutionary algorithms (EAs) have been very powerful and well established approaches to solve many-objective optimization problems (MOOPs). However, when the number of objectives are high, they struggle to find well-covered and well-distributed solutions. Moreover, it has been shown that more than 90% of the randomly generated initial population are non-dominated when the number of objectives are many [19]. Furthermore, the small size of a population is not able to cover the large-scale hyper PF surface and as a result, this causes a sparse PF set. One way to reduce the impact of sparsity in many-objective optimization is to increase the initial population size. However, when the population size is increased, the optimization process required to obtain converged and well-distributed PF is computationally expensive [31]. Moreover, the theoretical results presented by Chen et al. have revealed that large population, depending on a problem characteristics, may not always be useful and even it can degrade the performance of EAs [5]. Furthermore, it would be challenging to tackle the sparsity issue as the dimension of the problem gets large since large sparse regions occur in the PF with the increasing number of dimensions. In addition, since optimization algorithm has difficulties when the PF has sparse regions and consequently decision-maker cannot reach all the points in the region of the interest (ROI) of PF.
Recently, Deb and Srinivasan [13] introduced the idea of innovization in multi-objective optimization to discover the knowledge and patterns hidden in the PF. Although their study focused on the discovery of important design principles related to decision variables and objectives after the optimization process is done. Innovization can be used as part of the optimization process to improve the quality of the Pareto-optimal solutions [12,30]. Subsequent to the introduction of innovization, there were several studies focusing on the idea of incorporating of learning algorithms during the optimization process to gain information that leads to a set of improved solutions [4,8,27].
This study proposes a novel framework to fill sparse regions in many-objective optimization by employing machine learning (ML) after the optimization process is complete. The proposed framework uses a machine learning algorithm to approximate the mapping between the PF solutions and decision variables. This information is then used to generate new candidate solutions without the need for rerunning optimization algorithm. As a result, the sparsity issue of PF can be addressed by the proposed collaborative ML and MOO framework.
There is a growing interest in using the machine learning techniques to enhance the performance of optimization algorithms. Gaspar et al. presented a hybrid multi-objective EA to accelerate the search using artificial neural networks (ANN) [17]. The aim of their study was to reduce the number of fitness evaluation by approximating global fitness functions; hence machine learning is used to make a local search to find better individuals. Adra et al. used the same framework for optimizing the aircraft control system design [1]. Another similar study proposed an algorithm to speed-up the multiobjective genetic algorithm with constraints using ANN [18]. The authors introduced an approximated function of the original objective function using ANN, and used this function during the optimization to reduce the computational time of the evaluations. Kobayashi et al. presents a new mechanism which helps to maintain the diversity of solutions using ANN [23]. A set of NSGA-III solutions are used to train the ANN to find an approximation function as a surrogate model. The study used this approximated function to relocate the population uniformly. The use of machine learning to construct a proxy model for many-objective optimization problems is presented in several studies [3,26,33,47]. A proxy model can be created on simulation results of an experimental problem to be then solved by a multi-objective optimization algorithm and consequently to reduce the number of lab-based expensive experiments. Alternatively, for an expensive fitness function, a proxy model is built to approximate the value of objectives rather than evaluate them directly. Generally speaking, most of the studies in the literature have used the machine learning to improve the evolution process by surrogate model and they are not targeting adding new non-dominated solutions to the PF. Surrogate models construct a function mapping from the decision space to objective space. They estimate the fitness value of candidate solutions. Therefore, they are helpful when no explicit fitness function exists or when the original fitness function is computationally very expensive. On the contrary, the reverse mapping algorithm constructs a mapping model from the objective space to decision space. In [6], a reverse model algorithm, IM-MOEA, is introduced based on the Gaussian process to estimate the objective values of some random samples from objective space. They create Gaussian model for each objective and each cluster of population. Therefore, the computational complexity increases exponentially. In addition, they use descriptive models to estimate the distribution of the points and for accomplishment, they have been forced to consider some assumptions such as the independency among the objectives.
One of the major difficulties of large-scale many-objective optimization is to obtain acceptable solutions with a reasonable size of the population. Therefore, it is necessary to reduce the number of evaluations to reach this goal [16]. This difficulty might be alleviated using a surrogate model with machine learning methods. The idea of this study is highly influenced by the innovization idea and reverse models because of seeking possible ways to gain information from the PF solutions. In this paper, we proposed a reverse mapping from objective space to decision space. To this end, a learning model is trained to map the solutions on the PF, resulted by an optimization algorithm, to decision variables. Using this model, the decision variables can be estimated for the desired values of objectives. Therefore, a cloud of non-dominated solutions can be generated using a variety of techniques while their decision values can be obtained by the trained model. Two categories of generative techniques applied on the decision or objective spaces can create new points with the help of trained reverse mapping model and problem's objectives. As a result, decision-maker can select an acceptable solution from a comprehensive set of PF solutions by considering trade-off among own desired objective values.
The main contributions of the proposed framework to tackle the mentioned problems in the population-based algorithms are 1. A novel reverse model is proposed to map the objective space to decision space.
• Capability of comprehensively covering the Optimal PF with no need to rerun the optimizer. • with low cost, there is the capability of increasing PF solutions to the desired number (i.e., as many as possible). • Enhancement of the uniform distribution of PF solutions which provides the decision-makers with a higher resolution of trade-off solutions with a low sparsity over PF regions. • It can be utilized with any MOO algorithm (i.e., classical or metaheuristic one) which can offer an approximate PF set. • There is a capability of getting a preferred solution(s) (i.e., desired objective values) from the user to find corresponding solution(s). Thus, it can help decisionmakers to selectively fill a sparse Region-of-Interest (RoI) interactively by conducting a well-defined controlled procedure with no need to rerun the optimizer.
2. Instead of random selection of points in objective space, several sampling techniques are proposed for inverse sampling which leads to generating more non-dominated solutions to reduce the level of sparsity on surface of hyper curve of PF. 3. A reference-line based scheme is designed to select a set of well-distributed solutions after finishing of optimization. 4. Strongly supporting scalability on the number of objectives; the proposed model performs very well with many objective cases, because objective values are the inputs to the introduced reverse mapping model; dislike other models which are faced with the difficulty when the number of objectives increases. 5. Systematic experiments conducted to compare the proposed algorithm with the-state-of-the-art MOEAs on ten MOPs are described.
Finally, we want to remark that in most cases, real-life problems have many objectives and even more decision variables. This leads to the higher dimensionality problem in decision and objective spaces. Most of the similar studies try to avoid complexity, however, we take advantage of two types of complexity in our approach; (i) higher dimensions in objective space, (ii) approximation errors in reverse mapping. First, we use the objective function values as inputs of the machine-learning model, therefore having more objectives leads to accurate learning (i.e., in opposite preference of other methods). Second, approximation error acts in our benefit since it helps as a local search in finding new feasible candidate decision variables and solutions.
The remainder of the paper is organized as follows. The next section provides the background review of the study. The technical description of the proposed framework is presented in the subsequent section followed by which the experimental setup and experimental results of the proposed framework on well-known multi-objective benchmark test problems with 3 and 5 objectives are presented. In addition to the benchmarks, the performance of proposed method on two well-known real-word problems is investigated in the penultimate section. Finally, the concluding remarks are provided.

Literature review
Generally speaking, evolutionary-based multi-objective optimization algorithms can be divided into three categories: dominance-based, decomposition-based, and indicatorbased methods [41]. Dominance-based methods attempt to find the solutions that optimize the objective functions using the dominance concept. One of the stat-of-the-art algorithms from this category is NSGA-IIII [10]. According to this method, the candidate solutions in population are selected based on the reference points distributed in the objective space. On the contrary, the decomposition-based methods decompose the whole search space into smaller subproblems and solve all of them simultaneously. Therefore, the convergence rate of the algorithm is significantly improved, which enhances the diversity of the obtained solutions. MOEA/D [48] is one of the well-known algorithms in this category. The indicator-based methods evaluate the fitness of each solution by assessing an indicator (such as hypervolume) to improve the convergence and diversity criteria simultaneously. All traditional multi-objective algorithms belong to aforementioned categories focus on the proposing a selection strategy when attempting to adopt single-objective EAs to solving MOEAs. Whereas designing effective reproduction strategies that explicitly focus on covering PF by generating welldistributed solutions has not been paid attention properly.
Apart from this categorization, the traditional multiobjective optimization algorithms use the reproduction strategies to generate new individuals in original decision space. To this effect, there are a variety of methods which can significantly improve the performance of existing algorithms by designing new generative operators [20,40,46]. In contrary, some MOEAs generate offsprings using probabilistic models to characterize the promising solutions and approximate the Pareto-front instead of utilizing generative operators. In fact, these methods use the knowledge from the approximated Pareto-front to guide the MOEAs. Reverse models belong to these category which construct a model to map the objective space to decision space. This idea provides capability for MOEAs to generate desired candidate solutions in objective space explicitly.
IM-MOEA [6] constructs a model using Gaussian process to map the objective space to the decision space. Consequently, the authors generate candidate solutions in the objective space. However, they require K × M modes where K is the number of population partitions and M is the number of objective. This is thus an obstacle against scalability in term of number of objectives. This algorithm suffers low accuracy and difficulty in dealing with MOPs with irregular PFs because it is not able to generate distribution in sparse regions. In [44], a method is proposed based on the idea of IM-MOEA but the reference vectors are generated adaptively and additionally nonrandom grouping strategy are employed. In [25], a model based on an incremental Gaussian mixture model guides the search procedure. They fed the learning model with all new solutions generated during the evolution to adaptively discover the structure of the Pareto-optimal set. In addition, to conduct the Sensor Ontology Matching (SOM) process to find the mappings among diverse sensor data, an Improved IM-MOEA (I-IM-MOEA)-based matching technique is proposed [44]. An adjusted selection mechanism is employed to tackle the problems on irregular PFs such as reduction in PF solutions. Moreover, a dynamic Reference Vectors is proposed to decrease the computational resources and improve the efficiency of the algorithm. The potential of reverse models for solving dynamic multi-objective optimization problems is investigated in [49]. An inverse-based Gaussian process maps the historical optimal solutions from the objective space to the decision space.
A decomposition-based reverse model algorithm which uses k-means for grouping in the objective space is introduced in [15]. Furthermore, a selection criterion based on decomposition that selects the most appropriate reference vectors is proposed. All reviewed studies attempt to find a distribution on PF which have to make some assumptions such as independency among the objective functions while our proposed method alleviates these drawbacks as it generates a model using the learning process. Moreover, these method are dependent on the shape of PF and are not usually able to generate distribution model on sparse are of PF. Whereas our proposed method can train a model using a limited number of points.

Background review
In this section, the concepts related to the study have been explained. In addition to the definition of many-objective optimization and ANN, the explanation of three techniques which are utilized to generate new points in both decision and objective spaces is provided.

Many-objective optimization
Many-objective optimization targets handling more than three conflicting objectives. Multi-objective optimization algorithms have been greatly expanded to tackle manyobjective problems. The use of EAs has been very promising for solving such problems. The population-based nature of these algorithms results in generating a set of candidate solutions at each run of the algorithm. Collaboration of individuals to make an optimal Pareto-front is the core reason of success of population-based algorithms compared to singlesolution algorithms.

Definition 1 Many-objective optimization [2]
Subject to the following equality and/or inequality constraints.
where M is the number of objectives, d is the number of decision variables (i.e., dimension), and the value of each variable, x x x i , is in interval [L i , U i ] (i.e., box-constraints). f i represents the objective function, which should be minimized or maximized.
Due to the conflicting of objective functions in a multi-or many-objective optimization problems, the definition of the optimality is not as simple as the single-objective case. Therefore, it is required to make a trade-off decision among objective functions. One of the commonly used concepts for comparing candidate solutions in such problems is dominance.
This concept defines the optimality of a solution in a multiobjective space. Candidate solution x x x is better thanx x x if it is not worse thanx x x in any of the objectives and at least it has a better value in one of the objectives. All solutions that are not dominated using any other solution called non-dominated solutions; they create the PF set. Many-objective algorithms attempt to find these solutions by utilizing generating strategies/operators and selection schemes. The non-dominated sorting (NDS) algorithm [11] is one of the popular selection strategies which works based on the dominance concept. It ranks the solutions of the population in different levels of optimality, called Pareto. The algorithm starts with determining all non-dominated solutions in the first rank. To identify the second rank of individuals, the non-dominated vectors are removed from the set to process the remaining candidate solutions in the same way. Non-dominated solutions of this step make the second level of individuals (second Pareto). Thereafter, the second ranked individuals will be removed to identify the third Pareto. This process will continue until the whole individuals are grouped into different levels of Pareto.

Artificial neural network
Artificial Neural Network (ANN) is one of the most powerful tools for pattern recognition and data mining that is inspired by human brain [45]. Nowadays, there has been growing interest in using ANN in a variety of applications for developing a learning model [32,37,50]. The network structures are made up of two components: neurons and weighted connections among neurons. The feed-forward network is a popular category of ANNs with an input layer for feeding input data, an output layer for specifying the output of classification or regression, and one or more hidden layers between input and output layers for the learning process. The learning process in the network is based on finding the best connection weights and thresholds of neurons for hidden and output layer with the goal of achieving minimum error for predicting the output of test data. To train a network, training data is fed to the ANN. The error can be computed using a desired metric such as Mean Square Error (MSE) defined as follows.
where n is the number of samples, y i is the desired Neural Network output, andŷ i is the neural network output. The weight updates are performed using an optimization method such as Gradient Descent. The best values for the weight connections and threshold are calculated to minimize the corresponding error, called loss value. This process continues until the algorithm meets a predefined criterion such as reaching to a specific number of iterations or desired error value.

Opposition-based learning (OBL)
Opposition-based learning (OBL) is a technique to solve the optimization/learning problems effectively and efficiently [34]. Literature shows that OBL augments the diversity of generated solutions for an optimization problem by exploring more regions of the search space in both current and opposite directions. In this study, OBL is used as a parallel approach to produce more and diverse points on the PF cloud. In the following, three types of the opposition-based operators are briefly explained.

Definition 4 Type II opposition For the point
, the min-max opposition computation for Type-II opposition is defined as [36]: Definition 5 Partial opposition In a multi-dimensional space, the partial opposite point y y y po = (y 1 ,y 2 , y 3 , . . . ,y D ) is generated by opposing only a portion of dimensions [29] which can be selected randomly. So, we can define the partial opposite set of points y y y po as follows: y y y po = y y y

Latin hypercube sampling
Latin Hypercube Sampling (LHS) is a stratified sampling technique to generate near-random samples from a probability distribution [38]. To generate N samples from D variables, the range of each variable is divided into N equally probable intervals. From each interval, a random sample is selected to cover a large space of space. Since each sample of each variable can be paired in a random manner with other variables, for given values of N and D, there exist (N !) D−1 possible interval combinations. Two examples of different combinations are illustrated in Fig. 1 representing a LHS in two dimensions with five intervals.

Proposed algorithm: MORM
The main goal of the proposed method is to comprehensively cover the PF using a trained model on a resulted small set of solutions from an optimization process. In manyobjective optimization problems, obtaining well-covered PF is a challenging task especially when the number of objectives increases and PF is a hyper-surface as a result. The main key idea is to train a model on optimal PF set obtained from an optimization process to map the decision variables of the resulted objective values. In this way, generating numerous points in objective space leads to the capability of trained model to estimate the corresponding decision variables. Therefore, a cloud of non-dominated solutions can be obtained with no need to rerun an optimization process. For this purpose, a sparse PF is initially achieved by an arbitrary many-objective optimization algorithm, either by a classical or meta-heuristics method. Considering a learning algorithm, i.e., regression technique, the inputs of the model are the objective values on provided PF and its outputs are corresponding decision variables. The learning process aims to find a mapping model between objectives and variables values to predict the decision vector for a given objective vector. Accordingly, by providing more points around the resulted PF, we are able to estimate the decision values for these objective values. In addition, the constructed model gets improved over iterations, because the training points increase gradually. This approach is called many-objective reverse mapping (MORM) as constructed model conducts a mapping from objective space to decision space in reverse direction of an objective function which is a mapping from the decision to objective space. The trained model is utilized to predict the decision values for those points generated close to resulted PF from optimization task. Afterward, the true objective values for predicted decision variables can be achieved by objective function in a direct mapping. Newly generated pairs of decisions and their corresponding objective values are utilized to improve the trained model. This procedure is the fundamental part of the proposed framework. The process is repeated to generate as much as the required non-dominated solutions. To generate the approximate points in both decision and objective spaces, several techniques including OBL and LHS are employed. The details of main steps of the proposed framework are provided as follows.

Optimization.
At the first step of the framework, a manyobjective optimization algorithm is utilized to generate the initial PF for the given problem. The resulted non-dominated solutions are exploited for two purposes. Firstly, a machine learning model is trained on objective values to predict the decision values of the following generated points. In addition, these points can be input for other auxiliary techniques such as OBL and LHS to generate more points in objective and decision space. Hence, regardless of the type of the optimization algorithm, any method which is able to obtain an initial optimal PF of the problem can be employed. Therefore, the output of this step is a PF (i.e., a set of decision variables (X ) and objective values (Y ) resulted from a conventional many-objective optimization algorithm. where is a set of objective functions.
2. Reverse mapping. In this phase, a machine learning algorithm is used for the surrogate model to conduct the reverse mapping from objective space to decision space. Obviously, the fitness function maps the variables of decision space to their objective values. Conversely, in the reverse direction, a learning model, i.e., regression model, can be trained to map the objectives into decision variables. To this end, the generated PF using an optimization algorithm in the previous step is provided as training data for the utilized learning algorithm. The objective values of non-dominated solutions are inputs of the model and the decision variables form the outputs. After being given a sufficient number of samples, the model becomes capable of predicting decision variables from associated objective values. The estimated decision values of the trained model are evaluated by the fitness function to get the true value of objectives. This step leads to generating a set of points different from the training data because similar to every machine learning algorithm, an approximation model yields the approximation error; this situation results in a cloud of non-dominated solutions. After completing the training process, the generated objective values are entered into the constructed model as input data to get approximate decision variables. The process of mapping between decision and objective spaces is presented in Fig. 2. Therefore, the output of this step is a trained model (M) on the resultant PF from the previous step which maps the objective values (y y y)into decision variables (x x x) as follows.
where β is the a set of model parameters. It is obvious that if we feed the model using the current objective values, Y , a set of new decision variables can be obtained so that β). Correspondingly, the true objective values of X M can be obtained using fitness evaluation, Y M = F(X M ).

Generative techniques.
To produce a cloud of nondominated solutions for an optimization problem, other auxiliary methods can be utilized along with the optimization process. These techniques generate new points close to discovered solutions on PF. Moreover, new points can also be created in the decision space. Once the new points are generated in the objective space, the trained model is employed to map the objective values to decision variables reversely. For this purpose, two techniques are employed including OBL and LHS. The details of operating these techniques are provided in the following.
OBL As mentioned in the background review, there are two types of OBL, namely Type I and Type II which are utilized in the following ways.
Type I OBL can be applied in decision space. In this case, the opposite of decision variables of non-dominated solutions, X opI , are computed to discover more candidate solutions with higher diversity. In a direct mapping, for the newly discovered decision vectors, the objective values are computed using fitness functions so that Y opI = F(X opI ). As a result, a set of candidate solutions are produced using this technique. OBL leads to exploring the search space in both current and opposite directions to enhance the diversity and coverage of the PF. On the other hand, type II OBL and its variant, partial type II OBL, produces a set of new points in objective space, Y opII and Y popII , respectively. These points are in the opposite direction of objective vectors of the resulted PF.
Since these points are generated in objective space, the corresponding decision variables, x opII and x opII , should be estimated as follows: To compute the true objective values of approximate decision vectors, fitness evaluation is utilized and this is in fact the only time we need to call the fitness function so that Y opII = F(X opII ) and Y popII = F(X popII ). By a well-trained mapping model, the new objective vectors are expected to be close to the opposite of the original non-dominated solutions. Consequently, a set of decision and objective vectors are resulted using these techniques. LHS It is another sampling technique to generate a pool of new samples in objective space. Using the aforementioned procedure, a set of objective vectors are sampled from sub-regions of PF intervals, Y LHS . The sample spaces are resulted from splitting the interval of each objective of nondominated solutions. This technique attempts to distribute samples evenly over the sample space. The set of random numbers generated by the LHS method are the appropriate representatives of the real variability rather than traditional random sampling. Hence, it leads to better exploration of PF and generation of new non-dominated points with a higher probability. Similar to other generative methods applied in objective space, the reverse mapping mode takes the resulted objective vector as input to estimate the corresponding decision vectors, X LHS , as follows: Then, the valid fitness values of predicted decision vector are computed using the fitness function so that Y LHS = F(X LHS ). As a result, a set of decision and objective vectors pairs are generated using this technique as well.
4. Pareto-front selection. In this step, non-dominated solutions are selected among a set of all generated points using different methods in previous steps. The set is the union of three subsets including: 1. points on PF from the previous generation, (X , Y ). 2. the estimated points by the trained model, (X M , Y M ); as previously mentioned, after training the model, the approximate decision values of training data are evaluated by the fitness function to compute the true value of objectives. Therefore, in addition to PF from the previous iteration, the approximated points around the PF are appropriate set of candidates to cover the PF similar to a local search which covers the region around the current candidate solutions. 3. the resulted points from the generative techniques such as OBL and LHS as follows.
The non-dominated solutions extracted from this step are considered as a new training set for the machine leaning algorithm to re-train it. Accordingly, the process of training the model and generating a well-distributed PF continues iteratively. At each iteration of the framework, by increasing the non-dominated points, more training data are provided to improve the quality of reverse mapping model. The overall structure of the MORM is illustrated in Fig. 3. As it is presented, the resulted candidate solution from an optimization process is fed into the ANN as training data. Simultaneously, other generative techniques including OBL and LHS are utilized to produce more objective and decision vectors. All generated points in objective space are mapped to decision space using the trained NN to get the approximate corresponding decision vectors. The predicted decision vectors are evaluated by the fitness function to compute the true objective values. Non-dominated solutions of a union set of new points are selected to re-train the NN to get a more accurate model. In fact, re-training ANN with new solutions is for improving the model but we can still stay with the same trained ANN if the training cost is not affordable. In other words, the proposed model follows the active learning however, it could be a passive one. Furthermore, most of the real-world optimization problems are many-objective with expensive objective functions which require a huge population size to generate as many as possible PF candidate solutions. However, the cost of fitness call of such problems is a serious barrier for evolving a large population. Hence, the cost of training ANN even with all non-dominated solutions is less than running a traditional many-objective optimization evolutionary algorithm with a huge population size. In fact, re-training of an ANN with a small set of samples is not time consuming.
By this iterative procedure, at the end of each iteration, a set of significant number of non-dominated solutions are added to PF. Finally, a well-distributed and dense PF can be resulted. The pseudocode of the algorithm is presented in Algorithm 1.

Experimental results and analysis
In this section, we explain the conducted experiments to assess the proposed method in terms of various performance metrics. In addition to comparison with the state-of-theart many-objective optimization algorithms, the PF covering process of the proposed framework is investigated comprehensively.

Benchmark functions and implementation details
The practical application of the MORM is assessed in terms of various evaluation measures on a number of wellknown many-objective optimization benchmarks [7] listed  Table 1. The conducted experiments are for three and five-objectives optimization test sets. The number of decision variables is set according to the setting used in [7]. We excluded Ma F10 because the optimization algorithm could not generate an initial PF close the true PF and consequently the reverse mapping procedure fails to construct a model on the resulted PF.
At the first step, NSGA-III [10] algorithm is applied to create the initial PF as the input of the proposed framework. The size of the population to run the NSGA-III is set to 100; consequently according to reference-line strategy in the NSGA-III, the number of solutions on the resulted PF is 91 for 3-objective problems, and 81 for 5-objective problems. An ANN model with the hidden layer size of 10 is trained using the fitness values of initial PF as the input and the decision variables values as the output of the network. The ratio of size of training, validation, and test sets for training of the ANN is set to 70%, 15%, and 15% of the entire data, respectively. The MSE is used as the performance function of the network. To handle the box-constraint for the generated candidate solutions, the infeasible values are replaced with average values of the corresponding dimension's boundaries.
Two categories of results are presented in this section. Firstly, the resulted solutions of MORM are represented by the visualization and numerical values. Moreover, the MORM is compared with three state-of-the-art many-objective optimization algorithms, namely NSGA-III, MOEA/D [48], and IM-MOEA [6] in terms of three wellknown many-objective evaluation metrics including ratio of union PF, Hypervolume indicator (HV) [43], and inverted generational distance (IGD) [35]. Ratio of union PF computes the amount of contribution of each algorithm on resulted PF by conducting the NDS algorithm on them and then measuring the ratio of each. This measure compares the algorithms in terms of dominance. HV is a very popular indicator which evaluates many-objective algorithms in terms of the distribution of the PF and the closeness to true PF. HV indicator evaluates the diversity and convergence of a many-objective algorithm. It calculates the volume of Mdimensional space that is surrounded by a set of solution points (A) and a reference point r = (r 1 , r 2 , . . . , r M ) where M is the number of objectives of the problem. Therefore, the volume of the two-dimensional space which is surrounded by points on the obtained PF and r , is calculated as HV indicator. A reference point is a point with the worse values than nadir point. The measure is defined in Eq. 13.
where a ∈ A is a point which all candidate solutions are weakly dominated by it. Larger values of HV indicates that  type I X opI = T ypeI _Opposition(X ); Y opI = Y _Evaluation(X opI ); // Generating points using Opposition type II Y opI I = T ypeI I _Opposition(Y ); X opI = T rained_AN N (Y opI I ); Y opI I = Y _Evaluation(X opI I ); // Generating points using Partial Opposition type II Y popI I = Partial_T ypeI I _Opposition(Y ); X popI I = T rained_ AN N (Y popI I ); Y popI I = Y _Evaluation(X popI I ); // Generating points using LHS PF surrounds a wider space and it results in more diverse solutions and also more closer to optimal PF. IGD measures the distance between the resulted PF and true P F * which is calculated as follows: where d q i is the Euclidean distance between a solution from PF* to its nearest individual in the resulted PF and q = 2. A smaller IGD value indicates a lower distance to true PF and consequently better performance. To calculate IGD, roughly 10,000 reference points on the PF of each benchmark function are sampled by Das and Dennis's approach [9]. Finally, because of the stochasticity of the algorithm, the experiments are conducted as 31 independent runs. The Wilcoxon statistical test [14] is applied to investigate the significance of the acquired results. By this way, the winner method is highlighted in each table of numerical results. All simulations are implemented using Matlab R2019a with Microsoft Windows 10 Enterprise 64-bit as the operating system on a PC with a AMD Ryzen Threadripper 1950X 16-Core Processor, 3750 Mhz. Figure 4 illustrates the resulted non-dominated solutions on PF for 3-objective Ma F2 at each iteration of the process. The generated points using each component are presented using different colors. As it can be seen, the exploitation of each technique intensively increases the number of solutions on PF. Each technique discovers a portion of the PF and together they progressively cover the entire PF. Although, the number of solutions that each component can find is different, the effect of generated points in terms of filling the sparse region The PF is gradually covered during a number of iterations. According to results, the generated points by feeding the PF to ANN covers more region of PF rather than other techniques and/or diversity improvement can be independent of their quantity. For MaF2, LHS and ANN can markedly generate more points to fill the surface of the PF. By contrast, opposition type I and type II produce a limited portion of the PF. Note that the presented points are non-dominated candidate solutions which are remained in each iteration. For instance, the initial points have been reduced from the first iteration to the second iteration because other techniques could generate solutions that dominate the initial points. However, the number of non-dominated solutions increases gradually. It is worth mentioning that each generative technique finds a specific portion of the PF according to the way it operates. For instance, ANN generates the points based on what is learned from the initial PF resulted using a state-of-the-art manyobjective algorithm, thus the produced points are mainly around the initial PF, especially in the first iteration. How-ever, generating more points in sparse regions using other techniques in the next iterations provides the ANN with more training data which leads to a well-trained network and welldistributed points. Figure 5 represents the initial and final PF for some sample functions with three objectives resulted by the MORM. The initial PF is the non-dominated solutions obtained by the NSGA-III algorithm which is the input data for ANN to train the network for further process. After employing different generative techniques in four consecutive iterations, the final illustrated PF is achieved. As it is shown, the resulted nondominated solutions could fill the sparse regions of the PF. However, the performance of the method is mainly affected by the initial PF because well-trained ANN yields more accurate candidate solutions for the optimization problem. Therefore, as it is presented, for some cases that the initial For each sample, the initial solutions generated by NSGA-III algorithm (top plots) and the final PF produced by the proposed method (bottom plots) are presented. The generated points by each generative technique is presented in a different color. The PF is accumulatively covered by different techniques during a number of iterations optimization algorithm is not able to find solutions that cover the overall shape of the PF, the ANN cannot be trained efficiently and consequently, MORM may not fill the surface of the PF by the generated candidate solutions. Furthermore, Fig. 5 shows the distribution of non-dominated solutions resulted from different techniques. Each technique is able to find the solutions of a specific region of the PF. However, the ratio of the generated solutions is not identical for different techniques. Figure 6 also illustrates the 3D-Radvis visualization [21] of the initial and final resulted PFs on some sample benchmark functions with five objectives. Similar to three objectives, the final PF is crucially dependent on the initial PF which provides the training data for ANN. During the process, all techniques performing in objective space require Consequently, the performance of ANN has a crucial role on the performance of the method. For instance, on MaF3, since the NSGA-III could not discover non-dominated points in a specific portion of the PF, i.e., the region between f 1 and f 2 , the proposed method was not also able to search the corresponding region effectively and thus as it is shown, the number of discovered solutions is limited on the corresponding region.

Results and discussion
In addition, Table 2 represents the numerical results of the proposed framework after four iterations for both 3-and 5-objective problems. The first column indicates the number of points on initial PF resulted from the NSGA-III, which the MORM starts the reverse mapping process from. Correspondingly, the number of final candidate solutions is given in the second column. Depending on the nature of the optimization problem and the shape of the PF, the number of generated candidate solutions is different. The maximum number of non-dominated solutions for 3-objective problems is 80,188 which is obtained for MaF8. This number for 5-objective benchmarks is 100,924 on MaF2. The MORM could increase the average number of candidate solutions on 3-objective PF from 91 to 19,773 whereas on 5-objective problems, the MORM reached 37,209 candidate solutions.
Each successive column shows the ratio of contribution of each component on the final PF. For instance, ANN has produced 39.48% and 31.69% of points on the resulted 3objective and 5-objective PFs, respectively during the four iterations. As mentioned previously, ANN is trained at the beginning of each iteration using all non-dominated solutions which increase in every step; the ANN will be trained by more training data points and consequently it will be able to predict the decision variables with higher accuracy. On the other hand, the approximated decision values are able to reach new objective values on the PF close to previous points. This is the main reason why the ANN is in the first rank for all functions. In fact, the techniques such as LHS or OBL are able to explore the area around the PF while the ANN is a kind of local search to exploit the region close to the current PF. In other words, it is revealed that the ANN performs well on generating more points whereas other techniques cover more sparse regions. Results supports that OBL-based points lead to diversity and survive after non-dominated sorting to cover the sparse region of the PF. Along with the ratio of each components' contribution, the ratio of the remained points of the initial PF is also reported in the last column which constructs the lowest portion of the final PF. In overall, ANN has the higher contribution on finding the non-dominated solutions. Conversely, Type I Opposition could generate the lowest number of solutions among the employed techniques. The possible cause is that all other techniques explore the objective space to find more non-dominated solutions; therefore, they will achieve high probability of remarkable accomplish-ment. Conversely, type I opposition operates on decision space and correspondingly it may decrease the chance of selecting a non-dominated solution.

Well-distributed PF solutions using reference points
To ensure the diversity in resultant solutions, a predefined set of reference points can be used similar to that is proposed in NSGA-III. The reference points lead to widely distributed solutions because the reference points are widely distributed on the entire normalized hyperplane. NSGA-III uses Das and Dennis's [22] systematic approach to generate reference points on a normalized hyper-plane. The number of reference points (H ) in an M-objective problem is defined by the following equation.
where P is the number of divisions along each objective.
In our experiments, we define a predefined number of reference points to select a set of well-distributed candidate solutions among all generated non-dominated solutions. This technique produced a diverse set of non-dominated solution on the PF a diverse PF which makes the decision making more efficient. Figure 7 illustrates two samples of well-distributed PF with different number of reference points. Each reference point attracts a candidate solution having the minimum perpendicular distance with the corresponding reference line.
As it is presented, increasing reference points cover more sparse region of the PF with adequate diversity. Moreover, among the selected points, there are generated points from different components of proposed methods indicating their contribution in producing diverse solutions. The number of total reference points are considered 500 and 1500 approximately; however, according to Eq. 15, the accurate values are the closest numbers which are 496 and 1485 for M = 3 and 495 and 1365 for M = 5, respectively.

Comparative results
To evaluate the MORM, we also compared it with the state-of-the-art algorithms such as NSGA-III, MOEA/D, and IM-MOEA. It is obvious that the maximum number of nondominated solutions on a PF generated by a population-based many-objective algorithm is equal to the population size. Accordingly, to guarantee a fair comparison, for each benchmark function the competitors run with population size equal to the number of candidate solutions on the resulted PF from MORM. In addition, to run these algorithms, the total number of fitness calls set to that one required for four iterations of MORM plus the needed budget for running the optimization method to obtain the initial PF. Table 3 represents the

3.65
Two first columns represent the initial and final number of resulted non-dominated solutions. Successive columns indicate the portion of the final PF which is generated by the corresponding component. In addition, the portion of initial PF which is remained non-dominated after generating new points, shown in the last column As it is presented, the number of these parameters are different for 3and 5-objective optimization problems. Considering these parameters for many-objective optimizations, the maximum number of iterations for the algorithm to evolve the population is four which is a very small value and consequently insufficient for the algorithm to reach a comparable PF. For instance, the population size for Ma F2 is 9471 while the algorithm can call the fitness function 67,905 times. Therefore, at each iteration the competitor algorithms can iterate only seven times to evolve the population which is a very low value for the number of iterations for such a large population to get evolved sufficiently. According to the aforementioned parameters, NSGA-III, MOEA/D, and IM-MOEA as competitors are applied on the optimization benchmark functions. For all algorithms, we use the PlatEMO framework [39]. The parameter setting in platform is based on the reference papers which are given in Table 4.
The results of all algorithms are presented in Tables 5 and  6. As it can be seen, the algorithms are compared with the MORM in terms of ratio of union PF, HV, and IGD. Union PF is a set of non-dominated solutions from the union of PFs resulted from three methods. The remaining non-dominated solutions from each method in the union PF is reported as a measure. From the Table 5, MORM has a higher contribution in the union PF in all optimization problems except 3-objective MaF1 in which MOEA/D performs better. Since The population size is equal to the resultant number of non-dominated solutions by MORM. NFC represents the number of fitness calls the solutions generated by the NSGA-III and IM-MOEA algorithm were dominated by the solutions form other algorithm for most of the test problems, its contribution is zero. Similarly, the difference between the ratio of contribution of MORM and MOEA/D is remarkably significant. From Table  6, HV and IGD also reveal the superiority of the MORM compared to other algorithms. On average, in terms of HV, the MORM surpasses the NSGA-III, MOEA/D, and on 10, 6, Table 4 Parameter settings of the three algorithms in comparison

NSGA-III
Population size According to Table 3 Number of fitness calls According to Table 3 Number of reference points According to [10] Generative operators Genetic operators

MOEA/D
Population size According to Table 3 Number of fitness calls According to Table 3 Decomposition method Tchebychef

Number of neighbors 10
Generative operators DE operators Number of reference points According to [48] IM-MOEA Population size According to Table 3 Number of fitness calls According to Table 3 Number of reference vectors 10 Model group size 3 The population size and number of fitness calls are set based on what are obtained from proposed method according to Table 3 and 9 out of 12 of 3-objective functions, respectively. Similar HV results are acquired for 5 objectives on which the average value of HV achieved by MORM is 0.86 whereas NSGA-III, MOEA/D, and IM-MOEA could obtain a PF with HV value of 0.7, 0.65, and 0.8, respectively. Considering IGD values, MORM has obtained a PF with less distance to the true PF which yields lower values of IGD compared to competitors. Accordingly, except on 3-objective MaF9 and MaF13, and 5-objective Ma F6, MORM outperforms other algorithms. For some functions, the large values of IGD indicates that resultant PF by competitors has a high distance of true PF and consequently they fail solve the problems. From our results, we can conclude that the MORM is able to generate many non-dominated solutions while state-of-the-art multiobjective algorithms fail to produce a well-distributed PF over a limited number of generations. In addition to the quantity, the generated solutions have the efficiency to increase the HV and to get closer to true PF according to IGD measure. As it is mentioned before, this is due to drawback of the multiobjective EAs, which require a large population size with numerous fitness calls to evolve such population. To fill the Pareto-front with a huge number of points, evolutionary algorithms require a large population size and correspondingly a huge number of fitness calls. The smaller number of fitness calls (in average) is needed compared to required budget for filling Pareto-front (i.e., generating non-dominated solutions with high accuracy) using an evolutionary algorithm. Therefore, all points are certainly required to be evaluated but as table shows this amount of fitness calls is not sufficient for a

Effectiveness on real-world problems
In this section, MORM has been employed on two cases of many-objective real-world problems to investigate its effectiveness.

Vehicle crashworthiness design problem
In automotive industry, there are a variety of important requirement of crashworthiness design to develop highquality and low-cost products. To address these criteria, a multi-objective optimization problem can be defined. In [24], three objectives including the mass of the vehicle, an integration of collision acceleration, and the toe board intrusion are considered as design objectives. These objectives are formulated using the thickness of five reinforced members around the frontal structure as design variables (Eq. 16). The mass of the vehicle ( f 1 ) is minimized for the consideration of lightweight. In addition, the minimum integration of collision acceleration ( f 2 ) reflects the worst scenario of acceleration-induced biomechanical damage of occupants. Finally, to minimize the mechanical injury, the toe board intrusion in the "offset-frontal crash" ( f 3 ) should be minimized: where x i ∈ [1, 3] for i ∈ {1, 2, 3, 4, 5} are the thickness of five reinforced members around the frontal structure which could significantly affect the crash safety. More details can be find in [24]. An initial PF is obtained using NSGA-III algorithm to apply the MORM. After four iterations, the initial PF with 91 points is filled with 6,325 non-dominated solutions. Figure 8 represents the initial and final PFs resulted from applying MORM. As it can be seen, the generative techniques collaborated to result a well-distributed PF.

Rocket injector design problem
Design of injectors is an important element to develop technologies to make the next generation launch systems safer, more affordable, and more reliable. In [42], the design considerations are guided by three design objectives, namely, the minimum temperature on the injector face ( f 1 ), the length of the combustion zone ( f 2 ), and the temperature on the oxidizer post tip ( f 3 ). Shorter combustion lengths account for better performing designs while lower temperatures would indicate a design that had longer life due to decreased thermal strain. To satisfy theses goals, three design variables are selected, namely the angle at which the hydrogen is directed toward the oxidizer (x 1 ), the change in hydrogen flow area from the baseline (x 2 ), the change in oxygen flow area from the baseline (x 3 ), and the oxidizer post tip thickness (x 4 ). Eq. 17 indicates the formulation of three objectives based on the four variables.
where x i ∈ [0, 1] for each i ∈ {1, 2, 3, 4}. An initial PF is obtained using NSGA-III algorithm to apply the MORM. After four iterations, the initial PF with 91 points is filled with 2664 non-dominated solutions. Figure 9 represents the initial and final PFs resulted from applying MORM. As it can be seen, the generative techniques collaborated to result a well-distributed PF.
Furthermore, the comparison between the MORM and other algorithms for both problems is given in Table 7. As it is showed, the HV value of resultant PF using MORM is 0.85 and 0.73 for Vehicle Crashworthiness and Rocket Injector, respectively, while other algorithms could reach to a lower HV value. Thus, the effectiveness of proposed method on real-world problems is significant.

Concluding remarks
Every many-objective optimization algorithm delivers a restricted amount of solutions as the output of the optimization process. However, the well-covered PF is generally desired for a variety of applications. In this paper, we have introduced a kind of collaboration between machine learning and many-objective optimization algorithms to increase  the non-dominated solutions by an inexpensive approach. A many-objective optimization algorithm generates an initial PF which is the input of a learning model. The trained model maps the objective space points to decision space. This framework allows us to generate the many solutions by generative techniques in objective space and then find their decision variables using the trained model. Each component covers some regions of the PF and it consequently leads to a well covered PF. To ensure the diversity of the generated solutions, we used a set of reference points to select a predefined number of candidate solutions among all generated points. Compared to the initial PF solutions obtained by an optimization algorithm, the proposed method, MORM, can produce the resultant covered PF by only a few iterations. The conducted experiments on well-known manyobjective benchmarks have demonstrated the effectiveness of the MORM in terms of several many-objective optimization assessment measures. The comparative results between the proposed scheme and the state-of-the-art many-objective optimization algorithms including NSGA-III, MOEA/D, and IM-MOEA have revealed that the surface of the generated PF by a collaboration of machine learning is covered by a cloud of non-dominated solutions. It has been presented that each generative technique covers a portion of the PF resulting in accumulatively a well-covered PF. However, generating a similar PF using many-objective optimization algorithms require a significantly high budget which is not applicable especially for expensive optimization problems. Generating a low cost PF solutions without overloading of the optimizer seems a very promising direction. This novelty is achieved as a fruitful result of collaboration between machine learning and optimization algorithms, especially when the problem is many-objective and the proposed scheme is fully indepen-dent from the family of optimizer, classical or meta-heuristic. However, since the population-based optimizers obtain a PF using the individuals' collaboration, they probably are to offer a much better input set (i.e., initial PF) to our proposed method whereas this property is lacking in single-solution based methods, where they build PF by hitting it point-bypoint. In addition to benchmarks, MORM is evaluated on two well-known real-world problems . As future works, using the proposed approach in an interactive the optimization framework can improve the efficiency of the method. In addition, using the reference points can be embedded in optimization process to generate non-dominated solutions on desired regions of the PF.

Conflict of interest
The authors declare that they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.