Interactive Evolutionary Multiple Objective Optimization for Group Decision Incorporating Value-based Preference Disaggregation Methods

We present a set of interactive evolutionary multiple objective optimization (MOO) methods, called NEMO-GROUP. All proposed approaches incorporate pairwise comparisons of several decision makers (DMs) into the evolutionary search, though evaluating the suitability of solutions for inclusion in the next population in different ways. The performance of algorithms is quantified with various convergence factors derived from the extensive computational tests on a set of benchmark problems. The best individuals and complete populations of solutions constructed by the proposed approaches are evaluated in terms of both utilitarian and egalitarian group value functions for different numbers of DMs. Our results indicate that more promising directions for optimization can be discovered when exploiting the set of value functions compatible with the DMs’ preferences rather than selecting a single representative value function for each DM or all DMs considered jointly. We demonstrate that NEMO-GROUP is flexible enough to account for the weights assigned to the DMs. We also show that by appropriately adjusting the elicitation interval and starting generation of the elicitation, one could significantly decrease the number of pairwise comparisons the DMs need to perform to construct a satisfactory solution.


Introduction
In Multiple Objective Optimization (MOO), several objectives are optimized simultaneously. As goals to be attained usually represent conflicting viewpoints, it is impossible to find a solution for which all objectives reach their individual optima (Branke et al. 2008). Instead, one can identify a number of Pareto-optimal (non-dominated) solutions which are considered equally desirable in case no additional information is available. A solution is called Pareto-optimal if none of the objective functions can be improved in value without deteriorating some of the other objective values. A possibly infinite set of such solutions forms a Pareto front in the objective space.
The need for efficient and reliable MOO methods is increasing in practically every field of science, engineering, and business. These approaches need to involve two important steps: finding Pareto-optimal solutions and incorporating the preference information of the Decision Maker (DM) to identify the most satisfactory solution(s). Depending on the paradigm used, such information may be introduced before, during, or after the optimization process (Branke et al. 2008). In this perspective, for decades, in MOO one has developed two separate methodological streams: interactive and evolutionary ones.
On one hand, Interactive Multiple Objective Optimization (IMO) deals with identification of the most preferred solutions by means of alternating stages of optimization and preference elicitation. IMO methods provide the DM with a sample of candidate solutions, and (s)he is expected to return some crucial evaluation of these solutions. This permits to construct or find a new sample that better fits the DM's preferences . On the other hand, Evolutionary Multiple Objective Optimization (EMO) is focused on efficient generation of the whole Pareto optimal set or its good approximation by adopting the principles of natural evolution. Working with a population of solutions, EMO algorithms can search for several Pareto-optimal solutions in a single run. Only in the post-optimization phase, the DM is involved to identify the most preferred solution(s) among them.
The recent trend in MOO consists in merging the interactive and evolutionary approaches (for a review see, e.g., Branke et al. 2008Branke et al. , 2015. This is achieved by interlacing the preference handling and the evolutionary search. The underlying motivation for constructing such hybrid methods consists in biasing the search towards a sample of all Pareto-optimal solutions which is most relevant from the DM's point of view. In this way, the parts of Pareto front which are clearly irrelevant for the DM are neglected. Moreover, incorporating the DM's preference information in the search process speeds up the convergence to the more suitable sample of Pareto-optimal solutions and introduces the necessary selection pressure when more objectives are optimized simultaneously (Branke et al. 2015).
When it comes to preference information asked from the DM, it may be related to the objectives, constraints, or solutions (see Coello 2000;Rachmawati and Srinivasan 2006). Indeed, many MOO methods ask the DM to specify weights of the objectives, admissible trade-offs, reference or aspiration points (see Fonseca and Fleming 1993;Said et al. 2010), or desirability functions (Wagner and Trautmann 2010). Such direct elicitation of preference information can cause excessive cognitive burden on the DM. Hence, indirect preference questions have been proposed for lowering the elicitation effort (Kadziński and Tervonen 2013). In this regard, in some existing MOO studies, the DM is expected to provide a partial ranking for a limited subset of solutions (Deb et al. 2010), to pick the best and the worst solutions from such subset (Korhonen et al. 1984), or, more generally, to supply some pairwise comparisons (see Branke et al. 2015;Phelps and Köksalan 2003). Such indirect preference information is claimed to be understandable, natural, and advantageous in terms of admitting the DM to see the connection between provided preferences and resulting recommendation .
Whenever handling the DM's preference information, the MOO algorithms need to assume a particular preference model. When properly exploited, such model can be used for guiding the method to the relevant part of Pareto front. For this purpose, the vast majority of existing MOO approaches employ value functions (Kenney and Raiffa 1993). The scores they provide indicate the quality of solutions from all relevant points of view considered jointly. Nonetheless, value functions used in various methods differ in terms of their complexity and role.

Review of Existing Value-Based Multiple Objective Optimization Algorithms Eliciting Indirect Preference Information
In this subsection, we review some representative MOO algorithms that learn value functions from user's indirect preference information. Some of them select a single value function compatible with the DM's preference information. In particular, in Phelps and Köksalan (2003) the DM is asked to provide at regular intervals some pairwise comparisons of solutions. These are used to derive the most discriminant additive linear value function, which serves to rank individuals in the evolutionary algorithm. Differently, in Deb et al. (2010) one constructs a polynomial value function based on the DM's (partial) ranking of a small subset of solutions. Such function is employed to identify the most desirable solutions in the population and to reject a large subset of these which are non-relevant for the DM. The same type of preference information is used in Battiti and Passerini (2010) though with a highly non-linear complex function which is used to rank individuals in the same Pareto front. This algorithm has been proven to work well with few interactions with the DM, but requiring her/him to rank larger subsets of solutions than in Deb et al. (2010). Finally, in the NEMO-0 (Necessary preference enhanced Evolutionary Multi-objective Optimization-0) method (Branke et al. 2015), one generalized the approach of Phelps and Köksalan (2003) for inferring an arbitrarily monotonic value function compatible with the DM's pairwise comparisons, and used it in the same way as in Battiti and Passerini (2010). The method has been illustrated to perform well even if the DM's preference information is not representable with a linear model. Instead of using a single value function, some other approaches exploit a set of functions compatible with the DM's indirect preferences. In this regard, Greenwood et al. (1997) consider a set of linear value functions compatible with the DM's ranking of a few solutions. This set is exploited to derive the necessary preference information confirmed by all compatible functions. Such relation-reflecting robust consequences of the DM's exemplary statements-is used to replace the non-dominance sorting step of the evolutionary algorithms. NEMO-I (Branke et al. 2015) implements the same idea though with more general additive value functions. Recently, to reduce the computational complexity of NEMO-I, in NEMO-II (see Branke et al. 2015Branke et al. , 2016 the necessary relation has been replaced with the preference fronts obtained by iterative identification of potentially optimal solutions, i.e., individuals which are ranked first by at least one compatible value function. Moreover, NEMO-II adjusts the complexity of an assumed preference model to the pairwise comparisons provided by the DM, thus, starting the iterative process with a simple linear value function and switching to the Choquet integral (Choquet 1954) when the linear model is not expressive enough.
In this paper, we focus on the NEMO methods. They can be perceived as hybrids combining the evolutionary method, called Non-dominated Sorting Genetic Algorithm II (NSGA-II) (Deb et al. 2002), with some interactive ordinal regression approaches (see Jacquet-Lagrèze and Siskos 2001;Corrente et al. 2013). Our interest in NEMO comes from its favorable characteristics in terms of both preference information and preference model it employs. Indeed, NEMO requires the DM to compare some pairs of solutions from the current population, which is less demanding than specifying precise values for some preference model parameters, but also less restrictive and more general than ranking a subset of individuals or picking the best or the worst one among them. Moreover, it uses an additive value function as the preference model. Such function is often an appropriate simplification of the DM's preference structure, it has high explainability due to its low amount of inter-criteria preference parameters, and is more transparent than some non-parametric approaches used in MOO, such as support vector machines (Battiti and Passerini 2010) or artificial neural networks (Todd and Sen 1999).

Motivation and Contribution
NEMO, alike other existing interactive EMO methods, has been designed to deal with preferences expressed by a single DM. However, one often faces situations when individuals collectively make a choice. Each of the group members can have her/his own priorities, perception of the decision problem to be tackled, and unique contribution to the outcomes (Matsatsinis et al. 2005). In such case, the decision is no longer attributed to a single group member, but instead the aim is to find a group consensus. When dealing with multiple DM in the context of MOO, the main challenge consists in designing the algorithms so that they are able to focus the search on the consensus solutions. The promising initial results in this regard (Kadziński and Tomczyk 2015) encourage us to extend this line of research in terms of both further methodological advancements and experimental analysis.
In this paper, when compared to Kadziński and Tomczyk (2015), we propose a few additional variants of NEMO-GROUP that incorporate indirect preference information of several DM. All these variants properly modify NSGA-II so that it promotes the solutions preferred by the group members in the optimization run. They differ, however, in how they evaluate the quality of constructed solutions. In particular, solutions may be ranked either with respect to a single compatible value function or the whole set of such functions. A representative function can be derived individually for each DM or constructed for all group members considered jointly. When working with a set of compatible models, the elitism of solutions can be preserved by referring either to the fronts of potential optimality the solution belongs to, or to the most advantageous result attained by each solution. Finally, the population of solutions can be evolved separately for each DM or in common for the entire group.
Such diversity of approaches allows us to discover the paradigms which are most beneficial in terms of consensus seeking in the interactive evolutionary MOO. The performance of the algorithms is judged in terms of the quality of solutions they construct. We employ both utilitarian and egalitarian value functions, thus, verifying the level of satisfaction of an average group member or the least satisfied DM, respectively. In this regard, we evaluate both the best solution and the most advantageous population obtained throughout the optimization run, as well as the algorithms' convergence toward the part of Pareto front that is satisfactory for the whole group.
The proposed methods are thoroughly tested on a set of benchmark MOO problems with different numbers of objectives. In this way, we illustrate the impact that including additional relevant viewpoints may have on consensus reaching. Moreover, we demonstrate the potential of some variants of NEMO-GROUP for incorporating the DMs' weights, thus, differentiating the importance of group members. Finally, we assess how the performance of proposed methods is influenced by the frequency of eliciting pairwise comparisons from the DMs and the generation in which the preference elicitation starts.
The paper is organized as follows. The next section provides a brief reminder on ordinal regression, basic evolutionary methods, and their interactive counterparts. Section 3 presents different variants of our method, NEMO-GROUP. The results of an extensive experimental study are discussed in Sect. 4. The last section concludes.

Reminder on Ordinal Regression, Evolutionary Optimization Algorithms, and Their Interactive Counterparts
In this section, we remind the basic concepts and methods that are referred throughout the paper. These include formulation of the multiple objective optimization problem, ordinal regression, NSGA-II and NEMO methods. Some parts of this reminder are derived from our previous work (Kadziński and Tomczyk 2015).

Multiple objective optimization problem
We consider MOO problem in which a set of solutions A = {a, b, . . .} is evaluated in terms of m conflicting objectives, . . , f m }. These objectives are to be minimized while assuming that the considered solutions belong to some non-empty feasible region S. The general formulation of such problem is: Let F j denote the value set of objective f j . Following Branke et al. (2015), we assume that F j ⊆ R, and that the value space on each objective f j is bounded, such that where α j and β j are, respectively, the best and the worst objective values. Consequently, F = F 1 × F 2 × . . . F m represents the objective space, and each solution a ∈ A is associated with an evaluation vector denoted by f (a) = ( f 1 (a), f 2 (a), . . . , f m (a)) ∈ F.

Group decision making
We consider a set of DM (let us denote it by D = {DM 1 , . . . , DM k , . . . , DM s }, where s is the number of DMs) aiming to find a subset of the best consensus solutions. We assume two realistic scenarios: -the relative importance of the DMs is defined with a set of weights {W DM 1 , . . . , W DM k , . . . , W DM s } such that s k=1 W DM k = 1; -all DMs play the same role in the committee (thus, W DM k = 1/s for k = 1, . . . , s). Group preference model Each DM k ∈ D evaluates solutions with her/his individual "true" value function U T RU E k . The collective preference model combines these evaluations into a comprehensive value that solution a ∈ A represents to the whole group. We consider two specific group value functions: utilitarian (UT) and egalitarian (EG). These are defined as follows:

Ordinal Regression
Preference information Each DM k ∈ D offers individual preference information which is a set B k of pairwise comparisons of some reference solutions in provided by DM k states the strict preference, weak preference, or indifference. These relations are denoted by, a * k b * , a * k b * , and a * ∼ k b * , respectively. Let each pairwise comparison from B k be denoted by B t k , t = 1, . . . , p k , where p k is the number of comparisons contained in B k . Preference model To model the preferences provided by the DM and evaluate a set of solutions, we use an additive value function. It is defined on A as follows (Jacquet-Lagrèze and Siskos 2001): where u j : F j → R, j = 1, . . . , m, are subject to monotonicity and normalization constraints: When marginal value functions are assumed to be linear, for each a ∈ A and f j ∈ G: Ordinal regrssion The pairwise comparisons provided by each DM k ∈ D form the input data for the ordinal regression (Jacquet-Lagrèze and Siskos 2001) that finds the whole set of value functions U k being able to reconstruct these judgments. The set of linear constraints E k given below translates a reference pre-order provided by DM k to a value function: where ε is an arbitrarily small positive value. Thus, U k is defined by a set of constraints E U k = E U ∪ E k . The set of value functions U D compatible with the pairwise comparisons of all DMs is defined with . . , s. Note that U D corresponds to the intersection of sets of compatible value functions for all DMs in D.
is feasible, the set of compatible value functions U k (U D ) is non-empty ). Otherwise, the provided preference information is inconsistent with the assumed preference model, which means that there is no value function that would reproduce the pairwise comparisons provided by DM k (if U k = ∅) or all DMs (if U D = ∅).

Representative value function for a single DM
The issue of selecting a representative function for a single DM has been discussed in detail in . In this paper, we will use the most discriminant value function U R E P k , which is obtained by maximizing ε, subject to E U k . It discriminates comprehensive values of reference solutions related by the preference in the DM's partial ranking. When pairwise comparisons provided by the DM are inconsistent with an assumed preference model, they are removed, starting from the oldest one, until feasibility is restored. This approach has been employed in Branke et al. (2015) and Phelps and Köksalan (2003) to avoid an excessive use of Mixed Integer Linear Programming (MILP) techniques. Then, U R E P k is selected analogously.

Representative value function for a group of DMs
To select a representative value function U R E P−E P S D for a group of DMs, we discriminate the comprehensive values of reference solutions compared by all DMs in the following way: When there is no value function compatible with the preference information provided by all DMs, ε is interpreted as a non-statistical misranking error which indicates the distance between the DM's preferences and the recommendation which can be obtained for the assumed model (Corrente et al. 2013). Obviously, in case of incompatibility, some pairwise comparisons are not reproduced by U R E P−E P S D . Alternatively, to deal with incompatibility of preference information at the group level, we may identify a set of consistent pairwise comparisons of all DMs. For this purpose, we maximize a minimal number of pairwise comparisons of any DM which are consistent, being representable by a single additive value function . It can be achieved by solving the following MILP problem: for k = 1, . . . , s, for t = 1, . . . , |B k | : Apart from providing the minimal number of non-contradictory pairwise comparisons of all DMs (v * ), the solution of the above problem indicates which pairwise comparisons can be reproduced together by an additive value function (they are distinguished with v k, * t = 1). If for all DMs the numbers thereof are imbalanced, we arbitrarily choose the last v * non-contradictory pairwise comparisons provided by each DM, so that none of them is favored. Then, we determine a representative (most discriminant) value function U R E P−M AX D compatible with thus selected subset of holistic judgments.
Potential optimality To verify if solution a ∈ A is potentially optimal for DM k ∈ D, we need to check if it is at least as good as all remaining solutions for at least one value function compatible with her/his preferences (Lee et al. 2002). This can be achieved by solving the following problem: for b ∈ A \ {a} : where E U k differs from E U k in that it involves only consistent pairwise comparisons (the feasibility is restoterd by removing the oldest pairwise comparisons) and assumes ε to be equal to an arbitrarily selected small positive value.
is greater or equal to zero, a is potentially optimal. The set of all such potentially optimal solutions for DM k is denoted by PO 1,k . Moreover, ε ADV 1,k (a) indicates the advantage that a has over the remaining solutions for the most advantageous value function. If ε ADV 1,k (a) < 0, it reveals the minimal loss of a to the best solution in A for some compatible value function. Different values of ε ADV 1,k (a) are comparable, because being defined on the conjoint interval scale of the marginal value function (Wakker 1989), they have the meaning of intensity.
The above procedure can be repeated to assign all solutions to their respective levels of potential optimality. That is, the solutions which become potentially optimal once solutions in PO 1,k are removed from the explored set are denoted by PO 2,k and their maximal advantage over the solutions in A \ PO 1,k by ε ADV 2,k (a), etc. The level of potential optimality for a ∈ A and DM k ∈ D is denoted by P O k (a) and the respective advantage by ε ADV P O k (a),k (a). Obviously, the less P O k (a), the better. Let us call a procedure which sorts the solutions into a set of potential optimality levels PO k = {PO 1,k , PO 2,k , . . .} by pot-opt-sort k (A). From the group decision perspective, this procedure can be generalized to: -UT-pot-opt-sort(A) employing utilitarian level of potential optimality, i.e., P O U T (a) = s k=1 W DM k · P O k (a), to sort individuals; this procedure ranks the solution into fronts by iteratively identifying all solutions for which the average level of potential optimality among all DMs is the most advantageous; -EG-pot-opt-sort(A) using the egalitarian level of potential optimality, i.e., P O EG (a) = max k=1,...,s P O k (a), for ranking individuals; let us denote a set of DMs for which the level of potential optimality for a ∈ A is the worst by Finally, we denote a procedure which imposes a complete order on set A ⊆ A according to some measure M (e.g., comprehensive value U R E P k (a) of solution a ∈ A for the representative value function or the maximal advantage ε ADV 1,k (a) of a ∈ A over the remaining solutions) by sort(A , M(a)). Clearly, the greater M(a) for a ∈ A , the better its rank.

NSGA-II
The role of genetic algorithms is to estimate meta-heuristically the Pareto fronts in MOO problems. In particular, NSGA-II (Deb et al. 2002) incorporates a fast nondominated sorting algorithm to identify Pareto optimal solutions, and a diversity preservation mechanism for maintaining a well-spread Pareto front. It starts with the initialization of a random parent population P 0 of size N . Then, the offspring Q 0 of the same size is created using the usual selection, recombination and mutation operators. Further, the parents and their offspring (R t = P t ∪ Q t ) are combined to obtain a population of size 2N . This population is sorted into a set of Pareto fronts F using non-dominated-sort(R t ) procedure. Thus, F 1 is composed of nondominated solutions, F 2 contains solutions dominated only by some solutions from F 1 , etc.
The new population (P t+1 ) is filled with the best Pareto fronts from R t (first F 1 , then F 2 , etc.), until the size of the next front (F l ) is larger than the number of free slots in P t+1 . To have exactly N members in the new population and to maintain diversity, the front F l is ordered using the crowded distance C D measure. The total crowding distance of a solution is the sum of its individual objectives' distances which are computed as the absolute normalized differences between the solution and its closest neighbors. Then, the solutions with the greatest crowding distance are added to P t+1 . This ensures a uniform spread-out of the front throughout the various stages of the algorithm. Overall, in order for a solution to be preferred to another one it has to belong to a better non-domination front or to have a larger crowding distance in case the two belong to the same front. Algorithm 1 describes a single NSGA-II iteration for the t-th generation. Such process is iterated until a stopping criterion is met.

NEMO
NEMO (Branke et al. 2015) is an interactive evolutionary hybrid which combines NSGA-II with IMO approaches based on the principle of ordinal regression. The major innovation of NEMO when compared to NSGA-II consists in asking a (single) DM at regular intervals to compare a pair of solutions (note that set A is composed of solutions from the current population). The accumulated preference information is used to constrain the space of compatible value functions.
In what follows, we focus on NEMO-0 and NEMO-II (see Branke et al. 2015Branke et al. , 2016) that will be extended in our proposal. We neglect NEMO-I (Branke et al. 2015), because it needs to solve a prohibitively large number of Linear Programming (LP) problems, which makes it infeasible for dealing with real-world optimization problems.
Alike NSGA-II, NEMO-0 uses the Pareto fronts as a primary criterion to rank individuals. Then, the algorithm selects a representative additive general value function U REP , and the solutions within each Pareto front F i are ranked using sort(F i , U REP (a)) (i.e., the greater U REP (a), the better the solution a). Differently, NEMO-II uses the levels of potential optimality and crowding distance as the two criteria to judge individuals in terms of their suitability for inclusion in the next population.

NEMO-GROUP: Interactive Evolutionary Multiple Objective Optimization for Group Decision
In this section, we present a few approaches for interactive evolutionary multiple objective optimization incorporating preference information of several DMs. Each of these approaches extends NEMO, originally designed for dealing with preferences of just a single DM. The scheme of this extension is common for all proposed variants (see Algorithm 2). The procedure begins with selecting individuals from the current population P t of size N for mating. Then, it generates offspring Q t using mutation and crossover. The offspring is added to the population R t = P t ∪ Q t (size = 2N ). At regular intervals (defined with an "elicitation interval" (EI)), each DM k ∈ D is asked to compare a pair of randomly drawn non-dominated solutions. In a basic scenario, the preference elicitation starts with the evolutionary search. However, the "starting generation for preference elicitation" (SGE) may be delayed to SGE > 0. In any case, the accumulated pairwise comparisons are used to construct preference model of the DM/DMs.

Algorithm 2 A single NEMO-GROUP iteration for constructing the t-th generation
The individuals in R t are sorted according to the primary-sort(R t ) procedure into a set S of fronts. Within each front S i ∈ S, the solutions are ranked with secondary-sort(S i , M(a)) using a quality measure M. These two sorting procedures refer to the performances of solutions in R t on all considered objectives and/or to the results derived from exploitation of the DMs' preference models. The new population P t+1 is constructed analogously as in NSGA-II, i.e., by filling it with the best fronts in S until the number of free slots in P t+1 is not exceeded and the best solutions (according to M) form the first front which cannot be entirely contained in P t+1 .
The main variants of NEMO-GROUP evolve a joint population for the whole group of DMs. They differ in terms of the sorting procedures they use for judging the suitability of individuals for inclusion in the next population. These procedures are presented in Table 1 [the ones marked with a star (*) have been initially proposed in Kadziński and Tomczyk (2015)]. In what follows, we explain all approaches briefly.
The first group of approaches learns a single value function representing the preferences of the DM(s). In particular, the representative value functions derived individually for each DM may be used to construct a group preference model as follows: -NEMO-GROUP-REP-UT determines a representative value function U R E P k for each DM k ∈ D based on her/his pairwise comparisons only, and then ranks solutions within Pareto fronts according to their representative utilitarian compre- for each DM k ∈ D, and ranks subsets of Pareto fronts using representative egalitarian comprehensive values U R E P EG (a) = min k=1,...,s U R E P k (a). Table 1 Sorting procedures used to judge the suitability of solutions from the population R t for inclusion in R t+1 by different variants of NEMO-GROUP Alternatively, a single representative value function can be constructed for all group members considered jointly. The two approaches implementing this idea differ in how they deal with potential incompatibility of preference information provided by different DMs: -NEMO-GROUP-REP-EPS determines the representative value function U R E P−E P S D by minimizing the misranking error when dealing with pairwise comparisons of all DMs, and uses the inferred function to rank individuals within the same Pareto fronts; -NEMO-GROUP-REP-MAX maximizes the minimal number of pairwise comparisons of any DM k ∈ D that can be represented together by an additive value function, determines the representative value function U R E P−M AX D compatible with the consistent pairwise comparisons of all DMs (equal number of comparisons provided by each DM), and uses it to rank subsets of Pareto fronts.
The other group of approaches learns a set of value functions representing the preferences of the DM(s). This set is exploited to compare each solution with all remaining ones in the current population in terms of either the greatest advantage it may attain over the remaining solutions or fronts of potential optimality they belong to: -NEMO-GROUP-ADV-UT determines for each DM k ∈ D and each solution a ∈ R t the advantage ε ADV 1,k (a) this solution has over the remaining solutions for the most advantageous value function compatible with the pairwise comparisons provided by DM k , and then ranks solutions within Pareto fronts according to their utilitarian most beneficial advantages: ε ADV U T (a) = s k=1 W DM k · ε ADV 1,k (a); -NEMO-GROUP-ADV-EG determines ε ADV 1,k (a) for each DM k ∈ D and each solution a ∈ R t , and then ranks solutions within the same Pareto fronts according to their egalitarian most beneficial advantages: ε ADV EG (a) = min k=1,...,s ε ADV 1,k (a); -NEMO-GROUP-PO-UT sorts the solutions according to their utilitarian levels of potential optimality P O U T (a) = s k=1 W DM k · P O k (a), and then orders the individuals within each level according to ε ADV U T (a); -NEMO-GROUP-PO-EG sorts the solutions according to their worst levels of potential optimality P O EG (a) = max k=1,...,s P O k (a), and then orders the solutions within each level according to their minimal advantage among the DMs for whom the level of potential optimality is the worst, i.e., As a benchmark procedure, we consider NEMO-GROUP-IND which divides a population into 1/s equal sub-populations, one for each DM k ∈ D, and evolves them separately using a representative value function U R E P k (a) of each DM k ∈ D (as in standard NEMO-0 described in Sect. 2.3). A final population is obtained by putting together sub-populations of all DMs.

Experimental Results
To study the performance of different variants of NEMO-GROUP, we refer to a set of benchmark problems with two (2D) to seven (7D) objectives. We use artificial DMs applying the pre-defined individual value functions for comparing pairs of solutions whenever preference elicitation is conducted. Precisely, for DM k ∈ D we use either linear or Chebycheff function defined as follows: and where w k j , j = 1, . . . , m, are weights of the m cost-type objectives. Since both these functions are to be minimized, the whole group aims at minimizing utilitarian U U T D (a) and/or egalitarian U EG D (a) group value functions. Obviously, all these individual functions are unknown to NEMO-GROUP, which instead uses an additive value function defined in Sect. 2.1 as an internal preference model.
In our tests, we use a real-valued representation. We generate offspring by simulated binary crossover with probability of 0.9 and distribution index η c = 5, whereas mating selection is performed by tournament selection. We also apply Gaussian mutation with probability of 1/50 and standard deviation equal to 0.1. The population size is set to 60, and all methods are run for 500 generations.
Whenever different is not explicitly stated, we assume that all DMs play the same role in the committee, the preference elicitation is performed every 10 generations starting with the evolutionary search (i.e., (SGE,EI) = (0, 10)), whereas the numerical results are averaged over 50 independent runs. For clarity of presentation, all these experimental results are provided in "Appendix 1". When presenting them in a tabular form, the text in bold and italics indicates the best performing algorithm across all 50 optimization runs. Additionally, we indicate in bold these approaches whose distance from the best performer proved to be statistically insignificant according to a Mann-Whitney-U test with 5 % significance level.
The results in this section significantly extend the experimental study presented in Kadziński and Tomczyk (2015). The latter provides a view only on the convergence of a limited number of NEMO-GROUP variants (IND, REP-UT, REP-EG, and REP-MAX) in terms of a utilitarian value of the solutions they construct for two benchmark problems, ZDT1-2D and DTLZ2-5D. The experimental evaluation presented in this paper is more extensive in terms of (i) considering additional variants of NEMO-GROUP, (ii) accounting for more diverse benchmark problems, (iii) considering greater numbers of DMs, (iv) employing an egalitarian group value function in addition to a utilitarian one, (v) investigating the impact that assigning different weights to the DMs has on the results, and (vi) quantifying the impact of different parameterizations of a preference elicitation process on the algorithms' convergence.

Illustrative Example
In this subsection, we use a convex version of DTLZ2-2D to illustrate how the proposed approaches bias the search towards a sample of all Pareto-optimal solutions which is relevant for the DMs. We assume that the true linear value functions of three DMs are parameterized with the following weights (w k 1 , w k 2 ) for k = 1, 2, 3: DM 1 − (0.7, 0.3), DM 2 − (0.6, 0.4), and DM 3 − (0.2, 0.8). Intuitively, the greater the weight, the more important it is to minimize the respective objective. Figure 1 shows the results for five variants of NEMO-GROUP after 500 generations. As can be seen, NSGA-II approximates the whole Pareto front, whereas all variants of NEMO-GROUP are focused on the solutions preferred to the DMs. Since IND evolves a separate sub-population for each DM, the final population it delivers is composed of three clearly disjoint sets of solutions, each containing individuals which are most preferred to a particular DM. For algorithms evolving a joint population for the whole group, the final set of solutions is narrowed to a single small part of the Pareto front which can be seen as the best compromise for all DMs. However, these approaches differ in terms of both the region of the Pareto front to which they converge as well as the dispersion of a final population. This confirms the flexibility of the interactive evolutionary hybrids in a group decision context. In particular, REP-UT, ADV-UT, and ADV-EG converged to a part of the front which is more favorable for DM 1 and DM 2 whose preferences are consistent to a great extent. On the contrary, the population constructed with REP-EG is located in the central part of the objective space, because the preference information of DM 3 significantly affects the convergence of this algorithm. Moreover, the population obtained with REP-UT is more stretched due to the utilitarian aggregation of the DMs' individual representative value functions.
To demonstrate the convergence to the Pareto front over time, for REP-EG and ADV-UT, we additionally depict the populations obtained after 50 and 300 generations (see top-right corner of Fig. 1). These clearly demonstrate the improvement of solutions during the evolutionary search as well as constraining the population due to incorporation of DMs' preference information into the evolutionary search.

The Convergence in Terms of a Utilitarian Value of the Solutions
In this subsection, we study the evolution of utilitarian values of the best-of-population and average-in-population solutions in successive generations. These convergence factors permit to assess the performance of different variants of NEMO-GROUP from the point of view of a whole group of DMs. On the one hand, the best solution in the returned population may be perceived as a default outcome of the method that is most likely to be accepted by the DMs. On the other hand, an average quality of the individuals contained in the population reveals if the search has been appropriately focused on the group consensus solutions (Branke et al. 2015). The algorithms are compared for the DTLZ2-3D and DTLZ4-3D problems with the number of DMs ranging between 2 and 7. Let us emphasize that DLTZ4 is considered more challenging than DTLZ2. Figures 2 and 3 present the convergence plots for, respectively, value of the best solution and average value of all solutions in the population for DTLZ2-3D with three DMs. These plots demonstrate when different approaches start to converge towards the Pareto front, what is their convergence speed measured in terms of a change of a utilitarian value, and what is value of the solution(s) at which their performance stabilizes. Nonetheless, to compare the performance of proposed approaches for various benchmark problems and different numbers of DMs, we focus on the precise measures derived from the convergence plots. First, we refer to the minimal values obtained throughout the 500 generations (see Table 2). When it comes to the best utilitarian solution obtained during the optimization run (i.e., the minimal best-of-population value), the variants of NEMO-GROUP which evolve a joint population for all DMs significantly outperform NSGA-II and NEMO-GROUP-IND which, respectively, approximate an entire Pareto front or evolve a sub-population individually for each DM. Thus, it is beneficial to integrate user preferences and to seek for the group consensus solution already during the evolutionary search. The greater the number of DMs, the more evident these conclusions are.
The best performing variants of NEMO-GROUP are ADV-UT, REP-UT, and PO-UT. The slight differences between these methods are statistically insignificant for most considered settings. These approaches aggregate the evaluations of each solution conducted individually for each DM with, respectively, the per-solution most advantageous value function, the representative value function approximating the DM's true preference model, or the front of potential optimality the solution belongs to. Thus, to discover a solution which is on average perceived well by the group members, when judging the individuals in terms of their suitability for inclusion in the next population, one should apply the utilitarian measures. Indeed, REP-UT, ADV-UT, and PO-UT outperform their egalitarian counterparts: REP-EG, ADV-EG, and PO-EG, respectively. Finally, the worst performing variants of NEMO-GROUP in terms of the best-of-population value are REP-EPS and REP-MAX. This indicates that construction of a joint single representative value functions for all group members may bias the search to the less relevant regions of the Pareto front. Overall, our results confirm that it is more advantageous to aggregate the potentially conflicting viewpoints of different DMs within the sorting procedures of the evolutionary algorithm rather than when inferring the DMs' preference model.
When comprehensively judging the returned population of solutions being most favorable from the point of view of the whole group (i.e., the best average-in-population value), again REP-UT, ADV-UT, and PO-UT perform the best in view of the majority of considered problem settings. Obviously, for all methods a single best solution is clearly better than a complete best population. For example, the comprehensive utilitarian values of the best solution and population for REP-UT, DTLZ4-3D and 3 DMs are equal to, respectively, 0.3748 and 0.3814. When this difference is relatively small (see, e.g., ADV-EG), the constructed population is well focused, whereas higher differences (see, e.g., REP-EPS) suggest greater dispersion of the constructed solutions in the objective space.
Overall, when comprehensively evaluating the whole population rather than a single solution, the differences in the performance of various methods become more evident. The benefits of integrating the DMs' preference information into the evolutionary search are best confirmed with the average-in-population value of different NEMO-GROUP variants being significantly better than the best-of-population value for NSGA-II. For example, for DTLZ2-3D and 5 DMs the value of the best NSGA-II solution is 0.4502, while the average value of all solutions in the population of ADV-UT is 0.3941.
Intuitively, the comparison of results for different numbers of DMs indicates that the more DMs are involved in the optimization process, the worse are the utilitarian values of the best returned individual and population (i.e., the worse is their average perception by the group members). For example, when ranging the number of DMs from 2 to 7 for DTLZ2-3D, the value of the best solution constructed with REP-UT is, respectively, 0.3547, 0.3703, 0.3830, 0.3917, 0.3995, and 0.4054. However, this is mainly due to the lower value of the optimal utilitarian solution that can be constructed when preferences of more DMs are accounted. It is clearly visible in Fig. 4 which compares the values of such optimal solutions and the best individuals constructed with a few variants of NEMO-GROUP for different numbers of DMs for DTLZ2-3D. For example, when 2, 4 or 6 DMs are involved, the utilitarian value of the optimal solution is equal to, respectively, 0.3506, 0.3798, and 0.3971. Clearly, NEMO-GROUP cannot do any better. However, the analysis of Fig. 4 indicates that for the best performing approaches (REP-UT, ADV-UT, and PO-UT) the absolute value difference of the best returned individual from the optimal solution decreases with the increase in the number of DMs. For example, for REP-UT for 2, 4 or 6 DMs, it is equal to, respectively, 0.0040, 0.0035, and 0.0024. This suggests that the proposed algorithms scale up well to greater numbers of DMs.
As the other set of measures derived from the convergence plots we consider the average group utilitarian values observed throughout 500 generations (see Table 3). They quantify the overall performance of the algorithms from the point of view of either the best solution or a complete population returned after each generation. Since the value to which the algorithms converge highly affects the overall performance, the conclusions about the best and worst performing algorithms are analogous to the case of considering only the best results. The important differences are the following: -When comparing the best performers in terms of the minimal value of the returned solution and population, ADV-UT and PO-UT compare positively to REP-UT. The former methods converge faster to the most preferred region of the Pareto front, deriving their comprehensive superiority from exploiting more favorable search directions in the first 200 generations. Exploitation of the whole set of compatible value functions in ADV-UT and PO-UT implies that they are less sensitive to the accumulation of conflicting preference information provided by the DMs than the methods constructing a single representative value function for each DM individually or all DMs jointly. Such robustness proves to be particularly advantageous when the evolutionary search is still chaotic. -REP-EG, REP-EPS, and REP-MAX are even less advantageous than in case of considering the best results, because (i) the selection pressure they introduce between 100 and 300 generations is too weak, (ii) their convergence curves are more erratic than the others, deteriorating several times in the phase when performance of other algorithms stabilizes or still slightly improves. -IND performs poorly in terms of the best-of-population value, because it starts to converge later than other algorithms. With greater numbers of DMs its convergence is worse than that of NSGA-II. This is due to evolving less individuals in the subpopulation allocated for each DM.

The Convergence in Terms of an Egalitarian Value of the Solutions
In this subsection, we study the performance of NEMO-GROUP in terms of egalitarian values of the best-of-population and average-in-population solutions. Thus, instead of measuring the quality of solutions as the average (or sum) of individual judgments, we compute the quality that each solution represents to the whole group with respect to the least satisfied DM. Otherwise, we use the same experimental setting as in Sect. 4.2. The best egalitarian values of a single solution and an entire population obtained by different algorithms throughout 500 generations are provided in Table 4. Obviously, they are higher (less advantageous) than the respective utilitarian values presented in Table 2.
For all considered problems and numbers of DMs, REP-EG significantly outperforms the remaining algorithms. Let us remind that when judging individuals in terms of their suitability for inclusion in the next population, REP-EG accounts for the representative value function of the least satisfied DM. Since this function aims at approximating the DM's true preference model, when constructing the population REP-EG optimizes a quality measure which is consistent with the egalitarian group value function. This characteristic is central to its competitive advantage for the considered three-dimensional benchmark problems.
Interestingly, although accounting for the least satisfied DM during the evolutionary search, PO-EG and ADV-EG perform worse than their utilitarian counterparts PO-UT and ADV-UT, respectively. This suggests that evaluation of solutions in terms of the worst fronts they belong to or the least advantage they have over the remaining solutions, is not sufficient for discovering individuals with competitive egalitarian values. Overall, these measures are less discriminant for the comparison of different solutions than their utilitarian counterparts, and, thus, they fail to introduce a sufficiently strong selection pressure into the search. Conversely, with the focus on constructing the solutions which are acceptable for all DMs, REP- UT, ADV-UT, and PO-UT perform reasonably well also in terms of an egalitarian value. Figure 5 demonstrates the egalitarian values of optimal solution and the best individuals constructed with different variants of NEMO-GROUP for DTLZ2-3D for the number of DMs ranging from 2 to 7. Analogously to the case of using a utilitarian value function, the more DMs are involved, the worse are the egalitarian values of the best possible solutions as well as the ones returned by the algorithms. Interestingly, whatever the number of DMs, REP-EG is able to construct a solution which is very close to the optimal one. For the remaining approaches, the gap from the optimal solution, in general, slightly increases when more DMs are involved.

The Convergence in View of Different Numbers of Objectives and Types of True DMs' Value Functions
In this subsection, we demonstrate the impact that including additional objectives and considering different types of true DMs' value functions (i.e., linear or Chebycheff ones) has on the convergence of proposed algorithms. To save space, we compare NSGA-II with a representative subset of NEMO-GROUP variants (REP-UT, REP-EG, ADV-UT, ADV-EG, and IND) which use dominance relation as the first sorting procedure. Their performance is verified on DTLZ2 with 3, 5, and 7 objectives in view of utilitarian and egalitarian comprehensive value functions for 3 DMs. We focus on the average best-in-population solution throughout 500 generations.
First, we refer to the values obtained when simulating the DMs with linear value functions (see Table 5). For all considered algorithms, both utilitarian and egalitarian best-of-population values become worse with the increase in the number of considered objectives. Intuitively, the more dimensions are involved, the more challenging it is to construct the solutions which are relevant from the point of view of the whole group.
As the number of objectives increases, it becomes more difficult to identify the complete Pareto front. Since an increasing proportion of all feasible solution becomes non-dominated, the dominance relation fails to sufficiently discriminate the solutions within the evolutionary algorithms. As a result, the decrease in performance of NSGA-II is significant, and the constructed solutions are far from the true Pareto front. On the contrary, integrating DMs' preferences into evolutionary search allows to re-introduce the necessary selection pressure. Indeed, the advantage of the interactive evolutionary hybrids over NSGA-II becomes more evident in more dimensional problems.
When it comes to the utilitarian best-of-population value, ADV-UT is the best performer whatever the number of objectives. Interestingly, when more objectives are considered ADV-EG is gaining a competitive advantage over the methods which employ the representative value functions. Thus, when the number of objectives increases, it is more beneficial to exploit a set of compatible value functions. This seems to ensure greater robustness when indicating more promising directions for optimization.
The latter conclusion is strengthened by the analysis of egalitarian best-ofpopulation values. As indicated in Sect. 4.3, REP-EG was the best performer for a three-dimensional DTLZ2. While still performing better than REP-UT for all considered numbers of objectives, already with 5 dimensions REP-EG is outperformed by ADV-UT and for DTLZ2-7D its convergence is even worse than that of ADV-EG.
The status of ADV-UT as an overall best performer among the considered variants of NEMO-GROUP is confirmed by the results obtained when simulating the DMs with true Chebycheff functions (see Table 6). ADV-UT outperforms other methods for all considered settings. The relative comparison of the remaining approaches is the same as for the linear value functions. In particular, REP-UT is slightly better than REP-EG for the utilitarian function, whereas for the egalitarian one this order is inverse.
Apparently, the average best-in-population values presented in Table 6 decrease with the increase in the number of objectives. However, this is due to the internal definition of the Chebycheff function. That is, when more dimensions are involved, the maximal value this function can take decreases. In our experiments, for 3, 5, and 7 objectives, these maximal values were equal to, respectively, 0.6278, 0.4538, and 0.3447 for the utilitarian group value function, and 0.7977, 0.5585, and 0.4329 for the egalitarian one. Thus, the ability of algorithms to construct group consensus solutions should be judged while referring to these maximal values. For all considered variants of NEMO-GROUP the value of the best solution they construct becomes closer to the maximal possible value when more objectives are involved. For example, for the utilitarian value function for ADV-UT the respective differences are equal to 0.4046 (=0.6278 − 0.2232) for 3D, 0.2475 (=0.4538 − 0.2063) for 5D, and 0.1676 (=0.3447 − 0.1771) for 7D. Thus, again, our results confirm that with more objectives, it is more challenging to find a group consensus solution.

The Impact of Incorporating the Weights Assigned to the DMs into the Evolutionary Search
In this subsection, we study the potential of some proposed approaches-REP-UT and ADV-UT-for incorporating the DMs' weights, thus, differentiating the importance of group members. Such investigation makes sense in the context of utilitarian group value function. Let us first illustrate how REP-UT and ADV-UT are able to bias the search depending on the weights assigned to the DMs. We consider three DMs whose linear value functions are parameterized with the following weights (w k 1 , w k 2 ) for k = 1, 2, 3: DM 1 − (0.7, 0.3), DM 2 − (0.4, 0.6), and DM 3 − (0.2, 0.8). Moreover, we account for the following three vectors of weights (W DM 1 , W DM 2 , W DM 3 ) assigned to the DMs: W Set I = (1/3, 1/3, 1/3), W Set II = (0.1, 0.45, 0.45), and W Set III = (0.7, 0.2, 0.1). Thus, in the first considered scenario all DMs play the same role in the committee, for the second set of weights -the role of DM 1 is rather neglected, while for the third set -DM 1 can be considered as a dictator. Figure 6 shows the results of NSGA-II, NEMO-0 for the three DMs as well as REP-UT and ADV-UT with different sets of DMs' weights on DTLZ2-2D after 500 generations. Whatever the vector of DMs' weights, REP-UT and ADV-UT are able to focus the search on a single small part of the Pareto front reflecting the relative importance of the DMs. Thus, with the same weights of DMs, the population for W Set I is rather central in the objective space. For W Set II -the returned solutions are situated between the regions that are relevant for DM 2 and DM 3 (though closer to the part which is most preferred by DM 2 due to the marginal impact of partial  (0.7, 0.2, 0.1)) on DTLZ2-2D after 500 generations preferences of DM 1 ). Finally, the population returned for W Set III is very close to the solutions constructed individually for DM 1 whose weight (0.7) is much greater that the weight of the remaining two DMs considered jointly. Note, however, that there are slight differences between REP-UT and ADV-UT with respect to the parts of the Pareto front to which they converge.
We evaluate the algorithms in terms of the minimal utilitarian best-in-population and average-of-population values they obtain throughout 500 generations (see Table 7). Obviously, when evaluating the returned individuals with a utilitarian group value function, we account for the weights assigned to the DMs. In this perspective, the variants of REP-UT and AVG-UT which incorporate the DMs' weights already during the evolutionary search (called W-REP-UT and W-AVG-UT) attain significantly better results than their counterparts assuming that all DMs play the same role in the committee. This proves both the flexibility of proposed algorithms and the benefits of accounting for the DMs' weights when judging the individuals in the evolutionary algorithms.
Overall, W-ADV-UT attains the best results for all considered problem settings in terms of both a single solution and a complete population. The advantage of NEMO-GROUP variants incorporating the DMs' weights is more evident for W Set III where DM 1 can be considered as a dictator. Nonetheless, to prove that the preferences of the remaining (less important) DMs do influence the search, one can refer to the bestin-population value for IND. Evolving a sub-population for each DM individually, this method is able to discover solutions which are considered reasonably good by the whole group (assuming the prevailing role of DM 1 ). Indeed, for W Set III , IND is considered better than REP-UT and ADV-UT. However, it is still significantly worse than W-REP-UT and W-ADV-UT, which confirms that the impact of DM 2 and DM 3 in these algorithms is non-negligible. Obviously, when taking into account a complete population, NSGA-II and IND are significantly outperformed by all considered variants of NEMO-GROUP.

The Impact of Elicitation Interval and Starting Generation for Preference Elicitation on the Convergence
In this subsection, we evaluate the performance of REP-UT and ADV-UT for different elicitation intervals in {10, 20, 30} and starting generations for preference elicitation in {0, 100, 200, 300}. In Table 8, we provide mean utilitarian value differences of the best-in-population solutions constructed by these methods with respect to the true utilitarian solution for DTLZ2-3D discovered with LP techniques. The results indicate that NEMO-GROUP is not always able to find the true utilitarian solution, but the observed value differences are very small. As a general rule, the more often the DMs are questioned, the more advantageous is the best solution (with the exception of REP-UT for SGE = 100 or 200). Moreover, delaying the elicitation by 100 or 200 generation does not significantly influence the best-in-population value. In fact, for both algorithms, our results indicate that better solutions can be found when delaying the preference elicitation process to SGE = 100. Nevertheless, with SGE = 300 and too rare questioning of the DMs, the convergence of algorithms is affected negatively.
For all EIs > 0 and SGEs > 0, the convergence of algorithms over time to the most preferred region is affected significantly. To support this claim, in Fig. 7, we provide the convergence plots for REP-UT concerning the value difference between the true utilitarian solution and the best-in-population solution, for different EIs and SGEs.
Firstly, the lower the EI, the faster the convergence, i.e., the sooner the best solution becomes competitive with the true utilitarian solution. However, the plots indicate that in the initial generations when the evolutionary search is still chaotic, it is more beneficial to question the DMs rarely (see the main part of Fig. 7). Only once the population is well distributed (for our problem, this happens after 100 generations), the preference elicitation should be performed more often. Secondly, when the preference elicitation is delayed (SGE > 0), one could observe a visible difference in the convergence before and after the SGE (see the top-right corner of Fig. 7).
To compare the convergence of algorithms for different EIs and SGEs, Table 9 indicates the number of generations for which the average best-in-population value reaches two arbitrarily selected value distances, 0.015 and 0.01, from the true utilitarian solution. For example, when the preference elicitation is performed every 10 generations with SGE = 0, the convergence indicator is 0.015 already before 175 generations for both REP-UT and ADV-UT, whereas with EI = 30 or SGE = 200, this happens only after 300 generations. This confirms the benefit that preference information can have in speeding up the optimization.
However, when preference elicitation is delayed (SGE > 0), one may question the DMs less often without deteriorating the algorithm' convergence (see, e.g., the value difference of 0.01 for REP-UT with SGE = 200 and EI = 10 or 20 or for ADV-UT with SGE = 100 and EI = 20 or 30). These results confirm that in some scenarios, a satisfactory solution can be discovered faster even if the preference elicitation is started later. Nonetheless, with high SGEs the elicitation cannot be performed too rarely as the algorithms fail to discover a competitive solution.
In addition to considering the number of generations for which an average convergence indicator reaches the relative difference of 0.01 or 0.015, we analyzed the number of pairs that would have been compared by each DM until this stage. These numbers are provided in Table 9. They indicate that the cognitive effort of the DMs may be reduced when suitably adjusting EI and SGE. Most importantly, the questioning of the DMs should be delayed until the population is well distributed. For example, with (SGE, EI) = (200, 30) the number of pairs to be compared by the DMs can be reduced about 3 or 4 times for, respectively, REP-UT and ADV-UT when referring to the basic scenario with (SGE, EI) = (0, 10). Nonetheless, delaying the preference elicitation too much or performing it too rarely may not reduce the number of pairs that need to be compared by the DMs, while deteriorating the convergence (see, e.g., ADV-UT for (SGE, EI) = (200, 10) and (300, 10) or (0, 20) and (0, 30)).

Conclusions and Future Research
In this paper, we presented a set of interactive evolutionary methods for multiple objective optimization with multiple DMs. The proposed approaches incorporate the DMs' pairwise comparisons of solutions into the evolutionary search and use a preference model in form of an additive value function. The experimental results confirm that different variants of NEMO-GROUP are able to focus the search on the group-preferred solutions, thus, neglecting the individuals which are clearly irrelevant for the DMs. Moreover, we proved that incorporating the preferences of multiple decision makers in the evolutionary optimization speeds up the converge to the Pareto front and introduces the selection pressure which is necessary for dealing with more dimensional problems. Nevertheless, the proposed variants differ in terms of both part of the Pareto front to which they converge as well as the convergence speed.
Overall, our results indicate that to find the best group-consensus solutions the interactive evolutionary hybrids should: -evolve a joint population for all DMs rather than a sub-population of solutions for each DM individually; -analyze the preference information provided by each DM separately rather than infer the preference model compatible with preferences of all DMs; -account for the average rather than the worst judgment concerning the attractiveness of solutions in view of different DMs; this allows to better differentiate the solutions in terms of their suitability for inclusion in the next population; -consider the whole set of value functions compatible with the DM's pairwise comparisons, which ensures greater robustness when identifying the most promising directions for optimization; only when considering few objectives it might be beneficial to select a single representative value function; the latter is particularly useful when evaluating the solutions with an egalitarian group value function.
Additionally, when taking into account the computational cost, one should consider the most advantageous result attained by each solution rather than the fronts of potential optimality the solutions belong to.
While referring to a set of benchmark problems, we confirmed that when more objectives and more DMs are involved, it is more challenging to focus the search on the group-consensus solutions. We also demonstrated that the proposed algorithms are flexible enough to accommodate the weights assigned to the DMs. Doing so allows to construct the solutions which are more relevant for the group whose members play different roles.
Our tests showed that the interactive methods require less preference statements if an initial optimization phase is performed before any preferences are elicited. A good balance between fast convergence and low amount of required pairwise comparisons could be obtained by starting the preference elicitation after 100-200 generations and using an elicitation interval of 30 generations. We envisage the following directions of future research concerning the interactive evolutionary approaches for group decision: -considering other preference models than an additive value function (e.g., an achievement scalarizing function); -implementing the idea of adaptive preference modeling that adjusts the complexity of the model to the provided preference information in the course of evolutionary search in the spirit of Branke et al. (2016); -preserving elitism in the evolutionary search using measures derived from Stochastic Ordinal Regression (see Tervonen 2013 andMichalski 2016) which is based on Monte Carlo simulation rather than LP techniques; -adapting the proposed approaches to discrete (combinatorial) MOO.

Table 2
Minimal (best) utilitarian value throughout 500 generations for DTLZ2-3D and DTLZ4-3D with different numbers of DMs    The 'W' prefix indicates the variants of REP-UT and AVG-UT which incorporate the DMs' weights already during the evolutionary search SD standard deviation