A novel graph-theoretical clustering approach to find a reduced set with extreme solutions of Pareto optimal solutions for multi-objective optimization problems

Multi-objective optimization problems and their solution algorithms are of great importance as single-objective optimization problems are not usually a true representation of many real-world problems. In general, multi-objective optimization problems result in a large set of Pareto optimal solutions. Each solution in this set is optimal with some trade-offs. Therefore, it is difficult for the decision-maker to select a solution, especially in the absence of subjective or judgmental information. Moreover, an analysis of all the solutions is computationally expensive and, hence, not practical. Thus, researchers have proposed several techniques such as clustering and ranking of Pareto optimal solutions to reduce the number of solutions. The ranking methods are often used to obtain a single solution, which is not a good representation of the entire Pareto set. This paper deviates from the common approach and proposes a novel graph-theoretical clustering method. The quality of the clustering based on the Silhouette score is used to determine the number of clusters. The connectivity in the objective space is used to find representative solutions for clusters. One step forward, we identify ‘extreme solutions’. Hence, the reduced set contains both extreme solutions and representative solutions. We demonstrate the performance of the proposed method by using different 3D and 8D benchmark Pareto fronts as well as Pareto fronts from a case study in Royal Australian Navy. Results revealed that the reduced set obtained from the proposed method outperforms that from the K-means clustering, which is the most popular traditional clustering approach in Pareto pruning.


Introduction
Most real-world problems usually have multiple and conflicting objectives. Therefore, multi-objective optimization methods are of great importance. Multi-objective optimization methods provide a single solution, a few solutions, or Pareto optimal solutions (Pareto set) depending on the solution approach. However, Pareto optimal solutions give a complete picture to the decision-maker. Therefore, this research focuses on the methods, which produce Pareto optimal solutions.
A multi-objective decision-making process can be divided into three sub-steps, starting from the selection of a solution approach to the final decision-making. These steps include generation of the Pareto optimal solutions based on the utilized optimization algorithm, visualization of both Pareto solutions and corresponding decision variables and, lastly, analysis and decision-making as shown in Fig. 1. In the remainder of this section, we will mainly focus on the last step (i.e., analysis and decision-making) and discuss methods utilized in this step together with the advantages and disadvantages of each.
As all of the solutions in the Pareto set are optimal with some trade-offs and considered equally good [39], it is often challenging for the decision-makers to select the solution(s), especially in the absence of subjective or judgmental information [10]. Not to mention the time complexity of analyzing all the Pareto optimal solutions. Even in the presence of subjective or judgmental information, the decision depends on the experience of the decision-maker. Thus, methods to find a reduced (small) set that captures the diversity of the Pareto optimal solutions and hence, a reasonable representation for the entire Pareto set, are important [10,29,48]. Researchers have proposed several methods to group (cluster) Pareto set based on the similarity of the values of objective functions [54]. Then, one representative solution for each cluster is chosen to form the reduced set. However, the reduced set obtained from the traditional clustering methods may not cover the entire Pareto optimal solutions. This is due to the representative solutions may not find extreme solutions (refereeing to solutions with the global minimum or maximum of objective function values). Thus, a method to find a reduced set that truly represents the entire Pareto optimal solutions (including extreme solutions), is needed.
There are several approaches to find a reduced set for the Pareto set such as a-priori, interactive, and a-posteriori methods [31,32]. A-priori methods often lead to a single solution and depend on the decision-makers preference as the decision-makers impose their preference before the optimization. In the interactive methods, a reduced set is computed at each iteration to ultimately obtain a small set of Pareto optimal solutions [48,54]. In the post Pareto analysis, first, the entire Pareto set for the multi-objective problem is obtained and then the reduced set is computed [4,10,41,49]. Among these methods, a-posteriori methods give an overall picture to the decision-makers [30]. Therefore, in this paper, we also focus on the post Pareto analysis as depicted in Fig. 1. Accurate decisions are needed in every decision context and the nature of the decision determines the level of accuracy required. To make a better decision, the decision-maker needs the full picture. Thus, it is important to obtain a reduced set with small cardinality and is a better representation for the entire Pareto set (i.e., a set with a minimum number of Pareto optimal solutions such that it captures the diversity of the entire Pareto set).
Among the approaches used to find a reduced set of Pareto optimal solutions for the multi-objective optimization problems, the most common approaches that researchers used are ranking based hierarchical approaches or clustering-based approaches such as K -means clustering [5,10,41]. K -means is the most popular traditional clustering approach in Pareto Fig. 1 (Color online) A typical three steps summary of a-posteriori methods for multi-objective optimization problems from solution approach to decision-making. We give drawback(s) of each method of selecting a solution in red text while the focus and contribution of this paper are highlighted in blue pruning [34]. In the ranking based hierarchical approach, a preference is given to one objective over another objective in the pairwise comparison, and a score (weight) is assigned [5,37] (see also Step 3 in Fig. 1). Then, all the Pareto optimal solutions are ranked based on this score. However, the problem with this approach is that the preference depends on the application and the experience of the decision-maker. As all the Pareto fronts are considered equally good with some trade-off, the final solution obtained from the ranking may not represent the entire Pareto set. What is needed for multi-objective problems is a reduced set which truly represents the entire Pareto set [54]. To achieve this, several clustering algorithms have been employed. Moreover, some researchers have proposed Hypervolume indicator (HYP), which is a widelyused performance indicator for multi-objective evolutionary algorithms (MOEAs), for Pareto pruning [23,34]. HYP is also used to measure the diversity and efficiency in the solutions in the reduced set [34,38].
In clustering methods, a representative solution is chosen based on the solution at the cluster center or one closest to the hypothetical best solution (the ideal solution) in the respective cluster [10,54] (see also Step 3 in Fig. 1). On one hand, the selection of an optimal number of clusters is ambiguous. On the other hand, the reduced set may not include extreme solutions. However, it is also important to analyze extreme solutions [35,40]. The Silhouette score is proposed to overcome the ambiguity of determining the optimal number of clusters [9]. In this paper, we show how K -means clustering can be combined with the graph theory to capture a better-reduced set, even though we use K -means as a benchmark method for purely comparison purposes. That is, we also use results from K -means to show the importance of adding extreme solutions to the reduced set. This will enhance a reduced set when the reduced set is obtained via clustering. The Silhouette score, which indicates the quality of the clustering, is used to determine the optimal number of clusters within the given upper bound. Then, we use graph-theoretical properties based on the connectivity of the network (graph) that is created for the objective space to extract the representative solution for each cluster. Besides, this also allows us to find extreme solutions. Therefore, the proposed method gives a better-reduced set of Pareto optimal solutions to decision-makers to make their decision with ease. Moreover, the proposed method is independent of subjective decisions and can be used by a non-experienced decision-maker.
In the HYP-based pruning method, multiple solutions or a single solution is used to calculate the Hypervolume [16,23,34,38]. The idea behind this method is to maximize the HYP, i.e., a set of solutions or a solution, which maximizes the Hypervolume, constitutes the pruned set (reduced set). Even though boundary solutions are included in the reduced set, the HYP-based pruning method does not automatically assign non-selected solutions to each of the representative solutions and requires an additional step if the decision-maker wanted to further investigate neighboring solutions of his/her selected solution [34]. This is not the case for clustering methods and the inherent grouping of non-selected solutions is a major advantage of the clustering method over the HYP-based method. As boundary solutions are included in the pruned set from the HYP-based pruning, it is attractive for the decisionmaker who is interested in extreme solutions before finalizing the decision [34]. However, the HYP-based method can be very computationally expensive for multi-objectives problems and can be a burden as it requires determining the Hypervolume for every single solution or combination of solutions [55]. Therefore, to this end, this paper proposes a method that can effectively find a reduced set of Pareto sets, which includes extreme solutions as well.
Recently, both K -means clustering and the proposed graph-theoretical clustering, which is based on Gomory-Hu trees (GHT), have applied to predict the failure location on manmade and natural slopes as well as small scale data from DEM simulation [20,42,43,52]. The results revealed that the proposed-graph theoretical clustering outperforms K -means clustering in the sense of grouping points with similar properties. On the other hand, Kmeans clustering alone may not find the extreme solutions. The GHT algorithm is commonly used in graph partitioning and graph clustering [14,18], image segmentation [25], webpage segmentation [26], social network analysis [6], biological data analysis [33,45]. Motivates from these, we propose this graph-theoretical clustering method to find a better-reduced set of Pareto optimal solutions (see also Step 3 in Fig. 1). One step forward to the traditional clustering methods, the proposed clustering method can identify extreme solutions.
First, we apply the proposed method to the benchmark Pareto optimal fronts for different optimization problems reported by [11,12] to test the performance in general. These Pareto fronts are available at 1 . Next, we apply the proposed methods to the Pareto optimal solutions obtained from a realistic case study from the Royal Australian Navy to gain more decision-making insights into the case study in defence [47]. To the best of the authors' knowledge, there is no existing post-Pareto analysis that gives the promising reduced set to decision-makers in the context of the military application as we will discuss in Sect. 2. Our main contribution is the proposed graph-theoretical clustering method which can yield to a better-reduced set. Moreover, the proposed method of finding representative solutions can be unitized to find extreme solutions. We combine this method of finding extreme solutions with a K -means clustering to find an improved reduced set. This is to show the importance of adding extreme solutions with the results from clustering approaches. Results from all the test cases revealed that the reduced set obtained from the proposed method better represents the entire set.
The contributions of this paper are mainly threefold.
1. The paper proposes a novel graph-theoretical clustering method that uses Gomory-Hu trees, to find a reduced set of Pareto optimal solutions. This set includes both representative solutions found for each cluster and extreme solutions. Hence, it captures the diversity of the entire Pareto set. Thus, this set is a better representation of the entire Pareto set. Further, the proposed methods can be applied to any multi-objective problem. 2. In this paper, we propose a novel method to find representative solutions from each cluster based on the connectivity of the graph. This method can also identify extreme solutions. Therefore, this method can be combined with traditional K -means clustering to obtain a better-reduced set. 3. According to the best of the authors' knowledge, this is the first attempt to use Gomory-Hu trees for the post-Pareto analysis to obtain a better-reduced set in the field of military. Note also that there is no existing literature for post-Pareto analysis in the military facility location problem, which integrates workforce planning and capacity allocation.
The rest of the paper is organized as follows. In Sect. 2, we present the current state-ofthe-art knowledge of reducing Pareto optimal solutions to a small set which is then used to support decision-makers. In Sect. 3, we discuss the proposed method of finding a reduced set of the Pareto optimal solutions and other relevant background information. In Sect. 4, we show the performance of the proposed method with benchmark Pareto fronts. We present and discuss the case study, multi-objective model, solution approach and results in Sect. 5 to give some decision-making insights before the concluding remarks are given in Sect. 6.

Literature review
Researchers are continuously proposing new approaches to improve the quality (referring to being able to represent the entire Pareto set) of the reduced set. For example, SPEA2 is proposed by Zitzler et al. [53] to improve the reduced set obtained by SPEA. The reduced set obtained by the clustering method in SPEA may not keep the outer solutions even though it preserves the characteristics of nondominated set [53]. To overcome this, SPEA2 proposed in [53] uses new archive trancation method. In archiving, we commonly examine the problem of maintaining an approximation of the set of nondominated points visited during a multiobjective optimization [27]. However, this paper proposes a novel method to find a reduced set of Pareto optimal solutions (see flowchart in Fig. 3) based on Gomory-Hu trees, which always preserve the extreme solutions. Moreover, the proposed method finds the optimal number of clusters within the given upper bound as we will explain in Sect. 3. Finding a reduced set with small cardinality to represent the entire Pareto set for the multi-objective optimization problem is not new. In Table 1, we present a selective but, comprehensive literature covering the different methods of obtaining a reduced set and application domain. This section serves for the purpose of highlighting research gaps and the novelty of the proposed method.
To find the reduced set, clustering methods (hierarchical or direct) are utilized to cluster the Pareto optimal solutions with similar properties. Then, a representative solution is extracted from each cluster to form the reduced set. On the other hand, there are hierarchical based ranking methods to obtain the best solution. The main problem of obtaining the best solution is that this solution depends on the expertise of the decision-maker and his or her preference as weight is given to each objective compared to another [54].  [4] To use Pareto filters in order to identify the most promising ones

Pareto filters
The design of reverse osmosis desalination plants [5] To propose analytic hierarchy process-based approach for selecting a Pareto-optimal solution of a multi-objective, multi-site supply-chain planning problem The analytic hierarchy process Supply-chain planning problem [21] To find a set of Pareto optimal solutions and a compromise solution The branch & bound and iterative goal programming Public emergency service stations [48] To introduce a new criterion to evaluate the effectiveness of a representative subset of non-dominated points in the context of the

Bi-criteria p-Median p-Dispersion problem (BpMD)
Determined along the direction of worst objective function - † ‡ Applied to some test functions Applied to continuous and combinatorial problems * Applied to artificial data † Applicable only for bi-criterion problems The most popular techniques to obtain a reduced set is the clustering methods [1,10,41,49]. There are several clustering methods such as hierarchical clustering [1], K -means clustering [41] and unsupervised learning clustering (i.e., artificial neural network) with variable weights [28]. The main drawback of the clustering methods is the ambiguity of selecting the number of clusters. To avoid prescribing the number of clusters, a standard agglomerative hierarchical clustering technique is proposed in [49]. Moreover, the Silhouette score is also used to determine the optimal number of clusters as discussed in Sect. 1.
There exist other methods such as the analytical hierarchy process (AHP), Pareto filters, data envelopment analysis (DEA) and augmented -constraint method, which are often used to obtain the best solution (a compromise solution) [4,5,15,30,37]. However, what is needed for the multi-objective optimization problems is not a single solution but, a reduced set with small carnality, that captures the diversity of the entire Pareto set as discussed in Sect. 1. Moreover, the best solution obtained from the above-mentioned ranking methods often depends on the experience of decision-makers.
The applications of the post Pareto analysis lie in the field of engineering, including energy planning and optimization of renewable energy systems [1,4,49], redundancy allocation problem [8], system reliability design problems [41], public emergency service stations [21], supply-chain [5], etc. In a multi-objective, multi-site supply-chain planning problem, an AHP method is applied to select the best Pareto-optimal solution according to the preferences of the decision maker [5]. Most of the current literature on supply-chains have been focused only on obtaining the Pareto optimal solutions and do not attempt to select a compromise solution [5]. This is particularly true for existing literature on multi-objective optimization problems in the field of the military. To the best of the authors' knowledge, there is no post Pareto analysis conducted in the field of military application to obtain a reduced set. Moreover, the above is also true for the facility location problem except [21] where they propose a method to find a compromise solution using branch & bound and iterative goal programming. This might be due to the fact that most of the facility location problems are modeled as single-objective optimization problems. In particular, around 75% of facility location problems in the military are single-objective [22]. To this end, we contribute this branch of the existing literature by applying our graph-theoretical based approach to obtain a better-reduced set in a military facility location problem.

Method
In this section, we discuss our proposed method to find the clusters in Pareto optimal solutions and select a representative solution for each cluster. The proposed method utilizes Gomory-Hu trees. The idea here is to find the partitions of a graph, which is created for points in the objective space, to isolate points with similar properties. The number of partitions is determined by the quality of the clustering obtained from the Silhouette score.
The Silhouette score S is proposed by [36] to determine the overall quality of the clustering. This score is in S ∈ [−1, 1] and a value of 1 indicates the perfect clustering. S is defined to be the global average of s(i). s(i) measures the compactness of the clusters, i,e., it is a measure of similarity of a given node i to the other nodes j in its own cluster (cohesion) compared to the nodes in the other clusters (separation). S is given by where n is the total number of nodes, a(i) is the mean distance in the objective space between node i and all other nodes in the same cluster while b(i) is the mean distance between node i to all nodes in the other clusters. High S corresponds to a good clustering pattern, i.e., nodes in the same clusters have very similar properties and hence, compactly packed in the objective space while properties of the nodes in different clusters are vastly different and hence, well separated from each other (e.g., red and blues nodes in Fig. 2b are tightly packed and well separated from each other in the objective space). This results in small a(i) and large b(i).
As a general indicator, S < 0.2 suggests essentially no clustering pattern was found [50].
The higher values of S indicate the individual clusters are more compact and well separated from each other. To find the representative solutions from the large set of Pareto optimal solutions, first, we need to group a set of objects in such a way that objects in the same group are more similar. This is done by using the values of the objective functions in the Pareto optimal solutions (Pareto set), i.e., in the objective space. The most popular existing clustering algorithm in Pareto pruning is the K -means clustering [34] and we will be using results from it for comparison purposes only. To determine the optimal number of clusters, the Silhouette score is used [9]. Such a clustering with two clusters is shown in Fig. 2 with red and blue nodes. Once we have the clusters, we find the representative solutions using the connectivity, i.e., the degree and one close to the hypothetical best solution (ideal solution) and the solution at the cluster center in the respective cluster. The former method is the 'proposed method' in this paper while the latter methods are the traditional ways of finding the representative solutions for clusters [10]. Specifically, the former method is used to find the representative solutions for clusters from both K -means and the proposed method. Moreover, it is used to find extreme solutions, that can be added to a reduced set from other clustering methods. We use the traditional way of finding a representative solution for the performance comparisons.
In the remaining of this section, we first discuss the proposed graph-theoretical clustering method. This process is threefold: (1) construct a weighted contact network (graph) for the objective space obtained from a multi-objective optimization problem, (2) construct the Gomory-Hu tree (GHT), and (3) find clusters from GHT. In Sect. 3.1, we discuss the construction of the graph for the objective space. We present the proposed graph-theoretical clustering method in Sect. 3.2. In Sect. 3.3, we give details of the proposed methods to extract a representative solution for each cluster.

Construction of a weighted contact network (graph) G * for Pareto optimal solutions
To create a contact network, we map each point in the objective space to a node. For each node representing a point in objective space, we connect other nodes if they lie within a certain distance d t as illustrated in Fig. 2a. We select d t iteratively, i.e., we start with a small value and gradually increase it until we get a single connected component graph. The resultant graph is denoted by G = (V , E), where V is the set of nodes and E is the set of edges. An edge between node i and j is denoted by e i j . Note that G is an undirected graph for the purpose of this analysis. This means that there is an edge between node i and j if and only if there is an edge between node j and i, i.e., e i j = e ji . A depiction of a graph with 14 nodes is shown in Fig. 2b where the edges are shown via black links. We next assign every edge in G a non-negative weight called the capacity of edge c i j which is determined by the Euclidean distance between two nodes i and j representing the two distinct points in objective space. Since we are only interested in the clusters in the objective space with similar properties, we set the capacity such that the higher the Euclidean distance between two nodes i and j, the lower its corresponding capacity. The idea here is to find the cuts in the graph that passes through the set of edges with lower capacity and hence, nodes in those edges are far away. By doing so, we can cluster nodes with similar properties together. The edge capacity between two nodes i and j c i j is given by where d i j is the Euclidean distance between the nodes i and j representing the edge e i j . Therefore, the capacity function for all of E is written as Let G * = (G, c), then it is called a capacitated undirected graph. Now, we create the GHT (also called the minimum cut tree) which records all possible minimum cuts, to find clusters. The GHT utilizes the maximum flow minimum cut theorem. In the minimum cut analysis, the edge capacity function plays a major role [43,44]. Next, we give some background information for the completeness. First, we look at the so-called cut in a graph. A cut of a graph is defined as a partition of the vertices (nodes) of a graph into two disjoint subsets. Any cut induces a cut-set, the set of edges that have one endpoint in each subset of the partition. The capacity of a cut is the sum of the capacities of all edges in the cut-set.
We call the quadruple F = (G, c, q, t) is a flow network of G, where q and t are two artificial nodes called the source and the sink, respectively. Now, we can define a cut of F , as follows. contains a set of links in G which, if disconnected, separates F into two disjoint components {W , W } of V such that no connected path from source q ∈ W to sink t ∈ W , i.e., now flow can be transmitted from q to t. Thus, any cut contains all links that have one endpoint in W and the other is in W .
As we are interested in the clusters with similar properties, we find the cut with the least capacity-the so-called minimum cut. To find the minimum cut, we utilize the maximum flow minimum cut Theorem. For more details about these calculations and procedures, we refer the interested readers to [3].

Gomory-Hu tree based clustering method
To find the clusters among objective space with similar properties, we construct the GHT [17]. Now, we outline this procedure briefly. Assume G * = (G, c) be an undirected and capacitated graph constructed for the points in Objective space. The GHT contains the information on the minimum u-v cut of G * separating u and v for every possible pair of nodes (u, v). Note that there are a total of n(n − 1)/2 possible source-sink pairs for a given graph G * with n nodes. This means that there are n(n − 1)/2 possible minimum cuts. However, according to the construction of GHT, the minimum cuts for some pairs of nodes are the same. The GHT is a minimum cut tree with the information on exactly n − 1 distinct minimum cuts [17]. We define the GHT formally as follows. The key property of the GHT is that a minimum capacity cut partitioning G * can be found in linear time by simply finding the smallest capacity link, say e, in T . The nodes in the two components of T − e constitute two clusters in G * . In a similar fashion, using T one can easily find n clusters in G * for any n by simply removing n − 1 smallest capacity links in T . However, this results in unbalanced clusters [26]. Therefore, we introduce the concept of cut ratio to find more realistic clusters. The ratio r is defined to be the number of nodes in the smallest component to the largest component upon the removal of a link in T . The idea is to find the link with the minimum capacity within the ratio of αk ≤ r < α(k + 1), where k ∈ {0, 1, .., 1/α − 1}, i.e., to find minimum cuts within all the ratios with the same width of α. This width defines the upper bounds for the number of clusters. Note that α is chosen such that 1/α − 1 is an integer (if not, one can round off to the nearest integer). Then, we remove all possible combinations of these links and record the Silhouette score. The combination, which gives the maximum Silhouette score, determines the optimal number of clusters. A Pseudocode for the proposed clustering method is given in Algorithm 1. Note that unbalanced clusters can be used to find extreme solutions. The idea here is to remove the n m lowest capacity links (e.g., links with capacity below the value of 10 th percentile) in T one at a time and find the nodes in the smallest component. Recall that the removal of a link in T gives two components of the corresponding minimum cut. Typically the smallest component contains a single node or fewer nodes and separates them from others with a minimum capacity. According to the capacity function in Eq. (2), nodes in the same clusters have similar properties. Therefore, nodes in the smallest component should be the most isolated nodes. As every node corresponds to a point in the objective space, the most isolated nodes are the 'extreme' solutions. This procedure in detail is outlined in Algorithm 2.

Proposed way of selecting representative solutions
In this section, we discuss our method of selecting a representative solution for each cluster. The method utilizes the connectivity of G. We can define some properties of the graph based on the connectivity of the network such as degree. The degree for a node is defined to be the

Algorithm 1 The proposed graph-theoretical clustering algorithm
Consider a GHT T constructed for a undirected and capacitated graph G * with n nodes Set N to an (n − 1) × 3 null matrix, i.e., the matrix whose entries are all zeros for k := 1 : n − 1 do Remove a link l k in T Calculate the ratio r (l k ) upon the removal of the link l k and set the k th row of N , N [k, :] := [r (l k ) w(l k ) l k ] where w(l k ) is the weight of the link l k in T end for return The set of edges l k 's which corresponds to the minimum weight w(l k ) for each αk ≤ r < α(k + 1), for k ∈ {0, 1, .., 1/α − 1}. Determine the link combinations to remove: remove all the possible combinations and record S. The combination that maximizes S is the candidate links l k 's. Removal of these links l k 's in T at once gives the corresponding optimal clusters. number of edges incident to that node. For example, the degree of nodes 6 and 9 in Fig. 2b are D(5) = 5 and D(9) = 7, respectively while all other nodes have a degree of 3 except nodes 3 and 7 where these two nodes have a degree of four.

Algorithm 2 The proposed method to find extreme solutions
If a node in a cluster has many connections, it should be a good representative for that cluster. That means the node with a higher degree better represents the entire cluster. From the above calculations of the degree, we also see that the node with the highest degree sits in the middle of the cluster. Therefore, it is clear that the node with the highest degree in a cluster better represents the entire cluster, on average. The proposed method of finding a representative solution for each cluster is outlined in Algorithm 3. The clusters are obtained from either K -means or GHT.

Algorithm 3
The proposed method to extract representative solutions from each cluster Construct a graph G = (V , E) for the Pareto optimal solutions Compute the degree d(i) for each node i ∈ G Let n o be the optimal number of clusters obtained from either T or K -means clustering for k := 1 : n o do Find the node that corresponds to the highest D(i) in the cluster k end for return The set of nodes that corresponds to the highest D(i) in each cluster. Pareto optimal solutions that correspond to this set constitute the representative solutions.
The reduced set of Pareto optimal solutions must cover all the cases including the extreme solutions [35,40]. However, standalone utilization of K -means clustering with commonly used ways of selecting representative solutions may not find extreme solutions. Therefore, we find these extreme solutions from the connectivity of the graph G. To do that, we find the solutions with a lower global degree (i.e., this analysis is conducted in objective space, not on the reduced set or individual clusters). The idea here is to find the most isolated solutions and, hence, they have either global minimum or maximum of the objective function values. These

Performance of the proposed method on benchmark data
In this section, we compare and present the reduced set obtained from K -means clustering and the proposed graph-theoretical clustering method. We compare the performance of the proposed method with two approaches in the literature, i.e., the reduced set formed by the solution closest to the hypothetical best (ideal solution) and the solution at the cluster centroid in each cluster. We chose these as the solution closest to the ideal solution or at the cluster centroid should capture the trade-off, theoretically. We use the Hypervolume indicator (HYP) and maximum spread (MS) to quantitatively determine the performance of solutions obtained from each method. Since HYP measures the volume of the trade-off space covered by the Pareto optimal solutions, the higher the Hypervolume better the solution is [7,23]. It also measures both the convergence and the diversity [34,38,51]. MS proposed in [2] has been successfully used in many multi-objective optimization problems to measure the diversity of the Pareto optimal solutions [24]. MS compares the Pareto front to the extreme value of the Pareto front and a value close to one is desired [24]. For a better comparison, the ideal solution and the nadir (worst solution) point are chosen from the entire Pareto optimal solutions.
We have tested the method on five different 3D objective spaces and five different 8D objective spaces for different DTLZ problems, which are reported by [11,12] and available at 1 . In Sect. 4.1, we present the results from the proposed method while results from most commonly used K -means clustering are presented in Sect. 4.2. We use the results from Kmeans clustering purely for the comparison and to show the importance of adding extreme solutions, which are obtained from the proposed graph theoretical approach as discussed in  Sect. 3. We choose K -means as the benchmark as it is the most popular traditional clustering method in the Pareto pruning [34]. In Sect. 4.3, we compare the results from the proposed method and K -means clustering.

Results from the proposed method
In Fig. 4, we show a reduced set obtained from the proposed method. This reduced set contains the representative solutions from clusters, which are based on the maximum degree (Algorithm 3), and the extreme solutions (Algorithm 2). Next, we calculate HYP and MS for all the benchmark data to show the performance of the proposed method quantitatively. In particular, this is important where we cannot visualize the results (above 3D objective space). These two performance indicators are tabulated in Tables 2 and 3.

Results from the K-means clustering
With the K -means, to obtain a representative solution for each cluster, a solution close to the hypothetical best (i.e., ideal solution) and solution at the cluster center is used. These solutions together with the proposed method of obtaining a representative solution for each cluster, which is based on the maximum connectivity node in each cluster (Algorithm 3), are shown in Fig. 5 for 3D objective space. Next, we show the importance of adding extreme solutions to the reduced set from Kmeans clustering alone. As discussed in Sect. 3, we used the connectivity of the graph that is created for the objective space. We find all the nodes, which has a degree below the 1.5 percentile value in the degree distribution. We can see a better-reduced set if we add extreme solutions to the reduced set obtained from only the K -means clustering (Fig. 6).
We show the HYP and MS of reduced sets obtained from K -means clustering alone and extreme solutions added to the reduced set for all the benchmark data in Tables 4, 5, 6 and 7. When extreme solutions, which are found from the degree distribution from the contact network, are added to the results from K -means clustering, we can see a significant increase in HYP and MS for all benchmark Pareto fronts. This indicates that the reduced set, which comprises both the representative solutions for clusters and extreme solutions, better captures the diversity of the Pareto fronts. Thus, handling the Pareto fronts as a contact process and finding the extreme solutions from the connectivity has an added value.
We also compute the percentage improvement of HYP and MS of the reduced set, which is obtained from adding extreme solutions, against results from K -means. We tabulate these results of HYP and MS in Tables 8 and 9, respectively. As evident in the results, adding  extreme solutions to the reduced set obtained from the K -means alone has a significant improvement of the reduced set.

Performance comparison
A visual comparison shows the reduced set obtained from the proposed method better captures the diversity of the Pareto fronts for all 3D benchmark data (see Figs. 4 and 5). We also compute the percentage improvement of HYP and MS of the reduced set from the proposed method compared to that for traditional K -means clustering. We tabulate these results for HYP and MS in Tables 10 and 11, respectively. As can be seen from the results, we can see a significant increase in both the performance metrics for the reduced set from the proposed In summary, we observed that a representative set of Pareto front better preserves the shape of the original set and hence better capture the diversity of the original set compared to results from classical methods. Results also revealed that handling the Pareto fronts as a contact process has an added value in the reduced set.

Application with a case study
In Sect. 4, we have tested the performance of the proposed method to find the reduced set of Pareto optimal solutions against that for classical K -means clustering. In this section, we present the results from the case study and give some decision-making insights. We choose this problem as most of the existing facility location problems are single-objective [22] and the nature of the integrated approach. For the completeness, we provide a brief discussion of the case study including our multi-objective optimization problem, solution approach, and important characteristics of Pareto optimal solutions in Sect. 5.1. In Sect. 5.2, we present and compare the results from the proposed method.

The case study
The case study comes from the Royal Australian Navy's acquisition plan for a new vessel fleet F n with |F n | = 12 and three fleet transition options discussed in [19]. These fleet transition options also include decommissioning plans for the existing naval fleet F e with |F e | = 6. We summarize these three fleet transition options below and refer the interested reader to [19] for more details. We denote these fleet transition options by S l for l ∈ {1, 2, 3}. Under S 1 and S 2 , the first new vessel k ∈ F n enters the vessel fleet, F at the end of the year 2032 and the rest of 11 vessels enter the fleet with two years gap as shown in Fig. 7. As depicted in the figure, for S 3 , the first new vessel enters at the end of the year 2031. The rest

Multi-objective model
Now, we briefly discuss our multi-objective optimization model. The detailed description of the model is discussed in [47]. We model and solve a facility location problem which integrates workforce planning and capacity allocation. We apply this model in the abovediscussed realistic case study in the Royal Australian Navy. As we deal with fleet transition options, our objectives were to maximize the average fleet availability F A , minimize the total cost C T and minimize the total inter-regional transactions I T . Note that F A and C T are given in number of available vessels per week and Australian Dollars (AUD), respectively. The total cost includes construction/downsizing, the maintenance cost of facilities and salaries for both crew and maintenance workforce (industrial manpower). The inter-regional transaction is defined to be the number of times a vessel access a facility (i.e., a dock) outside its homebase location. Our decision variables are the date and amount to be hired (constructed) and downsized of industrial manpower (facilities) in each region, crew recruitment rate in each region, home-base for the new vessels and alternate maintenance region preference for the vessels in each region. In this model we consider

Solution approach
Here, we briefly discuss the solution approach for the above-mentioned multi-objective problem. Our solution approach was the simulation-based optimization technique (see Fig. 8). The detailed description of the solution approach can be found in [47]. The optimization method is based on Nondominated Sorting Genetic Algorithm II (NSGA-II) while the simulation model unitizes a hybrid method of system dynamic (SD) and discrete event simulation (DES) modelling technique discussed in [13,46]. We present the parameters used in the NSGA-II  in Table 12. Note that we use the DEAP toolbox 2 in Python and single-point crossover is applied to two randomly selected individuals (chromosomes) to form an offspring (children solution) at each generation which produces a new population. In the mutation process, the part of the chromosome (decision variable) is randomly selected from a randomly selected solution (individual) with the predefined probability. We run the simulation model between the years 2020 and 2060 with weekly time steps to accommodate the above-discussed fleet transition options.

Pareto optimal solutions
In Fig. 9,

Results
In this section, we compare and present the reduced set obtained from the proposed method and K -means clustering. As in the case with benchmark data, we compare the performance of the proposed method against two approach in the literature, i.e., representative solutions formed by the solution closest to the ideal solution in each cluster and solution at the cluster centroid with K -means clustering. We use the HYP and MS to quantify and compare the performance. In Sect. 5.2.1, we present the results from the proposed method while results from K -means are shown in Sect. 5.2.2. We compare the results from the proposed method and K -means in Sect. 5.2.3.

Results from the proposed method
We find the optimal number of clusters by applying the proposed method (Algorithm 1). The optimal number of clusters found for S 1 and S 2 are 6 while that for S 3 is 10. As pointed by [29], the reduced set should be adequately small but, it should capture the diversity of the entire Pareto set. Therefore, in line with this, we set the upper bound for the required number of clusters to 10. Then the method outlined in Algorithms 3 and 2 are applied to find a representative solution for each cluster and extreme solutions, respectively. These results are shown in Figs. 10a,b, 11a,b and 12a, b. In Figs. 10c, 11c and 12c, we show 2D projection of the representative solutions in (a) with respect to the cluster ID. This gives a complete picture to the decision-maker. Now, the decision-maker can select a solution depending on their requirement by analyzing the reduced set, which contains both representative solutions and the extreme solutions. For example, if the decision-maker prefers the average fleet availability over the total cost or the total inter-regional transactions, then they can make their decision based on the extreme solutions (e.g., Fig. 10b). If they need better-compromised (trade-off) solutions, they can go with representative solutions (e.g., the representative solution in cluster 4, Fig. 10a, c). We calculate the HYP and MS of the reduced set (i.e., combination of representative solutions for clusters and extreme solutions) of Pareto optimal solutions. These performance metrics values are tabulated in Table 13.

Results from K -means clustering
We find a reduced set from classical K -means clustering as mentioned before. In the K -means clustering, we use the quality of the clustering determined by the Silhouette score to obtain the optimal number of clusters within the given upper bound of 10. We observed 10, 7 and 5 clusters for S 1 − S 3 , respectively as it gives the maximum Silhouette score. For these clusters, we find the representative solutions from the maximum degree (Algorithm 3), solution closest hypothetical best solution, and solution at the cluster center. Next, we calculate the HYP and MS of the reduced set obtained from each methods and results are tabulated in Table 14.

Performance comparison
Unlike the traditional K -means clustering, what more the proposed clustering method offer is the extreme solutions. To show the extreme solutions are better captured in the GHT analysis, we calculated HYP and the MS of the reduced set found from the combination of representative solutions for each cluster and the extreme solutions (Table 13). We also calculated HYP and MS of the reduced set obtained from the K -means clustering ( Table 14). Comparison of these results reveal that the reduced set from the proposed method better capture the diversity of the entire Pareto set. This is also evident in the percentage improvement of HYP and MS of reduced set from the proposed method (see Table 15). In summary, results revealed that the proposed graph-theoretical clustering method performs better in finding a reduced set. Therefore, decision-makers can analyze this reduced set instead of analyzing the entire set, which is impossible or time-consuming, to make their decision with ease.

Conclusions, limitations and future research
In this paper, a novel method to find a reduced set of Pareto optimal solutions for multi-objective optimization problems is proposed. The reduced set includes representative solutions found from the clusters, which are obtained from a graph-theoretical approach and other possible extreme solutions. We applied the proposed method in different 3D and 8D benchmark Pareto fronts as well as the 3D Pareto fronts from a case study from Royal Australian Navy. In all cases, we observed a considerable improvement of the Hypervolume indicator (HYP) and maximum spread (MS) of the reduced set obtained from the proposed method compared to that for K -means clustering. This shows the reduced set from the proposed method better captures the diversity of the entire Pareto set. Therefore, decision-makers can make their decision with ease. We also demonstrated that inclusion of extreme solutions to a reduced set from K -means clustering have an added value. Thus, our results suggest combining the graph-theoretical approach and K -means clustering to obtain a better-reduced set. This will be applicable to any clustering method, where they are not capable of finding extreme solutions.
The main limitation of the proposed method is deciding the connections in the objective space when creating a proximity contact network. Currently, a node representing a point in objective space is connected to another node if they lie within a prescribed distance (a threshold). This distance is determined by a simple ad hoc method, i.e., start with a small value and gradually increase until it gives a single connected component graph. The need for a trial and error method for deciding this distance is a limitation. To decide the ratio in the GHT is another limitation since this value depends on the imposed maximum number of clusters. To find the optimal number of clusters from the GHT, one need to remove all the possible combinations of links and choose the combination that maximises the Silhouette score. This will be a constraint for the proposed approach if the imposed maximum number of clusters is too large.