Multi-objective multi-criteria evolutionary algorithm for multi-objective multi-task optimization

Evolutionary multi-objective multi-task optimization is an emerging paradigm for solving multi-objective multi-task optimization problem (MO-MTOP) using evolutionary computation. However, most existing methods tend to directly treat the multiple multi-objective tasks as different problems and optimize them by different populations, which face the difficulty in designing good knowledge transferring strategy among the tasks/populations. Different from existing methods that suffer from the difficult knowledge transfer, this paper proposes to treat the MO-MTOP as a multi-objective multi-criteria optimization problem (MO-MCOP), so that the knowledge of all the tasks can be inherited in a same population to be fully utilized for solving the MO-MTOP more efficiently. To be specific, the fitness evaluation function of each task in the MO-MTOP is treated as an evaluation criterion in the corresponding MO-MCOP, and therefore, the MO-MCOP has multiple relevant evaluation criteria to help the individual selection and evolution in different evolutionary stages. Furthermore, a probability-based criterion selection strategy and an adaptive parameter learning method are also proposed to better select the fitness evaluation function as the criterion. By doing so, the algorithm can use suitable evaluation criteria from different tasks at different evolutionary stages to guide the individual selection and population evolution, so as to find out the Pareto optimal solutions of all tasks. By integrating the above, this paper develops a multi-objective multi-criteria evolutionary algorithm framework for solving MO-MTOP. To investigate the proposed algorithm, extensive experiments are conducted on widely used MO-MTOPs to compare with some state-of-the-art and well-performing algorithms, which have verified the great effectiveness and efficiency of the proposed algorithm. Therefore, treating MO-MTOP as MO-MCOP is a potential and promising direction for solving MO-MTOP.


Introduction
Multi-objective multi-task optimization (MO-MTO) [1][2][3] is a novel and emerging paradigm that aims to solve multiple multi-objective optimization tasks simultaneously. The core assumption of MO-MTO lies in that the knowledge and information gained from the optimization of one task B Jian-Yu Li jianyulics@qq.com 1 can be used to enhance the optimization of the other tasks [4][5][6][7]. For instance, when the optimal solutions of two tasks share similarities in some dimensions, the knowledge of the current best solution in one task (no matter single or multiobjective) can guide the evolutionary search for the other task. Therefore, MO-MTO that tackles the multiple tasks together during the evolutionary process can be more efficient than the traditional multi-objective optimization diagram that only considers one multi-objective optimization task during the evolutionary process. Moreover, the above assumption is usually supported in a variety of real-world optimization applications [8][9][10][11][12]. For instance, the vehicle routing problem [8,9], with various constraints such as the vehicle capacity, vehicle number, and time constraints, usually has a lot of specific problem instances that are somewhat similar in their problem characteristics, function landscapes, and optimal solutions. As a result, the MO-MTO research has attracted increasing attention in recent years.
To date, some EMO-MTO algorithms have been proposed for solving MO-MTOP, which can be roughly classified into two categories. The first category is about the multifactorialbased approach [3,13,[36][37][38], while the second category is about the non-multifactorial-based approach [39][40][41]. In the first category, the multi-objective multifactorial evolutionary algorithm (MO-MFEA) [3] is one of the most classical and representative algorithm frameworks, which evolves a single population in a unified search space with skill factors for solving multiple tasks. Due to the efficiency of the MO-MFEA, some enhanced MO-MFEA variants have been further studied and proposed for solving MO-MTOPs [13,36,37]. Differently, the second category mainly maintains multiple populations for solving multiple tasks [39][40][41]. For example, Lin et al. [39] proposed an incremental learning method to transfer common knowledge among populations to help solve relevant tasks. For another example, Feng et al. [41] proposed a novel autoencoder method to transfer the promising individuals among different populations targeted at different tasks.
However, existing EMO-MTO algorithms, no matter using a single population with multiple groups (based on the MO-MFEA framework) or using multiple populations, still treat the multiple tasks in the MO-MTOP as different problems when optimizing them during the evolution procedure. Therefore, the existing EMO-MTO algorithms have to work with a well-designed knowledge transferring strategy among the tasks. However, designing a good knowledge transferring strategy is a very difficult issue. Differently, a recent work in [42] has shown that it is beneficial to consider the multiple tasks as different evaluation criteria during the optimization process of the single-objective MTOP. This way, the knowledge for optimizing different tasks can be reserved in the optimization process and be naturally shared among all the tasks. Following this, we can use the multiple relevant criteria to evolve the population accordingly, so as to search for the optimal solutions for all different tasks in one run. Inspired by the idea of treating multi-tasks as multi-criteria [42], we attempt to treat the MO-MTOP as a multi-objective multi-criteria optimization problem (MO-MCOP), so that the knowledge for different tasks can be fully leveraged to solve the MO-MTOP more efficiently. That is, we treat the MO-MTOP with multiple multi-objective optimization tasks as an MO-MCOP with multiple evaluation criteria, where the multiple criteria are used for the environmental selection and population evolution. By doing so, the challenging issue in MO-MTOP, i.e., how to find the useful knowledge and then transfer them across different relevant multi-objective tasks, becomes a simpler issue: how to utilize the multiple evaluation criteria to guide the environmental selection and population evolution, so as to generate the optimal solutions that satisfy different criteria of all tasks. Therefore, this research direction has a great potential of both leading to a significant approach for dealing with MO-MTOPs and providing significant contributions to the developments of related research communities.
The major novelties and contributions of this paper are listed as follows: First, this paper attempts to solve MO-MTOPs by treating them as MO-MCOPs, which provides a novel and potential way for handling MO-MTOPs. Moreover, to the best of our knowledge, this paper is the first that tries to tackle MO-MTOP by treating it as MO-MCOP. Besides, this paper also mathematically discusses why treating MO-MTOP as MO-MCOP can be effective and efficient.
Second, a probability-based criterion selection strategy (PCSS) is proposed to select and utilize the multiple evaluation criteria based on the corresponding probability, so that different criteria can have different corresponding chances to be selected to guide the environmental selection and population evolution.
Third, an adaptive parameter learning (APL) method is further proposed to learn the probability adaptively for choosing criteria in PCSS. By adopting the APL, the algorithm can learn the suitable probability to help determine which criterion should be used at the current generation, so as to guide the population evolution with different criteria more appropriately.
Fourth, by integrating the above, a multi-objective multicriteria evolutionary algorithm (MO-MCEA) framework is developed for solving MO-MTOPs.
To evaluate the proposed methods, extensive experiments are conducted on widely used MO-MTOP benchmarks. Moreover, some state-of-the-art and well-performing EMO-MTO algorithms have also been adopted to compare and challenge the proposed MO-MCEA.
The rest contents are as follows: the next section briefly introduces background knowledge and related work. The following section gives the motivation and analysis of treating MO-MTOP as MO-MCOP. The next section describes the proposed methods. Experimental studies are provided in the next section. Finally, the last section concludes the paper.

Background and related work
Multi-objective multi-task optimization MO-MTO is a diagram for solving multiple multi-objective optimization tasks together. Mathematically, the MO-MTOP can be defined as follows.
Given K multi-objective optimization tasks (assuming the objectives in every task are all minimization problems), denoted as T 1 , The search space and the objective space of the k th task are k and Ψ M k , respectively, and they satisfy that F k : k → Ψ M k . The aim of a minimization MO-MTOP is to find the optimal solution set {x k } for each task T k , such that {x k } satisfies As each F k has multiple objectives, we will have the following important concepts for each task T k to determine whether a solution is optimal according to the related definitions in the literature of multi-objective optimization [43,44]. Definition 1 Pareto domination Given any two objective fitness vectors u [u 1 , u 2 , …, u M ] and w [w 1 , w 2 , …, w M ] in the objective space Ψ M , we say that u dominates w if u m ≤ w m for all m 1, 2, …, M and u w, denoted as u w.

Definition 3 Pareto set
The Pareto set (PS) is a set of the Pareto optimal solutions, which can be represented as PS {x ∈ and x is Pareto optimal}. (2)

Definition 4 Pareto front
The Pareto front (PF) is composed of the solutions in PS, as Based on the above definitions, the optimal solution set {x k } for each task T k is actually the PS of the T k .

Related work
To date, although the EMO-MTO is a newly emerged optimization diagram, some EMO-MTO works have been proposed and attracted increasing attention. Therefore, this part provides a brief review of the existing works.
As mentioned briefly in the Introduction part, existing works about EMO-MTO can be categorized into two major categories. The first category is about the multifactorialbased approach [3,13,[36][37][38], while the second category is about the non-multifactorial-based approach [39][40][41].
In the first category, MO-MFEA is the most representative algorithm framework for solving MO-MTOPs [3]. When compared with other algorithms, the distinct feature of MO-MFEA is that the MO-MFEA evolves one population with skill factors to find the optimal solutions for multiple tasks together. In MO-MFEA, each individual in the population corresponds to one task according to their skill factors, and thereby individuals for different tasks can transfer common knowledge implicitly via genetic operations, e.g., crossover operation. By utilizing omnidirectional knowledge transfers, the MO-MFEA can achieve mutual knowledge sharing among different multi-objective optimization tasks, so that the optimization for each task can be benefited by the obtained knowledge from other tasks. To make the knowledge sharing more adaptively, Bali et al. [13] studied the intertask relationship learning and proposed an online adaptive genetic transfer method to develop the MO-MFEA-II, which can achieve better overall performance than the original MO-MFEA. Moreover, as MO-MFEA may convergence slowly if tasks are weakly relevant or even irrelevant, Zheng et al. [36] proposed to introduce additional helper tasks via the weighted sum of original tasks to improve the knowledge transfer and speed up the algorithm convergence. Furthermore, Yang et al. [37] considered not only the convergence but also the diversity of MO-MFEA and proposed a two-stage assortative mating method to enhance the knowledge transfer among diversity-related and convergence-related variables among related tasks. Besides, Binh et al. [38] proposed a reference-point-based approach and an enhanced random mating probability learning method to better exploit and transfer knowledge among individuals targeted at different tasks.
In the second category, algorithms maintain multiple populations with explicitly transfer information among the populations for solving MO-MTOPs. For example, Lin et al. [39] proposed an algorithm based on incremental learning to find suitable knowledge for the transfer among different tasks. Liang et al. [40] proposed a genetic transform strategy to transfer the individual genetic information from one task to relevant tasks. In addition, Feng et al. [41] proposed an explicit autoencoding method to achieve knowledge transfer among the population targeted at different tasks to enhance the optimization results.
However, these existing methods and algorithms for MO-MTOPs still treat the multiple tasks in the MO-MTOP as different problems and optimize them simultaneously, which requires a well-designed knowledge transferring strategy to share knowledge among tasks. Differently from these methods, the MO-MCEA proposed in this paper treats all tasks in MO-MTOP as multiple criteria of an entire MO-MTOP to guide the evolution appropriately, so as to fully utilize the knowledge in different tasks during the optimization process and obtain promising solutions for all tasks. Therefore, the contributions and novelties of this paper are justified.

Definition of MCOP and MO-MCOP
In this paper, the MCOP is defined as follows: Given an optimization problem (assuming it is a minimization problem) with K available evaluation criteria (including objective/constraint functions) on the same search space , which are denoted as F 1 , F 2 , …, F K , the aim of a minimization MCOP is to find the optimal solution set {x} that satisfies where F 0 can be arbitrary one in {F 1 , F 2 , …, F K } at different search stages. Note that the "argmin" should be "argmax" if the problem is a maximization optimization problem, and the {x} will only have one element if the problem is singleobjective. The key characteristic of MCOP is that when each evaluation function can guide the optimization to find acceptable optimal solutions for the problem, then the proper usage of multiple evaluation functions can obtain a satisfactory solution more efficiently. Moreover, if the F 1 , F 2 , …, F K have different fidelities or accuracy scales, then the MCOP can be considered as a multifidelity optimization problem [45] or multi-scale optimization problem [46], respectively. In addition, the relationship between MCOP and MTOP is that both of them have multiple evaluation functions, while the main difference between them lies in that the MCOP requires the algorithm to optimize only one evaluation function each time, while the MTOP requires the algorithm to optimize multiple different evaluation functions together each time. Based on the above, the MO-MCOP is the same as MCOP except that all evaluation functions in {F 1 , F 2 , …, F K } are multi-objective functions.

Relationship between MO-MTOP and MO-MCOP
This part discusses the relationship between MO-MTOP and MO-MCOP. To begin with, we consider a unified search space for all different tasks in the MO-MTOP, where the search space of the kth task is k . Considering K one-to-one mapping functions ϕ 1 , ϕ 2 , …, and ϕ K , where ϕ k : → k , then Eq. (1) for MO-MTOP can be rewritten as Now, the MO-MTOP is with K tasks with the same search space (i.e., Ω), where the multi-objective evaluation function of the kth task is F k zϕ k and of all tasks are the same. Moreover, if considering F 1 zϕ 1 , F 2 zϕ 2 , …, and F K zϕ K as different evaluation functions in a MO-MCOP, then the aim of MO-MCOP, i.e., Eq. (4), can be rewritten as Note that we have {ϕ k (x)} {x k } for k 1, 2, …, K as ϕ k : → k . Therefore, comparing Eqs. (5) and (6), the main difference between MO-MTOP and MO-MCOP is that given a set of evaluation functions, MO-MTOP aims to optimize all evaluation functions by considering them together all the time during the optimization process, while MO-MCOP attempts to obtain the optimal solutions by considering one evaluation function every time during the optimization process, where the latter (i.e., MO-MCOP) can be an easier problem, because the optimization algorithms can select an appropriate function at different stages adaptively and flexibly to guide the evolution to obtain better results. Therefore, it would be better if MO-MTOP can be treated as MO-MCOP.

Treating MO-MTOP as MO-MCOP
Based on the above, MO-MTOP can be treated as MO-MCOP with multiple criteria (i.e., evaluation functions) and the algorithm can select the proper criterion in different stages to guide the evolution. Herein, this part further analyzes the rationality and the benefit of treating MO-MTOP as MO-MCOP in the following contents.

The rationality of treating MO-MTOP as MO-MCOP
In fact, the key issue of rationality depends on the difference between the optimal results obtained from Eq. (5) and those from Eq. (6), i.e., the equivalence degree of MO-MTOPs and MO-MCOPs. Without loss of generality, we consider an MO-MTOP with two tasks T i and T j , where the evaluation functions and Pareto sets of these two tasks are F i zϕ i and F j zϕ j and PS i and PS j , respectively. Therefore, according to Eq. (5), the optimal results after solving the MO-MTOP should include PS i and PS j . Moreover, according to Eq. (6), if F i zϕ i is selected as the evaluation criterion all the time when solving the MO-MCOP, the optimal results after solving the MO-MCOP will be PS i . That is, the difference between the optimal results obtained by Eq. (5) (i.e., solving MO-MTOP) and those by Eq. (6) (i.e., solving MO-MCOP) can be considered as the difference between PS i ∪ PS j and PS i , i.e., the difference between PS i and PS j .
Based on the above, we can have three observations on the rationality of treating MO-MTOP as MO-MCOP.
First, if T i and T j in an MO-MTOP have high similarity and share much common knowledge in their Pareto sets PS i and PS j , e.g., PS i PS j or PS i ≈ PS j , then the results obtained by Eq. (5) (i.e., by solving MO-MTOP) and those obtained by Eq. (6) (i.e., by solving MO-MCOP) are similar. In such a situation, treating the MO-MTOP as an MO-MCOP can find the Pareto sets for both tasks.
Second, if T i and T j share some similarities in their Pareto sets, i.e., PS i ∩ PS j Ø but PS i PS j , then the PS i will contain some Pareto optimal solutions for T j . As real-world multi-objective optimization tasks usually require enough Pareto optimal solutions rather than all optimal solutions, the number of Pareto optimal solutions in PS i can be enough to constitute an acceptable Pareto set for T j . In this scenario, treating the MO-MTOP as an MO-MCOP can find the acceptable solutions for both tasks.
Third, if T i and T j are very different, e.g., PS i ∩ PS j ≈ Ø, these two tasks are not suggested to be integrated together as an MO-MTOP. In this circumstance, the problem with such tasks is actually not a well-defined MO-MTOP, and therefore, it is no need to (and is also not recommended to) treat it as an MO-MCOP.
Based on the above, it is reasonable to treat MO-MTOPs as MO-MCOPs.

The benefit of treating MO-MTOP as MO-MCOP
When treating an MO-MTOP as an MO-MCOP, the algorithm can select an evaluation function as the criterion to evolve the population. To begin with, we can consider an MO-MTOP with all tasks in the same search space (i.e., mapping functions ϕ are the identity functions ϕ(x) x and can be omitted) as an MO-MCOP, and then define the following notation: Given the population at the gth generation and the (g + z)th generation, i.e., P g and P g + z , respectively, Pr(P g + z ≺ P g |P g , F i ) denotes the probability that no solutions in P g + z are dominated by any solutions in P g after the P g evolves z gen-erations with the multi-objective evaluation function F i as the selection criterion to become P g + z .
Note that theory analysis and experimental studies have shown that if an algorithm has a larger probability for producing a better population in a given number of generations, the algorithm will have a faster convergence speed and a smaller time complexity [47,48]. Therefore, a larger Pr(P g + z ≺ P g |P g , F i ) will be more satisfied.
Based on the above, we can discuss the benefit of treating MO-MTOP as MO-MCOP as follows: Without loss of generality, we can assume that In general, as different multi-objective functions have different landscapes and the different numbers of local optima, the a g , a g + z , b g , and b g + z may be different from each other. Then, if only one of the evaluation functions (e.g., F i or F j ) is selected as the criterion to evolve population P g for 2z generations, we can have the following equations: However, if the problem is treated as an MO-MCOP and the F i and F j are selected appropriately as the criteria at different stages (e.g., during the z generations) to evolve P g , so as to maximize Pr(P g + 2z ≺ P g + z ≺ P g |P g , F i or F j ), then we can have Note that the "F i or F j " in the formula means that the algorithm can use F i or F j during the evolution from P g to P g + z and from P g + z to P g + 2z . Then, by combining Eqs. (7), (8), and (9), we can have the inequation (10) max Pr(P g+2z ≺ P g+z ≺ P g P g , F i or F j ) That is, when compared with the approach that only uses one function as the criterion for populations for different tasks, treating the MO-MTOP as an MO-MCOP and using multiple fitness functions properly at different stages can make the population have a larger probability to become a better population within the given number of generations, which can result in a higher evolution efficiency and a faster convergence speed. Therefore, treating MO-MTOP as MO-MCOP properly can bring more sufficient knowledge sharing among multiple tasks to benefit the population evolution [49][50][51].
Based on the above, it is rational and beneficial to treat MO-MTOPs as MO-MCOPs and choose the criterion properly at different evolution stages. Therefore, this paper also proposes the PCSS and APL to select the criterion in different stages properly, which will be described in the following contents.

The overall framework of the MO-MCEA
The overall framework of the MO-MCEA is presented in Fig. 1, which mainly has three parts. The first part is a criterion set of available evaluation functions, the second part is the criterion selection based on PCSS and APL to select one of the evaluation functions as the evaluation criterion to evolve population at different evolutionary stages, and the third part is the procedure of evolutionary optimization with the selected criterion. In the following contents, the PCSS, APL, and the complete algorithm will be introduced one by one.

Probability-based criterion selection strategy
The PCSS is proposed to select one multi-objective function from multiple multi-objective functions to be the selection criterion for the current generation. The idea of PCSS is straightforward: each multi-objective function will have a criterion selection probability (denoted as csp), e.g., F i has csp i , and the multi-objective function with a larger csp will have more opportunities to be selected as the selection criterion. Following this, given K multi-objective functions F 1 , F 2 , …, and F K , and their corresponding selection probability csp 1 , csp 2 , …, and csp K , the index of the function selected as the criterion will be determined by the roulette with csp 1 , csp 2 , …, and csp K . Mathematically, the selection of the criterion can be written as where cid is the index of the selected criterion (i.e., the selected criterion is denoted as F cid ), and the function roulette returns an index based on a roulette with csp 1 , csp 2 , …, and csp K as the probabilities for the available indexes 1 to K .

Adaptive parameter learning
The initial csp is initialized evenly for different available functions and the APL aims to learn more suitable csp adaptively during the evolution, so that the PCSS can select the criterion more properly. In general, if the population has better improvements after one generation under the current selection criterion, this criterion may be more suitable for the current evolutionary stage, and vice versa. Therefore, the APL updates the csp according to the population improvement in every generation. To be specific, considering the population at gth generation (i.e., P g ) uses the cidth multiobjective function (i.e., F cid ) as the selection criterion, then the APL will update csp cid as where is a fixed value for updating csp, and the "better" can be determined by comparing P g + 1 and P g based on various metrics that do not rely on the real Pareto front, such as C metric [52] and hypervolume [32]. Note that the proposed MO-MCEA in this paper uses the C metric to compare P g + 1 and P g . Besides the csp cid , the other csp (i.e., csp j where j cid) will also be updated as  where K is the total number of available multi-objective tasks. Note that if a csp i is smaller than 0.1, then it will be set as 0.1 and then all csp are normalized, so as to guarantee that the ith function still has a chance to be selected again and the sum of csp equals 1.

The complete MO-MCEA
With the PCSS and APL, the flowchart of the complete MO-MCEA is given in Fig. 2 and the pseudo code of the complete MO-MCEA is shown as Algorithm 1. Note that the search spaces of all tasks will be normalized in the unified search space [0,1] D , where D is the maximum variable dimension among all tasks. That is, solutions for different tasks are mapped to the [0, 1] D , which is a common technique in the literature [4]. As can be seen, after the initialization, Algorithm 1 mainly has three repeated procedures. These three procedures are the criterion determination via PCSS, population evolution, and the parameter learning for PCSS through APL, which can be seen in lines 10-15, lines 16-18, and lines 19-20 of Algorithm 1, respectively. Note that the population evolution procedures in Algorithm 1 can use various existing well-designed operators including efficient crossover, mutation, and selection operators according to the needs and preferences of users. Therefore, the MO-MCEA can be extended with powerful state-of-the-art methods and operators to develop more efficient algorithms. In this paper, the optimization algorithm for each task is the NSGA-II [44]. In addition, it should also be noted that the evaluation criterion will be switched by PCSS every G generations. Moreover, every time the evaluation criterion is switched, the current population will be re-evaluated by the new evaluation criterion before going to the evolution. Therefore, NP fitness evaluations (FEs) are needed after every switch, just as shown in lines 13 and 14 of Algorithm 1. Overall, the algorithm will repeat the PCSS, population evolution, and APL iteratively until the stop criterion is met, e.g., all available FEs are consumed. Note that we use K sets (denoted as NDS 1 , NDS 2 , …, NDS K ) to record the current non-dominated solutions for the K tasks, respectively. During the evolutionary process, after the population evolution of one generation with F cid (i.e., the line 16 of Algorithm 1), the corresponding NDS cid will be updated. That is, all the solutions in the current population will merge with the solutions in NS cid , and then, only those non-dominated solutions in the merged solution set can be remained in the NDS cid . After the evolutionary process, all the solutions in the final population will be, respectively, evaluated by each of the K multi-objective functions and be merged with the corresponding NDS to update the NDS (the update process is similar to that in line 18), see lines 23-24 in Algorithm 1. Finally, the algorithm outputs the best-found non-dominated sets for all different tasks. Re-evaluate individuals by the function Fcid; 14: FEs ← FEs+NP; 15: End If 16: Evolve population for one generation with Fcid; 17: FEs ← FEs+NP; 18: Update and record the corresponding NDScid; 19: // Adaptive Parameter Learning 20: Update [csp1, csp2, …, cspK] according to Eq.(12) and Eq.(13); 21: gen ← gen+1; 22: End While 23: Evaluate individuals by the K multi-objective functions F1, F2, …, and FK; 24: Update and record NDS1, NDS2, …, and NDSK, respectively; 25: End

Experiment setup
In the experimental studies, six commonly used MO-MTOPs [53] with different similarities are adopted to investigate the proposed methods and algorithm. The properties of these problems are briefly introduced in Table 1, where the task similarity is evaluated by the Spearman's rank correlation coefficient [53]. In Table 1, the complete intersection (CI) and partial intersection (PI) indicate that the Pareto optimal solutions of the two tasks are similar in all and some dimensions, respectively, while high similarity (HS), medium similarity (MS), and low similarity (LS) mean that the two tasks have high, medium, and low similarity measured by Pearson correlation according to their function landscape. Based on this, the six problems with different properties can be categorized into different categories according to their intersection degree and task similarity. For example, the CI + HS problem has the property of complete intersection and high similarity between its two internal tasks. Moreover, as shown in Table 1, different multi-objective tasks will also have different Pareto fronts. Therefore, the composition of different intersections, similarity degrees, and Pareto fronts can help provide an indepth observation of how the proposed algorithm may behave in various situations. In addition, the dimensions of all tasks are set as 10, because tasks with high dimensions can be difficult to solve (e.g., contains many local optima) and different algorithms may have similar results (e.g., similar very poor results) on the high-dimensional tasks, which is not ideal for algorithm comparison and analysis.
To evaluate the effectiveness and efficiency of the proposed algorithm, some state-of-the-art and latest wellperforming algorithms with various characteristics are used in comparisons. The adopted algorithms include the MO-MFEA [3], MO-MFEA-II [13], and EMT with autoencoding for MO-MTOPs (denoted as MO-EMTA for simplicity) [41]. To make the comparisons fair, all these algorithms use the same widely used and representative MOEA (i.e., NSGA-II [44]) as the optimizer. By doing so, the differences of the proposed MO-MCEA and the three compared algorithms only Table 1 The properties of six adopted multi-objective multi-task optimization problems Problem ID Tasks ID D Concave, multi-modal, nonseparable lie in their approach or way for handling MO-MTOPs, e.g., handle MO-MTOPs as MC-MTOPs or handle MO-MTOPs via multifactorial-based approach). Moreover, an NSGA-II integrated with the single-task evolutionary optimization paradigm (i.e., solving each task separately and independently) is also adopted as a baseline in the comparisons, where the algorithm is denoted as MO-STEA in the following contents. All the algorithm settings are kept consistent with their original papers. As for the NSGA-II operators, the settings are set as the same as those used in existing EMO-MTO literature [13]. Moreover, the G and in MO-MCEA are set as 25 and 0.01, respectively. In addition, the population size of individual budgets for each task is set as 50. Therefore, the total population size of both MO-MCEA and MO-STEA is 50, while the total population size of MO-MFEA, MO-MFEA-II, and MO-EMTA are all 100.
In the experiments, the maximum number of available FEs is 1 × 10 4 in total (i.e., 5 × 10 3 for each task) for each independent run of all algorithms. To reduce statistical errors, each algorithm runs 30 times independently on each MO-MTOP and the results are collected for the comparisons, as suggested by the literature [53].

Evaluation metrics
To evaluate the performance of MO-MCEA, four evaluation metrics are adopted in the experimental studies and comparisons. The first two metrics are the mean and standard value of the optimization results of each algorithm for each task over 30 runs. As the tasks are multi-objective optimization tasks, the widely used inverted generational distance (IGD) indicator [43] for multi-objective optimization is adopted to evaluate the performance of an algorithm on each task. The IGD value can be calculated with a set of objective vectors obtained by an algorithm A (denoted as P A ) and a set of the objective vectors uniformly distributed over the true PF (refer to "Multi-objective multi-task optimization") of a multi-objective task (denoted as P * ), which is mathematically defined as where d(x, y) calculates the Euclidean distance between x and y in the objective space, and |P * | is the number of vectors in P * . In general, the smaller the IGD(A, P * ) is, the better the algorithm A is, because a smaller IGD indicates that P A is more close to P * or contains more data in P * . In addition, if | P * | is large enough to represent the PF, the IGD(P A , P * ) can measure both the convergence and diversity of the nondominated solutions obtained by A. In other words, in this paper, the first two metrics are the mean and standard value of the IGD value obtained by each algorithm for each task over 30 runs, respectively. Furthermore, the third metric is the Wilcoxon's rank sum test with a significant level α 0.05 [43] for algorithm comparisons. Specifically, based on the Wilcoxon's rank sum test, the symbols " + ", "≈", and "−" are used to represent that the proposed MO-MCEA performs significantly better than, similar to, and significantly worse than the compared algorithms, respectively.
The fourth metric is the mean standard score (MSS) designed for comparing algorithms on MO-MTOPs [53]. This metric is defined as follows. Assuming that there are N algorithms denoted as A 1 , A 2 , …, A N for an MO-MTOP with K multi-objective minimization tasks T 1 , T 2 , …, T K , each algorithm runs for R repetitions and the MSS of algorithm A i can be defined as where IGD (i,k) r is the normalized IGD value obtained by an algorithm A i for task T k in rth independent run. The normalization process can be written as where IGD(i,k) r is the original IGD value obtained by A i on task T k in the rth independent run, and u k and σ k represent the mean and standard deviation of IGD value for task T k over all the R repetitions of all the N algorithms, i.e., the mean and standard value of N × R IGD results.

Comparisons with state-of-the-art algorithms
To investigate the performance of the proposed MO-MCEA, this part compares it with four algorithms, including the state-of-the-art and well-performing MO-MTO algorithms including MO-MFEA [3], MO-MFEA-II [13], and MO-EMTA [41], and one single-task algorithm MO-STEA as the baseline. The comparison results in Table 2 show the great efficiency of MO-MCEA. As shown in Table 2 Table 2 and Fig. 3, the MO-MCEA performs The bold values mean the best results significantly better on the three complete intersection problems with different task similarities, i.e., the MO-MTOP1 to MO-MTOP3. This is consistent with the analysis given in Sect. 3.1 that treating MO-MTOP as MO-MCOP is reasonable and effective when the optimal Pareto set of different tasks share similarities. Overall, the comparison results have verified the efficiency of MO-MCEA.

Ablation studies for component analysis
To perform the component analysis of MO-MCEA, we compare it with its variants that do not use PCSS or APL, which are denoted as MO-MCEA-w/o-PCSS and MO-MCEA-w/o-APL, respectively. To be specific, the MO-MCEA-w/o-PCSS selects the multiple objective functions as the criterion in sequential order with a round-robin fashion, and the MO-MCEA-w/o-APL will not update the probability parameter for determining the criterion (i.e., each of the multiple objective functions has the same probability to be randomly selected as the criterion in the PCSS). Table 3 gives the comparison results, which indicate the contributions of both PCSS and APL. As shown in Table  3

Experiments on MO-MTOP with more tasks
This part further investigates the problem-solving ability of the proposed MO-MCEA on MO-MTOPs with more than The bold values mean the best results two tasks. For this, four additional problems with three tasks (i.e., MO-MTOP7 to MO-MTOP10) are developed based on the MO-MTOP1 to MO-MTOP6, where the problem characteristics can be seen in Table 5. The experimental settings are the same as those in previous experiments, with the FEs for each task as 5 × Experimental results of the five algorithms over 30 runs are compared in Table 6. Table 6     The bold values mean the best results

Conclusion
In this paper, we have attempted to treat MO-MTOP as MO-MCOP and solve it efficiently. For this, we have provided an analysis of the rationality and benefit of treating MO-MTOP as MO-MCOP. Moreover, we have further proposed the PCSS to select different multi-objective fitness functions as criteria in different evolutionary stages to evolve individuals. In addition, the APL method has been further proposed to learn the parameter in PCSS adaptively. Based on PCSS and APL, the complete algorithm framework called MO-MCEA has been developed for solving MO-MTOPs. To investigate the proposed methods and algorithm, experiments have been conducted on widely used MO-MTOP benchmarks. Also, four state-of-the-art and well-performing algorithms have been adopted in comparisons to challenge the proposed MO-MCEA. The experimental results have shown the great effectiveness and efficiency of the proposed MO-MCEA, showing that treating MO-MTOP as MO-MCOP can be a potential way for solving MO-MTOP more efficiently. For future work, the MO-MCEA, including the PCSS and APL, will be further improved and extended to solve more difficult MO-MTOPs (e.g., MO-MTOPs with different similarities and intersections) and more complex real-world MO-MTOPs, such as those are also with other difficult characteristics in data-driven optimization problems [54,55], expensive optimization problems [56][57][58], multi-modal optimization problems [59], large-scale optimization problems [60,61], and many-objective optimization problems [62,63]. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.