Decision variable contribution based adaptive mechanism for evolutionary multi-objective cloud workflow scheduling

Li, Jun; Xing, Lining; Zhong, Wen; Cai, Zhaoquan; Hou, Feng

doi:10.1007/s40747-023-01137-w

Decision variable contribution based adaptive mechanism for evolutionary multi-objective cloud workflow scheduling

Original Article
Open access
Published: 29 June 2023

Volume 9, pages 7337–7348, (2023)
Cite this article

Download PDF

You have full access to this open access article

Complex & Intelligent Systems Aims and scope Submit manuscript

Decision variable contribution based adaptive mechanism for evolutionary multi-objective cloud workflow scheduling

Download PDF

Jun Li¹,
Lining Xing ORCID: orcid.org/0000-0002-6983-4244²,
Wen Zhong³,
Zhaoquan Cai^4,5 &
…
Feng Hou⁶

821 Accesses
1 Citation
Explore all metrics

Abstract

Workflow scheduling is vital to simultaneously minimize execution cost and makespan for cloud platforms since data dependencies among large-scale workflow tasks and cloud workflow scheduling problem involve large-scale interactive decision variables. So far, the cooperative coevolution approach poses competitive superiority in resolving large-scale problems by transforming the original problems into a series of small-scale subproblems. However, the static transformation mechanisms cannot separate interactive decision variables, whereas the random transformation mechanisms encounter low efficiency. To tackle these issues, this paper suggests a decision-variable-contribution-based adaptive evolutionary cloud workflow scheduling approach (VCAES for short). To be specific, the VCAES includes a new estimation method to quantify the contribution of each decision variable to the population advancement in terms of both convergence and diversity, and dynamically classifies the decision variables according to their contributions during the previous iterations. Moreover, the VCAES includes a mechanism to adaptively allocate evolution opportunities to each constructed group of decision variables. Thus, the decision variables with a strong impact on population advancement are assigned more evolution opportunities to accelerate population to approximate the Pareto-optimal fronts. To verify the effectiveness of the proposed VCAES, we carry out extensive numerical experiments on real-world workflows and cloud platforms to compare it with four representative algorithms. The numerical results demonstrate the superiority of the VCAES in resolving cloud workflow scheduling problems.

An adaptive multi-objective evolutionary algorithm for constrained workflow scheduling in Clouds

Article 02 December 2017

Multi-objective workflow scheduling in cloud computing: trade-off between makespan and cost

Article 10 October 2021

Mutation-driven and population grouping PRO algorithm for scheduling budget-constrained workflows in the cloud

Article 11 June 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Workflows have been commonly used to describe data processing applications from diversified fields, such as the Internet of Things and bio-informatics [1,2,3]. These workflows often comprise of large-scale data-dependent tasks, which are computation and data intensive. Then, executing various workflow applications calls for powerful high-performance infrastructures. With substantial advantages, such as economies of scale, on demand supply of resources, high elasticity, and reliability, cloud computing is attracting more and more enterprises or individuals to deploy their big data processing workflows [4, 5].

Workflow scheduling in cloud computing is a key technology for achieving the reduction of both execution cost and makespan to gain more profits for cloud providers and ensure the quality of service for cloud consumers [6]. Workflow scheduling problem involves determining the mappings from tasks to resources and the task order on each resource, and is a classic NP-complete [7, 8]. Also, the execution cost and makespan of workflow scheduling are two conflicting optimization objectives [9]. So far, multi-objective evolutionary algorithms have become popular to search a set of compromise solutions within an acceptable time [10,11,12]. To solve the workflow scheduling problem in cloud computing, some studies design new evolution and selection operators to improve the classical multi-objective evolutionary algorithms.

Over the past decade or so, designing efficient evolution operators to reproduce new solutions for multi-objective workflow scheduling problem has attracted considerable research interest [13]. First of all, the popular list-based workflow scheduling methods were embedded into the multi-objective evolutionary optimization framework as evolution operators [14,15,16]. Secondly, bio-inspired evolution operators, such as artificial neural network [17,18,19], ant colony optimization [20], firefly algorithm [21], particle swarm optimization [22,23,24], and grey wolf optimization [25], were modified as evolution operators to solve the multi-objective workflow scheduling problem. Thirdly, integrating heuristic rules and bio-inspired optimization techniques to reproduce offspring populations has become a popular technological path. For instance, Choudhary et al. [26] combined the gravitational search method and the list-based workflow scheduling method to solve bi-objective workflow scheduling problem in cloud computing. Hosseini et al. [27] merged the simulated annealing and a task duplication strategy to optimize the makespan and execution cost of workflows. Mohammadzadeh et al. [28] integrated the antlion and grasshopper optimization algorithms to balance throughput, makespan, cost, and energy consumption of executing workflows in cloud platforms. Zhang et al. [29] enhanced the list-based workflow scheduling method with a local search mechanism to balance the makespan and energy consumption of workflow execution.

At the same time, some studies went into designing selection operators to balance multiple conflicting objectives of workflow execution in cloud computing. For example, Zhou et al. [30] merged a fuzzy-dominance-based environmental selection and a list-based workflow scheduling method to minimize execution cost and makespan of workflows in cloud computing. Kumar et al. [31] integrated the entropy weight mechanism into a multi-criteria decision-making framework to balance makespan, execution cost, reliability, and energy consumption. Ye et al. [32] improved a knee point driven evolutionary method to balance makespan, reliability, execution cost, and the mean durations of all workflow tasks. Pham et al. [33] focused on the volatility of spot cloud resources and improved the multi-objective evolutionary algorithm to make a trade-off between makespan and execution cost for workflows in cloud computing.

In the evolutionary optimization community, a multi-objective optimization problem is generally considered large-scale if it has at least one hundred decision variables [34]. The multi-objective cloud workflow scheduling involves hundreds or even thousands of decision variables, and is a typical large-scale multi-objective optimization problem. However, the existing relevant studies evolve all decision variables as a whole and allocate evolution opportunities to each variable equally. This results in the low efficiency of these existing studies.

The recent research results in the evolutionary computation community demonstrate that cooperative coevolution [34, 35] has become a crucial and effective way to solve large-scale multi-objective optimization problems. In cooperative coevolution approaches, all the decision variables are classified into multiple groups, and decision variables in different groups are evolved in a round-robin manner [36, 37]. These static classification techniques work well when problems’ decision variables are fully or partially separable. However, this is not the case for multi-objective cloud workflow scheduling with nonseparable decision variables caused by data dependencies among tasks.

Besides, multi-objective cloud workflow scheduling poses an imbalance feature among decision variables regarding their contributions to optimization objectives. For instance, delaying the completion of a workflow task on the critical paths [38] often successively delays the completion of many tasks, including its successor tasks and other tasks being executed after the delayed tasks and their successor tasks. Whereas slightly delaying other tasks on the non-critical paths may not cause this chain reaction. The imbalance feature means that we should equip different decision variables with different evolution opportunities. This motivates us to design a decision variable contribution based adaptive mechanism to dynamically adjust the variable grouping and allocate evolution opportunities during the evolution process. Our main contributions in this paper are as follows.

We define the contribution of a decision variable as the fitness improvement of the solution generated by perturbing this decision variable. Then, we try to dynamically measure the contribution of each decision variable and classify them according to their contributions.
We design an adaptive mechanism to dynamically allocate more evolution opportunities for the variable groups with more contributions to generate offspring solutions efficiently.
In the context of fifteen real-world workflows and the Amazon Elastic Compute Cloud, we compare the proposal with four state-of-the-art multi-objective cloud workflow scheduling algorithms. The results demonstrate the competitive performance of the proposal in simultaneously optimizing execution cost and makespan.

This paper is organized as follows. In the second section formulates the multi-objective workflow scheduling problem. In the third section designs the proposed VCAES, followed by experimental verifications in the fourth section. In the final section concludes this paper.

Problem formulation

This section first describes the models of workflows and cloud resources, and then formulates the model for multi-objective workflow scheduling in cloud computing.

Workflow model

Without loss of generality, a workflow application is described by a Directed Acyclic Graph (DAG), whose vertices and directed edges represent workflow tasks and data dependencies, respectively. Formally, we construct the directed acyclic graph for a workflow application as $\varPsi = \{T, D\}$, where $T=\{t_1,t_2,\ldots ,t_n\}$ denotes the vertex set corresponding to task set, $D\subseteq T\times T$ denotes the edge set corresponding to data dependencies among tasks. The existence of an edge $d_{i,j}\in D$ means that $t_j$’s start demands $t_i$’s output results. Generally, task $t_i$ is referred to as an direct predecessor of task $t_j$, and $t_j$ is referred to as an direct successor of $t_i$. Regarding a task $t_i$, all its direct predecessors is expressed as a set $P(t_i)$, and all its direct successors is expressed as a set $S(t_i)$.

Figure 1 gives a visual example of a directed acyclic graph for a workflow with seven tasks, i.e., $T=\{t_1,t_2,\ldots ,t_7\}$. An edge $d_{1,2}$ denotes the data dependency from $t_1$ to $t_2$, meaning that $t_2$’s start have to wait for $t_1$’s output results. In Fig. 1, regarding task $t_6$, the set of its direct predecessors is $P(t_6)=\{t_3,t_4\}$, and the set of its direct successors is $S(t_6)=\{t_7\}$.

Cloud resource model

This paper targets the popular cloud paradigm, i.e., Infrastructure as a Service (IaaS). In this paradigm, cloud providers offer multiple types of cloud resources on demand [39, 40]. The differences between different types of cloud resources mainly lie in their charging prices and performance configurations, such as number of CPU cores, memory size, and network bandwidth. Assuming that cloud platforms offer m types of resources, then we model them as $\varGamma =\{1,2,\ldots ,m\}$, where $\tau \in \varGamma $ denotes the $\tau $-th resource type. Regarding a type $\tau $, we employ $\textrm{pr}(\tau )$ and $\textrm{con}(\tau )$ to represent its price and configurations. Then, a cloud resource of type $\tau $ is modeled as $r_k^\tau =\{k, \textrm{pr}(\tau ), \textrm{con}(\tau )\}$, where k denotes the index of resource $r_k^\tau $.

Refer to well-known cloud providers (e.g., Amazon EC2^{Footnote 1} and Alibaba Cloud ECS^{Footnote 2}), this study follows the resource charging basis of pay-as-you-use. Under this rule, any consumer can rent any number of resources on demand and is charged according to the real usage time. In general, cloud resources are charged based on the number of billing periods, and the partial period will be rounded up to one more. In case that the period length is 60.0 min, the number of billing periods for 60.01 min is two.

Multi-objective scheduling cloud workflows

Since cloud resources are available on demand, we build a resource pool based on the maximum resource requirements for running a workflow. Assuming the maximum parallelism of the workflow is p, the resource pool includes p resources of each type. Then, we describe the resource pool as: $R=\left\{ r_1^1,r_2^1,\ldots ,r_p^1,r_{p+1}^2,r_{p+1}^2,\ldots ,r_{2\cdot p}^2,\ldots ,r_{m\cdot p}^m \right\} $.

The decision vector ${\textbf {x}}=\{x_1,x_2, \ldots ,x_n\}$ is used to represent the mappings from workflow tasks to cloud resources, where the value of decision variable $x_i$ is decoded as the index of the cloud resource mapped to the i-th task. It is worth noting that the value range of each decision variable is an integer from 1 to $m\cdot p$.

Given a decision vector, assume that the task $t_i$ is mapped to resource $r_k^\tau $. The start time $\textrm{s}t_{i,k}$ of task $t_i$ refers to the maximum time of receiving all the input data and the available time of the mapping resource.

On resource $r_k^\tau $, we assume the set of tasks being executed before task $t_i$ as follows:

$$\begin{aligned} B_i = \left\{ t_p|I(t_p) < I(t_i)\right\} , \end{aligned}$$

(1)

where $I(t_p)$ indicates $t_p$’s order number on resource $r_k^\tau $.

Then, the start time $\textrm{s}t_{i,k}$ of task $t_i$ on cloud resource $r_k^\tau $ can be described as follows:

$$\begin{aligned} \textrm{s}t_{i,k} = \max \left\{ \max _{t_b\in B_i}\textrm{f}t_{b,k}, \max _{t_p\in P(t_i)}\left\{ \textrm{f}t_{p,*}+\textrm{d}t_{p,i}\right\} \right\} , \end{aligned}$$

(2)

where $\textrm{f}t_{b,k}$ indicates $t_b$’s finish time on resource $r_k^\tau $, $\textrm{f}t_{p,*}$ indicates the finish time of task $t_p$, and $\textrm{d}t_{p,i}$ indicates the data transfer time from $t_p$ to $t_i$.

Before scheduling, task $t_i$’s execution time $\textrm{e}t_{i,k}$ on cloud resource $r_k^\tau $ can be predicted by the computation length of task $t_i$ and the CPU frequency of the mapped resource $r_k^\tau $. The relationship among $\textrm{s}t_{i,k}$, $\textrm{e}t_{i,k}$, and $\textrm{f}t_{i,k}$ is described as follows:

$$\begin{aligned} \textrm{f}t_{i,k} = \textrm{s}t_{i,k} + \textrm{e}t_{i,k}. \end{aligned}$$

(3)

Given a decision vector, the set of all tasks mapped to cloud resource $r_k^\tau $ can be formulated as:

$$\begin{aligned} T_k = \{t_i|x_i = k, i\in \{1,2,\ldots ,n\}\}. \end{aligned}$$

(4)

With the task set $T_k$, the start time $\textrm{u}t_k$ and end time $\textrm{n}t_k$ of renting resource $r_k^\tau $ can be computed as follows:

$$\begin{aligned} {\begin{matrix} &{} \textrm{u}t_k = \min _{t_i\in T_k}\left\{ \textrm{s}t_{i,k} - \max _{t_p\in P(t_i)}\textrm{d}t_{p,i}\right\} , \\ &{} \textrm{n}t_k = \max _{t_i\in T_k}\left\{ \textrm{f}t_{i,k} + \max _{t_s\in S(t_i)}\textrm{d}t_{i,s}\right\} . \end{matrix}} \end{aligned}$$

(5)

Based on the above analysis, the first optimization objective, i.e., minimizing the execution cost, can be formulated as follows:

(6)

where C indicates the length of a billing period for cloud resources.

The second optimization objective of this paper is to minimize the workflow’s makespan, which corresponds to the maximum finish time of all the tasks in this workflow. The second optimization objective can be formulated as follows:

$$\begin{aligned} \text{ Min } f_2(\varvec{x}) = \max _{t_i\in T} \textrm{f}t_{i,*}. \end{aligned}$$

(7)

Thus, the model for multi-objective workflow scheduling problem in cloud computing can be summarised as follows:

$$\begin{aligned} \left\{ \begin{array}{ll} \text{ Min } &{} \varvec{f}(\varvec{x})= \left[ f_1(\varvec{x}), f_2(\varvec{x})\right] ,\\ \text{ S.t. } \\ &{} \varvec{x} \in \{1,2,\ldots ,m\cdot p\}^n. \\ \end{array} \right. \end{aligned}$$

(8)

Pareto-dominance has been widely employed to compare solutions in the multi-objective optimization field.

Pareto-dominance: Assuming ${\textbf {x}}_1$ and ${\textbf {x}}_2$ are two feasible solutions. ${\textbf {x}}_1$ is regarded to dominate ${\textbf {x}}_2$ (denoted as ${\textbf {x}}_1\prec {\textbf {x}}_2$) if and only if the two objectives of ${\textbf {x}}_1$ are not inferior to that of ${\textbf {x}}_2$ (i.e., $f_j({\textbf {x}}_1)\le f_j({\textbf {x}}_2), \forall j\in \{1,2\}$) and ${\textbf {x}}_1$ is better than ${\textbf {x}}_2$ on at least one objective (i.e., $f_j({\textbf {x}}_1)< f_j({\textbf {x}}_2), \exists j\in \{1,2\}$).

Pareto-optimal solution: Solution ${\textbf {x}}^*\in \{1,2,\ldots ,m\cdot p\}^n$ is regarded as Pareto-optimal if there exist no feasible solution dominating it.

Pareto Set/Front: All the Pareto-optimal solutions are defined as Pareto-Set (PS) in the decision space and Pareto-Front (PF) in the objective space.

Algorithm design

Given a workflow scheduling solution, the importance of each workflow task varies greatly. For instance, adjusting the mapping from a critical task to a resource often successively affects the execution of many tasks, including its successors and other tasks being executed after these tasks and their successors. Whereas adjusting the mapping from a non-critical task to a resource may have no impact on the execution cost and makespan of the workflow. Also, the importance of each workflow task varies from solution to solution. Then, decision variables corresponding to different workflow tasks pose an imbalance feature to optimization objectives. To deal with the large-scale decision variables in cloud workflow scheduling, the VCAES incorporates a novel cooperative coevolution (CC) mechanism to dynamically measure the contributions of decision variables and adaptively allocate evolution opportunities for each group of decision variables based on their contributions. The proposed VCAES follows the framework of traditional multi-objective evolutionary optimization, including initialization, reproduction operator, and selection operator, as shown in Algorithm 1.

As illustrated in Algorithm 1, the inputs of the proposed VCAES are the multi-objective cloud workflow scheduling problem, the population size, memory length for recording the variable contributions, the number of decision variables in one group. Once the VCAES reaches the termination condition, it will output an up-to-date population.

In the initialization stage, one population is generated randomly (Line 1). Next, an $l\times n$ matrix ${\textbf {M}}$ is initialized to collect the contribution of each variable over the past l iterations (Line 2). The element in row j and column i represents the contribution of the i-th variable in the previous j-th iteration. Also, a set of uniformly distributed reference vectors are initialized to assist in calculating variable contributions (Line 3). In addition, the number of iterations for cooperative co-evolution during each generation is initialized (Line 4). These iterations will be allocated to each group of variables in proportion to the overall contribution of variables in the corresponding group.

After the initialization stage, the VCAES enters the main loop. During each generation, the proposed adaptive cooperative coevolution mechanism is triggered to distribute the decision variables into many groups and allocate evolution opportunities to each group according to variable contributions (Line 6). Next, the memory matrix of variable contributions is updated (Lines 7–14). It is worth noting that the decision variables in the cloud workflow scheduling are related to each other. The VCAES generates a new population by evolving all variables in each generation (Line 15). After that, the non-dominated sorting and elitist-preserving method in NSGA-II [41] is employed to select an offspring population P from the combined population $P\bigcup Q$ (Line 16).

Before introducing the proposed adaptive cooperative co-evolution mechanism, we define and illustrate the variable contributions.

Suppose Q is a population that generated by evolving the decision variables in group G(i) and fixing other decision variables, the contribution of each variable in G(i) is defined as:

$$\begin{aligned} K(j) = \sum _{\varvec{q}\in Q} \textrm{FI}(\varvec{q}), \forall j \in G(i), \end{aligned}$$

(9)

where $\textrm{FI}(\varvec{q})$ denotes the fitness improvement of solution $\varvec{q}$.

For a set of reference vector V, the one associated with solution $\varvec{q}$ is defined as $\varvec{v}^* = \arg _{\varvec{v}\in V}\min \langle \varvec{f}(\varvec{q}),\varvec{v} \rangle $, where $\langle \varvec{f}(\varvec{q}),\varvec{v} \rangle $ represents the acute angle between two vectors. Suppose $\varvec{p}$ is the associated solution of the reference vector $\varvec{v}^*$, then the fitness improvement of solution $\varvec{q}$ can be calculated as follows:

$$\begin{aligned} \textrm{FI}(\varvec{q}) = \max \{0, \textrm{Fit}(\varvec{p},\varvec{v}^*)-\textrm{Fit}(\varvec{q},\varvec{v}^*)\}, \end{aligned}$$

(10)

where $\textrm{Fit}(\varvec{q},\varvec{v})$ denotes the fitness value of solution $\varvec{q}$ with respect to the reference vector $\varvec{v}$, which can be calculated as

$$\begin{aligned} \textrm{Fit}(\varvec{q},\varvec{v}) = \Vert \varvec{f}'\Vert \cdot (\cos<\varvec{f}', \varvec{v}> + \sin <\varvec{f}', \varvec{v}>), \end{aligned}$$

(11)

where $\varvec{f}' = \varvec{f}(\varvec{q}) - \varvec{z}$ denotes the translated objective vector, and $\varvec{z}$ denotes the ideal point.

An intuitive example on calculating contribution is given in Fig. 2. Suppose previous solution $\varvec{p}$ and the new solution $\varvec{q}$ are associated to the reference vector $\varvec{v}=(0.5,0.5)$, and their objective vectors are $\varvec{f}(\varvec{p})=(1.3,1.0)$ and $\varvec{f}(\varvec{q})=(0.7,0.8)$, respectively. The ideal point is supposed to be ${\textbf {z}} =(0,0)$. Based on the definition in (11), we can obtain the fitness of these two solutions as $\textrm{Fit}(\varvec{p},\varvec{v})=1.8385$ and $\textrm{Fit}(\varvec{q},\varvec{v})=1.1314$. Then, the fitness improvement of solution $\varvec{q}$ is $\textrm{FI}(\varvec{q})=\max \{0, 1.8385-1.1314\}=0.7071$. If the solution $\varvec{q}$ is generated by evolving variables $\{1,3,5\}$, then the contribution of these variables are defined as $K(1)=K(3)=K(5)=0.7071$.

Algorithm 2 gives the pseudo-code of the proposed adaptive cooperative coevolution mechanism, which dynamically distributes the variables with higher contributions into the same groups and assigns more evolution opportunities to accelerate the population convergence.

As illustrated in Algorithm 2, the inputs of the function AdaptiveCoEvolution() are the current population, the contribution of each variable during the previous l generations, the set of reference vectors, the population size, and the number of generations for cooperative coevolution search. Then, the outputs of this function are the updated population and the contribution of each variable.

At first, function AdaptiveCoEvolution() calculates the total contribution of each variable during the previous generations (Lines 1–4), where the notation H(i) represents the total contribution of the i-th variable during the previous l generations. Then, it sorts the variables in a non-ascending order according to their total contributions (Line 5) and distributes the variables into a series of groups (Line 6). In this way, the variables with similar contributions are distributed into the same group, which is helpful to distribute variables with larger contributions into the same groups to gain more evolution opportunities.

After that, function AdaptiveCoEvolution() successively evolves each group of variables, and measure their contributions. For a group of variables, the number of evolutionary iterations is calculated based on the sum of their contributions (Line 9). During each iteration, this function performs the reproduction operator on the corresponding variables to generate a new population (Line 11), and performs the selection operator to select the offspring population (Line 12). After evolving a group of variables, this function measures their contributions (Line 14). Also, it updates the contribution of the corresponding variables (Lines 15–17), where notation K(i) represents the contribution of the i-th variable.

Regarding Function AdaptiveCoEvolution(), it costs $O(n\cdot l)$ to calculate the contribution of each decision variable (Lines 1–4, Algorithm 2). The time complexity of sorting decision variables is $O(n\log n)$ (Line 5, Algorithm 2). This function takes $O(N\cdot n)$ to reproduce an offspring population (Line 11, Algorithm 2). According to the analysis [41], the time complexity of the environmental selection is $O(N^2)$ (Line 12, Algorithm 2). Then, it takes $O(n\cdot (N\cdot n + N^2))=O(n\cdot N\cdot (n + N))=O(n^2\cdot N)$ to adaptively each group of decision variables (Lines 8–18, Algorithm 2). Then, the time complexity of Function AdaptiveCoEvolution() is $O \left( n\cdot l + n\log n + n^2\cdot N \right) = O(n^2\cdot N)$, since l is often less than n.

Regarding the algorithm VCAES, the time complexity of updating the memory matrix is $O(n\cdot l)$ (Lines 7–11, Algorithm 1). The time complexities of reproducing an offspring population and selecting elitist solutions are $O(N\cdot n)$ and $O(N^2)$ (Lines 15-16, Algorithm 1), respectively. Thus, the time complexity of the VCAES during one generation is $O \left( n^2\cdot N + n\cdot l + N\cdot n + N^2\right) = O(n^2\cdot N)$.

Numerical experiments

To investigate the performance of the proposed VCAES, we compare it with four representative multi-objective cloud workflow scheduling algorithms: MOELS [42], EMS-C [9], WOF [43], LSMOF [44]. The MOELS and EMS-C follow the framework of NSGA-II [41] and incorporate new reproduction operators to generate offspring populations by evolving all the variables. The WOF and LSMOF are representative large-scale multi-objective evolutionary algorithms based on problem transformation.

Experimental setting

The eight types of workflows from different application domains, i.e., Inspiral (Gravitational physics), CyberShake (Earthquake), EpiGenomics (Biology), Montage (Astronomy), Sipht (Bioinformatics), BLAST (bioinformatics), Cycles (agroecosystems), and Seismology (seismology), have been widely used in evaluating cloud workflow scheduling algorithms. We select multiple task sizes from each workflow in the experiments. Besides, the DAG diagrams of some workflow instances with around 30 tasks are illustrated in Fig. 3. It is clear that these workflow instances cover various complicated structures, including in-tree, out-tree, fork-join, pipeline, and mixture. For more details on these workflows, please refer to the Pegasus repository.^{Footnote 3}

Five types of resources instances, i.e., t3.nano, t3.micro, t3.small, t3.medium, t3.medium, and t3.large, from Amazon EC2^{Footnote 4} are employed to simulate the cloud environments. We summarize the parameters of the five resource types in Table 1. Besides, we set the length of a billing period to 60 s and the bandwidth among resource instances to 5.0 Gbps.

Table 1 Parameters for five types of cloud resource

Full size table

Hypervolume [45] metric is designed to measure the quality of a population concerning both convergence and diversity, and has been frequently used to evaluate the performance of multi-objective evolutionary algorithms. Assume $\varvec{r} =\{r_1,r_2\}$ is a reference point. The hypervolume value of a population P, corresponding to the volume between the reference point and the objective vectors of the solutions in P, can be calculated as follows.

$$\begin{aligned} \begin{array}{cc} HV(P) = L \left( \bigcup _{p\in P}[f_1(p), r_1]\times [f_2(p), r_2]\right) , \end{array} \end{aligned}$$

(12)

where $L(\bigtriangleup )$ represents the Lebesgue measure.

Referring to the settings in MOELS and EMS-C, we set the population size of the five algorithms as 100. The maximum number of fitness evaluations is set as the stop condition for all the five algorithms and is set as 500,000.

The five algorithms are independently repeated 31 times on each workflow instance to mitigate random effects. We run all the experiments on a PC with two Cores i7-6500U CPU @ 2.50 GHz 2.59 GHz, 8.00 GB RAM, Windows 10.

Comparison result

Tables 2 and 3 report the average and standard deviation (in brackets) of the hypervolume values for the algorithms MOELS, EMS-C, WOF, LSMOF, and VCAES in scheduling the 27 workflow instances to cloud resources. For each workflow instance, the largest hypervolume value among the five algorithms is highlighted in bold. Besides, the Wilcoxon’s rank-sum test with $\alpha =0.05$ is used to examine the significant differences between the comparison algorithm and the proposed VCAES in terms of hypervolume metric. The marks −, $+$, and $\approx $ represent that the comparison algorithm is significantly worse than, better than, and similar to VCAES.

Table 2 Comparison results for the five algorithms on 15 workflows in terms of the hypervolume metric

Full size table

Table 3 Comparison results for the five algorithms on 15 workflows in terms of the hypervolume metric

Full size table

The comparison results in Tables 2 and 3 show that the proposed VCAES performs significantly better than the four baseline algorithms. Specifically, the proposal generates larger hypervolume values than MOELS, EMS-C, WOF, and LSMOF on 20, 22, 26, and 25 out of the 27 workflow instances respectively.

An interesting phenomenon is that the more decision variables, the more pronounced the advantages of the proposed VCAES. For example, in workflow instances with about 30 tasks, the VCAES obtains larger hypervolume values than algorithm LSMOF on 3 out of 5. Whereas, in workflow instances with 100 and 1000 tasks, the algorithm proposed in this paper significantly outperforms all the four comparison algorithms. Similar to the baseline algorithms MOELS and EMS-C, the proposed VCAES also follows the framework of NSGA-II. Different from MOELS and EMS-C, the VCAES integrates a new adaptive strategy based on variable contribution. Although two large-scale multi-objective evolutionary algorithms based on problem transformation, i.e., WOF and LSMOF, exhibit competitive performance in solving continuous problems, their performance in multi-objective scheduling workflows is far inferior to the proposed VCAES. The above comparison results demonstrate that the proposed adaptive mechanism in this paper can effectively improve the performance of evolutionary algorithms in solving multi-objective cloud workflow scheduling problems with large-scale variables.

To intuitively compare the convergence and diversity of the five multi-objective workflow scheduling algorithms, Fig. 4 illustrates the distribution of their output populations on workflow instances Inspiral, Montage, Sipht, and Cycles with large-scale tasks.

As illustrated in Fig. 4a, in the context of Inspiral with 100 tasks, the distribution of VCAES’s output population is better than that of three baseline algorithms, i.e., MOELS, EMS-C, and WOF. More specifically, the output solutions of VCAES dominate that of the three algorithms. Although the diversity of solutions obtained by algorithm LSMOF is superior to that of the VCAES when the cost is less than 10, it is far inferior to algorithm VCAES during other range. To sum up, the proposed VCAES is superior to each baseline algorithm in terms of convergence and diversity. In the context of Montage with 100 tasks, the proposed VCAES has similar advantages over the comparison algorithms in terms of convergence and diversity, as shown in Fig. 4b.

As can be observed from Fig. 4c, in the context of Sipht with 100 tasks, the solutions obtained by algorithm VCAES dominate most solutions obtained by comparison algorithms. This means that the VCAES outperforms all the four baseline algorithms in both convergence and diversity. In the context of Cycles with 657 tasks, the proposed VCAES has slight advantages over the comparison algorithms in terms of convergence and diversity, as shown in Fig. 4d.

Performance influence of different mechanisms

The VCAES mainly contains two new mechanisms to dynamically classify the decision variables and adaptively allocate evolution opportunities to each constructed group of decision variables. To measure the respective contributions of the two mechanisms to the overall performance, we construct three variants of the VCAES for comparison. The first variant, denoted as Variant 1, is constructed by replacing the decision variable classification mechanism with a random one and iterating each group of decision variables in a round-robin manner. The second variant, denoted as Variant 2, is constructed by replacing the decision variable classification mechanism with a random one and adaptively allocating evolution opportunities to each group of decision variables. The third variant, denoted as Variant 3, is constructed by removing the evolution opportunity allocation mechanism.

The main difference between the VCAES and its Variant 3 is that Variant 3 does not have the adaptive evolution opportunity allocation mechanism. Then, the improvement in the hypervolume value of the VCAES relative to Variant 3 can be attributed to the performance contribution of the adaptive evolution opportunity allocation mechanism. Similarly, the improvement in the hypervolume value of the VCAES relative to Variant 3 can be attributed to the performance contribution of the decision variable classification mechanism. The comparison results in Fig. 5 illustrate that the two proposed mechanisms contribute to the overall performance of the VCAES, with the decision variable classification mechanism contributing more. The performance improvement of the VCAES for Variant 1 can be attributed to the performance contribution of mixing the two proposed mechanisms. As shown in Fig. 5, in most workflow instances, the performance contribution of mixing the two mechanisms is better than that of any one. An exception is shown in Fig. 5b, where we can see that the overall performance contribution of mixing the two mechanisms is not as good as the contribution of the decision variable classification mechanism. This is because a mechanism cannot be efficient in any scenario. Instead, it has certain advantages in some scenarios and inevitably has its disadvantages in other scenarios.

Conclusions

This paper focuses on two challenges in multi-objective cloud workflow scheduling: (1) large-scale decision variables; (2) and imbalance feature among variables regarding their contributions to objectives. To deal with these two challenges, this paper suggests a variable-contribution-based adaptive evolutionary cloud workflow scheduling approach that dynamically classifies the variables and adaptively allocates evolution opportunities to each constructed group of variables. Finally, in the context of real-world workflows and cloud platforms, this paper conducts comparison experiments to verify the effectiveness of the proposed adaptive mechanism in enhancing the population to approximate the Pareto-fronts of multi-objective cloud workflow scheduling problems.

Cloud workflow scheduling is a representative grey-box problem, and it is interesting to mine the knowledge on the workflows and cloud resources to derive efficient scheduling algorithms. Another potential direction is to design a parallel evolutionary framework to shorten the time overhead of evolution optimization to support cloud workflow scheduling in real-time and uncertain situations.

Data availability

Data available on request from the authors.

Notes

References

Bugingo E, Zhang D, Chen Z, Zheng W (2021) Towards decomposition based multi-objective workflow scheduling for big data processing in clouds. Clust Comput 24(1):115–139
Google Scholar
Lv Z, Lou R, Li J, Singh AK, Song H (2021) Big data analytics for 6G-enabled massive internet of things. IEEE Internet Things J 8(7):5350–5359
Google Scholar
Lv Z, Qiao L, Hossain MS, Choi BJ (2021) Analysis of using blockchain to protect the privacy of drone big data. IEEE Netw 35(1):44–49
Google Scholar
Cong P, Li L, Zhou J, Cao K, Wei T, Chen M, Hu S (2018) Developing user perceived value based pricing models for cloud markets. IEEE Trans Parallel Distrib Syst 29(12):2742–2756
Google Scholar
Wang S, Sheng H, Zhang Y, Yang D, Shen J, Chen R (2023) Blockchain-empowered distributed multi-camera multi-target tracking in edge computing. IEEE Trans Ind Inf 2022:896
Google Scholar
Farid M, Latip R, Hussin M, Hamid NAWA (2020) Scheduling scientific workflow using multi-objective algorithm with fuzzy resource utilization in multi-cloud environment. IEEE Access 8:24309–24322
Google Scholar
Masdari M, ValiKardan S, Shahi Z, Azar SI (2016) Towards workflow scheduling in cloud computing: a comprehensive analysis. J Netw Comput Appl 66:64–82
Google Scholar
Cao B, Sun Z, Zhang J, Gu Y (2021) Resource allocation in 5G IoV architecture based on SDN and fog-cloud computing. IEEE Trans Intell Transp Syst 22(6):3832–3840
Google Scholar
Zhu Z, Zhang G, Li M, Liu X (2016) Evolutionary multi-objective workflow scheduling in cloud. IEEE Trans Parallel Distrib Syst 27(5):1344–1357
Google Scholar
Hosseinzadeh M, Ghafour MY, Hama HK, Vo B, Khoshnevis A (2020) Multi-objective task and workflow scheduling approaches in cloud computing: a comprehensive review. J Grid Comput 18(3):327–356
Google Scholar
Xiao Z, Shu J, Jiang H, Lui JC, Min G, Liu J, Dustdar S (2022) Multi-objective parallel task offloading and content caching in D2D-aided MEC networks. IEEE Trans Mob Comput 2022:896
Google Scholar
Cao B, Yan Y, Wang Y, Liu X, Lin JC-W, Sangaiah AK, Lv Z (2022) A multiobjective intelligent decision-making method for multistage placement of PMU in power grid enterprises. IEEE Trans Ind Inf 2022:87
Google Scholar
Zhan Z-H, Liu X-F, Gong Y-J, Zhang J, Chung HS-H, Li Y (2015) Cloud computing resource scheduling and a survey of its evolutionary approaches. ACM Comput Surv 47(4):1–33
Google Scholar
Durillo JJ, Nae V, Prodan R (2014) Multi-objective energy-efficient workflow scheduling using list-based heuristics. Futur Gener Comput Syst 36:221–236
Google Scholar
Fard HM, Prodan R, Fahringer T (2014) Multi-objective list scheduling of workflow applications in distributed computing infrastructures. J Parall Distrib Comput 74(3):2152–2165
MATH Google Scholar
Han P, Du C, Chen J, Ling F, Du X (2021) Cost and makespan scheduling of workflows in clouds using list multiobjective optimization technique. J Syst Architect 112:101837
Google Scholar
Ismayilov G, Topcuoglu HR (2020) Neural network based multi-objective evolutionary algorithm for dynamic workflow scheduling in cloud computing. Futur Gener Comput Syst 102:307–322
Google Scholar
Li X, Sun Y (2021) Application of RBF neural network optimal segmentation algorithm in credit rating. Neural Comput Appl 33:8227–8235
Google Scholar
Qin X, Liu Z, Liu Y, Liu S, Yang B, Yin L, Liu M, Zheng W (2022) User OCEAN personality model construction method using a BP neural network. Electronics 11(19):3022
Google Scholar
Chen Z-G, Zhan Z-H, Lin Y, Gong Y-J, Gu T-L, Zhao F, Yuan H-Q, Chen X, Li Q, Zhang J (2019) Multiobjective cloud workflow scheduling: a multiple populations ant colony system approach. IEEE Trans Cybern 49(8):2912–2926
Google Scholar
Adhikari M, Amgoth T, Srirama SN (2020) Multi-objective scheduling strategy for scientific workflows in cloud environment: a firefly-based approach. Appl Soft Comput 93:106411
Google Scholar
Gupta R, Gajera V, Jana PK et al (2016) An effective multi-objective workflow scheduling in cloud computing: a PSO based approach. In: 2016 Ninth International Conference on Contemporary Computing, pp 1–6, IEEE
Yu H (2021) Evaluation of cloud computing resource scheduling based on improved optimization algorithm. Compl Intell Syst 7(4):1817–1822
Google Scholar
Wang Y, Zuo X (2021) An effective cloud workflow scheduling approach combining PSO and idle time slot-aware rules. IEEE/CAA J Autom Sin 8(5):1079–1094
Google Scholar
Abed-Alguni BH, Alawad NA (2021) Distributed grey wolf optimizer for scheduling of workflow applications in cloud environments. Appl Soft Comput 102:107113
Google Scholar
Choudhary A, Gupta I, Singh V, Jana PK (2018) A GSA based hybrid algorithm for bi-objective workflow scheduling in cloud computing. Futur Gener Comput Syst 83:14–26
Google Scholar
Hosseini Shirvani M, Noorian Talouki R (2022) Bi-objective scheduling algorithm for scientific workflows on cloud computing platform with makespan and monetary cost minimization approach. Compl Intell Syst 8(2):1085–1114
Google Scholar
Mohammadzadeh A, Masdari M, Gharehchopogh FS (2021) Energy and cost-aware workflow scheduling in cloud computing data centers using a multi-objective optimization algorithm. J Netw Syst Manage 29(3):1–34
Google Scholar
Zhang H, Wu Y, Sun Z (2022) EHEFT-R: multi-objective task scheduling scheme in cloud computing. Compl Intell Syst 8(6):4475–4482
Zhou X, Zhang G, Sun J, Zhou J, Wei T, Hu S (2019) Minimizing cost and makespan for workflow scheduling in cloud using fuzzy dominance sort based HEFT. Futur Gener Comput Syst 93:278–289
Google Scholar
Kumar MS, Tomar A, Jana PK (2021) Multi-objective workflow scheduling scheme: a multi-criteria decision making approach. J Ambient Intell Hum Comput 2021:1–20
Google Scholar
Ye X, Liu S, Yin Y, Jin Y (2017) User-oriented many-objective cloud workflow scheduling based on an improved knee point driven evolutionary algorithm. Knowl-Based Syst 135:113–124
Google Scholar
Pham T-P, Fahringer T (2020) Evolutionary multi-objective workflow scheduling for volatile resources in the cloud. IEEE Trans Cloud Comput 2020:36
Google Scholar
Tian Y, Si L, Zhang X, Cheng R, He C, Tan KC, Jin Y (2021) Evolutionary large-scale multi-objective optimization: a survey. ACM Comput Surv 54(8):1–34
Google Scholar
Coello CAC, Brambila SG, Gamboa JF, Tapia MGC, Gómez RH (2020) Evolutionary multiobjective optimization: open research areas and some challenges lying ahead. Compl Intell Syst 6(2):221–236
Google Scholar
Chen H, Cheng R, Wen J, Li H, Weng J (2020) Solving large-scale many-objective optimization problems by covariance matrix adaptation evolution strategy with scalable small subpopulations. Inf Sci 509:457–469
MathSciNet MATH Google Scholar
Zhang X, Tian Y, Cheng R, Jin Y (2018) A decision variable clustering-based evolutionary algorithm for large-scale many-objective optimization. IEEE Trans Evol Comput 22(1):97–112
Google Scholar
Chen H, Zhu X, Liu G, Pedrycz W (2021) Uncertainty-aware online scheduling for real-time workflows in cloud service environment. IEEE Trans Serv Comput 14(4):1167–1178
Google Scholar
De Maio V, Kimovski D (2020) Multi-objective scheduling of extreme data scientific workflows in fog. Futur Gener Comput Syst 106:171–184
Google Scholar
Lv Z, Xiu W (2019) Interaction of edge-cloud computing based on SDN and NFV for next generation IoT. IEEE Internet Things J 7(7):5706–5712
Google Scholar
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197
Google Scholar
Wu Q, Zhou M, Zhu Q, Xia Y, Wen J (2020) MOELS: multiobjective evolutionary list scheduling for cloud workflows. IEEE Trans Autom Sci Eng 17(1):166–176
Google Scholar
Zille H, Ishibuchi H, Mostaghim S, Nojima Y (2018) A framework for large-scale multiobjective optimization based on problem transformation. IEEE Trans Evol Comput 22(2):260–275
Google Scholar
He C, Li L, Tian Y, Zhang X, Cheng R, Jin Y, Yao X (2019) Accelerating large-scale multiobjective optimization via problem reformulation. IEEE Trans Evol Comput 23(6):949–961
Google Scholar
Zitzler E, Thiele L (1999) Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach. IEEE Trans Evol Comput 3(4):257–271
Google Scholar

Download references

Acknowledgements

This work is supported by the Science and Technology Innovation Team of Shaanxi Province (2023-CX-TD-07), and the Special Project in Major Fields of Guangdong Universities (2021ZDZX1019), the Major Projects of Guangdong Education Department for Foundation Research and Applied Research (2017KZDXM081, 2018KZDXM066), Guangdong Provincial University Innovation Team Project (2020KCXTD045) and the Hunan Key Laboratory of Intelligent Decision-making Technology for Emergency Management (2020TP1013).

Funding

The funding has been received from National Natural Science Foundation of China with Grant no. 61773120; Hunan Provincial Innovation Foundation for Postgraduate with Grant no. CX20200585.

Author information

Authors and Affiliations

School of Management, Hunan Institute of Engineering, Xiangtan, 411104, Hunan, People’s Republic of China
Jun Li
School of Electronic Engineering, Xidian University, Xi’an, 710071, Shaanxi, People’s Republic of China
Lining Xing
Department of Traffic Management, Hunan Police Academy, Changsha, 410138, Hunan, People’s Republic of China
Wen Zhong
Shanwei Institute of Technology, Shanwei, 516600, Guangdong, People’s Republic of China
Zhaoquan Cai
School of Computer Science and Engineering, Huizhou University, Huizhou, 516007, Guangdong, People’s Republic of China
Zhaoquan Cai
School of Mathematical and Computational Sciences, Massey University, Albany, 4442, New Zealand
Feng Hou

Authors

Jun Li
View author publications
You can also search for this author in PubMed Google Scholar
Lining Xing
View author publications
You can also search for this author in PubMed Google Scholar
Wen Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoquan Cai
View author publications
You can also search for this author in PubMed Google Scholar
Feng Hou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Lining Xing or Wen Zhong.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Li, J., Xing, L., Zhong, W. et al. Decision variable contribution based adaptive mechanism for evolutionary multi-objective cloud workflow scheduling. Complex Intell. Syst. 9, 7337–7348 (2023). https://doi.org/10.1007/s40747-023-01137-w

Download citation

Received: 06 December 2022
Accepted: 08 May 2023
Published: 29 June 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s40747-023-01137-w

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Decision variable contribution based adaptive mechanism for evolutionary multi-objective cloud workflow scheduling

Abstract

Similar content being viewed by others

An adaptive multi-objective evolutionary algorithm for constrained workflow scheduling in Clouds

Multi-objective workflow scheduling in cloud computing: trade-off between makespan and cost

Mutation-driven and population grouping PRO algorithm for scheduling budget-constrained workflows in the cloud

Introduction