Introduction

Currently, the evolution and spread of technology double the quantity of data several times every minute; therefore, traditional data centers, which store these data locally, do not work as efficiently as before and have become unsuitable for many companies. Thus, traditional data centers should be improved to overcome several issues, including high maintenance costs, wasted floor space, high energy consumption, security, and high human resource costs. In addition, quality of service (QoS) requires companies to introduce high-quality services for their clients, including sending tasks to resources, task scheduling, and output satisfying results in an appropriate amount of time [1]. These problems can be approached by cloud computing servers.

Cloud computing can help the owners of datacenters overcome several issues through its flexibility, security, high scalability, QoS, and improved maintenance and support [1]. Cloud computing has the ability to host multiple types of heterogeneous servers; these servers host thousands of virtual machines (VMs); and each machine is provided with suitable resources (e.g., CPU, RAM, and storage) to be able to run all processes. Some cloud computing servers hold only one VM, while most servers host multiple VMs. This structure may cause overloading problems in some servers, which can reduce the performance of the servers and consume a large amount of energy [2].

VMs have also different workload types and resource specifications; thus, this may result in an imbalance in resource usage and high energy consumption [1]. Therefore, many issues need to be considered, such as the utilization of all resources, reduction of time consumption, and optimization of energy usage. The consumption of energy leads to increases in operating costs and it is related to resource utilization in datacenters. In this context, the energy cost of Amazon’s datacenters consumed more than 40% of its total operating costs [3]. In addition, the increase in energy consumption causes an increase carbon dioxide emissions and ecological problems [3]. Therefore, if energy consumption is reduced, the operating costs of datacenters will be reduced. An effective technique to achieve this goal is to apply and optimize the VM placement (VMP) process [4].

VMP relates to the allocation of VMs to a relevant physical machine (PM). This process is considered to be an important problem in managing datacenter resources. It should be done in real-time without large time consumption. If this process consumes more time, it will reduce its main benefits of being applied in a real environment. This research topic has several optimization criteria that can be effective in achieving highly positive economic and ecological results. For this reason, resource allocation in cloud computing platforms has attracted increasing attention and has been widely studied by many researchers [5,6,7,8].

The advantages of meta-heuristic algorithms in solving many machine learning optimization issues has led several studies to successfully employ them in different fields, such as image processing, feature selection, classification, and prediction [9,10,11,12]. In the same way, several meta-heuristic methods have been introduced, such as thermal exchange optimization [13], that depend on Newton’s law of cooling. The law assumes there is a relationship between the rate of a body’s heat loss and the temperature difference between the body and its surroundings. This algorithm is used in some applications such as solving global optimization [13] and structural damage identification [14]. Selfish herd optimizer (SHO) [15] is another meta-heuristic that emulates selfish herd behavior. The SHO is used to solve global optimization problems [15], and there is another version of SHO that depends on the opposite-based learning method and that applied this algorithm to an optimization function [16]. The tree-seed algorithm (TSA) has been applied to address continuous optimization problems [17]. The TSA simulates the relationship between trees and their seeds. This algorithm has several applications: large-scale binary optimization [18], optimal power flow problem [19], and similarity and logic gate binary optimization [20].

Therefore, we introduce an alternative method called MOSSASCA to find a suitable solution for the VMP problem. MOSSASCA is a multiobjective salp swarm algorithm (SSA) using sine-cosine algorithms (SCAs) as local operators.

SSA is a recently developed optimization method that simulates the natural behavior of salp, barrel-shaped plankton (family Salpidae) that are mostly water by weight. In addition, they move in the same way as jellyfish. SSA was proposed to solve different kinds of optimization issues [21]. SCA is an evolutionary method proposed to provide various solutions to different problems. It uses the mathematical model of sine and cosine functions and switches between them to obtain the optimal solution [22]. These methods were used in many previous studies and have shown good results [23,24,25].

The objectives of the proposed MOSSASCA are to maximize the mean time before a host shutdown, to reduce the power consumption and to minimize service level agreement violations (SLAVs). The proposed method starts by generating an integer population that represents solutions for the VMP and evaluates the quality of each solution by computing the objective functions. Then, the nondominated (ND) solutions are determined and saved in the archive. The next step uses the leader selection method to choose the best solution from the archive that is applied to update other solutions. Thereafter, the solutions in the current population are updated using the modified version of SSA, based on SCA. In MOSSASCA, the operators of SCA are used as local operators to improve the solutions of SSA and preventing them from becoming stuck at a local optimal point. All solutions updated by MOSSASCA are added to the archive. ND solutions are found and then are updated based on the size of the archive. The previous steps are executed until reaching stop conditions.

The rest of this study is organized as follows. “Related work” section lists the related works and problem formulation. “Background” secion describes the background of the methods used in this study. “Proposed MOSSASCA for VMP method” section explains the proposed method. The experiments and results are explained in “Results and discussion” section. The conclusion and future work are illustrated.

Related work

The massive numbers of data centers consume a substantial amount of power resources; this situation negatively affects the stability and extension of these centers. Therefore, this issue should be considered when designing cloud computing platforms [3, 26, 27]. In this context, various studies have been performed to overcome the problems of VMP such as power consumption, network traffic minimization, resource utilization, and performance maximization [28,29,30,31,32].

The techniques that can be applied to solve the problem of VMP resource allocation can be defined as multiobjective (MO) or single-objective. The process of finding the solution for more than one objective function simultaneously with different criteria is considered a multiobjective optimization problem (MOP).

Different types of meta-heuristic (MH) techniques are applied to solve the problem of single-objective VMPs. For example, Gabay et al. [33] applied first-fit decreasing, best-fit decreasing, and worst-fit techniques to locate destination PM for placing VMs. The authors reported that their proposed method is flexible, fast, and appropriate to be applied to real-life problems.

The authors of [34, 35] introduced simulated annealing (SA) algorithm-based VM consolidation algorithm.

However, single-objective methods are not suitable in the case of balancing between different objectives; therefore, multiobjective (MOP) methods are used to find solutions that balance between the objectives.

MOP methods have been applied to different applications, such as restructuring traffic networks [36], data mining methods for knowledge discovery [37], multiple sequence alignment [38], wind turbine blade geometry design [39] and others [40, 41].

Several MOP methods have been proposed, including in [42], a multiobjective method based on the grasshopper optimization algorithm (GOA). This algorithm simulates the navigation behavior of grasshoppers in nature. The authors of [43] proposed an MO version of the ant lion optimizer (ALO) and applied it to solve engineering design problems. In [44], the multiobjective flower pollination algorithm is proposed, which simulates the behavior of plant flowers’ proliferation role. For finding solutions for constrained MOP problems, a modified water cycle algorithm has been proposed [45].

With all of this attention on MOP methods, minimal efforts have been exerted toward the optimization of multiobjective VMP problems by MH and hybrid-heuristic algorithms. For example, the authors of [2] applied biogeography-based optimization (BBO) as an optimization method in order to determine a solution for the VMP problem considering server loads, inter-VMs, power consumption, resource wastage, and storage network traffic. In addition, the authors of [3] introduced a hybrid genetic algorithm (GA) to solve the VMP problem in a communication network and PMs in a data center. Gao et al. [46] proposed an MO ant colony optimization (ACO) algorithm to deal with VMP issues; it was applied to determine efficient ND solutions that minimize power consumption and resource usage. Other studies proposed intelligent models to decrease the power consumption in a cloud environment such as [47,48,49,50,51].

However, these models suffer from slow convergence and become stuck in local optima. In addition, the no free lunch (NFL) theorem [52] assumes that the optimization method does not have the ability to solve all problems with the same quality. Therefore, this consideration motivated us to propose an alternative multiobjective VMP method that can overcome the limitations of other methods.

The proposed MOP VNP method depends on improving the traditional SSAs using the operators of the SCA. The SSA has demonstrated its performance in several fields; for example, the SSA is used to determine the suitable parameters of the solar cell [53, 54] and parameters of PEM fuel cells [23]. In addition, SSA is applied to improve classification through selecting relevant features [24, 25, 55,56,57]. In [58], the authors presented a method based on SSA for practical considerations in radial distribution systems that determine optimal conductor and hosting capacity. For the fractional order proportional-integral-derivative (FOPID) controller problem, the SSA is used [59]. In [60], the authors presented an alternative job shop scheduling problem, whereas [61] presented training neural networks using SCA for improving pattern classification.

In general, the contributions of this study can be formulated as follows:

  • Proposing an alternative optimization approach that uses the multiobjective optimization concept to enhance the properties of SSA based on SCA as a local search.

  • Applying the MOSSASCA method to solve the problem of VMP and allocating resources in cloud computing platforms.

Background

Problem formulation of VMP

Physical cloud computing structures consist of many PMs. They contain the physical layers, or infrastructure, of cloud computing and they contain many components such as storage, RAM, CPUs, GPUs, and network connectors [8]. Therefore, cloud computing would not work without PMs. Each PM contains multiple VMs, where each VM can run an independent operating system (OS) and applications. VMs require many resources include CPU time, memory space, and storage. Therefore, a VM obtains these resources from its PM. Figure 1 illustrates the VMP problem.

Fig. 1
figure 1

VMP problem

Figure 1 contains four servers that run four VMs. Four virtualized applications are hosted on the system. The following points describe the process of VMP.

  1. 1.

    For each PM, calculate the requirement of the application’s resource through the statistics of the usage of the server’s resource in a period of time.

  2. 2.

    Select a proper PM with the comparable type of CPU, compatible software, storage, and similar network properties.

  3. 3.

    Assign \(VM_1\) to the \(PM_1\) in step 2, and assign \(VM_2\) to the \(PM_1\) if it has sufficient resources.

  4. 4.

    Repeat this mechanism until each VM is assigned to a suitable PM and, if required, add a new PM to build the structure of a server farm.

The following subsections describe the PMs, VMs, constraints, and objective function.

Physical machine

The PMs consist of more than one machine, and all these machines have physical layers, as shown in the following equation:

$$\begin{aligned} PM = [ PM_{1}, PM_{2}, PM_{3},\ldots ,PM_{N_P}], \end{aligned}$$
(1)

where \(N_P\) describes the total number of PMs. Each PM has the following:

$$\begin{aligned} PM_{i} =(PM_{i}^{CPU}, PM_{i}^{ram}, PM_{i}^{hdd}, PM_{i}^{pmax} ), \end{aligned}$$
(2)

where \(PM_{i}, (i=1,2,\ldots ,N_P)\) defines the number of the current PMs. \(PM_{i}^{pmax}\) is the maximum power consumption of the ith PM in units of [W]. The \(PM_{i}^{CPU}, PM_{i}^{ram}\) and \(PM_{i}^{hdd}\) are the processing resources (in [ECU]), the memory resources (in [GB]), and the storage resources (in [GB]) of the \(PM_i\).

Virtual machine

The VMs consist of one or more VMs, as shown in the following equation:

$$\begin{aligned} VM = [ VM_{1}, VM_{2}, VM_{3},\ldots ,VM_{N_V}], \end{aligned}$$
(3)

where \(N_V\) describes the total number of VMs. Each VM has the following resources:

$$\begin{aligned} VM_{j} =( VM_{j}^{CPU}, VM_{j}^{ram}, VM_{j}^{hdd} ),\quad j=1,2,\ldots ,N_V, \end{aligned}$$
(4)

where j is the current VM’s number. \(VM_{j}^{CPU}, VM_{j}^{ram}\), and \(VM_{j}^{hdd}\) represent the processing, memory, and the storage requirements of the jth VM, respectively.

Objective functions

In this section, the three objective functions are defined below:

  • Energy consumption: The servers consume a large volume of power when they are in an idle state.

    $$\begin{aligned} \begin{aligned} f^{Energy} =\sum _{i=1}^{N_V}((PM_{i}^{pmax}-PM_{i}^{pmin}) \times U_{CPU_{i}}(g)+ PM_{i}^{pmin})\times Y_i(g), \end{aligned} \end{aligned}$$
    (5)

    where \(f^{Energy}\) represents the total power consumption of the PMs, while \(PM_{i}^{pmin}=PM_{i}^{pmax} \times 0.6\) [62, 63] defines the minimum power consumption of \(PM_{i}\). \(U_{CPU_{i}}(g)\) represents the utilization ratio of resource utilized by \(PM_i\) at instant t, while \(Y_i(g)\in [0,1]\) is equal to 1 if the \(PM_i\) is turned on; otherwise, \(Y_i(g)=0\).

  • SLAV: The infrastructure of cloud computing and hosts try to meet QoS requirements, which are modeled in the SLAV form to maximum response time or minimum throughput. SLAV can be caused by a host, as defined in the following equations.

    $$\begin{aligned} f^{SLAV} = SLAO \times SLAM, \end{aligned}$$
    (6)

    where SLAO represents the average ratio of the period in the case the host experiences CPU utilization of 100% and is defined as follows:

    $$\begin{aligned} SLAO= \frac{1}{N_P}\sum \limits _{i=1}^{N_P}\frac{T_{si}}{T_{ai}}, \end{aligned}$$
    (7)

    where \(T_{ai}\) is the active time of i-th host, and \(T_{si}\) represents the total time when the i-th host experiences 100% SLAV utilization. Moreover, the SLAM represents the degradation in the performance due to the migration of VMs and is defined as follows:

    $$\begin{aligned} SLAM= \frac{1}{N_V}\sum \limits _{j=1}^{N_V}\frac{C_{dj}}{C_{rj}}, \end{aligned}$$
    (8)

    where \(C_{rj}\) indicates the total CPU utilization needed by the \(VM_j\), and \(C_{dj}\) represents the degradation in the performance that results from VM migration.

  • Mean time before a host shutdown (MTBHS): This time is measured in seconds, and the average is calculated as follows.

    $$\begin{aligned} \ f^{MTBHS}_{j} = \frac{1}{N_P}{\sum _{i=1 }^{N_P}{h}_{i}}, \end{aligned}$$
    (9)

    where \(h_i\) represents the host shutdown time.

Constraints:

$$\begin{aligned}&\sum _{i=1}^{N_P} B_{ji}(g)\le 1 \ \forall j\in \{1, \ldots , N_V\}, \end{aligned}$$
(10)
$$\begin{aligned}&\sum _{i=1}^{N_P} VM_{cpu_{j}} \times B_{ji}(g) \le PM_{CPU_{i}}, \end{aligned}$$
(11)
$$\begin{aligned}&\sum _{i=1}^{N_P} VM_{ram_{j}} \times B_{ji}(g) \le PM_{ram_{i}}, \end{aligned}$$
(12)
$$\begin{aligned}&\sum _{i=1}^{N_P} VM_{hdd_{j}} \times PM_{ji}(g) \le PM_{hdd_{i}}, \end{aligned}$$
(13)

where \(B_{ji}=1\) when the \(VM_j\) allocates on \(PM_i\); otherwise \(B_{ji}=0\). In constraint (10), \(VM_{i}\) should be executed on a single \(PM_{j}\) and when its \(SLAV_{i}\) is lowest priority (in this work, s = 2). Here, it should not allocated to any PM. In constraints (11)–(13), the current PM (\(PM_{j}\)) must have sufficient resources to be able to work properly and serve all its VMs at instant t.

Multiobjective optimization

The MOP problems are optimization problems that need solutions with more than one objective function [64]. However, this kind of problem is considered to be an NP hard problem because it contains a set of conflicting objectives. In general, the MOP is given as [64]:

$$\begin{aligned}&\min \,f( {x})=[{{f}_{1}}( {x}),\ldots ,{{f}_{M}}( {x})\,],\quad {x}\,=[{{x}_{1}},\ldots {{x}_{n}}]\in \varOmega , \end{aligned}$$
(14)
$$\begin{aligned}&subject\ to: {{h}_{j}}( {x})=0,\quad j=1,2,\ldots ,N_h, \end{aligned}$$
(15)
$$\begin{aligned}&{{G}_{l}}( {x})\ge 0,\quad (l=1,2,\ldots ,N_G), \end{aligned}$$
(16)

where \({{f}_{m}}( {x})\) and \(\varOmega\) are the m-th of the objective function and of the search domain, respectively. The \({{L}_{j}}\) and \({{U}_{j}}\) are lower and upper boundaries of i-th solution, respectively. The inequality and the equality constraints are represented by \({{G}_{j}}( {x})\) and \({{h}_{j}}( {x})\), respectively.

To find the optimal solution that balances all conflicting objective functions, we use the concept in MOPs known as Pareto optimal dominance. Based on this concept, we consider that the solution x has better results than another solution y if (and only if) it has objective function values better than the other objective values of (y). This can be formulated as

$$\begin{aligned} \forall j: f_j(x)\le f_j(y)\ and\ \exists i: f_i(x)<f_i(y). \end{aligned}$$
(17)

Therefore, the solution x that satisfied Eq. (17) is said to dominate y.

From the previous definition of the dominated solution, there exists a set of solutions called Pareto optimal (PF; also called nondominated solutions) where no solution can dominate them. The set of all PF solutions is represented as follows:

$$\begin{aligned} Pareto\ set=\{x|x\in \varOmega \ and\ x\ is\ Pareto\ optimal\}. \end{aligned}$$
(18)

In addition, the values of the solutions that belong to the Pareto set in the objective domain is called Pareto front and represented as follows:

$$\begin{aligned} Pareto\ front =\{f(x)|x\in Pareto\ set \ and\ f(x)\in R^M\}. \end{aligned}$$
(19)

Salp swarm algorithm

Salp swarm algorithm (SSA) is a new meta-heuristic method, introduced by [21]. SSA emulates the behavior of salps of the family Salpidae. They move like jellyfish, and a high percentage of their weight is water [65]. The working mechanism of SSA starts by producing a random set of \(N_S\) solutions (X) with dimension D. The X is split into two sets based on the location of salps in the population. A salp in front of the food chain will be in the leader group; the rest will be in the followers’ group. The leader group \(x^1\) states the solution to the problem. It is updated using the following Eq. [21]:

$$\begin{aligned} x_j^1(g+1)= {\left\{ \begin{array}{ll} {x^b_j(g) + \alpha _1 ((ub_j-lb_j)\times \alpha _2+lb_j)} &{} \alpha _3 \le 0, \\ {x^b_j(g) - \alpha _1 ((ub_j-lb_j)\times \alpha _2+lb_j)} &{} \alpha _3 > 0, \end{array}\right. } \end{aligned}$$
(20)

where \(x_j^1(g)\) and \(x^b_j(g)\) define the leader’s position and the food source in the j-th dimension at iteration g, respectively. \(ub_j\) and \(lb_j\) define the upper and lower bounds of dimension j, respectively. \(\alpha _2\) and \(\alpha _3\) represent random values in [0, 1] that maintain the search space [21]. To give balance to the exploitation and exploration phases, the \(\alpha _1\) is updated at each iteration g depending on the maximum number of iterations \(g_{max}\) as follows [21]:

$$\begin{aligned} \alpha _1=2e^{-\left(\frac{4g}{g_{max}}\right)^2}, \end{aligned}$$
(21)

Then, the position of the followers’ solution \(x^i\) is updated using Eq. (22) [21]:

$$\begin{aligned} x_j^i(g+1)= \frac{1}{2}(x_j^i(g) + x_j^{i-1}(g)),\quad i=2,\ldots ,N_S \end{aligned}$$
(22)

Algorithm 1 shows all steps of the SSA [21].

figure a

Sine-cosine algorithm

The sine-cosine algorithm (SCA), presented by [22], is based on the mathematical model of the sine and cosine functions to improve its population and obtain the solution for the problem. It switches between these models to achieve the best result.

The working mechanism of SCA begins by randomly producing an initial set of \(N_S\) solutions X with D dimensions. Afterward, for each solution, it calculates the fitness values. The solution with the best fitness value (\(f_b\)) is considered as the best solution (\(x_b\)). The \(x_b\) and the parameters \(\beta _i, i=1, 2, 3, 4\) are used to update the solutions as in Eq. (25). SCA repeats these steps until the stop condition is met, shown in Algorithm 2.

SCA updates its population using the sine function as follows [22]:

$$\begin{aligned} x_i(g+1)=x_i(g)+\beta _1\times sin(\beta _2 )\times |\beta _3 x_b(g)-x_i(g) | \end{aligned}$$
(23)

and updates its population using the cosine function as follows [22]:

$$\begin{aligned} x_i(g+1)=x_i(g)+\beta _1\times cos(\beta _2)\times |\beta _3x_b(g)-x_i(g) | \end{aligned}$$
(24)

The SCA uses a parameter \(\beta _4\) to switch between these equations as shown in the following [22]:

$$\begin{aligned} x_i(g+1)=\left\{ \begin{array}{c} x_i(g)+\beta _1\times sin(\beta _2 )\times |\beta _3 x_b(g)-x_i(g) | \ \ \ \ if \; \beta _4<0.5 \\ x_i(g)+\beta _1\times cos(\beta _2 )\times |\beta _3 x_b(g)-x_i(g) | \ \ \ if\; \beta _4\ge 0.5 \end{array} \right\} \end{aligned}$$
(25)

where |.|, \(x_b(g)\), and \(x_i(g)\) define the absolute value, the target and the current solutions at iteration g, respectively. The \(\beta _i, i=1, 2, 3, 4\) represent random variables of [0, 1] [22]. \(\beta _2\) defines the path of the next solution’s movement either toward or away from \(x_b\). \(\beta _3\) represents a random number that adds weight to \(x_b\) to check whether it stochastically preserves (when \(\beta _3 > 1\)) or (\(\beta _3 < 1\)) the influence of desalination upon finding the domain. \(\beta _1\) is applied to switch between exploration and exploitation in order to define the optimal part for the next solution. This part may be higher than the upper bound or lower than the lower bound, Thus, it is updated as follows [22]:

$$\begin{aligned} \beta _1=\sigma -g \frac{\sigma }{g_{max}}, \end{aligned}$$
(26)

where g, \(\sigma\), and \(g_{max}\) define the current iteration, a constant, and the iterations number, respectively.

figure b

Proposed MOSSASCA for VMP method

The section discusses the main steps of the multiobjective VM placement method that combines the improvement of the behavior of the SSA using the SCA. Therefore, the proposed method is called MOSSASCA. The goal of the proposed method is to search for the optimal set of approximate Pareto front (PF), which represents the optimal solutions for minimizing SLAV, reducing power consumption and maximizing time before a host shutdown [as shown in Eqs. (5)–(9)].

In general, the proposed MOSSASCA begins by receiving the parameters (i.e., the number of VMs (\(N_V\)), the number of hosts (\(N_P\)), the number of solutions (\(N_S\)), and the max number of iterations. Afterwards, the next steps are to generate a random population that contains a set of solutions, each of which represents the index of VM. Subsequently, the performance of each solution is measured by computing objective functions, and the best solution is selected using concepts of dominance. Therefore, each solution is changed and updated by the SSASCA, where SCA is applied to improve the SSA by working as a local search technique. The nondominated (ND) solutions in the archive are updated by comparing them with the ND solutions from the updated population. This sequence is executed until reaching a maximum number of iterations.

The previously mentioned steps of the proposed MOSSASCA can be classified into three phases: (1) initialization, (2) updating the population using SSASCA, and (3) updating the archive of ND solutions. These phases are discussed in detail in the following subsections.

Initialization

In this phase, the MOSSASCA starts by initializing a set of \(N_S\) solutions X where the value of each solution \(X_i, (i=1,2,\ldots , N_S)\) form the index of the VM and the dimension equal to the total number of VMs (\(N_V\)). The solution is generated using the following equation:

$$\begin{aligned} {X}_{ij}=floor(L_P+rand\times (N_P-L_P)), \quad j=1,\ldots , N_V \end{aligned}$$
(27)

where \(N_P\) represents the number of PMs, \(L_P\) is the minimum number of PMs (set to one), and floor is a function used to convert the real number to an integer value. To clarify this definition, we consider the \(N_P=5\), and \(N_V=8\), then \(X_i=[2\ 5\ 3\ 3\ 3\ 1\ 2\ 4]\). This representation means that the first and seventh VMs are allocated to the second PM and the second VM is allocated to the fifth PM. The third to fifth VMs are allocated to the third PM, and the sixth and eighth VMs are allocated at the first and fourth PMs, respectively.

The next step is that the objective function is calculated for each solution using the following equation:

$$\begin{aligned} Min\;\; F=[ f^{Energy};f^{SLAV}; f^{Meantime}]^T, \end{aligned}$$
(28)

where \(f^{SLAV}, f^{Energy}\), and \(f^{MTBHS}\) are defined in Eqs. (5)–(9). Thereafter, the ND solutions are allocated and are stored in archive AR.

figure c

Update the population using SSASCA

In this phase, the solutions X are updated by using the hybrid SSA and SCA. The aim of using the SCA algorithm is to give the SSA the capability of skipping the local point. The hybrid SSASCA algorithm starts by determining the best solution. However, this process is different from the single-objective optimization, where each solution has only one function. That in MOP has more than two objective values. Therefore, the leader selection mechanism (LSM) is applied to the nondominated solutions in archive AR (see [66] for more details) to select the best solution (\(X^b\)). The LSM uses the roulette-wheel method to choose \(X^b\) based on the probability (\(Prob_{Sel}\)), which is computed as follows:

$$\begin{aligned} Prob_{Sel}=C \times Nseg_i, \end{aligned}$$
(29)

where \(Nseg_i\) represents the number Pareto optimal solutions of the crowded segment i, while \(C>1\) is a constant number.

Therefore, to switch between SSA and SCA, we compute the probability (Pr) of the power fitness function as follows:

$$\begin{aligned} Pr_i=\frac{f_1}{\sum _{i=1}^{N_S} f_1}. \end{aligned}$$
(30)

If the current solution has \(Pr_i\ge \eta\) (\(\eta \in [0,1]\)), then the SSA is used to update \(X_i\); otherwise, the SCA is used to update \(X_i\). The next step is to convert the updated solution into an integer solution using the floor function (i.e., \(X_i=floor(X_i)\)); then, the objective functions are calculated for each solution.

figure d

Update the archive

This stage begins by combining the current population with the archive AR, followed by the determination of ND solutions. This process is critical to improve convergence to the true PF and preserve the population’s diversity by using the density estimation information [67,68,69,70].

In general, the archive AR is updated by comparing its ND solutions with each solution in the population (i.e., \(X_i\in X\)) using the Pareto dominance concepts. Here, the solution \(X_i\) will swap any solution \(X_A\) in AR when \(X_i\) dominates \(X_A\), and if there is a set of solutions in AR is dominated by \(X_i\), then this set will be removed from AR, and \(X_i\) will be added to AR. However, if \(X_i\) must be added to AR and the size of AR is full, then the crowded segment method is used. The segmentation of the objective space is rearranged by using the grid mechanism, and the most crowded segment is determined to delete the solution from AR. Thereafter, the \(X_i\) will be combined with the least crowded segment to improve the PF’s diversity. In addition, solution \(X_i\) is not appended to AR when any solution dominates it (i.e., \(X_i\)) [71]. The final step in this phase is to enhance the population X by selecting the best \(N_S\) solutions from AR.

figure e

Full description of MOSSASCA method

The remaining parts of the proposed MOSSASCA method for VMP are presented in Algorithm 6. Here, the current generation is represented by g and the maximum number of iterations is given by \(g_{max}\). The proposed MOSSASCA starts by determining the input to the initialization phase (as described in Algorithm 3) and receiving the output (i.e., population X and archive AR) from it. The next step is to update X based on the hybrid SSASCA algorithm (as described in 4). Then, Algorithm 5 is used to find the NDS and update AR. The steps of two algorithms (i.e., Algorithm 4, Algorithm 5) are repeated until the terminal conditions are satisfied.

figure f

Complexity of MOSSASCA method

The complexity of the MOSSASCA method depends on several items, for example, the dimension of the tested problem, the number of solutions \(N_S\), the number of objectives M, the maximum number of iterations \(t_{max}\). Therefore, the complexity of MOSSASCA is given by

$$\begin{aligned} O(MOSSASCA)=\, & {} O(t_{max}(N_V\times (N_S\times N_{prob1}+N_S\times N_{prob2})\nonumber \\&+cof\times N_S+M\times N_S^2)), \end{aligned}$$
(31)

where \(N_{prob1}\) and \(N_{prob1}\) represents the number of solutions updated using the SSA and SCA respectively. cof is the cost of objective function.

Results and discussion

A set of experiments are performed to assess the performance of the MOSSASCA as a multiobjective VMP method to find the optimal solution to the VM allocation problem. In addition, the results of MOSSASCA are compared with those of other approaches, such as NSGAII [72], MOEA-D [73], MOPSO [74], and MOSCA [75].

Environment description

In this section, the description of the environment is explained, where the CloudSim toolkit simulation framework is used to simulate the cloud computing environments [76]. The experimental setup used in this study simulates the real configurations of cloud computing infrastructure, which includes 160 IBM Server x3250 and Intel Xeon 3480 CPU, 160 IBM Server x3250 and Intel Xeon 3470 CPU model, 160 HP ProLiant ML110 G3, 160 HP ProLiant ML110 G4, and 160 HP ProLiant ML110 G5. Table 1 shows a set of parameters used in the experiments for VM simulations.

Table 1 The Parameters of VM simulation

In addition, the workload data are extracted from the project of CoMon data [77, 78]. Statistical values of the utilization (shown in %) of data (daily) are given in Table 2.

Table 2 Properties of the workload

The performance of the MOSSASCA is compared with other algorithms, such as NSGAII, MOEAD, MOPSO, and MOSCA. To evaluate the performance of these methods, we use different numbers of VMs (25, 75, and 100) and hosts (50, 150, and 200), where 50 VMs are assigned to 25 hosts, 150 VMs are assigned to 75 hosts, and 200 VMs are assigned to 100 hosts.

All algorithms are implemented using jMetal java framework, which is installed on Windows 10 (64-bit).

Performance measures

The proposed method is evaluated on the basis of MOP performance measurements, including hypervolume (HV), spread (SP), generational distance (GD), epsilon (EPS), and inverted generational distance (IGD) [79].

  1. 1.

    HV: It is applied to evaluate the nearness and variety of PF by computing the area size dominated by its solutions. If the HV of \(X > B\), then X Pareto optimal converges to be greater than B.

    $$\begin{aligned} HV=volume \left( \bigcup _{i=1}^{|PF|} a_i\right) \end{aligned}$$
    (32)
  2. 2.

    SP: It is used to evaluate the extent of spread achieved between the solutions and compute the nonuniformity in the distribution of solutions, and it is defined as

    $$\begin{aligned} Spread= \frac{\sum _{i=1}^M d_i^e +\sum _{i=1}^{|Q|}|d_i-\bar{d}|}{\sum _{i=1}^M d_i^E+|Q|\bar{d}}, \end{aligned}$$
    (33)

    where \(d_i\) represents the Euclidean distances between the set of solutions that have the mean value \(\bar{d}\), and where \(d_i^e\) represents the distance between the extreme solutions and \(PF^*\). M and |Q| are the total number of objective functions and the number of solutions in \(PF^*\), respectively. The smaller Spread value indicates that the algorithm is better than others.

  3. 3.

    GD: GD defines the average distance between the true Pareto front and the estimated front. Equation (34) defines this measure:

    $$\begin{aligned} GD(fp,fp^*)=\frac{\sqrt{\sum _{i=1}^{n}d_i^2}}{n}. \end{aligned}$$
    (34)

    In Eq. (34), \(n=|fp|\) defines the number of elements of the ND solutions, and \(d_i\) defines the Euclidean between the closest front’s solution of Pareto optimal in the objective space and reference solution (RP).

  4. 4.

    IGD: It is applied to compute the nearness between the reference optimal solution (RP) and the archived Pareto solution. Equation (35) defines this measure:

    $$\begin{aligned} IGD=\frac{{\sum _{r\in RP } d}}{|RP|},\quad d=||a-r||_2, \forall a\in PF, r\forall RP. \end{aligned}$$
    (35)

    The smaller IGD value indicates better performance.

  5. 5.

    EPS: It is used to find the minimum value where the approximation PF is mapped to the objective space that dominates the true PF.

    $$\begin{aligned} EPS=\max _{r\in PF_t} \min _{a\in A} \max _{i=1,2,\ldots ,M} (a_i-r_i), \end{aligned}$$
    (36)

    where r and a are the solutions in the true PF and the approximate PF, respectively. The algorithm with the smallest EPS is considered to be the better one.

Experimental results analysis

Comparison with other algorithms

The comparison results between the MOSSASCA and the other algorithms in order to determine the optimal solution for the VMP in cloud computing are given in Table 3 and Figs. 2, 3, 4. The results show that the proposed MOSSASCA method performs better than the other methods. Specifically, the proposed method has lower minimum and maximum power consumption in all tested problems. The standard deviation of the proposed method is smaller than all other methods except the MOEAD in the VM200-PM100 problem.

Fig. 2
figure 2

Average MTBHS for each method in each VMP problem

Fig. 3
figure 3

Average SLAV for each method in each VMP problem

Fig. 4
figure 4

Average power consumption for each method in each VMP problem

In addition, the minimum value of SLAV is achieved using the proposed method and by using the MOSCA, MOPSOA, and MOEAD algorithms in problems VM50-PM25 and VM150-PM75, but the proposed method achieves the best value in the VM200-PM100 problem. The proposed method has better maximum SLAV value than the other methods. However, the NSGA-II algorithm has the best standard deviation in problem VM50-PM25, and the MOSCA method has better value in problems VM150-PM75 and VM200-PM100.

The best maximum and minimum MTBHS values are achieved by the MOSSASCA in all the tested problems, but the best standard deviation is achieved by MOPSO in problems VM50-PM25 and VM150-PM75. Furthermore, the proposed method has the smaller standard deviation value in the VM200-PM100 problem.

Figures 2, 3, and 4 show the average time, SLAV, and energy, respectively. The figures show that the proposed method provides lower average power consumption and SLAV values than the other methods in each tested problem. The methods have similar performance in VM50-PM25, but MOPOS and NSGA-II have the worst value in VM200-PM100 and VM150-PM75, respectively. Power consumption is increased as the numbers of PMs and VMs are increased. The proposed method and MOSCA nearly have the same SLAV value in VM50-PM25, but the proposed method produces better results than all other methods in the other two problems. Among the methods, NSGA-II has the worst results in three problems, whereas MOSCA, MOPSO, and MOEAD have nearly the same average value of SLAV in the VM150-PM75 problem. However, MOSCA has better results than MOPSO and MOEAD in the VM200-PM100 problem. Moreover, the worst MTBHS value is achieved by NSGA-II in problems VM50-PM25 and VM200-PM100, and MOEAD in VM150-VM75. The MOSSASCA shows best value among all tested algorithms in each tested problem. These results suggest that MOSSASCA provides maximum utilization of each resource.

Table 3 The comparison of results among the algorithms in each tested VMP problem

Performance evaluation based on MOP indicators

To investigate the quality of PFs obtained by each method we evaluate a set of MOP indicators given in Table 4 for each tested problem. From this table, it can be noted that the MOSSASCA performs better than the other methods in approximating PFs. For example, in terms of EP, HV, and IGD, the MOSSASCA ranks first of in all tested problems. However, in terms of spread, MOSCA ranks first in testing problems in the VM50-PM25 and VM150-PM75, followed by the proposed MOSSASCA method, which has a smaller GD value in the third problem (i.e., VM200-PM100). MOSSASCA has a smaller GD value only in problem VM50-PM25, but MOPSO has a smaller GD value in two other problems. In addition, we can conclude from the table that MOEAD and NSGA-II generally have the worst performance in the tested problems.

From all the previous results, it can be concluded that the MOSSASCA outperforms the other methods based on the MOP indicators.

Table 4 Comparison results based on MOP indicators

Influence the parameters of proposed method

In this experiment, we investigate the influence of different parameters on the performance of the MOSSASCA. Table 5 illustrates the results of the proposed MOSSASCA based on different values of \(a=[1,\ 4]\), \(N_S\)=[50, 200], and \(g_{max}\)=[50, 200]. The proposed method at \(a=1.5\) is better than that at \(a=0.5\), indicating that the performance of the MOSSASCA increases with increasing a. In addition, population size affects the performance of MOSSASCA. Specifically, the MOSSASCA performs better at pop=200 than at pop=50 and pop=100 as in Table 4. Moreover, similarly, the performance of MOSSASCA is affected by the number of iterations, and the results showed that the MOSSASCA achieves high performance at iter=200.

Table 5 Results of MOP indicators under different parameter values in the proposed method

Influences of VMs and PMs on the proposed method

The effect of variants number of VMs and PMs on the performance of the MOSSASCA method is tested in this section. Table 6 illustrates the results of different VMs and PMs, in which we note that the performance of the MOSSASCA in terms of EPS has better value in problem VM300-PM600; however, this is not largely different than the result with VM200-PM500. In VM400-PM1000, the EPS of the MOSSASCA is the largest among the three problems. Based on the spread measure, the quality of the MOSSASCA in the three problems is nearly the same; however, the best value is achieved in VM200-PM500. The same observation is found for the values of GD, where the value of the MOSSASCA at VM200-PM500 is better than its value in the other problems. In addition, the HV value of the MOSSASCA at VM400-PM100 is better than in other problems. In addition, it can be observed that the best IGD value is achieved when MOSSASCA is used to solve the VM300-PM600. From these experimental series, it can be observed that the performance of MOSSASCA, in terms of MOP indicators, is not largely notable when solving the VM problem with VMs of 200–400 and PMs of 500–1000.

Table 6 Results of changing the number of VMs and PMs

From all the previous experimental results, it can be noted that the performance of MOSSASCA is better than the other methods in terms of the three objectives functions and the performance measurements. This performance results from the high skill of the SSA in exploration of the search domain and in using the operators of the SCA algorithm to improve the exploitation ability of the SSA through working as a local search method. Additionally, this leads to finding the nondominated solution that best balances among the three objectives of solving the VMP problem in cloud computing. Moreover, it can be noted that the performance of the other multiobjective VMP methods was less than that of the proposed method in this study and that the quality of the algorithms was not fixed when changing the problems. This outcome is because each algorithm has a different ability regarding either the exploration or exploitation of the search domain. However, there are some limitations for MOSSASCA, such as the fact that its computational time will require additional improvement to make it more suitable for real-time VMP problems.

Conclusion

This paper proposes an alternative MOP method for finding the optimal solution to the VM consolidation problem. The main objectives of the proposed MOSSASCA are to maximize MTBHS, to reduce power consumption, and to minimize SLAV. MOSSASCA is proposed to find solutions that can minimize conflict between the three objectives. In MOSSASCA, the SCA is applied as a local search approach to enhance the performance of traditional SSAs by preventing them from getting stuck in a local optimal solution and increasing convergence speed. To assess the performance of the proposed method, we perform a set of experiments using different numbers of VMs and physical machines. The results of MOSSASCA are compared with that of well-known MOP methods, including NSGA-II, MOPSO, and MOSCA. The experimental results show that MOSSASCA is better than others on the basis of MOP indicators and in achieving the three objectives. Here, the proposed method achieves the best results in three MOP indicators, namely, EPS, HV, and IGD. The MOSCA and MOPSO methods are better according to the value of spread and GD, respectively. Compared with the other methods, MOSSASCA exhibits a better ability to reduce power consumption and SLAV while increasing MTBHS.

Given the superiority of the proposed MOSSASCA method, it can be extended to solve more than the three objectives in VM placement in cloud computing. It can also be applied to different fields, for example, feature selection, image segmentation and job scheduling.