1 Introduction

Virtual Machine Placement and Migration (VMPM) involves processes that need to deal with a large number of different aspects in managing cloud resources. Some factors that play an important role in decision making are resource balancing, overload, costs and overall gains after performing such processes. The workload for cloud-hosted services is usually dynamic and unpredictable. Despite the apparently infinite cloud computing resources, they have to be properly managed to ensure that the demand and supply relation is met. Considering the scale of cloud infrastructure and the NP-hardness of VMPM, several works are limited to providing a comprehensive and scalable solution for these concerns such as those surveyed in our previous work [1] and many other [2,3,4,5,6,7,8].

In that regard, this paper proposes a new distributed approach for VM placement and migration based on the modern portfolio theory (MPT) developed by Markowitz (1959) [9]. The theory is applied for selection and management of investment portfolios, which are investment wallets containing a list of financial assets. These assets, such as stocks, represent the portfolio as a whole and each one is accountable for a specific percentage of the overall portfolio value. Values of assets are dynamically determined and difficult to predict, as it is the resource demand in cloud computing. That requires the shares of each asset in a portfolio to be rebalanced regularly, to ensure that the percentage of each asset remains the same as expected by the investor, according to his/her risk preferences. This is known as the risk profile [10].

A good portfolio is based on multiple assets that are aligned with investor goals. Assets can be selected strategically or tactically, respectively based on: (i) past information, which might give an overview of asset performance over time; (ii) and/or forecasts, beliefs regarding future asset performance, market information or trends [9].

The main MPT foundation is the portfolio diversification effect to reduce risk, while either yielding the (i) highest expected return for a given level of risk or (ii) lowest risk for a given level of return. It applies the mean-variance analysis to select assets, usually seeking to maximize the return mean (profit) and minimize return variance (risk). This analysis strongly relies on asset covariance (correlation) to determine how different assets fit together, so that the goals just mentioned are met.

Asset selection takes into account the imperfect correlation between pairs of assets, computed based on historical data. When there is a positive correlation between two assets, their expected values tend to change in the same direction. A less risky portfolio can be compounded by pairs of negatively correlated assets, so that they move in opposite ways; when one loses value, the other one gains, reducing risk and balancing returns. In practice, the correlation is not perfect, since assets change values at different scales (speeds).

The risk associated with assets is defined as the volatility of their prices, formally the variance of the returns. The more volatile an asset is, the higher its risk, although higher returns are usually earned. Due to the volatility and the often unpredictable nature of financial assets, it is challenging to create a portfolio that ensures a certain level of risk and return. That is why portfolios usually have to be re-evaluated and rebalanced periodically, even to add or remove some assets.

Regarding cloud computing, cloud resources are physical or logical assets (RAM, CPU, bandwidth, storage, etc.) that have different returns for cloud providers and customers, respectively, e. g., (i) profit and (ii) quality of service (QoS). Management of cloud assets by means of VM placement and migration also involves the same issues faced by financial assets, such as the volatility of returns. One impacting factor for volatility is the unpredictable demand that affects both cloud and financial markets.

Modern portfolio theory has applications in different fields such as project portfolio management by government and industry. It has been applied to create a portfolio of imperfectly correlated projects that minimize risks and maximize returns for a company [11]. Since MPT is widely used to optimize allocation of assets in a portfolio and to manage risk and return, this paper presents an approach that applies the theory to create cloud computing portfolios, each one representing a suboptimal allocation of a set of VMs to a given cluster of hosts (PMs). Thus, the proposal seeks to maximize resource utilization (return), while minimizing underload, overload and SLA violations (risk).

The VMPM problem has a well-known NP-hard complexity [1], considering the large scale of cloud infrastructures. Accordingly, the contributions of this paper are: (i) a novel way to use the MPT for VMPM that largely reduces computational complexity and memory footprint by applying the Welford algorithm to incrementally compute statistics; (ii) the definition of a multi-dimensional resource metric that considers variance as a resource unbalacing factor; (iii) use of the Gossip protocol to exchange data for VMPM; (iv) a distributed and decentralized proposal for VMPM decision-making that enables scalability, load balancing and fault-tolerance; and (v) our proposal was extensively tested using CloudSim Plus, a well-known and de facto simulator for Cloud Computing nowadays. To the best of our knowledge, this is the first proposal broadly applying the MPT that is computationally feasible for large-scale cloud providers.

The paper is organized as follows. Section 2 presents a parallel between financial markets and cloud computing resource allocation, their familiar points and their differences; Sect. 3 presents the foundations of portfolio analysis using modern portfolio theory (MPT) and showing how that theory can be applied for VM placement and migration; Sect. 4 reviews the related studies; Sect. 5 contains the MPT-based VM placement and migration proposal; Sect. 6 includes the proposal evaluation; and Sect. 7 finally reveals the conclusion and suggestions for future work.

2 Parallelism Between Financial Markets and Cloud Computing Resource Allocation

Considering the aspects surrounding MPT, a parallel between financial and cloud computing markets can be clearly drawn. First, players and other elements involved in both markets must be identified, as proposed in Table 1.

Financial assets, such as stocks, are bought by investors expecting some return. Virtual computing resources (such as VMs, containers or storage services) are rented by customers for some time to run services that benefit both the customers and their end-users. The generic term “asset” can be used to represent a financial or computing asset. An investor can buy and hold an asset for a certain time, usually while it is meeting the goals of that investor. In the same way, a cloud customer can rent a resource for some time, while it is useful or valuable for him/her. Investors and cloud customers are both customers in their markets.

Table 1 Parallelism between financial markets and cloud computing resource allocation

Financial brokers trade assets belonging to someone else on behalf of customers. In cloud computing, providers mainly trade their own computing resources with customers, although they can also trade resources owned by other providers in a hybrid or federated cloud. With Amazon Web Services (AWS) e.g., the Spot Instances service [12] trades computing resources according to demand and supply, and defines prices dynamically.

Investment contracts and service level agreements (SLAs) are both contracts defining rights and duties for the parties involved, namely: (i) brokers and investors in financial markets and (ii) cloud providers and customers in cloud markets. These contracts also define penalties for infringing parties. Demand/supply for financial assets is a main driver of asset prices in both markets. The demand for computing resources is determined by customers and the workload of their running services. This demand is highly dynamic and may be even harder to predict than in the financial market.

Buying and selling operations in the financial market usually have brokerage fees and taxes that investors must pay in addition to the asset price. Usually, the longer an investor holds an asset and waits to realize profits, the fewer fees and taxes he/she will be charged. Such charges impact investor net profit, so that the time in which these operations are performed has to be carefully decided. Similarly, VM placement and migration has to be performed in a timely manner. The interference and inherent overhead of such operations can be viewed as the fees and taxes to be paid. The collateral effect of these operations may reduce the performance of customer services and also impact other customers using the same physical infrastructure.

A cloud provider owns and may borrow computing resources from other providers. Cloud customers rent such resources from the provider on a pay-per-use basis. In the financial market on the other hand, investors buy assets from a broker, which are owned by a third-party such as a public company operating on the stock market. As has been shown, there are numerous similarities between cloud and financial markets that enable application of the Modern Portfolio Theory to the VMPM problem.

There are numerous similarities between financial and cloud computing markets, as shown in Table 1. The foundations of the MPT are based only on asset return and risk, whereas the latter is based on the variance of returns. This way, returns on assets play a major role in the theory, explaining why it has been applied in different fields. Certainly, there are differences between the financial market and other markets such as cloud computing, e.g., regulatory issues. For instance, cloud providers such as Amazon Web Services (AWS) in China have a dedicated infrastructure that does not communicate with the global AWS infrastructure [13]. Despite such regulatory concerns, we were not able to identify other aspects that interfere with the application of the MPT for VMPM problem.

3 Portfolio Analysis and VM Placement

3.1 Basic Concepts

A portfolio is a collection of assets, each one in a given proportion (also called allocation, share or weight). Correlation is one of the MPT foundations for computing portfolio risk, the latter represented as the volatility of the overall portfolio return. Correlation between two assets changes over time and must be estimated periodically to balance the portfolio and maintain risk at desired levels. A graph between risk and return can be plotted as a set of compounded portfolios of different assets, as presented in Fig. 1. A point in this graph represents a portfolio with a specific level of risk and return, according to weights assigned to each asset. The graph for a set of different portfolios is usually a bullet-shaped curve.

Fig. 1
figure 1

Portfolio analysis: a set of randomly generated portfolios and the efficient frontier

For a given portfolio, any other one outperforming it in terms of better return or lower risk is north-west headed. The green line starting at the upper half of the curve, is known as the efficient frontier. It is formed by optimal portfolios where one cannot achieve either (i) higher return without increasing risk or (ii) lower risk without decreasing return. The efficient frontier is drawn from all points along the risk (X) axis that have the maximum return for a specific level of risk.

A portfolio p is defined as efficient if and only if there exists no other portfolio q such that:

$$\begin{aligned} \mu _{t_q} >= \mu _{t_p} \wedge k_q < k_p \end{aligned}$$
(1)

where \(\mu _{t}\) is the mean return and k the risk for a given portfolio such as p or q [14].

The yellow horizontal line isolates the inefficient portfolios from the efficient ones, respectively at the lower and upper half of the curve. For a given level of risk there are at least two portfolios, as far as return is concerned: the least and the most profitable ones. The least profitable portfolio for a level of risk is a point at the edge of the lower half of the curve. On the other hand, for the same risk, one can realize the highest return by choosing the portfolio on the green line. As an example, in Fig. 1, for a risk of 10%, one can get a portfolio with the lowest return of about 1.7%. However, for the same risk there is a portfolio with the highest return of about 17% (10 times more).

In fact, the inefficient portfolios are not only contained inside the lower half of the curve. For a given level of risk, any portfolio not belonging to the efficient frontier has a lower return. The closer to that frontier, more efficient a portfolio is [11]. Portfolios along the efficient frontier can also be viewed as in Fig. 2. Each point represents an efficient portfolio compounded from different allocations of assets.

Fig. 2
figure 2

Efficient frontier: each circle is the most profitable portfolio for a given level of risk (Adapted from [15])

Consider that (i) only strategic asset allocation (which is based only on historical data) is being used to create portfolios and (ii) the goal is to minimize risk and maximize return. The only reasonable choice is a portfolio belonging to the efficient frontier. Any other one should be chosen only if a tactical allocation is being performed, based on market trends or beliefs. The efficient frontier (EF) is therefore a set of portfolios that can be formally defined as in (2):

$$\begin{aligned}{} & {} EF = \{ \forall p \in P | k_p = min(k|\mu _t=\mu _{t_p}) \nonumber \\{} & {} \quad \wedge \ \mu _{t_p} = max(\mu _t|{k=k_p}) \} \end{aligned}$$
(2)

Consider \(\mu _t\) the mean return and k the risk of a portfolio p; P the set of all generated portfolios. The efficient frontier contains every portfolio p where its risk is the minimum for that level of return (\(min(k|\mu _t=\mu _{t_p})\)) and its return is the maximum for that level of risk (\(max(\mu _t|{k=k_p})\)). It also only contains portfolios whose risk and return are both higher than those of the least risky portfolio. This is a curve starting at the minimum risk and ending at the maximum return [16], as shown in Figs. 1 and 2. The return \(\mu _{t_p}\) of a portfolio p is the sum of a weighted average of its asset returns means [16], as defined in (3):

$$\begin{aligned} \mu _{t_p} = \sum _{a=1}^N{t_a*w_a} \end{aligned}$$
(3)

where N is the number of assets, a represents a portfolio asset, \(t_a\) the return mean of that individual asset and \(w_a\) its weight. The return of a portfolio is straightforwardly computed using that equation. Conversely, computing the portfolio risk \(k_p\) is more complex. It is not based on the averaged variance of each asset, since the risk tends to decrease as more low-correlated assets are included.

The portfolio risk is represented by the variance of its assets returns, the sum of a so called variance-covariance \(N \times N\) matrix [10], as (4):

$$\begin{aligned} k_p = Var(t_p) = \sum _{a=1}^N{\sum _{b=1}^N{w_a*w_b*\sigma _{ab}}} \end{aligned}$$
(4)

where \(\sigma _{ab}\) is the covariance of returns between assets a and b. Considering that only a sample of the entire return history is being used, the sample covariance [10] is in turn computed as defined in (5):

$$\begin{aligned} Cov(a,b) = \sigma _{ab} = \frac{\sum _{i=1}^n{(t_{a_i} - \mu _{t_a}) * (t_{b_i} - \mu _{t_b})}}{n-1} \end{aligned}$$
(5)

where \(t_{a_i}\) is an individual sample of the asset return, \(\mu _{t_a}\) is the asset mean return (also denoted as the expected return E[t]), and n is the number of samples along the time period.

3.2 Problem Formulation

The Markovitz optimization problem defined by the MPT [14] can be formulated as in (6):

$$\begin{aligned} \begin{aligned} \max {\sum _{a=1}^n{w_a*\mu _{t_a} - k_a}} \\ {\text {s.t. }} w_1 + ... + w_n = 1 \end{aligned} \end{aligned}$$
(6)

aiming to maximize the sum of the weighted (\(w_a\)) mean return of each asset (\(\mu _{t_a}\)), minus its risk (\(k_a\)), subject to the sum of weights adds up to 1. Such a restriction ensures that all available wealth for that portfolio is invested. That formulation aims to maximize returns while minimizing risk.

Portfolio analysis in MPT shows how to select assets weights so that the most profitable portfolio for a given level of risk can be found. On the other hand, each cloud customer explicitly sets the desired amount of resources, manually weighting them. Those amounts cannot be arbitrarily changed by the provider, at the risk of degrading customer services performance and violating SLA. Alternatively, the provider may dynamically allocate resources to customers, up to the contracted amount, so that SLA violations and resource under/over utilization is minimized. However, this scenario is not considered here.

A cloud portfolio could therefore be defined as a group of VMs to be placed inside a given cluster of PMs, in order to maximize return and minimize risk. This set of VMs is going to be called a cluster portfolio, where return is the mean percentage of resources utilisation and risk is its variation.

VMs are abstractions for the computing resources being used from a PM [17, 18]. The benefits of diversification, presented earlier, can be achieved in the cloud by selecting distinct VMs from different customers to compose a portfolio. Diversification relies on the actual imperfect correlation between two assets. Inverse (anti/negative) correlation is the most impacting factor for risk reduction. That is, as an asset price goes down, the other one goes up. This way, losses on one asset are compensated by returns on another one. There are several proposals regarding correlation when placing and migrating VMs. Some of them have shown correlation between different computing resources such as CPU and RAM [3, 4, 6, 8, 19,20,21,22,23,24].

Some authors state that when there is a higher correlation between computing resources usages, there is a higher probability of the hosting PM becoming overloaded [25, 26]. The reason is that as the usage of a resource rises, the directly correlated ones also tend to rise, increasing the overall resource usage.

Regarding the VMPM problem, the selected strategy has to ensure that resources required for all VMs be placed into a given PM/cluster do not exceed the PM/cluster capacity, as defined in (6). Exceeding it causes oversubscription, which usually should be avoided to meet SLA.

While in MPT the allocation of all financial assets must add up to 100%, commonly in cloud infrastructures that percentage should be lower than a given threshold for any type of resource. That ensures there will be available resources for PMs internal operations, possible VM vertical scaling, workload bursts and even for overhead expected during VM migrations.

Before formulating the VMPM problem, weights have to be addressed. These define the fraction of each asset selected for a portfolio and are used to adjust risk. Different portfolios usually share the same assets in equal or different proportions. Reducing weight of riskier assets tends to reduce overall portfolio risk. Considering a VM as an indivisible cloud asset, the provider cannot assign just a fraction of it to a given cluster. This way, VM weights are always 1, which can be removed from the formulation, as presented in (7).

$$\begin{aligned} \begin{aligned}&\max { \sum _{a=1}^n{\mu _{t_a} - k_a} } \\&{\text {s.t. }} \mu _{t_1} + ... + \mu _{t_n} <= threshold \end{aligned} \end{aligned}$$
(7)

Considering \(\mu _{t_a}\) as the overall return for VM a and \(k_a\) its risk, the sum of returns for all VMs inside a portfolio is not expected to exceed a defined threshold of the total PM/cluster capacity. VM return is the utilization mean of its resources, namely CPU, RAM and network traffic for this proposal, as defined in (8):

$$\begin{aligned} \begin{aligned} \mu _{t_a} = average(\mu _{cpu}, \mu _{ram}, \mu _{inv\_net}) \end{aligned} \end{aligned}$$
(8)

where \(\mu\) variables are the VM mean utilisation for CPU, RAM and network traffic (normalized between 0 and 1). For network traffic, the utilization mean is inverted since the lower the traffic the better. That is a specific metric which will be detailed in Equation (9) of Sect. 5. There are other variables that could be included in the equation, such as storage, I/O throughput, and network QoS metrics. Storage space is commonly not an issue for current cloud providers. This is an abundant and cheap resource [27], mainly after the availability of Storage Area Networks (SANs). Therefore, the most critical resources in a cloud data center were considered, while including more variables than some related works presented in Sect. 4.

Although the resource utilization average is a unidimensional metric for a multidimensional problem (as surveyed in [1]), the formulation in (7) subtracts the risk from the maximization function. Considering the MPT foundation, risk is a volatility metric based on the variance of returns. If a VM resources variance is high, that means the utilization of individual resources is unbalanced. This way it reduces the VM utility by considering resource utilization volatility, and consequently reduces unbalancing.

Although weights are removed from the formulation, they are implicitly represented as the number of VMs assigned to each cluster or host. However, to reduce the number of decisions to be made and the computational complexity of the proposal, those weights are not input variables, but final results from a given VM placement. Those weights could be adjusted after the proposed placement, but this kind of tuning is not addressed here.

4 Related Work

There is a plethora of proposals for VM placement and migration available in the literature, as surveyed in [1]. The majority use traditional methods to compute statistics, based on resource utilization history collected over long time periods. That leads to overhead on CPU and memory in order to make VMPM decisions, due to the volume of collected data. Other proposals apply heuristics such as Ant Colony Systems or Simulated Annealing, which are known to lack scalability for large scale infrastructures. This section presents some of those proposals that apply correlation between VMs and/or the modern portfolio theory.

4.1 Dynamic Correlative VM Placement

An MPT-based VM placement and migration proposal is presented in [20]. It considers correlation between multiple VMs when generating a placement scheme for a given PM, so that time-varying (anti-correlated) resource demands are multiplexed. This means that VMs with different resource peak times are placed together to maximize resource usage.

The proposed scheme is based on VM demand prediction and volatility, using the auto regressive integrated moving average (ARIMA) model to predict k-step-ahead utilization for a VM resource. The generalized auto regressive conditional heteroscedasticity (GARCH) model is used to predict resource usage standard deviation, so as to reduce that the risk of prediction-based VM placement.

Although the proposal uses correlation among a group of VMs, it is not clear how MPT is applied as a whole. The only MPT foundations the proposal relies on are risk and VMs correlation. MPT relies on the creation of multiple portfolios with different allocations for the available assets. This means that they can be plotted together to select the best one that minimizes risk and maximizes return.

The work presents a first-fit greedy algorithm to place the first VM into the first PM, so that the expected PM load is not higher than its capacity. Finally, experiment results are assessed based only on computational simulation using synthetic data. Datacenter traces or testbeds are not applied to validate such results, which are presented without any simulation parameters and confidence intervals.

4.2 MPT-Based Static Single Resource Management

Portfolio theory-based algorithms are proposed in [28] for QoS-aware resource assignment in cloud environments, in order to maximize resource usage and minimize energy consumption. The proposal considers resource requirement as a random value x, instead of using historical data. However, since the cumulative distribution function (CDF) of x is usually not known, the amount of required resource is estimated using the Cantelli inequality [28]. The paper shows that estimation tends to allocate more resources than required. If the distribution of the random variable is known, the estimation can be more precise.

The proposal simply applies a subset of the MPT to reduce the standard deviation (\(\sigma\)) of VM resource requirements, mapping a static set of VMs to PMs. The work generates and selects the VM portfolio for a given PM that minimizes \(\sigma\). However, only CPU usage is considered as resource. A hierarchical resource management solution is presented in which (i) a cloud-level manager maps VMs to clusters of PMs and (ii) cluster-level managers map each VM to a specific PM. Cluster managers work independently in parallel.

Although a distributed solution is proposed, it uses a central cloud manager accountable to allocate VMs across the clusters, which is not suitable for current large-scale cloud infrastructures. That manager may not be able to generate a high number of different portfolios for selecting optimized sets of VMs to distribute among clusters. Clusters may not receive a set of VMs that enables them to obtain suboptimal-enough portfolios. This issue is even stated in a further publication from the same authors [29]. Collaboration between clusters would give better results, such as presented in [30, 31].

The proposal is assessed only by simulation and only using synthetic data. It outperforms the well-known simulated annealing (SA), first fit decreasing (FFD) and best fit decreasing (BFD) heuristics.

4.3 MPT-Based Dynamic Multiple Resource Management

The aforementioned work is extended in [29] by enabling resource allocation for dynamic-arrived VMs and multiple resources. The previous work considered only CPU requirements of VMs as random values to be estimated. Nevertheless, this extended proposal only considers VM workload intensity, a more general metric, as random values to be estimated. Workload intensity X for a VM n is defined as the average number of jobs running inside that VM. The requirement R for a resource k of a VM n is estimated using a linear regression in terms of such a workload intensity [29]. The authors state that the model is reasonable since there is a correlation between workload intensity and resource requirements.

The work presents a first fit decreasing (FFD) algorithm for mapping VMs to a cluster of PMs. It computes the cost of a VM n as a risk factor \(\sigma ^2/\mu\), where \(\sigma ^2\) is the variance of resource requirements and \(\mu\) is the mean.Footnote 1 It then assigns VMs with the highest cost to a cluster of PMs owning the highest capacity.

Since the hierarchical solution may not produce such optimized results, as discussed in the previous section, the authors propose a joint FFD algorithm which iterates over PMs from all clusters and selects VMs according to their workload intensity.

The major issue with this joint solution is the size of the problem for a large amount of PMs across all clusters. This approach may provide better placements, compared to the hierarchical proposal, since it has complete information about all VMs and PMs. However, trying to solve such a complex problem using all that information in a centralized way is impractical for large-scale data centers. That is one of the reasons why the papers presented in the literature attempt suboptimal solutions based on partial information view. Although the authors state that the hierarchical algorithm is highly scalable, the global manager is a single point of failure and is likely to be a bottleneck.

A local manager also performs VM migration when over or underutilized PMs are detected. It allows VMs from overloaded PMs to be migrated intra-cluster, when there is any PM able to offload the overloaded one. If no PM in the cluster has enough capacity to host VMs from an overloaded PM, inter-cluster migration is requested.

5 MPT-Based VM Placement Proposal

5.1 Proposal Overview

Correlation among assets plays a leading role in MPT, so that the MPT-based VM placement proposal (henceforth, MPT VMPM) considers it for placing VMs into clusters of PMs. Portfolio creation depends on the return of each asset. It can be computed based on different metrics, such as: (i) VM utilization level, i.e. the resource demand (as closer to an upper utilization threshold the better); (ii) the network traffic and distance (the lower the better); (iii) task completion time (the lower the better); (iv) other metrics or a composition of them.

The work of Wei et al. [20] presented in Sect. 4.1 considers the resource demand of a VM as the return and its volatility as the risk. The present work proposes an average resource utilization metric to define the return of a VM, as already presented in (7) and (8).

The foundation of relying on the VM resource usage to compute the return is the principle that the most resources used the better. Since underloaded VMs may lead to underloaded PMs, this causes wastage of resources. Furthermore, several works have shown that an idle PM is accountable for up to 70% of the energy consumption compared to the PM in full utilization [1]. Although energy efficiency is improving with new technologies, idle allocated resources are a waste, since they could be used by other customers.

According to the VMPM problem formulation in (7), the proposal aims at maximizing returns. However, for resources such as network traffic, minimizing their use provides more benefits, meaning maximization of returns. Mimimized traffic reduces: (i) task wait time, which leads to reduced task completion time; (ii) communication delay and network congestion. In order to achieve the mentioned goal, a weighted network traffic metric is defined in (9):

$$\begin{aligned} \mu _{inv\_net} = (vm_{bw} - bw)/hops \end{aligned}$$
(9)

where \(vm_{bw}\) is the VM bandwidth capacity, bw is the VM bandwidth consumption and hops is the number of network hops between inter-communicating VMs. That metric is used as a way to invert bandwidth consumption, while ensuring the resulting value lying between [0..1]. In order to minimize traffic, inter-communicating VMs should be placed as close as possible by applying that metric. The application of bandwidth and hops in network metrics is very common and the literature shows their utilization in different ways [32, 33].

5.2 Portfolios Generation

The reasoning behind MPT relies on the generation of multiple portfolios with random weights for each asset, as depicted in Fig. 1. That sample chart was drawn from 100,000 random portfolios. From these portfolios, it is supposed to select one, which will represent the actual allocation of a subset of VMs (assets) into some datacenter or cluster. Here, only clusters are taken into account. However, consider a number of 1000 candidate clusters where VMs from a selected portfolio may be placed into. That would require a total of 100 million portfolios generated for all clusters, just to select one for each cluster. That imposes a huge memory and processing overhead, as will be presented in Sect. 5.4. Furthermore, considering that allocation of VMs into data centers/clusters is re-evaluated periodically, that overhead may impact the regular cluster operation.

Alternatively, in order to provide a low-overhead solution that is suitable for large-scale cloud providers, the current proposal does not generate that huge an amount of portfolios for a single cluster. When a new VM arrives and does not yet have a utilization history, it is placed inside some cluster, applying a well-known algorithm such as First-Fit (FF). While utilization metrics are computed over time, some clusters are randomly and periodically selected to have their portfolio re-evaluated. That is performed by recomputing portfolio risk and return along with the arrival of VMs. Arrivals occur when new VMs are submitted to the cloud or finished ones are re-started.

Since cluster controllers can receive requests to place arrived VMs and there may be multiple controllers for each cluster, numerous potential portfolios can be computed so as to evaluate the placement of those VMs. Arrived VMs are placed into the cluster with higher return and/or lower risk. Accordingly, the current proposal presents a very particular utility function as defined in Algorithm 1. It computes the portfolio utility (which has to be maximized) for placing a set of VMs into a target cluster. The function compares the utility of a cloned portfolio containing the new VMs and the original cluster portfolio. The function is detailed as follows:

  • Line 2—Checks if the risk of the other potential portfolio has been decreased at least by the defined threshold (\(MIN\_RISK\_DEC\)).

  • Line 3—Checks if the return has been increased at least by the defined threshold (\(MIN\_RETURN\_INC\)).

  • Line 4 to 7—If there is a minimum risk decrease or return increase compared to the original cluster portfolio, it computes the new portfolio utility to enable ranking that portfolio among other possible placements into different clusters; otherwise, the utility is considered zero, so that such a target cluster is ignored for that VM arrival.

5.3 Mean and Variance for MPT

The MPT formulation relies on the covariance to compute risk (5), which requires a history of sample data. Considering that these samples represent VM resource utilization, as higher the frequency and longer the collection period, the VM return (mean utilisation) and risk (volatility) are more accurate. However, for large-scale cloud data centers with thousands machines and hundreds of thousands VMs, keeping such a utilization history imposes a heavy memory footprint and largely increases computational complexity in the long run, as the sample number grows. Proposals such as [28, 29], which rely on the regular MPT formulation, have this memory and CPU overhead, which may not be feasible for actual large-scale cloud providers. The overhead of a proposed solution should not compromise SLA for current customers.

Alternatively, the Welford online algorithm is applied [34] to incrementally compute portfolio return and risk. This is a non-interactive approach which updates the mean and variance for each new sample, without storing each sample. The algorithm was chosen because it is well-known and produced precise results in our tests, compared to the regular computation of mean and variance. It is defined in (10):

$$\begin{aligned} \mu _i= & {} \mu _{i-1} + \frac{\delta _{i-1}}{N} \nonumber \\ \delta _i= & {} x_i - \mu _i \nonumber \\ \sigma ^2_i= & {} \sigma ^2_{i-1} + (\delta _i * \delta _{i-1}) \end{aligned}$$
(10)

where N is the number of samples, \(x_i\) and \(\delta _i\) are a sample and its deviation from the current mean \(\mu\) for time i; \(\sigma ^2_i\) is the variance for time i.

In order to ensure the accuracy of Welford’s algorithm compared to the traditional way of computing statistics, we have created a simple experiment using a pseudo-random number generator (PRNG) to generate 100000 samples and compute the mean, variance and standard deviation for both methods. Samples were uniformly distributed between 0 and 10. The experiment was executed multiple times using different seeds, showing that the Welford algorithm has a high precision up to the 12th decimal place for these 3 metrics. Results are presented in Table 2 using the seed provided in Table 3.

Table 2 Comparing accuracy of results for computing statistics using the Welford algorithm (1st line) and traditional method using the entire samples history (2nd line), for 100000 samples uniformly distributed between 0 and 10
Table 3 General simulation parameters

5.4 Computational Complexity & Memory Footprint Analysis

The online algorithm (10) computes mean and variance in a single pass, updating those statistics as new samples arrive. That provides a constant computational complexity O(1) instead of linear O(n) for the worst case, where n is the number of samples. Similar approaches for rapid computation of such statistics were previously presented by [34].

Consider that (i) the cloud provider has 10 k hosts and 100 k VMs; (ii) CPU, RAM and network utilization metrics are collected every 30 min throughout 1 year; and (iii) a metric is a double value requiring 8 bytes of memory. The total number of samples collected for a single metric by a proposal that stores those values, for just one VM, will be 17520, defined as:

$$\begin{aligned} n = 2 \text { samples/hour} * \text { } 24 \text { hours} * 365 \text { days} \end{aligned}$$

The total memory footprint of a proposal that stores those samples, taking all metrics and VMs, will be:

$$\begin{aligned} n * 8 \text { bytes} * 3 \text { metrics} * 100k \text { VMs} \end{aligned}$$

That is almost 40 GB of historical data per year. The memory footprint is initially negligible, considering the entire cloud provider infrastructure, but it may become an issue in the long run if the entire sample history is stored. Despite that footprint is irrelevant for an entire cloud infrastructure, the computational complexity O(n) to compute metrics for a high volume of data is not. It may pose a huge overhead as the sample history grows. Furthermore, that complexity is infeasible for simulation environments, restricting the scale of experiments.

Applying the Welford algorithm for this same scenario and considering the 6 variables in (10), the memory footprint is reduced to: \(6 \text { variables} * 8 \text { bytes} * 3 \text { metrics} * 100k \text { VMs}\). The result is a constant value around 14 MB of memory for the given number of VMs, or a negligible 144 bytes per VM.

The followed approach to computing statistics improves the scalability of the proposal for both large-scale simulations and actual cloud infrastructures. Those improvements reduced the time for running such simulations, with the parameters presented in Table 3, from hours to minutes (using a 2,8 GHz dual-core hyper-threading Intel i7-4558U PC with 8 GB 1600 MHz DDR3 RAM).

5.5 Proposed Architecture

MPT VMPM is a distributed proposal with a decentralized management, as presented in Fig. 3. Hosts inside cloud data centers are grouped into clusters. Each cluster has at least one controller node that assesses the placement of VMs inside that cluster. A controller can be either a specific PM used exclusively for this task or a regular PM hosting customers VMs. A cluster controller can be selected either (i) manually (at least for the first time), according to its processing capacity; or (ii) randomly (the approach followed for simplicity), mainly in case of a controller failure.

Fig. 3
figure 3

Overview of the proposal architecture: physical machines inside a cluster with at least 1 controller. Datacenter machines are grouped into different clusters, where each one can have multiple controllers

PMs are grouped into clusters of equal size, as presented in Table 3. Clusters are created by splitting the set of all existing PMs into subsets. Network topology usually determines how PMs are grouped into clusters, but the proposal is agnostic on that point. Different sizes and heterogeneity of clusters will certainly impact performance and results. However, due to the current complexity, scale of the simulations, and limited infrastructure, it was not possible to include these variables in the experiments, which are presented as future work.

The proposal considers datacenter brokers available in current cloud provider infrastructure in order to randomly select cluster controllers. Brokers are accountable for responding to customer requests to manage VMs (create, clone, destroy, etc) and dispatch those requests to controllers for evaluation of VM placement/migration. This way, an existing component in the cloud infrastructure can be extended to provide new capabilities, instead of adding a new one. Considering the scale of global cloud infrastructures, such brokers are already distributed so that they can handle a large number of concurrent requests from multiple customers.

Since the number of Hosts and VMs inside a cluster is usually large and imposes a decision making overhead which may affect customer SLAs, multiple nodes inside a cluster can be chosen as controllers. Besides distributing that overhead across multiple clusters, this approach enables load balancing inside the cluster, while providing fault-tolerance for the controller. In such a multi-controller by cluster scenario, there is no leader defined and each controller has the same responsibilities. That is an important feature for distributed systems, since in case of a controller failure, any available node can replace it.

Figure 4 gives a detailed view of the architecture. The set of VMs placed into Hosts of some cluster represents the cluster portfolio, having a given risk (k) and return (\(\mu _t\)). When the broker receives a request to place VMs that have a utilization history, it sends such a history and VMs resource requirements to randomly selected clusters. Target clusters recompute their risk and return to place arriving VMs and reply to the broker informing the new risk and return values, according to the utility Function 1.

Fig. 4
figure 4

MPT VMPM architecture using the gossip protocol: stages of the VM placement assessment across multiple clusters in a given datacenter

The aforementioned function aims to increase the return/risk ratio, which is a simplified version of the Sharpe Ratio [10], adapted for the VMPM problem. The simplification was required since the original equation includes some financial market metrics which do not have a correspondence in the cloud computing market. The returnRiskRatio variable defines the amount of return by each unit of risk. This is one of the variables to be maximized. The higher the return for lower risk, the better. In such a scenario, adding up that ratio provide more utility for the proposed placement.

After receiving the replies from target clusters, the broker ranks them by the return of the utility function in a decreasing order. The cluster providing the highest utility is selected to place arrived VMs. In case an arriving VM has no utilization history, a First-Fit algorithm is applied to find the first suitable Host for that VM.

In order to enable such requests between brokers and cluster controllers, a distributed communication protocol must be introduced. The proposal uses the Gossip epidemic protocol to perform a distributed search, sending VM placement requests across neighboring clusters. The MPT VMPM algorithm that performs such a flow is presented in Fig. 5. Considering the high-scale of cloud infrastructures and low response times required by VMPM proposals in order to promptly select target PMs for VMs, the Gossip protocol was chosen due to its: (i) high scalability, (ii) decentralized management, (iii) peer-to-peer propagation of data across a network of nodes and (iv) fault-tolerance [30, 31, 35]. Those are crucial characteristics for distributed systems such as the proposed one.

Fig. 5
figure 5

MPT VM placement general algorithm (MPT VMPM)

Consider that we have multiple brokers and controllers communicating using the Gossip protocol. Each broker that receives a request to place VMs can randomly select one of the available controllers in a target cluster to send a request for placement evaluation. This way, multiple controllers in the same cluster will receive a different request, balancing the load. However, we have not simulated multiple brokers in the experiments.

Although the presented architecture relies on network communication, Figs. 3 and 4 are not intended to represent any network topology, since the proposal is agnostic in that regard.

6 Proposal Evaluation Using CloudSim Plus

6.1 CloudSim Plus Overview

The MPT VMPM proposal is evaluated by extensive and large-scale simulations using CloudSim Plus, our open source cloud computing simulation framework [36], available at https://cloudsimplus.org. CloudSim Plus currently represents the state-of-the-art in general cloud computing simulation. Several other simulation frameworks have been built using it as the underlying simulation engine, including RECAP-DES [37], LEAF [38], PureEdgeSim [39], SatEdgeSim [40] and others.

CloudSim Plus is a completely re-engineered project that largely improves simulation scalability and fixes critical bugs that jeopardize simulation accuracy, correctness and result reliability. Furthermore, the framework has been actively maintained since 2016 and this proposal is based on its version 7.2.0. Some of the exclusive features of the framework include:

  • Accurate power consumption models; joint power- and network-aware simulations; parallel execution of simulations;

  • Host fault injection; virtual memory and reduced bandwidth allocation due to oversubscription; closest datacenter VM placement; grouped VM placement;

  • Horizontal and vertical VM scaling; etc.

6.2 Methodology

This research work follows a Design Science Research (DSR) approach [41, 42]. Starting from the VMPM problem, different proposals were studied, as presented in [1]. The Modern Portfolio Theory (MPT) was selected and adapted, considering: (i) that we could not find any work comprehensively applying the theory for the VMPM problem; (ii) the suitability of the MPT for that problem, as it has been shown throughout this work and (iii) the possibility of applying an incremental approach for computing the fundamental statistics that MPT relies on (mean, variance and covariance) to reduce computational complexity and memory footprint, which has not been applied yet. The proposal was designed as presented in Sect. 5.5 and an experimental approach was followed by creating simulation scenarios using the CloudSim Plus framework, which improves simulation correctness. Finally, experiment results were collected following a scientific approach, to compute confidence intervals (CI’s) for data from multiple simulation runs.

The simulation experiments to be presented here were created from both synthetic data (using pseudo-random number generators, PRNGs) and trace files of VMs running into PlanetLab [43] (a former global research network) and Google Cluster. These three different sources were used together in all experiments, providing heterogeneous workloads as happens with actual cloud providers. The simulation is designed as described below, while its parameters are presented in Tables 3, 45.

Table 4 AWS EC2 instances configuration used for creating VMs: prices and configuration for Linux instances in the US Eastern (Ohio) time zone
Table 5 Gossip/MPT parameters to enable clusters to trade VMs using the MPT VMPM proposal

The experiments run in a simulated cloud infrastructure with one datacenter (the number of data centers is restricted in order to reduce simulation time). A fixed number of statically created VMs are submitted to the cloud infrastructure when the simulation starts, defining an initial workload. During simulation execution, new VMs arrive following a Gaussian distribution with a given mean. Those VMs represent the dynamic requests and workload the cloud provider receives all the time. Created VMs have some defined capacity, according to actual VM instances from AWS Elastic Cloud Computing (EC2) service, as presented in Table 4. Such instance configurations are based on AWS since it is a worldwide cloud provider and other providers follow similar configurations when setting the capacity for their VM instances.

VMs are managed as black boxes, where the cloud provider does not know the applications running inside it. Next, applications (cloudlets [36]) are created merely to simulate some kind of workload inside each VM. The VM workload is based on the CPU, RAM and bandwidth (BW) utilization of those applications. Each application applies some model to define how those resources will be used along the time. The RAM and BW utilization is based on a uniform PRNG. The CPU utilization is based on Gaussian and uniform PRNGs and also PlanetLab trace files. This way, different kinds of workloads are simulated, as expected in public cloud providers.

The evaluation of the MPT VM placement proposal is conducted by executing multiple runs of different simulation experiments to assess and compare results, as presented below:

  1. (E1)

    Random VM placement: places VMs into suitable hosts randomly selected following a uniform distribution, using a PRNG. That is a naive approach simply to show how the FF and MPT proposals outperform it.

  2. (E2)

    MPT Random VM placement: it works the same way as the experiment E1 above, using the initial random VM placement. Then it applies the MPT proposal to improve the initial placement.

  3. (E3)

    First-Fit (FF) VM placement: a low-complexity round-robin algorithm that tries to place each VM on the first suitable host found inside any available datacenter cluster. That host is used to place any consecutive VM. When such a host is not suitable for any VM anymore, the next one is selected for current and following VMs.

  4. (E4)

    MPT FF VM placement: the proposed trading MPT algorithm that periodically negotiates random exchanges of VMs between neighboring clusters, recomputing the cluster risk and return. If such metrics are improved, according to the utility Function 1, VMs are actually migrated between clusters. The proposal uses the previous FF algorithm for initial VM placement.

6.3 Result Analysis

6.3.1 Metrics and Experiment Types

Considering the experiments just enumerated, this section presents simulation results for the following evaluated metrics:

  1. (M1)

    Active Hosts Avg (%): average percentage of active hosts during simulation execution, considering the total number of hosts on the datacenter across all clusters. The lower this metric is, the better, since the VM placement algorithm is consolidating the maximum number of VMs into the minimum number of hosts. However, that can lead to overload.

  2. (M2)

    Active Hosts Max (%): maximum percentage of active hosts during simulation execution, following the same reasoning of the previous metric. However, an increased value may indicate that the server consolidation process is not sufficiently effective, which will require the activation of a higher number of hosts at some point in time.

  3. (M3)

    Used CPU cores Avg (%): average percentage of used CPU cores in active hosts. This is used as a way to indicate how loaded a server is, according to the number of cores used. If the server is active, the highest number of cores used that do not overload the hosts is desirable. An increased value improves the cost/benefit relation of the host, since it reduces resource waste and balances power consumption [1].

  4. (M4)

    Used CPU cores Max (%): maximum percentage of used CPU cores for active hosts, following the same reasoning of M3.

  5. (M5)

    CPU overload mean (%): average percentage of CPU load for times hosts are overused, considering a defined threshold. This indicates how overloaded hosts are on average. Overload is well known to cause issues for companies, cloud provider infrastructure and customers, as broadly discussed in the literature surveyed in [1].

  6. (M6)

    CPU overload time (%) from uptime: percentage of the time, on average, hosts are overused (considering their uptime) above the threshold discussed in the previous metric. The longer this overload time is, the higher the negative impacts will be. An increased value may indicate that the VM placement algorithm is not sufficiently effective in distributing VMs, since overload is the major issue to address.

  7. (M7)

    CPU underload mean (%): average percentage of CPU load for times hosts are underused, considering a defined threshold. In contrast to metrics such as M5 and M6, this one indicates resources are being wasted, while fewer hosts could be activated to meet demand and reduce power consumption.

  8. (M8)

    CPU underload time (%) from uptime: percentage of the time, on average, hosts are underused (considering their uptime) below the threshold discussed in the previous metric. An increased value may indicate the VM placement algorithm is scattering VMs in more hosts than required, which may increase power consumption and resource waste.

  9. (M9)

    Hosts CPU mean load for non under/overload times (%): percentage of Host CPU capacity is being used during regular times. The closer the value is to the CPU upper utilization threshold the better, since it maximizes resource utilization while trying to avoid under and overload.

  10. (M10)

    Hosts uptime (%) mean: percentage of time (according to the total simulation time) hosts are active on average. The higher this value for non under or overloaded hosts the better, following the same reasoning for the previous metric.

  11. (M11)

    Total overload hosts (%): percentage of overused hosts along the total simulation time.

  12. (M12)

    Total underload hosts (%): percentage of underused hosts along the total simulation time.

  13. (M13)

    Hosts startup/shutdown power consumption (MW): the total power consumed by all activated hosts during startups and shutdowns.

  14. (M14)

    Hosts total power consumption (GW): the total power consumed by all activated hosts during entire simulation execution.

  15. (M15)

    Hosts startups mean: mean number of times hosts are powered on (since an idle host may be shutdown). The lower this value is, the less frequently hosts are switched on and off, which may reduce the time VMs have to wait for a host to be activated.

  16. (M16)

    Delayed VM mean wait time (seconds): mean time VMs have to wait for a host being powered on (due to host boot time), excluding VMs that were promptly assigned to an already active host.

  17. (M17)

    Delayed VMs (%): percentage of VMs that had to wait for a host activation before the placement could be performed.

Experiments are classified in two categories:

  • Random: E1—places all VMs randomly; E2—applies the MPT placement for VMs having some utilisation history and the random placement otherwise.

  • First-Fit (FF): E3—places all VMs using a FF algorithm; E4—applies the MPT placement for VMs having some utilisation history and the FF placement otherwise.

Results for different VM placement strategies, considering the previous metrics, are presented and discussed in the next subsection. In order to understand the results, a brief overview of how each experiment works is presented as follows.

6.3.1.1 E1 vs E2: Random vs Random + MPT VMPM

E1 does not apply any analysis to place VMs and it is expected to degrade some important metrics. E2 improves most of them by applying the MPT algorithm after the initial random placement.

6.3.1.2 E3 vs E4: FF vs FF + MPT VMPM

E3 uses a FF algorithm to place VMs into the first suitable host available, prioritizing those that are already active. The algorithm performs server consolidation by packing the maximum number of VMs into the same host. Despite having a low computational complexity, it does not compare VM requirements or assess any other metrics other than available resource capacity. That may leave unallocated residual capacity for some resources due to a mismatch in resource allocation for placed VMs. That means some kind of resources may run out, while other ones are plentiful.

The MPT proposal in E4 performs a more complex analysis of VMs resource demand to create a portfolio for a given cluster, trying to balance risk and return. That improves resource utilisation and other metrics discussed as follows.

6.3.2 Evaluating Experiment Results

Figures 6, 7, 8, 9, 1011 present charts grouping some related metrics for all experiments, including the confidence interval (CI) error margin. This way, it is easier to compare them and analyze how they change from one experiment to another.

Fig. 6
figure 6

Active hosts (%)

Figure 6 shows the average (M1) and max (M2) percentage of hosts used throughout the entire simulation runs. The reduction is gradual and MPT VMPM experiments (E2 and E4) achieve the best results, when one compares each one to the corresponding experiment without MPT. The differences in the average number of active hosts are not so large because idle hosts in any proposal are turned off after a specific time limit, in order to reduce power consumption. There is a significant difference in the max % of hosts compared to experiments using initial random placement (E1 and E2) and initial FF placement (E3 and E4). That clearly happens because random placement will spread VMs across a larger number of hosts, starting up many more hosts that become idle over time. MPT VMPM (E4) outperforms all experiments for both metrics presented.

Figure 7 shows hosts % uptime mean (according to total simulation time) and CPU utilization metrics. Those are metrics to be maximized. As long as resource utilization is kept high, hosts are expected to be active longer (M10), thus reducing the number of startups/shutdowns. A mostly increasing trend may be seen from E1 and E4 usually, with a small error margin. E4 has the maximum values for different CPU utilization metrics, because the proposal performs server consolidation to maximize resource utilization. Furthermore, CPU utilization average is kept the closest to the upper utilization threshold. The server consolidation performed by E4 shows that the host uptime mean is the highest among all experiments. That means fewer hosts are kept active longer. The figure shows that E4 maximizes the number of CPUs used (M3 and M4) and due to the minimization of active Hosts in Fig. 6, that explains the highest CPU load (M9).

Fig. 7
figure 7

General CPU usage (%)

Figure 8 shows CPU utilization for times when hosts are either under or overloaded. M5 is the average CPU utilization when hosts are overloaded. The difference in results for this metric between experiments is negligible. When hosts become overloaded, they tend to use almost the maximum capacity available. That is because an upper utilization threshold is defined merely to compare results. However, no limit in resource utilization is actually set, showing why every experiment will sometimes use all the available CPU if other resources are enough for each VM. Such metrics provide the best results when minimized, but the chart makes it clear that they are high and usually increasing from one experiment to another. That is the side effect of server consolidation approaches, as already widely discussed.

Fig. 8
figure 8

CPU usage for non-regular times (%)

Despite the similar results for M5, the mentioned side effect causes CPU overload time to be much higher for E4. That is why setting a limit for resource utilization is critical. Without an actual limit, the MPT proposal performs a very aggressive server consolidation, which can be seen in M6 and M11.

Underload metrics are not significantly improved by any experiment because they are not attempting to address this issue. Random placement does not make any placement decision-making; it simply gets the first random host that is suitable for a VM. FF placement is concerned with performing server consolidation as the MPT proposal. In financial markets, one way to perform portfolio balancing is by buying assets that have depreciated in order to increase their portfolio share. That approach can also be used to reduce underload, but that was not directly addressed here. At any rate, considering that E4 activates fewer hosts on average, the absolute number of underload hosts will be smaller, since the percentage of underloaded hosts is around 60% for all experiments (M12).

Figure 9 shows the mean number of times hosts are activated (M15) and the consequent delay for VMs to be placed into such selected hosts (M16). E3 and E4 show each Host is started less than 2 times on average, because they are kept running longer (as Fig. 7 has shown). The reduction in M15 is clear and impressive; even the first two experiments have a large error margin. Although the reduction in M15 would indicate smaller VM wait times, it does not always work like that. Random experiments do activate a larger number of hosts, as Fig. 6 has shown. In the long run, those experiments will have a higher number of already active hosts, reducing VM placement wait time (M16), at the cost of more power consumed (as discussed below). FF and MPT experiments have a higher decision-making computational complexity. That means the random placement can be faster in finding a suitable host (at the cost of degrading several metrics), which explains the higher wait time for FF and MPT. Finally, since random experiments activate more hosts, more VMs have to wait (M17), despite waiting less (M16).

Fig. 9
figure 9

Number of times hosts are started up and VM wait time

Figure 10 shows power consumption metrics, which are decreasing from E1 to E4. This happens since host utilization is maximized and they are kept active longer as we move from one experiment to another (as can also be seen in Fig. 7). The number of active hosts in Fig. 6 also corroborates this result. Although the power consumed in startups and shutdowns (M13) is negligible (compared to total power consumption in GW), those results are aligned with the number of host startups presented in Fig. 9. There is a reduction of 17% in total power consumption from E3 to E4, which is mainly due to the reduced number of activated hosts and maximization of CPU utilization.

Fig. 10
figure 10

Power consumption (MW and GW)

Figure 11 groups some of the results presented, making it easier to evaluate them. It can be seen that as the number of activated hosts (M1) decreases, the average number of CPU cores used (M3) increases. However, simply assessing the number of cores used does not provide a complete view of how CPU is being used. Besides the maximization in number of used cores, the load of those cores (M9) is also increased. That maximizes resource utilization, reducing power consumption (M14) and resource wastage. As can be seen, experiment E4 applying the MPT proposal yields the best results for all those metrics.

Fig. 11
figure 11

Active hosts to CPU usage relation

There is a large number of metrics that can be categorized as metrics to be either maximized or minimized. It is not easy to grasp the overall performance of the experiments by considering each individual metric. In order to address that, Eq. 11 presents \(Pscore_e\), a normalized average representing a score used to analyze the performance of an experiment e, for a given category of metrics:

$$\begin{aligned} Pscore_e = \frac{\sum \nolimits _{i=1}^M{m_i/max(m_i)}}{M} \end{aligned}$$
(11)

where M is the number of metrics considered for computing the score, \(m_i\) is the metric value for experiment e, and \(max(m_i)\) is the maximum value for the same metric among experiments E1 to E4.

Finally, there is a Pscore for every experiment, which takes into account each metric category. Figure 12 presents the (i) \(Pscore_{max}\) considering metrics M3, M4, M7, M9, M10 to be maximized and (ii) \(Pscore_{min}\) considering the remaining metrics to be minimized. These results clearly show that the proposed MPT+FF approach (E4) has the best performance for both categories of metrics. Metrics to be maximized have the highest normalized average, while the ones to be minimized have the lowest. \(Pscore_{max}\) equal to 100% for E4 means that the experiment has the highest results for the metrics to be maximized.

Looking from a different perspective to the 17 metrics computed, the proposal: (i) improves M1..M4, M8..M10, M13..M15 and M17 (11 metrics); (ii) achieves small differences between all experiments for M5, M7, M12 and (iii) degrades only M6, M11, M16. It is an improvement in 64.7% of the metrics, having 17.6% of them with small differences in results and degrading 17.6% of the remaining ones. Considering (i) and (ii), that means 82.3% of the metrics improved or unchanged (14 out of 17). Finally, the performance of the MPT-based proposal leads to the best results, compared to the isolated utilization of the Random and FF approaches. These results show that our proposal is suitable for large-scale cloud computing infrastructures.

Considering the scale of the experiments carried out, a huge amount of data was generated for the extensive set of metrics presented. In order to avoid a lengthy and overwhelming discussion, additional experiment results and insights are provided in the appendix.

6.3.3 Related Work Comparison

This section discusses some results of the present proposal with the related ones in Sect. 4. The related work, and some others found in the literature, provide a limited presentation and analysis of results. They tend to: (i) suppress experiment parameters, (ii) collect a reduced number of metrics, (iii) neglect confidence intervals and (iv) lack computational complexity and memory footprint analysis. Unfortunately, that makes it unfeasible or impossible to perform a wide, accurate and meaningful comparison of results. The workload is usually not clearly described in such publications. Evaluating any proposal against them may lead to an inaccurate comparison of experiments that possibly use uneven or disproportionate workloads. Therefore, the following analysis is restricted to such limitations.

The work of Wei et al. [20] presents a MPT-based VM placement proposal that does not fully apply the theory, disregarding the generation of multiple portfolios for selection. Despite outperforming some traditional algorithms, it minimizes the number of hosts used and reduces under/overload by maximizing resource utilization. Results are very limited and present only a few metrics. For instance, numbers of active hosts are presented only in absolute values. Those numbers are not representative unless they provide the total number of hosts available to enable computing percentages.

Results show that the proposal uses the maximum of 118 hosts. However, if the simulation scenario was created, for instance, with 150 hosts, that means the proposal is either (i) inefficient in minimizing the number of active hosts (although it outperforms traditional and more simple algorithms) or (ii) the workload is very high, justifying that result. However, it is not possible to know which one is the case. Figure 6 shows that our proposal largely reduces the number of hosts used (less than 15% on average), but there is no way to compare results with [20].

The work of Wei et al. [20] shows that resource utilization follows an almost uniform distribution, trying to keep it between an under and overload threshold. However, several works surveyed in [1] show that the utilization of resources may follow different distributions, according to the kind of application, such as: batch processes, scientific applications, databases and web servers. That is why our proposal uses different distributions for synthetic data and distinct trace files to simulate varying workloads, as occurs in actual cloud infrastructures. Since [20] only provides the cumulative distribution function of resource utilization, such results are not comparable to the metrics we collected (such as average/max used CPU cores and average CPU load).

The work by Wei et al. tries to keep resource utilization close to the upper utilization threshold. However, our proposal goes beyond, and also minimizes network traffic and the time for VMPM decision-making (due to the reduced amount of data to process, already discussed in Sect. 5.3).

The results presented by Hwang and Pedram (2012 and 2018) [28, 29] are focused on evaluating the execution time of their proposed algorithms and number of VM migrations.Footnote 2 However, more important than the absolute execution times is the computational complexity analysis those works fail to provide. The execution time of an algorithm may vary drastically, according to the power of the computer on which the experiments are run. Alternatively, presenting (i) running times in percentage values or (ii) a computational complexity analysis is more meaningful and comparable. Furthermore, as already discussed in Sect. 5.4, such works perform a traditional and complex computation of statistics that our proposal outperforms. Thus, our results cannot be comparable with theirs.

Table 6 presents a summary comparing the related work to this proposal.

Table 6 Related work comparison
Fig. 12
figure 12

Pscore: measuring the performance of experiments E1 to E4 by computing the normalized average for metrics to be maximized or minimized

7 Conclusion

NP-hard optimization problems such as VM placement are a challenge for large-scale scenarios such as the ones built through simulation in this proposal. The literature offers a vast array of proposed solutions for that issue. Due to the complexity of the problem, it is difficult to achieve breakthrough results.

Nonetheless, our current research and previous studies [1] provide a broad analysis of the literature and propose a solution that uses the Modern Portfolio Theory (MPT) to balance risk and return of VM portfolios for clusters of hosts. Using the Welford online algorithm, it defines a simplified way to compute return/risk for portfolios, which, to the best of our knowledge, has not been used for large-scale VM placement scenarios.

The proposal combines a distributed architecture, decentralized management, partial neighborhood view and return/risk balance. The Welford algorithm drastically reduced computational complexity and memory footprint, moving from linear to constant time. Considering 100 k VMs in the scenario presented in Sect. 5.4, it moves from 40 GB of utilization history to a negligible 14 MB for one year of operation. That makes our proposal suitable for large-scale cloud computing infrastructures.

Furthermore, simulation experiments used only just synthetic data, but real traces from cloud data centers. Large-scale experiments were performed and a high number of metrics were collected and broadly discussed. Many of them were improved, such as number of active hosts, CPU utilization, host uptime and power consumption, although they did degrade overload metrics.

The work collected a total of 17 metrics, applied a 3-dimensional resource analysis (enabling other resources to be included), and on average, it improves all metrics to be maximized and all those to be minimized. For 82.3% of them (14 out of 17) the proposal either improves or provides the same results compared to the other experiments. Even for some metrics which are degraded (such as CPU overload), the difference is not excessive if compared to similar approaches such as FF. Such an increase in overload is compensated by the large reduction of 17% in total power consumption. Although the proposal in general is more complex than FF, the decision-making overhead is negligible compared to the latter. Considering the VM average wait time as an overhead metric, the proposal increases that by only 0.32 s, while generally outperforming all the other experiments as previously discussed.

Experiments followed a strict scientific approach, performing multiple simulation runs and accordingly presenting parameters and confidence intervals. They use CloudSim Plus, which is currently a very accurate and reliable simulation framework, recognized by academia and used in many of the most relevant projects in the last years. All of that increases the reliability of experiment results.

Although this work has considered only VMs as virtual assets to be placed/migrated into a cloud infrastructure, the research can be applied to other assets such as containers, and deployment of:

  1. (i)

    Functions in function as a service (FaaS) or serverless architectures,

  2. (ii)

    Or distributed and parallel applications.

Finally, there are some directions for future work, which can improve the proposal and result analysis in different ways, such as for evaluating:

  1. (i)

    The interference of running VMs or migration operation on co-located VMs as an additional measure of risk,

  2. (ii)

    Combined reactive VM migration strategy with MPT placement to reduce under and overload,

  3. (iii)

    Setting firm resource utilization limits to minimize overload,

  4. (iv)

    Placement of correlated/anti-correlated VMs, regarding their mean execution time. If VMs with a similar execution time span are placed together, the host may be expected to be underloaded only when a major amount of VMs are finished. However, these correlated VMs may lead to overload. On the other hand, if VMs with anti-correlated execution time span are placed together, that tends to reduce host overload bursts. However, that may lead to leaving the host underloaded longer, as VMs may finish at a different pace,

  5. (v)

    Unallocated residual resources due to VM placement mismatch: while some kinds of resources run out, others remain plentiful, increasing resource waste,

  6. (vi)

    Performance and results for multiple cluster sizes; VM migration metrics,

  7. (vii)

    Placement considering availability requirements, when VMs from the same customer or application need to be placed in different availability zones,

  8. (viii)

    Monetary cost evaluation for providers and customers, looking for cost-effective ($) placement for both parties.