1 Introduction

As a new computing model, the cloud computing can provide a flexible and on-demand storage and computing resources to users via computer network. Virtualization technology as enabler and important technical support of Cloud computing, is a way to be able to represent abstract methods of computer resources [1]. The simulation, gathering, sharing and isolation of resources can be achieved with the aids of virtualization technology. Furthermore, virtualization technology can also take advantage of the virtual machine to provide the necessary environment [2] for the reliable operation of a variety of applications and rapid deployment. Core feature of cloud computing is on-demand service, which makes cloud computing task allocation and resource scheduling to become technical problems. Because of the interest conflicts between the ordinary User, Infrastructures Providers and Service Providers, current studies only focus on one of them and how to make one of them benefit. In other words, this is actually the three communities of interest. Thus to make cloud computing resource management, you must put the interests of the three as a whole, not only maximize cloud service providers and infrastructure providers profits, but also improve the general user satisfaction.

In the stage of service provider (SP) provides cloud services for ordinary users, the user satisfaction and enhance revenue are taken into consideration. In the stage of SP purchase virtual resources from the infrastructure provider (IP), a virtual machine providing model is created. And the dynamic double subpopulation particle swarm optimization is introduced, the particle velocity and position of algorithm are redefined based on virtual machine providing model. In order to improve the convergence speed of double subpopulation particle swarm optimization, the particle velocity update weights is dynamically adjusted based on the fitness value of particles in an iterative process changes. PSO algorithm is easy to fall into local optimum, the immune algorithm is introduced to enhance the diversity of particles, making the algorithm can adjust the global factor dynamically. The improved PSO algorithm can not only find more solution at the beginning of search, but also capable of rapid convergence in the latter so as to achieve the optimal solution. The fusion of ant colony algorithm and genetic algorithm scheduling policy is introduced when SP provide cloud services to ordinary users. First, genetic operators is used in globally quick search, the initial value of ant colony algorithm’s pheromone is the result of global genetic algorithm, then the exact solution of task scheduling is achieved by using ACO operator, full use the dual advantages of the ant colony algorithm and genetic algorithms on solving NP problems. Experiments show that, while in two stages of purchase virtual resources and provide cloud resources to the general users, the two algorithms can not only improve SP profits, but also improve customer satisfaction.

In stage of IP provides cloud service for SP, in order to maximize the profits of IP, the energy consumption is taken into consideration. This paper presents a predictive model based on gray and credible ant colony scheduling algorithm for virtual machine migration scheduling policy. CPU resource utilization is an important index for live migration of virtual machines, and when there is a mutation in the arrival of CPU utilization, if there is no effective scheduling policy, the virtual machine migration occurs unnecessary, thus wasting system overhead. Gray prediction model in the second part can estimate the future utilization of a period of virtual machine CPU node. If a load on the host at the current time the CPU utilization is greater than the larger threshold (CPU utilization is less than a small threshold value), and the next three consecutive load prediction values are greater than the threshold (smaller than the threshold), the virtual machine will perform the migration. Experiments show that the algorithm can effectively avoid the frequent migration of virtual machine as a result of the shock caused by the change in CPU utilization, reduce energy consumption, and improve IP gains.

2 Gray Forecasting Model

We forecast future load values by the gray forecasting model before migrating virtual machines to target nodes. The gray forecast model is established by light and incompletability information. It is used to make predictions. The gray system theory research and solve the problem of how to analyze, modeling, forecast, make policy and control the gray system. Gray forecasting is the forecasting of a gray system. Some forecasting methods commonly used at present (such as regression analysis) need a larger sample. A smaller sample will cause greater error and make the target failure. The model given here needs less modeling information. It is of high precision and convenient operation. So it has a wide range of application in various forecasting fields being an effective tool to deal with the small sample forecasting problem.

In the following, the methods of building a gray forecast based on model are described with a time series of data by data analyzing and processing.

2.1 Data Preprocessing

For example, may wish to set up the original data sequence

$$ x^{\left( 0 \right)} = \left\{ {x^{\left( 0 \right)} \left( 1 \right), x^{\left( 0 \right)} \left( 2 \right), \ldots ,x^{\left( 0 \right)} \left( N \right)} \right\} = \left\{ {6,3,8,10,7} \right\} $$

Accumulation of raw data:

$$ \begin{array}{*{20}l} {x^{\left( 1 \right)} \left( 1 \right) = x^{\left( 0 \right)} \left( 1 \right) = 6,} \hfill \\ {x^{\left( 1 \right)} \left( 2 \right) = x^{\left( 0 \right)} \left( 1 \right) + x^{\left( 0 \right)} \left( 2 \right) = 6 + 3 = 9,} \hfill \\ {x^{\left( 1 \right)} \left( 3 \right) = x^{\left( 0 \right)} \left( 1 \right) + x^{\left( 0 \right)} \left( 2 \right) + x^{\left( 0 \right)} \left( 3 \right) = 6 + 3 + 8 = 17,} \hfill \\ \ldots \hfill \\ \end{array} $$

A new data series gotten is that \( x^{(1)} = \left\{ {6,9,17,27,34} \right\} \).

The formula above can be summarized as \( x^{\left( 1 \right)} \left( i \right) = \left\{ {\mathop \sum \limits_{j = 1}^{i} x^{\left( 0 \right)} \left( j \right) |i = 1,2, \ldots ,N} \right\}. \)

We call the data series represented by this formula a primary accumulation generation of raw data column (a primary accumulation generation in short).

Following we define: \( \varDelta x^{\left( 1 \right)} \left( i \right) = x^{\left( 1 \right)} \left( i \right) - x^{\left( 1 \right)} \left( {i - 1} \right) = x^{\left( 0 \right)} \left( i \right) \), in which \( i = 1,2, \ldots , N, x^{\left( 0 \right)} \left( 0 \right) = 0 \).

2.2 The Principle of Modeling

Given observation data series \( x^{(0)} = \left\{ {x^{(0)} \left( 1 \right),x^{(0)} \left( 2 \right), \ldots ,x^{(0)} \left( N \right) } \right\} \), after a primary accumulation generation we got: \( x^{(1)} = \left\{ {x^{(1)} \left( 1 \right),x^{(1)} \left( 2 \right), \ldots ,x^{(1)} \left( N \right) } \right\} \). Suppose that \( x^{(1)} \) satisfy the first order ordinary differential equation: \( \frac{{dx^{(1)} }}{dt} + ax^{(1)} = u \), where \( u \) is a constant, and \( a \) is called the development gray number. This equation satisfied the following initial condition:

$$ {\text{When }}t = t_{0} , x^{(1)} = x^{\left( 1 \right)} (t_{0} ) $$
(1)

Then the solution of this equation is:

$$ x^{\left( 1 \right)} \left( t \right) = \left[ {x^{\left( 1 \right)} \left( {t_{0} } \right) - \frac{u}{a}} \right]e^{{ - a(t - t_{0} )}} + \frac{u}{a} $$
(2)

For discrete values of equal interval sampling (take note of \( t_{0} = 1 \)), the solution is

$$ x^{\left( 1 \right)} \left( {k + 1} \right) = \left[ {x^{\left( 1 \right)} \left( 1 \right) - \frac{u}{a}} \right]e^{ - ak} + \frac{u}{a} $$
(3)

Because \( x^{\left( 1 \right)} (1) \) is used as an initial value, \( x^{\left( 1 \right)} \left( 2 \right), x^{\left( 1 \right)} \left( 3 \right), \ldots , x^{\left( 1 \right)} \left( N \right) \) are brought into the Eq. (1). Suppose difference is used to replace differential coefficient, because of the equal interval sampling \( \varDelta t = \left( {t + 1} \right) - t = 1 \), therefore we get \( \frac{{\varDelta x^{\left( 1 \right)} \left( 2 \right)}}{\varDelta t} = x^{\left( 0 \right)} \left( 2 \right) \).

Similarly \( \frac{{\varDelta x^{\left( 1 \right)} \left( 3 \right)}}{\varDelta t} = x^{\left( 0 \right)} \left( 3 \right), \ldots , \frac{{\varDelta x^{\left( 1 \right)} \left( N \right)}}{\varDelta t} = x^{\left( 0 \right)} \left( N \right) \).

By the formula (1), we get the following:

$$ \left[ {\begin{array}{*{20}c} {x^{\left( 0 \right)} (2)} \\ {x^{\left( 0 \right)} (3)} \\ \vdots \\ {x^{\left( 0 \right)} (N)} \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} { - \frac{1}{2}[x^{\left( 1 \right)} (2) + x^{\left( 1 \right)} (1)]} & 1 \\ { - \frac{1}{2}[x^{\left( 1 \right)} (3) + x^{\left( 1 \right)} (2)]} & 1 \\ \vdots & \vdots \\ { - \frac{1}{2}[x^{\left( 1 \right)} (N) + x^{\left( 1 \right)} (N - 1)]} & 1 \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} a \\ u \\ \end{array} } \right] $$
(4)

By matrix (4), the estimate values of \( a \) and \( u \) can be got. They are \( a^{'} \) and \( b^{'} \). Bringing \( a^{'} \) and \( b^{'} \) into formula (3), the forecast value can be obtained. By recording a sequence value of the load: \( k_{(1)} , k_{(2)} , \ldots , k_{(t)} \), the gray forecasting model can find out load forecasting values at \( t + 1 \), \( t + 2 \) and \( t + 3 \). If all of these three load forecasting values are greater than the upper limit threshold value or all of them are less than the lower threshold value, the virtual machine migration will be triggered.

3 The Selection and Positioning Strategy of Virtual Machine to Be Migrated

When the virtual machine to be migrated is selected, it is considered that the virtual machine to be moved out will release most of resources. At the same time, migration costs are small. Generally, memory resources and CPU resources are synthetically considered at first, then it is decided which virtual machine will be moved out.

In a virtual machine, Let \( w \) = utilization ratio of CPU/utilization ratio of memory. The \( w \) is greater, the utilization ratio of memory is lower and utilization ratio of CPU is higher in the virtual machine. So it will release more resources after the migration. Moreover the amount of data to be transmitted is small [3]. Let \( x \) = utilization ratio of CPU * utilization ratio of memory. The \( x \) is greater, the utilization ratio of CPU and memory are higher. Though the amount of data to be transmitted is greater when migrating the virtual machine being of maximum \( x \), the most resources will be released.

Suppose \( H_{t + 1} \) indicates the set of continuous three forecasting values of the utilization ratio of CPU after the moment t, \( H_{max} \) shows the upper limit threshold value of the utilization ratio of CPU when the host is triggered to start a migration, and \( H_{min} \) shows the lower limit threshold value. \( M_{thre} \) indicates the threshold value of utilization ratio of memory. The current memory occupancy of the physical node is \( M \).

The dynamic migration strategy of a virtual machine can be described as follows:

While \( H_{t + 1} < H_{min} \), all virtual machines on the physical node are moved out and close the physical node to save energy;

While \( H_{min} < H_{t + 1} < H_{max} \), no migration because of load balancing on the physical node;

While \( H_{t + 1} > H_{max} \,\, and \,\,M < M_{thre} \), the virtual machine being of the biggest \( w \) will be moved out;

While \( H_{t + 1} > H_{max} \,\, and\,\, M > M_{thre} \), the virtual machine being of the biggest \( x \) will be moved out.

When a virtual machine is moved out, the CPU utilization may become too high and the memory utilization is deficiency in the target physical node, or the CPU utilization is deficiency and the memory utilization may become too high [4]. We have defined \( w \) and \( x \). When the target physical node is selection, the virtual machine being of a bigger \( w \) should be moved into a target node which has a smaller \( w \) value. Otherwise, if \( w \) value is smaller, the virtual machine should be moved into a target node which being of a bigger \( w \) value. After the values match of \( w \), \( m \) optimum values are selected. Let \( p = x/\mathop \sum \nolimits x \), in it \( x \) is current available resources on someone target node among these \( m \) optimum solutions, and \( \mathop \sum \nolimits x \) shows total available resources of these \( m \) optimum solutions, so that \( \mathop \sum \limits_{i = 1}^{m} p_{i} = 1 \). In case of there are five optimum solutions: \( {\text{S}} = \left\{ {S_{1} ,S_{2} ,S_{3} ,S_{4} ,S_{5} } \right\} \), by computing, their proportion available resources are 0.1, 0.3, 0.2, 0.2, 0.2, then \( S_{1} , \ldots ,S_{5} \) can be intercalated into five interval \( S_{1} :\left( {0, 0.1} \right] \), \( S_{2} :\left( {0.1, 0.4} \right] \), \( S_{3} :\left( {0.4, 0.6} \right] \), \( S_{4} :\left( {0.6, 0.8} \right] \), \( S_{5} :(0.8, 1] \).

When the migration target node is to be selected, first, the random function is used to generate a random number between (0, 1], then select an appropriate interval in \( S_{1} , \ldots ,S_{5} \) according to the value of the random number. The target node represented by the selected interval is the ultimate goal of the virtual machine migration. Based on the positioning probability management, target nodes with more available resources are being of greater probability of receiving a virtual machine, the probability of being selected as a target node where the physical nodes with high resource utilization will become low. Thus load balancing for each physical node is realized in the data center.

4 Experiment and Analysis

CloudSim simulation software is used in the experiment to verify the efficiency of the algorithm. CloudSim Toolkit supports the system components in a virtualized environment, such as data center, hosts, virtual machines, scheduling and resources allocation strategies [5]. The CloudSim-3.0, Windows XP SP3, jdk6.5, MyEclipse ver8.5 and Ant1.8.1 are also used in the experiment. The hardware environment is Pentium Dual Core Processor with main frequency 2.6 GHz. With the CloudSim-3.0, we deploy the same 1000 physical nodes and meanwhile provide 3000 virtual machines of the same performance. The number of virtual machines migration and degree of system load balance are calculated every 10 s. After 20 times of sampling, find the average values as the migration times and the degree of system load balance under a certain threshold value. In the experiments, different threshold values are used in order that we can observe the number of virtual machines migration, shown in Fig. 1.

Fig. 1.
figure 1

The number of migration of the three algorithms under different thresholds

The number of migration caused by scheduling strategy based on the gray forecasting model is obviously smaller than those caused by Double Threshold (DT) algorithm. This shows that the scheduling strategy based on the gray forecasting model can make the migration more efficient. With the increase of the threshold setting, the number of migration caused by both the scheduling strategy based on the gray forecasting model and DT algorithm showed a clear upward trend, but the number of migration caused by the Simple Threshold (ST) algorithm showed a clear downward trend. This is because that when the dual-threshold is set up, threshold enhancement, low threshold is also to become high. Many virtual machines of low CPU and memory utilization rate in the physical machine will be moved out so that the number of migration will increase. When the threshold is 0.7, the numbers of migration caused by the three scheduling algorithms are most similar, shown in Fig. 2.

Fig. 2.
figure 2

The load balance degree of three migration strategies while the threshold value is 0.7

Figure 2 shows three load balance curves of the three algorithms while the threshold value is 0.7. It is found that there is a rise in the load balance in the early stage, but with the passage of the time, the load balance of the three scheduling strategies is gradually becoming smaller. The load balance of ST algorithm with a single limit threshold is greater than that of DT algorithm with dual-threshold and the dynamic migrating strategy based on gray forecasting model. But the scheduling strategy based on gray forecasting model has better load balancing effect. It can improve utilization ratio of the system resources.

5 Conclusions

A kind of scheduling strategy of virtual machine migration based on the gray forecasting model is proposed in this paper. The model is of high precision and high efficiency. In addition, the dual-threshold value is applied in the virtual machine migration strategy. It effectively eliminates the frequent migration caused by the CPU utilization ratio shocks. The idea of positioning probability is introduced while the target node is being selected, to solve the clustering effect and greatly enhance the migration success rate. In order to realize the goal of saving energy, [6] the scheduling strategy will be gradually improved in the following research.