AutoSAC: automatic scaling and admission control of forwarding graphs
 581 Downloads
Abstract
There is a strong industrial drive to use cloud computing technologies and concepts for providing timing sensitive services in the networking domain since it would provide the means to share the physical resources among multiple users and thus increase the elasticity and reduce the costs. In this work, we develop a mathematical model for userstateless virtual network functions forming a forwarding graph. The model captures uncertainties of the performance of these virtual resources as well as the timeoverhead needed to instantiate them. The model is used to derive a service controller for horizontal scaling of the virtual resources as well as an admission controller that guarantees that packets exiting the forwarding graph meet their endtoend deadline. The Automatic Service and Admission Controller (AutoSAC) developed in this work uses feedback and feedforward making it robust against uncertainties of the underlying infrastructure. Also, it has a fast reaction time to changes in the input.
Keywords
Cloud computing Network function virtualisation Endtoend deadline Realtime Feedback control Feedforward control1 Introduction
Over the last years, cloud computing has swiftly transformed the IT infrastructure landscape, leading to large costsavings for deployment of a wide range of IT applications. Physical resources such as compute nodes, storage nodes, and network fabrics are shared among tenants through the use of virtual resources. This makes it possible to dynamically change the amount of resources allocated to a tenant, for example as a function of workload or cost. Initially, the cloud technology was mostly used for IT applications, e.g., web servers, databases, etc., but has now found its way into new domains. One of these domains is packets processed by a chain of network functions.
In this work, we are considering a chain of network functions through which packets are flowing. Every packet must be processed by each function in the chain within some specific endtoend deadline. The goal is to ensure that as many packets as possible meet their deadline, while at the same time using as few resources as possible.
 a)
Starting a new virtual resource takes time, since it has to be deployed to a physical server and it requires the execution of several initialization scripts and push/pulls before it is ready to serve packets,
 b)
The true performance of the virtual resource differs from the expected performance, since one does not know what else is running on the physical machines [2].
In this work, we

develop a model of a servicechain of network functions and use it to derive a servicecontroller and admissioncontroller for the network functions,

derive a servicecontroller controlling the number of virtual resources (e.g., VMs or containers) allocated to each network function by using feedback from the true performance of the instances as well as feedforward between the network functions,

derive an admissioncontroller that is aware of the actions of the servicecontroller which it uses in order to reject as few packets as possible,

evaluate the service and admission controller using a realworld traffic trace from the Swedish University Network (SUNET).
1.1 Related works
There are a number of works considering the problem of controlling virtual resources within data centers, and specifically for virtual network functions. However, many of them focus on orchestration, i.e., how the virtual resources should be mapped onto the physical hardware. Shen et al. [3] develop a management framework, vConductor, for realizing endtoend virtual network services. In [4], Moens and De Turk develop a formal model for resource allocation of virtual network functions. A slightly different approach is taken by Mehraghdam et al. [5] where they define a model for formalizing the chaining of forwarding graphs using a contextfree language. They solve the mapping of the forwarding graphs onto the hardware by posing it as a MIQCP.
Scaling of virtual network functions is however studied by Mao et al. [6] where they develop a mechanism for autoscaling resources in order to meet some user specified performance goal. Recently, Wang et al. [7] developed a fast online algorithm for scaling and provisioning VNFs in a data center. However, they are not considering timingsensitive applications with deadlines for the packets moving through the chain, which is done by Li et al. [8] where they present a design and implementation of NFVRT that aims at controlling NFVs with soft realtime guarantees, allowing packets to have deadlines.
The enforcement of an endtoend deadline for a sequence of jobs is however addressed by several works, possibly under different terminologies. Di Natale and Stankovic [9] propose to split the E2E deadline proportionally to the local computation time or to divide equally the slack time. Later, Jiang [10] used time slices to decouple the schedulability analysis of each node, reducing the complexity of the analysis. Such an approach improves the robustness of the schedule, and allows to analyze each pipeline in isolation. Serreli et al. [11, 12] proposed to assign local deadlines to minimize a linear upper bound of the resulting local demand bound functions. More recently, Hong et al. [13] formulated the local deadline assignment problem as a MILP with the goal of maximizing the slack time.
An alternate analysis was proposed by Jayachandran and Abdelzaher [14], who developed several transformations to reduce the analysis of a distributed system to the single processor case. Or in [15] where Henriksson et al. proposed a feedforward/feedback controller to adjust the processing speed to match a given delay target.
2 Modeling the servicechain
2.1 Admission controller
Every packet that enters the servicechain must be processed by all of the functions in the chain within a certain endtoend (E2E) deadline, denoted D ^{ m a x }. This deadline can be split into local deadlines D _{ i }(t), one for each function in the chain, such that the packet should not spend more than D _{ i }(t) timeunits in the i th function. Should a packet miss its E2E deadline, it is considered useless. It is thus favorable to use admission control to drop packets that have a high probability of missing their deadline in order to make room for following packets. The goal of the admission controller is to guarantee that the packets that make it through the servicechain do meet their E2E deadline. It is assumed to be possible to do admission control at the entry of every function in the chain.
2.2 Service controller
2.3 Processing of packets
2.4 Function delay
Computing the expected functiondelay \({\bar {d}_{i}}(t)\) requires information about m _{ i }(t) and \({\hat {\xi }_{i}}(t)\) for the future, whereas computing the expected function delay \({{d}_{i}^{\text {ub}}}(t)\) requires information about m _{ i }(t) for the future. Information about m _{ i }(t) up until time t +Δ_{ i } is always known since \(m_{i}(t+{\Delta }_{i})=m_{i}^{ref}(t)\) and \(m_{i}^{ref}(x)\) is known for x ∈ [0, t]. It is therefore possible to compute the expected function delay \({\bar {d}_{i}}(t)\) whenever it is shorter than the timeoverhead Δ_{ i } (which will be used later in Section 3 when deriving the admission controller and the service controller).
Note that the (expected) function delay does not distinguish between queueing delay and processing delay. In [17], Google profiled where the latency in a data center occurred and showed that 99% of the latency (≈85 μs) occurred somewhere in the kernel, the switches, the memory, or the application. It is very difficult to say exactly which of this 99% is due to processing or queueing, hence they are considered together as the function delay.
2.5 Concatenation of functions
2.6 Problem formulation
The goal of this paper is to derive a servicecontroller and an admissioncontroller that guarantees that packets that pass through the servicechain meet their E2E deadline. This should be done using as few resources as possible while still achieving as high throughput as possible. This is captured in a simple, yet intuitive utility function u _{ i }(t). Later in Section 3, the utility function is used to derive an automatic service and admission controller, denoted AutoSAC.
Utility function
3 Controller design
Timing assumptions for the endtoend deadline, the changeofrate of the input, and the overhead for changing the servicerate. These timing assumptions are used when deriving the automatic service and admissioncontroller
Parameter  Timing assumption 

Longterm trend change of the input  1 min–1 h 
Servicerate change overhead Δ_{ i }  1 s–1 min 
Request endtoend deadline \(D^{\max }\)  1 μs–100 ms 
The admission controller is derived in Section 3.1 and the service controller in Section 3.3. In Section 3.4, a short discussion of the properties of AutoSAC is presented.
3.1 Admission controller
Every request that enters the service chain has an endtoend deadline \({D^{\max }}\). It has to pass through every function in the chain within this time. Furthermore, each function can impose a local deadline D _{ i }(t) for the packet entering the i th function at time t. One can therefore use either the local deadline to do a decentralized admission control at the entry of each of the functions in the chain, or the global deadline for a centralized admission control. In this work, we will use a decentralized approach, shown below, but will also derive a policy for a centralized admission control in Section 3.2.1; however, only the decentralized policy will be evaluated in Section 4.
3.2 Decentralized admission control
3.2.1 Centralized admission control
3.3 Service controller
3.3.1 Alternative utility function
3.4 Properties of AutoSAC
There are several interesting properties captured by the admission controller and service controller presented in this section. First of all, the admission controller (14) ensures, by design, that every packet that is admitted into a function, and thus exits the function, meets its deadline. Therefore, no packets that exit the servicechain will miss their endtoend deadline.
The servicecontroller given by Eq. 25 captures both the feedback used from the true performance of the instances (when computing \({\hat {\xi }_{i}}(t)\)) as well as feedforward information about future input coming from functions earlier in the servicechain (when computing \({\hat {r}_{i}}(t)\)). This makes it robust against machine uncertainties but also ensures that it reacts fast to sudden changes in the input. For instance, given a servicechain of six functions, function F _{5} will know that in Δ_{4} timeunits, F _{4} will have \({m^{\text {ref}}_{4}}(t)\) instances running and can thus start as many instances as needed to process this new load.
4 Evaluation
In this section, the automatic service and admissioncontroller (AutoSAC) developed in Section 3 is evaluated. First, in Section 4.1, by illustrating how a randomly generated service chain of three functions performs when it is given a 5h traffic trace. Later, in Section 4.2, AutoSAC is compared with two other “stateoftheart” methods for scaling cloud services. The comparison is done using a Monte Carlo simulation where the parameters of a five function service chain are randomly generated and then simulated, again using a real traffic trace as input.
The realworld trace of traffic data used as input was gathered over 120 hours from a port in the Swedish University NETwork (SUNET) and then normalized to have a peak of 10,000,000 packets per second as shown in Fig. 1. The simulation was written in the opensource language Julia [20]. The code and traffic trace used for this simulation is provided on GitHub.^{1}
4.1 Example chain
For this example, a service chain with three functions where the E2E deadline was set to 30 ms, which in turn was split into local deadlines of 10 ms for each function. The other parameters (i.e., \({\bar {s}_{i}}\), Δ_{ i }, \({{\xi }_{i}^{\text {lb}}}\), and \({{\xi }_{i}^{\text {ub}}}\)) for every function in the servicechain are generated randomly. The expected servicerate \({\bar {s}_{i}}\) was chosen uniformly at random from the interval [100,000, 200,000] pps. The timeoverhead Δ_{ i } was drawn uniformly at random from the interval [30, 120] seconds. The machine uncertainty was chosen to be in the range of ± 30% of the expected servicerate \(\bar {s}_{i}\). The lower bound of the machine uncertainty was drawn from the interval \([0.3{\bar {s}_{i}},\,0]\) pps and likewise, the upper bound was drawn from \([0,\,0.3{\bar {s}_{i}}]\) pps.
4.2 Comparing AutoSAC with stateoftheart
In this section, we will evaluate AutoSAC through a Monte Carlo simulation with 15 ⋅ 10^{4} runs where it is compared against two stateoftheart methods for autoscaling VMs in industry; dynamic autoscaling (DAS) and dynamic overprovisioning (DOP). However, since these two methods do not use any admission control, they are also augmented with the admission controller presented in Section 3.1. The two augmented methods are denoted by “DAS with AC” and “DOP with AC.” Hence, in total, the method presented in Section 3 is compared with four other methods.
Dynamic autoscaling (DAS)

add a VM if the efficiency is above 99%,

remove a VM if the efficiency is below 95%,
Dynamic overprovisioning (DOP)
A downside with DAS is that it reacts slowly to sudden changes in the input. A natural alternative would therefore be to instead do dynamic overprovisioning, where one measures the input to each function and allocate virtual resources such that there is an expected overprovision by 10%.
Monte Carlo simulation
The five methods are compared using a Monte Carlo simulation with 15 ⋅ 10^{4} runs. For every run, 1 h of input data was randomly selected from the total of 120 h shown in Fig. 1. Furthermore, in every run, a new servicechain with five functions was generated using the method described in Section 4.1. The endtoend deadline was chosen to 50 ms, which in turn was split into local deadlines of 10 ms for each function.
The evaluation of the Monte Carlo simulation is based on the average utility \(U(t)=\frac {1}{t}{{\int }_{0}^{t}} {\sum }_{i=1}^{n} u_{i}(x) {\mathrm {d}x}\). Since a packet that misses its deadline (which is possible when using DAS or DOP) is considered useless, it is evaluated as a dropped packet when exiting the function. It therefore impacts the availability metric and in turn the utility. Should all packets miss their deadlines in function F _{ i } for a time interval τ, then a _{ i }(t) = 0 ∀t ∈ τ, i.e., the availability would be evaluated as 0 during this timeinterval since the output of the function is considered useless.
Results
When augmenting DAS and DOP with the admission controller derived in Section 3.1, the performance is increased by 20–40%, purely as a result of not having these sudden drops in performance. However, AutoSAC still performs 5–10% better, due to the feedforward property of AutoSAC which gives it a faster reaction time to changes in the input as well as the feedback property leading to better prediction and robustness against the machine uncertainties.
5 Summary
In this work, we have developed a mathematical model for a NFV Forwarding Graphs residing in a Cloud environment. The model captures, among other things, the time needed to start/stop virtual resources (e.g., virtual machines or containers), and the uncertainty of the performance of the virtual resources which can deviate from the expected performance due to other tenants running loads on the physical infrastructure. The packets that flow through the forwarding graph must be processed by each of the virtual network functions (VNFs) within some endtoend deadline.
A utility function is defined to evaluate performance between different methods for controlling NFV Forwarding Graphs. The utility function is also used to derive an automatic service and admissioncontroller (AutoSAC) in Section 3. It ensures that packets that exit the forwarding graph meet their endtoend deadline. The servicecontroller uses feedback from the actual performance of the virtual resources making it robust against uncertainties and deviations from the expected performance. Furthermore, it uses feedforward between the VNFs making it fast to react to changes in the input load.
In Section 4, AutoSAC is evaluated and compared against four other methods in a Monte Carlo simulation with 15 ⋅ 10^{4} runs. The input load for the simulation is a realworld trace of traffic data gathered over 120 h. The traffic is normalized to have a peak of 10,000,000 packets per second. AutoSAC is shown to have better performance than what is offered in the cloud industry today. We also show that when augmenting the industrymethods with the admission controller derived in Section 3, they have a significant increase in performance.
It would be interesting to extend this work by investigating how to derive a controller when the true performance is unknown or when the timeoverhead needed to start virtual resources is unknown. Moreover, it would be interesting to investigate how to control a forwarding graph that has forks and joins, i.e., a graph structure rather than just a chain.
Footnotes
Notes
Acknowledgments
The authors would like to thank KarlErik Årzén and Joao Monteiro Soares for the useful comments on early versions of this paper.
References
 1.ETSI (2012) Network Functions Virtualization (NFV), https://portal.etsi.org/nfv/nfv_white_paper.pdf
 2.Leitner P, Cito J (2016) Patterns in the chaos—a study of performance variation and predictability in public iaas clouds. ACM Trans Internet Technol 16(3):15CrossRefGoogle Scholar
 3.Shen W, Yoshida M, Kawabata T, Minato K, Imajuku W (2014) vconductor: An nfv management solution for realizing endtoend virtual network services. In: Network Operations and Management Symposium (APNOMS), 2014 16th AsiaPacific. IEEE, pp 1–6Google Scholar
 4.Moens H, De Turck F (2014) Vnfp: A model for efficient placement of virtualized network functions. In: 10th International Conference on Network and Service Management (CNSM) and Workshop. IEEE, pp 418–423CrossRefGoogle Scholar
 5.Mehraghdam S, Keller M, Karl H (2014) Specifying and placing chains of virtual network functions. In: 2014 IEEE 3rd International Conference on Cloud Networking (CloudNet). IEEE, pp 7–13Google Scholar
 6.Mao M, Li J, Humphrey M (2010) Cloud autoscaling with deadline and budget constraints. In: 2010 11th IEEE/ACM International Conference on Grid Computing. IEEE, pp 41–48CrossRefGoogle Scholar
 7.Wang X, Wu C, Le F, Liu A, Li Z, Lau F (2016) Online vnf scaling in datacenters, arXiv preprint arXiv:1604.01136
 8.Li Y, Phan L, Loo BT (2016) Network functions virtualization with soft realtime guarantees. In: IEEE International Conference on Computer Communications (INFOCOM)Google Scholar
 9.Di Natale M, Stankovic JA (1994) Dynamic endtoend guarantees in distributed real time systems. In: Proceedings of the 15th IEEE RealTime Systems Symposium, pp 215–227Google Scholar
 10.Jiang S (2006) A decoupled scheduling approach for distributed realtime embedded automotive systems. In: Proceedings of the 12th IEEE RealTime and Embedded Technology and Applications Symposium, pp 191–198Google Scholar
 11.Serreli N, Lipari G, Bini E (2009) Deadline assignment for componentbased analysis of realtime transactions. In: 2nd Workshop on Compositional RealTime Systems, Washington, DC, USAGoogle Scholar
 12.Serreli N, Lipari G, Bini E (2010) The demand bound function interface of distributed sporadic pipelines of tasks scheduled by EDF. In: Proceedings of the 22nd Euromicro Conference on RealTime Systems, Bruxelles, BelgiumGoogle Scholar
 13.Hong S, Chantem T, Hu XS (2015) Localdeadline assignment for distributed realtime systems. IEEE Trans Comput 64(7):1983–1997MathSciNetCrossRefzbMATHGoogle Scholar
 14.Jayachandran P, Abdelzaher T (2008) Delay composition algebra: A reductionbased schedulability algebra for distributed realtime systems. In: Proceedings of the 29th IEEE RealTime Systems Symposium, Barcelona, Spain, pp 259–269Google Scholar
 15.Henriksson D, Lu Y, Abdelzaher T (2004). In: Proceedings of the 16th Euromicro Conference on RealTime Systems , pp 61–68Google Scholar
 16.Bonafiglia R, Cerrato I, Ciaccia F, Nemirovsky M, Risso F (2015) Assessing the performance of virtualization technologies for nfv: a preliminary benchmarking. In: 2015 Fourth European Workshop on Software Defined Networks. IEEE, pp 67–72Google Scholar
 17.Kapoor R, Porter G, Tewari M, Voelker GM, Vahdat A (2012) Chronos: Predictable low latency for data center applications. In: Proceedings of the Third ACM Symposium on Cloud Computing, ser. SoCC ’12. New York, NY, USA: ACM. [Online]. Available: http://doi.acm.org/10.1145/2391229.2391238, pp 9:1–9:14Google Scholar
 18.Cruz RL (1991) A calculus for network delay, part I: Network elements in isolation. IEEE Trans Inf Theory 37(1):114–131CrossRefzbMATHGoogle Scholar
 19.Le Boudec JY, Thiran P Network Calculus: a theory of deterministic queuing systems for the internet, ser. Lecture Notes in Computer Science. Springer, 2001, vol. 2050Google Scholar
 20.Bezanson J, Edelman A, Karpinski S, Shah VB (2014) Julia: A fresh approach to numerical computing, arXiv preprint arXiv:1411.1607
 21.2016, 10. [Online]. Available: https://aws.amazon.com/
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.