Optimized decision algorithm for Information Centric Networks

Information Centric Networks (ICN) enable network, server context and user context-awareness, to achieve an enhanced architecture for the delivery of the multimedia content. The information comes from different sources and serves as input for the decision algorithms for choosing the pertinent conﬁguration such as the best server or the suitable delivery path. Therefore, the relevance of the input informa- tionandtheefﬁciencyofthedecisionalgorithmsarebothcru-cial for the system performance. This paper proposes exploiting the multi-criteria optimization algorithms in the context of the ICN. Based on the approach of the reference level decision, an optimized algorithm is proposed, which considers the impact of different network and server parameters, and dynamically adapts the decision to the current state of the system. The additional contribution of the paper is comprehensive video content consumption simulation model, which represents large scale network. This model was designed to compare effectiveness of decision algorithms proposed for ICN. The presented simulation results prove effectiveness of proposed decision algorithm and suggest its deployment on the future media networks.


Introduction
In the last years, research initiatives for improving multimedia streaming in the Future Internet [1] have multiplied.From the creation of the future media networks (FMN) cluster [2], a series of research projects develop various complex systems that optimize the content transfer and dynamically adapt the streaming parameters to prevent the degradation.These systems, e.g., [3][4][5][6][7][8][9][10], are generally known as Information Centric Networks (ICN).The selection of pertinent streaming parameters requires awareness about, among others, user profile and context, terminal capabilities, network conditions and server context [11].For example, the knowledge on the location of content replicas, server and network conditions, and content transfer requirements makes feasible optimisation by taking centralised decisions [12], which improves utilisation of network resources and leads to better quality experienced by users.
One of the challenges in design ICN system is specification of network-awareness process, which measures network performance metrics having significant impact on the quality perceived by consumers, and associate them with an acceptable cost model [13,14].Despite different existing monitoring tools, characterizing network conditions is still a rough task.In addition, metrics are correlated and this correlation is, in general, impossible to model.
Thanks to the network awareness, the selection of the transmission parameters (e.g., content source, bitrate, etc.) can be improved.Such an improvement requires optimized decision algorithm, which considers the possible solutions (e.g., a number of content sources, the different bitrates to download the content, etc.) and decides the best one for the current network conditions.The decision is, in general, an NP-complete problem, since it results in a multi-criteria decision problem.In this case, heuristics are usually used to compute a sub-optimal solution.
This paper analyses the decision algorithms for the adaptive streaming on the ICN, and proposes a novel algorithm that optimizes the overall performance of the system.Our algorithm exploits multi-criteria optimisation based on decision space composed of a set of metrics.Such approach, in contrary to previous proposals exploiting a single parameter for decision algorithm, e.g.packet delay [15] or path length [16], can select optimised solution, especially in the case of certain correlation between decision parameters.The effectiveness of proposed algorithm has been evaluated by simulation experiments.In these experiments, we compare proposed decision algorithm with other recently proposed multi-criteria algorithms as well as with the random selection strategy.These simulation experiments were performed assuming an Internet-scale video content consumption system, which models large Video-on-Demand (VoD) service provider.This model assures that the analysed decision algorithms are compared in quasi-realistic conditions.
This paper is organized as follows: In Sect. 2 we present an overview of decision strategies used in currently proposed ICN solutions.Sect. 3 presents the multi-criteria decision algorithms developed for content server selection in ICN.The details of the proposed multi-criteria decision algorithm are presented in Sect. 4. It extends the current reference level-based algorithms and has improved performance, as it is shown in the simulation studies presented in Sect. 5.The simulations are directed to compare the proposed algorithm with other solutions in order to quantify its efficiency.The conclusions are presented in Sect.6.

Analysis of ICN systems
Recently, the ICN has gained attention in various research initiatives, e.g., ALICANTE [3], COMET [4], PSIRP/ PURSUIT [5,6], 4WARD/SAIL [9], DONA [17].Also some ICN-based mechanisms have been proposed in other ICT fields (e.g., Internet of Things [18]).Each of them follows new design paradigms, which treat the content as the primary citizen of the network.The investigated approaches differ in particularities, but all of them support: (1) ubiquitous and location independent content identifiers, (2) content aware routing of requests towards selected content server, (3) innetwork caching and content storage, (4) flexible data plane allowing for anycast and point-to-multipoint connections, and (5) application and location agnostic content access.In this way, the ICN becomes a sophisticated content access and delivery system instead of a simple host-to-host communication network.
One of the research challenges in ICN is the appropriate decision process, which selects for example the best content source for serving incoming content requests.The investigated approaches assume that decisions could be taken by the network infrastructure, by the client applications, or by the content provider.Among solutions relaying on the network infrastructure, we can distinguish the "route-byname" [17,18] and DNS-like approaches [4,9].The "routeby-name" approach assumes that every ICN node forwards the content request towards the destination server based on its local knowledge.In these ICN systems, the decision about server selection is taken in distributed way as a concatenation of local optimizations.Therefore, the final solution may not be optimized in the global scope.On the other side, the "DNS-like" approaches collect information about available content replicas, content server status and network conditions, then they use it for selecting the best content server to serve consumer request.In principle, the DNS-like approaches are centralized and could lead to the globally optimal solution.However, the challenge for the DNS-like approaches is to design effective and scalable information system which collects information about content localisation, server load and network status with appropriate accuracy.The investigated approaches exploit distributed information systems designed on federation principles.
The client side decision strategy assumes that the application selects the best content based on information collected by itself.The investigated approaches [3,19,20] exploit the dynamic probing and statistical estimation of different information such as round trip delay, bandwidth, servers responsiveness.The results presented in [19] confirm that even simple dynamic probing outperforms blind client-side approaches.However, the main limitation of client side strategy is its limited scalability in an Internet-wide ICN deployment.
In order to overcome these limitations, the server side selection strategy has been investigated [20,21].It allows to aggregate information at the server side and to reuse it for redirecting the content requests coming from different specific areas.Moreover, the information at server side should be pro-actively collected and processed before content requests arrive.These features significantly improves scalability of server side selection strategies.
Although the investigated ICN approaches differ in input parameters, decision strategies, and the system architectures all of them require an efficient multi-criteria decision algorithm.

Multi-criteria decision algorithms
In this section we briefly introduce the multi-criteria analysis and present the reference level decision approach [22,23], which constitutes a base for our algorithm.We believe that brief reminder of multi-criteria decision theory allows better understanding the role of decision algorithms in ICN systems.Note that, the main motivation for using multi-criteria decision methods in ICN systems comes from the complex set of input parameters covering content characteristics and location, server and network conditions and content transfer requirements.The multi-criteria optimization requires definition of the problem decision space m .
This space covers all candidate solutions considered by the decision process.They are denoted as decision vectors x = (x 1 , x 2 , . . ., x m ) Each decision vector contains m decision variables.Any decision variable may have bounded amount of feasible solutions defined by some given constraints.Multi-criteria optimization focuses on optimizing a set of k objective functions 1 (x), 2 (x), . . ., k (x), which can be maximized or minimized.Note that the problem does not lose generality when we consider uniquely minimization.The aggregate objective function composes a vector of these objective functions: for each decision vector In multi-criteria optimization, a solution x is treated as dominating the solution x if and only if ∀k * ∈ {1, . . ., k} : and a solution x is called efficient if and only if there not exist another solution x , dominating x .The Pareto optimal set composes of all efficient solutions, while the Pareto Frontier covers all outcome vectors y coming from equation, y = Π(x) where x is an efficient solution.Whenever the Pareto optimal set contains more than one efficient solution, the Decision Process should choose one of them.In fact, the Decision Process could (1) provide a priori some knowledge about the problem in order to ensure that the efficient solution outgoing from the model is unique or (2) consider a posteriori the whole set of efficient solutions and choose one unique solution.
Applying the multi-objective optimization [24] for ICN system is a challenging task because description of the network behavior is unattainable.Therefore, decision maker must select the most effective solution from a group of feasible and not dominated solutions described by m decision variables (m-criteria) [25,26].Moreover, the effectiveness of the decision algorithm strongly depends on the proper selection of considered decision variables (e.g., server load, routing path load, end-to-end packet transfer delay, available bandwidth at the server and user sides) as well as the algorithm itself.
The commonly recognized approach to solve the multicriteria problem is to transform it into a single criterion problem by applying specific cost function (e.g., [27]), which takes decision variables as its argument.Although, any strict monotonic and convex functions could be used as a cost function, the Minkowski norm (1) of order p is widely exploited in many practical approaches, where v i i = 1, . . ., m are decision variables, w i are the weights of each variable and p is shaping factor enforcing non-linear aggregation of decision variables.
The significant limitation of the above cost function is a need for "a priori" setting of decision variable weights w i and the shape factor p related to obtain non-linear aggregation.This feature limits applicability of Minkowski norm, since usually the ICN system has no "a priori" knowledge about how to fix the appropriate values of weights w i and shape factor p. Although, the ICN system could estimate values of some parameters, i.e. the server load and Round Trip Time (RTT) by active probing, there is still the problem of how to balance the importance of these two variables by fixing weights w i .Moreover, the implementers have to investigate how decision maker should tune the shape factor p to calculate the cost of candidate solutions.
It is worth to mention that decision strategies based on some "a priori" assumptions about the values of weights are not the most effective ones.The main issue is that someone can always find a specific example where the decision algorithm does not select the best feasible solution.Let us consider a linear combination of two random variables corresponding to RTT and server load ( p = 1).In this case, a candidate with medium values of RTT and load will never be selected from the solution with significantly different values of decision variables, i.e. light load and high RTT or vice versa.The similar effect can be observed for the value of p.The decision maker must know in advance preferences about decision variables.However, the proper setting value of p is not a trivial issue because decision variables may be correlated.
The commonly recognized approach to overcome this problem assumes independent evaluation of the decision variables.This heuristic is often the unique possible solution in content networks dimensioning (e.g., [28]).Let us remark that the independence of decision variables is acquired by a decision algorithm which uses (2) as the cost function.This means that the limit of Minkowski's norm with p going to infinity prefers feasible solution uniquely based on the most sensitive variable, while ignoring the others variables.
In Fig. 1, we present the Pareto optimal set for different values of p.When M( p → ∞), the decision variables are treated independently.The independent treatment of decision variables constitute a base for multi-criteria decision algorithm with the reference levels proposed, among others, in [22] and [23].
The decision algorithms with reference levels use two reference parameters, called reservation level and aspiration level, in order to weight the importance of a particular decision variable.The reservation level defines the upper limit for the decision variable, which should not be exceeded by a feasible solution.On the other hand, the aspiration level constitutes the lower bound beyond which decision variables are undistinguishable because of the same preference level.The reference levels are fixed a'priori by the decision maker to express his/her preferences.Formally, the cost function is defined by equation ( 3), where reservation and aspiration levels for decision variable i are denoted by r i and a i , respectively.max The decision algorithm with the reference levels assumes that decision variables are independent, so there is no need for using shape parameter p.However, we still need to fix appropriate weights of the decision variables.Therefore, Kreglewski et al. (see [29]) proposed to calculate the values of reservation and aspiration levels based on the feasible solutions.Let s (m) = [v 1s , . . ., v ms ] be a solution of the space of feasible solutions ∈ Sxm .The reservation and aspiration levels of decision variable i are estimated based on the maximum and minimum values of this variable in the space of feasible solutions, see formula (4).
The cost of considered solution is calculated using equation (3) with the reference levels determined by formula (4).
In the proposed optimized reference level decision algorithm, described in the next section, we enhance the reference level approach by considering the impact of current decision on the future state of the ICN system.Such a prediction allows us to prevent ICN system from undesirable states, e.g., server or network overload.We believe that our approach is a step forward in decision algorithm analysis, which has potential to improve the performance of ICN systems.

Optimized reference level decision algorithm
As stated above, the authors of [29] proposed an algorithm that uses the maximum and minimum values of vector as the reference and aspiration level, respectively.So, the authors considered that the comparison terms c is that should be minimized (in the space of m variables) and maximized (in the space of S feasible solutions) are as indicated in (5).Formula (6) presents the decision algorithm. max The comparison terms c is depend on the value (max[V i ]− min[V i ]) but do not consider how the values of vector V i are distributed between max[V i ] and min[V i ], i.e., the comparison terms do not consider the variance between the elements of the vector V i .
The algorithm presented below aims to reduce the decision importance (by reducing its weight) of the variables, whose space of feasible solutions has low value of variance.This is acquired by modifying the comparison terms [c is ].
The motivation is the following: when the feasible solutions have similar values for one of the specific variables (called variable i), then the selection of any solution does not change the state of the whole system (as far as variable i is concerned).So, we consider that such a variable should not be taken into account during the selection, which means in practice that such a variable should have lower weight within the decision algorithm.

SoluƟon of decision algorithms ( c is and c is ' )
Fig. 2 Pareto optimal set for the case of M(p) = 1 (w1 = w2 = 1) Then, the proposed comparison terms [c is ] are as presented in (8) and the decision algorithm is the one presented in (9).
The values c is decrease for higher values of σ i and lower values of c is are preferred in formula (9).In conclusion, decision variables with higher variance get higher weight in the decision algorithm.
Consider a system with 4 feasible solutions (S = 4) and two variables (m = 2) with the values presented in Fig. 2A.The values of c is and c is provide to selection of different feasible solutions: s = 2 for c is and s = 3 for c is , as we can see in Fig. 2B.In the first case, the selection of s = 2 is based on a better parameter of variable i = 1.Due to the little difference of this variable for both the solutions (s = 2 and s = 3), the selection of any of them will not change the system so much (for variable i = 1).Therefore, the selection should be based on the values of variable i = 2, which is reached for the decision algorithm c is .
Note that relation between c is described in formula (5) and ) is fewer or equal to 1, see (10).Therefore, c is ≥ c is .
On the other hand, the zeros of c is and In the paper [30] we proposed a comparison term with similar characteristics as the present one, i.e., the value depended on the variance of V i .The major difference is that, in [30], the comparison term (named c is ) and then, the algorithm could prefer a solution with value equal to the reference level of one or more variables.Even when this does not disqualify the comparison term c is , we think that the current solution c is offers better results and we will demonstrate this in the simulation studies presented in the next section.
The case In this case, the decision variable i is not considered in the decision algorithm, as it occurred for earlier solutions based on reference level algorithms [29,30].
The proposed algorithm reassesses the importance of variables with lower values of variance.This way, the system is more efficient since, indirectly, the decisions take into account the state of the selection.This means that the system reaches the saturation point more slowly than in the basic Reference level decision algorithm.The simulations will show this point.
Let us remark that the proposed algorithm does not require more information than the basic reference level algorithm and, therefore, other mechanisms are not necessary.The unique requirement is some more lines of code, which means low capital and operational expenditures in deployment.

Simulation environment and results
We evaluate the proposed solution by performing simulations on an extensive model of network dedicated to video on demand (VoD) streaming.Such a model takes the parameters from the largest content and service providers, and includes network topology, server characteristics as their locations, service details as content duration and popularity.Moreover, the users are also added in this model following the current arrangement in the Internet.
The model of the network topology is taken from the Internet topology that CAIDA [31] publishes every year.The topology only considers Autonomous Systems (36,000 domains) and inter-domain links (103,000 links).We classified the Autonomous Systems into tier-1, 2 or 3 by considering the peering, providing or consuming relations with the neighboring domains.The capacity of inter-domain link was assumed to be a value from a uniform distribution U[0.5,1,5] Gbps in tier-3 inter-domain links, following the guidelines in [32].We assumed a value 10 times higher in the case of capacity for inter-tier 2 links (U[5.0,15.0] Gbps) and 100 times higher in the case of inter-domain links with tier-1 (U[50.0,150.0]Gbps).In this topology, we placed content servers in the domains following the ideas proposed in [33].Specifically, for the top 50 largest content providers, network providers and CDNs (e.g., Level3, Global Crossing, Akamai, LimeLight, AT&T, Comcast, Google), the number of servers corresponds to the information from white papers (e.g., [34]) and illustrative information in the homepages.In other domains, we assigned a random number of servers between 50 and 150, which approximates the situation of Akamai: Akamai counts with 84,000 servers in 1000 domains.The total number of servers in the model is more than 200,000.Moreover, each server has up to 100 film tittles and may serve up to 200 streams in parallel.These data agree with current servers in the market [35].
The servers contain different content files, whose parameters were acquired from the 5000 most popular titles in filmweb [36] on December 1 st , 2010.The duration of these films is, on average, 4 100 s.To each tittle, we allotted a bandwidth value of streaming between 2.6 and 3.4 Mbps.Videos in the Netflix Canadian network [37] are streamed in this range of bandwidth.
Content replication was allocated by using the Zipf's law, which models the video distribution in large networks [32,38,39] (more popular contents were copied more times in the network).We assumed a value of the skew parameter (Zipf's law) equal to 0.2 following the guidelines in [32].The copies were randomly located within the servers, but no server had two or more copies of the same content.
Also user population was based on CAIDA data [31].We used values of user population which are proportional to the number of advertised prefixes in given domains.Since this number suffers light variation, we took the minimum value during a period of time of 5 days.
When the topology is prepared together with servers and content, then we start the simulations by generating user requests of content.Each request of the users has attached the request desired and the domain from which the request arrives.Then, the system receives information of server and path loads (considering shortest path) and triggers the selec-tion algorithm.As a result, the algorithm selects the best server to serve the arrived request.Let us remark that, for simplicity purposes, the algorithm selects the server between 500 feasible servers selected previously in a random way (S = 500).
Once the server is selected, then one connection is summed to the number of connections served by the server at this moment and the streaming bandwidth of the content is summed to the current load of each link of the end-to-end path (between the server and the user).When the server or any link in the path crosses over a certain threshold (200 for servers and assigned bandwidth for the links), then all the connections using the specific server/path are considered as unsuccessful.
We used three different decision algorithms.The first decision algorithm is the random selection of content server.The second one is the basic reference level following formulas ( 6) and ( 7) and the third decision algorithm is the optimized reference level following formulas ( 9) and (10).
As stated above, there are two reasons for considering the delivery of the content as unsuccessful: server overload or link over-load.Figures 3 and 4 present the relation between successful and unsuccessful connections (called success ratio) for increasing rate (λ) of arrival requests (whole system).The unsuccessful connections in Fig. 3 are provoked by server over-load, whereas, in Fig. 4, the unsuccessful connections are provoked by link over-load.Let us remark that the results are very dependent on the parameters of the model (e.g., the maximum capacity in servers).A threshold of 200 simultaneous connections in servers ensures that the overload provoked by servers and by links appears in similar range of λ.
The results show that the basic reference level algorithm is more effective (i.e., it serves more successful connections) than the random one.For example, a success ratio equal to 0.9 offers a value of success ratio three times higher in the case of basic reference level algorithm.
A value of success ratio equal to 0.9 is considered as a satisfying value.For success ratio equal to 0.9, the optimized Fig. 3 Success ratio due to server over-load Fig. 4 Success ratio due to link over-load reference level algorithm has a success ratio equal to 4 100 requests/s (see Fig. 4), which is almost 800 request/s more than in the case of basic reference level algorithm.
The comparison of the three decision algorithms shows that the gain of the reference level algorithms (both) over the random algorithm is definitively major than the gain of the optimized reference level algorithm over the basic reference level one.On the other hand, the cost of introducing a reference level algorithm is high since the system should acquire network awareness into the decision process.It means that a monitoring system should be developed in the server side and in the network.Whereas, the cost of introducing the optimized algorithm is negligible in comparison to the basic reference level one.In fact, the cost is only programming a new algorithm but the necessary information from the network and server sides is the same.
The last tests that we performed were destined to compare two optimized algorithms that base on the Reference level technique.The first one is the algorithm presented in [30] and the second one is the algorithm presented in this paper.As we can see in the results presented in Fig. 5, the success ratio due to server overload of the presented algorithm is lightly higher than the algorithm presented in [30], for all the values of λ.As pointed in the previous section, the reason we can find in the fact that the algorithm presented in [30] selects Fig. 5 Success ratio due to link over-load for two optimized Reference level algorithms solutions near to the reference level with higher probability and, then, the system saturates lightly faster than the present algorithm.Because of this, the present algorithm achieves better success ratio.
In order to ensure that the results are trustworthy, we performed several times very long simulations.Each simulation counted 10 12 content requests.During the simulations, we checked the state of a number (100) of servers and links randomly selected.The goal of this was to understand whether the servers or links entered in any state loop as, e.g., permanent over-loaded state.The monitoring of the servers and links showed that all of them changed state (light or heavy load) many times without any remarkable pattern.This shows the trustworthiness of the results.At last, the stability of the results was checked by counting the success ratio in different moments of the simulations, i.e., when the simulations counted 0.5 × 10 12 , 0.75 × 10 12 and 1.0 × 10 12 content requests.In all the cases, the results were identical.

Conclusions
In this paper we discussed the multi-criteria decision problem applied to the selection of transmission parameters in Information Centric Networks.We presented the general problem and different approaches proposed in the literature.We argued that, for the case that concerns us, the Reference level techniques seem to be appropriate.The paper presents a new algorithm that optimizes the basic Reference level algorithm.The optimization bases on introducing into the system awareness about the state of the system after the selection.Concretely, the optimized algorithm takes the decision on the basis of preferred variables which are crucial for the future state of the system.Whereas, the basic reference level algorithm does not consider the future state of the system and takes the decisions for searching the best quality of the current content transmission, regardless of the future state of the system.
The optimized Reference level decision algorithm prefers the variables with higher variance, since the selection based on these variables may have significant impact on the system, while the variables with small variance does not induce to a big change in the system after the selection.
We performed simulations on an extended model of Information Centric Network in order to understand the gain of the proposed algorithms in comparison with currently used decision algorithms.The results showed that the optimized algorithm is slightly better than the basic algorithm.Even if the gain is not so high, we advise to use the optimized one since there is no supplementary cost of its use and in any situation the optimized algorithm behaves worse than the basic reference level algorithm.
The results show significant improvement of the algobased on Reference level techniques compared to random selection.This point proves the efficiency of the ICN architectures that introduce information of the state of the system into the decision process of the parameters of the transmission.The simulations presented in this paper were also directed to compare two similar optimized Reference level-based algorithms in the considered model.The results indicated similar behavior of the system for both the algorithms (light gain for the algorithm proposed in this paper).Further research in this area will be directed to understand the influence of the different parameters of the simulations (e.g., content distribution within the servers) into the final results.