An Optimized Resilient Advance Bandwidth Scheduling for Media Delivery Services

. In IP-based media delivery services, we often deal with predictable network load and tra(cid:30)c, making it bene(cid:28)cial to use advance reservations even when network failure occurs. In such a network, to o(cid:27)er reliable reservations, fault-tolerance related features should be incorporated in the advance reservation system. In this paper, we propose an optimized protection mechanism in which backup paths are selected in advance to protect the transfers when any failure happens in the network. Using a shared backup path protection, the proposed approach minimizes the backup capacity of the requests while guaranteeing 100% single link failure recovery. We have evaluated the quality and complexity of our proposed solution and the impact of di(cid:27)erent percentages of backup demands and timeslot sizes have been investigated in depth. The presented approach has been compared to our previously-designed algo-rithm as a baseline. Our simulation results reveal a noticeable improvement in request acceptance rate, up to 9.2%. Moreover, with (cid:28)ne-grained timeslot sizes and under limited network capacity, the time complexity of the proposed solution is up to 14% lower.


Introduction
Currently, in the media-centric industries, the distribution of media content is generally performed by either people transporting the content on a physical storage media or over dedicated point-to-point high-speed optical links.However, these are highly inecient and costly methods.In order to support decentralized collaboration, reduce capital expenditures and increase network resource utilization, media related environments tend to switch to the cost-eective IPbased WAN approaches.Deploying a shared IP-based WAN solution enables the existing media content owners and their collaborators to work together in a cost eective way, while new actors can more easily nd new collaboration opportunities, thus fostering the whole industry's further growth.
As media-centric networks usually oer predictable network trac, this knowledge of future transmissions can be exploited to use advance reservation (AR) services.This makes it easier to oer guarantees in advance, improves the number of admitted requests and increases network utilization.In AR techniques, users submit requests for future data transfers, generally encompassing a start time in the future, a deadline, and total data transfer size or rate.To allocate the necessary resources (more specically network bandwidth), a scheduling algorithm is needed to ensure that all admitted requests nish before their specied deadline, while admitting as many requests as possible.Clearly, AR has advantages for next generation media related networks: it allows network operators to better plan resource usage, leading to greatly increased resource utilization and guaranteed Quality of Service (QoS).
Reliability of the transport is also of crucial importance in the digital-centric media transfer process, when dierent media actors are geographically located far apart.Therefore, strategies to deal with network dynamics such as failures should be dened to enable reliable transmission of accepted requests without any loss in QoS upon occurrence of a failure.For example, in media production networks, meeting transfers' deadlines is of crucial importance.Consider a live show or a news program which is broadcasted everyday at a specic time.Clearly, even slight delays in transfer of pre-production contents are intolerable in such a setting.Media-centric networks impose requirements not supported by existing AR scheduling techniques, such as dierent types of video or audio transfers, exible or unspecied start or end times, strict deadlines, interdependent requests, reliability, etc. Addressing these unexplored aspects was the main focus of our previous contributions.First optimal and near optimal AR scheduling algorithms, customized for media production networks, have been proposed [1].To oer reliable reservations, we have further presented the resilient version of our approach based on a protection mechanism to improve the reliability of the AR systems [2].The proposed scheme is capable of covering single link failures using pre-reserved disjoint backup paths.Additionally, the resilient solution improves the scheme's availability compared to the non-resilient approach.
Continued research has shown us that the resilient bandwidth allocation algorithm, in [2], can be further improved and this is the main contribution of the work presented in this paper.This algorithm is optimized to reduce network reservation waste by proposing a more ecient solution to nding an optimal allocation of bandwidth for each le transfer request.We have made a tradeo between the complexity and performance of the resilient bandwidth allocation algorithm for le-based transfers.This results in better network utilization and consequently higher request admittance ratio.
The remainder of this paper is structured as follows.In Section 2, we discuss the related work.Section 3, provides brief information about the media delivery services and elaborates on the resilient AR approach.The proposed solution to improve the performance of resilient AR scheduling approach is described in Section 4. Designed algorithms are explained in Section 5. Section 6 provides simulation results and nally, Section 7 concludes the paper.

Related work
The authors in [3] survey the AR algorithms, mostly focusing on Wavelength-Division Multiplexing (WDM) networks, and provide a taxonomy for classifying these algorithms.Advance reservation requests can be classied in 4 individual categories, which are also valid for dierent types of requests in media related networks.However, based on this survey, only two works oer variable-bandwidth reservation in their scheduling process [4,5].While both approaches consider a xed start time for the requests, all four classes for requests with specied or unspecied time and duration are supported in our work.Current research on AR scheduling mostly focuses on rescheduling [68], multi-domain reservations [9], and real-life deployments [1013].Nevertheless, reliability and fault tolerance properties have not been investigated.
Resilient AR systems can be deployed either through restoration or protection failure recovery mechanisms [14].In protection approaches, backup resources are reserved in advance before any failure happens in the network, while in case of restoration backup resources are found upon failure detection.The former results in more resource consumption but the recovery time is quite fast.In [15], the authors propose a restoration technique to deal with link failures.In their work, the active requests and the scheduled requests for the future which are aected by a failure are restored.In [16,17], optimal ILP-based solutions were proposed to provide shared and dedicated path protection.Authors in [18,19] also provide resiliency through shared path protection.Since meeting strict deadlines and QoS requirements is of great importance in our approach, we have made use of protection mechanisms.
The work detailed in this paper presents a signicant optimization to our previous works on bandwidth reservation approaches [1,2].In [1], we proposed a theoretical Integer Linear Programming (ILP) based model and heuristic algorithms for advance bandwidth reservation with no support for failure recovery.In [2], the media production reservation system is enhanced by following a protection mechanism and provisioning backup reservations for each request.This resilient solution guarantees 100% recovery against any single link failure.While this approach strives to minimize the needed bandwidth for the backup paths and determination of allocations is fast it does not fully utilize the network capacity.This work aims at optimizing the resilient solution proposed in [2] and improving the request admittance ratio by reducing wasted network capacity and thus improving network utilization.3 Background We briey explain about the specic properties of media delivery networks and provide a summery of the resilient advance bandwidth reservation approach.

Media delivery networks
In the digital-centric media related industries, various actors are connected to a shared wide area network to build a collaboration over IP media contents.This shared network supports the exchange of dierent media contents, e.g.raw and encoded video and audio les and streaming transfers.We refer to each transfer as a request.A request can have a xed or unspecied start and end times.Media delivery requests, supported in our work, are of 4 dierent classes: independent streaming requests, independent le transfers, dependent streams and dependent le based transfers.The requests of independent type can be started at the specied start time but dependent requests have to wait until the requests upon which they are dependent have nished.Dependency among dierent transfers implies that either all or none of the interdependent requests must be admitted.We refer to a set of interdependent requests as a scenario.
We assume that volume for le-based requests, and duration for streaming requests duration must be specied.The allocated bandwidth for the streams must be equal to their required bandwidth demand, from the start time to the end time, because their demand is xed.However, for le-based requests, the volume of le is the determinative factor.The le can be transferred whenever possible from the time the le is ready to be transferred till its deadline.The residual demand of le-based transfers is modied whenever a part of the le is transferred.

Resilient AR scheduling
In order to have a quick response to sudden changes such as failures in the network, we use a protection mechanism which nds backup paths for connections in advance, before the occurrence of any failure to ensure there is enough capacity left when failures occur.The objective is to minimize the resource usage by the protection paths while full recovery is guaranteed against any single link failure.In this scheme, rst the primary paths for a given request are determined using an advance reservation algorithm which we presented in [1].Then disjoint backup paths are found corresponding to these primary paths [2].Note that in the proposed schemes, the user can indicate the amount of required backup for each request.This way, the higher priority requests can be fully protected while the ones with lower priority can remain partially protected or unprotected.
Using shared backup path protection (SBPP) [19] and multi-path approaches, to provide full protection against a single link failure, the backups have to provide the maximum bandwidth allocated on the links of the primary paths.In the resilient approach, when bandwidth is allocated to a request, we look for disjoint paths to be reserved as backup paths for that request.In practice, a request may not ask for full recovery of bandwidth demand upon occurrence of a failure and it is sucient that a portion of the request demand is transmitted to the destination.Therefore, we compare the maximum primary allocation and the amount of requested backup and select the smaller value as the limit to be fullled by the backups.If this backup limit can not be found, it backtracks to the initial state and retries the bandwidth allocation with half of the primary bandwidth demand.This division by 2 is repeated until both primary and backup demands are satised.If the le can not be accommodated by its deadline, the scenario to which this request belongs, is rejected and all of its reservations have to be sent back to the network resource pool.Full details can be found in [2].

Resilient AR scheduling architecture
In this section, we briey explain the heuristic-based resilient AR architecture, detailed in [2], for the reliable bandwidth scheduling problem.There are dierent blocks in this approach, presented in Figure 1.As can be seen, new scenarios enter the scheduler through an API which can be transformed using the input transformation block.Then the resilient scheduling algorithm is invoked which follows a timeslot-based scheme.A timeslot is dened as a period of time in which reservations remain invariant.The new scenario is admitted and the schedule is updated provided that this algorithm succeeds in allocating bandwidth to all the scenario's requests.Otherwise, the previous scheduling remains unchanged.In the resilient scheduling component, rst the requests within the scenario are prioritized and then the FixedTimeSlot algorithm is called.In the prioritization algorithm, the requests' priorities are assigned rst based on the deadline, and then the request's demand.Requests with sooner deadline and higher demand receive higher priorities.

Backtracking
The FixedTimeSlot algorithm iterates over the time slots with 5 sub-modules.1) TimeSlotRequests: determines the eligible requests which can be served at the current time slot.2) Limit: denes a limitation for each request.For the streams this limit is equal to their demand which is xed.For le-based requests the residual demand divided by timeslot size is considered as the limit.3) Sorting: sorts the requests based on their priorities, assigned by the prioritization algorithm.4) BWallocationResilient: responsible for resilient bandwidth allocation to the requests depending on their types.5) Update and check for feasibility: Once the required demands are allocated to the requests, the schedule is updated if all the deadlines are met.Otherwise this schedule is infeasible.Full details can be found in [2]. 4 Optimized resilient AR scheduling In this section, we elaborate on how the reliable scheduling approach has been improved to achieve a higher request admittance ratio.
In the resilient approach, if the requested backup can not be found, it retries the primary allocation with fraction of the primary bandwidth demand (50% in our case).Although this is a fast approach, we found that halving the request demand does not always lead to an optimal solution because we may miss the opportunity to transfer a higher volume of the le and the network capacity may not be fully utilized if other concurrent requests can not make use of it.
As such, we propose to make better use of leftover capacities by deploying the binary search mechanism [20] for nding the maximum value which satises both primary allocation and the requested bandwidth demand.This way, perstep complexity of the reservations increases while a higher amount of allocations will potentially be achieved.We elaborate more on this with two examples.
Example 1: consider Figure 2 with a le-based request of 300 Gb and timeslot size of 5 min.The limit component sets 1Gbps (300 Gb/300s) for the limitation of bandwidth reservations for this request.The resilient approach, rst checks if it can fulll both a 1Gbps primary allocation and the requested backup demand.The amount of backup allocation depends on the percentage of requested protection and also the way the primary reservations are allocated.If the request's backup demand can not be fullled, the limit of the request is divided by 2. Then the same procedure is repeated for the lower limit of 500 Mbps.If both 500 Mbps primary demand and its corresponding backup are available, the primary and backup allocations are reserved and the request demand is updated by reducing reserved network capacity and the rest of the le has to wait for the next timeslots to be processed.Otherwise, this division by 2 is continued until the request limit is fullled or the le is nally rejected.However, this division by 2 is not ecient if the network is able to provide e.g.400 Mbps.Based on this approach the resilient reservation approach can only provide 250 Mbps.This is shown in Figure 2b following a multi-path allocation approach, i.e. 200+50 Mbps for primary and 200Mbps (to cover single link failure) for shared backup paths.Although the 150 Mbps can be occupied by other requests, this is not optimal and may results in a waste of network resources.
In order to improve the network utilization, we propose to make use of Binary Search to nd an optimal value for the amount of reservations.This way if the algorithm recognizes that 500 Mbps is not available but 250 Mbps can be oered, instead of returning 250 Mbps, the algorithm tries to nd the maximum available value between 250 Mbps and 500 Mbps.The proposed solution rst takes the middle value (375 Mbps) and checks the possibility of reservations again.As this is available, the algorithm again checks for the middle value between 375 Mbps and 500 Mbps which is 437.5 Mbps.This trend is continued with 406.25 and then 390.625, etc. until the dierence between the upper and lower bounds is smaller than a given margin, which we refer to as .Assuming 2Mbps as this margin the algorithm stops at 399.4 Mbps.The reservation based on the optimized resilient AR approach is shown in Figure 2c.This margin can be altered to make a tradeo between achieving a precise optimal value and solution complexity.Example 2: Figure 3 shows an example of a schedule for 3 le-based requests, R1, R2 and R3.The timeslot size is 1 hour and in both gures only primary reservations are shown.In Figure 3a and Figure 3b, the reservations are made using the original and optimized version of the resilient advance bandwidth reservation approaches respectively.Figure 3b reveals how the optimized resilient approach can improve network utilization and increase the probability of admittance for future requests.As can be seen, by allocating a higher volume of a given le, this le can be potentially transferred earlier compared to the original approach.This way, higher capacity is available for requests in future and the request admittance ratio will be potentially increased.Resilient timeslot-based AR algorithms In this section, we elaborate on the optimized BWallocationResilient+ algorithm, shown in Algorithm 1, which rst assigns a cost to each network link using a cost allocation module.We have previously designed two algorithms for resilient bandwidth allocation depending on the type of the request, which we refereed to as BWallocationFBResilient and BWallocationVSResilient for le-based and streaming requests respectively.As we have optimized the resilient approach for le transfers, we do not elaborate on the BWallocationVSResilient algorithm.The common part for both algorithms is repeatedly nding the least-cost paths between source and destination of a given request until the limit of that request is fullled.However, provided that the limit of the request is not available, a dierent trend is followed by each approach.For the le-based request, maximum available capacity is reserved as the rest of the le can be processed during the next timeslots.For the streams, if there is not enough capacity to allocate to the request, it can not be served and thus the feasibility is set to false.The next step in the resilient algorithms is to nd the backup paths.Depending on the backup demands and primary allocations, the amount of backup demand is rst calculated.Both algorithms check if the backup can be fullled.In order to cover single failures, the backups have to be disjointed the primary paths.As such, the links used in the primary paths are removed from the network and the bandwidth allocation algorithms are reused on the residual network to nd the backup paths for that request.If the backups can be found, the primary and backup paths can be successfully allocated for the request.Otherwise, the primary paths have to be removed.Again if the backups for the streaming request are not fullled, the scheduling is not successful.However, for le transfers if the backup can not be provided, the algorithm tries with a lower primary bandwidth demand.This is repeated until both primary and backup demands of the le are satised.If this algorithm is being executed in the timeslot prior to the request deadline, and both primary and backup demands can not be fullled, the entire scenarios to which the le belongs, is rejected.

BWallocationFBResilient+ algorithm
The main idea behind proposing the BWallocationFBResilient+ is to improve the performance of the BWallocationFBResilient algorithm.BResilient algorithm, if the backup for a given request can not be provided, the limit of primary allocations is repeatedly halved and the possibility of the reservation is checked with this lower limit.We argue that this halving cycle can be improved by deploying a binary search algorithm.That is, given a lebased request, we seek for maximum available bandwidth which satises both primary and backup demands.Therefore, if the BWallocationFBResilient+ algorithm nds that value X can satisfy both primary and backup demands, instead of returning this value, which was the case in the BWallocationFBResilient algorithm, a higher value based on the binary search approach is investigated and this is repeated until a near-optimal value (within an margin) is calculated and returned.This process is shown in Algorithm 3.This section evaluates the quality and execution time of the proposed solution, compared to our previously designed resilient timeslot-based scheduling algorithms.For this analysis, SARA (Static Advance Reservation Approach) is evaluated in which all requests are known in advance, before the start of scheduling.The inuence of the available network capacity, network load, backup demand, timeslot granularity and execution times are assessed.

Evaluation setup
The network topology used for this evaluation contains 8 nodes and 16 bidirectional links.After discussion with our industrial partners in media production industry, 3 scenario types are dened: a soccer after-game discussion program, an infotainment show and a news broadcast program, consisting of 5, 18 and 8 interdependent le-based and video streaming requests respectively.A detailed overview of the randomized variables of requests and network topology can be observed from [1] and [21] respectively.In the xed size timeslot-based solution, timeslot granularities of 5, 15 and 30 minutes and backup demand of 50% and 100% are used.Throughout this section, SARA[XX%, YYmin] denotes that backup demand of XX% and timeslot size of YY minutes is considered in the xed-size timeslotbased advance reservation algorithm.SARA+ refers to the optimized resilient bandwidth reservation approach.In this approach the margin, which we referred to as , equals 2 Mbps.Each simulation run covers a 24-hour period.All results are averaged over 50 runs with dierent randomized inputs, error bars denote the standard error.All algorithms in this section are implemented in Java 8.

Evaluation results
Evaluation of network capacity, backup demands and timeslot sizes: In Figures 4 and 5, the network infrastructure has been congured for dierent available bandwidths, respectively for 100% and 50% backup demands to investigate the impact of available network capacity, backup demands and timeslot sizes on the performance of our algorithms in terms of percentage of admitted requests.In both evaluations, 7 scenarios (67 requests) are submitted to the bandwidth reservation system and the network capacities vary from 50 Mbps to 400 Mbps.
What we can observe in these evaluations is as follows: rst, the ner the timeslot size, the higher gain achieved by the SARA+ approach.As can be observed from Figure 4, the SARA+ approach is able to achieve on average up to 3.6%, 7.3% and 9.2% higher admittance ratio in 30-, 15-and 5-minute timeslot sizes, respectively.Second, with higher backup demand, the performance of SARA+ is more signicant.In Figure 5, with 50% backup demand, SARA+ is able to outperform the SARA approach on average up to 8.5% with 5-minute timeslots.Comparing Figures 4c and 5c improves the request admittance ratio on average up to 2.8 times comparing to the 50% backup demand.As can be seen in Figure 6, by increasing the number of scenarios, the percentage of admitted requests decreases and the SARA+ approach performs better with ne-grained timeslot sizes.We notice that the advance bandwidth reservation system gains more by deploying the SARA+ approach and with the 5-minute timeslot size, shows up to 7.3% higher request admittance ratio.
The time complexity of the approaches are evaluated in Figure 7 for an increasing range of scenarios.This gure reveals that the granularity of timeslot size impacts the execution times of both approaches dierently.While with 30-minute timeslot size, the execution time of SARA+ is up to 147 milliseconds higher compared to the SARA approach, with 5-minute timeslots, this time is up to 4.5 second lower.These results indicate that the quality and complexity of the advance bandwidth reservation system can be improved by deploying the the SARA+ approach with ne-grained timeslot sizes.
For further investigation of the execution time, we have assessed the impact of network capacity on the execution time, when the timeslot granularity of 5 minutes is used.This has been shown in Figure 8.The number of scenarios is 7 and 14 in Figure 8a and 8b respectively.This evaluation shows that when there is enough bandwidth capacity available, the SARA approach is able to perform faster while SARA+ can better manage the time under stressed network conditions, i.e. limited network capacity.In this paper, we have optimized the resilient scheduling algorithms, previously presented for advance bandwidth reservation in media-centric networks.In the original version, for a given le transfer, if both primary and backup demands can not be fullled, the algorithm is repeatedly executed with 50% of primary demand until both demands are fullled or the request is rejected.We proposed to make use of binary search instead of halving the primary demand and showed that this optimization improves the performance of the timeslot-based advance reservation system in terms of request admittance ratio.The impact of available capacity, network load, timeslot sizes and backup demands is evaluated.Based on the results, we can conclude that the proposed solution specically performs well under limited network capacity and with ne-grained timeslot sizes.The proposed approach outperform the original one both in terms of the execution time, with 5-minute timeslot size, and the percentage of admitted requests, up to 9.2%.

Fig. 1 :
Fig. 1: Components of the resilient advance bandwidth reservation approach.

Fig. 2 :
Fig. 2: An example of primary and backup reservations based on the original and the optimized resilient AR approaches.(Black: network capacity, Blue: Primary reservation, Red and dashed: Backup reservation)

Fig. 3 :
Fig. 3: Comparing the primary allocations of the original and optimized versions of resilient timeslot-based advance bandwidth reservation approach.

Fig. 4 :
Fig. 4: Impact of timeslot size with 100% backup demand in the timeslot-based advance bandwidth reservation approach.

Fig. 5 :
Fig. 5: Impact of timeslot size with 50% backup demand in the timeslot-based advance bandwidth reservation approach.

Fig. 6 :Fig. 7 :
Fig. 6: Impact of network load in the xed size timeslot-based advance bandwidth reservation approach.

Fig. 8 :
Fig. 8: Comparing the execution times in the function of network capacity in the xed size timeslot-based advance bandwidth reservation approach.
In the BWallocationF-