An Optimal Lower Bound for Buffer Management in MultiQueue Switches
Authors
 First Online:
 Received:
 Accepted:
DOI: 10.1007/s0045301296778
 Cite this article as:
 Bienkowski, M. Algorithmica (2014) 68: 426. doi:10.1007/s0045301296778
Abstract
In the online packet buffering problem (also known as the unweighted FIFO variant of buffer management), we focus on a single network packet switching device with several input ports and one output port. This device forwards unitsize, unitvalue packets from input ports to the output port. Buffers attached to input ports may accumulate incoming packets for later transmission; if they cannot accommodate all incoming packets, their excess is lost. A packet buffering algorithm has to choose from which buffers to transmit packets in order to minimize the number of lost packets and thus maximize the throughput.
We present a tight lower bound of e/(e−1)≈1.582 on the competitive ratio of the throughput maximization, which holds even for fractional or randomized algorithms. This improves the previously best known lower bound of 1.4659 and matches the performance of the algorithm Random Schedule. Our result contradicts the claimed performance of the algorithm Random Permutation; we point out a flaw in its original analysis.
Keywords
Online algorithms Competitive analysis Packet buffering Buffer management1 Introduction
We study the unweighted FIFO variant of the buffer management problem introduced by Kesselman et al. [21]. In this problem, also called the packet buffering problem [3], we focus on a single packet switching device which has m input ports and one output port. The goal of such device is to forward packets from its input ports to the output port. The burstiness of the incoming traffic motivates the use of buffers that can accumulate incoming packets and store them for later transmission. We assume that all packets are of unit size and that each input port has an attached buffer able to store up to B packets. We consider the unweighted case, in which all packets are equally important.
Time is slotted in the following way. At any (integer) time t≥0, any number of packets may arrive at the input ports and they are appended to the appropriate buffers. If a buffer cannot accommodate all the packets, the excess is lost. Then, during the time step corresponding to time interval (t,t+1), the device can transmit a single packet from a single buffer. The key difficulty of this problem is that it is inherently online: the buffer managing algorithm does not know where packets will be injected in the (nearest) future. The goal is to minimize the number of lost packets; the only decision made by the algorithm at time t is choosing the buffer from which a packet is transmitted in step (t,t+1).
In our setting, no information about the future is available to the algorithm, i.e., the decision of an online algorithm at time t has to be made solely on the basis of the input sequence up to time t. In particular, we make no probabilistic assumptions about the input and assume it is created in a worstpossible manner by an adversary. For analyzing the efficiency of online algorithms, we use competitive analysis [12], and—on any input sequence—compare the throughput (the number of transmitted packets) of an algorithm and the optimal offline schedule. The supremum (taken over all inputs) of the ratios between these two values is called the competitive ratio of the algorithm and is subject to minimization. An algorithm is called Rcompetitive if its competitive ratio is at most R. In case of randomized online algorithms, in the definition above, we replace the throughput by its expected value.
The queuing model considered in this paper is typical for inputqueued switches or routers, which are the dominant packet switching architecture of the Internet. Moreover, despite the popularity of theoretical research on QualityofService solutions (cf. Sect. 1.2), most of the current networks (most notably those using IP protocol) provide only “best effort” services, where all packets are treated as equally important and do not have to respect any deadlines. The unweighted variant considered hereby is therefore typical for such networks. Minimizing the packet loss is an important issue, as subsequent packet retransmissions by the standard transportlayer protocols like TCP are quite costly, and may lead to performance degradation. Finally, there is an observable evidence that the network traffic exhibits socalled selfsimilar properties [25], which renders analyses based on stochastic queuing theory inapplicable. Hence, it is reasonable to apply the worstcase, competitive analysis rather than stochastic assumptions.
1.1 Previous Work
The contributions on this topic fall into three main categories: the results for deterministic algorithms, randomized algorithms and (deterministic) fractional algorithms. The capabilities of the last type of algorithms are extended beyond the standard model: a fractional algorithm may transmit arbitrary fractions of packets as long as the total load transmitted in a single step is at most 1 (see Sect. 2.1 for a detailed description). It is a straightforward observation (cf. Sect. 5) that any randomized algorithm can be simulated by a fractional one without increasing its competitive ratio, i.e., the fractional model is “easier” for algorithms and “harder” for the adversary. A relaxed relation in the opposite direction also exists: Azar and Litichevskey showed how to transform a ccompetitive fractional solution into a deterministic c⋅(1+⌊H _{ m }+1⌋/B)competitive one [7].
We start by describing results on randomized and fractional models. The currently best lower bound of 1.4659 (holding for arbitrary buffers’ sizes B and large m) on the competitive ratio of any randomized algorithm is due to Albers and Schmidt [3]. In Sect. 3.2, we highlight the details of their approach and show that in fact it also works for the fractional scenario. Azar and Richter presented the randomized algorithm Random Schedule [8], whose competitive ratio is e/(e−1)+o(1)≈1.582. Azar and Litichevskey [7] showed how to encode the fractional variant of the problem as an instance of the online fractional matching in a bipartite graph. By constructing an e/(e−1)competitive solution (based on a natural “water level” approach) for the latter problem, they obtained a fractional e/(e−1)competitive algorithm. In this paper, we refer to it as FracWaterlevel. Finally, this bound was falsely claimed to be improved by the algorithm Random Permutation [30] with competitive ratio 1.5 for all values of B and m. These competitive ratios were improved for the particular case of two buffers and any B, where an algorithm attaining the optimal competitive ratio of 16/13≈1.231 was given by Bienkowski and Madry [11].
As for the deterministic algorithms, the general upper bound holding for all values of B and m was given by Azar and Richter [8]. They proved that any workconserving (i.e., transmitting if there is a nonempty queue) algorithm is 2competitive. They also constructed the lower bound of 1.366 holding for any B and large m. Albers and Schmidt [3] improved that bound to e/(e−1) for m≫B. They also presented the algorithm SemiGreedy that is 1.889competitive for large B. By taking the algorithm FracWaterlevel and applying the fractionaltodeterministic reduction mentioned above, Azar and Litichevskey [7] obtained a deterministic \(\frac{\mathrm{e}}{\mathrm{e}1} \cdot(1 + \lfloor H_{m} + 1 \rfloor/ B)\)competitive algorithm (which we call DetWaterlevel). Again, the results can be improved for particular cases: when B=2, then the SemiGreedy algorithm achieves the optimal competitive ratio of 13/7≈1.857 [3]; when m=2, the optimal ratio 16/13≈1.231 was achieved by the algorithm Segmental Greedy by Kobayashi et al. [24].
Previously best competitive ratios for the problem for general values of B and m. The lower bounds approach values from the table for large m. The starred lower bound for the fractional model was not stated in [3], but is a straightforward adaptation of the lower bound for the randomized algorithms presented therein (cf. Sect. 3.2)
Fractional model 
Standard model, randomized alg. 
Standard model, deterministic alg.  

Previous upper bounds 
\(\frac{\mathrm{e}}{\mathrm{e}1}\) [7] 
\(\frac{\mathrm{e}}{\mathrm{e}1}+o(1)\) [8] 
\(\frac{\mathrm{e}}{\mathrm{e}1} \cdot(1 + \lfloor H_{m} + 1 \rfloor/ B)\) [7] 
Previous lower bounds 
1.4659 [3]^{∗} 
1.4659 [3] 
\(\frac{\mathrm{e}}{\mathrm{e}1}\) [3] (only for B≪m) 
Lower bounds in this paper 
\(\frac{\mathrm{e}}{\mathrm{e}1}\) 
\(\frac{\mathrm{e}}{\mathrm{e}1}\) 
\(\frac{\mathrm{e}}{\mathrm{e}1}\) 
1.2 Related Work
The throughput maximization problem considered in the unweighted version in this paper was also studied in a more general context, where packets may have different weights and the goal is to maximize the total transmitted weight. Such an extension tries to capture differences of importance between various data streams. Upon packet injection, the buffer managing algorithm makes a decision whether a packet should be accepted to the buffer or dropped immediately. Afterwards, accepted packets have to be transmitted in FIFO order. Such a problem is nontrivial even for the task of managing a single input buffer [1, 5, 6, 31]. A preemptive variant where packets may be dropped from the queue was also studied [4, 6, 16, 21–23, 28, 29, 31].
Another variant of the buffer management problem is the socalled boundeddelay scenario, where neither the buffer has a fixed capacity nor FIFO order is imposed. Instead, each packet specifies a deadline and it must be either transmitted before the deadline or dropped [6, 9, 10, 13–15, 18–21, 23, 26, 27].
For a comprehensive description of these and related models, we refer interested readers to the recent survey by Goldwasser [17].
1.3 Our Contribution

The result is up to lowerorder terms optimal; it matches the performance of the fractional algorithm FracWaterlevel of [7]. It also gives evidence that the reduction of Azar and Litichevskey from the online fractional bipartite matching to the fractional packet buffering was essentially tight. Even though the fractional matching is more general, the competitive ratios achievable for both problems are the same.

In Sect. 5, we show a simple reduction showing that any lower bound for the fractional model implies the same lower bound for randomized algorithms in the standard model. Hence, our lower bound improves the currently best lower bound of 1.4659 for randomized algorithms (also holding for any B and large m) [3]. It is also up to lowerorder terms optimal for randomized algorithms as it matches the performance of the algorithm Random Schedule [8].

The lower bound contradicts the claimed competitive ratio of 1.5 of the algorithm Random Permutation [30]. In Sect. 6, we point out a flaw in the original analysis of this algorithm. The main source of this flaw is neglecting certain types of adversarial strategies; these strategies are actually employed in our lower bound, cf. Sect. 1.4.

The lower bound of e/(e−1) for deterministic algorithms given by Albers and Schmidt [3] can be applied only when m≫B. Our construction yields the same ratio, but requires only that m is large; B may be arbitrary, e.g., even much larger than m. Thus, in contrast to their construction, ours shows that the deterministic algorithm DetWaterlevel [7] achieves the asymptotically optimal competitive ratio when both m is large and B≫logm.
As stated above, our lower bound for randomized algorithms follows simply by the lower bound for fractional algorithms and the relation between randomized and fractional algorithms (cf. Sect. 5). While for m→∞ this result cannot be improved, determining the best competitive ratio for smaller values of m remains an open problem. In particular, it is not known whether the fractional and randomized scenarios are equivalent in terms of achievable competitive ratios. In Sect. 5, we give evidence that it might not be the case. Namely, we show that for m≥3, no online randomized rounding of a fractional solution is able to preserve the throughput. (A successful randomized rounding for m=2 was given in [11].)
1.4 Used Techniques
While the formal construction of the lower bound is given in Sects. 3 and 4, here we informally describe its three key ingredients.
First, our adversarial strategy creates a packet injection sequence which at time 0 completely fills all the buffers and can be served losslessly by the optimal algorithm. Moreover, after each injection, all buffers of the optimal algorithm are full. These assumptions are rather standard and we list them only for completeness.
Second, we observe that in some cases delaying packet injections and injecting multiple packets at once incurs a greater packet loss for the algorithm. To give a specific example for a randomized algorithm, assume that the buffer size B is 1 and, at time t, the expected number of packets at each buffer is quite low. Then, injecting a packet at times t, t+1, and t+2 is more benign to the algorithm than not injecting a packet at time t, injecting two packets (to two different buffers, b _{ i } and b _{ j }) at time t+1 and injecting a packet at time t+2. At first glance, the former approach tries to incur a loss of the algorithm more aggressively, while the latter gives the algorithm more time to prepare. However, the latter approach can be advantageous to an adversary, because both buffers b _{ i } and b _{ j } become full at time t+1. This can be exploited by injecting a packet at time t+2 to one of these two buffers. On expectation, the incurred loss caused by this injection is then at least 1/2. Although the observation presented above may seem simple, it was not exploited by previous lower bounds. Moreover, assuming that injecting a single packet at a time constitutes the most cruel adversarial approach is the exact source of the flawed analysis of the algorithm Random Permutation [30].
Third, our construction is an iterative approach that—given an adversarial strategy S for n initially full buffers reducing the average load of the buffers to a specific level—constructs a strategy S′ for n ^{2} buffers, which uses the former strategy as a black box. The strategy S′ uses only a few more steps (in comparison to the number of buffers) and reduces the average load to even smaller amount than S does. By applying the construction iteratively, in the limit we obtain a strategy for M buffers that reduces the average load of the buffers to o(1) in time M⋅(e−1)+o(M). Hence, neglecting the lowerorder terms, the throughput of the algorithm on such an injection pattern is M⋅(e−1). On the other hand, right after the last injection, the buffers of the optimal solution are still full, and thus its throughput is M⋅(e−1)+M=M⋅e. This implies an asymptotic lower bound of e/(e−1).
2 The Model
Throughout the paper, m denotes the number of buffers and B the size of a single buffer. We denote the set of all buffers by \(\mathcal{B}\).
At any integer time t≥0, the adversary may inject an arbitrary number of packets to arbitrary buffers. It may also choose not to inject anything. If a buffer cannot accommodate all the packets, their excess is lost. We assume that packet injection is instant, and therefore we distinguish between the state of buffers at t ^{−} (right before injection at time t) and t ^{+} (right after the injection).
Then, during the time step corresponding to time interval (t,t+1), the algorithm may transmit a single packet from the buffer of its choice. We assume that after the last packet injection, the algorithm has sufficient time to transmit all the packets it still has in the buffers.
By a throughput of an algorithm A on a sequence σ, denoted T _{A}(σ), we mean the total number of packets transmitted by A. When A is randomized, T _{A}(σ) denotes its expected throughput. For any algorithm Alg, its competitive ratio is defined as sup_{ σ }{T _{ OPT }(σ)/T _{ ALG }(σ)}, where the supremum is taken over all possible inputs and Opt is the optimal offline algorithm. For randomized algorithms, we assume oblivious adversaries [12], i.e., the ones that do not have access to the random bits used by the algorithm.
2.1 The Fractional Model
It is usually more natural to think about lower bounds for deterministic algorithms. Therefore, in the two subsequent sections, we concentrate solely on a fractional variant of the problem, in which a deterministic algorithm may transmit fractional amounts of packets. Note that we change solely the capabilities of an online algorithm, which means that: (i) the injected packets are still integral and can be injected at integer times only, (ii) an optimal offline solution to which our solution is compared is chosen among integral solutions. We call the amount of packets in a buffer the load of this buffer. For a subset of buffers, the load of these buffers is the sum of the respective buffers’ loads, and the total load is the load of all buffers. Furthermore, in each step the load transmitted from the buffers is at most 1. We show relations between fractional and randomized algorithms in Sect. 5.
By Lemma 5.1 of [7], without loss of generality, we may assume that a fractional algorithm is workconserving, i.e., in each step it transmits a load of 1 if it has it, and its total load otherwise. We silently make this assumption in our proofs.
3 Basic Building Blocks
In our construction, for simplicity of the description, we assume that B=1. However, the presented approach is valid with virtually no changes for other values of B: we only have to replace an injection of a single packet by injection of B packets to the same buffer and replace each step by B consecutive steps.
An elementary building block of the constructed input sequence is an action. It is parametrized by a subset of buffers A and a positive integer a called length. When we say that the adversary executes action (A,a) at time t, we mean that: (i) nothing is injected during time (t,t+a) and (ii) at time t+a, the adversary chooses a buffers with the maximal load among buffers from A and injects a packet to each of these buffers. Ties are broken arbitrarily. We call set A active in time (t,t+a] or active for this action, and we say that such action operates on set A. The considered action starts at time t and ends at time t+a.
3.1 Canonical Strategies
A brief characterization of the inputs we construct is as follows. First, at time 0, the adversary injects a packet to each of m buffers; this is called the initial injection. Second, at time 0, it executes a canonical (adversarial) strategy. Such a strategy is a sequence of actions, where for each consecutive pair of actions, the latter starts exactly when the former ends. (For technical reasons, which become clear when we describe our recursive construction, the initial injection is not the part of the canonical strategy.) Furthermore, we define the length of a canonical strategy as the sum of its actions’ lengths; note that the length of a strategy starting at time 0 coincides with the last time at which packets are injected. We say that something happens during the strategy of length ℓ executed at τ if it happens within time interval (τ,τ+ℓ].
In our construction, we consider canonical strategies only. Canonical strategies are neat to analyze, because the optimal algorithm can serve the corresponding inputs losslessly.
Lemma 1
For a canonical adversarial strategy of length ℓ preceded by the initial injection, the throughput of the optimal offline algorithm is m+ℓ.
Proof
Let (A _{1},a _{1}),(A _{2},a _{2}),…,(A _{ k },a _{ k }) be the canonical strategy. For any 1≤j≤k, during a _{ j } time steps of action j, Opt transmits packets from these a _{ j } buffers to which a packet is injected at time \(\sum_{i=1}^{j} a_{i}\) (i.e., at the end of action j). In effect, during the first \(\sum_{i=1}^{k} a_{i} = \ell\) steps, Opt transmits a packet in each step and does not lose any. Finally, at time ℓ ^{+}, the buffers of Opt are still full, and hence it may transmit all m packets in time (ℓ,ℓ+m). □
By the lemma above, we may use a canonical strategy to show a lower bound on any algorithm in the following way. We show that after a given strategy S of length ℓ is executed against any (workconserving) algorithm Alg, the total load of Alg at time ℓ ^{+} is at most c. Then, during the first ℓ steps, the load transmitted by Alg is at most ℓ, and afterwards it is at most c. On the other hand, by Lemma 1, the throughput of Opt is ℓ+m. Hence, strategy S implies that the competitive ratio of any algorithm is at least (ℓ+m)/(ℓ+c). Therefore, our goal is to construct an adversarial canonical strategy that reduces the total load of any algorithm as quickly as possible.
In our construction, for a given number of buffers m, we give a strategy where the lengths of the corresponding actions are fixed (and known to the algorithm). However, the choice of active sets is algorithm dependent.
3.2 Uniform Strategies
The most straightforward canonical strategies are of the form (A,1),(A,1),…,(A,1), where A is a fixed set of buffers. In other words, when executing such a strategy, the adversary injects a packet at the end of each step into the most populated buffer from set A. We call such strategies Auniform.
We use (A,a)×k as a shorthand for the strategy consisting of k consecutive actions (A,a). The behavior of algorithms on \(\mathcal {B}\)uniform strategies (recall that \(\mathcal{B}\) is the set of all buffers) was investigated previously by Albers and Schmidt [3]. Hereby, we show a slightly more general result, which becomes useful later: we consider Auniform strategies for arbitrary sets A. Note that during an execution of such a strategy, the algorithm may transmit load also from buffers outside of A.
Definition 1
For any subset A consisting of n buffers and a real β>0, the strategy S _{0}(β,A) is the canonical uniform strategy (A,1)×⌈βn⌉.
Lemma 2
Fix a real β>0, any subset of n buffers A, and let ℓ be the length of the strategy S _{0}(β,A). Assume that at a step τ, all buffers of A are full and the adversary starts to execute the strategy S _{0}(β,A). Let H be the load transmitted during S _{0}(β,A) from buffers not in A. Then the load in A at time (τ+ℓ)^{+} is at most H+n⋅e^{−β }+1.
Proof
For simplifying the notation, we assume that τ=0. For any integer t≥0, let C _{ t } be the load of A at time t ^{+}, and for t≥1, let a _{ t }∈[0,1] be the load transmitted from A in step (t−1,t).
As a packet is injected at each time, the algorithm is able to transmit load 1 in each step, i.e., \(H = \sum_{t=1}^{\ell}(1a_{t})\).
We establish a recursive relation for C _{ t }. Clearly, C _{0}=n. Fix any t∈{1,…,ℓ}. At time t ^{−}, the load in the most populated buffer of A is at least (C _{ t−1}−a _{ t })/n, and thus the load in the remaining buffers is at most (C _{ t−1}−a _{ t })⋅(1−1/n). Due to the injection, the load in A becomes C _{ t }≤(C _{ t−1}−a _{ t })⋅(1−1/n)+1≤(C _{ t−1}−1)⋅(1−1/n)+(1−a _{ t })+1. Hence, \(C_{\ell}\leq(n1) \cdot( 11/n )^{\ell}+ \sum_{t=1}^{\ell}(1a_{t}) + 1 < n \cdot e^{\ell/n} + H + 1\). As ℓ≥βn, the lemma follows. □
We remark that the previously best lower bound, due to Albers and Schmidt [3], was to use the strategy \(S_{0}(\beta,\mathcal{B})\) with a numerically optimized β and large m. The best choice is β≈1.1462, which for m→∞ causes any algorithm to have competitive ratio at least 1.4659.
3.3 Delayed Injections
Consider a \(\mathcal{B}\)uniform strategy. A closer examination (see the proof of Lemma 2) reveals that it decreases the total load by a multiplicative factor in each round. Hence, it is quite effective when buffers are full, but becomes less efficient when the average load drops. To alleviate this problem, consider replacing n steps of the uniform strategy (i.e., the sequence \((\mathcal{B},1) \times n\)) by a single nstep action \((\mathcal{B},n)\). Although such replacement only further worsens the effectiveness of the adversary, the advantage of having n full buffers (denoted A _{ j }) after the execution of \((\mathcal{B},n)\) outweighs this loss. Concretely, the adversary may execute the uniform strategy (A _{ j },1)×j on the buffers of A _{ j }. As these buffers are initially full, the uniform strategy reduces the load in A _{ j } quite rapidly. The adversarial phase \((\mathcal{B},n), (A_{j},1) \times j\) is repeated for increasing values of j. This balances the influence of \((\mathcal{B},n)\) and (A _{ j },1)×j, assuring that the average load inside A _{ j } after a single phase is roughly the same as the load in \(\mathcal{B}\setminus A_{j}\).
The analysis given in the next section shows that already this simple strategy P incurs a lower bound of 1.5589 on the competitive ratio of any fractional algorithm. How can it be further improved? Let us compare P with the \(\mathcal{B}\)uniform strategy of the same length. By Lemma 1, the throughput of Opt on both strategies is the same, and therefore the key factor determining the efficiency is the total load of an online algorithm when these strategies end. It can be computed that its total load at that time is at most 0.104m for the strategy P and at most 0.223m for the \(\mathcal{B}\)uniform strategy.
Now, we take a closer look at a single part (A _{ j },1)×j of the strategy P. It is focused entirely on decreasing the load in A _{ j }. (The algorithm might transmit the load from other buffers as well, but as we implicitly show later, this is not beneficial.) As the strategy P is more efficient in decreasing the load than the uniform strategy, we could replace the sequence (A _{ j },1)×j by a Ptype strategy of length roughly j operating on the set A _{ j }. Furthermore, we could perform such a replacement recursively! Such topdown replacements are viable, but troublesome to define. Therefore, in our construction, we apply a bottomup approach: we fix n, start from a strategy for n buffers, and using it as a black box, we build a better strategy for n ^{2} buffers. By applying such a transformation iteratively, in the limit we obtain a desired construction, incurring a lower bound of e/(e−1).
4 Lower Bound Construction
Let n be a parameter of our construction, which will be specified later. Let n _{0}=n and \(n_{q+1} = n_{q}^{2}\) for any integer q≥0. In Definition 1, we have already specified strategies S _{0}(β,A) operating on any set A of n _{0} buffers. Now, we iteratively generalize this definition to any set of n _{ q } buffers.
Definition 2
We note that when m=n ^{2}, the definitions of the strategy \(S_{1}(1,\mathcal{B})\) and the strategy P from Sect. 3.3 coincide.
We say that the strategy S _{ q }(β,A) is a routine S _{ q }(β) operating on set A. This decouples the actual structure of the adversarial strategy from the sets used. In particular, it is helpful to interpret S _{ q+1}(β) as a (program) routine, which calls subroutines S _{ q }(1/n _{ q }),S _{ q }(2/n _{ q }),…,S _{ q }(⌈βn _{ q }⌉/n _{ q }) (cf. Definition 2). We emphasize that the called subroutines do not depend on the actual choices of the algorithm. (On the other hand, the sets these routines are operating on depend heavily on the algorithm.) In particular, the length of the routine S _{ q }(β) does not depend on the algorithm.
The ultimate goal of the remaining part of this section is to analyze the strategy \(S_{q}(1,\mathcal{B})\) on m=n _{ q } buffers, showing that—for appropriately chosen values of n and q—it takes time (e−1)⋅m+o(m) and reduces the total load of any online algorithm to o(m). This would immediately imply the lower bound of e/(e−1).
4.1 Bounding the Length of S _{ q }(β)
We start with bounding the length of our recursively defined strategies. As noted above, these lengths depend neither on the algorithm nor on the used sets. For any integer q≥0 and real β∈(0,1], we denote the length of the adversarial routine S _{ q }(β) by T _{ q }(β).
We start with the following technical lemma that is used extensively later.
Lemma 3
Lemma 4
Proof
The proof follows by induction on q. For q=0, the right hand side of the inequality is βn+1, and thus the induction basis holds by Definition 1.
4.2 Bounding the Efficiency of S _{ q }(β)
In this section, we measure the efficiency of routines S _{ q }(β) in reducing the load in a set of buffers. Fix any set A of n _{ q } full buffers. We would like to define D _{ q }(β) as the best achievable upper bound on the load in A after executing strategy S _{ q }(β,A). However, such definition would be useless as the algorithm may decide to transmit only from \(\mathcal{B}\setminus A\), keeping the buffers of A full. Hence, we adapt the formulation that takes such behavior into account.
Definition 3
For a given integer q≥0 and real β∈(0,1], we define the remainder D _{ q }(β) of the routine S _{ q }(β) as the infimum of all values of D for which the following statement holds: “If at time τ there is a subset A of n _{ q } full buffers, and the adversary starts the strategy S _{ q }(β,A), then for any load H transmitted by an algorithm during this strategy from \(\mathcal{B}\setminus A\), either (i) the total load drops below 1 during S _{ q }(β,A), or (ii) the load in A at time (τ+T _{ q }(β))^{+} is at most H+D”.
For example, by Lemma 2, D _{0}(β)≤n⋅e^{−β }+1. Throughout the rest of this section, we denote exp(−1/n _{ q }) by γ. First, we present the recursive relation between remainders.
Lemma 5
Proof
We choose any time τ, any set A of \(n_{q+1} = n_{q}^{2}\) full buffers and any situation at the remaining buffers, and analyze the strategy S _{ q+1}(β,A) starting at time τ. We may assume that during the execution of S _{ q+1}(β,A) the total load never drops below 1; otherwise the lemma follows trivially. By Definition 2, S _{ q+1}(β,A) consists of w phases, where the jth phase is an action (A,n _{ q }) affecting set A _{ j } of n _{ q } buffers, followed by the strategy S _{ q }(j/n _{ q },A _{ j }).

H _{ j }: the load transmitted during action (A,n _{ q }) from the buffers \(\mathcal{B}\setminus A\);

\(H'_{j}\): the load transmitted during S _{ q }(j/n _{ q },A _{ j }) from the buffers \(\mathcal{B}\setminus A\);

\(H''_{j}\): the load transmitted during S _{ q }(j/n _{ q },A _{ j }) from the buffers A∖A _{ j }.
Let t _{0}=τ and let t _{ j } be the ending time of phase j, i.e., phase j lasts between t _{ j−1} and t _{ j }. For j∈{0,…,w}, let C _{ j } denote the load in A at time \(t_{j}^{+}\), i.e., C _{ j } is the load in A at the end of phase j. Clearly, C _{0}=n _{ q+1} as all the buffers of A are full at the beginning. Our goal is to upperbound the load in buffers from A when the strategy S _{ q+1}(β,A) terminates, i.e., the value of C _{ w }. To this end, we focus on a single phase j and relate C _{ j } to C _{ j−1}.

As A _{ j } were the most loaded n _{ q } buffers from A at time (t _{ j−1}+n _{ q })^{−}, the load in A∖A _{ j } at time (t _{ j−1}+n _{ q })^{+} is at most (C _{ j−1}−n _{ q }+H _{ j })⋅(1−n _{ q }/n _{ q+1}). Recall that \(H''_{j}\) is the load transmitted from the buffers of A∖A _{ j } during the strategy S _{ q }(j/n _{ q },A _{ j }). Thus, the load in A∖A _{ j } at time \(t_{j}^{+}\) is at most \((C_{j1}  n_{q} + H_{j}) \cdot(1  n_{q} / n_{q+1})  H''_{j}\).

All buffers from A _{ j } are full at time (t _{ j−1}+n _{ q })^{+}. In steps (t _{ j−1}+n _{ q },t _{ j }], the adversary executes the strategy S _{ q }(j/n _{ q },A _{ j }). By the inductive assumption, the load in A _{ j } at time \(t_{j}^{+}\) is at most \(D_{q}(j/n_{q}) + (H'_{j}+H''_{j})\).
Lemma 6
Proof
4.3 The Competitive Ratio
The remaining part of our reasoning is to verify the performance of the routine S _{ q }(1) executed on a set of n _{ q } initially full buffers, where n _{0}=n will be chosen to be sufficiently larger than q. As our bounds on the efficiency of S _{ q }(1) hold provided the total load of an algorithm does not drop below 1, we slightly adapt our strategy.
Definition 4
For any integer q≥0, the adversarial strategy \(S'_{q}(A)\) on a set A of n _{ q } buffers is to fill these buffers at the beginning and then run strategy S _{ q }(1,A) on them. However, if a load of an algorithm drops below 1, the adversary finishes only the current action, and then ends the strategy immediately.
Lemma 7
Fix any integer q≥1 and let n=q ^{4}. Fix any online algorithm Alg. Assume that m≥n _{ q } and choose any subset A of n _{ q } buffers. If the adversary executes strategy \(S'_{q}(A)\), then the OpttoAlg throughput ratio is at least e/(e−1)−O(1/q).
Proof
Let c be the total load at time ℓ ^{+}. Assume first that a sequence has ended prematurely, i.e., the total load has dropped below 1 during some action of length a. By the construction of the routine S _{ q }(1), all its actions have length at most n _{ q−1}. Therefore, a≤n _{ q−1} packets are injected at time ℓ, and hence c<1+n _{ q−1}=O(1/n _{ q−1})⋅n _{ q }=O(1/q)⋅n _{ q }.
The throughput of Alg is at most ℓ+c. On the other hand, the strategy \(S'_{q}(A)\) is canonical, and hence by Lemma 1, the throughput of Opt is ℓ+n _{ q }. Therefore, by (7) and (8), the OpttoAlg throughput ratio is at least (ℓ+n _{ q })/(ℓ+c)=e/(e−1)−O(1/q). □
Theorem 1
The competitive ratio of any fractional algorithm for the unweighted FIFO variant of the buffer management problem is at least e/(e−1). The bound holds for any value of B and m→∞.
Proof
5 Fractional vs. Randomized Algorithms
The following straightforward reduction shows that the fractional model is easier for algorithms than the randomized one. The lemma below together with Theorem 1 yield the lower bound of e/(e−1) on the performance of any randomized algorithm.
Lemma 8
For any randomized algorithm Rand for the integral model with competitive ratio R, there exists a deterministic algorithm Frac for the fractional model whose ratio is at most R (for the same buffer size and number of buffers).
Proof
We fix any buffer size B and number of buffers m.
Clearly, the invariant holds at the beginning, when both Rand and Frac have no packets. Then, the proof follows inductively, by considering two possible types of events.
1. A packet is injected to buffer i. If Q _{ Frac }(i)≥B−1 before the injection, then afterwards Q _{ Frac }(i)=B. Otherwise, Q _{ Frac }(i) increases by 1 and E[Q _{ Rand }(i)] increases at most by 1. In both cases, the invariant is preserved.
2. Packets are transmitted during a step. Let \(Q'_{\textsc{Rand}}(i)\) be the number of packets in buffer i after the transmission of Rand. Let \(q_{i} = \mathbf{E}[Q_{\textsc{Rand}}(i)  Q'_{\textsc{Rand}}(i)]\). Clearly, q _{ i }≥0 for all i and \(\sum_{i=1}^{m} q_{i} \leq1\). To simulate this behavior of Rand and preserve the invariant, Frac transmits load q _{ i } from buffer i. □
Corollary 1
The competitive ratio of any randomized algorithm for the unweighted FIFO variant of the buffer management problem on is at least e/(e−1). The bound holds for any value of B and m→∞.
By means of Lemma 8, lower bounds for the fractional model apply to randomized algorithms. Below, we show that for the upper bounds it is not necessarily the case, i.e., it is not possible to perform online randomized rounding of a fractional solution (for m≥3) that preserves the throughput. This may be an indication that the randomized and the fractional model are not equivalent in terms of the competitive ratio.
Theorem 2
For m≥3 online randomized rounding of a fractional solution that preserves a throughput of the algorithm is not feasible.
Proof
We show how to create an input sequence and a fractional solution Frac, such that any randomized algorithm Rand that tries to simulate Frac has smaller throughput. We assume that m=3. (If m>3 nothing is ever injected to the additional buffers, and hence both fractional and randomized algorithm have them empty.)
As in the proof of Lemma 8, for an algorithm A, we denote the number of packets in its ith buffer by Q _{ A }(i). We say that Rand exactly simulates Frac at time t if for any buffer i it holds that E[Q _{ Rand }(i)]=Q _{ Frac }(i). We first argue that this condition is necessary for preserving the throughput.
Precisely speaking, assume that at time t ^{+}, Rand exactly simulates Frac and their throughput has been equal so far. Assume that within time interval (t,t+k), no packet is injected and Frac transmits load k. If Rand does not exactly simulate Frac at time (t+k)^{−}, then its total throughput can be made lower than that of Frac. To show this claim, we consider two cases.
1. If the expected load transmitted by Rand in steps (t,t+k) is smaller than k, then the adversary may inject B packets to each of the buffers at time t+k and end the input sequence afterwards. Hence, the throughput of Rand in step (t,t+k) is smaller than that of Frac and the throughput in the remaining steps is equal to m⋅B for both algorithms.
2. If the expected load transmitted by Rand in steps (t,t+k) is equal to k, then \(\sum_{i=1}^{m} \mathbf{E}[Q_{\textsc{Rand}}(i)] = \sum_{i=1}^{m} Q_{\textsc{Frac}}(i)\). As the exact simulation condition is violated, there exists a buffer i for which E[Q _{ Rand }(i)]<Q _{ Frac }(i). In this case, at time t+k, the adversary injects B packets to all buffers but the ith one and ends the input sequence. The throughput of Frac in the remaining steps is then (m−1)⋅B+Q _{ Frac }(i) while the expected throughput of Rand is strictly smaller, i.e., (m−1)⋅B+E[Q _{ Rand }(i)].
It remains to construct an input sequence and a behavior of Frac that cannot be simulated exactly by Rand. We identify the state of an algorithm with the state of its buffers. In these terms, the state of Frac at a given time t is a triplet denoted L(t). At time 0, the adversary fills all three buffers with packets, which means that L(0^{+})=(B,B,B). Clearly, the state of Rand is the same.
In step (0,1), Frac transmits a half of the packet from the first and the second buffer, i.e., L(1^{−})=(B−1/2,B−1/2,B). To assure an exact simulation, Rand has to transmit with probability 1/2 from the first buffer and with the probability 1/2 from the second one, ending with state (B−1,B,B) with probability 1/2 and in state (B,B−1,B) with probability 1/2. We denote this distribution over possible states by {(B−1,B,B)^{1/2},(B,B−1,B)^{1/2}} for short. At time 1, the adversary injects a packet to the second buffer, i.e., L(1^{+})=(B−1/2,B,B). The probability distribution of Rand simply changes to {(B−1,B,B)^{1/2},(B,B,B)^{1/2}}.
In step (1,2), Frac transmits a half of the packet from the second and the third buffer, i.e., L(2^{−})=(B−1/2,B−1/2,B−1/2). How can Rand assure an exact simulation? Let p _{ i } be the probability of transmitting from buffer i conditioned on Rand being in state (B−1,B,B). Similarly, let q _{ i } be the probability of transmitting from buffer i, conditioned on Rand being in state (B,B,B). Clearly, p _{1}+p _{2}+p _{3}=1 and q _{1}+q _{2}+q _{3}=1. Furthermore, p _{1}=q _{1}=0, as the expected load in the first buffer is B−1/2 already at time 1^{+}. As the expected load in the two remaining buffers has to be also B−1/2, it follows that 1/2⋅(1−p _{ i })+1/2⋅(1−q _{ i })=1/2 for i∈{2,3}. Hence, p _{2}+q _{2}=1 and p _{3}+q _{3}=1. For any fixed p _{2}, at step 2^{−}, the probability distribution of Rand is then \(\{(B1,B1,B)^{p_{2}/2},(B1,B,B1)^{(1p_{2})/2}, (B,B1,B)^{(1p_{2})/2}, (B,B,B1)^{p_{2}/2}\}\). The adversary does not inject a packet at step 2, and thus this is also the probability distribution of Rand at step 2^{+}.
Now we consider two cases. If p _{2}≥1/2, then Frac empties its first and its second buffer in steps (2,2B+1), so that L((2B+1)^{−})=(0,0,B−1/2). For the exact simulation Rand has to transmit 2B−1 packets in these steps and is not allowed to transmit anything from the third buffer, independently of its actual state at time 2^{+}. However, with probability p _{2}/2≥1/4, Rand is at state (B−1,B−1,B) at time 2^{+}, which renders the exact simulation impossible. Similarly, for the case p _{2}<1/2, Frac empties the first and the third buffer in steps (2,2B+1), and Rand cannot simulate this exactly as at time 2^{+} it is in state (B−1,B,B−1) with probability (1−p _{2})/2>1/4. □
6 Flaw in the Analysis of the Random Permutation Algorithm
The lower bound presented in this paper contradicts the claimed performance of the Random Permutation algorithm. In this section, we demonstrate an error in its original analysis [30]. Random Permutation is defined for B=1 and any number m of buffers. It chooses a permutation of the buffers uniformly at random. In any time step, it transmits a packet from the populated buffer which is most to the front in the permutation.
The proof of this lemma considers first an adversarial strategy, where all m initializing packets are injected at time 0 and then for 1≤i≤m, the ith noninitializing packet is injected at time i to buffer i. We call such a strategy systematic. For systematic strategies, a recursive formula for p _{ i } shows that indeed p _{ i }=v _{ i } [30].
However, in the remaining part of the proof of [30], it is (informally) argued that for other adversarial strategies the values of p _{ i } can be only higher, i.e., that p _{ i }≥v _{ i } for 1≤i≤m. This statement can be falsified immediately by the observation on delayed injections given in Sect. 1.4: it is possible to make a single value of p _{ i } smaller than 1/2, whereas v _{ m } tends to 1−1/e≈0.632 when m grows.
The subsequent parts of the proof of the Random Permutation competitiveness use only the weaker relation \(\sum_{i=1}^{m} p_{i} \geq \sum_{i=1}^{m} v_{i}\). However, the delayed injections can be employed to show that even such a relation is false.
Lemma 9
There exists an integer m and the sequence of adversarial injections of m initializing and m noninitializing packets leading to \(\sum_{i=1}^{m} p_{i} < \sum_{i=1}^{m} v_{i}\).
Proof
All initializing packets are injected at time 0. For 1≤i≤m−3, the ith noninitializing packet is injected to buffer i at time i. At time m−2, no packets are injected. At time m−1, the adversary chooses two buffers, denoted b _{1} and b _{2}, with the maximum expected number of packets, and injects a packet to both these buffers. At time m, the adversary injects a packet either to b _{1} or to b _{2}, choosing the buffer maximizing the expected number of packets.
Below, we show that for m=16, it holds that \(\sum_{i=1}^{m} p_{i} < \sum_{i=1}^{m} v_{i}\). Note that up to step m−3, the injection pattern is systematic, and thus \(\sum_{i=1}^{m3} p_{i} = \sum_{i=1}^{m3} v_{i}\). It is therefore sufficient to show that \(\sum_{i=m2}^{m} p_{i} < \sum_{i=m2}^{m} v_{i}\).
Open Access
This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.