An Optimal Lower Bound for Buffer Management in Multi-Queue Switches

In the online packet buffering problem (also known as the unweighted FIFO variant of buffer management), we focus on a single network packet switching device with several input ports and one output port. This device forwards unit-size, unit-value packets from input ports to the output port. Buffers attached to input ports may accumulate incoming packets for later transmission; if they cannot accommodate all incoming packets, their excess is lost. A packet buffering algorithm has to choose from which buffers to transmit packets in order to minimize the number of lost packets and thus maximize the throughput. We present a tight lower bound of e/(e-1) ~ 1.582 on the competitive ratio of the throughput maximization, which holds even for fractional or randomized algorithms. This improves the previously best known lower bound of 1.4659 and matches the performance of the algorithm Random Schedule. Our result contradicts the claimed performance of the algorithm Random Permutation; we point out a flaw in its original analysis.


Introduction
We study the unweighted FIFO variant of the buffer management problem introduced by Kesselman et al. [KLM + 04].In this problem, also called the packet buffering problem [AS05], we focus on a single network packet switching device which has m input ports and one output port.The goal of such device is to forward packets from its input ports to the output port.The burstiness of the incoming traffic motivates the use of buffers which can accumulate incoming packets and store them for later transmission.We assume that all packets are of unit size and that each input port has an attached buffer able to store up to B packets.We consider the unweighted case, in which all packets are equally important.Time is slotted in the following way.At any (integer) time t ≥ 0, any number of unit-sized packets may arrive at the input ports and they are appended to the appropriate buffers.If a buffer cannot accommodate all the packets, the excess is lost.Then, during the time step corresponding to time interval (t, t + 1), the device can transmit a single packet from a single buffer.
The key difficulty of this problem is that it is inherently online: the buffer managing algorithm does not know where packets will be injected in the (nearest) future.The goal is to minimize the number of lost packets; the only decision made by the algorithm at time t is choosing the buffer from which a packet will be transmitted in step (t, t + 1).
In our setting, no information about the future is available to the algorithm, i.e., the decision of an online algorithm at time t has to be made solely on the basis of the input sequence up to time t.In particular, we make no probabilistic assumptions about the input, and assume it is created in a worst-possible manner by an adversary.For analyzing the efficiency of online algorithms, we use competitive analysis [BE98], andon any input sequence -compare the throughput (the number of packets transmitted) of an algorithm and the optimal offline schedule.The supremum (taken over all inputs) of the ratios between these two values is called the competitive ratio of the algorithm and is subject to minimization.An algorithm is called R-competitive if its competitive ratio is at most R.In case of randomized online algorithms, in the definition above, we replace the throughput by its expected value.
The queuing model considered in this paper is typical for input-queued switches or routers, which are the dominant packet switching architecture of the Internet.Moreover, despite the popularity of theoretical research on Quality-of-Service solutions (cf.Section 1.2), most of the current networks (most notably those using IP protocol) provide only "best effort" services, where all packets are treated as equally important and do not have to respect any deadlines.The unweighted variant considered hereby is therefore typical for such networks.Minimizing the packet loss is an important issue, as subsequent packet retransmissions by the standard transport-layer protocols like TCP are quite costly, and may lead to performance degradation.Finally, there is an observable evidence that the network traffic exhibit the so-called self-similar properties [LTWW94], which renders analyses based on stochastic queuing theory inapplicable.Instead, we apply the worst-case, competitive analysis in place of stochastic assumptions.
1.1 Previous Work.The contributions on this topic fall into three main categories: the results for deterministic algorithms, randomized algorithms and (deterministic) algorithms for the fractional model.In the fractional model (described in detail in Section 2.2), the algorithm may transmit arbitrary fractions of packets, as long as the total load transmitted in a single step is at most 1.It is a simple observation that any randomized algorithm can be transformed into a deterministic fractional one without increasing its competitive ratio, i.e., the fractional model is "easier" for algorithms and "harder" for the adversary.A relaxed relation in the opposite direction also exists: Azar and Litichevskey showed how to transform a c-competitive fractional solution into a deterministic c • (1 + ⌊H m + 1⌋/B)competitive one [AL06].
First, we describe results on randomized and fractional solutions.The currently best lower bound of 1.4659 (holding for for arbitrary buffers sizes B and large m) on the competitive ratio of any randomized algorithm is due to Albers and Schmidt [AS05].In Section 2.3, we highlight the details of their approach and show that in fact it also works for the fractional scenario.Azar and Richter presented the randomized algorithm Random Schedule [AR05], whose competitive ratio is e e−1 + o(1) ≈ 1.582.Azar and Litichevskey [AL06] showed how to encode the fractional variant of the problem as an instance of the online fractional matching in a bipartite graph.By constructing an e e−1 -competitive solution (based on a natural "water level" approach) of the latter problem, they obtained a fractional e e−1 -competitive algorithm.In our paper, we refer to it as Frac-Waterlevel.Finally, this bound was claimed to be improved by the Random Permutation algorithm [Sch05] with competitive ratio 1.5 for all values of B and m.These competitive ratios were improved for the particular case of two buffers and any B, where an algorithm attaining the optimal competitive ratio of 16/13 ≈ 1.231 was given by Bienkowski and Mądry [BM08].
As for the deterministic algorithms, the general upper bound holding for all values of B and m was given by Azar and Richter [AR05].They proved that any work-conserving (i.e., serving a non-empty queue) algorithm is 2-competitive.They also showed the lower bound of 1.366 − Θ(1/m) holding for any B. Albers and Schmidt [AS05] improved that bound to e e−1 for m ≫ B. They also showed the algorithm Semi-Greedy which is 1.889-competitive for large B. By applying the fractional-to-deterministic reduction described above to the algorithm Frac-Waterlevel, Azar and Litichevskey obtained a deterministic e e−1 • (1 + ⌊H m + 1⌋/B)-competitive algorithm (which we will call Det-Waterlevel) [AL06].Again, the results can be improved for particular cases: when B = 2, then the Semi-Greedy algorithm achieves an optimal competitiveness of 13/7 ≈ 1.857 [AS05]; when m = 2, the optimal competitiveness of 16/13 ≈ 1.231 was achieved by the Segmental Greedy algorithm by Kobayashi et al. [KMO08].
Most of the algorithms described above were evaluated experimentally in the paper of Albers and Jacobs [AJ10].

Related Work.
The throughput maximization problem considered in the unweighted version in our paper was also studied in a more general context, where packets may have different weights and the goal is to maximize the the total transmitted weight.Such an extension tries to capture differences of importance between various data streams.Upon packet injection, the buffer managing algorithm makes a decision whether a packet should be accepted to the buffer or dropped immediately.Afterwards, accepted packets have to be transmitted in FIFO order.Such a problem is nontrivial even for the task of managing a single input buffer [AMRR05, AM03, AMZ03, Zhu04].There is also a preemptive variant were packets may be dropped from the queue [And05, AMZ03, EW09, KLM + 04, KM03, KMvS05, LP03, MPL04, Zhu04].
Another variant of the buffer management problem is the so-called bounded-delay scenario, where neither the buffer has a fixed capacity nor FIFO order is imposed.Instead, each packet specifies a deadline and it must be either transmitted before the deadline or dropped [AMZ03, CCF + 06, CF03, EW07, Haj01, Jeż10, KLM + 04, KMvS05, LSS05, LSS05].
For a comprehensive description of these and related models, we refer interested readers to the recent survey by Goldwasser [Gol10].

Our Contribution.
In this paper, we present a construction which shows that no fractional algorithm may achieve a competitive ratio lower than e/(e − 1) ≈ 1.582 for any value of B and for large m.Our result has a few implications: • The result is up to lower-order terms optimal; it matches the performance of the fractional al-gorithm Frac-Waterlevel of [AL06].It also gives evidence that the reduction of Azar and Litichevskey from the fractional packet buffering to online fractional bipartite matching was essentially tight.Even though the fractional matching is more general, the competitive ratios achievable for both problems are the same.
• The same lower bound holds for randomized algorithms.Thus, it improves the currently best lower bound of 1.4659 (also holding for any B and large m) [AS05].It is also up to lower-order terms optimal for randomized algorithms, as it matches the performance of the Random Schedule algorithm [AR05].
• The lower bound contradicts the claimed competitive ratio of 1.5 of the Random Permutation algorithm [Sch05].In the appendix, we pinpoint the subtle, yet fundamental flaw in the original analysis of this algorithm.Roughly speaking, the main source of the flaw was neglecting certain types of adversarial strategies; these strategies are actually employed in our lower bound, cf.Section 1.4.
• The lower bound of e/(e − 1) for deterministic algorithm given by Albers and Schmidt [AS05] can be applied only when m ≫ B. Our construction yields the same ratio, but requires only that m is large; B may be arbitrary, e.g., even larger than m.In particular, in contrast to their lower bound construction, ours shows that the deterministic algorithm Det-Waterlevel [AL06] achieves an asymptotically optimal competitive ratio when both m is large and B ≫ log m.
1.4 Used Techniques.While the formal construction of the lower bound is given in Section 2 and Section 3, here we informally describe three key ingredients.First, our adversarial strategy creates a packet injection sequence which at time 0 completely fills all the buffers and can be served losslessly by the optimal algorithm.Moreover, after each injection, all buffers of the optimal algorithm are full.These assumptions are rather standard and we list them only for completeness.
Second, we observe that in some cases delaying packet injections and injecting multiple packets at once incurs a greater packet loss for the algorithm.To give a specific example for a randomized algorithm, assume that the buffer size B is 1 and at time t, the expected number of packets at each buffer is quite low.Then, injecting a packet at times t, t + 1, and t + 2 is more benign to the algorithm than not injecting a packet at time t, injecting two packets (to two different buffers, b i and b j ) at time t+ 1 and injecting a packet at time t+ 2. At first glance, the former approach tries to incur a loss of the algorithm more aggressively, while the latter gives the algorithm more time to prepare.However, the latter approach has a main advantage for an adversary: at time t + 1, both buffers b i and b j become full.This knowledge can be exploited by the adversary, which may inject a packet at time t + 2 to one of these two buffers.On expectation, the incurred loss caused by this injection is then at least 1/2.Although the observation presented above may seem simple, it was not exploited by previous lower bounds.Moreover, neglecting these types of effects is the main source of the flawed analysis of [Sch05].
Third, our construction is an iterative approach which, given an adversarial strategy S for n initially full buffers reducing the average load of the buffers to a specific level, constructs a strategy S ′ for n 2 buffers, which uses the former strategy as a black box.The strategy S ′ uses only slightly more steps (in comparison to the number of buffers) and reduces the average load to even smaller amount than S does.By applying the construction iteratively, in the limit we obtain a strategy for M buffers which reduces the average load of the buffers to o(1) in time M • (e − 1) + o(M ).Hence, neglecting the lower-order terms, the throughput of the algorithm on such an injection pattern is M • (e − 1).On the other hand, right after the last injection, the buffers of the optimal solution are still full, and thus its throughput is M • (e − 1) + M = M • e.This implies an asymptotic lower bound of e/(e − 1).

The Basics
2.1 Preliminaries.Throughout the paper, m denotes the number of buffers and B the size of a single buffer.We denote the set of all buffers by B At any integer times t ≥ 0, the adversary may inject arbitrary number of packets to arbitrary buffers.It may also choose not to inject anything.If a buffer cannot accommodate all the packets, the excess is lost.We assume that packet injection is instant and therefore we distinguish between the state of buffers at t − (right before packets injection at time t) and t + (right after the injection).
Then, during the time step corresponding to time interval (t, t + 1), the algorithm may transmit a single packet from the buffer of its choice.We assume that after the last packet injection, the algorithm has sufficient time to transmit all the packets it still has in its buffers.
For any algorithm A and any sequence σ of packets' arrivals, we denote the throughput of A on σ by T A (σ).
When A is randomized, T A (σ) denotes its expected throughput.For any algorithm, its competitive ratio is defined as sup σ {T OPT (σ)/T ALG (σ)}, where supremum is taken over all possible inputs and Opt is the optimal offline algorithm.For randomized algorithms, we assume oblivious adversaries [BE98], i.e., the ones that do not have access to the random bits used by the algorithm.

Fractional Model.
It is usually more natural to think about lower bounds for deterministic algorithms.Therefore, in this paper, we consider a fractional variant of the problem, in which a deterministic algorithm may transmit fractional amounts of packets.Note that we change solely the capabilities of online algorithm, which means that: (i) the injected packets are still integral and can be injected at integer times only, (ii) an optimal offline solution to which our solution is compared is chosen among integral solutions.We call the amount of packets in a buffer the load of this buffer.For a subset of buffers, the load of these buffers is the sum of the respective buffers' loads, and the total load is the load of all buffers.Further, in each step the load transmitted from the buffers is at most 1.The following reduction is an easy observation and it is shown in the appendix for completeness.
Lemma 2.1.For any randomized algorithm Alg for the integral model with competitive ratio R, there exists a deterministic algorithm Alg ′ for the fractional model whose ratio is at most R (for the same buffer size and number of buffers).
By Lemma 5.1 of [AL06], without loss of generality, we may assume that an algorithm is work-conserving, i.e., in each step it transmits a load of 1 if it has it, and its total load otherwise.We silently make this assumption in our proofs.

Canonical Strategies.
By Lemma 2.1, we only need to show a lower bound for deterministic algorithms in the fractional model.In our construction, for simplicity of the description, we assume that B = 1.We note that the presented approach is valid with virtually no changes for other values of B: we only have to replace an injection of a single packet by injection of B packets to the same buffer and replace each step by B consecutive steps.
An elementary building block of the constructed input sequence is an action.It is parameterized by a subset of buffers A and a positive integer a called length.When we say that the adversary executes action (A, a) at time t, we mean that nothing is injected during time period (t, t + a).Then, at time t + a, the adversary chooses a buffers with the maximal load among buffers from A and injects a packet to each of these buffers.Ties are broken arbitrarily.We call set A active in time period (t, t + a] or active for this action, and we say that such action operates on set A. The considered action starts at time t and ends at time t + a.
A brief characterization of the inputs we construct is as follows.First, at time 0, the adversary injects a packet to each of m buffers; this is called the initial injection.Second, at time 0, it executes a canonical (adversarial) strategy.Such a strategy is a sequence of actions, where for each consecutive pair of actions, the latter starts exactly when the former ends.(For technical reasons, which become clear when we describe our recursive construction, the initial injection is not the part of the canonic strategy.)Further, we define the length of a canonical strategy as the sum of its actions' lengths; note that the length of a strategy starting at time 0 coincides with the last time at which packets are injected.We say that something happens during the strategy of length ℓ executed at τ if it happens within time interval (τ, τ + ℓ].
In our construction, we consider canonical strategies only.Canonical strategies are neat to analyze, because the optimal algorithm can serve the corresponding inputs losslessly.
Lemma 2.2.For a canonical adversarial strategy of length ℓ preceded by the initial injection, the throughput of the optimal offline algorithm is m + ℓ.
Proof.Let (A 1 , a 1 ), (A 2 , a 2 ), . . ., (A k , a k ) be the canonical strategy.For any 1 ≤ j ≤ k, during a j time steps of action j, Opt transmits packets from these a j buffers to which a packet is injected at time j i=1 a i (at the end of action j).In effect, during the first k i=1 a i = ℓ steps, Opt transmits a packet in each step and does not lose any.Finally, at time ℓ + , the buffers of Opt are still full and it may transmit all m packets afterwards.
⊓ ⊔ By the lemma above, we may use a canonical strategy to show a lower bound on any algorithm in the following way.We show that after a given strategy S of length ℓ is executed against any algorithm Alg, the total load of Alg at time ℓ + is at most c.Then, during the first ℓ steps, the load transmitted by Alg is at most ℓ, and afterwards it is at most c.On the other hand, by Lemma 2.2, the throughput of Opt is ℓ + m.Hence, S implies that the competitive ratio of any algorithm is at least (ℓ + m)/(ℓ + c).Therefore, our goal is to construct an adversarial canonical strategy which reduces the total load of any algorithm as quickly as possible.
In our construction, for a given number of buffers m, we give a strategy where the lengths of the corresponding actions are fixed (and known to the algorithm).The choice of active sets will be algorithm dependent; we describe it in detail in the next sections.

Uniform Strategies.
The most straightforward canonical strategies, used previously in the literature, are ones in the form (A, 1), (A, 1), . . ., (A, 1), where A is a fixed set of buffers.In other words, when executing such a strategy, the adversary injects a packet at the end of each step into the most populated buffer from set A. We call such strategies A-uniform.
For any action (A, a), we use (A, a) × k as a shorthand for the sequence consisting of k consecutive actions (A, a).The behavior of algorithms on B-uniform strategies (recall that B is the set of all buffers) was investigated previously by Albers and Schmidt [AS05].Hereby, we show a slightly more general result, which become useful later: we consider A-uniform strategies for arbitrary sets A.
Definition 2.1.For any subset A consisting of n buffers and a real β > 0, the strategy S β 0 (A) is the canonical uniform strategy (A, 1) × ⌈βn⌉.
Lemma 2.3.Fix a real β > 0, any subset of n buffers A, and let ℓ be the length of the strategy S β 0 (A).Assume that at a step τ , all buffers of A are full and the adversary starts to execute the strategy S β 0 (A).Let E be the load transmitted during S β 0 (A) from buffers not in A. Then the load in A at time (τ + ℓ) + is at most E + n • e −β + 1.
Proof.For simplifying the notation, we assume that τ = 0.For any integer t ≥ 0, let C t be the load of A at time t + , and for t ≥ 1, let a t ∈ [0, 1] be the load transmitted from A in step (t − 1, t).
Since at each time a packet is injected the total load is at least 1, the algorithm is able to transmit the load of 1 in each step.Therefore, E = ℓ t=1 (1 − a t ).We establish a recursive relation for C t .Clearly, C 0 = n.Fix any t ∈ {1, . . ., ℓ}.At time t − , the load in the most populated buffer of A is at least (C t−1 − a t )/n.When the adversary injects a packet to this buffer at time t, the load in By ℓ ≥ βn, the lemma follows.
⊓ ⊔ We remark that the previously best lower bound, due to Albers and Schmidt [AS05], was to use the strategy S β 0 (B) with a numerically optimized β and large m.The best choice is β ≈ 1.1462, which for m → ∞ causes any algorithm to have ratio at least 1.4659.

Lower Bound Construction
As mentioned in the introduction, the crucial observation behind our construction is that sometimes it is beneficial to delay injections first and then to inject more than one packet at once.To give a specific example, assume that m = n 2 and consider a strategy We have to further define active sets A j of particular actions.The sequence P can be divided into n phases, the j-th one corresponding to the subsequence (B, n), (A j , 1) × j.A j is the set of buffers to which packets are added at the end of action (B, n) of the j-th phase.
Observe that the length of P is roughly 1.5n 2 = 1.5m.Using the bounds given later in our proofs, one can show that already this simple strategy incurs a lower bound of 1.5589 on the competitive ratio of any algorithm.We may compare P with the B-uniform strategy (B, 1) × ℓ of the same length.By Lemma 2.2, the throughput of Opt on both strategies is the same, and therefore the key factor determining the efficiency is the total load at time ℓ + .It can be computed that the total load at that time is at most 0.104 m for the strategy P and at most 0.223 m for the B-uniform strategy.The approach illustrated by the strategy P is the main building block of our construction.Now, we concentrate on a single part (A j , 1) × j of the strategy P .We observe that it is focused entirely on decreasing the load in A j .(The algorithm might transmit the load from other buffers as well, but as we implicitly show later, it is rather a bad choice.)As the strategy P is more efficient in decreasing the load than the uniform strategy, why not replace the sequence (A j , 1) × j by a P -type strategy of length roughly j operating on the set A j ?Furthermore, why not perform such a replacement recursively?Such top-down replacements are viable, but troublesome to define.Therefore, we apply a bottom-up approach: we fix n, start from a strategy for n buffers and using it as a black box, we build a better strategy for n 2 buffers.By applying such a transformation iteratively, in the limit we obtain a desired construction, incurring a lower bound of e/(e − 1). on 16 buffers identified with capital letters.Sets were omitted from description of the strategy and placed on the picture instead.They denote buffers to which the adversary injects packets at steps marked with dots.

Building Lower Bounds Iteratively.
Let n be a parameter of our construction.For any integer q ≥ 0, let n q = n 2 q , i.e., n 0 = n and n q+1 = n 2 q .In this section, for any β ∈ (0, 1], we define strategies S β q+1 operating on n q+1 buffers on the basis of strategies S β q operating on n q buffers.The strategies S β 0 are the uniform strategies of Definition 2.1.Definition 3.1.Fix an integer q ≥ 0. For any subset A of n q+1 buffers, and any β ∈ (0, 1], let where A j is the set of n q buffers to which packets were injected during the action (A, n q ) immediately preceding S j/nq q (A j ).
We say that S β q+1 (A) operates on set A. Note that by the recursive definition above, all actions of such strategy operate on A or its subsets.Observe also that if m = n 2 , the strategy S 1 1 (B) coincides with the strategy P .
It is helpful to interpret this adversarial strategy as a (program) routine, which calls another subroutines.In the example given in Figure 1, routine S on {C, D, F, K}.Within these subroutines, the adversary operates on a given set of buffers and ignores all the remaining ones.Note that the algorithm is not bounded in this way: it may always transmit the load from an arbitrary buffer.In the figure, we used brackets to emphasize the recursive structure of the strategy, and we omitted sets for succinctness.
Our construction implicitly builds a hierarchy of buffer sets: if a strategy (routine) operating on set A calls another strategy (subroutine) operating on set A ′ , then A ′ ⊆ A. Moreover, at the beginning of the subroutine, all buffers from A ′ are full.

Technical Relations.
We present two technical facts that we use in our bounds.They are both proven in the appendix.
Fact 3.1.For any non-negative integers n > 0, q, k and any positive real β, the following inequality holds: Fact 3.2.For any non-negative integer q, the following inequality holds:

Bounding the Length of S β
q .Below, we bound the length of our strategies S β q .
Lemma 3.1.Fix integers n > 0, q ≥ 0, a real β ∈ (0, 1], and any subset A consisting of n q buffers.Let T β q be the length of the strategy S β q (A).Then, Proof.The proof follows by induction on q.For q = 0, the right hand side of the inequality is βn + 1, and thus by Definition 2.1, the basis of the induction holds.Assume that the bound holds for q, and we prove it for q + 1.By Definition 3.1, it holds that Using inductive assumption, Fact 3.1 and β ≤ 1, we derive which completes the inductive proof.⊓ ⊔

Bounding the Efficiency of S β
q .We measure the efficiency of strategies S β q in reducing the total load in a set of buffers.We assume that the total load never drops below 1 as this assures that the algorithm transmits load 1 in each step, which simplifies the analysis.Otherwise, the adversary may stop the sequence as the goal of reducing the load was achieved anyway; we postpone the details to the proof of Theorem 3.1.Lemma 3.2.Fix a real β ∈ (0, 1], any subset A consisting of n q buffers, and let ℓ be the length of the strategy S β q (A).Assume that at a step τ , all buffers of A are full and the adversary starts to execute the strategy S β q (A).Assume that for any time t ∈ {τ, τ + 1, . . ., τ + ℓ − 1}, the total load (in all buffers) is at least 1.Let E be the load transmitted during S β q (A) from buffers not in A. Let L β q be the total load in A at time (τ + ℓ) + .Then, Proof.We use induction.As Q 0 = 1/n and F β 0 = 1, the induction basis (q = 0) follows immediately by Lemma 2.3.We assume that the the bound holds for q and prove it for q + 1.
Recall that S β q+1 operates on set A consisting of n q+1 = n 2 q buffers.We divide this strategy into ⌈βn q ⌉ phases, where the j-th phase consists of an action (A, n q ) followed by the strategy S j/nq q (A j ) (cf.Definition 3.1).
Let t 0 = τ and let t j be the ending time of phase j, i.e., phase j lasts between t j−1 and t j .For j ∈ {0, . . ., ⌈βn q ⌉}, let C j denote the total load in A at time t + j .Further, let E j be the load transmitted during phase j from buffers outside of A.
Let γ = e −1/nq .We show that the following relation holds for any phase j.
To this end, we focus on a single phase j.We partition E j into a j and b j : the loads transmitted from the buffers not in A during action (A, n q ) and during sequence S j/nq q (A j ), respectively.The action (A, n q ) takes place in steps (t j−1 , t j−1 + n q ).Since the algorithm is work-conserving, and -by the lemma assumption -the total load in all buffers never drops below 1, the algorithm transmits the load n q − a j from the buffers of A during these steps.The load of A at time (t j−1 + n q ) − is then C j−1 − n q + a j .At time t j−1 + n q , the adversary chooses set A j of n q most populated buffers of A and a packet is injected to each buffer of A j .Hence, at time (t j−1 + n q ) + , all the buffers of A j are full and the load in the set At time t j−1 + n q the adversary starts to execute the strategy S j/nq q (A j ), ending at time t j .Let b ′ j be the load transmitted by the algorithm during sequence S j/nq q (A j ) from buffers A \ A j .Then, the loads at time t + j are as follows: • n q (by the inductive assumption); and thus (3.1) follows.By solving the recurrence of (3.1) with the boundary condition C 0 = n q+1 , the following bound on C r follows for any phase r.
As the number of phases is ⌈βn q ⌉, L β q = C ⌈βnq ⌉ .Using γ ⌈βnq⌉ ≤ e −β , −Q q+1 < q + 1, and further reorganizing terms, we obtain Therefore, for proving the lemma, it is sufficient to show that the term in bracket is at most F β q+1 .Indeed, applying Fact 3.1 yields which completes the inductive proof.⊓ ⊔

The Competitive Ratio.
When we look at the strategies S 1 q run on the set of n q initially full buffers, we observe that when q is large and n is large in comparison to q, then (i) the length of the strategy is approximately (e − 1) • n q , and (ii) after executing S 1 q , the total load of any algorithm is o(n q ).Hence, the throughput of the algorithm is roughly (e − 1) • n q , whereas Opt may serve the whole sequence losslessly, ending the strategy S 1 q with n q full buffers.Hence, the throughput of Opt is approximately e • n q , which yields the desired lower bound on the competitive ratio.These intuitions are formalized below.
Theorem 3.1.The competitive ratio of any fractional algorithm Alg for the unweighted FIFO variant of the buffer management problem is at least e/(e − 1).The bound holds for any value of B and m → ∞.
Proof.Let q > 0 be a parameter which we will fix later.Let n = 4q 4 .The number of buffers used is m = n q .At time 0, the adversary performs the initial injection, filling all buffers, and then executes the strategy S 1 q (B) against the algorithm Alg (recall that B is the set of all buffers).If at any time during this strategy, the total load drops below 1, then the adversary ends the input sequence immediately.
We show that for large enough q, the competitive ratio of Alg incurred by S 1 q (B) is arbitrarily close to e/(e − 1).
Let ℓ be the last time when packets were injected.If the sequence does not end prematurely (because of the load drop), then ℓ is just the length of S 1 q (B), otherwise it is smaller.In either case, for the analysis, we may assume that the algorithm faced a canonic sequence that ended at time ℓ.For our choice of n, it holds that (1 + 2q/n) q+1 ≤ (1 + 1/(2q 3 )) 2q < e 1/q 2 < 1 + 2/q 2 .Hence, by Lemma 3.1, Let c be the total load at time ℓ + .If the sequence ended prematurely, then since all actions of S 1 q have length at most n q−1 , it holds that c ≤ 1+n q−1 .Now, we consider the case when the strategy S 1 q (B) was executed completely.As it operates on all the buffers, the term E in the statement of Lemma 3.2 is zero.Hence, by this lemma, where the second inequality follows by Fact 3.2.The throughput of Alg is at most ℓ + c.On the other hand, by Lemma 2.2, the throughput of Opt is ℓ + n q .Hence, the competitive ratio of Alg is at least By taking sufficiently large q, the ratio can become arbitrarily close to e/(e − 1).As the choice of Alg was arbitrary, no algorithm can achieve the competitive ratio lower than e/(e − 1).
⊓ ⊔ Theorem 3.1 combined with Lemma 2.1 yields the following corollary.
Corollary 3.1.The competitive ratio of any randomized algorithm Alg for the unweighted FIFO variant of the buffer management problem on is at least e/(e − 1).The bound holds for any value of B and m → ∞.

Conclusions
We conclude with a few remarks about the fractional model.First of all, the buffer size B does not seem to be a relevant factor, at least for all known algorithms and lower bounds.Therefore, the fractional model seems to be simpler than the standard, integral one and developing optimal algorithms might be easier here.
For the fractional model, we presented a new tight lower bound of e/(e−1), which matches the performance of the algorithm Frac-Waterlevel by Azar and Litichevskey [AL06].Although optimal when m tends to infinity, the bound is not applicable for smaller values of m.For m = 2, the exact competitive ratio of 16/13 is achieved by the Perpendicular Bisector algorithm of [BM08]; it matches the lower bound generated by the uniform strategy of [AS05].Determining the exact competitive ratios for other values of m is an open problem.In particular, it is not even clear whether the best possible lower bound can be incurred using canonical strategies.

A Flaw in the Analysis of the Random Permutation Algorithm
The lower bound presented in this paper contradicts the claimed performance of the Random Permutation algorithm.In this section, we demonstrate an error in its original analysis [Sch05].In terms of our lower bound, the main source of the flawed analysis in [Sch05] is an assumption that B-uniform strategies are the worst for an online algorithm.
Random Permutation is defined for B = 1 and any number m of buffers.It chooses a permutation of the buffers uniformly at random.In any time step, it transmits a packet from the populated buffer which is most to the front in the permutation.
The first packet arriving at a particular buffer is called initializing and all the subsequent packets injected to this buffer are non-initializing.We number all non-initializing packets.The probability of accepting the i-th non-initializing packet is denoted p i .Let Lemma 1 of [Sch05] states that p i ≥ q i for 1 ≤ i ≤ m, provided that Opt can accommodate all non-initializing packets.
In the proof of this lemma, it is argued that the worst possible situation occurs when all m initializing packets arrive at time 0 and then, for 1 ≤ i ≤ m, the i-th non-initializing packet arrives at time i to buffer i.We call such an input sequence systematic (these are inputs generated by special types of uniform strategies used in our construction).Although the lemma holds for systematic sequences (it even holds that p i = q i ), it is not true that the systematic sequences are the worst input instances for the Random Permutation algorithm.
Our counterexample to Lemma 1 of [Sch05] is extremely simple.Let m = 10; m initializing packets are injected at time 0. For 1 ≤ i ≤ 7, the i-th noninitializing packet is injected at time i to buffer i.At time 8, no packets are injected.At time 9, 8-th and 9-th non-initializing packets are injected to buffers 8 and 9, respectively.At time 10, the 10-th non-initializing packet is injected to buffer 8.
Note that till step 7, our sequence is systematic.The probabilities of accepting the 8-th and the 9-th packet are indeed slightly higher in our sequence than in the systematic one, i.e., up till step 9 claiming that the systematic sequence is the worst possible is justified.But our construction leaves buffers 8 and 9 full at time 9. Thus, the probability of accepting the 10-th non-initializing packet is at most 1/2 < q 10 .Note that this counterexample to the performance of Random Permutation follows the same line of attack as our lower bound, i.e., delaying the injections, filling multiple buffers at once, and finally injecting a subsequent packet to one of them.
Subsequent parts of the analysis of Random Permutation [Sch05] use only the relation m k=1 p k ≥ m k=1 q k , which can be proved to be false using our construction as well.Showing it analytically would require tedious calculations and would be in a great part a mere repetition of the arguments given in the original proof of [Sch05].Since our construction is a small finite object, it can be verified by a trivial computer program.Such program can be found on the author's web page.Here, we just list the relevant values.Clearly, p i = q i for i ≤ 7. Moreover p 8 = p 9 = 2 374 894/10!and p 10 = 1 716 050/10!, whereas q 8 = 2 012 014/10!, q 9 = 2 160 343/10!, and q 10 = 2 293 839/10!.This shows that m k=1 p k < m k=1 q k , proving Lemma 1 of [Sch05] false.

B Proofs of Technical Claims
Proof.(of Lemma 2.1) We fix any buffer size B and number of buffers m.
By Q Alg (i) and Q Alg ′ (i) we denote the amount of packets in buffer i of algorithms Alg and Alg ′ , respectively.The proof follows by simulation, i.e., we show that on the basis of Alg, it is possible to construct an algorithm Alg ′ , such that -at any time of any input sequence -the following invariant holds for any buffer i: and for any step where Alg transmits p packets in expectation, Alg ′ transmits load p as well.This implies that -on any input sequence -the throughput of Alg ′ is at least that of Alg, which yields the lemma.
Clearly, the invariant holds at the beginning, when both Alg and Alg ′ have no packets.Then the proof follows inductively, by considering two possible types of events.
1.A packet is injected to buffer i.If Q Alg ′ (i) ≥ B − 1 before the injection, then afterwards Otherwise, Q Alg ′ (i) increases by 1 and E[Q Alg (i)] increases at most by 1.In both cases, the invariant is preserved.
2. Packets are transmitted during a step.Let Q ′ Alg (i) be the number of packets in buffer i after the transmission of Alg.Let Clearly, q i ≥ 0 for all i and m i=1 q i ≤ 1.To simulate this behavior of Alg and preserve the invariant, Alg ′ transmits load q i from buffer i. ≤ n q • ((βn q + 2)/n q + 2q/n) k+1 k + 1 Proof.(of Fact 3.2) We start with bounding a related term.