Abstract
The classical paging problem is to maintain a twolevel memory system so that a sequence of requests to memory pages can be served with a small number of faults. Standard competitive analysis gives overly pessimistic results as it ignores the fact that realworld input sequences exhibit locality of reference. Initiated by a paper of Borodin et al. (J Comput Syst Sci 50:244–258, 1995) there has been considerable research interest in paging with locality of reference. In this paper we study the paging problem using an intuitive and simple locality model that records interrequest distances in the input. A characteristic vector \(\mathcal{C}\) defines a class of request sequences with certain properties on these distances. The concept was introduced by Panagiotou and Souza (In: Proceedings of 38th annual ACM symposium on theory of computing (STOC), 2006). As a main contribution we develop new and improved bounds on the performance of important paging algorithms. A strength and novelty of the results is that they express algorithm performance in terms of locality parameters. In a first step we develop a new lower bound on the number of page faults incurred by an optimal offline algorithm opt. The bound is tight up to a small additive constant. Technically, the result relies on a new approach of relating the number of page faults to the number of memory hits and amortizing suitably. Based on these expressions for opt’s cost, we obtain nearly tight upper and lower bounds on lru’s competitiveness, given any characteristic vector \(\mathcal{C}\). Furthermore, we compare lru to fifo and fwf. For the first time we show bounds that quantify the difference between lru’s performance and that of the other two strategies. The results imply that lru is strictly superior on inputs with a high degree of locality of reference. There exist general input families for which lru achieves constant competitive ratios whereas the guarantees of fifo and fwf tend to k, the size of the fast memory. Finally, we report on an experimental study that demonstrates that our theoretical bounds are very close to the experimentally observed ones. Hence our contributions bring competitive paging again closer to practice.
Introduction
Paging is a fundamental resource management problem in computer science. In algorithms research it has been studied extensively ever since Sleator and Tarjan published their seminal paper [21] on the competitive analysis of algorithms. In the paging problem we are given a twolevel memory system consisting of a small fast memory and a large slow memory. At any time up to k pages, for some \(k\in \mathbb {N}\), can reside in fast memory. A paging algorithm alg is presented with a request sequence \(\sigma = \sigma (1), \ldots , \sigma (m)\), where each request \(\sigma (t)\) specifies a memory page. Consider any \(\sigma (t)\), \(1\le t\le m\). If the referenced page is in fast memory, \(\sigma (t)\) is a memory hit. Otherwise \(\sigma (t)\) is a page fault and the missing page must be loaded from slow memory into fast memory. If the fast memory is full, alg must evict a page from fast memory; in the online setting this decision must be made without knowledge of any future requests. The goal is to serve \(\sigma \) so as to minimize the total number of faults.
For an online algorithm alg and a request sequence \(\sigma \), let \(\textsc {alg}(\sigma )\) denote the number of page faults incurred. Let \(\textsc {opt}(\sigma )\) be the number of faults generated by an optimal offline algorithm opt. Strategy alg is ccompetitive if, for every \(\sigma \), \(\textsc {alg}(\sigma )\) is at most c times \(\textsc {opt}(\sigma )\). The optimal competitive ratio achieved by deterministic online algorithms is equal to k [21]. Classical algorithms such as lru (LeastRecentlyUsed), fifo (FirstIn Firstout) and fwf (FlushWhenFull) are all kcompetitive.
It was soon observed that the competitiveness of k is overly pessimistic. In practice algorithms such as lru and fifo attain constant performance ratios in the range [1.5, 4], see also [22]. Furthermore, lru outperforms fifo, which does not show in competitive analysis. The deficiency of the competitive measure is that it considers arbitrary request sequences whereas input sequences generated by real programs have a special structure. They exhibit locality of reference, i.e. whenever a page is requested it is likely to be referenced again in the near future. In a cornerstone paper Borodin et al. [10] initiated the investigation of paging with locality of reference. Over the years various frameworks modeling locality of reference have been proposed. Moreover, new and alternative performance measures have been introduced. In this paper we revisit paging with locality of reference, considering again the competitive performance measure. Compared to previous studies we present for the first time strong guarantees that quantify competitiveness in terms of locality parameters of the input. We analyze individual algorithms and relate pairs of strategies.
Input model We use a model for locality of reference introduced by Panagiotou and Souza [20]. The framework is simple, yet captures the essentials of locality of reference: Whenever a page is requested, it is likely to be reaccessed soon. Hence locality can appropriately be modeled by interrequest distances. Specifically, feasible input is defined by a characteristic vector \(\mathcal{C} = (c_0,\ldots , c_{p1})\), where p denotes the total number of distinct pages ever referenced. Again let \(\sigma \) be a request sequence and \(\sigma (t)\) be the request at time t. We refer to \(\sigma (t)\) as a distancel request, where \(0\le l \le p1\), if the following two properties hold. (1) The page x referenced by \(\sigma (t)\) has been requested before in \(\sigma \) and its most recent request was \(\sigma (t')\). (2) The number of distinct pages requested between \(\sigma (t')\) and \(\sigma (t)\) is equal to l, i.e. \(\{ \sigma (t'+1), \ldots , \sigma (t1)\} = l\). In a request sequence \(\sigma \) characterized by \(\mathcal{C} = (c_0,\ldots , c_{p1})\), there are exactly \(c_l\) distancel requests, for \(l=0, \ldots , p1\). The total number of requests in \(\sigma \) is \(p + \sum _{l=0}^{p1} c_l\).
The concept of characteristic vectors allows one to easily quantity the number of pages faults incurred by lru. This was already observed by Panagiotou and Souza [20]. On a fault lru evicts the page whose last reference is longest ago. Thus at any time lru’s fast memory stores the (up to) k pages that were referenced most recently. Consequently, lru never incurs a page fault on a distancel request with \(0\le l\le k1\) as the referenced page is still in fast memory. Moreover, lru has a fault on every distancel request with \(k\le l \le p1\) since the accessed page has been evicted from fast memory since its last reference. It follows that for any \(\sigma \) specified by \(\mathcal{C} = (c_0,\ldots , c_{p1})\), there holds \(\textsc {lru}(\sigma ) = p + \sum _{l=k}^{p1} c_l\).
Given any characteristic vector \(\mathcal{C}\), the competitive ratio of an algorithm alg is defined as \(R_\textsc {alg}(\mathcal{C}) = \max _\sigma \textsc {alg}(\sigma )/ \textsc {opt}(\sigma )\), where the maximum ranges over all request sequences characterized by \(\mathcal{C}\). As this set of sequences is finite, the minimum is welldefined.
Previous work There exists a considerable body of literature on paging with locality of reference. Due to the wealth of results we can only present a selection. A good survey article is [12]. In their initial paper [10] Borodin et al. introduced access graphs G, representing the execution of programs, to model locality of reference. The vertices of G correspond to the memory pages. Page x may be requested after y if they are adjacent in G. Borodin et al. showed that, for any G, the competitiveness \(R_{\textsc {lru}}(G)\) of lru depends on the number of articulation nodes whose removal separates G. They also developed an algorithm that achieves the best possible competitive ratio attainable for any given G, up to a constant factor [10, 19]. Chrobak and Noga [13] proved that lru is always at least as good as fifo, i.e. for any G, \(R_{\textsc {lru}}(G)\le R_{\textsc {fifo}}(G)\).
Articles [4, 17, 18] make probabilistic assumptions about the input. A diffuse adversary [18] generates a request sequence according to a probability distribution that belongs to a known family of distributions. In Markov paging [17] the input is generated by a Markov chain. Algorithms are evaluated in terms of the page fault rate. In [1] concave functions, which model the working set sizes of programs, are used to restrict the allowed input. Again page fault rates are evaluated.
Especially in recent years various alternative performance measures, in addition to the wellknown page fault rate, have been proposed. These include (a) the max/max ratio [6], (b) bijective and average analysis [2, 3], (c) the relative worstorder ratio [7, 8], (e) relative interval analysis [9, 14] and (f) parametrized analysis [11]. In a bijective analysis two algorithms \(\textsc {alg}_1\) and \(\textsc {alg}_2\) are compared on permutations of the same requests. Let \(\mathcal{I}_n\) denote the request sequences of length n. \(\textsc {alg}_1\) is no worse than \(\textsc {alg}_2\), in signs \(\textsc {alg}_1 \preceq \textsc {alg}_2\), if for all \(n\ge n_0\) there is a bijection \(b : \mathcal{I}_ n \rightarrow \mathcal{I}_n\) such that \(\textsc {alg}_1(\sigma ) \le \textsc {alg}_2(b(\sigma ))\) for all \(\sigma \in \mathcal{I}_n\). In this setting lru is no worse than any other online algorithm alg assuming that locality is modeled by a concave function [3]. However, \(\textsc {lru}\preceq \textsc {fifo}\) and \(\textsc {fifo}\preceq \textsc {lru}\), see [2], so that they are equally good under bijective analysis.
The concept of characteristic vectors was defined by Panagiotou and Souza [20]. As a main result they lower bound the number of page faults incurred by opt on a request sequence \(\sigma \) characterized by \(\mathcal{C} = (c_0,\ldots , c_{p1})\), i.e.
They mention that the bound is tight for characteristic vectors in which \(p=k+1\). Panagiotou and Souza [20] also define an \((\alpha ,\beta )\)adversary that chooses vectors \(\mathcal{C}\) satisfying \(\sum _{l=k}^{\alpha k1} c_l \le \beta \sum _{l=\alpha k}^{p1} c_l\). Against this adversary lru achieves a competitive ratio of \(2(1+\beta ) \alpha /(\alpha 1)\). Panagiotou and Souza [20] also study a setting where an adversary may construct an arbitrary request sequence but the size of the fast memory is chosen uniformly at random from a given range.
Our contribution We investigate paging using classical competitive analysis and adopt the concept of characteristic vectors \(\mathcal{C} = (c_0,\ldots , c_{p1})\) to model locality of reference. It is intuitive to represent input characteristics by a fingerprint of the interrequest distances: If a request sequence exhibits a high degree of locality, then a large majority of the requests are distancel requests, for small l, so that the corresponding vector entries \(c_l\) take large values. Given a realworld trace, the underlying \(\mathcal{C}\) can be extracted easily by a single scan over the data.
We present new and significantly improved bounds on the performance of the most important paging strategies. A particular strength and novelty of the results is that they quantify algorithm performance in terms of locality parameters. Furthermore, the bounds very accurately predict the corresponding performance observed in practice. This finding results from an experimental study we conducted with traces from a benchmark library. These tests confirm the value of our theoretical bounds.
In Sect. 2, given any characteristic vector \(\mathcal{C}\), we develop a new lower bound on the number of page faults incurred by opt to serve any request sequence \(\sigma \) characterized by \(\mathcal{C}\). Technically, the analysis relies on a new approach that relates the number of page faults to the number of memory hits and amortizes the values appropriately. Specifically, we show that
Here \(\lambda \) and \(c^*_{\lambda }\) are solutions of an equation that matches faults and hits, assuming that page faults preferably occur on longdistance requests. We prove that our lower bound is tight, up to an additive term of at most \(2(\lambda k+1)\), which in turn is upper bounded by \(2(pk)\). More precisely, we construct an input sequence that opt can serve with the stated number of faults. The construction and cost analysis of the sequence are involved. A strength of our lower bound is that it is tight for every characteristic vector. The bound (1) by Panagiotou and Souza [20] is only tight for a restricted class of characteristic vectors, as indicated above. Additionally, we show that our lower bound (2) is always greater than that given in (1). In the experiments (2) considerably outperforms (1).
In Sect. 3 we evaluate the competitiveness of lru. Given the analysis of \(\textsc {opt}(\sigma )\), we derive nearly tight upper and lower bounds on \(R_{\textsc {lru}}(\mathcal{C})\), for any \(\mathcal{C}\). The resulting ratios range between 1 and k, depending on \(\mathcal{C}\). The experiments show that these refined ratios are very close to lru’s experimentally observed competitiveness. For all the traces and all values of k, the theoretical bounds are usually at most 2.5 times the experimentally observed performance. In most cases the gap is much smaller. This is the first time that theoretical performance guarantees for paging match the experimental ones up to a constant factor, independently of k. We remark that our theoretical guarantees cannot exactly match the experimental ones because \(R_{\textsc {lru}}(\mathcal{C}) = \max _\sigma \textsc {lru}(\sigma )/ \textsc {opt}(\sigma )\) is still a worstcase ratio. A realworld trace, in general, is not a worstcase input for the underlying \(\mathcal{C}\).
In Sect. 4 we show that lru is superior to other popular paging strategies. We focus on a comparison with fifo and fwf, which have received considerable attention in the memory management literature. We first prove that lru is always at least as good as the other two strategies, i.e. \(R_{\textsc {lru}}(\mathcal{C}) \le R_{\textsc {fifo}}(\mathcal{C})\) and \(R_{\textsc {lru}}(\mathcal{C}) \le R_{\textsc {fwf}}(\mathcal{C})\) for any \(\mathcal{C}\). This is not surprising; similar relations have been shown in other frameworks as well. In this paper we go one step further and quantify the performance difference between lru and fifo, respectively fwf. We make use of the fact that lru’s competitiveness can be expressed as \(R_{\textsc {lru}}(\mathcal{C}) = \textsc {lru}(\mathcal{C})/\textsc {opt}(\mathcal{C})\), where \(\textsc {opt}(\mathcal{C})\) denotes the minimum number of page faults required to serve any request sequence defined by \(\mathcal{C}\) and \(\textsc {lru}(\mathcal{C})\) is lru’s fixed cost for every input specified by \(\mathcal{C}\). We prove that
where c depends on the vector entries \(c_l\), \(1\le l \le p1\). If the number of distancel requests with \(l\ge k\) is not too small, then fifo’s competitiveness tends to k as the locality in the input (captured by entries \(c_l\), for small l) increases. In particular, there exist input classes \(\mathcal{C}\) for which lru’s competitiveness is constant while that of fifo is close to k. The same results hold for fwf, except that slightly “weaker” assumptions on the input are made. Finally, in Sect. 5 we report on the results of our experimental study.
Algorithms We describe the classical paging algorithms analyzed in the paper. Suppose that there is a page fault and the fast memory is full. Among the pages residing in fast memory, lru evicts the one whose last reference is longest ago. fifo drops the page that was loaded earliest. fwf deletes all pages from fast memory. An optimal offline algorihm \(\textsc {opt}\) was given by Belady [5]. On a fault, when the fast memory is full, it evicts the page whose next reference is farthest in the future.
Notation and conventions Throughout this paper we assume that the initial fast memory is empty. Furthermore we assume \(p>k\) since otherwise a request sequence can be served without any faults. Moreover let \(k\ge 2\). When constructing and analyzing a request sequence, a page is called new if it has not been referenced so far.
Analysis of opt
Let \(\mathcal{C} = (c_0,\ldots , c_{p1})\) be an arbitrary characteristic vector. First we develop a lower bound on \(\textsc {opt}(\sigma )\), for any \(\sigma \) defined by \(\mathcal{C}\). Then we prove that our bound is nearly tight.
A Lower Bound
The lower bound on \(\textsc {opt}(\sigma )\) given by Panagiotou and Souza [20], cf. (1), depends on multicycles. A multicycle repeatedly requests a sequence of, say l, distinct pages. On each repetition of the sequence \(\textsc {opt}\) incurs at least \(lk\) page faults if \(l\ge k\). Panagiotou and Souza prove that every request sequence can be viewed as a collection of multicycles. Instead, our new lower bound on \(\textsc {opt}(\sigma )\) is based on a novel approach that relates page faults to memory hits. If \(\textsc {opt}\) has a hit on a distancel request with \(l\ge k\), then it must have incurred at least \(l(k1)\) faults since the last reference to the requested page. We assign tokens to the respective faults. This allows us to lower bound the number of page faults in terms of the number of hits, see Lemma 1 below. The subsequent analysis then lower bounds the expression on the number of hits, for any request sequence. It turns out that the expression is minimized if the hits (faults) occur on the distancel requests with the smallest (largest) possible value of l.
Formally, given any \(\sigma \), let \(f_l\) denote the total number of page faults incurred by \(\textsc {opt}\) on distancel requests, \(0\le l \le p1\), and let \(h_l = c_l  f_l\) be the number of hits on this type of requests. We relate the total number of faults to the number of hits.
Lemma 1
Let \(\sigma \) be any request sequence characterized by \(\mathcal{C}\). There holds a) \(\textsc {opt}(\sigma ) = p + \sum _{l=k}^{p1} f_l\) and (b) \(p + \sum _{l=k}^{p1} f_l \ge k + \sum _{l=k}^{p1} h_l {lk+1\over k1}\).
Proof
We first prove part (a). There holds \(\textsc {opt}(\sigma ) = p + \sum _{l=0}^{p1} f_l\) because \(\textsc {opt}\) incurs one page fault whenever any of the p distinct pages is requested for the first time. Moreover, by the definition of \(f_l\), \(\textsc {opt}\) has exactly \(f_l\) faults on the distancel requests, for \(l=0, \ldots , p1\). It remains to argue that \(f_l =0\), for \(l=0, \ldots , k1\). Obviously, \(f_0=0\). So assume \(l\ge 1\). Consider a distancel request \(\sigma (t) =x\) and let \(\sigma (t')\), where \(t'<t\), be the most recent request when page x was referenced in \(\sigma \). Immediately after \(\textsc {opt}\) has served \(\sigma (t')\), page x is in fast memory. Whenever \(\textsc {opt}\) incurs a fault on a request \(\sigma (s)\), \(t'<s<t\), the set \(\{\sigma (s),\ldots , \sigma (t)\}\) of pages referenced until and including \(\sigma (t)\) contains at most \(l+1 \le k\) pages. This holds true because \(\sigma (t)\) is a distancel request, where \(l\le k1\). Thus the set \(\{\sigma (s),\ldots , \sigma (t)\}\) contains at most \(k1\) pages different from \(y=\sigma (s)\). Hence when \(\textsc {opt}\) serves \(\sigma (s)\), its fast memory must store a page not contained in \(\{\sigma (s+1),\ldots , \sigma (t)\}\). \(\textsc {opt}\) evicts a page whose next request is farthest in the future. Therefore it drops a page that is not referenced by \(\sigma (s+1), \ldots , \sigma (t)\).
We next prove part (b). To this end we assign tokens to page faults whenever \(\textsc {opt}\) has a hit on a distancel request, where \(l\ge k\), in \(\sigma \). Let \(\sigma (t)=x\) be such a request and let \(\sigma (t')\) be the most recent request to x. A total of l distinct pages are referenced in the subsequence \(\sigma (t'+1), \ldots , \sigma (t1)\). Algorithm \(\textsc {opt}\) incurs at least \(l(k1)\) page faults in this subsequence because \(l\ge k\) and \(\sigma (t)\) is a hit. Now we select the last \(l(k1)\) page faults occurring before \(\sigma (t)\) and assign a token to each of these faults. By this process, exactly \(\sum _{l=k}^{p1} h_l(lk+1)\) tokens are placed.
In the following we upper bound the number of tokens a page fault may be assigned. Let \(\sigma (s)\) be any page fault. Suppose that \(\sigma (s)\) receives a token when there is a hit on a request \(\sigma (t)\) with \(s<t\). Reference \(\sigma (t)\) is a distancel request where \(l\ge k\). The page \(x=\sigma (t)\) is not requested in \(\sigma (s), \ldots , \sigma (t1)\). For, if x were requested in this subsequence, then the \(l(k1)\) tokens would be assigned to page faults occuring between the most recent request to x and \(\sigma (t)\). Since x is not requested by \(\sigma (s), \ldots , \sigma (t1)\) and \(\sigma (t)\) is a hit, x must reside in fast memory when \(\textsc {opt}\) has served \(\sigma (s)\). Also x is different from the page referenced by \(\sigma (s)\). Hence when \(\sigma (s)\) receives a token due to a hit on \(\sigma (t)\), page \(x=\sigma (t)\) resides in fast memory when \(\textsc {opt}\) has served \(\sigma (s)\) and is different from the page accessed by \(\sigma (s)\). Since there exist at most \(k1\) such pages, \(\sigma (s)\) can receive at most \(k1\) tokens.
We next argue that the first k page faults in \(\sigma \) do not receive any token. Let \(\sigma (t_1), \ldots , \sigma (t_k)\) be the requests where these first k page faults occur. Recall that the initial fast memory is empty. Hence each \(\sigma (t_i)\), \(1\le i\le k\), requests a new page that has not been referenced before in \(\sigma \). There holds \(t_1=1\), the k pages referenced by \(\sigma (t_1), \ldots , \sigma (t_k)\) are pairwise distinct and the subsequence \(\sigma (1),\ldots ,\sigma (t_k)\) only contains requests to these pages. Furthermore, the first hit on a distancel request with \(l\ge k\) occurs after \(\sigma (t_k)\). Let \(\sigma (t)\), \(t>t_k\), be such a hit and assume that the referenced page \(x=\sigma (t)\) was requested most recently by \(\sigma (t')\), where \(t'<t_k\), so that any of the faults \(\sigma (t_1), \ldots , \sigma (t_k)\) could potentially be assigned a token. The subsequence \(\sigma (t'+1), \ldots , \sigma (t1)\) contains l pages, at most \(k1\) of which can be identical to those referenced by \(\sigma (t_1), \ldots , \sigma (t_k)\) because \(\sigma (t)\) is a page from \(\sigma (t_1), \ldots , \sigma (t_k)\). Hence \(\sigma (t'+1), \ldots , \sigma (t1)\) contains at least \(l(k1)\) pages that are different from those requested by \(\sigma (t_1), \ldots , \sigma (t_k)\). These pages different from \(\sigma (t_1), \ldots , \sigma (t_k)\) are referenced after \(\sigma (t_k)\) and the first request to each of these pages is a fault since, again, the initial fast memory is empty. Our token assignment scheme places \(l(k1)\) tokens on the last \(l(k1)\) page faults prior to \(\sigma (t)\). Hence faults \(\sigma (t_1),\ldots , \sigma (t_k)\) do not receive any token.
In summary, each fault is assigned at most \(k1\) tokens, where the first k faults not receive any. We conclude that the total number of tokens is upper bounded by \((pk)(k1) + \sum _{l=k}^{p1} f_l(k1)\), i.e.
This is equivalent to \(k + \sum _{l=k}^{p1} h_l {lk+1\over k1} \le p + \sum _{l=k}^{p1} f_l\). \(\square \)
For the further analysis, given a vector \(\mathcal{C} = (c_0,\ldots , c_{p1})\), we define two functions f and g as well as values \(\lambda \) and \(c_{\lambda }^*\). For any integer j with \(k\le j \le p1\) and any real number \(\gamma \) with \(0\le \gamma \le c_j\), let
Intuitively, \(f(j,\gamma )\) is the number of page faults, as implied by Lemma 1, if memory hits occur on all the distancel requests, for \(l=k,\ldots ,j1\), and \(\gamma \) distancej requests. The corresponding \(g(j,\gamma )\) is the number of requests where these faults can occur. If \(f(p1,c_{p1}) \le g(p1,c_{p1})\), then let \(\lambda =p1\) and \(c_{\lambda }^* = c_{p1}\). Otherwise determine the largest \(\lambda \) and corresponding \(c_{\lambda }^*\) such that \(f(\lambda ,c_{\lambda }^*) = g(\lambda ,c_{\lambda }^*)\).
Lemma 2

(a)
The values \(\lambda \) and \(c_{\lambda }^*\) are welldefined.

(b)
Let \(j'\) and \(\gamma '\) be a pair such that \(f(j',\gamma ') \le g(j',\gamma ')\). Then \(f(j',\gamma ') \le f(\lambda ,c_{\lambda }^*) \le g(\lambda ,c_{\lambda }^*)\le g(j',\gamma ')\). Moreover, \(j'\le \lambda \). If \(j'=\lambda \), then \(\gamma '\le c_\lambda ^*\).
Proof
For any fixed j, \(k\le j\le p1\), and variable \(\gamma \), \(0<\gamma <c_j\), the functions \(f(j,\gamma )\) and \(g(j,\gamma )\) are continuous. For any fixed j and increasing \(\gamma \), function \(f(j,\gamma )\) is strictly increasing while \(g(j,\gamma )\) is strictly decreasing. For \(j=k,\ldots , p2\), there holds \(f(j,c_j)=f(j+1,0)\) and \(g(j,c_j)=g(j+1,0)\). Hence f and g are continuous, when considering the transitions from \(f(j,c_j)\) to \(f(j+1,0)\) and from \(g(j,c_j)\) to \(g(j+1,0)\). Furthermore the functions are monotone, i.e. f is increasing and g is decreasing.
We first prove part (a). If \(f(p1,c_{p1}) \le g(p1,c_{p1})\), there is nothing to show. Suppose that \(f(p1,c_{p1}) > g(p1,c_{p1})\). Since \(f(k,0) = k < p \le g(k,0)\), the monotonicity of f and g ensures the existence of \(j^*\) and \(\gamma ^*\) such that \(f(j^*,\gamma ^*)=g(j^*,\gamma ^*)\). The first parameter j of f and g is an integer upper bounded by \(p1\). Hence part (a) holds.
For the proof of part (b), first suppose that \(f(p1,c_{p1}) \le g(p1,c_{p1})\), in which case \(\lambda =p1\) and \(c_{\lambda }^*=c_{p1}\). For any pair \(j',\gamma '\) there holds \(f(j',\gamma ') \le f(p1,c_{p1})\) and \(g(p1,c_{p1})\le g(j',\gamma ')\), which establishes the desired inequality. Obviously, \(j'\le p1\). The monotonicity of f and g implies \(\gamma '\le c^*_\lambda \). Next assume that \(f(p1,c_{p1}) > g(p1,c_{p1})\). In this case \(f(\lambda ,c_{\lambda }^*) = g(\lambda ,c_{\lambda }^*)\). Functions f and g are monotone as described in the last paragraph. Hence, for any pair \(j,\gamma \) with \(f(j,\gamma ) > f(\lambda ,c_{\lambda }^*)\) there holds \(g(\lambda ,c_{\lambda }^*)\ge g(j,\gamma )\). For any pair \(j,\gamma \) with \(g(j,\gamma )<g(\lambda ,c_{\lambda }^*)\) there holds \(f(\lambda ,c_{\lambda }^*)\le f(j,\gamma )\). Now let \(j'\) and \(\gamma '\) be a pair such that \(f(j',\gamma ') \le g(j',\gamma ')\). If \(f(j',\gamma ') > f(\lambda ,c_{\lambda }^*)\) held true, then \(g(\lambda ,c_{\lambda }^*)\ge g(j',\gamma ')\) and \(f(j',\gamma ')> f(\lambda ,c_{\lambda }^*) = g(\lambda ,c_{\lambda }^*) \ge g(j',\gamma ')\). If \(g(j',\gamma ')<g(\lambda ,c_{\lambda }^*)\) held true, then \(f(\lambda ,c_{\lambda }^*)\le f(j',\gamma ')\) and \(g(j',\gamma ')< g(\lambda ,c_{\lambda }^*) = f(\lambda ,c_{\lambda }^*) \le f(j',\gamma ')\). In both case we obtain a contradiction to the fact that \(f(j',\gamma ') \le g(j',\gamma ')\). Therefore, the desired inequality holds. Since \(\lambda \) is the largest integer such that f and g assume an equal value, the monotonicity of the functions implies \(j' \le \lambda \). Furthermore, \(\gamma '\le c^*_\lambda \) if \(j'= \lambda \).\(\square \)
Theorem 1
Let \(\sigma \) be any request sequence characterized by \(\mathcal{C}\). There holds
Proof
Obviously, \(\textsc {opt}(\sigma ) \ge p\). By Lemma 1
Determine the largest \(j'\), where \(k\le j'\le p1\), and corresponding \(\gamma '\), where \(0\le \gamma '\le c_{j'}\) such that \(\sum _{l=k}^{p1} h_l = \sum _{l=k}^{j'1} c_l +\gamma '\). Intuitively, we express the total number of hits in terms of a prefix of the \(c_l\)values, for increasing \(l\ge k\). Of course, the hits do not necessarily occur on all the distancel requests, where \(l\le j'\). Then
In (3) expression \(k + \sum _{l=k}^{p1} h_l {lk+1\over k1}\) is minimized if the hits occur on distancel requests with smallest possible l subject to the constraint that at most \(c_l\) distancel requests occur in \(\sigma \). Hence
Combining (3) and (4) together with the last inequality we obtain \(\textsc {opt}(\sigma ) = g(j',\gamma ') \ge k + \sum _{l=k}^{p1} h_l {lk+1\over k1} \ge f(j',\gamma ')\). Using Lemma 2, part (b), we conclude \(\textsc {opt}(\sigma ) \ge f(\lambda ,c_{\lambda }^*) = k + \sum _{l=k}^{\lambda 1} c_l {lk+1\over k1} + c_{\lambda }^*{\lambda k+1\over k1}\).\(\square \)
Proposition 1
The lower bound on \(\textsc {opt}(\sigma )\) stated in Theorem 1 is always greater than that in inequality (1).
The proof is given in the “Appendix”.
Tightness of the Lower Bound
The lower bound of Theorem 1 is essentially best possible. We present a strategy that, given an arbitrary \(\mathcal{C} = (c_0, \ldots , c_{p1})\), constructs a request sequence that can be served with the stated number of page faults, up to an additive constant of \(2(\lambda k+1)\). The strategy is called GenerateRequestSequence, or GRS for short. It takes the original \(\mathcal{C}\) and in a general step issues a distancel request, \(0\le l \le p1\), according to a specific protocol. The corresponding value \(c_l\) is reduced by 1. The process stops when all vector entries \(c_l\), \(0\le l \le p1\), are equal to 0 and all the p distinct pages have been requested.
On a high level the constructed request sequence consists of a series of phases. In each phase, first a certain number of distancel requests, for the largest possible \(l\ge k\), are issued. Then \(k1\) distancel requests, for the smallest possible \(l\ge k\), may be generated. Finally, distancel request with \(l<k\) are generated. The request sequence can be served so that page faults only occur in the first part of the phases, i.e. on distancel requests issued for the largest possible l. All other requests are memory hits. This allows us to analyze the number of page faults in terms of the functions f and g as well as values \(\lambda \) and \(c^*_{\lambda }\) defined in Sect. 2.1. Hence we derive a bound similar to that in Theorem 1.
Description of GRS Let \(\mathcal{C} = (c_0, \ldots , c_{p1})\) be an arbitrary characteristic vector. First, starting with an empty fast memory, GRS requests k new pages. Then GRS generates a sequence of phases in which requests to new pages or distancel requests with \(k\le l \le p1\) are issued. The goal is to reduce the vector entries \(c_k, \ldots , c_{p1}\) to 0 while generating subsequences of requests that can be served with low cost. Each phase, except for possibly the last one, consists of exactly \(l^*\) requests, for some properly chosen \(l^*\) that depends on the state of \(\mathcal{C}\) at the beginning of the phase. Such a phase with \(l^*\) requests is complete. The last phase may contain fewer requests. Finally, GRS issues distancel requests with \(0\le l\le k1\) and requests to the remaining new pages, if there are any. We remark that at any time t GRS can issue a distancel request provided that at least \(l+1\) distinct pages have been requested so far. The algorithm just has to determine the most recent request \(\sigma (t')\), where \(t'<t\), such that \(\sigma (t'+1),\ldots , \sigma (t1)\) reference exactly l distinct pages. It then issues a request to the page specified by \(\sigma (t')\). In Lemma 3 we show that GRS never fails, i.e. when it has to create a distancel request, indeed at least \(l+1\) distinct pages have been referenced so far. Figure 1 gives a pseudocode description of GRS. In line 1, k new page are requested. At any time a variable new stores the current number of new pages. The whileloop consisting of lines 2–9 generates the phases as described in the next two paragraphs. Each execution of lines 2–9 produces one phase.
The phases Each phase with \(l^*\) requests, for the calculated value \(l^*\), contains \(l^*k+1\) socalled longdistance requests followed by \(k1\) shortdistance requests. When generating a longdistance request, GRS either requests a new page or issues a distancel request, for the largest index \(l\ge k\) such that \(c_l>0\). In a shortdistance request GRS poses a distancel request, for the smallest possible \(l\ge k\) such that \(c_l>0\). We will prove that each phase can be served so that page faults occur only on the longdistance requests; all shortdistance requests are memory hits. As we shall see, the crucial property is that each shortdistance request is to a page that was requested before in the phase or during the last k requests preceding the phase. The property holds if each shortdistance request is a distancel request, for some \(l\le l^*\).
Phase lengths An important component of GRS is the choice of \(l^*\), for each phase. Loosely speaking, \(l^*\) is the smallest j such that (a) \(\sum _{l=k}^j c_l \ge k1\) and (b) \(\sum _{l=k}^{p1} c_l\ge j\), provided that such a value exists. Condition (a) ensures that \(k1\) shortdistance requests can be issued. Condition (b) guarantees that a complete phase can be generated. Condition (a) also implies that at the end of the phase the vector entries \(c_l\) with \(l<j\) are equal to 0. Formally, in a first step GRS sets \(l^*\) to the smallest possible l such that \(l\ge k\) and \(c_l>0\); cf. line 3 in the pseudocode of GRS. A (weak) necessary condition for the generation of a complete phase is that the total number of requests still to be generated is at least \(l^*k+1\), i.e. \(\sum _{l=l^*}^{p1} c_l + { n ew} \ge l^*k+1\). If this inequality does not hold, then a final phase consisting of less than \(l^*k+1\) long distance requests is created. If the inequality does hold, then the value of \(l^*\) is refined. More precisely, it is first set to the largest j such that \(\sum _{l=j}^{p1} c_l + { n ew} \ge jk+1\); see line 5 in the pseudocode. This is the right choice of \(l^*\) to ensure a proper termination of the phase generation in case no complete phase can be created anymore. Finally, GRS checks if \(\sum _{l=k}^{p1} c_l + { n ew} \ge l^*\). In this case a complete phase can be created and \(l^*\) is set to the smallest j such that \(\sum _{l=k}^j c_l \ge k1\); cf. line 7 of the pseudocode. In Lemma 4 below we show that a complete phase, consisting of \(l^*\) requests is generated, if and only if \(l^*\) is set in line 7. Otherwise an incomplete final phase is created. In order to distinguish the various setting of \(l^*\), we introduce some notation. If \(l^*\) is set in line 7, we say that Case C holds, referring to the fact that a complete phase is generated. If \(l^*\) is set in line 3 or in line 5 and not refined further, then Case I or Case I’ holds, respectively. In the latter cases, an incomplete phase is created.
Figure 2 depicts the procedures for issuing the short and longdistance requests. In the generation of the longdistance requests, preference is given to new pages if there exist such. When the phase generation terminates, GRS generates distancel requests, where \(0\le l \le k1\), and issues requests to new pages, in case there are any. These final requests are generated in lines 10 and 11 in the pseudocode of GRS.
Theorem 2
Let \(\sigma \) be the request sequence generated by GRS. There holds
Analysis of GRS In the remainder of this section we prove Theorem 2. We analyze the number of page faults needed to serve the sequence generated by GRS. Formally, a phase is a maximal subsequence of requests generated during an execution of the whileloop consisting of lines 2–9 of GRS. In such a phase the first up to \(l^*k+1\) requests issued in line 8 are called longdistance requests. The remaining up to \(k1\) requests issued in line 9 are shortdistance requests. Here \(l^*\) is the value determined during the execution of lines 3–7. The phase is complete if it contains \(l^*\) requests. The following Lemma 3, part (b) proves that GRS constructs a request sequence characterized by \(\mathcal{C}\). The proof needs part (a) of the lemma that identifies a property of shortdistance requests. This property will also be needed in further lemmas.
Lemma 3

(a)
Consider an arbitrary execution of the whileloop consisting of lines 2–9 in GRS. Let \(l^*\) be the value determined by lines 3–7. If in a call to ShortDistanceRequest there holds \(\sum _{l=k}^{p1} c_l >0\), then the smallest \(j\ge k\) such that \(c_j >0\) satisfies \(j\le l^*\).

(b)
GRS never fails and generates a request sequence characterized by \(\mathcal{C}\).
Proof
a) Consider an arbitrary execution of the whileloop consisting of lines 2–9 in GRS, where \(l^*\) is the value determined in lines 3–7. A first observation is that if ShortDistanceRequest is called and \(\sum _{l=k}^{p1} c_l >0\), then the value \(l^*\) must have been set in lines 5 or 7 (Case I’ or Case C) of the current whileloop. For, if this were not the case, then \(\sum _{l=l^*}^{p1} c_l + { n ew} = \sum _{l=k}^{p1} c_l + { n ew} < l^*k+1\) when line 3 of the loop was executed. Hence less than \(l^*k+1\) longdistance requests could be generated before \(\sum _{l=k}^{p1} c_l =0\) and no shortdistance request would be issued.
We procede with the concrete proof of the statement of part (a). Suppose that \(l^*\) was set in line 5 but not reset in line 7 of GRS, i.e. Case I’ holds. If \(l^* = p1\), there is nothing to show. If \(l^* <p1\), then by the choice of \(l^*\), \(\sum _{l=l^*}^{p1} c_l + { n ew} \ge l^*k+1\) but \(\sum _{l=l^*+1}^{p1} c_l + { n ew} < l^*+1k+1\). This implies \(\sum _{l=l^*+1}^{p1} c_l + { n ew} \le l^*k+1\) before line 8 is executed. Therefore, after the execution of this forloop we have \(c_l = 0\), for any \(l> l^*\), because LongDistanceRequest issues distancel requests, for the largest possible \(l\ge k\). We conclude that the smallest possible \(j\ge k\) with \(c_j>0\) always satisfies \(j\le l^*\).
Finally assume that \(l^*\) was set in line 7 of GRS, i.e. Case C holds. Then \(\sum _{l=k}^{l^*} c_l \ge k1\). Recall again that LongDistanceRequest issues distancel requests, for the largest possible \(l\ge k\). We conclude that during any of the \(k1\) calls of ShortDistanceRequest the smallest \(j\ge k\) with \(c_j>0\) satisfies \(j\le l^*\).
(b) We show that the request generation in lines 8, 9 and 10 of GRS is always welldefined. First observe that in line 1 of GRS k distinct pages a requested. Hence in line 10, distancel requests with \(l\le k1\) can always be issued. Next consider an execution of LongDistanceRequest. If line 4 of the procedure is executed, then all the p distinct pages have already been referenced and a distancel request can be generated for any l with \(0\le l\le p1\).
We study a call to ShortDistanceRequest in line 9 of GRS. Consider the execution of the whileloop consisting of lines 2–9 in which the call is made. Let \(l^*\) be the value determined during lines 3–7. If prior to the call of ShortDistanceRequest all the p distinct pages have been referenced, a distancel request, for any \(0\le l \le p1\), can be generated. So suppose that less than p distinct pages have been requested so far. Then in the preceding execution of the forloop in line 8, only new pages were requested by LongDistanceRequest. Hence the phase constructed so far contains \(l^*k+1\) pairwise distinct pages. Together with the k new pages requested in line 1 of GRS, a total of at least \(l^*+1\) pairwise distinct pages have been referenced so far. If in the call to ShortDistanceRequest there holds \(\sum _{l=k}^{p1} c_l >0\), then by part (b) of the lemma the smallest \(j\ge k\) such that \(c_j>0\) satisfies \(j\le l^*\). Hence the request generation in line 3 of the procedure is well defined.\(\square \)
In Lemma 4 below we prove that a complete phase is generated if and only if \(l^*\) is set in line 7 (Case C). For the proof we need the following auxiliary claim.
Claim 1
Consider an arbitrary execution of the whileloop consisting of lines 2–9. If \(l^*\) is set in line 7, then the value is upper bounded by that of the previous setting in line 5.
Proof
Immediately before \(l^*\) is set in line 7 (Case C), with the prior choice of \(l^*\) in line 5, there holds \(\sum _{l=l^*}^{p1} c_l + { n ew} \ge l^*k+1\) and \(\sum _{l=k}^{p1} c_l + { n ew} \ge l^*\). We argue that \(\sum _{l=k}^{l^*} c_l \ge k1\). If \(l^*=p1\), there is nothing to show because \({ n ew} \le pk\). Otherwise \(\sum _{l=l^*+1}^{p1} c_l + { n ew} < l^*+1k+1\) by the choice of \(l^*\). Again this implies \(\sum _{l=l^*+1}^{p1} c_l + { n ew} \le l^*k+1\) and the condition \(\sum _{l=k}^{p1} c_l + { n ew} \ge l^*\) ensures \(\sum _{l=k}^{l^*} c_l \ge k1\). Therefore the setting in line 7 cannot increase the value of \(l^*\).\(\square \)
Lemma 4
Consider an arbitrary execution of the whileloop consisting of lines 2–9. If \(l^*\) is set in line 7, then a complete phase is generated. Otherwise this is the last execution of the whileloop and the phase contains less than \(l^*\) requests.
Proof
Suppose that \(l^*\) is set in line 7. Immediately before this setting, with the prior choice of \(l^*\) in line 5, there holds \(\sum _{l=k}^{p1} c_l + { n ew} \ge l^*\). By the above Claim 1, the setting in line 7 cannot increase the value of \(l^*\) so that the last inequality is maintained. Therefore, in the forloops in lines 8 and 9 a total of \(l^*\) requests are issued. Next assume that \(l^*\) is not set in line 7. If in the execution of line 4 there holds \(\sum _{l=l^*}^{p1} c_l + { n ew} < l^*k+1\), then less than \(l^*k+1\) requests can be issued before \(\sum _{l=k}^{p1} c_l =0\). If \(l^*\) is set in line 5 but not in line 7 (Case I’), then \(\sum _{l=k}^{p1} c_l + { n ew} < l^*\). In the subsequent execution of lines 8 and 9 less than \(l^*\) requests are issued before \(\sum _{l=k}^{p1} c_l =0\).\(\square \)
The next lemma identifies important properties of the requests in a phase. The longdistance requests reference distinct pages. The shortdistance requests also reference distinct pages, having the property that they occured earlier in the phase or at the end of the previous phase. This ensures that the shortdistance requests can be served without page faults.
Lemma 5
Let P be an arbitrary phase and \(l^*\) be the value determined in lines 3–7 when the phase was generated.

(a)
The up to \(l^*k+1\) longdistance requests reference pairwise distinct pages. These pages are also different from the last k distinct pages referenced before the beginning of P.

(b)
The up to \(k1\) shortdistance requests reference pairwise distinct pages that are also different from the page referenced by the last longdistance request. Each of these shortdistance requests references a page that was requested during the last k requests before P or by a longdistance request in P.
Proof
(a) It suffices to show that when a distancel request is generated by LongDistanceRequest, then \(l\ge l^*\). Suppose that \(l^*\) was set in line 3 but not reset in lines 5 or 7, i.e. Case I holds. In this case there is nothing to show because in this case \(l^*\) is the smallest index l with \(c_l\ge 0\). If \(l^*\) was set in line 5 or 7 (Case I’ or Case C), then it satisfies \(\sum _{l=l^*}^{p1} c_l + { n ew} \ge l^*k+1\) because an adjustment in line 7 cannot increase the value determined in line 5. Since LongDistanceRequest issues distancel requests, for the largest l with \(c_l>0\), the index l cannot drop below \(l^*\).
(b) Whenever ShortDistanceRequest generates a distancel request, \(l\ge k\). Hence the up to \(k1\) shortdistance requests reference distinct pages that are also different from the last longdistance request. Observe that if shortdistance requests are issued, they must be preceded by \(l^*k+1\) longdistance requests. Recall that in line 1 of GRS, k new pages are requested. By Lemma 4 every phase except for possibly the last one is complete. It follows that before the beginning of P the last k requests reference distinct pages. By part (a) of this lemma they are also different from the longdistance requests in P. Hence the subsequence consisting of the last k requests before P and the first \(l^*k+1\) requests in P reference a total of \(l^*+1\) distinct pages. Lemma 3, part (a), ensures that whenever ShortDistanceRequest generates a distancel request, there holds \(l\le l^*\). Therefore, any such request references a page that was requested during the last k requests before P or by a longdistance request in P.\(\square \)
Lemma 6 analyzes the service of the request sequence generated by GRS.
Lemma 6
Suppose that the request sequence \(\sigma \) produced by GRS contains at least one phase generated in lines 2–9. Sequence \(\sigma \) produced by GRS can be served such that the following two properties hold. (1) No page faults occur on short distance requests, if there are any. (2) A the end of the last phase the last k distinct pages referenced are in fast memory.
Proof
Let alg be an algorithm that serves \(\sigma \) as follows. The first k distinct pages referenced are loaded into fast memory without making any page evictions. Whenever there is a page fault in a phase P alg evicts a page that is not referenced by any shortdistance request in the phase. If there are several such pages, it evicts the page that was requested least recently. Note that alg is welldefined because there exist at most \(k1\) shortdistance requests in any phase.
At the beginning of the first phase the last k distinct pages referenced are in fast memory. We show that if at the beginning of a phase P the last k distinct pages referenced are in fast memory, then this property also holds at the end of P w.r.t. the pages requested before the end of P. Moreover, no page faults occur on shortdistance requests in P. By Lemma 5 part (b) any shortdistance request in P references a page that was either requested during the last k requests before P or by a longdistance request in P. In the first case, by assumption, the corresponding page is in fast memory at the beginning of P and will not be evicted by alg until the end of the phase. In the second case the corresponding page will be loaded into fast memory when the longdistance request is served and not be evicted until the end of P. Thus the statement on the page faults holds. If P is a complete phase, then at the end of P the last k distinct pages referenced are all in fast memory. If P is not a complete phase, then the desired property also holds because alg always keeps the most recently requested pages in fast memory in addition to those needed by the shortdistance requests.\(\square \)
In the next two lemmas let \(j^*\) be the value of \(l^*\) determined in lines 3–7 when the last phase is generated by GRS. The following lemma shows that the longdistance and shortdistance requests are separated along \(j^*\).
Lemma 7
In the entire request sequence generated by GRS any distancel request with \(k\le l<j^*\) is a shortdistance request. Any distancel request with \(l>j^*\) is a longdistance request.
Proof
As always, let \(l^*\) be the value determined in an execution of lines 3–7 of GRS. We first prove that over the executions of the whileloop consisting of lines 2–9, these values form a nondecreasing sequence. To this end consider an arbitrary execution of the whileloop in lines 2–9. If \(l^*\) is set in line 7 (Case C), then \(l^*\) is the smallest j such that \(\sum _{l=k}^j c_l \ge k1\). By Lemma 4 the execution of the whileloop produces a complete phase so that exactly \(k1\) shortdistance request are issued and \(c_l =0\), for any \(l< l^*\). In a subsequent execution of the whileloop the chosen \(l^*\) value cannot be smaller because it is lower bounded by the smallest \(l\ge k\) such that \(c_l >0\). On the other hand if \(l^*\) is set in lines 3 or 5 (Cases I or I’), then by Lemma 4 the current execution of the whileloop is the last one. Therefore, as claimed, the \(l^*\) values form a nondecreasing sequence.
By Lemma 3 part (a) and the just proven property that the \(l^*\) values form a nondecreasing sequence, any distancel request with \(l> j^*\) must be a longdistance request. We next show that no distancel request with \(l<j^*\) can be a longdistance request. Consider the generation of the last phase. If \(j^*\) is determined in lines 3 or 7 (Cases I or C), then \(c_{j^*} >0\) when the value \(j^*\) is set. If LongDistanceRequest has generated a distancel request in any previous phase, then \(l\ge j^*\) because the procedure always chooses the largest l such that \(c_l >0\). As for the last phase, if \(j^*\) is determined in line 3 (Case I), no distancel request with \(l<j^*\) can be issued. If \(j^*\) is determined in line 7 (Case C), then by Lemma 4 the last phase is complete. Since by the choice of \(j^*\) there holds \(\sum _{l=k}^{j^*1} c_l < k1\), we obtain \(\sum _{l=j^*}^{p1} c_l \ge j^* k+1\). Therefore, in the last phase, no longdistance request can be a distancel request with \(l<j^*\).
Finally assume that \(j^*\) is set in line 5 but not reset in line 7, i.e. Case I’ holds. If the variable \({ n ew}\) is still positive, then the longdistance requests issued in previous phases cannot be distancel requests, for any \(l\ge k\). If \({ n ew}=0\), then \(\sum _{l=j^*}^{p1} c_l \ge j^*k+1\). The latter inequality and the fact that there exist an \(l\ge j^*\) such that \(c_l>0\) imply that neither in any previous phase nor in the last phase a distancel request with \(l<j^*\) can be issued.\(\square \)
Let \(\lambda \) and \(c^*_\lambda \) be the values as defined in Sect. 2.1.
Lemma 8
The value \(j^*\) is upper bounded by \(\lambda \). Moreover the entire request sequence \(\sigma \) produced by GRS contains at most \(c^*_\lambda \) distance\(\lambda \) requests that are shortdistance requests.
Proof
Let \(\gamma ^*\) be the number of distance\(j^*\) requests that are issued as shortdistance requests. By Lemma , the total number of shortdistance requests is \(\sum _{j=k}^{j^*1} c_l + \gamma ^*\). If at the end of the last phase all p distinct pages have been requested, the number of longdistance requests is \(pk + (c_{j^*}\gamma ^*) + \sum _{l=j^*+1}^{p1} c_l\). Otherwise the number is upper bounded by \(pk\). In the following we relate these expressions. Intuitively, in any phase we distribute the number of longdistance requests among the shortdistance requests.
Consider any distancel request \(l\le j^*\) that is issued as a shortdistance request. This request is contained in a phase with exactly \(l^*k+1\) longdistance request where, as usual, \(l^*\) is the value determined in lines 3–7 of GRS when the phase is generated. By Lemma 3 part (a) there holds \(l\le l^*\). Now we split the number \(l^*k+1\) of longdistance requests evenly among the shortdistance requests of the phase. If the phase is complete, each shortdistance request is assigned a request volume of \({l^*k+1 \over k1}\). If the phase is not complete, the assigned value is \({l^*k+1 \over s}\), where \(s<k1\) is the actual number of shortdistance requests in the phase. Thereby, a shortdistance request that is a distancel request will be assigned a request volume of at least \({lk+1 \over k1}\). This implies that the number of longdistance requests is lower bounded by
If at the end of the last phase all the p distinct pages have been requested, we obtain \(p + (c_{j^*}\gamma ^*) + \sum _{l=j^*+1}^{p1} c_l \ge k + \sum _{l=k}^{j^*1} c_l {lk+1 \over k1} + \gamma ^* {j^*k+1 \over k1}\). Recall the functions f and g defined in Sect. 2.1. The last inequality then reads as \(f(j^*,\gamma ^*) \le g(j^*,\gamma ^*)\). Lemma 2 part (b) implies \(j^*\le \lambda \). If \(j^*= \lambda \), then \(\gamma ^* \le c^*_{\lambda }\).
We finally study the case that at the end of the last phase some of the p pages have not yet been requested. In this case there are no distancel requests, \(k\le l \le p1\), that are longdistance requests. Hence \(p > \sum _{l=k}^{p1} c_l {lk+1 \over k1}\). Using the functions f and g, we obtain \(f(p1,c_{p1}) \le g(p1,c_{p1})\). In this case \(\lambda = p1\) and \(c^*_\lambda = c_{p1}\).\(\square \)
We finally prove the main result of this section.
Proof of Theorem 2
Given \(\sigma \) the service of the first k requests, which reference new pages, requires k page faults. These k pages then reside in fast memory. First assume that GRS does not generate any phases in lines 2–9, which implies \(\sum _{l=k}^{p1}c_l =0\). Then any distancel request with \(0\le l \le k1\) is a memory hit. Thus \(\textsc {opt}(\sigma )= p\). We argue that the upper bound on \(\textsc {opt}(\sigma )\) given in Theorem 2 is at least p. If \(\lambda = p1\), this is obvious because \(k + 2(\lambda k+1)> p\). If \(\lambda < p1\), then \(f(\lambda , c^*_{\lambda }) = g(\lambda , c^*_{\lambda })\), where \(g(\lambda , c^*_{\lambda }) \ge p\). The upper bound on \(\textsc {opt}(\sigma )\) given in Theorem 2 is equal to \(f(\lambda , c^*_{\lambda }) + 2(\lambda k+1) = g(\lambda , c^*_{\lambda }) + 2(\lambda k+1)\) and thus at least p.
In the following we assume that GRS generates at least one phase in lines 2–9. By Lemma 6 there exists an algorithm alg that can serve the phases such that page faults occur only on the longdistance requests and at the end of the last phase the last k distinct pages referenced are in fast memory. Hence all the distancel requests, where \(0\le l \le k1\), can be served without any page faults. If after the service of these requests there still exist new pages, then all the longdistance requests must have referenced new pages. In this case \(\textsc {opt}(\sigma )= p\) and, as argued in the last paragraph, the theorem holds.
In the remainder of this proof we concentrate on the case that after the service of the distancel requests, \(0\le l\le k1\), all the p distinct pages have been referenced. We have to upper bound the number of longdistance requests, which will give us an upper bound on the number of page faults. Suppose that r phases \(P(1), \ldots , P(r)\) have been generated by GRS in \(\sigma \). The first \(r1\) phases are complete. Assume that phase P(i), \(1\le i \le r1\), consists of \(l_i^*\) requests, where \(l_i^*\) is the value determined for this phase in lines 3–7 of GRS. Let \(l'_i\) be the smallest \(l\ge k\) such that \(c_l>0\) at the beginning of the phase. There holds \(l'_i \le l^*_i\). Algorithm alg can serve \(P(1), \ldots , P(r1)\) so that at most \(l^*_ik+1\) page faults are incurred on the longdistance requests in P(i), for \(i=1, \ldots , r1\). For any such phase P(i) we charge a service cost of \(l'_ik+1\), which is potentially smaller than \(l^*_ik+1\), to the \(k1\) shortdistance requests. Each such request is assigned a cost of \((l'_ik+1)/(k1)\). Observe that each such request is a distancel request with \(l'_i\le l\). Hence a shortdistance request that is a distancel request carries a cost of at most \((lk+1)/(k1)\).
Lemmas and ensure that, for any distancel request that is issued as shortdistance request, there holds \(l\le \lambda \). Moreover, there exist at most \(c^*_\lambda \) distance\(\lambda \) requests that are shortdistance requests. We obtain
The last inequality follows from our charging scheme that assigns service cost for the longdistance requests to the shortdistance requests. We next evaluate \(\sum _{i=1}^{r1} (l^*_il'_i)\). We argue that for any \(i=1, \ldots , r1\) there holds \(l_i^* \le l'_{i+1}\). Phase P(i) is complete, i.e. \(l^*_i\) was set in line 7 of GRS, see Lemma 4. At the beginning of the phase, by the choice of \(l^*_i\), there holds \(\sum _{l=k}^{l_i^*} c_l \ge k1\) but \(\sum _{l=k}^{l_i^*1} c_l < k1\). Hence at the end of the phase \(c_l=0\), for any l with \(k\le l <l^*_i\).
We obtain \(\sum _{i=1}^{r1} (l^*_il'_i)\le l^*_{r1} l'_1 < \lambda k+1\), where the last inequality follows from the facts that \(l^*_{r1}\le \lambda \), cf. Lemma , and \(l'_1 \ge k\). Also, by Lemma , \(l^*_r\le \lambda \). Using these inequalities in (5), we obtain the desired upper bound on \(\textsc {opt}(\sigma )\).\(\square \)
The Competitiveness of lru
We present upper and lower bounds on the competitive ratio \(R_{\textsc {lru}}(\mathcal{C})\), for any \(\mathcal{C}\). While the bounds involve a number of terms, we stress that they are nearly tight, up to an additive constant of \(2(\lambda k+1)\) in the denominator of the ratios. Of course, one could simplify the expressions at the expense of weakening the bounds. After stating the corollary we show that our expressions for \(R_{\textsc {lru}}(\mathcal{C})\) range between 1 and k.
Corollary 1
Let \(\mathcal{C} = (c_0, \ldots , c_{p1})\) be an arbitrary characteristic vector. Then
and
Proof
For any \(\sigma \), \(\textsc {lru}(\sigma ) = p + \sum _{l=k}^{p1} c_l\). This fact was already observed by Panagiotou and Souza [20] and also explained in the introduction of this paper. The corollary then follows from Theorems 1 and 2.\(\square \)
We argue that the upper bound in (6) can be constant, and as low as 1, in particular when given vectors \(\mathcal{C}\) modeling request sequences with a high degree of locality of reference. First consider the very simple case that \(\mathcal{C} = (c_0,\ldots , c_{k1}, 0,\ldots , 0)\). The ratio in (6) is equal to 1. A more interesting case is the scenario in which \(\mathcal{C}\) has a small number of positive entries \(c_l\) with \(l\ge k\). In the benchmark library we used there exist traces with this property, see Fig. 4 in Sect. 5. In order to keep the calculations simple we assume that there is a single positive entry \(c_l\) with \(l\ge k\). W.l.o.g. \(c_{p1}>0\), i.e. \(\mathcal{C} = (c_0,\ldots , c_{k1}, 0,\ldots , 0, c_{p1})\). The entries \(c_0,\ldots , c_{k1}\) may take arbitrary values as they are irrelevant for lru’s and opt’s cost. If \(f(p1,c_{p1}) \le g(p1,c_{p1})\), then \(c_{p1} \le k1\) and the ratio in (6) is upper bounded by 2. So assume \(f(p1,c_{p1}) > g(p1,c_{p1})\), in which case \(\lambda = p1\) and \(c^*_{\lambda }\) satisfies \(k+c^*_{\lambda }(pk)/(k1) = p+c_{p1} c^*_{\lambda }\). This implies \(c^*_{\lambda }\ge c_{p1}(k1)/(p1)\). We obtain that the ratio in (6) is upper bounded by \((p+c_{p1})/(k+c_{p1}(pk)/(p1))\). For increasing \(c_{p1}\) the last ratio approaches \(p1\over pk\). If \(p=k+1\), then the latter expression is equal to k, which is consistent with the fact that lru is kcompetitive on sequences in which a total of \(k+1\) distinct pages are referenced. If \(p=rk\), for some constant \(r>1\), then \(p1\over pk\) is smaller than \({r\over r1}\), i.e. we obtain constant competitive ratios if r is not too close to 1.
Finally, the upper bound for \(R_{\textsc {lru}}(\mathcal{C})\) is not greater than k: First assume that, for the given \(\mathcal{C}\), there holds \(f(p1,c_{p1}) \le g(p1,c_{p1})\). In this case \(k + \sum _{l=k}^{p1} c_l {lk+1\over k1} \le p\), which implies \(\sum _{l=k}^{p1} c_l \le (k1)p\). Thus the numerator in (6) is upper bounded by kp. On the other hand, if \(f(p1,c_{p1}) > g(p1,c_{p1})\), then \(f(\lambda , c^*_{\lambda }) = g(\lambda , c^*_{\lambda })\). In this case the numerator in (6) is
The last expression is at most k times the denominator in (6) because \(l/(lk+1) \le k\), for any \(l\ge k\).
Separating lru from fifo and fwf
We compare lru to fifo and fwf and start with an analysis of fifo. First we present Lemma 9 below, which specifies request sequences which opt can serve with low cost. It is essential for all the results developed in this section. More precisely, Lemma 9 states that, for any characteristic vector, among the request sequences that opt can serve with the smallest number of page faults, there exists one in which the distancel requests with \(l\le k1\) occur at the end of the sequence. We then show that, on such a sequence \(\sigma ^*\), fifo incurs at least as many faults as lru. This establishes Theorem 3, stating that the competitiveness of fifo is at least as high as that of lru. Furthermore, given \(\sigma ^*\), we can construct a nemesis sequence on which fifo incurs strictly more faults than lru. The main idea is to rearrange the suffix of distancel requests with \(l<k1\) and some distancel requests with \(l\ge k\) and build a series of phases causing a high cost for fifo. This is made precise in Theorem 4 that separates the performance of fifo from that of lru. Thereafter we show similar results for fwf.
Lemma 9
Let \(\mathcal{C} = (c_0, \ldots , c_{p1})\) be an arbitrary characteristic vector. Consider the request sequences defined by \(\mathcal{C}\) for which opt incurs the smallest number of page faults. Among these sequences there exists one in which all distancel requests with \(0\le l \le k1\) occur at the end of the sequence.
Proof
Given an arbitrary request sequence characterized by \(\mathcal{C}\) we perform two transformations. First, we repeatedly remove the distancel requests with \(0\le l\le k1\) so that the resulting sequence is characterized by \(\mathcal{C}_0 = (0, \ldots ,0, c_k,\ldots , c_{p1})\). Then, for \(l=0,\ldots , k1\), we append \(c_l\) distancel requests at the end of the sequence. Neither of these transformations increases the number of page faults incurred by opt. This proves the lemma.
Transformation 1 Consider an arbitrary request sequence \(\sigma \) characterized by \(\mathcal{C} = (c_0, \ldots , c_{p1})\). Let \(\sigma (t)\) be the first request in \(\sigma \) that is a distance\(l'\) request, for some \(l'\) with \(0\le l' \le k1\). We modify \(\sigma \) so that \(\sigma (t)\) is removed and, for all other distancel requests in \(\sigma \), the value of l does not change. Hence the resulting sequence \(\sigma '\) is characterized by a vector that differs from \(\mathcal{C}\) only in that the \(l'\)th component is equal to \(c_{l'}1\). As we will see, the optimum service cost of \(\sigma '\) is not higher than that of \(\sigma \). By repeating these modifications, for a total of \(\sum _{l=0}^{k1} c_l\) times, we obtain a sequence characterized by \(\mathcal{C}_0 = (0, \ldots ,0, c_k,\ldots , c_{p1})\) whose optimum service cost is not higher than that of \(\sigma \).
Again, let \(\sigma \) be the original request sequence and \(\sigma (t)\) be the first request in \(\sigma \) that forms a distance\(l'\) request with \(0\le l'\le k1\). Let \(x_0=\sigma (t)\) be the referenced page and \(\sigma (t')\) with \(t'<t\) be the most recent request to \(x_0\). If \(l'=0\), then \(\sigma '\) is obtained from \(\sigma \) by simply deleting request \(\sigma (t)\). This preserves the distances l in all remaining distancel requests and \(\sigma '\) can be served in the same way as \(\sigma \). In this case we are done.
In the following we concentrate on the case \(l'>0\). Since \(\sigma (t)\) is the first distancel request with \(0\le l\le k1\), the pages requested by \(\sigma (t'+1), \ldots , \sigma (t1)\) are pairwise distinct and hence \(t1t'=l'\). For \(i=1,\ldots , l'\), let \(x_i\) be the page requested by \(\sigma (t'+i)\). Thus the subsequence \(\sigma (t'), \ldots , \sigma (t)\) is equal to \(x_0,x_1, \ldots , x_{l'},x_0\). Also note that \(x_0\) is different from the pages \(x_1, \ldots , x_{l'}\). Now the sequence \(\sigma '\) is obtained from \(\sigma \) by deleting \(\sigma (t)\) and renaming requests \(\sigma (s)\) with \(s>t\) that are made to pages in \(\{x_0,\ldots , x_{l'}\}\) in a cyclic fashion. More specifically, requests to pages \(x_i\) are replaced by requests to \(x_{i1}\), where \(0<i\le l'\), and requests to \(x_0\) are replaced by \(x_{l'}\). Formally, the first \(t1\) requests in \(\sigma '\) are identical to those in \(\sigma \). Request \(\sigma (t)\) does not occur in \(\sigma \). Consider any \(s>t\). If \(\sigma (s)\) references \(x_i\), \(0\le i\le l'\), then the corresponding request \(\sigma '(s1)\) references \(x_{(i1)\bmod (l'+1)}\). Finally, if \(\sigma (s)\) references a page different from \(x_i\), for all \(0\le i\le l'\), then \(\sigma '(s1)\) is identical to \(\sigma (s)\).
We prove that \(\sigma '\) is a request sequence characterized by a vector differing from \(\mathcal{C}\) only in that entry \(c_{l'}\) is replaced by \(c_{l'}1\). Recall that \(\sigma \) and \(\sigma '\) are identical on the first \(t1\) requests. Thus any distancel request in this prefix of \(\sigma \) remains a distancel request in \(\sigma '\). Therefore, it suffices to consider an arbitrary request \(\sigma (s)\) with \(s>t\). We show that if \(\sigma (s)\) is a distancel request, \(0\le l\le p1\), then the corresponding \(\sigma '(s1)\) is also a distancel request. We focus on the number of distinct pages from \(\{x_0,\ldots , x_{l'}\}\) referenced between \(\sigma (s)\) (resp. \(\sigma '(s1)\)) and the most recent request to the page accessed by \(\sigma (s)\) (resp. \(\sigma '(s1)\)). This is sufficient because requests to other pages do not change in the sequence modification described above.
We first study the case that the page requested by \(\sigma (s)\) was last referenced during \(\sigma (t'),\ldots , \sigma (t)\). In this case \(\sigma (s)= x_i\), for some \(0\le i \le l'\). First assume that \(i=0\), i.e. \(\sigma (s)=x_0\) and the most recent reference to \(x_0\) is \(\sigma (t)\). Moreover \(\sigma '(s1)=x_{l'}\). If no pages from \(\{x_0,\ldots , x_{l'}\}\) are requested in the subsequence \(\sigma (t+1), \ldots ,\sigma (s1)\), then by the above argument we are done. So let \(x_{i_1}, \ldots , x_{i_m}\) with \(i_1< \ldots < i_m\) be the pages from \(\{x_0,\ldots , x_{l'}\}\) requested in \(\sigma (t+1), \ldots ,\sigma (s1)\). There holds \(i_1\ge 1\) because the most recent request to \(x_0\) is \(\sigma (t)\). In \(\sigma '\) pages \(x_{i_11}, \ldots , x_{i_m1}\) are referenced in the subsequence starting after \(\sigma '(t1)=x_{l'}\) and ending before \(\sigma '(s1)\). Since \(i_m1< l'\), the number of distinct pages referenced between \(\sigma (t)\) and \(\sigma (s)\) is the same as the number of distinct pages between \(\sigma '(t1)\) and \(\sigma '(s1)\). Next assume that \(0<i\le l'\). Here \(\sigma (s)=x_i\) and \(\sigma '(s1)=x_{i1}\). Between \(\sigma (t'+i)\) and \(\sigma (s)\) first there are requests to pages \(x_{i+1}, \ldots , x_{l'}\) and \(x_0\). Furthermore, assume that pages \(x_{i_1}, \ldots , x_{i_m}\) with \(i_1< \cdots < i_m\) from the set \(\{x_1,\ldots , x_{i1}\}\) are referenced in this subsequence. Note that after request \(\sigma (t)\), a reference to a page from \(\{x_0, x_{i+1}, \ldots , x_{l'}\}\) turns into a page from \(\{x_{i}, \ldots , x_{l'}\}\) in \(\sigma '\). Thus between \(\sigma '(t'+i1)=x_{i1}\) and \(\sigma '(s1)\) we find pages \(x_{i}, \ldots , x_{l'}\) and \(x_{i_11}, \ldots , x_{i_m1}\) when focusing on the set \(\{x_0,\ldots , x_{l'}\}\). Hence the total number of distinct pages referenced remains the same.
We next address the case that the page requested by \(\sigma (s)\) has been referenced last by some \(\sigma (s')\) with \(t<s'<s\). In this case the number of distinct pages referenced in the subsequence \(\sigma (s'+1),\ldots , \sigma (s1)\) is the same as in \(\sigma '(s'),\ldots , \sigma '(s2)\). This holds true because the pages from \(\{x_0,\ldots , x_{l'}\}\) were just renamed cyclically. Finally assume that the page y requested by \(\sigma (s)\) was last referenced by \(\sigma (s')\) with \(s'<t'\). In this case \(y\ne x_i\), for all i with \(0\le i \le l'\). We observe that the subsequences \(\sigma (s'+1),\ldots , \sigma (s1)\) and \(\sigma '(s'+1),\ldots , \sigma '(s2)\) reference all the pages from \(\{x_0,\ldots , x_{l'}\}\). Again the number of distinct pages in \(\sigma (s'+1),\ldots , \sigma (s1)\) is the same as that in \(\sigma '(s'),\ldots , \sigma '(s2)\).
It remains to analyze service cost. When serving \(\sigma \), on a page fault opt always evicts a page whose next request is farthest in the future. Hence when processing a request \(\sigma (t'+i)\), \(0\le i \le l'\), opt does not evict any of the pages \(x_{i+1}, \ldots , x_{l'}\) or \(x_0\) from fast memory because \(l'\le k1\). In particular, \(\sigma (t)=x_0\) is not a page fault. Consider the following algorithm alg that serves \(\sigma '\) in the same way as opt serves \(\sigma \), with the following modification: After request \(\sigma '(t')=\sigma (t')\), whenever opt evicts \(x_i\), alg evicts \(x_{(i1)\bmod (l'+1)}\), for \(i=0,\ldots , l'\). Recall that \(\sigma '\) is obtained from \(\sigma \) by replacing occurrences of \(x_i\) by \(x_{(i1)\bmod (l'+1)}\) after request \(\sigma (t)\). This implies that after \(\sigma (t1)=\sigma '(t1)\) the total number of page faults incurred by opt on requests to \(x_i\) is equal to the number of page faults incurred by alg on references to \(x_{(i1)\bmod (l'+1)}\), \(0\le i\le l'\). Hence both algorithms have identical service costs because up to request \(\sigma (t1) = \sigma '(t1)\) they incur the same number of faults.
Transformation 2 Consider the sequence \(\sigma ^*\) characterized by \(\mathcal{C}_0 = (0, \ldots ,0, c_k,\ldots , c_{p1})\). For \(l=0,\ldots , k1\), append \(c_l\) distancel requests at the end of \(\sigma ^*\). We observe that the last k requests in \(\sigma ^*\) reference pairwise distinct pages and that each of the newly appended requests is to one of these k pages. Hence the addition of the new requests does not generate extra service cost because when opt serves the last k requests in \(\sigma ^*\), it can always evict a page not referenced in the remainder of the sequence.\(\square \)
Theorem 3
For any \(\mathcal{C}\), there holds \(R_{\textsc {fifo}}(\mathcal{C})\ge R_{\textsc {lru}}(\mathcal{C})\).
Proof
Let \(\mathcal{C}=(c_0,\ldots , c_{p1})\) be an arbitrary vector. Among the sequences characterized by \(\mathcal{C}\), consider those for which opt incurs the smallest number of page faults. By Lemma 9 there exists one in which all distancel requests, \(0\le l \le k1\), are issued at the end of the sequence. Fix a sequence \(\sigma ^*\) with this property. There holds \(R_{\textsc {lru}}(\mathcal{C}) = \textsc {lru}(\sigma ^*)/\textsc {opt}(\sigma ^*)\) because on every sequence characterized by \(\mathcal{C}\) lru incurs the same number \(p+\sum _{l=k}^{p1} c_l\) of page faults. In the following we show \(\textsc {fifo}(\sigma ^*) \ge \textsc {lru}(\sigma ^*)\). This establishes the theorem.
Let \(\sigma ^*_1\) be the prefix of \(\sigma ^*\) consisting of all distancel requests, \(k\le l\le p1\), and the requests to new pages. The remaining requests of \(\sigma ^*\) are distancel requests with \(0\le l \le k1\). We will show that fifo incurs a page fault on each request of \(\sigma ^*_1\), which consists of \(p+\sum _{l=k}^{p1}\) requests. Hence \(\textsc {fifo}(\sigma ^*) \ge p+\sum _{l=k}^{p1} c_l = \textsc {lru}(\sigma ^*)\).
Any \(k+1\) consecutive requests in \(\sigma ^*_1\) reference distinct pages because, for any distancel request, there holds \(l\ge k\). On each of the first k requests in \(\sigma ^*_1\) fifo incurs a page fault because the initial fast memory is empty. After request \(\sigma ^*_1(k)\) fifo has the pages referenced by \(\sigma ^*_1(1), \ldots , \sigma ^*_1(k)\) in fast memory. We show inductively that, for any \(t>k\), fifo has a page fault on \(\sigma ^*_1(t)\) and after the service of this request the algorithm has the pages accessed by \(\sigma ^*_1(tk+1), \ldots , \sigma ^*_1(t)\) in its fast memory. So consider any \(t>k\). If \(t=k+1\), then before \(\sigma ^*_1(t)\) fifo has pages \(\sigma ^*_1(1), \ldots , \sigma ^*_1(k) = \sigma ^*_1(tk), \ldots , \sigma ^*_1(t1)\) in fast memory. If \(t>k+1\), then by induction hypothesis fifo has the pages \(\sigma ^*_1(tk), \ldots , \sigma ^*_1(t1)\) in fast memory before reference \(\sigma ^*(t)\). Since any \(k+1\) consecutive requests in \(\sigma ^*_1\) reference distinct pages, the page requested by \(\sigma ^*_1(t)\) is not in fifo’s fast memory and a page fault occurs. As any prior request \(\sigma ^*_1(s)\) with \(s<t\) has been a page fault, fifo will evict the page requested by \(\sigma ^*_1(tk)\) when serving \(\sigma ^*_1(t)\).\(\square \)
The next theorem sharply separates lru from fifo. Observe that, for any \(\mathcal{C}=(c_0,\ldots , c_{p1})\), lru’s competitiveness can be expressed as \(R_{\textsc {lru}}(\mathcal{C}) = \textsc {lru}(\mathcal{C})/\textsc {opt}(\mathcal{C})\), where \(\textsc {lru}(\mathcal{C}) = p+\sum _{l=k}^{p1} c_l\) is the number of faults incurred by lru on every input characterized by \(\mathcal{C}\) and \(\textsc {opt}(\mathcal{C})\) denotes the minimum number of page faults required to serve any request sequence defined by \(\mathcal{C}\). We use this notation in the following. Theorem 4 presents a lower bound on \(R_{\textsc {fifo}}(\mathcal{C})\), given \(R_{\textsc {lru}}(\mathcal{C}) = \textsc {lru}(\mathcal{C})/\textsc {opt}(\mathcal{C})\), for any \(\mathcal{C}\). In that lower bound c depends on the minimum \(c_l\), where \(1\le l \le k1\), and roughly \(\sum _{l=k}^{p1} c_l\). For increasing c, the competitiveness of fifo can be made arbitrarily close to \((k1)/(11/k) = k\). In Section 3 we analyzed vectors \(\mathcal{C} = (c_0,\ldots , c_{k1}, 0,\ldots , 0, c_{p1})\) and showed that lru’s competitiveness is constant, for sufficiently large \(c_{p1}\), provided that p is not too close to k. Hence, for large \(c_1, \ldots , c_{k1}\) and \(c_{p1}\), the competitiveness of lru is a small constant while that of fifo is close to k. We remark that, in general, c cannot be larger than \(\textsc {lru}(\mathcal{C})\) but this is sufficient to establish a lower bound of at least k / 2 on fifo’s competitiveness.
Theorem 4
Let \(\mathcal{C}=(c_0,\ldots , c_{p1})\) be any vector. Let \(c_{\min } = \min _{1\le l \le k1} c_l\) and \(c= \min \{\lfloor c_{\min }/2 \rfloor , pk + \sum _{l=k}^{p1} c_l\}\). Then
Proof
Among the request sequences characterized by \(\mathcal{C}\), let \(\sigma ^*\) be one for which opt incurs the minimum number \(\textsc {opt}(\mathcal{C})\) of page faults and in which all distancel requests, \(0\le l \le k1\), occur at the end of the sequence. Lemma 9 ensures the existence of such a request sequence. Given \(\sigma ^*\), we construct a nemesis sequence \(\sigma \) for fifo in three steps.
(1) First remove all distancel requests with \(0\le l\le k1\) from \(\sigma ^*\). In this truncated sequence remove the last c requests, which are requests to new pages or distancel requests with \(k\le l \le p1\). Let \(\sigma _1\) denote the resulting request sequence. (2) Append to \(\sigma _1\) a sequence of c phases \(P(1),\ldots , P(c)\). Any P(i) consists of two parts. In the first part, for \(l=1, \ldots , k1\) and in this specific order, a distancel request is issued. The second part of the phase starts with a request to a new page if less than p distinct pages have been referenced so far. Otherwise it starts with a distancel request, where \(l\ge k\) is an index such that the current request sequence contains less than \(c_l\) distancel requests. Then again, for increasing \(l=1, \ldots , k1\), a distancel request is issued. Let \(\sigma _2\) denote the request sequence obtained after this step. (3) Append the missing distancel requests, where \(0\le l \le k1\), at the end of \(\sigma _2\). Specifically, while there exists an l with \(0\le l \le k1\) such that the current request sequence contains less than \(c_l\) distancel requests, issue such a request. Let \(\sigma _3=\sigma \) be the resulting request sequence.
We state some properties of the above construction. Sequence \(\sigma _1\) consists of at least k requests because after the removal of the distancel requests with \(0\le l\le k1\) from \(\sigma ^*\), exactly \(p+ \sum _{l=k}^{p1} c_l\) requests remain and \(c\le pk + \sum _{l=k}^{p1} c_l\). In step (2) each P(i) contains two distancel requests, for any \(1\le l \le k1\), as well as one request to a new page or a distancel request with \(k\le l \le p1\). Thus the choice of c, where \(c\le \lfloor c_{\min }/2 \rfloor \), as well as the request removals of Step (1) ensure that the construction of \(P(1), \ldots , P(c)\) is welldefined. Sequence \(\sigma _2\) contains all the \(p + \sum _{l=k}^{p1} c_l\) requests to new pages and distancel requests with \(k\le l \le p1\). Hence the final request sequence \(\sigma _3 = \sigma \) is an input characterized by \(\mathcal{C}\).
In the following we first prove that \(\textsc {fifo}(\sigma ) \ge \textsc {lru}(\mathcal{C}) +c(k1)\). Consider the prefix of \(\sigma ^*\) consisting of the requests to new pages and the distancel requests with \(k\le l\le p1\). In the proof of Theorem 3 we showed that fifo incurs a page fault on each request of this prefix sequence. Hence fifo has a page fault on each request of \(\sigma _1\), which consists of \(p + \sum _{l=k}^{p1} c_l c\) requests. We next prove that fifo incurs k page faults in each P(i), \(1\le i \le c\). This implies \(\textsc {fifo}(\sigma ) \ge p + \sum _{l=k}^{p1} c_l c +c\cdot k = \textsc {lru}(\mathcal{C}) +c(k1)\).
As for fifo’s cost in \(P(1), \ldots , P(c)\) we show the following statement: In the second part of each P(i), \(1\le i \le c\), fifo incurs a page fault on each of the k requests; at the end of P(i) fifo has the pages referenced by these k requests in fast memory. The proof is by induction on i. Consider any P(i) and let \(x_0, \ldots , x_{k1}\) be the pages referenced by the last k requests preceding P(i). We first observe that these pages are pairwise distinct. If \(i=1\), then \(x_0, \ldots , x_{k1}\) are the pages accessed by the last k requests of \(\sigma _1\). All these requests are references to new pages or distancel requests with \(l\ge k\). Hence the desired property holds. If \(i>1\), then \(x_0, \ldots , x_{k1}\) are the pages referenced by the k requests in the second part of \(P(i1)\). The jth of these references is a distance\((j1)\) request, for \(j=2, \ldots , k\), so that no page can occur twice. This implies that P(i) has the form \(x_{k2}, \ldots , x_0, y, x_0, \ldots , x_{k2}\); cf. the construction of the phases in Step (2). Here y is the page accessed by the first request in the second part of P(i). As this is a request to a new request page or a distancel request with \(l\ge k \), page y is different from \(x_0, \ldots , x_{k1}\).
We next argue that at the beginning of P(i) fifo has pages \(x_0, \ldots , x_{k1}\) in fast memory. If \(i=1\), then the k requests before P(1), which reference the sequence \(x_0, \ldots , x_{k1}\), are a suffix of \(\sigma _1\). Recall that fifo has a fault on each request in \(\sigma _1\). Hence when the fault on \(x_j\) occurs, fifo evicts a page different from \(x_0, \ldots , x_{j1}\), for \(j=1, \ldots , k1\). Thus \(x_0, \ldots , x_{k1}\) are in fast memory at the beginning of P(i). If \(i>1\), then the property follows from the induction hypothesis. Given this fact about fifo’s fast memory content, the algorithm does not incur any page fault in the first part of P(i). On the request to y fifo has a page fault and evicts \(x_0\). In the sequel, for \(j=0, \ldots , k2\), fifo has a page fault on the reference to \(x_j\) and evicts \(x_{j+1}\) so that at the end of P(i), pages \(y, x_0, \ldots , x_{k2}\) reside in fast memory.
It remains to analyze opt’s cost on \(\sigma \). Consider an algorithm alg that serves \(\sigma \) as follows. On \(\sigma _1\) it performs the same page replacements as opt does when serving \(\sigma ^*\), except for the last k requests of \(\sigma _1\). On these references, whenever there is page fault, alg evicts a page not accessed during this suffix of k requests. Thus at the beginning of P(1) the pages referenced by the last k requests prior P(1) reside in alg’s fast memory. The algorithm can then serve P(1) and any subsequent phase P(i) so that a page fault occurs only on the first request of the second part of the phase. More specifically, when serving this request, alg evicts the page referenced last before P(i), which is not needed in the remainder of the phase. Hence at the end of a phase P(i) alg has the k pages referenced in the second part of P(i) in fast memory. Using this fact for \(i=c\), we obtain that all the distancel requests with \(0\le l \le k1\) issued after P(c) are memory hits. A final observation is that on the first \(p+ \sum _{l=k}^{p1} c_l\) requests of the initial sequence \(\sigma ^*\), opt incurs at least one page fault on any k consecutive requests: After opt has served any request \(\sigma (t)\), it cannot have all the pages accessed by \(\sigma (t+1), \ldots , \sigma (t+k)\) in fast memory because \(\sigma (t), \ldots , \sigma (t+k)\) reference pairwise distinct pages. Thus on the c requests that are removed from the already truncated sequence in Step (1) opt incurs at least \(\lfloor c/k \rfloor \) page faults. We conclude that \(\textsc {opt}(\sigma ) \le \textsc {opt}(\mathcal{C}) + c  \lfloor c/k \rfloor \le \textsc {opt}(\mathcal{C}) + c(11/k) +1\).\(\square \)
Next we address fwf and develop results corresponding to those for fifo. In the separation bound of Theorem 6 the vector entries \(c_1,\ldots , c_{k1}\) may be by a factor of 2 smaller compared to those in Theorem 4.
Theorem 5
For any \(\mathcal{C}\), there holds \(R_{\textsc {fwf}}(\mathcal{C})\ge R_{\textsc {lru}}(\mathcal{C})\).
Proof
The proof is very similar to that of Theorem 3. Given \(\mathcal{C} = (c_0,\ldots , c_{p1})\), let \(\sigma ^*\) be the request sequence for which opt incurs the minimum number of page faults, among sequences characterized by \(\mathcal{C}\), and in which the distancel requests with \(0\le l \le k1\) all occur at the end of the sequence. We will show \(\textsc {fwf}(\sigma ^*) \ge \textsc {lru}(\sigma ^*)\). Let \(\sigma _1^*\) be the prefix of \(\sigma ^*\) consisting of the requests to new pages and the distancel requests, where \(l\ge k\). It suffices to show that fwf has a page fault on each request of \(\sigma _1^*\). The first k requests in \(\sigma _1^*\) are references to new pages. Thereafter fwf evicts all pages from fast memory on request \(\sigma _1^*(ik+1)\), for any \(i\ge 1\), because the referenced page is different from those requested by the k previous ones \(\sigma _1^*((i1)k+1),\ldots , \sigma _1^*(ik)\). Obviously, \(\sigma _1^*(ik+1)\) as well as the \(k1\) subsequent requests are page faults.\(\square \)
Theorem 6
Let \(\mathcal{C}=(c_0,\ldots , c_{p1})\) be any vector. Let \(c_{\min } = \min _{1\le l \le k1} c_l\) and \(c= \min \{c_{\min }, pk + \sum _{l=k}^{p1} c_l\}\). Then
Proof of Theorem 6
The basic structure of the proof is very similar to that of Theorem 4. Considering the sequences defined by \(\mathcal{C}\), we fix an input \(\sigma ^*\) for which opt incurs the minimum number of page faults and in which the requests to new pages and the distancel requests, \(0\le l \le k1\), occur at the end of the sequence. We transform \(\sigma ^*\) into a nemesis sequence \(\sigma \) for fwf. The transformation is similar to that described in the proof of Theorem 4, except that the construction of the phases in Step (2) is different.
(1) Remove all distancel requests, \(0\le l\le k1\), from \(\sigma ^*\). Moreover, remove the last c requests from this truncated sequence. Let \(\sigma _1\) be the resulting sequence. (2) Append phases \(P(1), \ldots , P(c)\) to \(\sigma _1\). More specifically, suppose that at the end of \(\sigma _1\) fwf has j pages in fast memory, \(1\le j\le k\). Then each P(i), \(1\le i \le c\), is of the following form. First, for \(l=j, \ldots , k1\), a distancel request is issued. Then a request to a new page or a distancel request with \(l\ge k\) is placed. Finally, for increasing \(l=1, \ldots , j1\), a distancel request is issued. Let \(\sigma _2\) denote the request sequence obtained after this step. (3) Append the missing distancel requests with \(0\le l \le k1\) to \(\sigma _2\) and let \(\sigma _3=\sigma \) be the final request sequence.
We analyze fwf’s cost. The proof of Theorem 5 implies that fwf incurs a page fault on each request of \(\sigma _1\). Let \(z_1,\ldots , z_k\) be the pages referenced by the last k requests of \(\sigma _1\). These pages are pairwise distinct. By assumption fwf has j pages in fast memory at the end of \(\sigma _1\), where \(1\le j \le k\). These must be \(z_{kj+1}, \ldots , z_k\). Thus the first phase P(1) starts with requests to pages \(z_{kj}, \ldots , z_1\) in this order as these are distancel requests, for \(l=j, \ldots , k1\). Each of these requests is a page fault for fwf. In each phase P(i), \(1\le i \le c\), there exists one request that is made to a new page or forms a distancel request with \(l\ge k\). Let \(y_i\) denote the page specified by this request in P(i). Consider the first such page \(y_1\). Before the respective request in P(1) the page sequence \(z_{kj+1}, \ldots , z_k, z_{kj}, \ldots , z_1\) is requested and all of these pages reside in fwf’s fast memory when the request to \(y_1\) has to be served. Again, the pages \(z_{kj+1}, \ldots , z_k, z_{kj}, \ldots , z_1\) are pairwise distinct and \(y_1\) differs from all of them. Thus on the request to \(y_1\) fwf flushes its fast memory. Consider a pair of requests to \(y_i\) and \(y_{i+1}\) issued in P(i) and \(P(i+1)\), respectively, where \(1\le i <c\). Let \(x_1, \ldots , x_{k1}\) be the pages referenced by the \(k1\) requests between \(y_i\) and \(y_{i+1}\) in P(i) and \(P(i+1)\). These pages are pairwise distinct and differ from \(y_i\) as they are distancel requests, for \(l=1, \ldots , k1\). Moreover, \(y_{i+1}\) is different from these pages and also different from \(y_i\). Thus, if fwf flushes its fast memory on the request to \(y_i\), all requests to \(x_1, \ldots , x_{k1}\) are page faults and fwf again evicts all pages from fast memory on the request to \(y_{i+1}\). As all requests of P(c) issued after \(y_c\) are also page faults, the total number of faults incurred by fwf is \(\textsc {fwf}(\sigma ) \ge p + \sum _{l=k}^{p1} c_l c +c\cdot k = \textsc {lru}(\mathcal{C}) +c(k1)\).
We finally evaluate opt’s cost on \(\sigma \). Let again \(z_1, \ldots , z_k\) denote the k pages referenced at the end of \(\sigma _1\). Consider the algorithm alg that serves \(\sigma _1\) as opt serves this prefix of the original sequence \(\sigma ^*\) with the following exception: On the last k requests of \(\sigma _1\), whenever there is a fault on a request to a page \(z_i\), \(1\le i \le k\), evict a page different from these k pages. Hence at the end of \(\sigma _1\), alg has the pages \(z_1, \ldots , z_k\) in fast memory and the first \(kj\) requests of P(1) can be served without any page fault. Algorithm alg can then serve the c phases so that a page fault only occurs on the requests to \(y_i\), \(1\le i\le c\). This holds true because if prior to the request to \(y_i\) in P(i) the subsequence of the last \(k1\) references is \(x_1, \ldots , x_{k1}\), then the \(k1\) requests after \(y_i\) access \(x_{k1}, \ldots , x_1\) in this order. This implies that within the phases, apart from the pages \(y_1, \ldots , y_c\), only pages \(z_1, \ldots , z_{kj},z_{kj+2},\ldots , z_k\) are referenced, which alg always keeps in fast memory. Hence on the request to \(y_1\), page \(z_{kj+1}\) is evicted. On the fault to \(y_{i+1}\) page \(y_i\) is deleted, \(1\le i <c\). This also implies that all the final distancel requests, \(0\le l \le k1\), can be served without any page fault. In conclusion, as in the proof of Theorem 4, \(\textsc {opt}(\sigma ) \le \textsc {opt}(\mathcal{C}) + c  \lfloor c/k \rfloor \le \textsc {opt}(\mathcal{C}) + c(11/k) +1\).\(\square \)
Experiments
We report on an experimental study we have performed with reference traces from the benchmark library [15]. This test suite was specifically designed to evaluate the performance of memory systems. Details can be found in the SIGMETRICS paper [16]. The trace library consists of 15 files that contain sequential logs of memory locations used by various programs. Standard applications from the Linux and the Windows NT operating systems were executed. Table 1 shows a list of the files.
In a first step, for each trace, we have extracted the underlying characteristic vector, simply by counting the number of distancel requests, for each \(l\ge 0\), in it. Uniformly over all files, in each resulting vector, the entries basically form a nonincreasing sequence, with a large majority of the requests representing distancel requests, for small values of l. Once again this confirms the fact that realworld sequences exhibit a high degree of locality. Figures 3 and 4 depict the extracted characteristic vectors for four representative files, namely gcc, netscape, lindsay and winword. Due to space considerations we do not show the results for all the 15 files. Note that the values of the vector entries are shown in a logarithmic scale. Figure 4 gives the vector of lindsay that contains a few positive entries \(c_l\), for large l, after a long preceding subsequence of zerovalued entries. This pattern was refered to in the calculations of Section 3.
In a second step we have compared, for each trace/request sequence \(\sigma \), the optimum number of page faults \(\textsc {opt}(\sigma )\) to our bounds given in Theorems 1 and 2. For all the traces the difference is small. Hence our lower bound on \(\textsc {opt}(\sigma )\) in Theorem 1 quite accurately predicts the optimum cost. Furthermore, the additive expression of \(2(\lambda k+1)\) in the bound of Theorem 2 is not critical. We point out that our lower bound on \(\textsc {opt}(\sigma )\) cannot match the true service cost because the bound holds for every request sequence specified by a characteristic vector \(\mathcal{C}\). The given trace \(\sigma \), in general, is not a sequence that can be served with the minimum number of faults, among inputs characterized by the underlying \(\mathcal{C}\). Additionally, we have evaluated the lower bound on \(\textsc {opt}(\sigma )\) given by Panagiotou and Souza [20], cf. inequality (1). In the experiments our new lower bound developed in this paper is always significantly better. The gap increases as the fast memory size k increases. It turns out that, for larger values of k, the lower bound by Panagiotou and Souza is quite weak. One could slightly improve it by considering the maximum of p and the expression of (1). However, this only resolves cases where there is no need for a sophisticated lower bound. Figures 5 and 6 show the plots for the four sample traces. Even for small values of k, our new lower bound improves upon that of Panagiotou and Souza by at least 25– 100%.
Finally we have compared, for each trace \(\sigma \) and underlying \(\mathcal{C}\), the upper and lower bounds on lru’s competitiveness \(R_{\textsc {lru}}(\mathcal{C})\) (see Corollary 1) to the experimentally observed competitiveness for \(\sigma \). For all the files, our bounds give small constant competitive factors that are typically in the range [1, 4]. Our upper bound on \(R_{\textsc {lru}}(\mathcal{C})\) is usually at most 2.5 times the experimentally observed competitiveness. Again, our bounds cannot exactly match the latter competitiveness since \(R_{\textsc {lru}}(\mathcal{C})\) is the maximum ratio of \(\textsc {lru}(\sigma )/\textsc {opt}(\sigma )\), considering \(\sigma \) characterized by \(\mathcal{C}\). A trace at hand, in general, is not such a worstcase sequence. Interestingly, for varying k, our bounds exhibit the same overall behavior as the experimentally observed competitiveness. Thus they correctly describe the general qualitative behavior of \(R_{\textsc {lru}}(\mathcal{C})\), depending on k. We refer the reader to Figs. 7 and 8, which depict again the results for our four selected samples traces. An exception in the trace library is the file lindsay. For a few values of k, the upper and lower bounds on lru’s competitiveness is as high as 40. For these k, there are only a few positive vector entries \(c_l\), with \(l\ge k\), in the characteristic vector \(\mathcal{C}\). These outliers cause high competitive ratios in the theoretical bounds. Indeed there exist sequences characterized by \(\mathcal{C}\) for which the performance factors depend linearly on k.
References
Albers, S., Favrholdt, L.M., Giel, O.: On paging with locality of reference. J Comput Syst Sci 70(2), 145–175 (2005)
Angelopoulos, S., Dorrigiv, R., LópezOrtiz, A.: On the separation and equivalence of paging strategies. In: Proceedings of 18th Annual ACMSIAM Symposium on Discrete Algorithms (SODA), pp. 229–237 (2007)
Angelopoulos, S., Schweitzer, P.: Paging and list update under bijective analysis. J. ACM 60(2), 7 (2013)
Becchetti, L.: Modeling locality: a probabilistic analysis of LRU and FWF. In: Proceedings of 12th Annual European Symposium of Algorithms (ESA), Springer LNCS, vol. 3221, pp. 98–109 (2004)
Belady, L.A.: A study of replacement algorithms for virtual storage computers. IBM Syst. J. 5, 78–101 (1966)
BenDavid, S., Borodin, A.: A new measure for the study of online algorithms. Algorithmica 11(1), 73–91 (1994)
Boyar, J., Favrholdt, L.M., Larsen, K.S.: The relative worstorder ratio applied to paging. J. Comput. Syst. Sci. 73(5), 818–843 (2007)
Boyar, J., Gupta, S., Larsen, K.S.: Access graphs results for LRU versus FIFO under relative worst order analysis. In: Proceedings of 13th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT), Springer LNCS, vol. 7357, pp. 328–339 (2012)
Boyar, J., Gupta, S., Larsen, K.S.: Relative interval analysis of paging algorithms on access graphs. In: Proceedings of 13th International Symposium on Algorithms and Data Structures (WADS), Springer LNCS, vol. 8037, pp. 195–206 (2013)
Borodin, A., Irani, S., Raghavan, P., Schieber, B.: Competitive paging with locality of reference. J. Comput. Syst. Sci. 50, 244–258 (1995)
Dorrigiv, R., Ehmsen, M.R., LópezOrtiz, A.: Parameterized analysis of paging and list update algorithms. Algorithmica 71(2), 330–353 (2015)
Dorrigiv, R., LópezOrtiz, A.: On developing new models, with paging as a case study. SIGACT News 40(4), 98–123 (2009)
Chrobak, M., Noga, J.: LRU is better than FIFO. Algorithmica 23(2), 180–185 (1999)
Dorrigiv, R., LópezOrtiz, A., Munro, J.I.: On the relative dominance of paging algorithms. Theor. Comput. Sci. 410(38–40), 3694–3701 (2009)
Kaplan, S.: Trace reduction for virtual memory simulation. Benchmark library. https://www3.amherst.edu/~sfkaplan/research/tracereduction/index.html
Kaplan, S.F., Smaragdakis, Y., Wilson, P.R.: Trace reduction for virtual memory simulations. In: Proceedings of International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), pp. 47–58 (1999)
Karlin, A., Phillips, S., Raghavan, P.: Markov paging. SIAM J. Comput. 30(3), 906–922 (2000)
Koutsoupias, E., Papadimitriou, C.H.: Beyond competitive analysis. SIAM J. Comput. 30(1), 300–317 (2000)
Irani, S., Karlin, A.R., Phillips, S.: Strongly competitive algorithms for paging with locality of reference. SIAM J. Comput. 25, 477–497 (1996)
Panagiotou, K., Souza, A.: On adequate performance measures for paging. In: Proceedings of 38th Annual ACM Symposium on Theory of Computing (STOC), pp. 487–496 (2006)
Sleator, D.D., Tarjan, R.E.: Amortized efficiency of list update and paging rules. Commun. ACM 28, 202–208 (1985)
Young, N.E.: The \(k\)server dual and loose competitiveness for paging. Algorithmica 11, 525–541 (1994)
Author information
Authors and Affiliations
Corresponding author
Additional information
Susanne Albers: Work supported by the European Research Council, Grant Agreement No. 691672, and by the German Research Foundation, Project Al 464/91.
Appendix
Appendix
Proof of Proposition 1
Let \(d=1+ {k1\over k} {k1 \over p1}\), which is the denominator in the first expression on the righthand side of (1). Since \(p>k\) there holds \(d\ge 1\). If \(f(p1,c_{p1}) \le g(p1,c_{p1})\), then \(\lambda =p1\) and \(c_{\lambda }^* = c_{p1}\). In this case the proof is obvious. Suppose that \(f(p1,c_{p1}) > g(p1,c_{p1})\). Our goal is to prove that \({1 \over d} \sum _{l=k}^{p1} c_l {lk+1\over l}\) is smaller than \(f(\lambda ,c_{\lambda }^*) = g(\lambda ,c_{\lambda }^*)\). There holds
where (7) follows from the fact that \(d\ge 1\) and \(k\ge 2\). Finally, (8) follows from \({1\over d}({lk+1 \over l}+{k1 \over k})\le 1\), for \(l\le p1\).\(\square \)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Albers, S., Frascaria, D. Quantifying Competitiveness in Paging with Locality of Reference. Algorithmica 80, 3563–3596 (2018). https://doi.org/10.1007/s0045301804069
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s0045301804069
Keywords
 Online algorithm
 Optimal offline algorithm
 Analysis of algorithms
 Experimental study