Adaptive k-center and diameter estimation in sliding windows

In this paper we present novel streaming algorithms for the k-center and the diameter estimation problems for general metric spaces under the sliding window model. The key idea behind our algorithms is to maintain a small coreset which, at any time, allows to compute a solution to the problem under consideration for the current window, whose quality can be made arbitrarily close to the one of the best solution attainable by running a polynomial-time sequential algorithm on the entire window. Remarkably, the size of our coresets is independent of the window length and can be upper bounded by a function of the target number of centers (for the k-center problem), of the desired accuracy, and of the characteristics of the current window, namely its doubling dimension and aspect ratio. One of the major strengths of our algorithms is that they adapt obliviously to these two latter characteristics. We also provide experimental evidence of the practical viability of the algorithms and their superiority over the current state of the art.


Introduction
In several modern application domains (e.g., social networks, online finance, online transaction systems), data are generated in a continuous fashion, and at such a high rate that their processing requires on-the-fly computation which can afford to maintain only a small portion of the data in memory. This computational scenario is captured by the well-known streaming model, which has received everincreasing attention in the literature over the last two decades [26,30,31,35,38]. In some prominent applications, it is also important that older data in the stream (i.e., those outside a sliding window containing the N most recent data items) be considered "stale" and thus be disregarded in the computation. As an example, consider the problem of detecting fraudulent credit card use, where it is essential to detect a change in the recent spending patterns. For this latter setting, admits a 2-approximation algorithm, and, for any > 0, it is not (2 − )-approximable unless P=NP [20].
The problem has also been studied in the fully dynamic setting where the input pointset changes dynamically through insertions of new points or deletions of existing points, and, at any time, the algorithm must be able to return an accurate solution for the current pointset in a time substantially smaller than the time required to compute the solution from scratch. In [13] the authors developed a (2 + )-approximation algorithm for the fully dynamic k-center problem on general metric spaces, with update time independent of the input size. For a given query point x, the algorithm can establish whether x is a center in constant time, and return the cluster of x (i.e., all points whose closest center is x) in time proportional to the cluster size. These results have been recently improved in [21] for spaces of constant doubling dimension using a navigating net data structure, and in [5] for general metric spaces using a reduction to the fully-dynamic maximal independent set problem. However, we remark that these fully dynamic algorithms store in memory a number of points linear with the size of the set of interest, as they rather target good query time/approximation tradeoffs, irrespective of the memory usage. For this reason they cannot be utilized in the sliding window model, where the size of the working memory, which is the premium resource to be optimized, must be substantially smaller than (and possibly independent of) the size of the set of interest.
In the standard streaming model, McCutchen and Khuller [33], and, independently, Guha [23], presented algorithms which maintain a (2 + )-approximation to the k-center problem for the entire set of points processed from the beginning of the stream, using working memory polynomial in k and 1/ . In the more restrictive sliding window model, which is the focus of this paper, Cohen-Addad et al. [14] presented an algorithm which is able to compute a (6+ )-approximation to the k-center problem for the current window, from only O k −1 log α points stored in the working memory, where α is the aspect ratio of the entire stream, that is, the ratio between the maximum and minimum distance between any two points of the stream. At any time, the algorithm requires O k −1 log α update time for handling the new point arrived from the stream, and O k 2 −1 log α time to return the approximate solution for the current window. One of the practical limitations of this algorithm is that it assumes prior knowledge of the aspect ratio α, thus the algorithm is inapplicable for unknown or unbounded values of α. In the same paper, the authors also show that, for general metric spaces, any algorithm for the 2-center problem that achieves an approximation ratio less than 4 requires working memory of size Ω N 1/3 , where N is the window length. In a recent unpublished manuscript, Kim [28] improved the result in [14] for Euclidean spaces, by presenting an algorithm which attains a (2+2 √ 3+ )-approximation through a coreset-based approach. The algorithm makes crucial use of specific properties of Euclidean spaces, hence, it is not immediately portable to general spaces. The author also claims that a (2 + )-approximation is achievable for constant-dimensional Euclidean spaces, and that the algorithm can be made oblivious to the aspect ratio α. However, due to the missing details, it is not immediate to fully reconstruct these stated improvements.
For what concerns diameter estimation, it is shown in [23] that, in the streaming setting, in order to approximate the diameter within any factor strictly less than 2 for general metrics, a working memory at least proportional to the size of the stream is required. For Euclidean spaces of low dimension d, there is a streaming (1 + )-approximate algorithm requiring O (1/ ) (d−1)/2 working memory [2], while for streams of higher dimensionality [3] presents an algorithm returning ( √ 2 + )-approximate solutions using O d −3 log(1/ ) working memory, and [27] presents an algorithm which can approximate the diameter within a factor c > √ 2 using O dn 1/(c 2 −1) log n working memory, where n is the size of the stream. A naive 2-approximation for the diameter of the entire stream is attainable in constant working memory by simply accumulating the maximum distance of all points in the stream from the first one. However, this approach cannot be effectively used in the sliding window model, because of the difficulty of maintaining the maximum distance from the first point of each window.
In [14], a streaming algorithm under the sliding window model is presented which, for any constant > 0, is able to return a (3 + )-approximation to the diameter of the current window in general metric spaces using working memory of size O (log(α)/ ), where, again, α represents the aspect ratio of the entire stream and must be known in advance. In the same paper, the authors prove that, under reasonable assumptions, obtaining an approximation ratio less than 3 in general spaces requires Ω N 1/3 working memory, where N is the length of the window. For Euclidean spaces of dimension d, a (1 + )-approximation algorithm in the sliding window model is presented in [18], which uses a working memory of O (1/ ) (d+1)/2 log 3 N (log α + log log N + (1/ )) bits. For constant d, the working memory requirement has been improved to O (1/ ) (d+1)/2 log(α/ ) [12].
For the smallest enclosing ball problem, which is closely related to the diameter estimation, [3] presents a streaming algorithm that can maintain a ((1 + √ 3)/2 + )-approximate solution using O d −3 log(1/ ) working memory. In the more restrictive sliding window model, only a (9.66 + )approximation is known [42] which uses sub-polynomial working memory.
In the realm of large graph analytics, diameter approximation (under the shortest path metric) has been extensively addressed in the distributed setting (see [8,39] and references therein).
Another relevant variation on the diameter estimation problem is the computation of the α-effective diameter, which is defined as the α-th quantile of the distances between all pairs of elements. This concept was first introduced in [34] as a noise-robust alternative to diameter in the context of network analysis, but can be easily generalized to arbitrary metric spaces. Recently, both [37] and [17] independently presented sliding window algorithms for the effective diameter estimation problem in general metric spaces.
Finally, the sliding window model has recently been addressed in a large number of research papers, which provide algorithms for a wide variety of optimization problems, including k-means and k-medians [6], diversity maximization [7], k-center with outliers [17,37], submodular optimization [19], and heavy hitters [41].

Our contribution
In this work, we present streaming algorithms for the k-center and diameter estimation problems under the sliding window model. Our algorithms are coreset based, in the sense that they maintain a small subset of representative points embodying an accurate solution for the current window. More specifically, our algorithms rely on the data structures used in [14] to obtain an initial reasonable estimate of the optimal kcenter radius, or the diameter, for the current window. These structures are paired with additional ones which leverage the initial estimate to maintain a coreset containing better representatives for the points of the current window. The working memory used by the algorithms is analyzed as a function of k, of a precision parameter , related to the desired approximation guarantee, and of the doubling dimension and of the aspect ratio of the current window. The doubling dimension, which is formally defined in Sect. 2, generalizes the notion of Euclidean dimensionality, and, as our results show, is related to the increasing difficulty of approximating the solution to the above problems when its value grows.
Consider a stream S of points from a metric space under the sliding window model. Let > 0 denote a fixed, userdefined, precision parameter, and let α W and D W denote, respectively, the aspect ratio and the doubling dimension of the current window W . The two main theoretical results in this paper are the following: It is important to remark that our algorithms are fully oblivious to both the aspect ratio α W and to the doubling dimension D W , in the sense that these values are not used explicitly by the algorithms but they are only employed to analyze their space and time requirements. This is a crucial feature since, in practice, estimates for α W and D W would be very difficult to obtain. Moreover, as desirable in the sliding window model, the amount of working memory used by the algorithms is independent of the window length, and, for constant and D W , they grow asymptotically only as a function of α W and k (resp., only as a function of α W ), for the k-center (resp., diameter) problem.
The main improvements of our algorithms with respect to the state-of-the-art for general metric spaces [14] are: -For the k-center problem, the approximation ratio drops from 6 + to 2 + , with a moderate increase in the working memory and update/query time requirements for windows of low-dimension. Moreover, our result shows that the aforementioned lower bound on the working memory size, proved in [14], can be beaten when the doubling dimension of the stream is small. In general, the approximation ratio of our algorithm can be made arbitrarily close to 2, which, under the hypothesis P =NP, is the best approximation attainable by any polynomialtime sequential algorithm when run on the entire window with unbounded memory, -For the diameter estimation problem, the approximation ratio drops from 3 + to 1 + , thus almost matching the exact estimation, with a moderate increase in the working memory and update/query time requirements for windows of low-dimension. (In fact, the query time can be improved by settling for a (2 + )-approximation.) Thus, our algorithm almost provides an exact estimation, whose computation would require time quadratic in the window length.
-Our algorithms, while oblivious to the doubling dimension of the window, afford a dimensionality-sensitive analysis which yields sharper resource-accuracy tradeoffs. They are also oblivious to the aspect ratio of the window, unlike the algorithms in [14] which require explicit knowledge of the aspect ratio of the entire stream. Note that this latter aspect ratio can be much larger than the largest aspect ratio of any window.
Finally, to gauge the practicality of our approach, we implemented our algorithms and the ones by [14], and compared their performance. The experiments provide clear evidence that, when endowed with similar amounts of working memory, on real-world datasets almost always our algorithms yield significantly better approximation with comparable update and query times.

Novelty with respect to conference version
A preliminary version of this work appeared in the Proceedings of the 7th IEEE International Conference on Data Science and Advanced Analytics, (DSAA 2020) [36]. The novel contributions of this work with respect to the preliminary conference version are the following: -strengthened analysis of the algorithms' space and time requirements, which now depend on the doubling dimension of the current window rather than on the doubling dimension of the entire stream; -a new technique to make the algorithms oblivious to the aspect ratio of the metric; -application of our clustering approach to the diameter estimation problem, improving upon the results of [14]; -substantially richer experimental analysis.

Organization of the paper
The rest of the paper is organized as follows. Section 2 defines the problems formally, and introduces a number of technical notions which will be used throughout the paper. Section 3.1 contains the description and the analysis of the algorithm for the k-center problem. In particular, Sects. 3.1 and 3.2 describe and analyze a simpler version of the algorithm assuming that the aspect ratio of the entire stream be known. Section 3.3 shows how to make the algorithm oblivious to the aspect ratio, and how to weaken the dependence from the aspect ratio of the entire stream to the one of current window. Section 4 describes and analyzes our algorithm for diameter estimation. Section 5 presents the experimental results. Finally, Sect. 6 offers some concluding remarks.

Preliminaries
Consider a pointset W from some metric space with distance function dist(·, ·). For any point p ∈ W and any subset C ⊆ W we use the notation dist( p, C) = min q∈C dist( p, q), and define the radius of C with respect to W as For a positive integer k < |W |, the k-center problem requires to find a subset C ⊆ W of size k which minimizes r C (W ). Note that any subset C ⊆ W of size k induces immediately a partition of W into k clusters by assigning each point to its closest center (with ties broken arbitrarily). We say that r C (W ) is the radius of such a clustering, and define to denote the radius achieved by an optimal solution to the problem.
As recalled in the introduction, the well-known greedy sequential algorithm by Gonzalez [20] (dubbed gon in the rest of the paper), provides a 2-approximation to the k-center problem running in O (|W |k) time. The following useful fact, proved in [10, Lemma 1], states that gon, when run on any subset T of the pointset W , returns a clustering whose radius cannot be much larger than the radius of an optimal clustering of the entire pointset.

Fact 1
For any subset T ⊆ W , with |T | > k, let C be the output of gon when run on T . We have r C (T ) ≤ 2 · OPT k,W .
We also define the diameter of a pointset W as Δ W = max p,q∈W dist( p, q). The diameter can be computed exactly in quadratic time and, it is easy to argue that, for an arbitrary In the standard streaming framework [26,31] the computation is performed by a single processor with a small working memory, and the input is provided as a continuous, possibly unbounded, stream of objects (points, in our case), arriving one at each time step, which is usually too large to fit in the working memory. Under the sliding window model, at each time t, a solution to the problem of interest should be computable for the pointset W t represented by the last N points arrived in the stream, where N , referred to as window length, is a predetermined value known to the algorithm. More formally, for each input point p, let t( p) denote its arrival time. At any time t, we have that W t = {p|0 ≤ t − t( p) < N }. 1 Since N can still be much larger than the working memory size, the challenging goal in this setting is to guarantee the quality of the solution while storing an amount of data substantially smaller than the window length.
Consider a stream S of points from a metric space with distance function dist(·, ·), and with a sliding window of length N . We define the aspect ratio α of S as the ratio between the maximum distance and the minimum distance of any two distinct points of S. Similarly, at any time t, we define the aspect ratio α W of the current window W as the ratio between the maximum and the minimum distance of any two distinct points of W . These values will play an important role in our algorithms.
In this paper, we present streaming algorithms for the kcenter problem and for diameter estimation, under the sliding window model. Our algorithms maintain information about a judiciously selected subset of points of the current window, from which, at any time t, a succinct coreset T ⊆ W can be extracted, so that a solution to the problem under consideration can be efficiently computed by running a sequential (approximation) algorithm on T . The quality of a coreset T is regulated by a user-defined accuracy parameter > 0, as captured by the following definition. In other words, the property of an -coreset T of W is that each point in W is "close" enough to some point in T , where closeness is defined as a function of and O PT k,W . Our algorithms for both the k-center problem and diameter estimation crucially rely on -coresets complying with the above definition. (For the diameter estimation, the -coresets employed will be w.r.t the 1-center problem.) The time and space performance of our algorithms will be analyzed in terms of parameters k, N , α (or α W ), , and of the dimensionality of the points in the current window. Since we target the applicability of our algorithms to arbitrary metric spaces, we will make use of the following, general notion of dimensionality. Let W denote a set of points from a metric space. For any x ∈ W and r > 0, let the ball of radius r centered at x, denoted as B(x, r ), be the subset of points of W at distance at most r from x. The doubling dimension of W is the smallest value D such that any ball B(x, r ), with x ∈ W , is contained in the union of at most 2 D balls of radius r /2 suitably centered at points of W . The following important fact, which we will use in the analysis, was proved in [24]:

Fact 2 Let W be a set of points from a metric space and let Y ⊆ W be such that any two distinct points a
A prominent feature of our algorithm is that it adapts automatically to the doubling dimension D of the window, in the sense that the algorithm does not require explicit knowledge of D, and provides best performances for small values of D. The characterization of datasets (or metric spaces) through their doubling dimension has been used in the literature in several contexts, including routing [29], clustering [1,10], nearest neighbor search [15], machine learning [22], and diversity maximization [9].

K-center problem
In this section, we present our (2+ )-approximation streaming algorithm for the k-center problem, under the sliding window model. The section is organized as follows. A first version of the algorithm, which assumes the knowledge of the aspect ratio α of the entire stream, is presented in Sect. 3.1 and analyzed in Sect. 3.2. Subsequently, Sect. 3.3 shows how to make the algorithm oblivious to α. Moreover, the dependence of the algorithm's space and time performance on the aspect ratio will be restricted to the one of current window, rather than to the one of the entire stream.

Algorithm
We consider the k-center problem for a target number k of centers, an input stream S, and a window length N . Let minDist and maxDist denote, respectively, the minimum and maximum distances between any two distinct points of S. To simplify the presentation of the algorithm, we assume that the values minDist and maxDist, hence the aspect ratio of S α = maxDist/minDist are known to the algorithm.
For each point p we define its Time-To-Live (TTL), denoted by TTL( p), as N − (t − t( p)), where t is the current time. When p arrives (t = t( p)), its TTL is N , the window length, and, from that time on, TTL( p) decreases of one unit at every new arrival. To avoid repeated updates of the TTL of points stored in the working memory, we assume that with each point p in the working memory we store the value t( p), which allows to immediately compute its TTL, given the current time t and N . We say that a point p expires when it leaves the current window W , that is, when TTL( p) = 0. In the analysis we will also consider points with negative TTL, that is, points that have expired at some previous time step.
For a user-defined constant β > 0, let and note that |Γ | = O (log(α)/ log(1 + β)). As in [14], our algorithm runs several parallel instances, where each instance uses a different value γ ∈ Γ as a guess of the optimal radius of a clustering of the current window. For each guess γ , the algorithm maintains two types of points belonging to the current window W : validation points (v-points for short) which enable to assess whether γ is a constant approximation to the optimal radius OPT k,W , and coreset points (c-points for short) which are those from which the coreset is extracted. For each γ ∈ Γ , validation points are in turn organized into three (not necessarily disjoint) sets, namely AV γ , RV γ and OV γ . Coreset points are similarly partitioned into sets A γ , R γ and O γ . The sets of validation points serve the same purpose as those used in [14]. In broad terms, the set AV γ (attraction v-points), whose size is upper bounded by k + 2, contains centers of clusters of radius at most 2γ , which cover all points of W when γ is a valid guess for , defined as the newest point (that is, the point with the largest TTL) among those v-attracted by v. When v expires, its representative repV γ (v) becomes an orphan, and it is moved to the set OV γ (orphan v-points).
Let > 0 be a user-defined precision parameter. The three sets of coreset points are used to refine the coverage provided by the validation points, so to make sure that, for valid guesses of γ , they contain an -coreset for the current window. Let δ = /(1 + β). The set A γ (attraction c-points) contains centers that refine the clusters around the attraction v-points by reducing their radius by a factor O (δ). We say that a point p is c-attracted by a ∈ A γ if dist( p, a) ≤ δγ /2. The sets R γ and O γ play, for c-points, the same role played by RV γ and OV γ for v-points. Thus, the set R γ (representative c-points) contains a representative repC γ (a) for each a ∈ A γ , which is the newest point among those c-attracted by a. When a expires, its representative repC γ (a) becomes an orphan and it is moved to the set O γ (orphan c-points).
Observe that a point q can be a representative for several attraction v-points (resp., c-points). In that case, we assume that a distinct copy of q is maintained in RV γ (resp., R γ ), one for each v ∈ AV γ (resp., a ∈ A γ ) such that q = repV γ (v) (resp., q = repC γ (a)).
At every time step, a number of points, including those that expires at that step, are removed from the sets of validation and coreset points, so to keep their sizes under control. The interplay between validation and coreset points is the follow-ing. At any time t, the validation points enable to identify a suitable guessγ which is within a constant factor from the optimal value OPT k,W . Then, the set Rγ ∪Oγ provides a good coreset from which an accurate final solution to k-center for W can be computed, using algorithm gon.
Our approach is described in detail by the following pseudocode, which consists of three procedures: update( p) describes the processing of each point p of the stream; insertValidation( p, γ ) is invoked inside update( p) when p must be added to AV γ ; finally, query(), if invoked at time t, returns the coreset where algorithm gon can be run to return the solution.
We remark that values of δ ≥ 4 would be uninteresting since, in this case, the coreset points would not offer any refinement over the coverage provided by the validation points. Therefore, in what follows, we assume that and β are fixed so that δ ≤ 4.

Algorithm analysis
Suppose that Procedure update( p) is applied to every point p of the input stream S, upon arrival. In this section, we show that, at any time, Procedure query (if invoked after update( p) has finished processing the latest point p arrived) returns an -coreset for the current window W , and that, by running gon on such a coreset, a (2+ )-approximate solution to the k-center problem for W is obtained. Moreover, we will analyze the amount of working memory the time required to process each point of the stream.
The following technical lemma states the main invariants maintained by Procedure update, which will be crucial for the analysis.

Lemma 1
For every γ ∈ Γ , the following invariants hold at the end of each execution of Procedure update( p), with respect to the window W containing p as its last point.
Proof The proof for Invariants 1(b) and 2(b) follow the lines of the argument in [14], but we include it for completeness. For convenience, we subdivide the time in steps, where each step processes a point of the stream. It is easy to see that the invariants hold at the end of Step 0, which we consider as the beginning of the stream before the first point arrives. We suppose that the invariants hold at the end of Step t − 1, for some t > 0, and show that they are maintained at the end of Step t. In the proof, we assume the following ordering of the activities of Step t: first, the point whose TTL goes to 0 expires and is thus excluded from the current window W ; then, the new point p arrives and update( p) is executed; and, finally, at the end of update( p), p is included in the current window W . For each point q ∈ W we define its vattractor (resp., c-attractor) as the oldest attraction v-point (resp., attraction c-point) which was at distance at most 2γ (resp. ≤ δγ /2) from q when q entered W . For simplicity, we define a number of checkpoints in the execution of Step t and show that if the invariants hold prior to each checkpoint, they also hold at the checkpoint. All the line numbers are, unless explicitly specified, referred to procedure update. Checkpoint 1: the invariants hold after the point with TTL=0 expires. This is immediate to see, since we are only removing a point from the window, but the point is not yet removed from the sets stored in memory which it belongs to, if any.
Checkpoint 2: the invariants hold after Line 10. If |AV γ | ≤ k before update( p) starts, it stays this way after Line 10, since Lines 1-10 do not add new points to AV γ , thus we only need to prove that Invariant 1 is maintained. We will do the argument for 1(a), since the one for 1(b) is virtually identical. If the expired point is o ∈ O γ , its removal in Line 8 does not affect the invariant. Indeed, if a point q violated 1(a) after the expiration of o, it would imply that o and q shared the same cattractor, but, in this case, q would have expired before o and could not belong to W . If instead, the expired point is a ∈ A γ then its representative repC(a) is moved to O γ . If a represents itself (i.e., a = repC(a)), then repC(a) will be also removed from the orphans and the considerations made above apply, otherwise the union R γ ∪ O γ remains unchanged. Note that no point of R γ can expire unless its c-attractor also expires but, in this case, the point is moved to the orphan set, and this corresponds to the case considered above when a = repC(a) expires. Consider now the case when |AV γ | > k before update( p) starts (note that it must necessarily be |AV γ | = k + 1). If a v ∈ AV γ expired, v is removed from AV γ hence |AV γ | = k after Line 10, hence it suffices to prove that Invariant 1 holds. Note that all the points q such that dist(q, R γ ∪ O γ ) > δγ already expired due to the fact that 2(a) holds at the beginning of update( p). Similarly, it can be argued that all points q such that dist(q, RV γ ∪ OV γ ) > 4γ already expired. Consider now the case |AV γ | = k + 1 after Line 10, and let us first show that 2(a) holds. If a point q violates 2(a) this implies that a point o ∈ O γ with the same c-attractor as q has expired, but then q must have expired prior to o, hence it cannot belong to W . A similar argument can be used to prove 2(b).
Checkpoint 3: the invariants hold after Line 15. First, consider the case E V = ∅. Then insertValidation( p, γ ) is called to insert p in AV γ . If, at the end of this call, we have |AV γ | ≤ k, then it was also |AV γ | ≤ k at the start of the call. As a consequence, Invariant 1 must hold since, in this case, the call does not delete any point. Otherwise, at the end of insertValidation( p, γ ), it must be |AV γ | = k + 1 and we need to prove that Invariant 2 holds. Consider first 2(a). For any point q whose distance from R γ ∪ O γ becomes > δγ , there must be an orphan o ∈ O γ with the same c-attractor as q, which has been deleted in Line 14 of insertValidation, hence TTL(q) ≤ TTL(o) < min v∈AV γ TTL(v). A symmetrical argument applies to prove 2(b). In case E V = ∅, we replace each representative repV(v) of v ∈ E V , with the new point p. Note that both repV(v) and p are v-attracted by the same point v, so, all points with the same v-attractor as repV(v) are contained in the 4γ -ball centered in p, which suffices to prove both Invariants 1 and 2.
Checkpoint 4: the invariants hold after Line 22. If E = ∅, then no point is deleted in Lines 16-22, thus the two invariants will hold. Otherwise, if E = ∅, we replace each representative c-point repC(a), a ∈ E with the new point p. Since repC(a) and p are c-attracted by the same point a, all points with the same c-attractor as repC(a) are contained in the δγball centered in p, which suffices to prove that Invariants 1 and 2 hold.
Checkpoint 5: the invariants hold after the new point p is inserted into the active window. If p has been inserted into AV γ , then p has also been inserted into RV γ , hence dist( p, , and the two invariants follow. Consider a time step t and let T be the set of points returned by Procedure query invoked after the execution of update( p), where p is the t-th point of the stream. Let also W be the current window containing p as its last point. We have:

Lemma 2 T is an -coreset for W w.r.t. the k-center problem.
Proof It can be easily seen that for any guess γ such that either |AV γ | > k, or |AV γ | ≤ k and the set C computed by query contains > k points, there at least k + 1 points of W at pairwise distance > 2γ , which immediately implies that γ < OPT k,W . Moreover, since Γ contains a guess γ ≥ maxDist ≥ OPT k,W , the procedure will always determine a minimum guessγ such that both |AVγ | ≤ k and |C| ≤ k. Then, since the values in Γ form a geometric progression of common ratio (1 + β), we obtain thatγ /(1 + β) < OPT k,W . Also, since |AVγ | ≤ k, Invariant 1(a) of Lemma 1 ensures that and the lemma follows.
The next theorem establishes the approximation factor of our algorithm.

Theorem 1 By running Algorithm gon on T we obtain a (2 + )-approximate solution for the k-center problem on W .
Proof Let C alg be the set of centers returned by gon when run on T . Since T is a subset of W , by Fact 1 we have that for each q ∈ T , dist(q, C alg ) ≤ 2·OPT k,W . Moreover, since T is ancoreset for W (Lemma 2) we have that for each p ∈ W there is q ∈ T such that dist( p, q) ≤ · OPT k,W . By combining these two observations and applying the triangle inequality, we conclude that for each p ∈ W we have dist( p, C alg ) ≤ (2 + ) · OPT k,W .
The next two theorems establish the space and time requirements of our algorithm.
The proof argument is the same as the one used in [14], but we report it for completeness. The bound on |AV γ | is explicitly enforced by insertValidation which removes a point from AV γ as soon as its size exceeds k + 1. The bound on RV γ follows from the fact that the algorithm makes sure that RV γ contains exactly one representative for each v ∈ AV γ . Indeed, when a point is removed from AV γ , its representative is moved to For what concerns the bound on |OV γ |, let v 1 , v 2 , . . . be an enumeration of the points inserted in AV γ at any time during the algorithm, ordered by arrival time. We now show that for every i ≥ 1 we have TTL(v i+k+1 ) > TTL(repV γ (v i )) ≥ TTL(v i ). Consider two cases. If v i expires before v i+k+1 enters the window, then TTL(v i+k+1 ) > TTL(repV γ (v i )) because repV γ (v i ) must have entered the window before v i expired. Otherwise, upon insertion of v i+k+1 in AV γ , there are k + 1 points in AV γ , so the algorithm deletes v i as it is the oldest point in AV γ . Then, again TTL(v i+k+1 ) > TTL(repV γ (v i )) because repV γ (v i ) must have entered the window before v i is deleted. At time t, let v j be the last point that was removed from AV γ , either because expired or deleted. By the property proved above, any point which has been representative of v j−(k+1) has a TTL smaller than TTL(v j ), thus it cannot be in memory at time t because it either expired or has been deleted by Line 14 of insertValidation. This shows that |OV γ | ≤ k + 1.
Next, we show that |A γ ∪ R γ ∪ O γ | ≤ 6(k + 1)(32/δ) D W , where D W is the doubling dimension of the current window W . From the proof above we know that there are at most k + 1 points in each of the sets AV γ , RV γ and OV γ . By construction, we also know that the distance between any two points of A γ is ≥ δγ /2. We show that the points of A γ are enclosed in at most 2(k + 1) balls of radius 4γ . Consider two cases. If |AV γ | ≤ k, by Invariant 1(b) we have max q∈W dist(q, RV γ ∪ OV γ ) ≤ 4γ , hence each q ∈ A γ is within one of the at most 2(k + 1) balls of radius 4γ centered at the points of RV γ ∪ OV γ . Instead, if |AV γ | = k + 1, then by Invariant 2(b) we have that for each q ∈ W with TTL(q) ≥ min v∈AV γ TTL(v) it holds that max q∈W dist(q, RV γ ∪ OV γ ) ≤ 4γ . Recall that after we insert a new point in AV γ , if the size exceeds k we delete from |A γ ∪ R γ ∪ O γ | all the points with TTL smaller than the smallest TTL of a point in AV γ . Then after each execution of the procedure update, if |AV γ | = k +1, each point in A γ has TTL greater than the oldest point in AV . Thus, each q ∈ A γ is within a ball of radius 4γ from some point in RV γ ∪ OV γ . By Fact 2, in each of these 2(k + 1) balls, there can be at most (32/δ) D W points of A γ , so |A γ | ≤ 2(k + 1)(32/δ) D W . Moreover, at any given time |R γ | = |A γ |, since the algorithm makes sure that R γ contains exactly one representative for each a ∈ A γ .
Let k = 2(k + 1)(32/δ) D W be the above upper bound on the size of A γ . We are left to show that |O γ | ≤ k . Let a 1 , a 2 , . . . be an enumeration of the points inserted in A γ at any time during the algorithm, ordered by arrival time. We now show that for every i ≥ 1 we have TTL(a i+k +1 ) > TTL(repC γ (a i )) ≥ TTL(a i ). It must hold that a i expires or gets deleted before a i+k +1 enters the window, or otherwise, upon insertion of the new point a i+k +1 , there would be k + 1 points in A γ , which is impossible since k is an upper bound to |A γ |. Then, TTL(a i+k +1 ) > TTL(repC γ (a i )) because repC γ (a i ) must have entered the window before a i expired or got deleted, which means that repC γ (a i ) must have entered the window before a i+k +1 enters the window. Let a j be the last point that was removed from A γ , either because expired or deleted. By the property proved above, any point which has been representative of a j−(k +1) has a TTL smaller than TTL(v j ), thus it cannot be in memory at time t because it either expired or has been deleted by Line 16 of insertValidation. This shows that there can be at most k points in O γ .

Theorem 3 Procedure update( p) runs in time
while Procedure query() runs in time where D W is the doubling dimension of the current window W .
Proof The time complexity of update( p) is dominated by the construction of the sets E V and E for each γ (Lines 9 and 10), which requires time linear in |AV γ |+|A γ |. The claimed bound follows by Theorem 2. For what concerns query, we observe that, as shown in the proof of Theorem 2 hence the identification of the minimum guessγ such that both |AVγ | ≤ k and |C| ≤ k can be easily accomplished in O k 2 log(α)/ log(1 + β) time. Finally, onceγ has been found, returning Rγ ∪ Oγ takes time proportional to their size, which is O k(32(1 + β)/ ) D W as argued in the proof of Theorem 2.
We remark that the first term in the running time of Procedure query() can be improved to O k 2 log(log(α)/ log(1 + β)) , by using binary search over the values of Γ .
The following theorem summarizes the main features of our algorithm. We remark that the amount of working memory required by the algorithm to analyze the entire stream S will depend on the maximum value of D W , which is upper bounded by the doubling dimension of the entire stream but can in fact be substantially smaller.

Obliviousness toF
or convenience, the algorithm presented in Sect. 3.1 assumed explicit knowledge of the aspect ratio α of the entire stream S. In this section, we show how to make the algorithm oblivious to the aspect ratio, while keeping the same approximation quality, time and space requirements. In fact, the analysis will also show that the dependence on α of the time and space requirements of the oblivious algorithm can be weakened into a dependence on the value α W the aspect ratio restricted to the current window W .
We observe that the knowledge of α required by the algorithm presented in Sect. 3.1 serves solely the purpose of identifying a spectrum of feasible values for the optimal cluster radius. This is reflected in the definition of Γ , which contains a geometric progression of values spanning the interval between the minimum and maximum distance of any two points of the stream. In the oblivious version, at any time t the algorithm maintains an analogous set Γ t , spanning an interval delimited by a lower bound and an upper bound to the radius of the optimal clustering for the current window W .
Let p 1 , p 2 , . . . be an enumeration of all points of the stream S based on the arrival order. At every time t > k, let r t be the minimum pairwise distance between the last k + 1 points of the stream ( p t−k , . . . , p t−1 , p t ). It is easy to see that, for the current window W , OPT k,W ≥ r t /2. We require that, together with the other structures, the algorithm stores the last k + 1 points arrived and maintains the value r t , which can be computed with an extra O k 2 operations per step.
We also require that the algorithm maintains a tight upper bound on the diameter Δ W of the current window W . More precisely, we require that the algorithm maintains, at any time t, a value M t such that M t ≥ Δ W , with M t = Θ (Δ W ). To ease presentation, let us assume for now that M t is available. At the end of this subsection, we will show how to augment the algorithm so that M t can indeed be efficiently computed at each step.
Let the values β, and δ = /(1 + β) be defined as in Secti. 3.1, and recall that we assumed that δ ≤ 4 since larger values of δ are uninteresting for our algorithm. We define and observe that the definition of Γ t is independent of α. The structure of the α-oblivious algorithm is identical to the one presented in the previous subsections, except that at each time t, the set of guesses Γ t substitutes the fixed set of guesses Γ . The following claim shows that at any step t, it is sufficient that the algorithm maintains structures for guesses γ belonging to Γ t .
Claim Consider the non-oblivious algorithm presented in Sect. 3.1. At any time t, Procedure query() would be correct if the sets AV γ , RV γ , OV γ , A γ , R γ , and O γ satisfying the invariants stated in Lemma 1 were available only for γ ∈ Γ t .

Proof
We show that, even if available, the sets of validation and coreset points for values of γ outside Γ t would never provide the final coreset returned by query(). Consider a value γ < min Γ t ≤ r t /2, and observe that the last k + 1 points of the stream, namely p t−k , . . . , p t−1 , p t , are all at pairwise distance at least r t > 2γ . Therefore, the nonoblivious algorithm would insert all of these points in AV γ , hence |AV γ | = k+1 and the value γ would not be considered in the outer loop of query(). Let γ max = max Γ t ≥ 2M t /δ. We show that at time t, when query() considers γ max it must find |AV γ max | ≤ k and a set C of size at most k, since otherwise there would exist k + 1 points at distance > 2γ max ≥ 4M t /δ ≥ M t , which is impossible since M t is an upper bound to the diameter Δ W , where W is the current window up to point p t . Thus, Procedure query() surely returns a coreset for a value γ ≤ γ max .
We now show how to modify the algorithm so that, without the knowledge of α, at the end of each step t it is able to maintain the sets AV γ , RV γ , OV γ , A γ , R γ , and O γ satisfying the invariants stated in Lemma 1, limited to the guesses γ ∈ Γ t . Suppose that this is the case for up to some time t −1 > k and consider the arrival of p t . First, the algorithm computes the new values r t (as described above) and M t (to be described later), and removes all sets relative to values of γ It is easy to argue that, for these values of γ , these sets satisfy the invariants of Lemma 1 at the end Again, for these values of γ , the sets satisfy the invariants of Lemma 1 at the end of step t −1, since for every point q in the window at that time we have dist(q, RV γ ) = dist(q, R γ ) = dist(q, p t−1 ) ≤ M t−1 ≤ δγ /2 ≤ 2γ ≤ 4γ , since we are assuming δ ≤ 4.
At this point, for every γ ∈ Γ t the algorithm has available sets AV γ , RV γ , OV γ , A γ , R γ , and O γ which satisfy the invariants of Lemma 1 at the end of step t − 1. Then, it is sufficient to run update( p t ) to complete step t.
Computation of M t It remains to show how to compute M t at each step. To this purpose, we require that, for every γ ∈ Γ t , a separate concurrent process to the main algorithm described above maintains a set of validation points for k = 1, which we refer to as AV (1) γ , RV (1) γ , OV (1) γ . Clearly, for each step t ≤ k + 1, maintaining these three sets and an upper bound M t to Δ W is trivial. Suppose now that these sets are available at the end of some step t − 1 > k, for every γ ∈ Γ t−1 . and consider the arrival of p t . We operate as follows. We first compute r t . Then, for each γ = (1 + β) i , with i ≥ log 1+β r t /2 , we perform the following sequence of steps until a given valueγ is determined: γ , and OV (1) γ as explained above for the case of arbitrary k; 2. we update AV (1) γ , RV (1) γ , and OV (1) γ to account for the arrival of p t , performing the same operations that would be performed by update( p t ) with k = 1;
Observe that this procedure is akin to the search for a feasible value of γ performed by Procedure query, and will surely stop at a valueγ = O (Δ W ). Onceγ is computed, M t is set equal to 12γ , which, together with r t computed before, determines the set Γ t . At this point, Steps 1) and 2) above are repeated for all values of γ ∈ Γ t , with γ >γ , so to have the sets AV (1) γ , RV (1) γ , OV (1) γ updated at the end of time t for every γ ∈ Γ t .
The following lemma shows that M t is indeed a tight upper bound on Δ W .

Lemma 3 We have:
where W is the current window, up to and including p t .

Proof Let AV
(1) γ = {a}. The following sequence of inequalities follows by Invariant 1(b) of Lemma 1, by the properties of RV (1) γ , and by the above definition ofγ .
The following theorem summarizes the main result regarding the α-oblivious version of our algorithm presented in this subsection.

Theorem 5 Consider a stream S of points from a metric space under the sliding window model. Let α W and D W be, respectively, the aspect ratio and the doubling dimension of the current window W . For fixed parameters
, β > 0, at any time t the α-oblivious version of our algorithm for the k-center problem requires working memory of size M = O k · (log(α W )/ log(1 + β))(32(1 + β)/ ) D W , processes each point p ∈ S in time O M + k 2 , and is able to return a (2 + )-approximation to the k-center problem for W in time O k 2 · (log(log(α W )/ log(1 + β))+ (32 ( Proof First we observe that Γ t ∈ O (log(α W )/ log(1 + β)). Reasoning as in the proof of Theorem 2, it is immediate to see that, for each guess γ , the aggregate size of all sets of validation and coreset employed by the algorithm, including those required to compute M t , is still O k (32(1 + β) Similarly to what we remarked at the end of the previous subsection for the doubling dimension, we observe that now the amount of working memory required by the algorithm to analyze the entire stream S will depend on the maximum value of α W , which is upper bounded by the aspect ratio α of the entire stream but can in fact be substantially smaller.

Diameter estimation
In this section, we present an algorithm for accurate diameter estimation in the sliding window model. Let S be a stream of points from a metric space, with a sliding window of length N . Fix an accuracy parameter > 0 and let be such that = /(1 − ). Suppose that we run the α-oblivious algorithm for the k-center algorithm discussed in the previous section, with k = 1 and accuracy . Also, in the algorithm, set δ = /(2(1 + β)) instead of δ = /(1 + β). For a time step t, let T be the set of points returned by Procedure query() after the t-th point has been processed with Procedure update. We have:

Lemma 4 Let Δ W and Δ T be the diameters of the current window W and of the coreset T , respectively. Then
For what concerns the upper bound, letγ be the smallest estimate for which both |AVγ | ≤ 1 and |C| ≤ 1 in Procedure query(). By reasoning as in the proof of Lemma 2, we obtain thatγ /(1 + β) ≤ OPT 1,W . Hence, For each point p ∈ W , we define its proxy point π( p) as the closest point to p in T , namely, π( p) = argmin q∈T dist( p, q). Let p, q be two points of W such that dist( p, q) = Δ W . We have that ≤ dist( p, π(p)) + dist(π( p), π(q)) + dist(q, π(q)) Let D W be the doubling dimension of the current window. By repeating the argument in the proof of Theorem 2, we obtain that the coreset size is since δ = /(2(1 + β)) and = /(1 + ). Lemma 4 implies that an estimate of Δ W within a factor (1 + ) can be obtained by computing the exact diameter of the coreset T in time O |T | 2 . If, due to the exponential dependency on D W , the coreset size is too large to afford a quadratic computation of its diameter, one can compute instead max q∈T dist( p, q), for an arbitrary point p ∈ T , which yields an estimate of Δ T within a factor 2, whence and estimate of Δ W within a factor 2(1 + ), in linear time.
The following theorem summarizes the results of this section:

Experiments
To assess the practical viability of our approach, we designed a set of experiments to -compare approximation quality, execution time, and memory usage (in terms of the number of points stored in the working memory) of our k-center algorithm against the state-of-the-art algorithm in [14]; -test the ability of our k-center algorithm to adapt, at each time t, to the specificity of the window W t , as captured by its doubling dimension D W and its aspect ratio α W ; -compare approximation ratio, execution time, and memory usage of our (1 + )-approximate diameter algorithm against the state-of-the-art algorithm in [14].

Experimental testbed and datasets
All tests were executed using a Java 13 implementation of the algorithms on a Windows machine running on an AMD FX8320 processor with 12GB of RAM, and the running times of the procedures were measured with System.nanoTime. The points of the datasets are fed to the algorithms through the file input stream. We experimented with datasets often used in the clustering literature. Specifically, we used both a low-dimensional dataset, Higgs, and for a higher-dimensional dataset, Covertype, which serves as a stress test for our dimensionality-sensitive approach. The Higgs dataset 2 contains 11 million points representing high-energy particle features generated through Monte-Carlo simulations. The points of this dataset have 28 attributes, 7 of which are a function of all the others. We considered only these seven "high-level" features, hence regard the data as points in R 7 using Euclidean distance. The Covertype dataset 3 contains 500 thousand 54-dimensional points from geological observations of US forest biomes. We interpret the data as points in R 54 using Euclidean distance. In some experiments, we will also use artificial datasets consisting points randomly extracted from (subspaces of) R 100 , using again Euclidean distance.

Comparison with the state-of-the-art k-center algorithm
We designed implementations of the non-oblivious and of the α-oblivious version of our k-center strategy, respectively referred to as our-sliding and our-oblivious hereinafter, and an implementation of the state-of-the art sliding window algorithm in [14], referred to as css-sliding. Due to the NP-hardness of the k-center problem, it is unfeasible to compute the optimal solution for each window so to measure the exact approximation factor of the solutions returned by the algorithms. As a workaround, we assess the quality loss incurred by our space-restricted streaming algorithms with respect to the most accurate, polynomial-time sequential approximation gon running on the entire window, hence using unrestricted space.
For the Higgs dataset, we tested several values of k in [10,100], and several window sizes N in [10 3 , 10 6 ]. For brevity we report only the results for k = 20, since the behaviors observed for the other values exhibit a similar pattern. For our-sliding and our-oblivious, we set = 1 and β = 0.2. For a fair comparison, we searched the parameter space of css-sliding so to determine a value of its parameter (the equivalent of our parameter β) so to enforce that the algorithms use approximately the same working memory. As a result, for css-sliding we set = 0.01 which, in all of our tests, makes its working memory comparable yet slightly larger than the one used by our-sliding and our-oblivious, which gives a competitive advantage to css-sliding in the comparison with respect to the approximation quality. Similarly, for Covertype we report experiments with k = 20 and window lengths in [10 3 , 3·10 5 ]. Due to the higher dimensionality of the dataset, which calls for a finer control on accuracy, we set = 0.5 and β = 0.1 for both our-sliding and our-oblivious, which brings us to set = 0.01 for css-sliding to attain a similar, yet higher, memory usage of the latter w.r.t to the former algorithms.
The results are reported in the plots of Figs. 1, 2, 3, and 4 for the dataset Higgs, and in the plots of Figs. 5, 6, 7, and 8 for Covertype. In the plots, the blue lines refer to css-sliding, the orange lines to our-sliding, the yellow lines to our-oblivious, and the purple ones to the execution of the sequential algorithm gon on the entire window. Each point in a plot represents an average over 10,000 consecutive windows.
The comparison of the algorithms' memory usage is reported in Figs. 1 and 5. As expected, the working memory required by both css-sliding and our algorithms is mostly insensitive to the the window length, while the memory usage of gon clearly grow linearly with it. Note that ouroblivious maintains consistently less points in memory than our-sliding, as it does not maintain the data structures relative to infeasible estimates, rebuilding them on the fly when needed. Figures 2 and 6 compare the clustering radii obtained by the four algorithms. Remarkably, on the Higgs dataset our-sliding and our-oblivious, even for the relatively large value of = 1, return a clustering whose radius essentially coincides with the one returned by running gon on the full window, and it is consistently and decidedly smaller than the one returned by css-sliding. Similarly on the Covertype dataset, the radii returned by our algorithms are consistently smaller than the ones returned by css-sliding, albeit not quite matching the radius of gon. By lowering the parameter one would obtain better results at the cost of an increased memory usage. The update time (Figs. 3 and 7), seems rather insensitive to N for both css-sliding oursliding and our-oblivious, while it is clearly negligible for gon, where it simply entails discarding the oldest point of the window and inserting the new one. Finally, as shown in Figs. 4 and 8, the query times of css-sliding our-sliding and our-oblivious are clearly much smaller than the one of gon, which grows linearly with the window length. The query times of our-sliding and our-oblivious are comparable to those of css-sliding but slightly higher especially in the case of the Covertype dataset, which is conceivably due to its higher dimensionality.
Overall, the experiments provide evidence that, with respect to the state-of-the-art algorithm in [14], our algorithms our-sliding and our-oblivious offer an approximate solution whose quality is much closer to the one guaranteed by best sequential algorithm run on the entire window, within comparable space and time budgets. Moreover the experiments show that the α-oblivious version of our algorithm consistently maintains less points in the working memory, without compromising neither execution times nor the quality of the solution.

Adaptiveness to dimensionality and aspect ratio
One of the prominent features of our algorithms is their ability to adapt to the inherent complexity of the window W t , as captured by its dimensionality D W , and, in the case of our-oblivious, also by its aspect ratio α W . The experiments reported below are aimed at testing such adaptiveness.
First, we investigate the dependency of the memory usage as a function of D W . To this purpose, we used several artificial datasets consisting of points of R 100 randomly extracted from the unit subcube [0, 1] d with d ranging from 1 up to 25 (by fixing 100 − d components to zero), making sure to fix the aspect ratio for all the datasets by periodically injecting "extreme" points in the stream 4 . For each of the datasets, we report the number of points maintained in the working memory by our-sliding (we omit for brevity the results of our-oblivious, as they exhibit the same behavior) when used on a window of length N = 10 5 points with parameters β = 0.1, = 1 and k = 20. All quantities are averaged over 10,000 consecutive windows and reported together with their 95% confidence intervals. As plotted in Fig. 9, for dimensions in the range [1,15] the memory growth is clearly exponential with the dimension of the subspace, as suggested by the theory. For larger values of the subspace dimension, the growth slows down as the coreset size approaches the window length.
In a second experiment, we assessed the capability of our algorithms to adapt to the dimensionality of the data dynamically, as the window slides over sets of points of varying dimensionality. We generated an artificial dataset (DD) of random points in R 100 , where the first portion of points belongs to a subspace of dimension 1 (i.e., a line), the middle portion of points belongs to a subspace of dimension 10, and 4 We remark that the doubling dimension of a subspace of [0, 1] d is linearly related to d.

Fig. 9
Memory usage versus subspace dimension the last portion belongs again to a line. In order to fix other features of the dataset that may influence memory usage, the points are generated so to make sure that the aspect ratio of all the windows is roughly the same, so that our-oblivious is not influenced by the variability of α W . As before, we set β = 0.1, = 1 and k = 20 for both our-sliding and our-oblivious. The windows length is N = 10 4 . The memory usage of our-sliding and our-oblivious, plotted in respectively Figs. 10 and 11, respectively, is low in the windows that cover the points in the first and third portion, and much higher in the windows that cover the points in the second portion, with a transition phase for the windows spanning the two types of subspaces. This property is very appealing, as the algorithms automatically adapt to the dimension D W of the active window W t to maintain as few points as possible, even without any prior knowledge on the doubling dimension. We observe that the higher level of noise in the plot for our-oblivious is due to the higher variability of the range of guesses Γ t (which mostly depends on the variability of r t ), as opposed to the fixed range Γ used by our-sliding.
Finally, we present some experiments on an artificial dataset (alpha) of random points in R 100 , where the aspect ratio varies substantially from window to window while the doubling dimension remains stable. In order to have multiple aspect ratios in the same dataset, we generated the points as follows. All the points have the first five coordinates distributed uniformly in some range and the others set to 0. In the first portion of the stream 90% of the points have coordinates in [0, 10] and 10% of the points have coordinates in [0, 100]. In the second portion of the stream 90% of the points have coordinates in [0, 0.01], hence the minimum distance among any two points will be around 10 3 times smaller than before, and 10% of the points have coordinates in [0, 10], hence the diameter will be around 10 times smaller than before, for an overall factor 100 increase in the aspect ratio. Finally, the last part of the stream has the same distribution as the first one.
Once again we set β = 0.1, = 1 and k = 20 for both our-sliding and our-oblivious and set the window length N = 10 4 . Figure 12a and b show the memory usage of the two algorithms, as a function of the step t, while Fig. 12c plots how the extremal values r t /2 and 2M t /δ of range Γ t (used by our-oblivious to attain obliviousness to the aspect ratio) evolve with t. We first observe that the memory usage of our-oblivious is slightly lower than the one of our-sliding. This feature, as seen in previous experiments, does not arise only in tailor-made artificial datasets but also in real-world ones (see Figs. 1 and 5), giving a competitive advantage to the oblivious version of the algorithm over the non-oblivious one. Somewhat surprisingly, however, the memory usage of our-sliding also seems to adapt to the shape of Γ t although it uses a fixed range Γ for the guesses. This phenomenon is due to the fact that at each time t, for each guess γ ∈ Γ − Γ t , the number of validation and coreset points maintained by our-sliding is rather small.

Comparison with the state-of-the-art diameter algorithm
As for the k-center algorithm, we implemented our diameter approximation algorithm, hereinafter referred to as ourdiameter, and we tested it against the sliding window algorithm in [14], which we will refer to as css-diameter.
For efficiency, we substituted the expensive quadratic-time naive computation of the exact diameter, wherever it was required, with the following greedy heuristic (referred to greedy-diameter) proposed in [32]: starting from a random point p 0 , select the point p 1 farthest away from p 0 , then select the point p 2 farthest from p 1 , and return dist( p 1 , p 2 ). While, theoretically, dist( p 1 , p 2 ) is only guaranteed to be within a factor 2 from the actual diameter, it has been observed that, in most cases, the difference between the two quantities is, in fact, very small [32]. In light of this fact, in the experiments we used greedy-diameter both as a baseline to compute the diameter Δ W of the window W , using the entire window as an input, and as a means of computing the diameter Δ T of the coreset T . Thus, in the experiments we do not differentiate between the two variants to obtain Δ T (exactly or approximately) described in Sect. 4. For what concerns the Higgs dataset, we experimented with window lengths varying in [10 3 , 10 6 ] and parameters = 0.5 and β = 0.1, for our-diameter, and = 0.001, for css-diameter, so to make sure, for fairness, that css-diameter requires approximately the same amount of working memory as our-diameter (but not less). All the quantities in the plots are averaged over 10,000 consecutive windows. Figure 13, reports the memory usage of both algorithms together the (linear) memory usage of the baseline. Figure 14 compares the accuracy exhibited by the algorithms. Specifically, for each value of the window length the figure reports the ratio of the distance returned by each algorithm to the distance computed by greedy-diameter when run on the entire window. Surprisingly, the solution of css-diameter already exhibits rather good accuracy albeit, in theory, the algorithm only guarantees a (3 + ) approximation. Our algorithm our-diameter turns out slightly more accurate than css-diameter, offering a good trade-off between the quality of the solution and the memory usage. We remark that css-diameter delivers results of comparable quality also for larger values of (that is, smaller memory usage), hence it seems that the algorithm is less effective in exploiting larger memory budgets. For what concerns the Covertype dataset, we repeated the same experiment with window lengths varying in [10 3 , 2 * 10 5 ] and parameters = 1 and β = 1, for our-diameter, and = 0.001, for css-diameter, again to ensure, for fairness, that css-diameter requires approximately the same amount of working memory. The results concerning the memory requirements and the approximation are reported in Figs. 15 and 16. Due to higher dimensionality of the dataset we were forced to increase the values of and β for our-diameter in order to keep the memory requirements reasonable and below the window length. Nonetheless the approximation quality featured by our-diameter is superior to the one of css-diameter and matches the one of the baseline for smaller window lengths, while it degrades for larger windows, remaining however always within 10% of the estimate provided by the baseline. This degradation is probably caused Regarding update and query times, for both the Higgs and Covertype datasets, we observed for all algorithms the same behaviors discussed for the k-center problem (plots are omitted for brevity). Namely, update and query times are independent of the window length and linear in the working memory, for both our-diameter and css-diameter. Also, the query times our-diameter and css-diameterare comparable, with the former being slightly larger for the higher dimensional Covertype dataset.

Conclusions
In this paper, we have shown how to attain coreset constructions yielding accurate streaming algorithms for the k-center and diameter estimation problems under the sliding window model. While the algorithms require very limited amounts of working memory for windows W of low doubling dimension D W , the approach quickly degrades as D W grows large. An interesting, yet challenging, research avenue is to investigate whether this steep dependence on D W can be ameliorated by means of alternative techniques (e.g., the use of randomization).
Funding Open access funding provided by Università degli Studi di Padova within the CRUI-CARE Agreement. This work was supported, in part, by MIUR, the Italian Ministry of Education, University and Research, under PRIN Project n. 20174LF3T8 AHeAD (Efficient Algorithms for HArnessing Networked Data), and grant L. 232 (Dipartimenti di Eccellenza), and by the University of Padova under project SID 2020 RATED-X (Resource-Allocation TradEoffs for Dynamic and eXtreme data).

Code Availability
The code for the experiments and is publicly available on GitHub (https://github.com/PaoloPellizzoni/AdaptiveCoreset).

Conflict of interest
The authors declare that they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.