Competitive Analysis of Aggregate Max in Windowed Streaming

  • Luca Becchetti
  • Elias Koutsoupias
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5555)

Abstract

We consider the problem of maintaining a fixed number k of items observed over a data stream, so as to optimize the maximum value over a fixed number n of recent observations. Unlike previous approaches, we use the competitive analysis framework and compare the performance of the online streaming algorithm against an optimal adversary that knows the entire sequence in advance. We consider the problem of maximizing the aggregate max, i.e., the sum of the values of the largest items in the algorithm’s memory over the entire sequence. For this problem, we prove an asymptotically tight competitive ratio, achieved by a simple heuristic, called partition-greedy, that performs stream updates efficiently and has almost optimal performance. In contrast, we prove that the problem of maximizing, for every time t, the value maintained by the online algorithm in memory, is considerably harder: in particular, we show a tight competitive ratio that depends on the maximum value of the stream. We further prove negative results for the closely related problem of maintaining the aggregate minimum and for the generalized version of the aggregate max problem in which every item comes with an individual window.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ajtai, M., Megiddo, N., Waarts, O.: Improved algorithms and analysis for secretary problems and generalizations. SIAM Journal on Discrete Mathematics 14(1), 1–27 (2000)MathSciNetCrossRefMATHGoogle Scholar
  2. 2.
    Alon, N., Matias, Y., Szegedy, M.: The space complexity of approximating the frequency moments. In: Proc. of the ACM Symposium on the Theory of Computing, pp. 20–29 (1996)Google Scholar
  3. 3.
    Arasu, A., Manku, G.S.: Approximate counts and quantiles over sliding windows. In: Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 2004), pp. 286–296. ACM Press, New York (2004)Google Scholar
  4. 4.
    Babcock, B., Datar, M., Motwani, R., O’Callaghan, L.: Maintaining variance and k-medians over data stream windows. In: Proc. of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (PODS 2003), pp. 234–243. ACM Press, New York (2003)CrossRefGoogle Scholar
  5. 5.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)Google Scholar
  6. 6.
    Borodin, A., El-Yaniv, R.: Online computation and competitive analysis. Cambridge University Press, New York (1998)MATHGoogle Scholar
  7. 7.
    Braverman, V., Ostrovsky, R.: Smooth histograms for sliding windows. In: Proceedings of the IEEE Symposium on Foundations of Computer Science, pp. 283–293 (2007)Google Scholar
  8. 8.
    Braverman, V., Ostrovsky, R., Zaniolo, C.: Succinct sampling on streams. Computing Research Repository (CoRR), abs/cs/0702151 (2007)Google Scholar
  9. 9.
    Broder, A.Z., Kirsch, A., Kumar, R., Mitzenmacher, M., Upfal, E., Vassilvitskii, S.: The hiring problem and lake wobegon strategies. In: Proceedings of the nineteenth annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2008), pp. 1184–1193. Society for Industrial and Applied Mathematics, Philadelphia (2008)Google Scholar
  10. 10.
    Chan, T.M., Sadjad, S.B.S.: Geometric optimization problems over sliding windows. In: Fleischer, R., Trippen, G. (eds.) ISAAC 2004. LNCS, vol. 3341, pp. 246–258. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  11. 11.
    Cormode, G., Muthukrishnan, S.: What’s hot and what’s not: tracking most frequent items dynamically. ACM Trans. Database Syst. 30(1), 249–278 (2005)CrossRefGoogle Scholar
  12. 12.
    Datar, M., Gionis, A., Indyk, P., Motwani, R.: Maintaining stream statistics over sliding windows. SIAM Journal of Computing 31(6), 1794–1813 (2002)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Datar, M., Muthukrishnan, S.: Estimating rarity and similarity over data stream windows. In: Möhring, R.H., Raman, R. (eds.) ESA 2002. LNCS, vol. 2461, pp. 323–334. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  14. 14.
    El-yaniv, R., Fiat, A., Karp, R.M., Turpin, G.: Optimal search and one-way trading online algorithms. Algorithmica 30, 101–139 (2001)MathSciNetCrossRefMATHGoogle Scholar
  15. 15.
    Fagin, R., Kumar, R., Sivakumar, D.: Comparing top k lists. In: Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms (SODA 2003), pp. 28–36. Society for Industrial and Applied Mathematics, Philadelphia (2003)Google Scholar
  16. 16.
    Feigenbaum, J., Kannan, S., Zhang, J.: Computing diameter in the streaming and sliding-window models. Algorithmica 41(1), 25–41 (2004)MathSciNetCrossRefMATHGoogle Scholar
  17. 17.
    Ferguson, T.S.: Who solved the secretary problem? Statistical Science 3(4), 282–296 (1988)MathSciNetMATHGoogle Scholar
  18. 18.
    Golab, L., DeHaan, D., Demaine, E.D., Lopez-Ortiz, A., Munro, J.I.: Identifying frequent items in sliding windows over on-line packet streams. In: Proceedings of the 3rd ACM SIGCOMM conference on Internet Measurement (IMC 2003), pp. 173–178. ACM Press, New York (2003)CrossRefGoogle Scholar
  19. 19.
    Guha, S., McGregor, A.: Approximate quantiles and the order of the stream. In: Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 2006), pp. 273–279. ACM, New York (2006)CrossRefGoogle Scholar
  20. 20.
    Kleinberg, R.: A multiple-choice secretary algorithm with applications to online auctions. In: Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms (SODA 2005), pp. 630–631. Society for Industrial and Applied Mathematics, Philadelphia (2005)Google Scholar
  21. 21.
    Lee, L.K., Ting, H.F.: A simpler and more efficient deterministic scheme for finding frequent items over sliding windows. In: Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 2006), pp. 290–297. ACM, New York (2006)CrossRefGoogle Scholar
  22. 22.
    Manjhi, A., Shkapenyuk, V., Dhamdhere, K., Olston, C.: Finding (recently) frequent items in distributed data streams. In: Proceedings of the 21st International Conference on Data Engineering (ICDE 2005), pp. 767–778. IEEE Computer Society Press, Los Alamitos (2005)CrossRefGoogle Scholar
  23. 23.
    Muthukrishnan, S.: Data streams: algorithms and applications. Now Publishers Inc. (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Luca Becchetti
    • 1
  • Elias Koutsoupias
    • 2
  1. 1.“Sapienza” University of RomeItaly
  2. 2.University of AthensGreece

Personalised recommendations