Adaptive Spatial Partitioning for Multidimensional Data Streams

  • John Hershberger
  • Nisheeth Shrivastava
  • Subhash Suri
  • Csaba D. Tóth
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3341)

Abstract

We propose a space-efficient scheme for summarizing multidimensional data streams. Our scheme can be used for several geometric queries, including natural spatial generalizations of well-studied single-dimensional queries such as icebergs and quantiles.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agarwal, P.K., Har-Peled, S., Varadarajan, K.R.: Approximating extent measures of points. J. ACM 51, 606–635 (2004)CrossRefMathSciNetMATHGoogle Scholar
  2. 2.
    Alon, N., Matias, Y., Szegedy, M.: The space complexity of approximating the frequency moments. J. Comput. Syst. Sci. 58, 137–147 (1999)MATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Arasu, A., Manku, G.: Approximate counts and quantiles over sliding windows. In: Proc. 23rd PODS, pp. 286–296. ACM Press, New York (2004)Google Scholar
  4. 4.
    Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: 21st PODS, pp. 1–16. ACM Press, New York (2002)Google Scholar
  5. 5.
    Bentley, J.L.: Multidimensional divide-and-conquer. Communications of the ACM 23(4), 214–229 (1980)MATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Charikar, M., O’Callaghan, L., Panigrahy, R.: Better streaming algorithms for clustering problems. In: Proc. 35th STOC, pp. 30–39 (2003)Google Scholar
  7. 7.
    Cormode, G., Korn, F., Muthukrishnan, S., Srivastava, D.: Finding hierarchical heavy hitters in data streams. In: Proc. 29th Conf. VLDB (2003)Google Scholar
  8. 8.
    Cormode, G., Muthukrishnan, S.: Radial histograms for spatial streams. Technical report DIMACS TR 2003-11 (2003)Google Scholar
  9. 9.
    Cormode, G., Muthukrishnan, S.: What is hot and what is not: Tracking most frequent items dynamically. In: Proc. 22nd PODS, pp. 296–306 (2003)Google Scholar
  10. 10.
    Datar, M., Gionis, A., Indyk, P., Motwani, R.: Maintaining stream statistics over sliding windows. SIAM Journal of Computing 31(6), 1794–1813 (2002)MATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    Demaine, E.D., López-Ortiz, A., Munro, J.I.: Frequency estimation of internet packet streams with limited space. In: Möhring, R.H., Raman, R. (eds.) ESA 2002. LNCS, vol. 2461, pp. 348–360. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  12. 12.
    Estan, C., Savage, S., Varghese, G.: Automatically inferring patterns of resource consumption in network traffic. In: Proc. SIGCOMM, pp. 137–148. ACM Press, New York (2003)Google Scholar
  13. 13.
    Fang, M., Shivakumar, N., Garcia-Molina, H., Motwani, R., Ullman, J.D.: Computing iceberg queries efficiently. In: Proc. 24rd Conf. VLDB, pp. 299–310 (1998)Google Scholar
  14. 14.
    Gilbert, A., Kotidis, Y., Muthukrishnan, S., Strauss, M.: How to summarize the Universe: Dynamic maintenance of quantiles. In: Proc. 28th Conf. on VLDB (2002)Google Scholar
  15. 15.
    Greenwald, M., Khanna, S.: Space-efficient online computation of quantile summaries. In: Proc. 20th SIGMOD, pp. 58–66 (2001)Google Scholar
  16. 16.
    Hershberger, J., Suri, S.: Adaptive sampling for geometric problems over data streams. In: Proc. 23rd PODS, pp. 252–262. ACM Press, New York (2004)Google Scholar
  17. 17.
    Karp, R.M., Shenker, S., Papadimitriou, C.H.: A simple algorithm for finding frequent elements in streams and bags. ACM Transactions on Database Systems 28(1), 51–55 (2003)CrossRefGoogle Scholar
  18. 18.
    Manku, G., Motwani, R.: Approximate frequency counts over data streams. In: Proc. 28th Conf. VLDB, pp. 346–357 (2002)Google Scholar
  19. 19.
    Manku, G.S., Rajagopalan, S., Lindsay, B.G.: Approximate medians and other quantiles in one pass and with limited memory. In: Proc. 17th SIGMOD, pp. 426–435 (1998)Google Scholar
  20. 20.
    Manku, G., Rajagopalan, S., Lindsay, B.G.: Random sampling techniques for space efficient online computation of order statistics of large datasets. In: Proc. 18th SIGMOD, pp. 251–262 (1999)Google Scholar
  21. 21.
    Misra, J., Gries, D.: Finding repeated elements. Sci. Comput. Programming 2, 143–152 (1982)MATHCrossRefMathSciNetGoogle Scholar
  22. 22.
    Munro, J.I., Paterson, M.S.: Selection and sorting with limited storage. Theoretical Computer Science 12, 315–323 (1980)MATHCrossRefMathSciNetGoogle Scholar
  23. 23.
    Muthukrishnan, S.: Data streams: Algorithms and applications. Preprint (2003)Google Scholar
  24. 24.
    Suri, S., Tóth, C.D., Zhou, Y.: Range counting over multi-dimensional data streams. In: Proc. 20th ACM Symp. Comput. Geom., pp. 160–169. ACM Press, New York (2004)Google Scholar
  25. 25.
    Thaper, N., Guha, S., Indyk, P., Koudas, N.: Dynamic multidimensional histograms. In: Proc. SIGMOD Conf. on Management of Data, pp. 428–439. ACM Press, New York (2002)Google Scholar
  26. 26.
    Vapnik, V.N., Chervonenkis, A.Y.: On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab. Appl. 16, 264–280 (1971)MATHCrossRefGoogle Scholar
  27. 27.
    Vitter, J.S.: Random sampling with a reservoir. ACM Trans. Math. Software 11, 37–57 (1985)MATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • John Hershberger
    • 1
    • 2
  • Nisheeth Shrivastava
    • 3
  • Subhash Suri
    • 3
  • Csaba D. Tóth
    • 3
  1. 1.Mentor Graphics Corp.WilsonvilleUSA
  2. 2.(by courtesy) Computer Science Dept.University of CaliforniaSanta Barbara
  3. 3.Computer Science Dept.University of CaliforniaSanta BarbaraUSA

Personalised recommendations