Advertisement

Adaptive Spatial Partitioning for Multidimensional Data Streams

  • John Hershberger
  • Nisheeth Shrivastava
  • Subhash Suri
  • Csaba D. Tóth
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3341)

Abstract

We propose a space-efficient scheme for summarizing multidimensional data streams. Our scheme can be used for several geometric queries, including natural spatial generalizations of well-studied single-dimensional queries such as icebergs and quantiles.

Keywords

Data Stream Range Query Frequent Item Cold Spot Range Counting 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agarwal, P.K., Har-Peled, S., Varadarajan, K.R.: Approximating extent measures of points. J. ACM 51, 606–635 (2004)CrossRefMathSciNetMATHGoogle Scholar
  2. 2.
    Alon, N., Matias, Y., Szegedy, M.: The space complexity of approximating the frequency moments. J. Comput. Syst. Sci. 58, 137–147 (1999)MATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Arasu, A., Manku, G.: Approximate counts and quantiles over sliding windows. In: Proc. 23rd PODS, pp. 286–296. ACM Press, New York (2004)Google Scholar
  4. 4.
    Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: 21st PODS, pp. 1–16. ACM Press, New York (2002)Google Scholar
  5. 5.
    Bentley, J.L.: Multidimensional divide-and-conquer. Communications of the ACM 23(4), 214–229 (1980)MATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Charikar, M., O’Callaghan, L., Panigrahy, R.: Better streaming algorithms for clustering problems. In: Proc. 35th STOC, pp. 30–39 (2003)Google Scholar
  7. 7.
    Cormode, G., Korn, F., Muthukrishnan, S., Srivastava, D.: Finding hierarchical heavy hitters in data streams. In: Proc. 29th Conf. VLDB (2003)Google Scholar
  8. 8.
    Cormode, G., Muthukrishnan, S.: Radial histograms for spatial streams. Technical report DIMACS TR 2003-11 (2003)Google Scholar
  9. 9.
    Cormode, G., Muthukrishnan, S.: What is hot and what is not: Tracking most frequent items dynamically. In: Proc. 22nd PODS, pp. 296–306 (2003)Google Scholar
  10. 10.
    Datar, M., Gionis, A., Indyk, P., Motwani, R.: Maintaining stream statistics over sliding windows. SIAM Journal of Computing 31(6), 1794–1813 (2002)MATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    Demaine, E.D., López-Ortiz, A., Munro, J.I.: Frequency estimation of internet packet streams with limited space. In: Möhring, R.H., Raman, R. (eds.) ESA 2002. LNCS, vol. 2461, pp. 348–360. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  12. 12.
    Estan, C., Savage, S., Varghese, G.: Automatically inferring patterns of resource consumption in network traffic. In: Proc. SIGCOMM, pp. 137–148. ACM Press, New York (2003)Google Scholar
  13. 13.
    Fang, M., Shivakumar, N., Garcia-Molina, H., Motwani, R., Ullman, J.D.: Computing iceberg queries efficiently. In: Proc. 24rd Conf. VLDB, pp. 299–310 (1998)Google Scholar
  14. 14.
    Gilbert, A., Kotidis, Y., Muthukrishnan, S., Strauss, M.: How to summarize the Universe: Dynamic maintenance of quantiles. In: Proc. 28th Conf. on VLDB (2002)Google Scholar
  15. 15.
    Greenwald, M., Khanna, S.: Space-efficient online computation of quantile summaries. In: Proc. 20th SIGMOD, pp. 58–66 (2001)Google Scholar
  16. 16.
    Hershberger, J., Suri, S.: Adaptive sampling for geometric problems over data streams. In: Proc. 23rd PODS, pp. 252–262. ACM Press, New York (2004)Google Scholar
  17. 17.
    Karp, R.M., Shenker, S., Papadimitriou, C.H.: A simple algorithm for finding frequent elements in streams and bags. ACM Transactions on Database Systems 28(1), 51–55 (2003)CrossRefGoogle Scholar
  18. 18.
    Manku, G., Motwani, R.: Approximate frequency counts over data streams. In: Proc. 28th Conf. VLDB, pp. 346–357 (2002)Google Scholar
  19. 19.
    Manku, G.S., Rajagopalan, S., Lindsay, B.G.: Approximate medians and other quantiles in one pass and with limited memory. In: Proc. 17th SIGMOD, pp. 426–435 (1998)Google Scholar
  20. 20.
    Manku, G., Rajagopalan, S., Lindsay, B.G.: Random sampling techniques for space efficient online computation of order statistics of large datasets. In: Proc. 18th SIGMOD, pp. 251–262 (1999)Google Scholar
  21. 21.
    Misra, J., Gries, D.: Finding repeated elements. Sci. Comput. Programming 2, 143–152 (1982)MATHCrossRefMathSciNetGoogle Scholar
  22. 22.
    Munro, J.I., Paterson, M.S.: Selection and sorting with limited storage. Theoretical Computer Science 12, 315–323 (1980)MATHCrossRefMathSciNetGoogle Scholar
  23. 23.
    Muthukrishnan, S.: Data streams: Algorithms and applications. Preprint (2003)Google Scholar
  24. 24.
    Suri, S., Tóth, C.D., Zhou, Y.: Range counting over multi-dimensional data streams. In: Proc. 20th ACM Symp. Comput. Geom., pp. 160–169. ACM Press, New York (2004)Google Scholar
  25. 25.
    Thaper, N., Guha, S., Indyk, P., Koudas, N.: Dynamic multidimensional histograms. In: Proc. SIGMOD Conf. on Management of Data, pp. 428–439. ACM Press, New York (2002)Google Scholar
  26. 26.
    Vapnik, V.N., Chervonenkis, A.Y.: On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab. Appl. 16, 264–280 (1971)MATHCrossRefGoogle Scholar
  27. 27.
    Vitter, J.S.: Random sampling with a reservoir. ACM Trans. Math. Software 11, 37–57 (1985)MATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • John Hershberger
    • 1
    • 2
  • Nisheeth Shrivastava
    • 3
  • Subhash Suri
    • 3
  • Csaba D. Tóth
    • 3
  1. 1.Mentor Graphics Corp.WilsonvilleUSA
  2. 2.(by courtesy) Computer Science Dept.University of CaliforniaSanta Barbara
  3. 3.Computer Science Dept.University of CaliforniaSanta BarbaraUSA

Personalised recommendations