Skip to main content

Efficient Quantile Retrieval on Multi-dimensional Data

  • Conference paper
Book cover Advances in Database Technology - EDBT 2006 (EDBT 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3896))

Included in the following conference series:

Abstract

Given a set of N multi-dimensional points, we study the computation of φ-quantiles according to a ranking function F, which is provided by the user at runtime. Specifically, F computes a score based on the coordinates of each point; our objective is to report the object whose score is the φN-th smallest in the dataset. φ-quantiles provide a succinct summary about the F-distribution of the underlying data, which is useful for online decision support, data mining, selectivity estimation, query optimization, etc. Assuming that the dataset is indexed by a spatial access method, we propose several algorithms for retrieving a quantile efficiently. Analytical and experimental results demonstrate that a branch-and-bound method is highly effective in practice, outperforming alternative approaches by a significant factor.

Work supported by grants HKU 7380/02E and CityU 1163/04E from Hong Kong RGC, and SRG grant (Project NO: 7001843) from City University of Hong Kong.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alsabti, K., Ranka, S., Singh, V.: A One-Pass Algorithm for Accurately Estimating Quantiles for Disk-Resident Data. In: VLDB (1997)

    Google Scholar 

  2. Bar-Yossef, Z., Kumar, R., Sivakumar, D.: Sampling Algorithms: Lower Bounds and Applications. In: STOC (2001)

    Google Scholar 

  3. Blum, M., Floyd, R.W., Pratt, V.R., Rivest, R.L., Tarjan, R.E.: Time Bounds for Selection. J. Comput. Syst. Sci. 7, 448–461 (1973)

    Article  MATH  MathSciNet  Google Scholar 

  4. Brinkhoff, T., Kriegel, H.-P., Seeger, B.: Efficient Processing of Spatial Joins Using RTrees. In: SIGMOD (1993)

    Google Scholar 

  5. Clarkson, K., Eppstein, D., Miller, G., Sturtivant, C., Teng, S.-H.: Approximating Center Points with Iterated Radon Points. Int. J. Comp. Geom. and Appl. 6(3), 357–377 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  6. Cormode, G., Korn, F., Muthukrishnan, S., Srivastava, D.: Effective Computation of Biased Quantiles over Data Streams. In: ICDE (2005)

    Google Scholar 

  7. Fagin, R., Lotem, A., Naor, M.: Optimal Aggregation Algorithms for Middleware. In: PODS (2001)

    Google Scholar 

  8. Gilbert, A.C., Kotidis, Y., Muthukrishnan, S., Strauss, M.: How to Summarize the Universe: Dynamic Maintenance of Quantiles. In: VLDB (2002)

    Google Scholar 

  9. Greenwald, M., Khanna, S.: Space-Efficient Online Computation of Quantile Summaries. In: SIGMOD (2001)

    Google Scholar 

  10. Guttman, A.: R-Trees: A Dynamic Index Structure for Spatial Searching. In: SIGMOD (1984)

    Google Scholar 

  11. Hjaltason, G.R., Samet, H.: Distance Browsing in Spatial Databases. TODS 24(2), 265–318 (1999)

    Article  Google Scholar 

  12. Jadhav, S., Mukhopadhyay, A.: Computing a Centerpoint of a Finite Planar Set of Points in Linear Time. In: ACM Symposium on Computational Geometry (1993)

    Google Scholar 

  13. Lazaridis, I., Mehrotra, S.: Progressive approximate aggregate queries with a multiresolution tree structure. In: SIGMOD (2001)

    Google Scholar 

  14. Manku, G.S., Rajagopalan, S., Lindsay, B.G.: Approximate Medians and other Quantiles in One Pass and with Limited Memory. In: SIGMOD (1998)

    Google Scholar 

  15. Manku, G.S., Rajagopalan, S., Lindsay, B.G.: Random Sampling Techniques for Space Efficient Online Computation of Order Statistics of Large Datasets. In: SIGMOD (1999)

    Google Scholar 

  16. Munro, J.I., Paterson, M.: Selection and Sorting with Limited Storage. Theor. Comput. Sci. 12, 315–323 (1980)

    Article  MATH  MathSciNet  Google Scholar 

  17. Papadias, D., Kalnis, P., Zhang, J., Tao, Y.: Efficient OLAP operations in spatial data warehouses. In: Jensen, C.S., Schneider, M., Seeger, B., Tsotras, V.J. (eds.) SSTD 2001. LNCS, vol. 2121, p. 443. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  18. Papadias, D., Tao, Y., Fu, G., Seeger, B.: An optimal and progressive algorithm for skyline queries. In: SIGMOD (2003)

    Google Scholar 

  19. Paterson, M.: Progress in selection. Technical Report, University of Warwick, Conventry, UK (1997)

    Google Scholar 

  20. Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C. second edition. Cambridge University Press, Cambridge (1992)

    MATH  Google Scholar 

  21. Stanoi, I., Riedewald, M., Agrawal, D., Abbadi, A.E.: Discovery of Influence Sets in Frequently Updated Databases. In: VLDB (2001)

    Google Scholar 

  22. Thaper, N., Guha, S., Indyk, P., Koudas, N.: Dynamic Multidimensional Histograms. In: SIGMOD (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yiu, M.L., Mamoulis, N., Tao, Y. (2006). Efficient Quantile Retrieval on Multi-dimensional Data. In: Ioannidis, Y., et al. Advances in Database Technology - EDBT 2006. EDBT 2006. Lecture Notes in Computer Science, vol 3896. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11687238_13

Download citation

  • DOI: https://doi.org/10.1007/11687238_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-32960-2

  • Online ISBN: 978-3-540-32961-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics