Abstract
Given a set of N multi-dimensional points, we study the computation of φ-quantiles according to a ranking function F, which is provided by the user at runtime. Specifically, F computes a score based on the coordinates of each point; our objective is to report the object whose score is the φN-th smallest in the dataset. φ-quantiles provide a succinct summary about the F-distribution of the underlying data, which is useful for online decision support, data mining, selectivity estimation, query optimization, etc. Assuming that the dataset is indexed by a spatial access method, we propose several algorithms for retrieving a quantile efficiently. Analytical and experimental results demonstrate that a branch-and-bound method is highly effective in practice, outperforming alternative approaches by a significant factor.
Work supported by grants HKU 7380/02E and CityU 1163/04E from Hong Kong RGC, and SRG grant (Project NO: 7001843) from City University of Hong Kong.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alsabti, K., Ranka, S., Singh, V.: A One-Pass Algorithm for Accurately Estimating Quantiles for Disk-Resident Data. In: VLDB (1997)
Bar-Yossef, Z., Kumar, R., Sivakumar, D.: Sampling Algorithms: Lower Bounds and Applications. In: STOC (2001)
Blum, M., Floyd, R.W., Pratt, V.R., Rivest, R.L., Tarjan, R.E.: Time Bounds for Selection. J. Comput. Syst. Sci. 7, 448–461 (1973)
Brinkhoff, T., Kriegel, H.-P., Seeger, B.: Efficient Processing of Spatial Joins Using RTrees. In: SIGMOD (1993)
Clarkson, K., Eppstein, D., Miller, G., Sturtivant, C., Teng, S.-H.: Approximating Center Points with Iterated Radon Points. Int. J. Comp. Geom. and Appl. 6(3), 357–377 (1996)
Cormode, G., Korn, F., Muthukrishnan, S., Srivastava, D.: Effective Computation of Biased Quantiles over Data Streams. In: ICDE (2005)
Fagin, R., Lotem, A., Naor, M.: Optimal Aggregation Algorithms for Middleware. In: PODS (2001)
Gilbert, A.C., Kotidis, Y., Muthukrishnan, S., Strauss, M.: How to Summarize the Universe: Dynamic Maintenance of Quantiles. In: VLDB (2002)
Greenwald, M., Khanna, S.: Space-Efficient Online Computation of Quantile Summaries. In: SIGMOD (2001)
Guttman, A.: R-Trees: A Dynamic Index Structure for Spatial Searching. In: SIGMOD (1984)
Hjaltason, G.R., Samet, H.: Distance Browsing in Spatial Databases. TODS 24(2), 265–318 (1999)
Jadhav, S., Mukhopadhyay, A.: Computing a Centerpoint of a Finite Planar Set of Points in Linear Time. In: ACM Symposium on Computational Geometry (1993)
Lazaridis, I., Mehrotra, S.: Progressive approximate aggregate queries with a multiresolution tree structure. In: SIGMOD (2001)
Manku, G.S., Rajagopalan, S., Lindsay, B.G.: Approximate Medians and other Quantiles in One Pass and with Limited Memory. In: SIGMOD (1998)
Manku, G.S., Rajagopalan, S., Lindsay, B.G.: Random Sampling Techniques for Space Efficient Online Computation of Order Statistics of Large Datasets. In: SIGMOD (1999)
Munro, J.I., Paterson, M.: Selection and Sorting with Limited Storage. Theor. Comput. Sci. 12, 315–323 (1980)
Papadias, D., Kalnis, P., Zhang, J., Tao, Y.: Efficient OLAP operations in spatial data warehouses. In: Jensen, C.S., Schneider, M., Seeger, B., Tsotras, V.J. (eds.) SSTD 2001. LNCS, vol. 2121, p. 443. Springer, Heidelberg (2001)
Papadias, D., Tao, Y., Fu, G., Seeger, B.: An optimal and progressive algorithm for skyline queries. In: SIGMOD (2003)
Paterson, M.: Progress in selection. Technical Report, University of Warwick, Conventry, UK (1997)
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C. second edition. Cambridge University Press, Cambridge (1992)
Stanoi, I., Riedewald, M., Agrawal, D., Abbadi, A.E.: Discovery of Influence Sets in Frequently Updated Databases. In: VLDB (2001)
Thaper, N., Guha, S., Indyk, P., Koudas, N.: Dynamic Multidimensional Histograms. In: SIGMOD (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yiu, M.L., Mamoulis, N., Tao, Y. (2006). Efficient Quantile Retrieval on Multi-dimensional Data. In: Ioannidis, Y., et al. Advances in Database Technology - EDBT 2006. EDBT 2006. Lecture Notes in Computer Science, vol 3896. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11687238_13
Download citation
DOI: https://doi.org/10.1007/11687238_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32960-2
Online ISBN: 978-3-540-32961-9
eBook Packages: Computer ScienceComputer Science (R0)