Advertisement

Space-Bounded Query Approximation

  • Boris Cule
  • Floris Geerts
  • Reuben Ndindi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9282)

Abstract

When dealing with large amounts of data, exact query answering is not always feasible. We propose a query approximation method that, given an upper bound on the amount of data that can be used (i.e., for which query evaluation is still feasible), identifies a part C of the data D that (i) fits in the available space budget; and (ii) provides accurate query results. That is, for a given query Q, the query result Q(C) is close to the exact answer Q(D). In this paper, we present the theoretical framework underlying our query approximation method and provide an experimental validation of the approach.

Keywords

Big data query processing Query approximation Data reduction 

References

  1. 1.
    Agarwal, S., Mozafari, B., Panda, A., Milner, H., Madden, S., Stoica, I.: BlinkDB: queries with bounded errors and bounded response times on very large data. In: Proceedings of ECCS, pp. 29–42 (2013)Google Scholar
  2. 2.
    Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., New York (1979)MATHGoogle Scholar
  3. 3.
    Gonzalez, T.F.: Clustering to minimize the maximum intercluster distance. Theor. Comput. Sci. 38, 293–306 (1985)MathSciNetCrossRefMATHGoogle Scholar
  4. 4.
    Samet, H.: Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann Publishers Inc., San Francisco (2005)MATHGoogle Scholar
  5. 5.
    Cormode, G., Garofalakis, M., Haas, P.J., Jermaine, C.: Synopses for massive data: samples, histograms, wavelets, sketches. Found. Trends Databases 4(1–3), 1–294 (2012)MATHGoogle Scholar
  6. 6.
    Chakrabarti, K., Garofalakis, M.N., Rastogi, R., Shim, K.: Approximate query processing using wavelets. In: Proceedings of VLDB, pp. 111–122 (2000)Google Scholar
  7. 7.
    Ioannidis, Y.E., Poosala, V.: Histogram-based approximation of set-valued query-answers. In: Proceedings of VLDB, pp. 174–185 (1999)Google Scholar
  8. 8.
    Poosala, V., Ganti, V.: Fast approximate answers to aggregate queries on a data cube. In: Proceedings of SSDBM, pp. 24–33 (1999)Google Scholar
  9. 9.
    Gunopulos, D., Kollios, G., Tsotras, V.J., Domeniconi, C.: Approximating multi-dimensional aggregate range queries over real attributes. In: Proceedings of SIGMOD, pp. 463–474 (2000)Google Scholar
  10. 10.
    Chaudhuri, S., Das, G., Narasayya, V.: Optimized stratified sampling for approximate query processing. ACM TODS 32(2), 1–50 (2007)CrossRefGoogle Scholar
  11. 11.
    Gibbons, P.B., Poosala, V., Acharya, S., Bartal, Y., Matias, Y., Muthukrishnan, S., Ramaswamy, S., Suel, T.: Aqua: system and techniques for approximate query answering. Bell Labs Technical report (1998)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.University of AntwerpAntwerpBelgium

Personalised recommendations