Advertisement

Building Data Synopses Within a Known Maximum Error Bound

  • Chaoyi Pang
  • Qing Zhang
  • David Hansen
  • Anthony Maeder
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4505)

Abstract

The constructions of Haar wavelet synopses for large data sets have proven to be useful tools for data approximation. Recently, research on constructing wavelet synopses with a guaranteed maximum error has gained attention. Two relevant problems have been proposed: One is the size bounded problem that requires the construction of a synopsis of a given size to minimize the maximum error. Another is the error bounded problem that requires a minimum sized synopsis be built to satisfy a given error bound. The optimum algorithms for these two problems take O(N 2) time complexity. In this paper, we provide new algorithms for building error-bounded synopses. We first provide several property-based pruning techniques, which can greatly improve the performance of optimal error bounded synopses construction. We then demonstrate the efficiencies and effectiveness of our techniques through extensive experiments.

Keywords

Leaf Node Pruning Strategy Pruning Technique Error Tree Synopsis Construction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Chaudhuri, S., Motwani, R., Narasayya, V.: Random sampling for histogram construction: How much is enough? In: ACM SIGMOD’98, pp. 436–447 (1998)Google Scholar
  2. 2.
    Garofalakis, M., Gibbons, P.B.: Wavelet synopses with error guarantees. In: ACM SIGMOD’02, pp. 476–487 (2002)Google Scholar
  3. 3.
    Garofalakis, M., Kumar, A.: Deterministic wavelet thresholding for maximum-error metrics. In: ACM PODS’04, Paris, France, pp. 166–176 (2004), doi:10.1145/1055558.1055582Google Scholar
  4. 4.
    Gilbert, A.C., Kotidis, Y., Muthukrishnan, S., Strauss, M.: Optimal and approximate computation of summary statistics for range aggregates. In: ACM PODS’01, pp. 227–236 (2001)Google Scholar
  5. 5.
    Guha, S.: Space efficiency in synopsis construction algorithms. In: VLDB’05, Trondheim, Norway, pp. 409–420 (2005)Google Scholar
  6. 6.
    Guha, S., Harb, B.: Approximation algorithms for wavelet transform coding of data streams. In: SODA, pp. 698–707 (2006)Google Scholar
  7. 7.
    Guha, S., Harb, B.: Wavelet synopsis for data streams: minimizing non-euclidean error. In: ACM SIGKDD, Chicago, Illinois, USA, pp. 88–97 (2005), doi:10.1145/1081870.1081884Google Scholar
  8. 8.
    Guha, S., Shim, K., Woo, J.: Rehist: Relative error histogram construction algorithms. In: VLDB’04, pp. 300–311 (2004)Google Scholar
  9. 9.
    Karras, P., Mamoulis, N.: One-pass wavelet synopses for maximum-error metrics. In: VLDB’05, Trondheim, Norway, pp. 421–432 (2005)Google Scholar
  10. 10.
    Matias, Y., Urieli, D.: Inner-Product Based Wavelet Synopses for Range-Sum Queries. In: Azar, Y., Erlebach, T. (eds.) ESA 2006. LNCS, vol. 4168, pp. 504–515. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  11. 11.
    Matias, Y., Vitter, J.S., Wang, M.: Wavelet-based histograms for selectivity estimation. In: ACM SIGMOD’98, pp. 448–459. ACM Press, New York (1998)Google Scholar
  12. 12.
    Muthukrishnan, S.: Subquadratic Algorithms for Workload-Aware Haar Wavelet Synopses. In: Ramanujam, R., Sen, S. (eds.) FSTTCS 2005. LNCS, vol. 3821, pp. 285–296. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  13. 13.
    Stollnitz, E.J., Derose, T.D., Salesin, D.H.: Wavelets for computer graphics: theory and applications. Morgan Kaufmann, San Francisco (1996)Google Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Chaoyi Pang
    • 1
  • Qing Zhang
    • 1
  • David Hansen
    • 1
  • Anthony Maeder
    • 1
  1. 1.eHealth Research Centre, ICT CSIROAustralia

Personalised recommendations