Optimal Workload-Based Weighted Wavelet Synopses

  • Yossi Matias
  • Daniel Urieli
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3363)

Abstract

In recent years wavelets were shown to be effective data synopses. We are concerned with the problem of finding efficiently wavelet synopses for massive data sets, in situations where information about query workload is available. We present linear time, I/O optimal algorithms for building optimal workload-based wavelet synopses for point queries. The synopses are based on a novel construction of weighted inner-products and use weighted wavelets that are adapted to those products. The synopses are optimal in the sense that the subset of retained coefficients is the best possible for the bases in use with respect to either the mean-squared absolute or relative errors. For the latter, this is the first optimal wavelet synopsis even for the regular, non-workload-based case. Experimental results demonstrate the advantage obtained by the new optimal wavelet synopses, as well as the robustness of the synopses to deviations in the actual query workload.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aboulnaga, A., Chaudhuri, S.: Self-tuning histograms: Building histograms without looking at data. In: Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, pp. 181–192 (1999)Google Scholar
  2. 2.
    Chakrabarti, K., Garofalakis, M., Rastogi, R., Shim, K.: Approximate query processing using wavelets. In: Proceedings of 26th International Conference on Very Large Data Bases, VLDB 2000, pp. 111–122 (2000)Google Scholar
  3. 3.
    Chaudhuri, S., Das, G., Datar, M., Motwani, R., Narasayya, V.R.: Overcoming limitations of sampling for aggregation queries. In: ICDE, pp. 534–542 (2001)Google Scholar
  4. 4.
    Chaudhuri, S., Das, G., Narasayya, V.: A robust, optimization-based approach for approximate answering of aggregate queries. In: Proceedings of the 2001 ACM SIGMOD international conference on Management of data (2001)Google Scholar
  5. 5.
    Coifman, R.R., Jones, P.W., Semmes, S.: Two elementary proofs of the l2 boundedness of cauchy integrals on lipschitz curves. J. Amer. Math. Soc. 2(3), 553–564 (1989)MATHMathSciNetGoogle Scholar
  6. 6.
    Deligiannakis, A., Roussopoulos, N.: Extended wavelets for multiple measures. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 229–240 (2003)Google Scholar
  7. 7.
    Ganti, V., Lee, M.-L., Ramakrishnan, R.: Icicles: Self-tuning samples for approximate query answering. The VLDB Journal, 176–187 (2000)Google Scholar
  8. 8.
    Garofalakis, M., Gibbons, P.B.: Wavelet synopses with error guarantees. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data (2002)Google Scholar
  9. 9.
    Garofalakis, M., Kumar, A.: Deterministic wavelet thresholding for maximum-error metrics. In: Proceedings of the 2004 ACM SIGMOD international conference on on Management of data, pp. 166–176 (2004)Google Scholar
  10. 10.
    Gibbons, P.B., Matias, Y.: Synopsis data structures for massive data sets. In: DIMACS: Series in Discrete Mathematics and Theoretical Computer Science: Special Issue on External Memory Algorithms and Visualization, A (1999)Google Scholar
  11. 11.
    Girardi, M., Sweldens, W.: A new class of unbalanced Haar wavelets that form an unconditional basis for L p on general measure spaces. J. Fourier Anal. Appl. 3(4) (1997)Google Scholar
  12. 12.
    Mallat, S.: A Wavelet Tour of Signal Processing, 2nd edn. Academic Press, London (1999)MATHGoogle Scholar
  13. 13.
    Manku, G.S., Rajagopalan, S., Lindsay, B.G.: Approximate medians and other quantiles in one pass and with limited memory. In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, New York, pp. 426–435 (1998)Google Scholar
  14. 14.
    Matias, Y., Portman, L.: Workload-based wavelet synopses. Technical report, Department of Computer Science, Tel Aviv University (2003)Google Scholar
  15. 15.
    Matias, Y., Urieli, D.: Optimal wavelet synopses for range-sum queries. Technical report, Department of Computer Science, Tel-Aviv University (2004)Google Scholar
  16. 16.
    Matias, Y., Urieli, D.: Optimal workload-based weighted wavelet synopses. Technical report, Department of Computer Science, Tel-Aviv University (2004)Google Scholar
  17. 17.
    Matias, Y., Vitter, J.S., Wang, M.: Wavelet-based histograms for selectivity estimation. In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, Seattle, WA, June 1998, pp. 448–459 (1998)Google Scholar
  18. 18.
    Muthukrishnan, S.: Workload-optimal wavelet synopsis. Technical report (May 2004)Google Scholar
  19. 19.
    Portman, L.: Workload-based wavelet synopses. M.sc. thesis, Tel Aviv University (2003)Google Scholar
  20. 20.
    Stollnitz, E.J., Derose, T.D., Salesin, D.H.: Wavelets for Computer Graphics. Morgan Kaufmann, San Francisco (1996)Google Scholar
  21. 21.
    Vitter, J.S., Wang, M.: Approximate computation of multidimensional aggregates of sparse data using wavelets. In: Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, Phildelphia, pp. 193–204 (June 1999)Google Scholar
  22. 22.
    Vitter, J.S., Wang, M., Iyer, B.: Data cube approximation and histograms via wavelets. In: Proceedings of Seventh International Conference on Information and Knowledge Management, Washington D.C, pp. 96–104 (November 1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Yossi Matias
    • 1
  • Daniel Urieli
    • 1
  1. 1.School of Computer ScienceTel-Aviv University 

Personalised recommendations