Skip to main content
Log in

Hierarchically compressed wavelet synopses

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

The wavelet decomposition is a proven tool for constructing concise synopses of large data sets that can be used to obtain fast approximate answers. Existing research studies focus on selecting an optimal set of wavelet coefficients to store so as to minimize some error metric, without however seeking to reduce the size of the wavelet coefficients themselves. In many real data sets the existence of large spikes in the data values results in many large coefficient values lying on paths of a conceptual tree structure known as the error tree. To exploit this fact, we introduce in this paper a novel compression scheme for wavelet synopses, termed hierarchically compressed wavelet synopses, that fully exploits hierarchical relationships among coefficients in order to reduce their storage. Our proposed compression scheme allows for a larger number of coefficients to be stored for a given space constraint thus resulting in increased accuracy of the produced synopsis. We propose optimal, approximate and greedy algorithms for constructing hierarchically compressed wavelet synopses that minimize the sum squared error while not exceeding a given space budget. Extensive experimental results on both synthetic and real-world data sets validate our novel compression scheme and demonstrate the effectiveness of our algorithms against existing synopsis construction algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Baraniuk, R., Jones, D.: A signal-dependent time-frequency representation: fast algorithm for optimal kernel design. ISP 42(1), 134– (1994)

    Google Scholar 

  2. Chakrabarti, K., Garofalakis, M.N., Rastogi, R., Shim, K.: Approximate query processing using wavelets. In: Proceedings of the International Conference on Very Large Data Bases (VLDB), pp. 111–122 (2000)

  3. Cormode, G., Garofalakis, M., Sacharidis, D.: Fast approximate wavelet tracking on streams. In: Proceedings of the International Conference on Extending Database Technology (EDBT) (2006)

  4. Deligiannakis, A., Garofalakis, M., Roussopoulos, N.: A fast approximation scheme for probabilistic wavelet synopses. In: Proceedings of the 17th International Conference on Scientific and Statistical Database Management (SSDBM) (2005)

  5. Deligiannakis, A., Garofalakis, M., Roussopoulos, N.: Extended wavelets for multiple measures. ACM Trans. Database Systems 32(2) (2007)

  6. Deligiannakis, A., Roussopoulos, N.: Extended wavelets for multiple measures. In: Proceedings of ACM International Conference on Management of Data (SIGMOD), pp. 229–240 (2003)

  7. Deshpande, A., Guestrin, C., Madden, S., Hellerstein, J., Hong, W.: Model-driven data acquisition in sensor networks. In: VLDB (2004)

  8. Garofalakis, M., Gibbons, P.B.: Wavelet synopses with error guarantees. In: Proceedings of ACM International Conference on Management of Data (SIGMOD), pp. 476–487 (2002)

  9. Garofalakis, M., Kumar, A.: Deterministic wavelet thresholding for maximum-error metrics. In: Proceedings of the ACM Symposium on Principles of Database Systems (PODS), pp. 166–176 (2004)

  10. Garofalakis, M., Kumar, A.: Wavelet synopses for general error metrics. ACM Trans. Database Systems 30(4), 888– (2005)

    Article  Google Scholar 

  11. Gilbert, A.C., Kotidis, Y., Muthukrishnan, S., Strauss, M.J.: Surfing wavelets on streams: one-pass summaries for approximate aggregate queries. In: Proceedings of the International Conference on Very Large Data Bases (VLDB) (2001)

  12. Guha, S.: Space efficiency in synopsis construction algorithms. In: Proceedings of the International Conference on Very Large Data Bases (VLDB), pp. 409–420 (2005)

  13. Guha, S., Harb, B.: Wavelet synopsis for data streams: minimizing non-euclidean error. In: Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (KDD), pp. 88–97 (2005)

  14. Guha, S., Harb, B.: Approximation algorithms for wavelet transform coding of data streams. In: Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA) (2006)

  15. Guha, S., Kim, C., Shim, K.: Xwave: Approximate extended wavelets for streaming data. In: Proceedings of the International Conference on Very Large Data Bases (VLDB), pp. 288–299 (2004)

  16. Jahangiri, M., Sacharidis, D., Shahabi, C.: Shift-Split: I/O efficient maintenance of wavelet-transformed multidimensional data. In: Proceedings of ACM International Conference on Management of Data (SIGMOD) (2005)

  17. Jawerth, B., Sweldens, W.: An overview of wavelet based multiresolution analyses. SIAM Rev. 36(3), 377– (1994)

    Article  MATH  MathSciNet  Google Scholar 

  18. Karras, P., Mamoulis, N.: One-pass wavelet synopses for maximum-error metrics. In: Proceedings of the International Conference on Very Large Data Bases (VLDB), pp. 421–432 (2005)

  19. Mallat S. (1999) A Wavelet Tour of Signal Processing, 2nd edn. Academic Press, Ney York

    MATH  Google Scholar 

  20. Matias, Y., Urieli, D.: Inner-product based wavelet synopses for range-sum queries. In: Proceedings of the 14th Annual European Symposium on Algorithms (ESA), pp. 504–515 (2006)

  21. Matias, Y., Vitter, J.S., Wang, M.: Wavelet-based histograms for selectivity estimation. In: Proceedings of ACM International Conference on Management of Data (SIGMOD), pp. 448–459 (1998)

  22. Matias, Y., Vitter, J.S., Wang, M.: Dynamic maintenance of wavelet-based histograms. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 101–110 (2000)

  23. Muthukrishnan, S.: Subquadratic algorithms for workload-aware haar wavelet synopses. In: Proceedings of the IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS) (2005)

  24. Natse, A., Rastogi, R., Shim, K.: WALRUS: A similarity retrieval algorithm for image databases. In: Proceedings of ACM International Conference on Management of Data (SIGMOD) (1999)

  25. Poosala, V., Ioannidis, Y.E.: Selectivity estimation without the attribute value independence assumption. In: VLDB (1997)

  26. Stollnitz, E.J., Derose, T.D., Salesin, D.H.: Wavelets for Computer Graphics: Theory and Applications. Morgan Kaufmann (1996)

  27. Urieli, D., Matias, Y.: Optimal workload-based weighted wavelet synopses. In: Proceedings of International Conference on Database Theory (ICDT) (2005)

  28. Vitter, J.S., Wang, M.: Approximate computation of multidimensional aggregates of sparse data using wavelets. In: Proceedings of ACM International Conference on Management of Data (SIGMOD, pp. 193–204. ACM Press (1999)

  29. Vitter, J.S., Wang, M., Iyer, B.R.: Data cube approximation and histograms via wavelets. In: Proceedings of the International Conference on Information and Knowledge Management (CIKM), pp. 96–104 (1998)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dimitris Sacharidis.

Additional information

This work has been funded by the project PENED 2003. The project is co-financed 75% of public expenditure through EC—European Social Fund, 25% of public expenditure through Ministry of Development—General Secretariat of Research and Technology and through private sector, under measure 8.3 of OPERATIONAL PROGRAMME “COMPETITIVENESS" in the 3rd Community Support Programme.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sacharidis, D., Deligiannakis, A. & Sellis, T. Hierarchically compressed wavelet synopses. The VLDB Journal 18, 203–231 (2009). https://doi.org/10.1007/s00778-008-0096-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-008-0096-z

Keywords

Navigation