Skip to main content
Log in

Strategies for complex data cube queries

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

This paper proposes a computation method for holistic multi-feature cube (MF-Cube) queries based on the characteristics of MF-Cubes. Three simple yet efficient strategies are designed to optimize the dependent complex aggregate at multiple granularities for a complex data-mining query within data cubes. One strategy is the computation of Holistic MF-Cube queries using the PDAP (Part Distributive Aggregate Property). More efficiency is gained by another strategy, that of dynamic subset data selection (the iceberg query technique), which reduces the size of the materialized data cubes. To extend this efficiency further, the second approach may adopt the chunk-based caching technique that reuses the output of previous queries. By combining these three strategies, we design an algorithm called the PDIC (Part Distributive Iceberg Chunk). We experimentally evaluate this algorithm using synthetic and real-world datasets and demonstrate that our approach delivers up to approximately twice the performance efficiency of traditional computation methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Agarwal S, Agrawal R, Deshpande P, Gupta A et al (1996) On the computation of multidimensional aggregates. In: Proceedings of the international conference on very large data bases, pp 506–521

  2. Beyer K, Ramakrishnan R (1999) Bottom-up computation of sparse and iceberg cubes. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 359–370

  3. Dehne F, Eavis T, Rau-Chaplin A (2004) Computing partial data cubes. In: Proceedings of the Hawaii international conference on system sciences, pp 1–20

  4. Deshpande P, Naughton J (2000) Aggregate aware caching for multi-dimensional queries. In: Proceedings of the international conference on extending database technology, pp 167–182

  5. Deshpande P, Ramasamy K, Shukla A (1998) Caching multidimensional queries using chunks. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 259–270

  6. Dong G, Han J, Lam J et al. (2004) Mining constrained gradients in large databases. IEEE Trans Knowl Data Eng 16(8):922–938

    Article  Google Scholar 

  7. Fang M, Shivakumar N, Garcia-Molina H et al (1998) Computing iceberg queries efficiently. In: Proceedings of the international conference on very large data bases, pp 299–310

  8. Gray A, Bosworth A, Layman A, et al (1996) Datacube: a relational aggregation operator generalizing group-by, cross-tab, and sub-totals. In: Proceedings of the international conference on data engineering, pp 152–159

  9. Hahn C, Warren S, London J Edited synoptic cloud reports from ships and land stations over the globe. http://cdiac.esd.ornl.gov/cdiac/ndps/ndp026b.html

  10. Han J, Kambr M (2001) In: Data mining concepts and techniques. Morgan Kaufmann, San Francisco, pp 39–104

    Google Scholar 

  11. Han J, Pei J, Dong G et al (2001) Efficient computation of iceberg cubes with complex measures. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 1–12

  12. Ng R, Wagner A, Yin Y (2001) Iceberg-cube computation with pc clusters. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 25–36

  13. O’Gorman K, Agrawal D, El Abbadi A (2002) Multiple query optimization by cache-aware middleware using query teamwork. In: Proceedings of the IEEE international conference on data engineering, p 274

  14. Ross K, Srivastava D (1997) Fast computation of sparse datacubes. In: Proceedings of the international conference on very large data bases, pp 116–125

  15. Ross K, Srivastava D, Chatziantoniou D (1998) Complex aggregation at multiple granularities. In: Proceedings of the international conference on extending database technology, pp 263–277

  16. Wang K, Jiang Y, Dong G et al. (2005) Divide-and-approximate: a novel constraint push strategy for iceberg cube mining. IEEE Trans Knowl Data Eng 17(3):354–368

    Article  Google Scholar 

  17. Yang Q, Wu X (2005) 10 Challenging problems in data mining research. In: IEEE international conference on data mining

  18. Zhao Y, Deshpande P, Naughton J (1997) An array-based algorithm for simultaneous multidimensional aggregates. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 159–170

  19. Zhu Q, Tao Y, Zuzarte C (2003) Exploiting similarity of sub-queries for complex query optimization. In: Proceedings of the international conference on database and expert systems applications, pp 747–759

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shichao Zhang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, S., Wang, R. & Jin, Z. Strategies for complex data cube queries. Appl Intell 31, 332–346 (2009). https://doi.org/10.1007/s10489-008-0130-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-008-0130-2

Keywords

Navigation