Strategies for complex data cube queries

Zhang, Shichao; Wang, Rifeng; Jin, Zhi

doi:10.1007/s10489-008-0130-2

Strategies for complex data cube queries

Published: 20 June 2008

Volume 31, pages 332–346, (2009)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Shichao Zhang^1,2,
Rifeng Wang³ &
Zhi Jin⁴

82 Accesses
Explore all metrics

Abstract

This paper proposes a computation method for holistic multi-feature cube (MF-Cube) queries based on the characteristics of MF-Cubes. Three simple yet efficient strategies are designed to optimize the dependent complex aggregate at multiple granularities for a complex data-mining query within data cubes. One strategy is the computation of Holistic MF-Cube queries using the PDAP (Part Distributive Aggregate Property). More efficiency is gained by another strategy, that of dynamic subset data selection (the iceberg query technique), which reduces the size of the materialized data cubes. To extend this efficiency further, the second approach may adopt the chunk-based caching technique that reuses the output of previous queries. By combining these three strategies, we design an algorithm called the PDIC (Part Distributive Iceberg Chunk). We experimentally evaluate this algorithm using synthetic and real-world datasets and demonstrate that our approach delivers up to approximately twice the performance efficiency of traditional computation methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Agarwal S, Agrawal R, Deshpande P, Gupta A et al (1996) On the computation of multidimensional aggregates. In: Proceedings of the international conference on very large data bases, pp 506–521
Beyer K, Ramakrishnan R (1999) Bottom-up computation of sparse and iceberg cubes. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 359–370
Dehne F, Eavis T, Rau-Chaplin A (2004) Computing partial data cubes. In: Proceedings of the Hawaii international conference on system sciences, pp 1–20
Deshpande P, Naughton J (2000) Aggregate aware caching for multi-dimensional queries. In: Proceedings of the international conference on extending database technology, pp 167–182
Deshpande P, Ramasamy K, Shukla A (1998) Caching multidimensional queries using chunks. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 259–270
Dong G, Han J, Lam J et al. (2004) Mining constrained gradients in large databases. IEEE Trans Knowl Data Eng 16(8):922–938
Article Google Scholar
Fang M, Shivakumar N, Garcia-Molina H et al (1998) Computing iceberg queries efficiently. In: Proceedings of the international conference on very large data bases, pp 299–310
Gray A, Bosworth A, Layman A, et al (1996) Datacube: a relational aggregation operator generalizing group-by, cross-tab, and sub-totals. In: Proceedings of the international conference on data engineering, pp 152–159
Hahn C, Warren S, London J Edited synoptic cloud reports from ships and land stations over the globe. http://cdiac.esd.ornl.gov/cdiac/ndps/ndp026b.html
Han J, Kambr M (2001) In: Data mining concepts and techniques. Morgan Kaufmann, San Francisco, pp 39–104
Google Scholar
Han J, Pei J, Dong G et al (2001) Efficient computation of iceberg cubes with complex measures. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 1–12
Ng R, Wagner A, Yin Y (2001) Iceberg-cube computation with pc clusters. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 25–36
O’Gorman K, Agrawal D, El Abbadi A (2002) Multiple query optimization by cache-aware middleware using query teamwork. In: Proceedings of the IEEE international conference on data engineering, p 274
Ross K, Srivastava D (1997) Fast computation of sparse datacubes. In: Proceedings of the international conference on very large data bases, pp 116–125
Ross K, Srivastava D, Chatziantoniou D (1998) Complex aggregation at multiple granularities. In: Proceedings of the international conference on extending database technology, pp 263–277
Wang K, Jiang Y, Dong G et al. (2005) Divide-and-approximate: a novel constraint push strategy for iceberg cube mining. IEEE Trans Knowl Data Eng 17(3):354–368
Article Google Scholar
Yang Q, Wu X (2005) 10 Challenging problems in data mining research. In: IEEE international conference on data mining
Zhao Y, Deshpande P, Naughton J (1997) An array-based algorithm for simultaneous multidimensional aggregates. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 159–170
Zhu Q, Tao Y, Zuzarte C (2003) Exploiting similarity of sub-queries for complex query optimization. In: Proceedings of the international conference on database and expert systems applications, pp 747–759

Download references

Author information

Authors and Affiliations

Faculty of Information Technology, University of Technology Sydney, P.O. Box 123, Broadway, Sydney, NSW, 2007, Australia
Shichao Zhang
College of CS and IT, Guangxi Normal University, Guilin, China
Shichao Zhang
Department of Comuter Science, Guangxi University of Technology, Liuzhou, China
Rifeng Wang
School of EE and CS, Peking University, Beijing, China
Zhi Jin

Authors

Shichao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Rifeng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhi Jin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shichao Zhang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, S., Wang, R. & Jin, Z. Strategies for complex data cube queries. Appl Intell 31, 332–346 (2009). https://doi.org/10.1007/s10489-008-0130-2

Download citation

Received: 05 January 2008
Accepted: 02 June 2008
Published: 20 June 2008
Issue Date: December 2009
DOI: https://doi.org/10.1007/s10489-008-0130-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Strategies for complex data cube queries

Abstract

Access this article

Similar content being viewed by others

HaCube: Extending MapReduce for Efficient OLAP Cube Materialization and View Maintenance

Computing and Mining ClustCube Cubes Efficiently

A First Framework for Top-K Cubes Queries

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Strategies for complex data cube queries

Abstract

Access this article

Similar content being viewed by others

HaCube: Extending MapReduce for Efficient OLAP Cube Materialization and View Maintenance

Computing and Mining ClustCube Cubes Efficiently

A First Framework for Top-K Cubes Queries

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation