Advertisement

Lossless Reduction of Datacubes

  • Alain Casali
  • Rosine Cicchetti
  • Lotfi Lakhal
  • Noël Novelli
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4080)

Abstract

Datacubes are specially useful for answering efficiently queries on data warehouses. Nevertheless the amount of generated aggregated data is incomparably more voluminous than the initial data which is itself very large. Recently, research work has addressed the issue of a concise representation of datacubes in order to reduce their size. The approach presented in this paper fits in a similar trend. We propose a concise representation, called Partition Cube, based on the concept of partition and define an algorithm to compute it. Various experiments are performed in order to compare our approach with methods fitting in the same trend. This comparison relates to the efficiency of algorithms computing the representations, the main memory requirements, and the storage space which is necessary.

Keywords

Concise Representation Data Cube Aggregative Function Dimension Attribute Original Relation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Beyer, K., Ramakrishnan, R.: Bottom-Up Computation of Sparse and Iceberg CUBEs. In: Proceedings of the International Conference on Management of Data, SIGMOD, pp. 359–370 (1999)Google Scholar
  2. 2.
    Casali, A., Cicchetti, R., Lakhal, L.: Extracting semantics from datacubes using cube transversals and closures. In: Proc. of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD, pp. 69–78 (2003)Google Scholar
  3. 3.
    Chaudhuri, S., Dayal, U.: An Overview of Data Warehousing and OLAP Technology. SIGMOD Record 26(1), 65–74 (1997)CrossRefGoogle Scholar
  4. 4.
    Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer, Heidelberg (1999)MATHGoogle Scholar
  5. 5.
    Gilbert, A., Kotidis, Y., Muthukrishnan, S., Strauss, M.: Surfing Wavelets on Streams: One-Pass Summaries for Approximate Queries. In: Proceedings of 27th International Conference on Very Large Data Bases, VLDB, pp. 79–88 (2001)Google Scholar
  6. 6.
    Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. Data Mining and Knowledge Discovery 1(1), 29–53 (1997)CrossRefGoogle Scholar
  7. 7.
    Gupta, H., Mumick, I.: Selection of Views to Materialize in a Data Warehouse. IEEE Transactions on Knowledge and Data Engineering, TKDE 17(1/2005), 24–43 (2005)CrossRefGoogle Scholar
  8. 8.
    Han, J., Pei, J., Dong, G., Wang, K.: Efficient Computation of Iceberg Cubes with Complex Measures. In: Proceedings of the International Conference on Management of Data, SIGMOD, pp. 441–448 (2001)Google Scholar
  9. 9.
    Harinarayan, V., Rajaraman, A., Ullman, J.: Implementing data cubes efficiently. In: Proceedings of the International Conference on Management of Data, SIGMOD, pp. 205–216 (1996)Google Scholar
  10. 10.
    Lakshmanan, L., Pei, J., Han, J.: Quotient cube: How to summarize the semantics of a data cube. In: Proceedings of the 28th International Conference on Very Large Databases, VLDB, pp. 778–789 (2002)Google Scholar
  11. 11.
    Laurent, D., Spyratos, N.: Partition semantics for incomplete information in relational databases. In: Proceedings of the International Conference on Management of Data, SIGMOD, pp. 66–73 (1988)Google Scholar
  12. 12.
    Lopes, S., Petit, J.M., Lakhal, L.: Functional and Approximate Dependency Mining: Databases and FCA points of View. Experimental and Theoretical Artificial Intelligence (JETAI): Special Issue on Concept Lattice-based theory, methods and tools for Knowledge Discovery in Databases 14(2-3), 93–114 (2002)MATHGoogle Scholar
  13. 13.
    Mannila, H., Toivonen, H.: Multiple Uses of Frequent Sets and Condensed Representations: Extended Abstract. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, KDD, pp. 189–194 (1996)Google Scholar
  14. 14.
    Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering Frequent Closed Itemsets for Association Rules. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  15. 15.
    Ross, K., Srivastava, D.: Fast Computation of Sparse Datacubes. In: Proceedings of the 23rd International Conference on Very Large Databases, VLDB, pp. 116–125 (1997)Google Scholar
  16. 16.
    Shanmugasundaram, J., Fayyad, U., Bradley, P.: Compressed Data Cubes for OLAP Aggregate Query Approximation on Continuous Dimensions. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD, pp. 223–232 (1999)Google Scholar
  17. 17.
    Vitter, J., Wang, M.: Approximate Computation of Multidimensional Aggregates of Sparse Data Using Wavelets. In: Proceedings ACM SIGMOD International Conference on Management of Data, SIGMOD, pp. 193–204 (1999)Google Scholar
  18. 18.
    Wang, W., Lu, H., Feng, J., Yu, J.: Condensed Cube: An Effective Approach to Reducing Data Cube Size. In: Proceedings of the 18th International Conference on Data Engineering, ICDE, pp. 213–222 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Alain Casali
    • 1
  • Rosine Cicchetti
    • 1
  • Lotfi Lakhal
    • 1
  • Noël Novelli
    • 1
  1. 1.Laboratoire d’Informatique Fondamentale de Marseille (LIF), CNRS UMR 6166Université de la MéditerranéeMarseilleFrance

Personalised recommendations