Complex aggregation at multiple granularities

  • Kenneth A. Ross
  • Divesh Srivastava
  • Damianos Chatziantoniou
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1377)

Abstract

Datacube queries compute simple aggregates at multiple granularities. In this paper we examine the more general and useful problem of computing a complex subquery involving multiple dependent aggregates at multiple granularities. We call such queries “multi-feature cubes.” An example is “Broken down by all combinations of month and customer, find the fraction of the total sales in 1996 of a particular item due to suppliers supplying within 10% of the minimum price (within the group), showing all subtotals across each dimension.” We classify multi-feature cubes based on the extent to which fine granularity results can be used to compute coarse granularity results; this classification includes distributive, algebraic and holistic multi-feature cubes. We provide syntactic sufficient conditions to determine when a multi-feature cube is either distributive or algebraic. This distinction is important because, as we show, existing datacube evaluation algorithms can be used to compute multi-feature cubes that are distributive or algebraic, without any increase in I/O complexity. We evaluate the CPU performance of computing multi-feature cubes using the datacube evaluation algorithm of Ross and Srivastava. Using a variety of synthetic, benchmark and real-world data sets, we demonstrate that the CPU cost of evaluating distributive multi-feature cubes is comparable to that of evaluating simple datacubes. We also show that a variety of holistic multi-feature cubes can be evaluated with a manageable overhead compared to the distributive case.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [AAD+96]
    S. Agarwal, R. Agrawal, P. M. Deshpande, A. Gupta, J. F. Naughton, R. Ramakrishnan, and S. Sarawagi. On the computation of multidimensional aggregates. In Proceedings of VLDB, pages 506–521, 1996.Google Scholar
  2. [AGS97]
    R. Agrawal, A. Gupta, and S. Sarawagi. Modeling multidimensional databases. In Proceedings of IEEE ICDE, 1997.Google Scholar
  3. [CR96]
    D. Chatziantoniou and K. A. Ross. Querying multiple features of groups in relational databases. In Proceedings of VLDB, pages 295–306, 1996.Google Scholar
  4. [GBLP96]
    J. Gray, A. Bosworth, A. Layman, and H. Pirahesh. Datacube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. In Proceedings of IEEE ICDE, pages 152–159, 1996. Also available as Microsoft Technical Report MSR-TR-95-22.Google Scholar
  5. [HWL94]
    C. J. Hahn, S. G. Warren, and J. London. Edited synoptic cloud reports from ships and land stations over the globe, 1982–1991. Available from http://cdiac.esd.ornl.gov/cdiac/ndps/ndp026b.html, 1994.Google Scholar
  6. [LW96]
    C. Li and X. S. Wang. A data model for supporting on-line analytical processing. In Proceedings of CIKM, pages 81–88, 1996.Google Scholar
  7. [RS97]
    K. A. Ross and D. Srivastava. Fast computation of sparse datacubes. In Proceedings of VLDB, pages 116–125, 1997.Google Scholar
  8. [RSC97]
    K. A. Ross, D. Srivastava and D. Chatziantoniou. Complex aggregation at multiple granularities. AT&T Technical Report, 1997.Google Scholar
  9. [Tra95]
    Transaction Processing Performance Council (TPC), 777 N. First Street, Suite 600, San Jose, CA 95112, USA. TPC Benchmark D (Decision Support), May 1995.Google Scholar
  10. [ZDN97]
    Y. Zhao, P. M. Deshpande, and J. F. Naughton. An array-based algorithm for simultaneous multidimensional aggregates. In Proceedings of ACM SIGMOD, pages 159–170, 1997.Google Scholar

Copyright information

© Springer-Verlag 1998

Authors and Affiliations

  • Kenneth A. Ross
    • 1
  • Divesh Srivastava
    • 2
  • Damianos Chatziantoniou
    • 3
  1. 1.Columbia UniversityNew YorkUSA
  2. 2.AT&T Labs-ResearchFlorham ParkUSA
  3. 3.Stevens Institute of TechnologyHobokenUSA

Personalised recommendations