Discovery-driven exploration of OLAP data cubes

  • Sunita Sarawagi
  • Rakesh Agrawal
  • Nimrod Megiddo
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1377)

Abstract

Analysts predominantly use OLAP data cubes to identify regions of anomalies that may represent problem areas or new opportunities. The current OLAP systems support hypothesis-driven exploration of data cubes through operations such as drill-down, roll-up, and selection. Using these operations, an analyst navigates unaided through a huge search space looking at large number of values to spot exceptions. We propose a new discovery-driven exploration paradigm that mines the data for such exceptions and summarizes the exceptions at appropriate levels in advance. It then uses these exceptions to lead the analyst to interesting regions of the cube during navigation. We present the statistical foundation underlying our approach. We then discuss the computational issue of finding exceptions in data and making the process efficient on large multidimensional data bases.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [AAD+96]
    S. Agarwal, R. Agrawal, P.M. Deshpande, A. Gupta, J.F. Naughton, R. Ramakrishnan, and S. Sarawagi. On the computation of multidimensional aggregates. In Proc. of the 22nd Int'l Conference on Very Large Databases, pages 506–521, Mumbai (Bombay), India, September 1996.Google Scholar
  2. [AGS97]
    Rakesh Agrawal, Ashish Gupta, and Sunita Sarawagi. Modeling multidimensional databases. In Proc. of the 13th Int'l Conference on Data Engineering, Birmingham, U.K., April 1997.Google Scholar
  3. [Arb]
    Arbor Software Corporation. Application Manager User's Guide, Essbase version 4.0. http://www.arborsoft.com.Google Scholar
  4. [BFH75]
    Y. Bishop, S. Fienberg, and P. Holland. Discrete Multivariate Analysis theory and practice. The MIT Press, 1975.Google Scholar
  5. [CL86]
    William W. Cooley and Paul R Lohnes. Multivariate data analysis. Robert E. Krieger publishers, 1986.Google Scholar
  6. [Col95]
    George Colliat. OLAP, relational, and multidimensional database systems. Technical report, Arbor Software Corporation, Sunnyvale, CA, 1995.Google Scholar
  7. [GBLP96]
    J. Gray, A. Bosworth, A. Layman, and H. Pirahesh. Data cube: A relational aggregation operator generalizing group-by, cross-tabs and sub-totals. In Proc. of the 12th Int'l Conference on Data Engineering, pages 152–159, 1996.Google Scholar
  8. [HMJ88]
    D. Hoaglin, F. Mosteller, and Tukey. J. Exploring data tables, trends and shapes. Wiley series in probability, 1988.Google Scholar
  9. [HMT83]
    D.C. Hoaglin, F. Mosteller, and J.W. Tukey. Understanding Robust and Exploratory Data Analysis. John Wiley, New York, 1983.MATHGoogle Scholar
  10. [Man71]
    J. Mandel. A new analysis of variance model for non-additive data. Technometrics, 13:1–18, 1971.CrossRefGoogle Scholar
  11. [Mon91]
    D.G. Montgomery. Design and Analysis of Experiments, chapter 13. John Wiley & sons, third edition, 1991.Google Scholar
  12. [OLA96]
    The OLAP Council. MD-API the OLAP Application Program Interface Version 0.5 Specification, September 1996.Google Scholar
  13. [SAM98]
    Sunita Sarawagi, Rakesh Agrawal, and Nimrod Megiddo. Discovery-driven exploration of OLAP data cubes. Research Report RJ 10102 (91918), IBM Almaden Research Center, San Jose, CA 95120, January 1998. Available from http://www.almaden.ibm.com/cs/quest.Google Scholar

Copyright information

© Springer-Verlag 1998

Authors and Affiliations

  • Sunita Sarawagi
    • 1
  • Rakesh Agrawal
    • 1
  • Nimrod Megiddo
    • 1
  1. 1.IBM Almaden Research CenterSan JoseUSA

Personalised recommendations