Abstract
Analysts often explore data cubes to identify anomalous regions that may represent problem areas or new opportunities. Discovery-driven exploration (proposed by S.Sarawagi et al [5]) automatically detects and marks the exceptions for the user and reduces the reliance on manual discovery. However, when the data is large, it is hard to materialize the whole cube due to the limitations of both space and time. So, exploratory mining on complete cube cells needs to construct the data cube dynamically. That will take a very long time. In this paper, we investigate optimization methods by pushing several constraints into the mining process. By enforcing several user-defined constraints, we first restrict the multidimensional space to a small constrained-cube and then mine exceptions on it. Two efficient constrained-cube construction algorithms, the NAIVE algorithm and the AGOA algorithm, were proposed. Experimental results indicate that this kind of constraint-based exploratory mining method is efficient and scalable.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
J. Han, S. Chee, and J. Chiang.: Issues for On-Line Analytical Mining of Data Warehouses. In the Proc. of 1998 SIGMOD’96 Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD), 1998
Cognos Software Corporation. Power Play 5, special edition, http://www.cognos.com/powercubes/index.html, 1997
Pilot Software. Decision support suite, http://www.pilotsw.com/.
J. Han and Y. Fu.: Discovery of multiple-level association rules from large databases. In Proc. of the 21st Int’l Conference on Very Large Databases(VLDB), 1995.
S. Sarawagi, R. Agrawal, and N. Megiddo.: Discovery-driven exploration of OLAP data cubes. In proc. of the 6th Int’l Conference on Extending Database Technology (EDBT), 1998
V. Harinarayan, A. Rajaraman and J.: Ullman. Implementing data cubes efficiently, In Proc. Of ACM-SIGMOD Int’l Conference on Management of Data, 1996
T. Imielinski, L. Khachiyan, and A. Abdulghani.: Cubegrades: generalizing association rules. Technique Report, Dept. Computer Science, Rutgers Univ., Aug. 2000
S. Sarawagi.: Explaining differences in multidimensional aggregates. In Proc. of the 25st Int’l Conference on Very Large Databases (VLDB), 1999.
S. Sarawagi.: User-adaptive exploration of multidimensional data. In Proc. of the 26st Int’l Conference on Very Large Databases (VLDB), 2000.
G. Sathe, S. Sarawagi.: Intelligent Rollups in Multidimensional OLAP data. In Proc. of the 27st Int’l Conference on Very Large Databases (VLDB), 2001
G. Dong, J. Han, J. lam, and K. wang.: Mining Multi-Dimensional Constrained Gradients in Data Cubes, In Proc. of the 27st Int’l Conference on Very Large Databases (VLDB), 2001
R. Bayardo, R. Agrawal, and D. Gunopulos.: Constraint-based rule mining on large, dense data sets. In Proc. of 1999 Int’l Conf. on Data Engineering (ICDE), 1999.
R. Ng, L. Lakshmanan, J. han, and A. Pang.: Exploratory mining and pruning optimizations of constrained association rules. In Proc. of ACM-SIGMOD Int’l Conference on Management of Data, 1998.
J. Pei, J. Han, and L. Lakshmanan.: Mining frequent itemsets with convertible constraints. In Proc. of 2001 Int’l Conf. on Data Engineering (ICDE), 2001.
R. Srikant, Q. Vu, and R. Agrawal.: Mining association rules with item constraints. In Proc. 1997 Int’l Conf. on Data Mining and Knowledge Discovery (KDD), 1997
W. Liang, M. E. Orlowska, and J.X. Yu.: Optimizing multiple dimensional queries simultaneously in multidimensional databases, VLDB Journal, 8(3–4), 2000
Y. Zhao, P. Deshpande, J. Naughton, and A. Shukla.: Simultaneous optimization and evaluation of multiple dimensional queries, In Proc. of ACM-SIGMOD Int’l Conference on Management of Data, 1998
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, C., Li, S., Wang, S., Du, X. (2002). Efficient Constraint-Based Exploratory Mining on Large Data Cubes. In: Chen, MS., Yu, P.S., Liu, B. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2002. Lecture Notes in Computer Science(), vol 2336. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47887-6_38
Download citation
DOI: https://doi.org/10.1007/3-540-47887-6_38
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43704-8
Online ISBN: 978-3-540-47887-4
eBook Packages: Springer Book Archive