Abstract
This work presents DWMiner, an association rules efficient mining tool to process data directly over a relational DBMS data warehouse. DWMiner executes the Apriori algorithm as SQL queries in parallel, using a database PC Cluster middleware developed for SQL query optimization in OLAP applications. DWMiner combines intra- and inter-query parallelism in order to reduce the total time needed to find frequent item sets directly from a data warehouse. DWMiner was tested using the BMS-Web-View1 database from KDD-Cup 2000 and obtained linear and super-linear speedups.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Imielinsk, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: ACM SIGMOD International Conference on Management of Data, pp. 207–216. ACM Press, New York (1993)
Agrawal, R., Mannila, H., Srikant, R., et al.: Fast discovery of association rules. In: Fayyad, U.M., et al. (eds.) Advances in Knowledge Discovery and Data Mining, MIT Press, Cambridge (1996)
Agrawal, R., Shafer, J.: Parallel Mining of Association Rules. IEEE Trans. Knowledge and Data Engineering 8, 962–969 (1996)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: 20th International Conference on Very Large Databases (VLDB), pp. 487–499 (1994)
Akal, F., Böhm, K., Schek, H.J.: OLAP Query Evaluation in a Database Cluster: a Performance Study on Intra-Query Parallelism. In: Manolopoulos, Y., Návrat, P. (eds.) ADBIS 2002. LNCS, vol. 2435, Springer, Heidelberg (2002)
C-JDBC, http://c-jdbc.objectweb.org/ Accessed in 2005
Houtsma, M., Swami, A.: Set-oriented mining of association rules. In: 11th Conference on Data Engineering, Taipei, Taiwan (1995)
Kohavi, R., Brodley, C.E., Frasca, B., et al.: KDD Cup 2000 Organizers’ Report: Peeling the Onion. SIGKDD Exploration 2(2), 86–98 (2000)
Lima, A.A.B., Mattoso, M., Valduriez, P.: OLAP Query Processing in a Database Cluster. In: Danelutto, M., Vanneschi, M., Laforenza, D. (eds.) Euro-Par 2004. LNCS, vol. 3149, pp. 355–362. Springer, Heidelberg (2004)
Lima, A.A.B., Mattoso, M., Valduriez, P.: Adaptive Virtual Partitioning for OLAP Query Processing in a Database Cluster. In: 19th SBBD, pp. 92–105 (2005)
Mattoso, M., et al.: "ParGRES: a middleware for executing OLAP queries in parallel". COPPE/UFRJ Technical Report ES-690 (2005), http://pargres.nacad.ufrj.br/Documentos/ES-690.pdf
Paris Project, http://www.irisa.fr/paris/General/cluster.htm
ParGRES, http://pargres.nacad.ufrj.br/ , Accessed in 2005
PostgreSQL, http://www.postgresql.org , Accessed in 2005
Röhm, U., Böhm, K., Schek, H.J.: FAS - A Freshness-Sensitive Coordination Middleware for a Cluster of OLAP Components. In: 28th International Conference on Very Large Data Bases (VLDB2002), pp. 754–765 (2002)
Sarawagi, S., Thomas, S., Agrawal, R.: Integrating Association Rule Mining with Relational Database Systems: Alternatives and Implications. In: 1998 ACM SIGMOD International Conference on Management of Data, pp. 343–355. ACM Press, New York (1998)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Almentero, B.K., Evsukoff, A.G., Mattoso, M. (2007). DWMiner: A Tool for Mining Frequent Item Sets Efficiently in Data Warehouses. In: Daydé, M., Palma, J.M.L.M., Coutinho, Á.L.G.A., Pacitti, E., Lopes, J.C. (eds) High Performance Computing for Computational Science - VECPAR 2006. VECPAR 2006. Lecture Notes in Computer Science, vol 4395. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71351-7_17
Download citation
DOI: https://doi.org/10.1007/978-3-540-71351-7_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71350-0
Online ISBN: 978-3-540-71351-7
eBook Packages: Computer ScienceComputer Science (R0)