DWMiner: A Tool for Mining Frequent Item Sets Efficiently in Data Warehouses

Almentero, Bruno Kinder; Evsukoff, Alexandre Gonçalves; Mattoso, Marta

doi:10.1007/978-3-540-71351-7_17

DWMiner: A Tool for Mining Frequent Item Sets Efficiently in Data Warehouses

Bruno Kinder Almentero¹,
Alexandre Gonçalves Evsukoff¹ &
Marta Mattoso¹

Conference paper

732 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4395))

Abstract

This work presents DWMiner, an association rules efficient mining tool to process data directly over a relational DBMS data warehouse. DWMiner executes the Apriori algorithm as SQL queries in parallel, using a database PC Cluster middleware developed for SQL query optimization in OLAP applications. DWMiner combines intra- and inter-query parallelism in order to reduce the total time needed to find frequent item sets directly from a data warehouse. DWMiner was tested using the BMS-Web-View1 database from KDD-Cup 2000 and obtained linear and super-linear speedups.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Imielinsk, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: ACM SIGMOD International Conference on Management of Data, pp. 207–216. ACM Press, New York (1993)
Google Scholar
Agrawal, R., Mannila, H., Srikant, R., et al.: Fast discovery of association rules. In: Fayyad, U.M., et al. (eds.) Advances in Knowledge Discovery and Data Mining, MIT Press, Cambridge (1996)
Google Scholar
Agrawal, R., Shafer, J.: Parallel Mining of Association Rules. IEEE Trans. Knowledge and Data Engineering 8, 962–969 (1996)
Article Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: 20th International Conference on Very Large Databases (VLDB), pp. 487–499 (1994)
Google Scholar
Akal, F., Böhm, K., Schek, H.J.: OLAP Query Evaluation in a Database Cluster: a Performance Study on Intra-Query Parallelism. In: Manolopoulos, Y., Návrat, P. (eds.) ADBIS 2002. LNCS, vol. 2435, Springer, Heidelberg (2002)
Chapter Google Scholar
C-JDBC, http://c-jdbc.objectweb.org/ Accessed in 2005
Houtsma, M., Swami, A.: Set-oriented mining of association rules. In: 11th Conference on Data Engineering, Taipei, Taiwan (1995)
Google Scholar
Kohavi, R., Brodley, C.E., Frasca, B., et al.: KDD Cup 2000 Organizers’ Report: Peeling the Onion. SIGKDD Exploration 2(2), 86–98 (2000)
Article Google Scholar
Lima, A.A.B., Mattoso, M., Valduriez, P.: OLAP Query Processing in a Database Cluster. In: Danelutto, M., Vanneschi, M., Laforenza, D. (eds.) Euro-Par 2004. LNCS, vol. 3149, pp. 355–362. Springer, Heidelberg (2004)
Google Scholar
Lima, A.A.B., Mattoso, M., Valduriez, P.: Adaptive Virtual Partitioning for OLAP Query Processing in a Database Cluster. In: 19th SBBD, pp. 92–105 (2005)
Google Scholar
Mattoso, M., et al.: "ParGRES: a middleware for executing OLAP queries in parallel". COPPE/UFRJ Technical Report ES-690 (2005), http://pargres.nacad.ufrj.br/Documentos/ES-690.pdf
Paris Project, http://www.irisa.fr/paris/General/cluster.htm
ParGRES, http://pargres.nacad.ufrj.br/ , Accessed in 2005
PostgreSQL, http://www.postgresql.org , Accessed in 2005
Röhm, U., Böhm, K., Schek, H.J.: FAS - A Freshness-Sensitive Coordination Middleware for a Cluster of OLAP Components. In: 28th International Conference on Very Large Data Bases (VLDB2002), pp. 754–765 (2002)
Google Scholar
Sarawagi, S., Thomas, S., Agrawal, R.: Integrating Association Rule Mining with Relational Database Systems: Alternatives and Implications. In: 1998 ACM SIGMOD International Conference on Management of Data, pp. 343–355. ACM Press, New York (1998)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

COPPE/Federal University of Rio de Janeiro, P.O. Box 68511, 21941-972 Rio de Janeiro RJ, Brazil
Bruno Kinder Almentero, Alexandre Gonçalves Evsukoff & Marta Mattoso

Authors

Bruno Kinder Almentero
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre Gonçalves Evsukoff
View author publications
You can also search for this author in PubMed Google Scholar
Marta Mattoso
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Michel Daydé José M. L. M. Palma Álvaro L. G. A. Coutinho Esther Pacitti João Correia Lopes

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Almentero, B.K., Evsukoff, A.G., Mattoso, M. (2007). DWMiner: A Tool for Mining Frequent Item Sets Efficiently in Data Warehouses. In: Daydé, M., Palma, J.M.L.M., Coutinho, Á.L.G.A., Pacitti, E., Lopes, J.C. (eds) High Performance Computing for Computational Science - VECPAR 2006. VECPAR 2006. Lecture Notes in Computer Science, vol 4395. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71351-7_17

Download citation

DOI: https://doi.org/10.1007/978-3-540-71351-7_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71350-0
Online ISBN: 978-3-540-71351-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics