Programming Relational Databases for Itemset Mining over Large Transactional Tables

Alves, Ronnie; Belo, Orlando

doi:10.1007/11595014_32

Ronnie Alves²¹ &
Orlando Belo²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3808))

Included in the following conference series:

Portuguese Conference on Artificial Intelligence

1461 Accesses

Abstract

Most of the itemset mining approaches are memory-like and run outside of the database. On the other hand, when we deal with data warehouse the size of tables is extremely huge for memory copy. In addition, using a pure SQL-like approach is quite inefficient. Actually, those implementations rarely take advantages of database programming. Furthermore, RDBMS vendors offer a lot of features for taking control and management of the data. We purpose a pattern growth mining approach by means of database programming for finding allfrequent itemsets. The main idea is to avoid one-at-a-time record retrieval from the database, saving both the copying and process context switching, expensive joins, and table reconstruction. The empirical evaluation of our approach shows that runs competitively with the most known itemset mining implementations based on SQL. Our performance evaluation was made with SQL Server 2000 (v.8) and T-SQL, throughout several synthetical datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agarwal, R., Shim., R.: Developing tightly-coupled data mining application on a relational database system. In: Proc. of the 2nd Int. Conf. on Knowledge Discovery in Database and Data Mining, Portland, Oregon (1996)
Google Scholar
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. of the ACM SIGMOD Intl. Conference on Management of Data, pp. 207–216 (1993)
Google Scholar
Agrawal, R., Srikant., R.: Fast algorithms for mining association rules. In: Proc. of the 20th Very Large Data Base Conference, pp. 487–499 (1994)
Google Scholar
Alves, R., Belo, O.: Integrating Pattern Growth Mining on SQL-Server RDBMS. Technical Report-003, University of Minho, Department of Informatics (May 2005), http://alfa.di.uminho.pt/~ronnie/files_files/rt/2005-RT3-Ronnie.pdf
Alves, R., Gabriel, P., Azevedo, P., Belo, O.: A Hybrid Method to Discover Inter-Transactional Rules. In: Proceedings of the JISBD 2005, Granada (2005)
Google Scholar
Cheung, W., Zaïane, O.R.: Incremental Mining of Frequent Patterns Without Candidate Generation or Support Constraint. In: Seventh International Database Engineering and Applications Symposium (IDEAS 2003), Hong Kong, China, July 16-18, pp. 111–116 (2003)
Google Scholar
El-Hajj, M., Zaïane, O.R.: Inverted Matrix: Efficient Discovery of Frequent Items in Large Datasets in the Context of Interactive Mining. In: Proc. 2003 Int’l Conf. on Knowledge Discovery and Data Mining (ACM SIGKDD), Washington, DC, USA, August 24-27, pp. 109–118 (2003)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proc. of ACM SIGMOD Intl. Conference on Management of Data, pp. 1–12 (2000)
Google Scholar
Hidber, C.: Online association rule mining. In: Delis, A., Faloutsos, C., Ghandeharizadeh, S. (eds.) Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data. SIGMOD Record, vol. 28(2), pp. 145–156. ACM Press, New York (1999)
Chapter Google Scholar
Orlando, S., Palmerini, P., Perego, R.: Enhancing the apriori algorithm for frequent set counting. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2001. LNCS, vol. 2114, pp. 71–82. Springer, Heidelberg (2001)
Chapter Google Scholar
Orlando, S., Palmerini, P., Perego, R., Silvestri, F.: Adaptive and resource-aware mining of frequent sets. In: Kumar, V., Tsumoto, S., Yu, P.S., Zhong, N. (eds.) Proceedings of the 2002 IEEE International Conference on Data Mining. IEEE Computer Society, Los Alamitos (2002)
Google Scholar
Rantzau, R.: Processing frequent itemset discovery queries by division and set containment join operators. In: DMKD 2003: 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (2003)
Google Scholar
Sarawagi, S., Thomas, S., Agrawal, R.: Integrating mining with relational database systems: alternatives and implications. In: Proc. of the ACM SIGMOD Conference on Management of data, Seattle, Washington, USA (1998)
Google Scholar
Shang, X., Sattler, K., Geist, I.: Sql based frequent pattern mining without candidate generation. In: SAC 2004 Data Mining, Nicosia, Cyprus (2004)
Google Scholar
Wang, H., Zaniolo, C.: Using SQL to build new aggregates and extenders for Object-Relational systems. In: Proc. of the 26th Int. Conf. on Very Large Databases, Cairo, Egypt (2000)
Google Scholar
Yoshizawa, T., Pramudiono, I., Kitsuregawa, M.: Sql based association rule mining using commercial rdbms (ibm db2 udb eee). In: Kambayashi, Y., Mohania, M., Tjoa, A.M. (eds.) DaWaK 2000. LNCS, vol. 1874, p. 301. Springer, Heidelberg (2000)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Informatics, University of Minho, Campus de Gualtar, 4710-057, Braga, Portugal
Ronnie Alves & Orlando Belo

Authors

Ronnie Alves
View author publications
You can also search for this author in PubMed Google Scholar
Orlando Belo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Portugal Telecom Inovação (PTI), Centro de Informatica e Sistemas da Universidade de Coimbra (CISUC),
Carlos Bento
Department of Informatics Engineering, Coimbra University, Portugal
Amílcar Cardoso
Centre of Human Language Technology and Bioinformatics, University of Beira Interior, 6201-001, Covilhã, Portugal
Gaël Dias

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alves, R., Belo, O. (2005). Programming Relational Databases for Itemset Mining over Large Transactional Tables. In: Bento, C., Cardoso, A., Dias, G. (eds) Progress in Artificial Intelligence. EPIA 2005. Lecture Notes in Computer Science(), vol 3808. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11595014_32

Download citation

DOI: https://doi.org/10.1007/11595014_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30737-2
Online ISBN: 978-3-540-31646-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics