Skip to main content

Programming Relational Databases for Itemset Mining over Large Transactional Tables

  • Conference paper
Progress in Artificial Intelligence (EPIA 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3808))

Included in the following conference series:

  • 1461 Accesses

Abstract

Most of the itemset mining approaches are memory-like and run outside of the database. On the other hand, when we deal with data warehouse the size of tables is extremely huge for memory copy. In addition, using a pure SQL-like approach is quite inefficient. Actually, those implementations rarely take advantages of database programming. Furthermore, RDBMS vendors offer a lot of features for taking control and management of the data. We purpose a pattern growth mining approach by means of database programming for finding allfrequent itemsets. The main idea is to avoid one-at-a-time record retrieval from the database, saving both the copying and process context switching, expensive joins, and table reconstruction. The empirical evaluation of our approach shows that runs competitively with the most known itemset mining implementations based on SQL. Our performance evaluation was made with SQL Server 2000 (v.8) and T-SQL, throughout several synthetical datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agarwal, R., Shim., R.: Developing tightly-coupled data mining application on a relational database system. In: Proc. of the 2nd Int. Conf. on Knowledge Discovery in Database and Data Mining, Portland, Oregon (1996)

    Google Scholar 

  2. Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. of the ACM SIGMOD Intl. Conference on Management of Data, pp. 207–216 (1993)

    Google Scholar 

  3. Agrawal, R., Srikant., R.: Fast algorithms for mining association rules. In: Proc. of the 20th Very Large Data Base Conference, pp. 487–499 (1994)

    Google Scholar 

  4. Alves, R., Belo, O.: Integrating Pattern Growth Mining on SQL-Server RDBMS. Technical Report-003, University of Minho, Department of Informatics (May 2005), http://alfa.di.uminho.pt/~ronnie/files_files/rt/2005-RT3-Ronnie.pdf

  5. Alves, R., Gabriel, P., Azevedo, P., Belo, O.: A Hybrid Method to Discover Inter-Transactional Rules. In: Proceedings of the JISBD 2005, Granada (2005)

    Google Scholar 

  6. Cheung, W., Zaïane, O.R.: Incremental Mining of Frequent Patterns Without Candidate Generation or Support Constraint. In: Seventh International Database Engineering and Applications Symposium (IDEAS 2003), Hong Kong, China, July 16-18, pp. 111–116 (2003)

    Google Scholar 

  7. El-Hajj, M., Zaïane, O.R.: Inverted Matrix: Efficient Discovery of Frequent Items in Large Datasets in the Context of Interactive Mining. In: Proc. 2003 Int’l Conf. on Knowledge Discovery and Data Mining (ACM SIGKDD), Washington, DC, USA, August 24-27, pp. 109–118 (2003)

    Google Scholar 

  8. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proc. of ACM SIGMOD Intl. Conference on Management of Data, pp. 1–12 (2000)

    Google Scholar 

  9. Hidber, C.: Online association rule mining. In: Delis, A., Faloutsos, C., Ghandeharizadeh, S. (eds.) Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data. SIGMOD Record, vol. 28(2), pp. 145–156. ACM Press, New York (1999)

    Chapter  Google Scholar 

  10. Orlando, S., Palmerini, P., Perego, R.: Enhancing the apriori algorithm for frequent set counting. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2001. LNCS, vol. 2114, pp. 71–82. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  11. Orlando, S., Palmerini, P., Perego, R., Silvestri, F.: Adaptive and resource-aware mining of frequent sets. In: Kumar, V., Tsumoto, S., Yu, P.S., Zhong, N. (eds.) Proceedings of the 2002 IEEE International Conference on Data Mining. IEEE Computer Society, Los Alamitos (2002)

    Google Scholar 

  12. Rantzau, R.: Processing frequent itemset discovery queries by division and set containment join operators. In: DMKD 2003: 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (2003)

    Google Scholar 

  13. Sarawagi, S., Thomas, S., Agrawal, R.: Integrating mining with relational database systems: alternatives and implications. In: Proc. of the ACM SIGMOD Conference on Management of data, Seattle, Washington, USA (1998)

    Google Scholar 

  14. Shang, X., Sattler, K., Geist, I.: Sql based frequent pattern mining without candidate generation. In: SAC 2004 Data Mining, Nicosia, Cyprus (2004)

    Google Scholar 

  15. Wang, H., Zaniolo, C.: Using SQL to build new aggregates and extenders for Object-Relational systems. In: Proc. of the 26th Int. Conf. on Very Large Databases, Cairo, Egypt (2000)

    Google Scholar 

  16. Yoshizawa, T., Pramudiono, I., Kitsuregawa, M.: Sql based association rule mining using commercial rdbms (ibm db2 udb eee). In: Kambayashi, Y., Mohania, M., Tjoa, A.M. (eds.) DaWaK 2000. LNCS, vol. 1874, p. 301. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Alves, R., Belo, O. (2005). Programming Relational Databases for Itemset Mining over Large Transactional Tables. In: Bento, C., Cardoso, A., Dias, G. (eds) Progress in Artificial Intelligence. EPIA 2005. Lecture Notes in Computer Science(), vol 3808. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11595014_32

Download citation

  • DOI: https://doi.org/10.1007/11595014_32

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-30737-2

  • Online ISBN: 978-3-540-31646-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics