A Compressed Vertical Binary Algorithm for Mining Frequent Patterns

Palancar, J. Hdez.; León, R. Hdez.; Pagola, J. Medina; Hechavarría, A.

doi:10.1007/978-3-540-78488-3_12

J. Hdez. Palancar⁶,
R. Hdez. León⁶,
J. Medina Pagola⁶ &
…
A. Hechavarría⁶

Part of the book series: Studies in Computational Intelligence ((SCI,volume 118))

1214 Accesses
1 Citations

Summary

A new algorithm named Compressed Binary Mine (CBMine) for mining association rules and frequent patterns is presented in this chapter. Its efficiency is based on a compressed vertical binary representation of the database. CBMine was compared with several a priori implementations, like Bodon’s a priori algorithm, and MAFIA, another vertical binary representation method. The experimental results have shown that CBMine has significantly better performance, especially for sparse databases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Fast algorithms for frequent itemset mining using fp-trees. IEEE Transactions on Knowledge and Data Engineering, 17(10):1347–1362, 2005. Member-Gosta Grahne and Student Member-Jianfei Zhu
Google Scholar
Agrawal R., Imielinski T., and Swami A. N. Mining association rules between sets of items in large databases. In Buneman P. and Jajodia S. editors, Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pages 207–216. Washington DC, 26–28 1993
Google Scholar
Agrawal R. and Srikant R. Fast algorithms for mining association rules. In Bocca J. B., Jarke M., and Zaniolo C. editors, Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, pages 487–499. Morgan Kaufmann, San Fransisco, CA, 12–15 1994
Google Scholar
Bodon F. Surprising results of trie-based fim algorithms. In Goethals B., Zaki M. J., and Bayardo R. editors, Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations (FIMI’04), volume 126 of CEUR Workshop Proceedings, Brighton, UK, 1 November 2004
Google Scholar
Bodon F. Trie-based apriori implementation for mining frequent itemsequences. In Goethals B., Nijssen S., and Zaki M. J. editors, Proceedings of ACM SIGKDD International Workshop on Open Source Data Mining (OSDM’05), pages 56–65. Chicago, IL, USA, August 2005
Google Scholar
Brin S., Motwani R., Ullman J. D., and Tsur S. Dynamic itemset counting and implication rules for market basket data. In Peckham J. editor, SIGMOD 1997, Proceedings of ACM SIGMOD International Conference on Management of Data, pages 255–264. ACM, Tucson, Arizona, USA, May 13–15, 1997, 05 1997
Google Scholar
Burdick D., Calimlim M., and Gehrke J. Mafia: A maximal frequent itemset algorithm for transactional databases. In Proceedings of the Seventeenth International Conference on Data Engineering, pages 443–452. Washington DC, USA, 2001. IEEE Computer Society
Google Scholar
Gardarin G., Pucheral P., and Wu F. Bitmap based algorithms for mining association rules, in: Actes des journèes Bases de Donnèes Avances (BDA’98), Hammamet, Tunisie, 1998
Google Scholar
Han J., Pei J., Yin Y., and Mao R. Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery, 8(1):53–87, 2004
Article MathSciNet Google Scholar
Hipp J., Güntzer U., and Nakhaeizadeh G. Algorithms for association rule mining – a general survey and comparison. SIGKDD Explorations, 2(1):58–64, July 2000
Article Google Scholar
Holt J. D. and Chung S. M. Multipass algorithms for mining association rules in text databases. Knowledge Information System, 3(2):168–183, 2001
Article MATH Google Scholar
Lin T. Y. Data mining and machine oriented modeling: A granular computing approach. Applied Intelligence, 13(2):113–124, 2000
Article Google Scholar
Calimlim M. and Gehrke J. Himalaya data mining tools: Mafia. http://himalaya-tools.sourceforge.net, May 2006
Fayyad U. M., Piatetsky-Shapiro G., and Smyth P. From data mining to knowledge discovery: An overview. In Fayyad U. M., Piatetsky-Shapiro G., Smyth P., and Uthurusamy R. editors, Advances in Knowledge Discovery and Data Mining, pages 1–34. AAAI, Menlo Park, CA, 1996
Google Scholar
Gopalan R. P. and Sucahyo Y. G. High performance frequent patterns extraction using compressed fp-tree. In Proceedings of the SIAM International Workshop on High Performance and Distributed Mining, Orlando, USA, 2004
Google Scholar
Feldman R. and Hirsh H. Finding associations in collections of text In Machine Learning and Data Mining: Methods and Applications, pages 223–240. Wiley, New York, 1998
Google Scholar
Feldman R., Dagen I., and Hirsh H. Mining text using keyword distributions. Journal of Intelligent Information Systems, 10(3):281–300, 1998
Article Google Scholar
Chen M. S., Han J., and Yu P. S. Data mining: An overview from a data-base perspective. IEEE Transactions on Knowledge and Data Engineering, 8(6):866–883, 1996
Article Google Scholar
Savasere A., Omiecinski E., and Navathe S. B. An efficient algorithm for mining association rules in large databases. In The VLDB Journal, pages 432–444, 1995
Google Scholar
Shenoy P., Haritsa J. R., Sundarshan S., Bhalotia G., Bawa M., and Shah D. Turbo-charging vertical mining of large databases. In Proceedings of 2000 ACM SIGMOD International Conference on Management of Data, pages 22–33, 2000
Google Scholar
Cheung D. W., Han J., Ng V. T., and Wong C. Y. Maintenance of discovered association rules in large databases: An incremental updating technique. In Proceedings of the Twelfth IEEE International Conference on Data Engineering, pages 106–114. IEEE, New Orleans, LA, 1996
Chapter Google Scholar
Zaki M. J., Parthasarathy S., Ogihara M., and Li W. New algorithms for fast discovery of association rules. Technical Report TR651, 1997
Google Scholar

Download references

Author information

Authors and Affiliations

Advanced Technologies Application Center (CENATAV), 7a # 21812 e/ 218 y 222, Rpto. Siboney, Playa, C.P. 12200, Ciudad de la Habana, Cuba
J. Hdez. Palancar, R. Hdez. León, J. Medina Pagola & A. Hechavarría

Authors

J. Hdez. Palancar
View author publications
You can also search for this author in PubMed Google Scholar
R. Hdez. León
View author publications
You can also search for this author in PubMed Google Scholar
J. Medina Pagola
View author publications
You can also search for this author in PubMed Google Scholar
A. Hechavarría
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, San Jose State University, San Jose, CA, 95192, USA
Tsau Young Lin
Department of Computer Science and Information Systems, Kennesaw State University, Building 11, Room 3060 1000 Chastain Road, Kennesaw, GA, 30144, USA
Ying Xie
Department of Computer Science, The University at Stony Brook, Stony Brook, New York, 11794-4400, USA
Anita Wasilewska
Institute of Information Science, Academia Sinica, No 128, Academia Road, Section 2 Nankang, Taipei, 11529, Taiwan
Churn-Jung Liau

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Palancar, J.H., León, R.H., Pagola, J.M., Hechavarría, A. (2008). A Compressed Vertical Binary Algorithm for Mining Frequent Patterns. In: Lin, T.Y., Xie, Y., Wasilewska, A., Liau, CJ. (eds) Data Mining: Foundations and Practice. Studies in Computational Intelligence, vol 118. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78488-3_12

Download citation

DOI: https://doi.org/10.1007/978-3-540-78488-3_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78487-6
Online ISBN: 978-3-540-78488-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics