Summary
A new algorithm named Compressed Binary Mine (CBMine) for mining association rules and frequent patterns is presented in this chapter. Its efficiency is based on a compressed vertical binary representation of the database. CBMine was compared with several a priori implementations, like Bodon’s a priori algorithm, and MAFIA, another vertical binary representation method. The experimental results have shown that CBMine has significantly better performance, especially for sparse databases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Fast algorithms for frequent itemset mining using fp-trees. IEEE Transactions on Knowledge and Data Engineering, 17(10):1347–1362, 2005. Member-Gosta Grahne and Student Member-Jianfei Zhu
Agrawal R., Imielinski T., and Swami A. N. Mining association rules between sets of items in large databases. In Buneman P. and Jajodia S. editors, Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pages 207–216. Washington DC, 26–28 1993
Agrawal R. and Srikant R. Fast algorithms for mining association rules. In Bocca J. B., Jarke M., and Zaniolo C. editors, Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, pages 487–499. Morgan Kaufmann, San Fransisco, CA, 12–15 1994
Bodon F. Surprising results of trie-based fim algorithms. In Goethals B., Zaki M. J., and Bayardo R. editors, Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations (FIMI’04), volume 126 of CEUR Workshop Proceedings, Brighton, UK, 1 November 2004
Bodon F. Trie-based apriori implementation for mining frequent itemsequences. In Goethals B., Nijssen S., and Zaki M. J. editors, Proceedings of ACM SIGKDD International Workshop on Open Source Data Mining (OSDM’05), pages 56–65. Chicago, IL, USA, August 2005
Brin S., Motwani R., Ullman J. D., and Tsur S. Dynamic itemset counting and implication rules for market basket data. In Peckham J. editor, SIGMOD 1997, Proceedings of ACM SIGMOD International Conference on Management of Data, pages 255–264. ACM, Tucson, Arizona, USA, May 13–15, 1997, 05 1997
Burdick D., Calimlim M., and Gehrke J. Mafia: A maximal frequent itemset algorithm for transactional databases. In Proceedings of the Seventeenth International Conference on Data Engineering, pages 443–452. Washington DC, USA, 2001. IEEE Computer Society
Gardarin G., Pucheral P., and Wu F. Bitmap based algorithms for mining association rules, in: Actes des journèes Bases de Donnèes Avances (BDA’98), Hammamet, Tunisie, 1998
Han J., Pei J., Yin Y., and Mao R. Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery, 8(1):53–87, 2004
Hipp J., Güntzer U., and Nakhaeizadeh G. Algorithms for association rule mining – a general survey and comparison. SIGKDD Explorations, 2(1):58–64, July 2000
Holt J. D. and Chung S. M. Multipass algorithms for mining association rules in text databases. Knowledge Information System, 3(2):168–183, 2001
Lin T. Y. Data mining and machine oriented modeling: A granular computing approach. Applied Intelligence, 13(2):113–124, 2000
Calimlim M. and Gehrke J. Himalaya data mining tools: Mafia. http://himalaya-tools.sourceforge.net, May 2006
Fayyad U. M., Piatetsky-Shapiro G., and Smyth P. From data mining to knowledge discovery: An overview. In Fayyad U. M., Piatetsky-Shapiro G., Smyth P., and Uthurusamy R. editors, Advances in Knowledge Discovery and Data Mining, pages 1–34. AAAI, Menlo Park, CA, 1996
Gopalan R. P. and Sucahyo Y. G. High performance frequent patterns extraction using compressed fp-tree. In Proceedings of the SIAM International Workshop on High Performance and Distributed Mining, Orlando, USA, 2004
Feldman R. and Hirsh H. Finding associations in collections of text In Machine Learning and Data Mining: Methods and Applications, pages 223–240. Wiley, New York, 1998
Feldman R., Dagen I., and Hirsh H. Mining text using keyword distributions. Journal of Intelligent Information Systems, 10(3):281–300, 1998
Chen M. S., Han J., and Yu P. S. Data mining: An overview from a data-base perspective. IEEE Transactions on Knowledge and Data Engineering, 8(6):866–883, 1996
Savasere A., Omiecinski E., and Navathe S. B. An efficient algorithm for mining association rules in large databases. In The VLDB Journal, pages 432–444, 1995
Shenoy P., Haritsa J. R., Sundarshan S., Bhalotia G., Bawa M., and Shah D. Turbo-charging vertical mining of large databases. In Proceedings of 2000 ACM SIGMOD International Conference on Management of Data, pages 22–33, 2000
Cheung D. W., Han J., Ng V. T., and Wong C. Y. Maintenance of discovered association rules in large databases: An incremental updating technique. In Proceedings of the Twelfth IEEE International Conference on Data Engineering, pages 106–114. IEEE, New Orleans, LA, 1996
Zaki M. J., Parthasarathy S., Ogihara M., and Li W. New algorithms for fast discovery of association rules. Technical Report TR651, 1997
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Palancar, J.H., León, R.H., Pagola, J.M., Hechavarría, A. (2008). A Compressed Vertical Binary Algorithm for Mining Frequent Patterns. In: Lin, T.Y., Xie, Y., Wasilewska, A., Liau, CJ. (eds) Data Mining: Foundations and Practice. Studies in Computational Intelligence, vol 118. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78488-3_12
Download citation
DOI: https://doi.org/10.1007/978-3-540-78488-3_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78487-6
Online ISBN: 978-3-540-78488-3
eBook Packages: EngineeringEngineering (R0)