Skip to main content

A Compressed Vertical Binary Algorithm for Mining Frequent Patterns

  • Chapter
Data Mining: Foundations and Practice

Part of the book series: Studies in Computational Intelligence ((SCI,volume 118))

Summary

A new algorithm named Compressed Binary Mine (CBMine) for mining association rules and frequent patterns is presented in this chapter. Its efficiency is based on a compressed vertical binary representation of the database. CBMine was compared with several a priori implementations, like Bodon’s a priori algorithm, and MAFIA, another vertical binary representation method. The experimental results have shown that CBMine has significantly better performance, especially for sparse databases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fast algorithms for frequent itemset mining using fp-trees. IEEE Transactions on Knowledge and Data Engineering, 17(10):1347–1362, 2005. Member-Gosta Grahne and Student Member-Jianfei Zhu

    Google Scholar 

  2. Agrawal R., Imielinski T., and Swami A. N. Mining association rules between sets of items in large databases. In Buneman P. and Jajodia S. editors, Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pages 207–216. Washington DC, 26–28 1993

    Google Scholar 

  3. Agrawal R. and Srikant R. Fast algorithms for mining association rules. In Bocca J. B., Jarke M., and Zaniolo C. editors, Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, pages 487–499. Morgan Kaufmann, San Fransisco, CA, 12–15 1994

    Google Scholar 

  4. Bodon F. Surprising results of trie-based fim algorithms. In Goethals B., Zaki M. J., and Bayardo R. editors, Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations (FIMI’04), volume 126 of CEUR Workshop Proceedings, Brighton, UK, 1 November 2004

    Google Scholar 

  5. Bodon F. Trie-based apriori implementation for mining frequent itemsequences. In Goethals B., Nijssen S., and Zaki M. J. editors, Proceedings of ACM SIGKDD International Workshop on Open Source Data Mining (OSDM’05), pages 56–65. Chicago, IL, USA, August 2005

    Google Scholar 

  6. Brin S., Motwani R., Ullman J. D., and Tsur S. Dynamic itemset counting and implication rules for market basket data. In Peckham J. editor, SIGMOD 1997, Proceedings of ACM SIGMOD International Conference on Management of Data, pages 255–264. ACM, Tucson, Arizona, USA, May 13–15, 1997, 05 1997

    Google Scholar 

  7. Burdick D., Calimlim M., and Gehrke J. Mafia: A maximal frequent itemset algorithm for transactional databases. In Proceedings of the Seventeenth International Conference on Data Engineering, pages 443–452. Washington DC, USA, 2001. IEEE Computer Society

    Google Scholar 

  8. Gardarin G., Pucheral P., and Wu F. Bitmap based algorithms for mining association rules, in: Actes des journèes Bases de Donnèes Avances (BDA’98), Hammamet, Tunisie, 1998

    Google Scholar 

  9. Han J., Pei J., Yin Y., and Mao R. Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery, 8(1):53–87, 2004

    Article  MathSciNet  Google Scholar 

  10. Hipp J., Güntzer U., and Nakhaeizadeh G. Algorithms for association rule mining – a general survey and comparison. SIGKDD Explorations, 2(1):58–64, July 2000

    Article  Google Scholar 

  11. Holt J. D. and Chung S. M. Multipass algorithms for mining association rules in text databases. Knowledge Information System, 3(2):168–183, 2001

    Article  MATH  Google Scholar 

  12. Lin T. Y. Data mining and machine oriented modeling: A granular computing approach. Applied Intelligence, 13(2):113–124, 2000

    Article  Google Scholar 

  13. Calimlim M. and Gehrke J. Himalaya data mining tools: Mafia. http://himalaya-tools.sourceforge.net, May 2006

  14. Fayyad U. M., Piatetsky-Shapiro G., and Smyth P. From data mining to knowledge discovery: An overview. In Fayyad U. M., Piatetsky-Shapiro G., Smyth P., and Uthurusamy R. editors, Advances in Knowledge Discovery and Data Mining, pages 1–34. AAAI, Menlo Park, CA, 1996

    Google Scholar 

  15. Gopalan R. P. and Sucahyo Y. G. High performance frequent patterns extraction using compressed fp-tree. In Proceedings of the SIAM International Workshop on High Performance and Distributed Mining, Orlando, USA, 2004

    Google Scholar 

  16. Feldman R. and Hirsh H. Finding associations in collections of text In Machine Learning and Data Mining: Methods and Applications, pages 223–240. Wiley, New York, 1998

    Google Scholar 

  17. Feldman R., Dagen I., and Hirsh H. Mining text using keyword distributions. Journal of Intelligent Information Systems, 10(3):281–300, 1998

    Article  Google Scholar 

  18. Chen M. S., Han J., and Yu P. S. Data mining: An overview from a data-base perspective. IEEE Transactions on Knowledge and Data Engineering, 8(6):866–883, 1996

    Article  Google Scholar 

  19. Savasere A., Omiecinski E., and Navathe S. B. An efficient algorithm for mining association rules in large databases. In The VLDB Journal, pages 432–444, 1995

    Google Scholar 

  20. Shenoy P., Haritsa J. R., Sundarshan S., Bhalotia G., Bawa M., and Shah D. Turbo-charging vertical mining of large databases. In Proceedings of 2000 ACM SIGMOD International Conference on Management of Data, pages 22–33, 2000

    Google Scholar 

  21. Cheung D. W., Han J., Ng V. T., and Wong C. Y. Maintenance of discovered association rules in large databases: An incremental updating technique. In Proceedings of the Twelfth IEEE International Conference on Data Engineering, pages 106–114. IEEE, New Orleans, LA, 1996

    Chapter  Google Scholar 

  22. Zaki M. J., Parthasarathy S., Ogihara M., and Li W. New algorithms for fast discovery of association rules. Technical Report TR651, 1997

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Palancar, J.H., León, R.H., Pagola, J.M., Hechavarría, A. (2008). A Compressed Vertical Binary Algorithm for Mining Frequent Patterns. In: Lin, T.Y., Xie, Y., Wasilewska, A., Liau, CJ. (eds) Data Mining: Foundations and Practice. Studies in Computational Intelligence, vol 118. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78488-3_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78488-3_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78487-6

  • Online ISBN: 978-3-540-78488-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics