Skip to main content

Abstract

Frequent sets lie at the basis of many Data Mining algorithms. As a result, hundreds of algorithms have been proposed in order to solve the frequent set mining problem. In this chapter, we attempt to survey the most successful algorithms and techniques that try to solve this problem efficiently.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Agrawal, R., Imielinski, T., and Swami, A. (1993). Mining association rules between sets of items in large databases. In Buneman, P. and Jajodia, S., editors, Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, volume 22(2) of SIGMOD Record, pages 207–216. ACM Press.

    Google Scholar 

  • Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., and Verkamo, A. (1996). Fast discovery of association rules. In Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., and Uthurusamy, R., editors, Advances in Knowledge Discovery and Data Mining, pages 307–328. MIT Press.

    Google Scholar 

  • Agrawal, R. and Srikant, R. (1994). Fast algorithms for mining association rules. In Bocca, J., Jarke, M., and Zaniolo, C, editors, Proceedings 20th International Conference on Very Large Data Bases, pages 487–499. Morgan Kaufmann.

    Google Scholar 

  • Amir, A., Feldman, R., and Kashi, R. (1997). A new and versatile method for association generation. Information Systems, 2:333–347.

    Article  Google Scholar 

  • Bayardo, Jr., R. (1998). Efficiently mining long patterns from databases. In (Haas and Tiwary, 1998), pages 85–93.

    Google Scholar 

  • Bonchi, F., Giannotti, F, Mazzanti, A., and Pedreschi, D. (2003). Exante: Anticipated data reduction in constrained pattern mining. In (Lavrac et al., 2003).

    Google Scholar 

  • Borgelt, C. and Kruse, R. (2002). Induction of association rules: Apriori implementation. In Hardle, W. and Ronz, B., editors, Proceedings of the 15th Conference on Computational Statistics, pages 395–400. Physica-Verlag.

    Google Scholar 

  • Boulicaut, J.-E, Bykowski, A., and Rigotti, C. (2003). Free-sets: A condensed representation of boolean data for the approximation of frequency queries. Data Mining and Knowledge Discovery, 7(1):5–22.

    Article  MathSciNet  Google Scholar 

  • Brin, S., Motwani, R., Ullman, J., and Tsur, S. (1997). Dynamic itemset counting and implication rules for market basket data. In Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data, volume 26(2) of SIGMOD Record, pages 255–264. ACM Press.

    Article  Google Scholar 

  • Burdick, D., Calimlim, M., and Gehrke, J. (2001). MAFIA: A maximal frequent itemset algorithm for transactional databases. In Proceedings of the 17th International Conference on Data Engineering, pages 443–452. IEEE Computer Society.

    Google Scholar 

  • Bykowski, A. and Rigotti, C. (2001). A condensed representation to find frequent patterns. In Proceedings of the Twentieth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 267–273. ACM Press.

    Google Scholar 

  • Calders, T. (2004). Computational complexity of itemset frequency satisfiability. In Proceedings of the Twenty-third ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 143–154. ACM Press.

    Google Scholar 

  • Calders, T. and Goethals, B. (2002). Mining all non-derivable frequent item-sets. In Elomaa, T., Mannila, H., and Toivonen, H., editors, Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery, volume 2431 of Lecture Notes in Computer Science, pages 74–85. Springer.

    Google Scholar 

  • Calders, T. and Goethals, B. (2003). Minimal k-free representations of frequent sets. In (Lavrac et al., 2003), pages 71–82.

    Google Scholar 

  • Cercone, N., Lin, T., and Wu, X., editors (2001). Proceedings of the 2001 IEEE International Conference on Data Mining. IEEE Computer Society.

    Google Scholar 

  • Dayal, U., Gray, P., and Nishio, S., editors (1995). Proceedings 21th International Conference on Very Large Data Bases. Morgan Kaufmann.

    Google Scholar 

  • G. Grahne, J. Z. (2003). Efficiently using prefix-trees in mining frequent item-set. In (Goethals and Zaki, 2003).

    Google Scholar 

  • G. Ramesh, W. Maniatty, M. Z. (2003). Feasible itemset distributions in Data Mining: theory and application. In Proceedings of the Twenty-second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 284–295. ACM Press.

    Google Scholar 

  • Geerts, E, Goethals, B., and den Bussche, J. V. (2001). A tight upper bound on the number of candidate patterns. In (Cercone et al., 2001), pages 155–162.

    Google Scholar 

  • Getoor, L., Senator, T., Domingos, P., and Faloutsos, C, editors (2003). Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press.

    Google Scholar 

  • Goethals, B. (2004). Memory issues in frequent itemset mining. In Haddad, H., Omicini, A., Wainwright, R., and Liebrock, L., editors, Proceedings of the 2004 ACM symposium on Applied computing, pages 530–534. ACM Press.

    Google Scholar 

  • Goethals, B. and den Bussche, J. V. (2000). On supporting interactive association rule mining. In Kambayashi, Y., Mohania, M., and Tjoa, A., editors, Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery, volume 1874 of Lecture Notes in Computer Science, pages 307–316. Springer.

    Google Scholar 

  • Goethals, B. and Zaki, M., editors (2003). Proceedings of the ICDM 2003 Workshop on Frequent Itemset Mining Implementations, volume 90 of CEUR Workshop Proceedings.

    Google Scholar 

  • Gouda, K. and Zaki, M. (2001). Efficiently mining maximal frequent itemset. In (Cercone et al., 2001), pages 163–170.

    Google Scholar 

  • Gunopulos, D., Khardon, R., Mannila, H., Saluja, S., Toivonen, H., and Sharma, R. (2003). Discovering all most specific sentences. ACM Transactions on Database Systems, 28(2): 140–174.

    Article  Google Scholar 

  • Haas, L. and Tiwary, A., editors (1998). Proceedings of the 1998 ACM SIG-MOD International Conference on Management of Data, volume 27(2) of SIGMOD Record. ACM Press.

    Google Scholar 

    Google Scholar 

  • Han, J., Pei, J., Yin, Y., and Mao, R. (2004). Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery, 8(1):53–87.

    Article  MathSciNet  Google Scholar 

  • Holsheimer, M., Kersten, M., Mannila, H., and Toivonen, H. (1995). A perspective on databases and Data Mining. In Fayyad, U. and Uthurusamy, R., editors, Proceedings of the First International Conference on Knowledge Discovery and Data Mining, pages 150–155. AAAI Press.

    Google Scholar 

  • Lavrac, N., Gamberger, D., Blocked, H., and Todorovski, L., editors (2003). Proceedings of the 7th European Conference on Principles and Practice of Knowledge Discovery in Databases, volume 2838 of Lecture Notes in Computer Science. Springer.

    Google Scholar 

  • Liu, G., Lu, H., Yu, J., Wei, W., and Xiao, X. (2003). AFOPT: An efficient implementation of pattern growth approach. In (Goethals and Zaki, 2003).

    Google Scholar 

  • Mannila, H. (1997). Inductive databases and condensed representations for Data Mining. In Maluszynski, J., editor, Proceedings of the 1997 International Symposium on Logic Programming, pages 21–30. MIT Press.

    Google Scholar 

  • Mannila, H. (2002). Local and global methods in Data Mining: Basic techniques and open problems. In Widmayer, P., Ruiz, F., Morales, R., Hennessy, M., Eidenbenz, S., and Conejo, R., editors, Proceedings of the 29th International Colloquium on Automata, Languages and Programming, volume 2380 of Lecture Notes in Computer Science, pages 57–68. Springer.

    Google Scholar 

  • Mannila, H., Toivonen, H., and Verkamo, A. (1994). Efficient algorithms for discovering association rules. In Fayyad, U. and Uthurusamy, R., editors, Proceedings of the AAAI Workshop on Knowledge Discovery in Databases, pages 181–192. AAAI Press.

    Google Scholar 

  • Mielikäinen, T. (2003). On inverse frequent set mining. In Du, W. and Clifton, C, editors, 2nd Workshop on Privacy Preserving Data Mining, pages 18–23.

    Google Scholar 

  • Ng, R., Lakshmanan, L., Han, J., and Pang, A. (1998). Exploratory mining and pruning optimizations of constrained association rules. In (Haas and Tiwary, 1998), pages 13–24.

    Google Scholar 

  • Orlando, S., Palmerini, P., Perego, R., and Silvestri, F. (2002). Adaptive and resource-aware mining of frequent sets. In Kumar, V., Tsumoto, S., Yu, P., and N. Zhong, editors, Proceedings of the 2002 IEEE International Conference on Data Mining. IEEE Computer Society. To appear.

    Google Scholar 

  • Pan, F., Cong, G., and A.K.H. Tung, J. Yang, M. Z. (2003). Carpenter: finding closed patterns in long biological datasets. In (Getoor et al., 2003), pages 637–642.

    Google Scholar 

  • Park, J., Chen, M.-S., and Yu, P. (1995). An effective hash based algorithm for mining association rules. In Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, volume 24(2) of SIGMOD Record, pages 175–186. ACM Press.

    Article  Google Scholar 

  • Pasquier, N., Bastide, Y, Taouil, R., and Lakhal, L. (1999). Discovering frequent closed itemsets for association rules. In Beeri, C. and Buneman, P., editors, Proceedings of the 7th International Conference on Database Theory, volume 1540 of Lecture Notes in Computer Science, pages 398–416. Springer.

    Google Scholar 

  • Rioult, K, Boulicaut, J.-R, and B. Cremilleux, J. B. (2003). Using transposition for pattern discovery from microarray data. In Zaki, M. and Aggarwal, C, editors, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pages 73–79. ACM Press.

    Google Scholar 

  • Savasere, A., Omiecinski, E., and Navathe, S. (1995). An efficient algorithm for mining association rules in large databases. In (Dayal et al., 1995), pages 432–444.

    Google Scholar 

  • Srikant, R. (1996). Fast algorithms for mining association rules and sequential patterns. PhD thesis, University of Wisconsin, Madison.

    Google Scholar 

  • Srikant, R. and Agrawal, R. (1995). Mining generalized association rules. In (Dayal et al., 1995), pages 407–419.

    Google Scholar 

  • Srikant, R., Vu, Q., and Agrawal, R. (1997). Mining association rules with item constraints. In Heckerman, D., Mannila, H., and Pregibon, D., editors, Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, pages 66–73. AAAI Press.

    Google Scholar 

  • Toivonen, H. (1996). Sampling large databases for association rules. In Vija-yaraman, T., Buchmann, A., Mohan, C, and Sarda, N., editors, Proceedings 22nd International Conference on Very Large Data Bases, pages 134–145. Morgan Kaufmann.

    Google Scholar 

  • Uno, T. and Satoh, K. (2003). Detailed description of an algorithm for enumeration of maximal frequent sets with irredundant dualization. In (Goethals and Zaki, 2003).

    Google Scholar 

  • Vaidya, J. and Clifton, C. (2002). Privacy preserving association rule mining in vertically partitioned data. In Hand, D., Keim, D., and Ng, R., editors, Proceedings of the Eight ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 639–644. ACM Press.

    Google Scholar 

  • Wang, J., Han, J., and Pei, J. (2003). CLOSET+: searching for the best strategies for mining frequent closed itemsets. In (Getoor et al., 2003), pages 236–245.

    Google Scholar 

  • Yang, G. (2004). The complexity of mining maximal frequent itemsets and maximal frequent patterns. In DuMouchel, W., Gehrke, J., Ghosh, J., and Kohavi, R., editors, Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press.

    Google Scholar 

  • Zaki, M. (2000). Scalable algorithms for association mining. IEEE Transactions on Knowledge and Data Engineering, 12(3):372–390.

    Article  MathSciNet  Google Scholar 

  • Zaki, M. and Gouda, K. (2003). Fast vertical mining using diffsets. In (Getoor et al., 2003), pages 326–335.

    Google Scholar 

  • Zaki, M. and Hsiao, C.-J. (2002). CHARM: An efficient algorithm for closed itemset mining. In Grossman, R., Han, J., Kumar, V., Mannila, H., and Mot-wani, R., editors, Proceedings of the Second SIAM International Conference on Data Mining.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer Science+Business Media, Inc.

About this chapter

Cite this chapter

Goethals, B. (2005). Frequent Set Mining. In: Maimon, O., Rokach, L. (eds) Data Mining and Knowledge Discovery Handbook. Springer, Boston, MA. https://doi.org/10.1007/0-387-25465-X_17

Download citation

  • DOI: https://doi.org/10.1007/0-387-25465-X_17

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-24435-8

  • Online ISBN: 978-0-387-25465-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics