Abstract
Frequent pattern mining has been a focused theme in data mining research for over a decade. Abundant literature has been dedicated to this research and tremendous progress has been made, ranging from efficient and scalable algorithms for frequent itemset mining in transaction databases to numerous research frontiers, such as sequential pattern mining, structured pattern mining, correlation mining, associative classification, and frequent pattern-based clustering, as well as their broad applications. In this article, we provide a brief overview of the current status of frequent pattern mining and discuss a few promising research directions. We believe that frequent pattern mining research has substantially broadened the scope of data analysis and will have deep impact on data mining methodologies and applications in the long run. However, there are still some challenging research issues that need to be solved before frequent pattern mining can claim a cornerstone approach in data mining applications.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Afrati FN, Gionis A, Mannila H (2004) Approximating a collection of frequent sets. In: Proceedings of the 2004 ACM SIGKDD international conference knowledge discovery in databases (KDD’04), Seattle, WA, pp 12–19
Agarwal R, Aggarwal CC, Prasad VVV (2001) A tree projection algorithm for generation of frequent itemsets. J Parallel Distribut Comput 61:350–371
Aggarwal CC, Yu PS (1998) A new framework for itemset generation. In: Proceedings of the 1998 ACM symposium on principles of database systems (PODS’98), Seattle, WA, pp 18–24
Agrawal R, Gehrke J, Gunopulos D, Raghavan P (1998) Automatic subspace clustering of high dimensional data for data mining applications. In: Proceedings of the 1998 ACM-SIGMOD international conference on management of data (SIGMOD’98), Seattle, WA, pp 94–105
Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM-SIGMOD international conference on management of data (SIGMOD’93), Washington, DC, pp 207–216
Agrawal R, Shafer JC (1996) Parallel mining of association rules: design, implementation, and experience. IEEE Trans Knowl Data Eng 8:962–969
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 1994 international conference on very large data bases (VLDB’94), Santiago, Chile, pp 487–499
Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the 1995 international conference on data engineering (ICDE’95), Taipei, Taiwan, pp 3–14
Ahmed KM, El-Makky NM, Taha Y (2000) A note on “beyond market basket: generalizing association rules to correlations”. SIGKDD Explorations 1:46–48
Asai T, Abe K, Kawasoe S, Arimura H, Satamoto H, Arikawa S (2002) Efficient substructure discovery from large semi-structured data. In: Proceedings of the 2002 SIAM international conference on data mining (SDM’02), Arlington, VA, pp 158–174
Aumann Y, Lindell Y (1999) A statistical theory for quantitative association rules. In: Proceeding of the 1999 international conference on knowledge discovery and data mining (KDD’99), San Diego, CA, pp 261–270
Bayardo RJ (1998) Efficiently mining long patterns from databases. In: Proceeding of the 1998 ACM-SIGMOD international conference on management of data (SIGMOD’98), Seattle, WA, pp 85–93
Beil F, Ester M, Xu X (2002) Frequent term-based text clustering. In: Proceeding of the 2002 ACM SIGKDD international conference on knowledge discovery in databases (KDD’02), Edmonton, Canada, pp 436–442
Bettini C, Sean Wang X, Jajodia S (1998) Mining temporal relationships with multiple granularities in time sequences. Bull Tech Committee Data Eng 21:32–38
Beyer K, Ramakrishnan R (1999) Bottom-up computation of sparse and iceberg cubes. In: Proceeding of the 1999 ACM-SIGMOD international conference on management of data (SIGMOD’99), Philadelphia, PA, pp 359–370
Blanchard J, Guillet F, Gras R, Briand H (2005) Using information-theoretic measures to assess association rule interestingness. In: Proceeding of the 2005 international conference on data mining (ICDM’05), Houston, TX, pp 66–73
Bonchi F, Giannotti F, Mazzanti A, Pedreschi D (2003) Exante: anticipated data reduction in constrained pattern mining. In: Proceeding of the 7th European conference on principles and pratice of knowledge discovery in databases (PKDD’03), pp 59–70
Bonchi F, Lucchese C (2004) On closed constrained frequent pattern mining. In: Proceeding of the 2004 international conference on data mining (ICDM’04), Brighton, UK, pp 35–42
Borgelt C, Berthold MR (2002) Mining molecular fragments: finding relevant substructures of molecules. In: Proceeding of the 2002 international conference on data mining (ICDM’02), Maebashi, Japan, pp 211–218
Brin S, Motwani R, Silverstein C (1997) Beyond market basket: generalizing association rules to correlations. In: Proceeding of the 1997 ACM-SIGMOD international conference on management of data (SIGMOD’97), Tucson, AZ, pp 265–276
Brin S, Motwani R, Ullman JD, Tsur S (1997) Dynamic itemset counting and implication rules for market basket analysis. In: Proceeding of the 1997 ACM-SIGMOD international conference on management of data (SIGMOD’97), Tucson, AZ, pp 255–264
Bucila C, Gehrke J, Kifer D, White W (2003) DualMiner: a dual-pruning algorithm for itemsets with constraints. Data Min knowl discov 7:241–272
Burdick D, Calimlim M, Gehrke J (2001) MAFIA: a maximal frequent itemset algorithm for transactional databases. In: Proceeding of the 2001 international conference on data engineering (ICDE’01), Heidelberg, Germany, pp 443–452
Calders T, Goethals B (2002) Mining all non-derivable frequent itemsets. In: Proceeding of the 2002 European conference on principles and pratice of knowledge discovery in databases (PKDD’02), Helsinki, Finland, pp 74–85
Calders T, Goethals B (2005) Depth-first non-derivable itemset mining. In: Proceeding of the 2005 SIAM international conference on data mining (SDM’05), Newport Beach, CA, pp 250–261
Cao H, Mamoulis N, Cheung DW (2005) Mining frequent spatio-temporal sequential patterns. In: Proceeding of the 2005 international conference on data mining (ICDM’05), Houston, TX, pp 82–89
Chang J, Lee W (2003) Finding recent frequent itemsets adaptively over online data streams. In: Proceeding of the 2003 international conference on knowledge discovery and data mining (KDD’03), Washington, DC, pp 487–492
Chen MS, Park JS, Yu PS (1996) Data mining for path traversal patterns in a web environment. In: Proceeding of the 16th international conference on distributed computing systems, pp 385–392
Cheng CH, Fu AW, Zhang Y (1999) Entropy-based subspace clustering for mining numerical data. In: Proceeding of the 1999 international conference on knowledge discovery and data mining (KDD’99), San Diego, CA, pp 84–93
Cheng H, Yan X, Han J (2004) IncSpan: incremental mining of sequential patterns in large In: Proceeding of the 2004 ACM SIGKDD international conference on knowledge discovery in databases (KDD’04), Seattle, WA, pp 527–532
Cheng H, Yan X, Han J (2005) Seqindex: indexing sequences by sequential pattern analysis. In: Proceeding of the 2005 SIAM international conference on data mining (SDM’05), Newport Beach, CA, pp 601–605
Cheng H, Yan X, Han J, Hsu C (2007) Discriminative frequent pattern analysis for effective classification. In: Proceeding of the 2007 international conference on data engineering (ICDE’07), Istanbul, Turkey
Cheung DW, Han J, Ng V, Fu A, Fu Y (1996) A fast distributed algorithm for mining association rules. In: Proceeding of the 1996 international conference on parallel and distributed information systems, Miami Beach, FL, pp 31–44
Cheung DW, Han J, Ng V, Wong CY (1996) Maintenance of discovered association rules in large an incremental updating technique. In: Proceeding of the 1996 international conference on data engineering (ICDE’96), New Orleans, LA, pp 106–114
Chi Y, Wang H, Yu PS, Muntz R (2004) Moment: maintaining closed frequent itemsets over a stream sliding window. In: Proceeding of the 2004 international conference on data mining (ICDM’04), Brighton, UK, pp 59–66
Cong S, Han J, Padua D (2005) Parallel mining of closed sequential patterns. In: Proceeding of the 2005 ACM SIGKDD international conference on knowledge discovery in databases (KDD’05), Chicago, IL, pp 562–567
Cong G, Tan K-L, Tung AKH, Xu X (2005) Mining top-k covering rule groups for gene expression data. In: Proceeding of the 2005 ACM-SIGMOD international conference on management of data (SIGMOD’05), Baltimore, MD, pp 670–681
Deshpande M, Kuramochi M, Karypis G (2003) Frequent sub-structure-based approaches for classifying chemical compounds. In: Proceeding of the 2002 international conference on data mining (ICDM’03), Melbourne, FL, pp 35–42
Dong G, Han J, Lam J, Pei J, Wang K, Zou W (2004) Mining constrained gradients in multi-dimensional databases. IEEE Trans Knowl Data Eng 16:922–938
Dehaspe L, Toivonen H, King R (1998) Finding frequent substructures in chemical compounds. In: Proceeding of the 1998 international conference on knowledge discovery and data mining (KDD’98), New York, NY, pp 30–36
Dong G, Li J (1999) Efficient mining of emerging patterns: discovering trends and differences. In: Proceeding of the 1999 international conference on knowledge discovery and data mining (KDD’99), San Diego, CA, pp 43–52
Eirinaki M, Vazirgiannis M (2003) Web mining for web personalization. ACM Trans Inter Tech 3:1–27
Fukuda T, Morimoto Y, Morishita S, Tokuyama T (1996) Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization. In: Proceeding of the 1996 ACM-SIGMOD international conference management of data (SIGMOD’96), Montreal, Canada, pp 13–23
Gade K, Wang J, Karypis G (2004) Efficient closed pattern mining in the presence of tough block constraints. In: Proceeding of the 2004 international conference on knowledge discovery and data mining (KDD’04), Seattle, WA, pp 138–147
Garofalakis M, Rastogi R, Shim K (1999) SPIRIT: Sequential pattern mining with regular expression constraints. In: Proceeding of the 1999 international conference on Very large data bases (VLDB’99), Edinburgh, UK, pp 223–234
Geerts F, Goethals B, Bussche J (2001) A tight upper bound on the number of candidate patterns. In: Proceeding of the 2001 international conference on data mining (ICDM’01), San Jose, CA, pp 155–162
Gionis A, Kujala T, Mannila H (2003) Fragments of order. In: Proceeding of the 2003 international conference on knowledge discovery and data mining (KDD’03), Washington, DC, pp 129–136
Gionis A, Mannila H, Mielikäinen T, Tsaparas P (2006) Assessing data mining results via swap randomization. In: Proceeding of the 2006 ACM SIGKDD international conference on knowledge discovery in databases (KDD’06), Philadelphia, PA, pp 167–176
Goethals B, Zaki M (2003) An introduction to workshop on frequent itemset mining implementations. In: Proceeding of the ICDM’03 international workshop on frequent itemset mining implementations (FIMI’03), Melbourne, FL, pp 1–13
Grahne G, Lakshmanan L, Wang X (2000) Efficient mining of constrained correlated sets. In: Proceeding of the 2000 international conference on data engineering (ICDE’00), San Diego, CA, pp 512–521
Grahne G, Zhu J (2003)Efficiently using prefix-trees in mining frequent itemsets. In: Proceeding of the ICDM’03 international workshop on frequent itemset mining implementations (FIMI’03), Melbourne, FL, pp 123–132
Han J, Dong G, Yin Y (1999) Efficient mining of partial periodic patterns in time series database. In: Proceeding of the 1999 international conference on data engineering (ICDE’99), Sydney, Australia, pp 106–115
Han J, Fu Y (1995) Discovery of multiple-level association rules from large databases. In: Proceeding of the 1995 international conference on very large data bases (VLDB’95), Zurich, Switzerland, pp 420–431
Han J, Kamber M (2006) Data mining: concepts and techniques, 2nd edn. Morgan Kaufmann
Han J, Pei J, Dong G, Wang K (2001) Efficient computation of iceberg cubes with complex measures. In: Proceeding of the 2001 ACM-SIGMOD international conference on management of data (SIGMOD’01), Santa Barbara, CA, pp 1–12
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceeding of the 2000 ACM-SIGMOD international conference on management of data (SIGMOD’00), Dallas, TX, pp 1–12
Hilderman RJ, Hamilton HJ (2001) Knowledge discovery and measures of interest. Kluwer Academic
Holder LB, Cook DJ, Djoko S (1994) Substructure discovery in the subdue system. In: Proceeding of the AAAI’94 workshop knowledge discovery in databases (KDD’94), Seattle, WA, pp 169–180
Holsheimer M, Kersten M, Mannila H, Toivonen H (1995) A perspective on databases and data mining. In Proceeding of the 1995 international conference on knowledge discovery and data mining (KDD’95), Montreal, Canada, pp 150–155
Huan J, Wang W, Bandyopadhyay D, Snoeyink J, Prins J, Tropsha A (2004) Mining spatial motifs from protein structure graphs. In: Proceeding of the 8th international conference on research in computational molecular biology (RECOMB), San Diego, CA, pp 308–315
Huan J, Wang W, Prins J (2003) Efficient mining of frequent subgraph in the presence of isomorphism. In: Proceeding of the 2003 international conference on data mining (ICDM’03), Melbourne, FL, pp 549–552
Huan J, Wang W, Prins J, Yang J (2004) Spin: mining maximal frequent subgraphs from graph databases. In: Proceeding of the 2004 ACM SIGKDD international conference on knowledge discovery in databases (KDD’04), Seattle, WA, pp 581–586
Imielinski T, Khachiyan L, Abdulghani A (2002) Cubegrades: generalizing association rules. Data Min Knowl Discov 6:219–258
Inokuchi A, Washio T, Motoda H (2000) An apriori-based algorithm for mining frequent substructures from graph data. In: Proceeding of the 2000 European symposium on the principle of data mining and knowledge discovery (PKDD’00), Lyon, France, pp 13–23
Jaroszewicz S, Scheffer T (2005) Fast discovery of unexpected patterns in data relative to a bayesian network. In: Proceeding of the 2005 ACM SIGKDD international conference on knowledge discovery and data mining (KDD’05), Chicago, IL, pp 118–127
Jaroszewicz S, Simovici D (2004) interestingness of frequent itemsets using bayesian networks as background knowledge. In: Proceeding of the 2004 ACM SIGKDD international conference on knowledge discovery and data mining (KDD’04), Seattle, WA, pp 178–186
Ji X, Bailey J, Dong G (2005) Mining minimal distinguishing subsequence patterns with gap constraints. In: Proceeding of the 2005 international conference on data mining (ICDM’05), Houston, TX, pp 194–201
Jin R, Agrawal G (2005) An algorithm for in-core frequent itemset mining on streaming data. In Proceeding of the 2005 international conference on data mining (ICDM’05), Houston, TX, pp 210–217
Jin R, Wang C, Polshakov D, Parthasarathy S, Agrawal G (2005) Discovering frequent topological structures from graph datasets. In: Proceeding of the 2005 ACM SIGKDD international conference on knowledge discovery in databases (KDD’05), Chicago, IL, pp 606–611
Kamber M, Han J, Chiang JY (1997) Metarule-guided mining of multi-dimensional association rules using data cubes. In: Proceeding of the 1997 international conference on knowledge discovery and data mining (KDD’97), Newport Beach, CA, pp 207–210
Karp RM, Papadimitriou CH, Shenker S (2003) A simple algorithm for finding frequent elements in streams and bags. ACM Trans Database Syst, 28:51–55
Koperski K, Han J (1995) Discovery of spatial association rules in geographic information databases. In: Proceeding of the 1995 international symposium on large spatial databases (SSD’95), Portland, ME, pp 47–66
Kosala R, Blockeel H (2000) Web mining research: a survey. SIGKDD Explor 2
Kuramochi M, Karypis G (2001) Frequent subgraph discovery. In: Proceeding of the 2001 international conference on data mining (ICDM’01), San Jose, CA, pp 313–320
Kuramochi M, Karypis G (2004) GREW: a scalable frequent subgraph discovery algorithm. In Proceeding of the 2004 international conference on data mining (ICDM’04), Brighton, UK, pp 439–442
Lakshmanan LVS, Ng R, Han J, Pang A (1999) Optimization of constrained frequent set queries with 2-variable constraints. In: Proceeding of the 1999 ACM-SIGMOD international conference on management of data (SIGMOD’99), Philadelphia, PA, pp 157–168
Lakshmanan LVS, Pei J, Han J (2002) Quotient cube: how to summarize the semantics of a data cube. In: Proceeding of the 2002 international conference on very large data bases (VLDB’02), Hong Kong, China, pp 778–789
Lee Y-K, Kim W-Y, Cai YD, Han J (2003) CoMine: efficient mining of correlated patterns. In: Proceeding of the 2003 international conference on data mining (ICDM’03), Melbourne, FL, pp 581–584
Lent B, Swami A, Widom J (1997) Clustering association rules. In: Proceeding of the 1997 international conference on data engineering (ICDE’97), Birmingham, England, pp 220–231
Li Z, Chen Z, Srinivasan SM, Zhou Y (2004) C-Miner: mining block correlations in storage systems. In: Proceeding of the 2004 USENIX conference on file and storage technologies (FAST’04), San Francisco, CA, pp 173–186
Li J, Dong G, Ramamohanrarao K (2000) Making use of the most expressive jumping emerging patterns for classification. In: Proceeding of the 2000 Pacific-Asia conference on knowledge discovery and data mining (PAKDD’00), Kyoto, Japan, pp 220–232
Li X, Han J, Kim S (2006) Motion-alert: automatic anomaly detection in massive moving objects. In: IEEE international conference on intelligence and security informatics (ISI’06), San Diego, CA, pp 166–177
Li W, Han J, Pei J (2001) CMAR: accurate and efficient classification based on multiple class-association rules. In: Proceeding of the 2001 international conference on data mining (ICDM’01), San Jose, CA, pp 369–376
Li Z, Lu S, Myagmar S, Zhou Y (2004) CP-Miner: a tool for finding copy-paste and related bugs in operating system code. In: Proceeding of the 2004 symposium on operating systems design and implementation (OSDI’04), San Francisco, CA, pp 289–302
Li Z, Zhou Y (2005) PR-Miner: Automatically extracting implicit programming rules and detecting violations in large software code. In: Proceeding of the 2005 ACM SIGSOFT symposium on foundations software eng (FSE’05), Lisbon, Portugal, pp 306–315
Lin C, Chiu D, Wu Y, Chen A (2005) Mining frequent itemsets from data streams with a time-sensitive sliding window. In: Proceeding of the 2005 SIAM international conference on data mining (SDM’05), Newport Beach, pp 68–79
Liu H, Han J, Xin D, Shao Z (2006) Mining frequent patterns on very high dimensional data: a top-down row enumeration approach. In: Proceeding of the 2006 SIAM international conference on data mining (SDM’06), Bethesda, MD, pp 280–291
Liu B, Hsu W, Ma Y (1998) Integrating classification and association rule mining. In: Proceeding of the 1998 international conference on knowledge discovery and data mining (KDD’98), New York, NY, pp 80–86
Liu G, Li J, Wong L, Hsu W (2006) Positive borders or negative borders: how to make lossless generator based representations concise. In: Proceeding of the 2006 SIAM international conference on data mining (SDM’06), Bethesda, MD, pp 467–471
Liu G, Lu H, Lou W, Yu JX (2003) On computing, storing and querying frequent patterns. In: Proceeding of the 2003 ACM SIGKDD international conference on knowledge discovery and data mining (KDD’03), Washington, DC, pp 607–612
Liu J, Paulsen S, Sun X, Wang W, Nobel A, Prins J (2006) Mining approximate frequent itemsets in the presence of noise: algorithm and analysis. In: Proceeding of the 2006 SIAM international conference on data mining (SDM’06), Bethesda, MD, pp 405–416
Liu J, Pan Y, Wang K, Han J (2002) Mining frequent item sets by opportunistic projection. In: Proceeding of the 2002 ACM SIGKDD international conference on knowledge discovery in databases (KDD’02), Edmonton, Canada, pp 239–248
Liu C, Yan X, Yu H, Han J, Yu PS (2005) Mining behavior graphs for “backtrace” of noncrashing bugs. In: Proceeding of the 2005 SIAM international conference on data mining (SDM’05), Newport Beach, pp 286–297
Lu H, Han J, Feng L (1998) Stock movement and n-dimensional inter-transaction association rules. In: Proceeding of the 1998 SIGMOD workshop research issues on data mining and knowledge discovery (DMKD’98), Seattle, WA, pp 12:1–12:7
Luo C, Chung S (2005) Efficient mining of maximal sequential patterns using multiple samples. In: Proceeding of the 2005 SIAM international conference on data mining (SDM’05), Newport Beach, CA, pp 415–426
Ma S, Hellerstein JL (2001) Mining partially periodic event patterns with unknown periods. In: Proceeding of the 2001 international conference on data engineering (ICDE’01), Heidelberg, Germany, pp 205–214
Manku G, Motwani R (2002) Approximate frequency counts over data streams. In: Proceeding of the 2002 international conference on very large data bases (VLDB’02), Hong Kong, China, pp 346–357
Mannila H, Toivonen H, Verkamo AI (1994) Efficient algorithms for discovering association rules. In: Proceeding of the AAAI’94 workshop knowledge discovery in databases (KDD’94), Seattle, WA, pp 181–192
Mannila H, Toivonen H, Verkamo AI (1997) Discovery of frequent episodes in event sequences. Data Min Knowl Discov 1:259–289
Mei Q, Xin D, Cheng H, Han J, Zhai C (2006) Generating semantic annotations for frequent patterns with context analysis. In: Proceeding of the 2006 ACM SIGKDD international conference on knowledge discovery in databases (KDD’06), Philadelphia, PA, pp 337–346
Metwally A, Agrawal D, El Abbadi A (2005) Efficient computation of frequent and top-k elements in data streams. In: Proceeding of the 2005 international conference on database theory (ICDT’05), Edinburgh, UK, pp 398–412
Miller RJ, Yang Y (1997) Association rules over interval data. In: Proceeding of the 1997 ACM-SIGMOD international conference on management of data (SIGMOD’97), Tucson, AZ, pp 452–461
Nanopoulos A, Manolopoulos Y (2001) Mining patterns from graph traversals. Data Knowl Eng 37:243–266
Ng R, Lakshmanan LVS, Han J, Pang A (1998) Exploratory mining and pruning optimizations of constrained associations rules. In: Proceeding of the 1998 ACM-SIGMOD international conference on management of data (SIGMOD’98), Seattle, WA, pp 13–24
Nijssen S, Kok J (2004) A quickstart in frequent structure mining can make a difference. In: Proceeding of the 2004 ACM SIGKDD international conference on kowledge discovery in databases (KDD’04), Seattle, WA, pp 647–652
Omiecinski E (2003) Alternative interest measures for mining associations. IEEE Trans Knowl and data engineering 15:57–69
Özden B, Ramaswamy S, Silberschatz A (1998) Cyclic association rules. In: Proceeding of the 1998 international conference on data engineering (ICDE’98), Orlando, FL, pp 412–421
Pan F, Cong G, Tung AKH, Yang J, Zaki M (2003) CARPENTER: finding closed patterns in long biological datasets. In: Proceeding of the 2003 ACM SIGKDD international conference on knowledge discovery and data mining (KDD’03), Washington, DC, pp 637–642
Pan F, Tung AKH, Cong G, Xu X (2004) COBBLER: combining column, and row enumeration for closed pattern discovery. In: Proceeding of the 2004 international conference on scientific and statistical database management (SSDBM’04), Santorini Island, Greece, pp 21–30
Park JS, Chen MS, Yu PS (1995) An effective hash-based algorithm for mining association rules. In: Proceeding of the 1995 ACM-SIGMOD international conference on management of data (SIGMOD’95), San Jose, CA, pp 175–186
Park JS, Chen MS, Yu PS (1995) Efficient parallel mining for association rules. In: Proceeding of the 4th international conference on information and knowledge management, Baltimore, MD, pp 31–36
Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: Proceeding of the 7th international conference on database theory (ICDT’99), Jerusalem, Israel, pp 398–416
Pei J, Dong G, Zou W, Han J (2002) On computing condensed frequent pattern bases. In: Proceeding of the 2002 international conference on data mining (ICDM’02), Maebashi, Japan, pp 378–385
Pei J, Han J, Lakshmanan LVS (2001) Mining frequent itemsets with convertible constraints. In: Proceeding of the 2001 international conference on data engineering (ICDE’01), Heidelberg, Germany, pp 433–332
Pei J, Han J, Mao R (2000) CLOSET: an efficient algorithm for mining frequent closed itemsets. In: Proceeding of the 2000 ACM-SIGMOD international workshop data mining and knowledge discovery (DMKD’00), Dallas, TX, pp 11–20
Pei J, Han J, Mortazavi-Asl B, Pinto H, Chen Q, Dayal U, Hsu M-C (2001) PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceeding of the 2001 international conference on data engineering (ICDE’01), Heidelberg, Germany, pp 215–224
Pei J, Han J, Mortazavi-Asl B, Wang J, Pinto H, Chen Q, Dayal U, Hsu M-C (2004) Mining sequential patterns by pattern-growth: the prefixspan approach. IEEE Trans Knowl Data Eng 16:1424–1440
Pei J, Han J, Mortazavi-Asl B, Zhu H (2000) Mining access patterns efficiently from web logs. In: Proceeding of the 2000 Pacific-Asia conference on knowledge discovery and data mining (PAKDD’00), Kyoto, Japan, pp 396–407
Pei J, Han J, Wang W (2002) Constraint-based sequential pattern mining in large databases. In: Proceeding of the 2002 international conference on information and knowledge management (CIKM’02), McLean, VA, pp 18–25
Pei J, Liu J, Wang H, Wang K, Yu PS, Yang J (2005) Efficiently mining frequent closed partial orders. In: Proceeding of the 2005 international conference on data mining (ICDM’05), Houston, TX, pp 753–756
Piatetsky-Shapiro G (1991) Notes of AAAI’91 workshop knowledge discovery in databases (KDD’91). AAAI/MIT Press, Anaheim, CA
Pinto H, Han J, Pei J, Wang K, Chen Q, Dayal U (2001) Multi-dimensional sequential pattern mining. In: Proceeding of the 2001 international conference on information and knowledge management (CIKM’01), Atlanta, GA, pp 81–88
Punin J, Krishnamoorthy M, Zaki M (2001) Web usage mining: languages and algorithms. Springer-Verlag
Ramesh G, Maniatty WA, Zaki MJ (2003) Feasible itemset distributions in data mining: theory and application. In: Proceeding of the 2003 ACM symposium on principles of database systems (PODS’03), San Diego, CA, pp 284–295
Sarawagi S, Thomas S, Agrawal R (1998) Integrating association rule mining with relational database systems: alternatives and implications. In: Proceeding of the 1998 ACM-SIGMOD international conference on management of data (SIGMOD’98), Seattle, WA, pp 343–354
Savasere A, Omiecinski E, Navathe S (1995) An efficient algorithm for mining association rules in large databases. In: Proceeding of the 1995 international conference on very large data bases (VLDB’95), Zurich, Switzerland, pp 432–443
Seppänen J, Mannila H (2004) Dense itemsets. In: Proceeding of the 2004 international conference on knowledge discovery and data mining (KDD’04), Seattle, WA, pp 683–688
Shekar B, Natarajan R (2004) A transaction-based neighbourhood-driven approach to quantifying interestingness of assoication rules. In: Proceeding of the 2004 international conference on data mining (ICDM’04), Brighton, UK, pp 194–201
Siebes A, Vreeken J, Leeuwen M (2006) Item sets that compress. In: Proceeding of the 2006 SIAM international conference on data mining (SDM’06), Bethesda, MD, pp 393–404
Silverstein C, Brin S, Motwani R, Ullman JD (1998) Scalable techniques for mining causal structures. In: Proceeding of the 1998 international conference on very large data bases (VLDB’98), New York, NY, pp 594–605
Sismanis Y, Roussopoulos N, Deligianannakis A, Kotidis Y (2002) Dwarf: shrinking the petacube. In: Proceeding of the 2002 ACM-SIGMOD international conference on management of data (SIGMOD’02), Madison, WI, pp 464–475
Srikant R, Agrawal R (1995) Mining generalized association rules. In: Proceeding of the 1995 international conference on very large data bases (VLDB’95), Zurich, Switzerland, pp 407–419
Srikant R, Agrawal R (1996) Mining sequential patterns: generalizations and performance improvements. In: Proceeding of the 5th international conference on extending database technology (EDBT’96), Avignon, France, pp 3–17
Srivastava J, Cooley R, Deshpande M, Tan PN (2000) Web usage mining: discovery and applications of usage patterns from web data. SIGKDD Explor 1:12–23
Steinbach M, Tan P, Kumar V (2004) Support envelopes: A technique for exploring the structure of association patterns. In: Proceeding of the 2004 ACM SIGKDD international conference on knowledge discovery in databases (KDD’04), Seattle, WA, pp 296–305
Tan P-N, Kumar V, Srivastava J (2002) Selecting the right interestingness measure for association patterns. In: Proceeding of the 2002 ACM SIGKDD international conference on knowledge discovery in databases (KDD’02), Edmonton, Canada, pp 32–41
Ting R, Bailey J (2006) Mining minimal contrast subgraph patterns. In: Proceeding of the 2006 SIAM international conference on data mining (SDM’06), Bethesda, MD, pp 638–642
Toivonen H (1996) Sampling large databases for association rules. In: Proceeding of the 1996 international conference on very large data bases (VLDB’96), Bombay, India, pp 134–145
Ukkonen A, Fortelius M, Mannila H (2005) Finding partial orders from unordered 0-1 data. In: Proceeding of the 2005 international conference on knowledge discovery and data mining (KDD’05), Chicago, IL, pp 285–293
Vanetik N, Gudes E, Shimony SE (2002) Computing frequent graph patterns from semistructured data. In: Proceeding of the 2002 international conference on data mining (ICDM’02), Maebashi, Japan, pp 458–465
Wang J, Han J (2004) BIDE: Efficient mining of frequent closed sequences. In: Proceeding of the 2004 international conference on data engineering (ICDE’04), Boston, MA, pp 79–90
Wang J, Han J, Lu Y, Tzvetkov P (2005) TFP: An efficient algorithm for mining top-k frequent closed itemsets. IEEE Trans Knowl Data Eng 17:652–664
Wang J, Han J, Pei J (2003) CLOSET+: searching for the best strategies for mining frequent closed itemsets. In: Proceeding of the 2003 ACM SIGKDD international conference on knowledge discovery and data mining (KDD’03), Washington, DC, pp 236–245
Wang K, Jiang Y, Lakshmanan L (2003) Mining unexpected rules by pushing user dynamics. In: Proceeding of the 2003 ACM SIGKDD international conference on knowledge discovery in databases (KDD’03), Washington, DC, pp 246–255
Wang J, Karypis G (2005) HARMONY: efficiently mining the best rules for classification. In: Proceeding of the 2005 SIAM conference on data mining (SDM’05), Newport Beach, CA, pp 205–216
Wang W, Lu H, Feng J, Yu JX (2002) Condensed cube: an effective approach to reducing data cube size. In: Proceeding of the 2002 international conference on data engineering (ICDE’02), San Fransisco, CA, pp 155–165
Wang C, Wang W, Pei J, Zhu Y, Shi B (2004) Scalable mining of large disk-base graph databases. In: Proceeding of the 2004 ACM SIGKDD international conference on knowledge discovery in databases (KDD’04), Seattle, WA, pp 316–325
Wang H, Wang W, Yang J, Yu PS (2002) Clustering by pattern similarity in large data sets. In: Proceeding of the 2002 ACM-SIGMOD international conference on management of data (SIGMOD’02), Madison, WI, pp 418–427
Washio T, Motoda H (2003) State of the art of graph-based data mining. SIGKDD Explor 5:59–68
Xin D, Han J, Li X, Wah WB (2003) Star-cubing: computing iceberg cubes by top-down and bottom-up integration. In: Proceeding of the 2003 international conference on very large data bases (VLDB’03), Berlin, Germany, pp 476–487
Xin D, Han J, Shao Z, Liu H (2006) C-cubing: efficient computation of closed cubes by aggregation-based checking. In: Proceeding of the 2006 international conference on data engineering (ICDE’06), Atlanta, Georgia, p 4
Xin D, Han J, Yan X, Cheng H (2005) Mining compressed frequent-pattern sets. In: Proceeding of the 2005 international conference on very large data bases (VLDB’05), Trondheim, Norway, pp 709–720
Xin D, Shen X, Mei Q, Han J (2006) Discovering interesting patterns through user’s interactive feedback. In: Proceeding of the 2006 ACM SIGKDD international conference on knowledge discovery in databases (KDD’06), Philadelphia, PA, pp 773–778
Xiong H, Shekhar S, Huang Y, Kumar V, Ma X, Yoo JS (2004) A framework for discovering co-location patterns in data sets with extended spatial objects. In: Proceeding of the 2004 SIAM international conference on data mining (SDM’04), Lake Buena Vista, FL, pp 78–89
Yan X, Cheng H, Han J, Xin D (2005) Summarizing itemset patterns: a profile-based approach. In: Proceeding of the 2005 ACM SIGKDD international conference on knowledge discovery in databases (KDD’05), Chicago, IL, pp 314–323
Yan X, Han J (2002) gSpan: graph-based substructure pattern mining. In: Proceeding of the 2002 international conference on data mining (ICDM’02), Maebashi, Japan, pp 721–724
Yan X, Han J (2003) CloseGraph: mining closed frequent graph patterns. In: Proceeding of the 2003 ACM SIGKDD international conference on knowledge discovery and data mining (KDD’03), Washington, DC, pp 286–295
Yan X, Han J, Afshar R (2003) CloSpan: mining closed sequential patterns in large datasets. In: Proceeding of the 2003 SIAM international conference on data mining (SDM’03), San Fransisco, CA, pp 166–177
Yan X, Yu PS, Han J (2004) Graph indexing: a frequent structure-based approach. In: Proceeding of the 2004 ACM-SIGMOD international conference on management of data (SIGMOD’04), Paris, France, pp 335–346
Yan X, Yu PS, Han J (2005) Substructure similarity search in graph databases. In: Proceeding of the 2005 ACM-SIGMOD international conference on management of data (SIGMOD’05), Baltimore, MD, pp 766–777
Yan X, Zhou XJ, Han J (2005) Mining closed relational graphs with connectivity constraints. In: Proceeding of the 2005 ACM SIGKDD international conference on knowledge discovery in databases (KDD’05), Chicago, IL, pp 324–333
Yan X, Zhu F, Han J, Yu PS (2006) Searching substructures with superimposed distance. In: Proceeding of the 2006 international conference on data engineering (ICDE’06), Atlanta, Georgia, p 88
Yang G (2004) The complexity of mining maximal frequent itemsets and maximal frequent patterns. In: Proceeding of the 2004 ACM SIGKDD international conference on kowledge discovery in databases (KDD’04), Seattle, WA, pp 344–353
Yang C, Fayyad U, Bradley PS (2001) Efficient discovery of error-tolerant frequent itemsets in high dimensions. In: Proceeding of the 2001 ACM SIGKDD international conference on knowledge discovery in databases (KDD’01), San Fransisco, CA, pp 194–203
Yang LH, Lee M-L, Hsu W (2003) Efficient mining of xml query patterns for caching. In: VLDB, pp 69–80
Yang J, Wang W (2003) CLUSEQ: efficient and effective sequence clustering. In: Proceeding of the 2003 international conference on data engineering (ICDE’03), Bangalore, India, pp 101–112
Yang J, Wang W, Yu PS (2003) Mining asynchronous periodic patterns in time series data. IEEE Trans Knowl Data Eng 15:613–628
Yin X, Han J (2003) CPAR: classification based on predictive association rules. In: Proceeding of the 2003 SIAM international conference on data mining (SDM’03), San Fransisco, CA, pp 331–335
Yoda K, Fukuda T, Morimoto Y, Morishita S, Tokuyama T (1997) Computing optimized rectilinear regions for association rules. In: Proceeding of the 1997 international conference on knowledge discovery and data mining (KDD’97), Newport Beach, CA, pp 96–103
Yu JX, Chong Z, Lu H, Zhou A (2004) False positive or false negative: mining frequent itemsets from high speed transactional data streams. In: Proceeding of the 2004 international conference on very large data bases (VLDB’04), Toronto, Canada, pp 204–215
Yun U, Leggett J (2005) Wfim: weighted frequent itemset mining with a weight range and a minimum weight. In: Proceeding of the 2005 SIAM international conference on data mining (SDM’05), Newport Beach, CA, pp 636–640
Zaïane OR, Han J, Zhu H (2000) Mining recurrent items in multimedia with progressive resolution refinement. In: Proceeding of the 2000 international conference on data engineering (ICDE’00), San Diego, CA, pp 461–470
Zaki MJ (1998) Efficient enumeration of frequent sequences. In: Proceeding of the 7th international conference on information and knowledge management (CIKM’98), Washington, DC, pp 68–75
Zaki MJ (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12:372–390
Zaki M (2001) SPADE: an efficient algorithm for mining frequent sequences. Mach Learn 40:31–60
Zaki MJ (2002) Efficiently mining frequent trees in a forest. In: Proceeding of the 2002 ACM SIGKDD international conference on knowledge discovery in databases (KDD’02), Edmonton, Canada, pp 71–80
Zaki MJ, Hsiao CJ (2002) CHARM: an efficient algorithm for closed itemset mining. In: Proceeding of the 2002 SIAM international conference on data mining (SDM’02), Arlington, VA, pp 457–473
Zaki MJ, Lesh N, Ogihara M (1998) PLANMINE: sequence mining for plan failures. In: Proceeding of the 1998 international conference on knowledge discovery and data mining (KDD’98), New York, NY, pp 369–373
Zaki MJ, Parthasarathy S, Ogihara M, Li W (1997) Parallel algorithm for discovery of association rules. data mining knowl discov, 1:343–374
Zhang X, Mamoulis N, Cheung DW, Shou Y (2004) Fast mining of spatial collocations. In: Proceeding of the 2004 ACM SIGKDD international conference on knowledge discovery in databases (KDD’04), Seattle, WA, pp 384–393
Zhang H, Padmanabhan B, Tuzhilin A (2004) On the discovery of significant statistical quantitative rules. In: Proceeding of the 2004 international conference on knowledge discovery and data mining (KDD’04), Seattle, WA, pp 374–383
Zhu F, Yan X, Han J, Yu PS, Cheng H (2007) Mining colossal frequent patterns by core pattern fusion. In: Proceeding of the 2007 international conference on data engineering (ICDE’07), Istanbul, Turkey
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Geoff Webb.
The work was supported in part by the U.S. National Science Foundation NSF IIS-05-13678/06-42771 and NSF BDI-05-15813. Any opinions, findings, and conclusions or recommendations expressed here are those of the authors and do not necessarily reflect the views of the funding agencies.
Rights and permissions
About this article
Cite this article
Han, J., Cheng, H., Xin, D. et al. Frequent pattern mining: current status and future directions. Data Min Knowl Disc 15, 55–86 (2007). https://doi.org/10.1007/s10618-006-0059-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-006-0059-1