Selective association rule generation

Hahsler, Michael; Buchta, Christian; Hornik, Kurt

doi:10.1007/s00180-007-0062-z

Selective association rule generation

Original Paper
Published: 25 July 2007

Volume 23, pages 303–315, (2008)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Michael Hahsler¹,
Christian Buchta² &
Kurt Hornik³

157 Accesses
10 Citations
Explore all metrics

Abstract

Mining association rules is a popular and well researched method for discovering interesting relations between variables in large databases. A practical problem is that at medium to low support values often a large number of frequent itemsets and an even larger number of association rules are found in a database. A widely used approach is to gradually increase minimum support and minimum confidence or to filter the found rules using increasingly strict constraints on additional measures of interestingness until the set of rules found is reduced to a manageable size. In this paper we describe a different approach which is based on the idea to first define a set of “interesting” itemsets (e.g., by a mixture of mining and expert knowledge) and then, in a second step to selectively generate rules for only these itemsets. The main advantage of this approach over increasing thresholds or filtering rules is that the number of rules found is significantly reduced while at the same time it is not necessary to increase the support and confidence thresholds which might lead to missing important information in the database.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on management of data. ACM Press, pp 207–216
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Bocca JB, Jarke M, Zaniolo C (eds) Proceedings of the 20th international conference on very large data bases, VLDB. Morgan Kaufmann, pp 487–499
Bayardo RJ, Agrawal R, Gunopulos D (2000) Constraint-based rule mining in large, dense databases. Data Mining Knowled Discov 4(2/3):217–240
Article Google Scholar
Borgelt C (2003) Efficient implementations of Apriori and Eclat. In: FIMI’03: Proceedings of the IEEE ICDM workshop on frequent itemset mining implementations
Borgelt C (2006) Apriori—Association rule induction, School of Computer Science, Otto-von-Guericke-University of Magdeburg. http://fuzzy.cs.uni-magdeburg.de/~borgelt/apriori.html
Borgelt C, Kruse R (2002) Induction of association rules: Apriori implementation. In: Proceedings of the 15th conference on computational statistics (Compstat 2002, Berlin, Germany). Physika Verlag, Heidelberg
Creighton C, Hanash S (2003) Mining gene expression databases for association rules. Bioinformatics 19(1):79–86
Article Google Scholar
Goethals B, Zaki MJ (2004) Advances in frequent itemset mining implementations: Report on FIMI’03. SIGKDD Explorations 6(1):109–117
Article Google Scholar
Hahsler M, Buchta C, Grün B, Hornik K (2007) arules: Mining Association Rules and Frequent Itemsets. R package version 0.6-0. http://CRAN.R-project.org/
Hipp J, Güntzer U, Nakhaeizadeh G (2000) Algorithms for association rule mining—a general survey and comparison. SIGKDD Explorations 2(2):1–58
Article Google Scholar
Imielinski T, Virmani A (1998) Association rules... and what’s next? towards second generation data mining systems. In: Proceedings of the second East European symposium on advances in databases and information systems. Lecture notes in computer science, vol 1475. Springer, London, pp 6–25
Klemettinen M, Mannila H, Ronkainen P, Toivonen H, Verkamo AI (1994) Finding interesting rules from large sets of discovered association rules. In: Adam NR, Bhargava BK, Yesha Y (eds) Third international conference on information and knowledge management (CIKM’94). ACM Press, pp 401–407
Knuth D (1997) The art of computer programming, sorting and searching, vol 3, 3rd edn. Digital searching, pp 492–512
Kohavi R (1996) Scaling up the accuracy of Naïve–Bayes classifiers: a decision-tree hybrid. In: Proceedings of the second international conference on knowledge discovery and data mining, pp 202–207
Kohavi R, Brodley C, Frasca B, Mason L, Zheng Z (2000) KDD-Cup 2000 organizers report: peeling the onion. SIGKDD Explorat 2(2):86–98
Article Google Scholar
Luo J, Bridges S (2000) Mining fuzzy association rules and fuzzy frequency episodes for intrusion detection. Int J Intell Syst 15(8):687–703
Article MATH Google Scholar
Newman DJ, Hettich S, Blake CL, Merz CJ (1998) UCI Repository of Machine Learning Databases, University of California, Irvine, Department of Information and Computer Sciences. http://www.ics.uci.edu/~mlearn/MLRepository.html
Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: Proceeding of the 7th international conference on database theory. Lecture notes in computer science (LNCS 1540). Springer, Heidelberg, pp 398–416
Piatetsky-Shapiro G (1991) Discovery, analysis, and presentation of strong rules. In: Piatetsky-Shapiro G, Frawley WJ (eds). Knowledge discovery in databases. AAAI/MIT Press, Cambridge, MA
Google Scholar
Srikant R, Vu Q, Agrawal R (1997) Mining association rules with item constraints. In: Heckerman D, Mannila H, Pregibon D, Uthurusamy R (eds) Proceedings of the 3rd international conference on knowledge discovery and data mining, KDD. AAAI Press, pp 67–73
Srivastava J, Cooley R, Deshpande M, Tan P-N (2000) Web usage mining: discovery and applications of usage patterns from web data. SIGKDD Explorat 1(2):12–23
Article Google Scholar
Tan P-N, Kumar V, Srivastava J (2004) Selecting the right objective measure for association analysis. Inf Syst 29(4):293–313
Article Google Scholar
Zaki MJ (2004) Mining non-redundant association rules. Data Mining Knowled Discov 9:223–248
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Systems and Operations, Institut für Informationswirtschaft, Wirtschaftsuniversität Wien, Augasse 2-6, 1090, Wien, Austria
Michael Hahsler
Institute for Tourism and Leisure Studies, Wirtschaftsuniversität Wien, Wien, Austria
Christian Buchta
Department of Statistics and Mathematics, Wirtschaftsuniversität Wien, Wien, Austria
Kurt Hornik

Authors

Michael Hahsler
View author publications
You can also search for this author in PubMed Google Scholar
Christian Buchta
View author publications
You can also search for this author in PubMed Google Scholar
Kurt Hornik
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael Hahsler.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hahsler, M., Buchta, C. & Hornik, K. Selective association rule generation. Computational Statistics 23, 303–315 (2008). https://doi.org/10.1007/s00180-007-0062-z

Download citation

Accepted: 28 February 2007
Published: 25 July 2007
Issue Date: April 2008
DOI: https://doi.org/10.1007/s00180-007-0062-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Selective association rule generation

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Data clustering: application and trends

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Selective association rule generation

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Data clustering: application and trends

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation