Skip to main content
Log in

Selective association rule generation

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

Mining association rules is a popular and well researched method for discovering interesting relations between variables in large databases. A practical problem is that at medium to low support values often a large number of frequent itemsets and an even larger number of association rules are found in a database. A widely used approach is to gradually increase minimum support and minimum confidence or to filter the found rules using increasingly strict constraints on additional measures of interestingness until the set of rules found is reduced to a manageable size. In this paper we describe a different approach which is based on the idea to first define a set of “interesting” itemsets (e.g., by a mixture of mining and expert knowledge) and then, in a second step to selectively generate rules for only these itemsets. The main advantage of this approach over increasing thresholds or filtering rules is that the number of rules found is significantly reduced while at the same time it is not necessary to increase the support and confidence thresholds which might lead to missing important information in the database.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on management of data. ACM Press, pp 207–216

  • Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Bocca JB, Jarke M, Zaniolo C (eds) Proceedings of the 20th international conference on very large data bases, VLDB. Morgan Kaufmann, pp 487–499

  • Bayardo RJ, Agrawal R, Gunopulos D (2000) Constraint-based rule mining in large, dense databases. Data Mining Knowled Discov 4(2/3):217–240

    Article  Google Scholar 

  • Borgelt C (2003) Efficient implementations of Apriori and Eclat. In: FIMI’03: Proceedings of the IEEE ICDM workshop on frequent itemset mining implementations

  • Borgelt C (2006) Apriori—Association rule induction, School of Computer Science, Otto-von-Guericke-University of Magdeburg. http://fuzzy.cs.uni-magdeburg.de/~borgelt/apriori.html

  • Borgelt C, Kruse R (2002) Induction of association rules: Apriori implementation. In: Proceedings of the 15th conference on computational statistics (Compstat 2002, Berlin, Germany). Physika Verlag, Heidelberg

  • Creighton C, Hanash S (2003) Mining gene expression databases for association rules. Bioinformatics 19(1):79–86

    Article  Google Scholar 

  • Goethals B, Zaki MJ (2004) Advances in frequent itemset mining implementations: Report on FIMI’03. SIGKDD Explorations 6(1):109–117

    Article  Google Scholar 

  • Hahsler M, Buchta C, Grün B, Hornik K (2007) arules: Mining Association Rules and Frequent Itemsets. R package version 0.6-0. http://CRAN.R-project.org/

  • Hipp J, Güntzer U, Nakhaeizadeh G (2000) Algorithms for association rule mining—a general survey and comparison. SIGKDD Explorations 2(2):1–58

    Article  Google Scholar 

  • Imielinski T, Virmani A (1998) Association rules... and what’s next? towards second generation data mining systems. In: Proceedings of the second East European symposium on advances in databases and information systems. Lecture notes in computer science, vol 1475. Springer, London, pp 6–25

  • Klemettinen M, Mannila H, Ronkainen P, Toivonen H, Verkamo AI (1994) Finding interesting rules from large sets of discovered association rules. In: Adam NR, Bhargava BK, Yesha Y (eds) Third international conference on information and knowledge management (CIKM’94). ACM Press, pp 401–407

  • Knuth D (1997) The art of computer programming, sorting and searching, vol 3, 3rd edn. Digital searching, pp 492–512

  • Kohavi R (1996) Scaling up the accuracy of Naïve–Bayes classifiers: a decision-tree hybrid. In: Proceedings of the second international conference on knowledge discovery and data mining, pp 202–207

  • Kohavi R, Brodley C, Frasca B, Mason L, Zheng Z (2000) KDD-Cup 2000 organizers report: peeling the onion. SIGKDD Explorat 2(2):86–98

    Article  Google Scholar 

  • Luo J, Bridges S (2000) Mining fuzzy association rules and fuzzy frequency episodes for intrusion detection. Int J Intell Syst 15(8):687–703

    Article  MATH  Google Scholar 

  • Newman DJ, Hettich S, Blake CL, Merz CJ (1998) UCI Repository of Machine Learning Databases, University of California, Irvine, Department of Information and Computer Sciences. http://www.ics.uci.edu/~mlearn/MLRepository.html

  • Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: Proceeding of the 7th international conference on database theory. Lecture notes in computer science (LNCS 1540). Springer, Heidelberg, pp 398–416

  • Piatetsky-Shapiro G (1991) Discovery, analysis, and presentation of strong rules. In: Piatetsky-Shapiro G, Frawley WJ (eds). Knowledge discovery in databases. AAAI/MIT Press, Cambridge, MA

    Google Scholar 

  • Srikant R, Vu Q, Agrawal R (1997) Mining association rules with item constraints. In: Heckerman D, Mannila H, Pregibon D, Uthurusamy R (eds) Proceedings of the 3rd international conference on knowledge discovery and data mining, KDD. AAAI Press, pp 67–73

  • Srivastava J, Cooley R, Deshpande M, Tan P-N (2000) Web usage mining: discovery and applications of usage patterns from web data. SIGKDD Explorat 1(2):12–23

    Article  Google Scholar 

  • Tan P-N, Kumar V, Srivastava J (2004) Selecting the right objective measure for association analysis. Inf Syst 29(4):293–313

    Article  Google Scholar 

  • Zaki MJ (2004) Mining non-redundant association rules. Data Mining Knowled Discov 9:223–248

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Hahsler.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hahsler, M., Buchta, C. & Hornik, K. Selective association rule generation. Computational Statistics 23, 303–315 (2008). https://doi.org/10.1007/s00180-007-0062-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-007-0062-z

Keywords

Navigation