Skip to main content
Log in

Ensembles of jittered association rule classifiers

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

The ensembling of classifiers tends to improve predictive accuracy. To obtain an ensemble with N classifiers, one typically needs to run N learning processes. In this paper we introduce and explore Model Jittering Ensembling, where one single model is perturbed in order to obtain variants that can be used as an ensemble. We use as base classifiers sets of classification association rules. The two methods of jittering ensembling we propose are Iterative Reordering Ensembling (IRE) and Post Bagging (PB). Both methods start by learning one rule set over a single run, and then produce multiple rule sets without relearning. Empirical results on 36 data sets are positive and show that both strategies tend to reduce error with respect to the single model association rule classifier. A bias–variance analysis reveals that while both IRE and PB are able to reduce the variance component of the error, IRE is particularly effective in reducing the bias component. We show that Model Jittering Ensembling can represent a very good speed-up w.r.t. multiple model learning ensembling. We also compare Model Jittering with various state of the art classifiers in terms of predictive accuracy and computational efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agrawal R, Mannila H, Srikant R, Toivonen H, Verkamo AI (1996) Fast discovery of association rules. In: Advances in knowledge discovery and data mining. AAAI/MIT Press, pp 307–328

  • Azevedo PJ (2003) CAREN—A Java based apriori implementation for classification purposes. Technical report, Universidade do Minho, Departamento de Informática

  • Azevedo PJ (2005) A data structure to represent association rules based classifiers. Technical report, Universidade do Minho, Departamento de Informática

  • Azevedo PJ (2008) CAREN—class project association rule engine. http://www.di.uminho.pt/~pja/class/caren.html

  • Azevedo PJ, Jorge AM (2007a) Comparing rule measures for predictive association rules. In: Kok JN, Koronacki J, de Mántaras RL, Matwin S, Mladenic D, Skowron A (eds) ECML. Lecture Notes in Computer Science, vol 4701. Springer, pp 510–517

  • Azevedo PJ, Jorge AM (2007b) Iterative reordering of rules for building ensembles without relearning. In: Corruble V, Takeda M, Suzuki E (eds) Discovery science. Lecture Notes in Computer Science, vol 4755. Springer, pp 56–67

  • Bauer E, Kohavi R (1999) An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach Learn 36(1–2): 105–139

    Article  Google Scholar 

  • Bayardo RJ, Agrawal R, Gunopulos D (1999) Constraint-based rule mining in large, dense databases. In: ICDE. IEEE Computer Society, pp 188–197

  • Breiman L (1996) Bagging predictors. Mach Learn 24(2): 123–140

    MATH  MathSciNet  Google Scholar 

  • Breiman L (2000) Randomizing outputs to increase prediction accuracy. Mach Learn 40(3): 229–242

    Article  MATH  Google Scholar 

  • Brin S, Motwani R, Ullman JD, Tsur S (1997) Dynamic itemset counting and implication rules for market basket data. In: Peckham J (ed) SIGMOD conference. ACM Press, pp 255–264

  • Cohen WW (1995) Fast effective rule induction. In: Machine learning. Proceedings of the twelfth international conference on machine learning (ICML 1995), Tahoe City, CA, USA, 9–12 July, 1995, pp 115–123

  • Cohen WW, Singer Y (1999) A simple, fast, and effictive rule learner. In: AAAI/IAAI, pp 335–342

  • Dembczyński K, Kotlowski W, Slowinski R (2008) Maximum likelihood rule ensembles. In: Cohen WW, McCallum A, Roweis ST (eds) ICML. In: ACM international conference proceeding series, vol 307. ACM, pp 224–231

  • Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7: 1–30

    MathSciNet  Google Scholar 

  • Domingos P (1997) Why does bagging work? A bayesian account and its implications. In: KDD, pp 155–158

  • Fayyad UM, Irani KB (1993) Multi-interval discretization of continuous-valued attributes for classification learning. In: IJCAI, pp 1022–1029

  • Frank E, Pfahringer B (2006) Improving on bagging with input smearing. In: Ng WK, Kitsuregawa M, Li J, Chang K (eds) PAKDD. Lecture Notes in Computer Science, vol 3918. Springer, pp 97–106

  • Frank E, Witten IH (1998) Generating accurate rule sets without global optimization. In: Shavlik JW (eds) ICML. Morgan Kaufmann, San Fransisco, MA, pp 144–151

    Google Scholar 

  • Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and an application to boosting. In: Vitányi PMB (ed) EuroCOLT. Lecture Notes in Computer Science, vol 904. Springer, pp 23–37

  • Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: ICML, pp 148–156

  • Friedman JH, Popescu B (2005) Predictive learning via rule ensembles. Technical report, Stanford University

  • Gama J (2003) Iterative bayes. Theor Comput Sci 292(2): 417–430

    Article  MATH  MathSciNet  Google Scholar 

  • Hastie T, Tibshirani R, Friedman JH (2001) The elements of statistical learning. Springer, New York

    MATH  Google Scholar 

  • Jorge A, Azevedo PJ (2005) An experiment with association rules and classification: post-bagging and conviction. In: Hoffmann AG, Motoda H, Scheffer T (eds) Discovery science. Lecture Notes in Computer Science, vol 3735. Springer, pp 137–149

  • Jovanoski V, Lavrac N (2001) Classification rule learning with apriori-c. In: Brazdil P, Jorge A (eds) EPIA. Lecture Notes in Computer Science, vol 2258. Springer, pp 44–51

  • Knobbe A, Crémilleux B, Fürnkranz J, Scholz M (2008) From local patterns to global models: the LeGo approach to data mining. In: From local patterns to global models: proceedings of the ECML PKDD 2008 Workshop

  • Kohavi R, Wolpert D (1996) Bias plus variance decomposition for zero-one loss functions. In: ICML, pp 275–283

  • Li W, Han J, Pei J (2001) CMAR: accurate and efficient classification based on multiple class-association rules. In: Cercone N, Lin TY, Wu X (eds) ICDM. IEEE Computer Society, Washington, DC, pp 369–376

    Google Scholar 

  • Liu B, Hsu W, Ma Y (1998) Integrating classification and association rule mining. In: KDD ’98: proceedings of the fourth ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, New York, pp 80–86

  • Martínez-Muñoz G, Suárez A (2006) Pruning in ordered bagging ensembles. In: Cohen WW, Moore A (eds) Machine learning. In: Proceedings of the twenty-third international conference (ICML 2006), Pittsburgh, PA, USA, 25–29 June, 2006. ACM international conference proceeding series, vol 148. ACM, pp 609–616

  • Meretakis D, Wüthrich B (1999) Extending naïve bayes classifiers using long itemsets. In: KDD. pp 165–174

  • Merz CJ, Murphy P (1996) UCI repository of machine learning database. http://www.cs.uci.edu/~mlearn

  • Pfahringer B, Holmes G, Wang C (2004) Millions of random rules. In: Workshop on advances in inductive rule learning, 15th European conference on machine learning (ECML), Pisa

  • Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Fransisco, MA

    Google Scholar 

  • R Development Core Team (2004) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN:3-900051-00-3

  • Rückert U, Kramer S (2006) A statistical approach to rule learning. In: Cohen WW, Moore A (eds) Machine learning. In: Proceedings of the twenty-third international conference (ICML 2006), Pittsburgh, PA, USA, 25–29 June, 2006. ACM international conference proceeding series, vol 148. ACM, pp 785–792

  • Schapire RE (1990) The strength of weak learnability. Mach Learn 5: 197–227

    Google Scholar 

  • Wang J, Karypis G (2005) Harmony: efficiently mining the best rules for classification. In: Proceedings of 2005 SIAM international data mining conference, SDM

  • Webb GI (2006) Discovering significant rules. In: Eliassi-Rad T, Ungar LH, Craven M, Gunopulos D (eds) Proceedings of the twelfth ACM SIGKDD international conference on knowledge discovery and data mining, Philadelphia, PA, USA, 20–23 August, 2006. ACM, pp 434–443

  • Webb GI (2008) Layered critical values: a powerful direct-adjustment approach to discovering significant patterns. Mach Learn 71(2–3): 307–323

    Article  Google Scholar 

  • Webb GI, Butler SM, Newlands DA (2003) On detecting differences between groups. In: Getoor L, Senator TE, Domingos P, Faloutsos C (eds) Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining, Washington, DC, USA, 24–27 August, 2003. ACM, pp 256–265

  • Weiss SM, Indurkhya N (2000) Lightweight rule induction. In: Langley P (eds) ICML. Morgan Kaufmann, San Fransisco, MA, pp 1135–1142

    Google Scholar 

  • Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann Series in Data Management Systems. 2. Morgan Kaufmann, San Fransisco, MA

    Google Scholar 

  • Zimmermann A, Raedt LD (2004) Corclass: correlated association rule mining for classification. In: Suzuki E, Arikawa S (eds) Discovery science. Lecture Notes in Computer Science, vol 3245. Springer, pp 60–72

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paulo J. Azevedo.

Additional information

Responsible editor: Johannes Fürnkranz and Arno Knobbe.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Azevedo, P.J., Jorge, A.M. Ensembles of jittered association rule classifiers. Data Min Knowl Disc 21, 91–129 (2010). https://doi.org/10.1007/s10618-010-0173-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-010-0173-y

Keywords

Navigation