Rule Learning with Probabilistic Smoothing

  • Gianni Costa
  • Massimo Guarascio
  • Giuseppe Manco
  • Riccardo Ortale
  • Ettore Ritacco
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5691)

Abstract

A hierarchical classification framework is proposed for discriminating rare classes in imprecise domains, characterized by rarity (of both classes and cases), noise and low class separability. The devised framework couples the rules of a rule-based classifier with as many local probabilistic generative models. These are trained over the coverage of the corresponding rules to better catch those globally rare cases/classes that become less rare in the coverage. Two novel schemes for tightly integrating rule-based and probabilistic classification are introduced, that classify unlabeled cases by considering multiple classifier rules as well as their local probabilistic counterparts. An intensive evaluation shows that the proposed framework is competitive and often superior in accuracy w.r.t. established competitors, while overcoming them in dealing with rare classes.

Keywords

Association Rule Minority Class Decision Region Default Rule Training Case 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. of Int. Conf. on Very Large Data Bases, pp. 487–499 (1994)Google Scholar
  2. 2.
    Antonie, M.-L., Zaïane, O.R.: Text document categorization by term association. In: Proc. on IEEE Int. Conf. on Data Mining, pp. 19–26 (2002)Google Scholar
  3. 3.
    Arunasalam, B., Chawla, S.: CCCS: A top-down association classifier for imbalanced class distribution. In: Proc. of ACM SIGKDD KDD, pp. 517–522 (2006)Google Scholar
  4. 4.
    Asuncion, A., Newman, D.J.: UCI machine learning repository (2007)Google Scholar
  5. 5.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)MATHGoogle Scholar
  6. 6.
    Cesario, E., Folino, F., Locane, A., Manco, G., Ortale, R.: Boosting text segmentation via progressive classification. Knowledge and Information Systems 15(3), 285–320 (2008)CrossRefGoogle Scholar
  7. 7.
    Coenen, F.: LUCS KDD implementations of CBA and CMAR (2004)Google Scholar
  8. 8.
    Cohen, W.W.: Fast effective rule induction. In: Proc. of Int. Conf. on Machine Learning, pp. 115–123 (1995)Google Scholar
  9. 9.
    Fawcett, T.: An introduction to ROC analysis. Pattern Recognition Letters 27(8), 861–874 (2006)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Frank, E., Witten, I.H.: Generating accurate rule sets without global optimization. In: Proc. of Int. Conf. on Machine Learning, pp. 144–151 (1998)Google Scholar
  11. 11.
    Han, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proc. of ACM SIGMOD Int. Conf. on Management of data, pp. 1–12 (2000)Google Scholar
  12. 12.
    Holte, R.C., Acker, L., Porter, B.: Concept learning and the problem of small disjuncts. In: Proc. of Int. Joint Conf. on Artificial Intelligence, pp. 813–818 (1989)Google Scholar
  13. 13.
    Li, W., Han, J., Pei, J.: CMAR: Accurate and efficient classification based on multiple class-association rules. In: Proc. of IEEE Int. Conf. on Data Mining, pp. 369–376 (2001)Google Scholar
  14. 14.
    Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proc. of ACM SIGKDD Int. Conf. on Kwnoledge Discovery and Data Mining, pp. 80–86 (1998)Google Scholar
  15. 15.
    Liu, B., Ma, Y., Wong, C.K.: Improving an association rule based classifier. In: Proc. of Principles of Data Mining and Knowledge Discovery, pp. 504–509 (2000)Google Scholar
  16. 16.
    Thabtah, F.: A review of associative classification mining. The Knowledge Engineering Review 22(1), 37–65 (2007)CrossRefGoogle Scholar
  17. 17.
    Webb, G., Boughton, J., Wang, Z.: Not so naive bayes: Aggregating one-dependence estimators. Machine Learning 58(1), 5–24 (2005)CrossRefMATHGoogle Scholar
  18. 18.
    Weiss, G.M.: Mining with rarity: A unifying framework. ACM SIGKDD Explorations 6(1), 7–19 (2004)CrossRefGoogle Scholar
  19. 19.
    Xin, X., Han, J.: CPAR: Classification based on predictive association rules. In: Proc. of SIAM Int. Conf. on Data Mining, pp. 331–335 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Gianni Costa
    • 1
  • Massimo Guarascio
    • 1
  • Giuseppe Manco
    • 1
  • Riccardo Ortale
    • 1
  • Ettore Ritacco
    • 1
  1. 1.ICAR-CNRRendeItaly

Personalised recommendations