Closed Sets for Labeled Data

  • Gemma C. Garriga
  • Petra Kralj
  • Nada Lavrač
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4213)

Abstract

Closed sets are being successfully applied in the context of compacted data representation for association rule learning. However, their use is mainly descriptive. This paper shows that, when considering labeled data, closed sets can be adapted for prediction and discrimination purposes by conveniently contrasting covering properties on positive and negative examples. We formally justify that these sets characterize the space of relevant combinations of features for discriminating the target class. In practice, identifying relevant/irrelevant combinations of features through closed sets is useful in many applications. Here we apply it to compacting emerging patterns and essential rules and to learn descriptions for subgroup discovery.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baralis, E., Chiusano, S.: Essential classification rule sets. ACM Trans. Database Syst. 29(4), 635–674 (2004)CrossRefGoogle Scholar
  2. 2.
    Bay, S.D., Pazzani, M.J.: Detecting group differences: Mining contrast sets. Data Min. Knowl. Discov. 5(3), 213–246 (2001)MATHCrossRefGoogle Scholar
  3. 3.
    Clark, P., Niblett, T.: The CN2 induction algorithm. Mach. Learn. 3(4), 261–283 (1989)Google Scholar
  4. 4.
    Cohen, W.W.: Fast effective rule induction. In: Proc. 12th Int. Conf. on Machine Learning, pp. 115–123 (1995)Google Scholar
  5. 5.
    Dong, G., Li, J.: Efficient mining of emerging patterns: discovering trends and differences. In: KDD 1999: Proc. of the fifth ACM SIGKDD Int. Conf. on Knowledge discovery and data mining, pp. 43–52 (1999)Google Scholar
  6. 6.
    Goethals, B., Zaki, M.: Advances in frequent itemset mining implementations: report on fimi 2003. SIGKDD Explor. Newsl. 6(1), 109–117 (2004)CrossRefGoogle Scholar
  7. 7.
    Gramberger, D., Lavrač, N.: Expert-guided subgroup discovery: Methodology and application. Journal of Artificial Intelligence Research 17, 501–527 (2002)Google Scholar
  8. 8.
    Jovanoski, V., Lavrač, N.: Classification rule learning with APRIORI-C. In: Brazdil, P.B., Jorge, A.M. (eds.) EPIA 2001. LNCS (LNAI), vol. 2258, pp. 44–51. Springer, Heidelberg (2001)Google Scholar
  9. 9.
    Kavšek, B., Lavrač, N.: APRIORI-SD: Adapting association rule learning to subgroup discovery. Applied Artificial Intelligence (to appear, 2006)Google Scholar
  10. 10.
    Lavrač, N., Gamberger, D., Jovanoski, V.: A study of relevance for learning in deductive databases. Journal of Logic Programming 40(2/3), 215–249 (1999)MATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    Lavrač, N., Kavšek, B., Flach, P., Todorovski, L.: Subgroup discovery with CN2-SD. Journal of Machine Learning Research 5, 153–188 (2004)Google Scholar
  12. 12.
    Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Closed set based discovery of small covers for association rules. In: Proc. ICAD, pp. 361–381 (1999)Google Scholar
  13. 13.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java implementations. Morgan Kaufmann, San Francisco (2005)Google Scholar
  14. 14.
    Zaki, M.: Mining non-redundant association rules. Data Mining and Knowledge Discovery: An Int. Journal 4(3), 223–248 (2004)CrossRefMathSciNetGoogle Scholar
  15. 15.
    Zhang, J., Bloedorn, E., Rosen, L., Venese, D.: Learning rules from highly unbalanced data sets. In: ICDM 2004, pp. 571–574 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Gemma C. Garriga
    • 1
  • Petra Kralj
    • 2
  • Nada Lavrač
    • 2
    • 3
  1. 1.Universitat Politècnica de CatalunyaBarcelonaSpain
  2. 2.Jožef Stefan InstituteLjubljanaSlovenia
  3. 3.University of Nova GoricaNova GoricaSlovenia

Personalised recommendations