Generic Pattern Trees for Exhaustive Exceptional Model Mining

  • Florian Lemmerich
  • Martin Becker
  • Martin Atzmueller
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7524)

Abstract

Exceptional model mining has been proposed as a variant of subgroup discovery especially focusing on complex target concepts. Currently, efficient mining algorithms are limited to heuristic (non exhaustive) methods. In this paper, we propose a novel approach for fast exhaustive exceptional model mining: We introduce the concept of valuation bases as an intermediate condensed data representation, and present the general GP-growth algorithm based on FP-growth. Furthermore, we discuss the scope of the proposed approach by drawing an analogy to data stream mining and provide examples for several different model classes. Runtime experiments show improvements of more than an order of magnitude in comparison to a naive exhaustive depth-first search.

Keywords

exceptional model mining subgroup discovery 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Atzmueller, M., Lemmerich, F.: Fast Subgroup Discovery for Continuous Target Concepts. In: Rauch, J., Raś, Z.W., Berka, P., Elomaa, T. (eds.) ISMIS 2009. LNCS, vol. 5722, pp. 35–44. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  2. 2.
    Atzmueller, M., Lemmerich, F.: Vikamine - A Rich-Client Environment for Pattern Mining and Subgroup Discovery. In: Proc. LWA 2011 (KDML Track) (2011)Google Scholar
  3. 3.
    Atzmüller, M., Puppe, F.: SD-Map – A Fast Algorithm for Exhaustive Subgroup Discovery. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 6–17. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  4. 4.
    Bennett, J., Grout, R., Pébay, P., Roe, D., Thompson, D.: Numerically Stable, Single-Pass, Parallel Statistics Algorithms. In: IEEE International Conference on Cluster Computing and Workshops (CLUSTER 2009), pp. 1–8. IEEE (2009)Google Scholar
  5. 5.
    Bromberg, F., Patterson, B., Yaramakala, E.: Mining bayesian networks from streamed data (2003)Google Scholar
  6. 6.
    Duivesteijn, W., Knobbe, A., Feelders, A., van Leeuwen, M.: Subgroup Discovery Meets Bayesian Networks–An Exceptional Model Mining Approach. In: 10th IEEE Intl Conference on Data Mining (ICDM), pp. 158–167. IEEE (2010)Google Scholar
  7. 7.
    Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns Without Candidate Generation. In: Intl. Conf. on Management of Data, pp. 1–12. ACM Press (2000)Google Scholar
  8. 8.
    Herrera, F., Carmona, C., González, P., del Jesus, M.: An Overview on Subgroup Discovery: Foundations and Applications. Knowledge and Information Systems 29(3), 495–525 (2011)CrossRefGoogle Scholar
  9. 9.
    Klösgen, W.: Explora: A Multipattern and Multistrategy Discovery Assistant. In: Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 249–271. AAAI Press (1996)Google Scholar
  10. 10.
    Kohavi, R.: The Power of Decision Tables. In: Lavrač, N., Wrobel, S. (eds.) ECML 1995. LNCS, vol. 912, pp. 174–189. Springer, Heidelberg (1995)CrossRefGoogle Scholar
  11. 11.
    van Leeuwen, M.: Maximal Exceptions with Minimal Descriptions. Data Min. Knowl. Discov. 21(2), 259–276 (2010)MathSciNetCrossRefGoogle Scholar
  12. 12.
    van Leeuwen, M., Knobbe, A.: Non-redundant Subgroup Discovery in Large and Complex Data. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part III. LNCS, vol. 6913, pp. 459–474. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  13. 13.
    Leman, D., Feelders, A., Knobbe, A.: Exceptional Model Mining. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 1–16. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  14. 14.
    Newman, D., Hettich, S., Blake, C., Merz, C.: UCI Repository of Machine Learning Databases (1998), http://www.ics.uci.edu/mlearn/mlrepository.html
  15. 15.
    Novak, P.K., Nada Lavrac, G.I.W.: Supervised Descriptive Rule Discovery: A Unifying Survey of Contrast Set, Emerging Pattern and Subgroup Mining. Journal of Machine Learning Research 10, 377–403 (2009)MATHGoogle Scholar
  16. 16.
    Umek, L., Zupan, B.: Subgroup Discovery in Data Sets with Multi-Dimensional Responses. Intelligent Data Analysis 15(4), 533–549 (2011)Google Scholar
  17. 17.
    Wrobel, S.: An Algorithm for Multi-Relational Discovery of Subgroups. In: Komorowski, J., Żytkow, J.M. (eds.) PKDD 1997. LNCS, vol. 1263, pp. 78–87. Springer, Heidelberg (1997)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Florian Lemmerich
    • 1
  • Martin Becker
    • 1
  • Martin Atzmueller
    • 2
  1. 1.Artificial Intelligence and Applied Computer Science GroupUniversity of WürzburgWürzburgGermany
  2. 2.Knowledge & Data Engineering GroupUniversity of KasselKasselGermany

Personalised recommendations