ISMIS 2009: Foundations of Intelligent Systems pp 35-44 | Cite as
Fast Subgroup Discovery for Continuous Target Concepts
Abstract
Subgroup discovery is a flexible data mining method for a broad range of applications. It considers a given property of interest (target concept), and aims to discover interesting subgroups with respect to this concept. In this paper, we especially focus on the handling of continuous target variables and describe an approach for fast and efficient subgroup discovery for such target concepts. We propose novel formalizations of effective pruning strategies for reducing the search space, and we present the SD-Map* algorithm that enables fast subgroup discovery for continuous target concepts. The approach is evaluated using real-world data from the industrial domain.
Keywords
Quality Function Optimistic Estimate Target Variable Target Concept Pruning StrategyPreview
Unable to display preview. Download preview PDF.
References
- 1.Gamberger, D., Lavrac, N.: Expert-Guided Subgroup Discovery: Methodology and Application. Journal of Artificial Intelligence Research 17, 501–527 (2002)MATHGoogle Scholar
- 2.Lavrac, N., Kavsek, B., Flach, P., Todorovski, L.: Subgroup Discovery with CN2-SD. Journal of Machine Learning Research 5, 153–188 (2004)MathSciNetGoogle Scholar
- 3.Atzmueller, M., Puppe, F., Buscher, H.P.: Exploiting Background Knowledge for Knowledge-Intensive Subgroup Discovery. In: Proc. 19th Intl. Joint Conference on Artificial Intelligence (IJCAI 2005), Edinburgh, Scotland, pp. 647–652 (2005)Google Scholar
- 4.Jorge, A.M., Pereira, F., Azevedo, P.J.: Visual interactive subgroup discovery with numerical properties of interest (ISI, ISIProc). In: Todorovski, L., Lavrač, N., Jantke, K.P. (eds.) DS 2006. LNCS (LNAI), vol. 4265, pp. 301–305. Springer, Heidelberg (2006)CrossRefGoogle Scholar
- 5.Wrobel, S.: An Algorithm for Multi-Relational Discovery of Subgroups. In: Komorowski, J., Żytkow, J.M. (eds.) PKDD 1997. LNCS, vol. 1263, pp. 78–87. Springer, Heidelberg (1997)CrossRefGoogle Scholar
- 6.Aumann, Y., Lindell, Y.: A Statistical Theory for Quantitative Association Rules. Journal of Intelligent Information Systems 20(3), 255–283 (2003)CrossRefGoogle Scholar
- 7.Atzmueller, M., Puppe, F.: SD-Map – A Fast Algorithm for Exhaustive Subgroup Discovery. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 6–17. Springer, Heidelberg (2006)CrossRefGoogle Scholar
- 8.Klösgen, W.: Explora: A Multipattern and Multistrategy Discovery Assistant. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 249–271. AAAI Press, Menlo Park (1996)Google Scholar
- 9.Grosskreutz, H., Rüping, S., Wrobel, S.: Tight optimistic estimates for fast subgroup discovery. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part I. LNCS (LNAI), vol. 5211, pp. 440–456. Springer, Heidelberg (2008)CrossRefGoogle Scholar
- 10.Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns Without Candidate Generation. In: Chen, W., Naughton, J., Bernstein, P.A. (eds.) 2000 ACM SIGMOD Intl. Conference on Management of Data, pp. 1–12. ACM Press, New York (2000)CrossRefGoogle Scholar
- 11.Klösgen, W.: Applications and Research Problems of Subgroup Mining. In: Raś, Z.W., Skowron, A. (eds.) ISMIS 1999. LNCS, vol. 1609, pp. 1–15. Springer, Heidelberg (1999)CrossRefGoogle Scholar
- 12.Grosskreutz, H., Rüping, S., Shaabani, N., Wrobel, S.: Optimistic estimate pruning strategies for fast exhaustive subgroup discovery. Technical report, Fraunhofer Institute IAIS (2008), http://publica.fraunhofer.de/eprints/urn:nbn:de:0011-n-723406.pdf
- 13.Newman, D., Hettich, S., Blake, C., Merz, C.: UCI Repository of Machine Learning Databases (1998), http://www.ics.uci.edu/~mlearn/mlrepository.html