Learning Predictive Clustering Rules
The two most commonly addressed data mining tasks are predictive modelling and clustering. Here we address the task of predictive clustering, which contains elements of both and generalizes them to some extent. Predictive clustering has been mainly evaluated in the context of trees. In this paper, we extend predictive clustering toward rules. Each cluster is described by a rule and different clusters are allowed to overlap since the sets of examples covered by different rules do not need to be disjoint. We propose a system for learning these predictive clustering rules, which is based on a heuristic sequential covering algorithm. The heuristic takes into account both the precision of the rules (compactness w.r.t. the target space) and the compactness w.r.t. the input space, and the two can be traded-off by means of a parameter. We evaluate our system in the context of several multi-objective classification problems.
KeywordsTarget Attribute Rule Induction Subgroup Discovery Inductive Database Predictive Cluster
Unable to display preview. Download preview PDF.
- 1.Blockeel, H.: Top-down induction of first order logical decision trees. PhD thesis, Department of Computer Science, Katholieke Universiteit, Leuven (1998)Google Scholar
- 2.Blockeel, H., De Raedt, L., Ramon, J.: Top-down induction of clustering trees. In: Proceedings of the 15th International Conference on Machine Learning, pp. 55–63. Morgan Kaufmann, San Francisco (1998)Google Scholar
- 4.Clark, P., Niblett, T.: The CN2 Induction Algorithm. Machine Learning 3, 261–283 (1989)Google Scholar
- 6.Džeroski, S., Blockeel, H., Grbović: Predicting river water communities with logical decision trees. In: Presented at the Third European Ecological Modelling Conference, Zagreb, Croatia (2001)Google Scholar
- 7.Flach, P., Lavrač, N.: Rule induction. In: Berthold, M., Hand, D.J. (eds.) Intelligent Data Analysis, pp. 229–267. Springer, Heidelberg (1999)Google Scholar
- 10.Langley, P.: Elements of Machine Learning. Morgan Kaufmann, San Francisco (1996)Google Scholar
- 13.Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases. University of California, Irvine (1998)Google Scholar
- 14.Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)Google Scholar
- 16.Sese, J., Kurokawa, Y., Kato, K., Monden, M., Morishita, S.: Constrained clusters of gene expression profiles with pathological features. Bioinformatics (2004)Google Scholar
- 18.Struyf, J., Dzeroski, S., Blockeel, H., Clare, A.: Hierarchical multiclassification with predictive clustering trees in functional genomics. In: Proceedings of Workshop on Computational Methods in Bioinformatics as part of the 12th Portuguese Conference on Artificial Intelligence, pp. 272–283. Springer, Heidelberg (2005)Google Scholar
- 20.Todorovski, L., Blockeel, H., Dzeroski, S.: Ranking with predictive clustering trees. In: Proceedings of the 13th European Conferende on Machine Learning, pp. 444–456. Springer, Heidelberg (2002)Google Scholar
- 21.Torgo, L.: Data Fitting with Rule-based Regression. In: Zizka, J., Brazdil, P. (eds.) Proceedings of the workshop on Artificial Intelligence Techniques (AIT 1995), Brno, Czech Republic (1995)Google Scholar