Clustering Rules: A Comparison of Partitioning and Hierarchical Clustering Algorithms
Previous research has resulted in a number of different algorithms for rule discovery. Two approaches discussed here, the ‘all-rules’ algorithm and multi-objective metaheuristics, both result in the production of a large number of partial classification rules, or ‘nuggets’, for describing different subsets of the records in the class of interest. This paper describes the application of a number of different clustering algorithms to these rules, in order to identify similar rules and to better understand the data.
Kew wordsclustering partial classification rule induction
Mathematics Subject Classifications (2000)62H30 68T05 68T37
Unable to display preview. Download preview PDF.
- 1.Bayardo, Jr., R. J. and Agrawal, R.: Mining the most interesting rules, in S. Chaudhuri and D. Madigan (eds.), Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, California, United States, 1999, pp. 145–154.Google Scholar
- 2.Bayardo, Jr., R. J., Agrawal, R. and Gunopulos, D.: Constraint-based rule mining in large, dense databases, in Proceedings of the 15th International Conference on Data Engineering, Sydney, Australia, 1999, pp. 188–197.Google Scholar
- 3.Blake, C. and Merz, C.: ‘UCI Repository of machine learning databases,' (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html.
- 4.Chu, S. C., Roddick, J. F. and Pan, J. S.: A comparative study and extensions to k-medoids algorithms, in Fifth International Conference on Optimization, Hong Kong, China, 2001, pp. 1708–1717.Google Scholar
- 5.de la Iglesia, B., Philpott, M. S., Bagnall, A. J. and Rayward-Smith, V. J.: Data mining rules using multi-objective evolutionary algorithms, in R. Sarker, R. Reynolds, H. Abbass, K. C. Tan, B. McKay, D. Essam, and T. Gedeon (eds.), Proceedings of 2003 IEEE Congress on Evolutionary Computation, Canberra, Australia, 2003, pp. 1552–1559.Google Scholar
- 6.de la Iglesia, B., Reynolds, A. and Rayward-Smith, V. J.: Developments on a Multi-Objective Metaheuristic (MOMH) algorithm for finding interesting sets of classification rules, in C. A. Coello Coello, A. H. Aguirre and E. Zitzler (eds.), Evolutionary Multi-Criterion Optimization: Third International Conference, EMO 2005, Guanajuato, Mexico, 2005, pp. 826–840.Google Scholar
- 8.Deb, K., Agrawal, S., Pratab, A. and Meyarivan, T.: A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II,' in Marc Schoenauer and Kalyanmoy Deb, Günter Rudolph, Xin Yao, Evelyne Lutton, J. J. Merelo, Hans-Paul Schwefel (eds.), Proceedings of the Parallel Problem Solving from Nature VI Conference. Lecture Notes in Computer Science No. 1917, Paris, France, 2000, pp. 849–858.Google Scholar
- 10.Handl, J. and Knowles, J.: Evolutionary multiobjective clustering, in X. Yao, E. Burke, J. Lozano, J. Smith, J. Merelo-Guervs, J. Bullinaria, J. Rowe, P. Tino, A. Kabn, and H.-P. Schwefel (eds.), Proceedings of the Eighth International Conference on Parallel Problem Solving from Nature (PPSN VIII). Birmingham, UK, 2004, pp. 1081–1091.Google Scholar
- 11.Handl, J. and Knowles, J.: Exploiting the trade-off – the benefits of multiple objectives in data clustering, in C. A. Coello Coello, A. H. Aguirre and E. Zitzler (eds.), Evolutionary Multi-Criterion Optimization: Third International Conference, EMO 2005, Guanajuato, Mexico, 2005, pp. 547–560.Google Scholar
- 12.Jaccard, P.: Étude comparative de la distribution florale dans une portion des Alpes et des Jura, Bull. Soc. Vaud. Sci. Nat. 37 (1901), 547–579.Google Scholar
- 13.Kaufman, L. and Rousseeuw, P. J.: Finding Groups in Data: An Introduction to Cluster Analysis, Wiley series in probability and mathematical statistics, Wiley, 1990.Google Scholar
- 14.MacQueen, J. B.: Some methods for classification and analysis of multivariate observations, in Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1, 1967, pp. 281–297.Google Scholar
- 16.Reynolds, A. P., Richards, G. and Rayward-Smith, V. J.: The application of K-medoids and PAM to the clustering of rules, in Z. R. Yang, H. Yin, and R. Everson (eds.), in Proceedings of the Fifth International Conference on Intelligent Data Engineering and Automated Learning (IDEAL'04), 2004, pp. 173–178.Google Scholar
- 17.Richards, G. and Rayward-Smith, V. J.: Discovery of association rules in tabular data, in N. Cercone, T. Y. Lin and X. Wu (eds.), in Proceedings of IEEE First International Conference on Data Mining, San Jose, California, USA, San Jose, California, 2001, pp. 465–473.Google Scholar
- 18.Sokal, R. R. and Michener, C. D.: A statistical method for evaluating systematic relationships, Univ. Kans. Sci. Bull. 38 (1958), 1409–1438.Google Scholar
- 19.Sokal, R. R. and Sneath, P. H. A.: Principles of Numerical Taxonomy, Freeman, San Francisco, 1963.Google Scholar