Conceptual K-Means Algorithm Based on Complex Features
The k-means algorithm is the most studied and used tool for solving the clustering problem when the number of clusters is known a priori. Nowadays, there is only one conceptual version of this algorithm, the conceptual k-means algorithm. One characteristic of this algorithm is the use of generalization lattices, which define relationships among the feature values. However, for many applications, it is difficult to determine the best generalization lattices; moreover there are not automatic methods to build the lattices, thus this task must be done by the specialist of the area in which we want to solve the problem. In addition, this algorithm does not work with missing data. For these reasons, in this paper, a new conceptual k-means algorithm that does not use generalization lattices to build the concepts and allows working with missing data is proposed. We use complex features for generating the concepts. The complex features are subsets of features with associated values that characterize objects of a cluster and at the same time not characterize objects from other clusters. Some experimental results obtained by our algorithm are shown and they are compared against the results obtained by the conceptual k-means algorithm.
KeywordsGenetic Algorithm Cluster Problem Complex Feature Cluster Phase Conceptual Cluster
Unable to display preview. Download preview PDF.
- 1.Michalski, R.S.: A theory and methodology of inductive learning. In: Michalski, R.S., Carbonell, J.G., Mitchell, T.M. (eds.) Machine Learning: An artificial intelligence approach, vol. 2, pp. 83–129. Morgan Kaufmann, Los Altos (1986)Google Scholar
- 2.Briscoe, G., Caelli, T.: A compendium of Machine Learning. In: Ablex (ed.) Symbolic Machine Learning, vol. 1 (1996)Google Scholar
- 3.Stumme, G., et al.: Conceptual Clustering with Iceberg Concept Lattices. In: Klinkenberg, R., Ruping, S., Fick, A., Henze, N., Herzog, C., Molitor, R., Schroder, O. (eds.) Proc. GI-Fachgruppentre#en Maschinelles Lernen 2001,Universitat Dortmund 763 (2001)Google Scholar
- 6.Osinski, S., Weiss, D.: Conceptual clustering using lingo algorithm: Evaluation on open directory project data. In: Advanced in Soft Computing, Intelligent Information Processing and Web Mining Proceedings of the International IIS: IIPWM 2004 Conference, Zakopane, Poland, pp. 369–378 (2004)Google Scholar
- 9.De la Vega Doria, L.A., Carrasco Ochoa, J.A., Ruiz Shulcloper, J.: Fuzzy Kora-W Algorithm. In: 6th European Congress on Intelligent Techniques and Soft Computing EUFIT 1998, Aachen Germany, vol. 2, pp. 1190–1194 (1998)Google Scholar
- 10.García Serrano, J.R., Martínez-Trinidad, J.F.: Extension to c-means algorithm for the use of similarity functions. In: 3rdEuropean Conference on Principles of Data Mining and Knowledge Discovery Proceedings, Prague, Czech. Republic, pp. 354–359 (1999)Google Scholar