Similarity and Dissimilarity Measures for Mixed Feature-Type Symbolic Data
This paper presents some preliminary results for the similarity and dissimilarity measures based on the Cartesian System Model (CSM) that is a mathematical model to manipulate mixed feature-type symbolic data. We define the notion of concept size for the description of each object in the feature space. By extending the notion to the concept sizes of the Cartesian join and the Cartesian meet of the descriptions for objects, we can obtain various similarity and dissimilarity measures. We present especially asymmetric and symmetric similarity measures useful for pattern recognition problems.
KeywordsCartesian system model Symbolic data Concept size Pattern recognition
The authors thank anonymous referees for their helpful comments. This work was supported by JSPS KAKENHI (Grants–in–Aid for Scientific Research) Grant Number 25330268.
- 2.Hubert, L.: Some extensions of Johnson’s hierarchical clustering algorithms. Psychometrika 37(3) 261–27 L. 4 (1972)Google Scholar
- 3.Tversky, A.: Features of similarity. Psychol. Rev. 84(4) (1977)Google Scholar
- 4.Michalski, R., Stepp, R.: Learning from observation: Conceptual clustering. In: Michalski, R.S., Carbonell, J.G., Mitchel, T.M. (eds.) Machine Learning, An Artificial Intelligence Approach, vol. II, pp. 331–363. TIOGA Publishing Co., Palo Alto (1983)Google Scholar
- 9.De Carvalho, F.D.A.T., De Souza, M.C.R.: Unsupervised pattern recognition models for mixed feature-type data. Pattern Recognit. Lett. 31, 430–443 (2010)Google Scholar
- 12.Ichino, M.: The quantile method of symbolic principal component analysis 4, 184–198 (2011)Google Scholar
- 13.Ono, Y., Ichino, M.: A new feature selection method based on geometrical thickness. Int. J. Off. Stat. 1(2), 19–38 (1998)Google Scholar