Abstract
This paper presents a methodology of constructing robust classifiers based on a concept called a Hierarchic Similarity Model (HSM). The hierarchic similarity is interpreted as a relation between pairs of complex objects. This relation can be derived from an information system by examining the domain related aspects of similarity. In the paper, global similarity is decomposed into many local similarities by analogy with the process of perceiving similar objects. For the purpose of estimating local relations some well-known rough sets methods are used, as well as context knowledge provided by a domain expert. Then the rules modeling interactions between local similarities are constructed and used to assess the degree of a global similarity of complex objects. The obtained relation can be used to construct classifiers which may successfully compete with other popular methods like boosted decision trees or k-NN algorithm. An implementation of the proposed models in the R script language is provided together with an empirical evaluation of the similarity based classification accuracy for some common datasets. This paper is a continuation of the research started in [1].
The author would like to thank professor Andrzej Skowron for the inspiration and the useful remarks and also Aleksandra Janusz-Ochab and Marcin Szczuka for their support in writing and editing this paper. This research was supported by the grant N N516 368334 from Ministry of Science and Higher Education of the Republic of Poland and by the Innovative Economy Operational Programme 2007-2013 (Priority Axis 1. Research and development of new technologies).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Janusz, A.: A similarity relation in machine learning. Master’s thesis, Warsaw University, Faculty of Mathematics, Informatics and Mechanics (2007) (in polish)
Gati, I., Tversky, A.: Studies of similarity. In: Rosch, E., Lloyd, B. (eds.) Cognition and Categorization, pp. 81–99. L. Erlbaum Associates, Hillsdale (1978)
Hahn, U., Chater, N.: Understanding similarity: A joint project for psychology, case based reasoning, and law. Artificial Intelligence Review 12, 393–427 (1998)
Goldstone, R., Medin, D., Gentner, D.: Relational similarity and the nonindependence of features in similarity judgments. Cognitive Psychology 23, 222–262 (1991)
Nguyen, S.H.T.: Regularity analysis and its applications in data mining. Ph.D thesis, Warsaw University, Faculty of Mathematics, Informatics and Mechanics, Part II: Relational Patterns (1999)
Tversky, A.: Features of similarity. Psychological Review 84, 327–352 (1977)
Bazan, J., Nguyen, S.H., Nguyen, H.S., Skowron, A.: Rough set methods in approximation of hierarchical concepts. In: Tsumoto, S., Słowiński, R., Komorowski, J., Grzymała-Busse, J.W. (eds.) RSCTC 2004. LNCS (LNAI), vol. 3066, pp. 346–355. Springer, Heidelberg (2004)
Bazan, J., Kruczek, P., Bazan-Socha, S., Skowron, A., Pietrzyk, J.J.: Automatic planning of treatment of infants with respiratory failure through rough set modeling. In: Greco, S., Hata, Y., Hirano, S., Inuiguchi, M., Miyamoto, S., Nguyen, H.S., Słowiński, R. (eds.) RSCTC 2006. LNCS (LNAI), vol. 4259, pp. 418–427. Springer, Heidelberg (2006); see also the extended version in Fundamenta Informaticae 85 (2008)
Pawlak, Z.: Information systems, theoretical foundations. Information Systems 3(6), 205–218 (1981)
Skowron, A., Stepaniuk, J.: Ontological framework for approximation. In: Ślęzak, D., Wang, G., Szczuka, M.S., Düntsch, I., Yao, Y. (eds.) RSFDGrC 2005. LNCS (LNAI), vol. 3641, pp. 718–727. Springer, Heidelberg (2005)
Husserl, E.: The Crisis of European Sciences and Transcendental Phenomenology. Northwestern University Press, Evanston (1970); German original written in 1937
Schütz, A.: The Phenomenology of the Social World. Northwestern University Press, Evanston (1967)
Nguyen, S.H., Bazan, J., Skowron, A., Nguyen, H.S.: Layered learning for concept synthesis. In: Peters, J.F., Skowron, A., Grzymała-Busse, J.W., Kostek, B.z., Świniarski, R.W., Szczuka, M.S. (eds.) Transactions on Rough Sets I. LNCS, vol. 3100, pp. 187–208. Springer, Heidelberg (2004)
Basu, S.: Semi-supervised Clustering: Probabilistic Models, Algorithms and Experiments. PhD thesis, The University of Texas at Austin (2005)
Smyth, B., McClave, P.: Similarity vs. diversity. In: Aha, D.W., Watson, I. (eds.) ICCBR 2001. LNCS (LNAI), vol. 2080, pp. 347–361. Springer, Heidelberg (2001)
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc.of the 1993 ACM SIGMOD International Conference on Management of Data SIGMOD 1993, Washington, DC, pp. 207–216 (1993)
Nguyen, H.S.: On efficient handling of continuous attributes in large data bases. Fundamenta Informaticae 48(1), 61–81 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Janusz, A. (2008). Similarity Relation in Classification Problems. In: Chan, CC., Grzymala-Busse, J.W., Ziarko, W.P. (eds) Rough Sets and Current Trends in Computing. RSCTC 2008. Lecture Notes in Computer Science(), vol 5306. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88425-5_22
Download citation
DOI: https://doi.org/10.1007/978-3-540-88425-5_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88423-1
Online ISBN: 978-3-540-88425-5
eBook Packages: Computer ScienceComputer Science (R0)