Similarity Relation in Classification Problems

Janusz, Andrzej

doi:10.1007/978-3-540-88425-5_22

Andrzej Janusz⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5306))

Included in the following conference series:

International Conference on Rough Sets and Current Trends in Computing

714 Accesses
4 Citations

Abstract

This paper presents a methodology of constructing robust classifiers based on a concept called a Hierarchic Similarity Model (HSM). The hierarchic similarity is interpreted as a relation between pairs of complex objects. This relation can be derived from an information system by examining the domain related aspects of similarity. In the paper, global similarity is decomposed into many local similarities by analogy with the process of perceiving similar objects. For the purpose of estimating local relations some well-known rough sets methods are used, as well as context knowledge provided by a domain expert. Then the rules modeling interactions between local similarities are constructed and used to assess the degree of a global similarity of complex objects. The obtained relation can be used to construct classifiers which may successfully compete with other popular methods like boosted decision trees or k-NN algorithm. An implementation of the proposed models in the R script language is provided together with an empirical evaluation of the similarity based classification accuracy for some common datasets. This paper is a continuation of the research started in [1].

The author would like to thank professor Andrzej Skowron for the inspiration and the useful remarks and also Aleksandra Janusz-Ochab and Marcin Szczuka for their support in writing and editing this paper. This research was supported by the grant N N516 368334 from Ministry of Science and Higher Education of the Republic of Poland and by the Innovative Economy Operational Programme 2007-2013 (Priority Axis 1. Research and development of new technologies).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Janusz, A.: A similarity relation in machine learning. Master’s thesis, Warsaw University, Faculty of Mathematics, Informatics and Mechanics (2007) (in polish)
Google Scholar
Gati, I., Tversky, A.: Studies of similarity. In: Rosch, E., Lloyd, B. (eds.) Cognition and Categorization, pp. 81–99. L. Erlbaum Associates, Hillsdale (1978)
Google Scholar
Hahn, U., Chater, N.: Understanding similarity: A joint project for psychology, case based reasoning, and law. Artificial Intelligence Review 12, 393–427 (1998)
Article Google Scholar
Goldstone, R., Medin, D., Gentner, D.: Relational similarity and the nonindependence of features in similarity judgments. Cognitive Psychology 23, 222–262 (1991)
Article Google Scholar
Nguyen, S.H.T.: Regularity analysis and its applications in data mining. Ph.D thesis, Warsaw University, Faculty of Mathematics, Informatics and Mechanics, Part II: Relational Patterns (1999)
Google Scholar
Tversky, A.: Features of similarity. Psychological Review 84, 327–352 (1977)
Article Google Scholar
Bazan, J., Nguyen, S.H., Nguyen, H.S., Skowron, A.: Rough set methods in approximation of hierarchical concepts. In: Tsumoto, S., Słowiński, R., Komorowski, J., Grzymała-Busse, J.W. (eds.) RSCTC 2004. LNCS (LNAI), vol. 3066, pp. 346–355. Springer, Heidelberg (2004)
Chapter Google Scholar
Bazan, J., Kruczek, P., Bazan-Socha, S., Skowron, A., Pietrzyk, J.J.: Automatic planning of treatment of infants with respiratory failure through rough set modeling. In: Greco, S., Hata, Y., Hirano, S., Inuiguchi, M., Miyamoto, S., Nguyen, H.S., Słowiński, R. (eds.) RSCTC 2006. LNCS (LNAI), vol. 4259, pp. 418–427. Springer, Heidelberg (2006); see also the extended version in Fundamenta Informaticae 85 (2008)
Chapter Google Scholar
Pawlak, Z.: Information systems, theoretical foundations. Information Systems 3(6), 205–218 (1981)
Article MATH Google Scholar
Skowron, A., Stepaniuk, J.: Ontological framework for approximation. In: Ślęzak, D., Wang, G., Szczuka, M.S., Düntsch, I., Yao, Y. (eds.) RSFDGrC 2005. LNCS (LNAI), vol. 3641, pp. 718–727. Springer, Heidelberg (2005)
Google Scholar
Husserl, E.: The Crisis of European Sciences and Transcendental Phenomenology. Northwestern University Press, Evanston (1970); German original written in 1937
Google Scholar
Schütz, A.: The Phenomenology of the Social World. Northwestern University Press, Evanston (1967)
Google Scholar
Nguyen, S.H., Bazan, J., Skowron, A., Nguyen, H.S.: Layered learning for concept synthesis. In: Peters, J.F., Skowron, A., Grzymała-Busse, J.W., Kostek, B.z., Świniarski, R.W., Szczuka, M.S. (eds.) Transactions on Rough Sets I. LNCS, vol. 3100, pp. 187–208. Springer, Heidelberg (2004)
Chapter Google Scholar
Basu, S.: Semi-supervised Clustering: Probabilistic Models, Algorithms and Experiments. PhD thesis, The University of Texas at Austin (2005)
Google Scholar
Smyth, B., McClave, P.: Similarity vs. diversity. In: Aha, D.W., Watson, I. (eds.) ICCBR 2001. LNCS (LNAI), vol. 2080, pp. 347–361. Springer, Heidelberg (2001)
Chapter Google Scholar
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc.of the 1993 ACM SIGMOD International Conference on Management of Data SIGMOD 1993, Washington, DC, pp. 207–216 (1993)
Google Scholar
Nguyen, H.S.: On efficient handling of continuous attributes in large data bases. Fundamenta Informaticae 48(1), 61–81 (2001)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Mathematics, Informatics and Mechanics, Warsaw University, ul. Banacha 2, 02-097, Warszawa, Poland
Andrzej Janusz

Authors

Andrzej Janusz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, The University of Akron, OH 44325-4003, Akron, USA
Chien-Chung Chan
Department of Electrical Engineering and Computer Science, University of Kansas, KS 66045, Lawrence, USA
Jerzy W. Grzymala-Busse
Department of Computer Science, University of Regina,, S4S 0A2, Regina, Saskatchewan, Canada
Wojciech P. Ziarko

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Janusz, A. (2008). Similarity Relation in Classification Problems. In: Chan, CC., Grzymala-Busse, J.W., Ziarko, W.P. (eds) Rough Sets and Current Trends in Computing. RSCTC 2008. Lecture Notes in Computer Science(), vol 5306. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88425-5_22

Download citation

DOI: https://doi.org/10.1007/978-3-540-88425-5_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88423-1
Online ISBN: 978-3-540-88425-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics