Skip to main content

Similarity Relation in Classification Problems

  • Conference paper
Book cover Rough Sets and Current Trends in Computing (RSCTC 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5306))

Included in the following conference series:

Abstract

This paper presents a methodology of constructing robust classifiers based on a concept called a Hierarchic Similarity Model (HSM). The hierarchic similarity is interpreted as a relation between pairs of complex objects. This relation can be derived from an information system by examining the domain related aspects of similarity. In the paper, global similarity is decomposed into many local similarities by analogy with the process of perceiving similar objects. For the purpose of estimating local relations some well-known rough sets methods are used, as well as context knowledge provided by a domain expert. Then the rules modeling interactions between local similarities are constructed and used to assess the degree of a global similarity of complex objects. The obtained relation can be used to construct classifiers which may successfully compete with other popular methods like boosted decision trees or k-NN algorithm. An implementation of the proposed models in the R script language is provided together with an empirical evaluation of the similarity based classification accuracy for some common datasets. This paper is a continuation of the research started in [1].

The author would like to thank professor Andrzej Skowron for the inspiration and the useful remarks and also Aleksandra Janusz-Ochab and Marcin Szczuka for their support in writing and editing this paper. This research was supported by the grant N N516 368334 from Ministry of Science and Higher Education of the Republic of Poland and by the Innovative Economy Operational Programme 2007-2013 (Priority Axis 1. Research and development of new technologies).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Janusz, A.: A similarity relation in machine learning. Master’s thesis, Warsaw University, Faculty of Mathematics, Informatics and Mechanics (2007) (in polish)

    Google Scholar 

  2. Gati, I., Tversky, A.: Studies of similarity. In: Rosch, E., Lloyd, B. (eds.) Cognition and Categorization, pp. 81–99. L. Erlbaum Associates, Hillsdale (1978)

    Google Scholar 

  3. Hahn, U., Chater, N.: Understanding similarity: A joint project for psychology, case based reasoning, and law. Artificial Intelligence Review 12, 393–427 (1998)

    Article  Google Scholar 

  4. Goldstone, R., Medin, D., Gentner, D.: Relational similarity and the nonindependence of features in similarity judgments. Cognitive Psychology 23, 222–262 (1991)

    Article  Google Scholar 

  5. Nguyen, S.H.T.: Regularity analysis and its applications in data mining. Ph.D thesis, Warsaw University, Faculty of Mathematics, Informatics and Mechanics, Part II: Relational Patterns (1999)

    Google Scholar 

  6. Tversky, A.: Features of similarity. Psychological Review 84, 327–352 (1977)

    Article  Google Scholar 

  7. Bazan, J., Nguyen, S.H., Nguyen, H.S., Skowron, A.: Rough set methods in approximation of hierarchical concepts. In: Tsumoto, S., Słowiński, R., Komorowski, J., Grzymała-Busse, J.W. (eds.) RSCTC 2004. LNCS (LNAI), vol. 3066, pp. 346–355. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  8. Bazan, J., Kruczek, P., Bazan-Socha, S., Skowron, A., Pietrzyk, J.J.: Automatic planning of treatment of infants with respiratory failure through rough set modeling. In: Greco, S., Hata, Y., Hirano, S., Inuiguchi, M., Miyamoto, S., Nguyen, H.S., Słowiński, R. (eds.) RSCTC 2006. LNCS (LNAI), vol. 4259, pp. 418–427. Springer, Heidelberg (2006); see also the extended version in Fundamenta Informaticae 85 (2008)

    Chapter  Google Scholar 

  9. Pawlak, Z.: Information systems, theoretical foundations. Information Systems 3(6), 205–218 (1981)

    Article  MATH  Google Scholar 

  10. Skowron, A., Stepaniuk, J.: Ontological framework for approximation. In: Ślęzak, D., Wang, G., Szczuka, M.S., Düntsch, I., Yao, Y. (eds.) RSFDGrC 2005. LNCS (LNAI), vol. 3641, pp. 718–727. Springer, Heidelberg (2005)

    Google Scholar 

  11. Husserl, E.: The Crisis of European Sciences and Transcendental Phenomenology. Northwestern University Press, Evanston (1970); German original written in 1937

    Google Scholar 

  12. Schütz, A.: The Phenomenology of the Social World. Northwestern University Press, Evanston (1967)

    Google Scholar 

  13. Nguyen, S.H., Bazan, J., Skowron, A., Nguyen, H.S.: Layered learning for concept synthesis. In: Peters, J.F., Skowron, A., Grzymała-Busse, J.W., Kostek, B.z., Świniarski, R.W., Szczuka, M.S. (eds.) Transactions on Rough Sets I. LNCS, vol. 3100, pp. 187–208. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  14. Basu, S.: Semi-supervised Clustering: Probabilistic Models, Algorithms and Experiments. PhD thesis, The University of Texas at Austin (2005)

    Google Scholar 

  15. Smyth, B., McClave, P.: Similarity vs. diversity. In: Aha, D.W., Watson, I. (eds.) ICCBR 2001. LNCS (LNAI), vol. 2080, pp. 347–361. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  16. Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc.of the 1993 ACM SIGMOD International Conference on Management of Data SIGMOD 1993, Washington, DC, pp. 207–216 (1993)

    Google Scholar 

  17. Nguyen, H.S.: On efficient handling of continuous attributes in large data bases. Fundamenta Informaticae 48(1), 61–81 (2001)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Janusz, A. (2008). Similarity Relation in Classification Problems. In: Chan, CC., Grzymala-Busse, J.W., Ziarko, W.P. (eds) Rough Sets and Current Trends in Computing. RSCTC 2008. Lecture Notes in Computer Science(), vol 5306. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88425-5_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-88425-5_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-88423-1

  • Online ISBN: 978-3-540-88425-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics