A Correlation-Based Distance Function for Nearest Neighbor Classification

  • Yanet Rodriguez
  • Bernard De Baets
  • Maria M. Garcia
  • Carlos Morell
  • Ricardo Grau
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5197)

Abstract

The Nearest Neighbor rule is a well-known classification me-thod largely studied in the pattern recognition community, both for its simplicity and its performance. The definition of the distance function is central for obtaining a good accuracy on a given data set and different distance functions have been proposed to increase the performance. This paper proposes a new distance function based on the correlation of fuzzy sets, called Fuzzy Correlation-based Difference Metric. The proposed distance function is a generalization of the Value Difference Metric and applies to both nominal and continuous attributes in a uniform way. Fuzzy sets are used to represent numeric attributes. A uninorm operator is used to aggregate local differences. Experimental results using an standard \(\mathit{k}\)-NN algorithm show a significant improvement in comparison to other distance functions proposed before.

Keywords

Nearest Neighbour Classification Distance functions Value Difference Metric Fuzzy Sets Theory 

References

  1. 1.
    Stanfill, C., Waltz, D.: Towarder memory-based reasoning. Communication of the ACM 29, 1213–1228 (1986)CrossRefGoogle Scholar
  2. 2.
    Wilson, R., Martinez, T.: Improved heterogeneous distance functions. Journal of Artificial Intelligence Research 6, 1–34 (1997)MathSciNetMATHGoogle Scholar
  3. 3.
    Ohnishi, H., Suzuki, H., Shigemasu, K.: Similarity by feature creation: Reexamination of the asymmetry of similarity. In: 16th Annual Conference of Cognitive Science Society. Lawrence Erlbaum, Mahwah (1994)Google Scholar
  4. 4.
    Zwick, R., Carlstein, E., Budescu, D.: Measures of similarity among fuzzy sets: a comparative analysis. International Journal of Approximate Reasoning 1(2), 221–242 (1987)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Rodriguez, Y., Garcia, M., De Baets, B., Morell, C., Bello, R.: A connectionist fuzzy case-based reasoning model. In: Gelbukh, A., Reyes-Garcia, C.A. (eds.) MICAI 2006. LNCS (LNAI), vol. 4293, pp. 176–185. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  6. 6.
    Siegel, S., Castellan, N.: Nonparametric Statistics for the Behavioral Sciences. McGraw-Hill, New York (1988)Google Scholar
  7. 7.
    Ruspini, E.: A new approach to clustering. Information and Control 15, 22–38 (1969)CrossRefMATHGoogle Scholar
  8. 8.
    Zadeh, L.: The concept of a linguistic variable and its applications to approximate reasoning, parts I, II. Information Sciences 8, 199–251, 301–357 (1975)CrossRefMATHGoogle Scholar
  9. 9.
    Chiang, D., Lin, N.: Correlation of fuzzy sets. Fuzzy Sets and Systems 102(2), 221–226 (1999)MathSciNetCrossRefMATHGoogle Scholar
  10. 10.
    Tsadiras, A., Margaritis, K.: The mycin certainty factor handling function as uninorm operator and its use as a threshold function in artificial neurons. Fuzzy Sets and Systems 93(3), 263–274 (1998)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Dubois, D., Nguyen, H., Prade, H.: Possibility theory, probability and fuzzy sets: misunderstandings, bridges and gaps. In: Fundamentals of Fuzzy Sets, pp. 343–438. Kluwer Academic Publishers, Boston (2000)CrossRefGoogle Scholar
  12. 12.
    Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann, San Francisco (2005)MATHGoogle Scholar
  13. 13.
    Fayyad, U., Irani, K.: Multi-interval discretization of continuous-valued attributes for classification learning. In: IJCAI 1993 (1993)Google Scholar
  14. 14.
    Murphy, P., Aha, D.: Uci repository of machine-learning databases, http://www.ics.uci.edu/~mlearn/mlrepository.htm
  15. 15.
    Payne, T., Edwards, P.: Implicit feature selection with the value difference metric. In: ECAI 1998: 13th European Conference on Artificial Intelligence (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Yanet Rodriguez
    • 1
  • Bernard De Baets
    • 2
  • Maria M. Garcia
    • 1
  • Carlos Morell
    • 1
  • Ricardo Grau
    • 1
  1. 1.Universidad Central de Las VillasSanta ClaraCuba
  2. 2.Ghent UniversityGentBelgium

Personalised recommendations