Fast k Most Similar Neighbor Classifier for Mixed Data Based on a Tree Structure

  • Selene Hernández-Rodríguez
  • J. Francisco Martínez-Trinidad
  • J. Ariel Carrasco-Ochoa
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4756)


In this work, a fast k most similar neighbor (k-MSN) classifier for mixed data is presented. The k nearest neighbor (k-NN) classifier has been a widely used nonparametric technique in Pattern Recognition. Many fast k-NN classifiers have been developed to be applied on numerical object descriptions, most of them based on metric properties to avoid object comparisons. However, in some sciences as Medicine, Geology, Sociology, etc., objects are usually described by numerical and non numerical features (mixed data). In this case, we can not assume the comparison function satisfies metric properties. Therefore, our classifier is based on search algorithms suitable for mixed data and non-metric comparison functions. Some experiments and a comparison against other two fast k-NN methods, using standard databases, are presented.


Nearest Neighbors Rule Fast k-Most Similar Neighbors Search Mixed Data 


  1. 1.
    Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. Trans. Information Theory 13, 21–27 (1967)zbMATHCrossRefGoogle Scholar
  2. 2.
    Fukunaga, K., Narendra, P.: A branch and bound algorithm for computing k-nearest neighbors. IEEE Trans. Comput. 24, 743–750 (1975)MathSciNetGoogle Scholar
  3. 3.
    Kalantari, I., McDonald, G.: A data structure and an algorithm for the nearest point problem. IEEE Trans. Software Eng. 9, 631–634 (1983)CrossRefGoogle Scholar
  4. 4.
    Omachi, S., Aso, H.: A fast algorithm for a k-nn Classifier based on branch and bound method and computational quantity estimation. Systems and Computers in Japan 31(6), 1–9 (2000)CrossRefGoogle Scholar
  5. 5.
    Gómez-Ballester, E., Mico, L., Oncina, J.: Some Improvements in Tree Based Nearest Neighbor Search Algorithms. In: Sanfeliu, A., Ruiz-Shulcloper, J. (eds.) CIARP 2003. LNCS, vol. 2905, pp. 456–463. Springer, Heidelberg (2003)Google Scholar
  6. 6.
    Gómez-Ballester, E., Mico, L., Oncina, J.: Some approaches to improve tree-based nearest neighbor search algorithms. Pattern Recognition Letters 39, 171–179 (2006)zbMATHGoogle Scholar
  7. 7.
    Moreno-Seco, F., Mico, L., Oncina, J.: Approximate Nearest Neighbor Search with the Fukunaga and Narendra Algorithm and its Application to Chromosome Classification. In: Sanfeliu, A., Ruiz-Shulcloper, J. (eds.) CIARP 2003. LNCS, vol. 2905, pp. 322–328. Springer, Heidelberg (2003)Google Scholar
  8. 8.
    Mico, L., Oncina, J., Carrasco, R.: A fast Branch and Bound nearest neighbor classifier in metric spaces. Pattern Recognition Letters 17, 731–739 (1996)CrossRefGoogle Scholar
  9. 9.
    MacQueen, J.B.: Some Methods for classification and Analysis of Multivariate Observations. In: Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297. University of California Press, Berkeley (1967)Google Scholar
  10. 10.
    García-Serrano, J.R., Martínez-Trinidad, J.F.: Extension to C-Means Algorithm for the use of Similarity Functions. In: 3rd European Conference on Principles and Practice of Knowledge Discovery in Database Proceedings, Prague, Czech, pp. 354–359 (1999)Google Scholar
  11. 11.
    Wilson, D.R., Martínez, T.: Reduction techniques for instance based learning algorithms. Machine Learning. 38, 257–286 (2000)zbMATHCrossRefGoogle Scholar
  12. 12.
    Wilson, D., Martínez, T.: Improve heterogeneous Distance Functions. Journal of Artificial Intelligence Research 6, 1–34 (1997)zbMATHMathSciNetGoogle Scholar
  13. 13.
    McNames, J.: A Fast Nearest Neighbor Algorithm Based on a Principal Axis Search Tree. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(9), 964–976 (2001)CrossRefGoogle Scholar
  14. 14.
    Yong-Sheng, C., Yi-Ping, H., Chiou-Shann, F.: Fast and versatile algorithm for nearest neighbor search based on lower bound tree. Pattern Recognition Letters (2006)Google Scholar
  15. 15.
    Blake, C., Merz, C.: UCI Repository of machine learning databases. In: Department of Information and Computer Science, University of California, Irvine, CA (1998),

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Selene Hernández-Rodríguez
    • 1
  • J. Francisco Martínez-Trinidad
    • 1
  • J. Ariel Carrasco-Ochoa
    • 1
  1. 1.Computer Science Department, National Institute of Astrophysics, Optics and Electronics, Luis Enrique Erro No. 1, Sta. María Tonantzintla, Puebla, CP: 72840México

Personalised recommendations