Skip to main content

Fast Most Similar Neighbor Classifier for Mixed Data

  • Conference paper
Advances in Artificial Intelligence (Canadian AI 2007)

Abstract

The nearest neighbor (NN) classifier has been a widely used technique in pattern recognition because of its simplicity and good behavior. To decide the class of a new object, the NN classifier performs an exhaustive comparison between the object to classify and the training set T. However, when T is large, the exhaustive comparison is very expensive and sometimes becomes inapplicable. To avoid this problem, many fast NN algorithms have been developed for numerical object descriptions, most of them based on metric properties to avoid comparisons. However, in some sciences as Medicine, Geology, Sociology, etc., objects are usually described by numerical and non numerical attributes (mixed data). In this case, we can not assume the comparison function satisfies metric properties. Therefore, in this paper a fast most similar object classifier based on search methods suitable for mixed data is presented. Some experiments using standard databases and a comparison with other two fast NN methods are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. Trans. Information Theory 13, 21–27 (1967)

    Article  MATH  Google Scholar 

  2. Fukunaga, K., Narendra, P.: A branch and bound algorithm for computing k-nearest neighbors. IEEE Trans. Comput. 24, 743–750 (1975)

    MathSciNet  Google Scholar 

  3. Kalantari, I., McDonald, G.: A data structure and an algorithm for the nearest point problem. IEEE Trans. Software Eng. 9, 631–634 (1983)

    Article  Google Scholar 

  4. Omachi, S., Aso, H.: A fast algorithm for a k-nn Classifier based on branch and bound method and computational quantity estimation. Systems and Computers in Japan 31(6), 1–9 (2000)

    Article  Google Scholar 

  5. Gómez-Ballester, E., Micó, L., Oncina, J.: Some Improvements in Tree Based Nearest Neighbour Search Algorithms. In: Sanfeliu, A., Ruiz-Shulcloper, J. (eds.) CIARP 2003. LNCS, vol. 2905, pp. 456–463. Springer, Heidelberg (2003)

    Google Scholar 

  6. Gómez-Ballester, E., Mico, L., Oncina, J.: Some approaches to improve tree-based nearest neighbour search algorithms. Pattern Recognition Letters 39, 171–179 (2006)

    MATH  Google Scholar 

  7. Moreno-Seco, F., Micó, L., Oncina, J.: Approximate Nearest Neighbour Search with the Fukunaga and Narendra Algorithm and Its Application to Chromosome Classification. In: Sanfeliu, A., Ruiz-Shulcloper, J. (eds.) CIARP 2003. LNCS, vol. 2905, pp. 322–328. Springer, Heidelberg (2003)

    Google Scholar 

  8. Mico, L., Oncina, J., Carrasco, R.: A fast Branch and Bound nearest neighbor classifier in metric spaces. Pattern Recognition Letters 17, 731–739 (1996)

    Article  Google Scholar 

  9. MacQueen, J.B.: Some Methods for classification and Analysis of Multivariate Observations. In: Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297. University of California Press, Berkeley (1967)

    Google Scholar 

  10. García-Serrano, J.R., Martínez-Trinidad, J.F.: Extension to C-means Algorithm for the Use of Similarity Functions. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 354–359. Springer, Heidelberg (1999)

    Google Scholar 

  11. Martínez-Trinidad, J.F., García-Serrano, J.R., Ayaquica-Martínez, I.O.: C-Means Algorithm with Similarity Functions. Computación y Sistemas 5(4), 241–246 (2002)

    Google Scholar 

  12. Wilson, D.R., Martínez, T.: Reduction techniques for instance based learning algorithms. Machine Learning 38, 257–286 (2000)

    Article  MATH  Google Scholar 

  13. Wilson, D., Martínez, T.: Improve heterogeneous Distance Functions. Journal of Artificial Intelligence Research 6, 1–34 (1997)

    MathSciNet  MATH  Google Scholar 

  14. McNames, J.: A Fast Nearest Neighbour Algorithm Based on a Principal Axis Search Tree. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(9), 964–976 (2001)

    Article  Google Scholar 

  15. Yong-Sheng, C., Yi-Ping, H., Chiou-Shann, F.: Fast and versatile algorithm for nearest neighbor search based on lower bound tree. Pattern Recognition Letters (2006)

    Google Scholar 

  16. Blake, C., Merz, C.: UCI Repository of machine learning databases. Department of Information and Computer Science, University of California, Irvine, CA (1998), http://www.uci.edu/mlearn/databases/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ziad Kobti Dan Wu

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Hernández-Rodríguez, S., Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A. (2007). Fast Most Similar Neighbor Classifier for Mixed Data. In: Kobti, Z., Wu, D. (eds) Advances in Artificial Intelligence. Canadian AI 2007. Lecture Notes in Computer Science(), vol 4509. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72665-4_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-72665-4_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-72664-7

  • Online ISBN: 978-3-540-72665-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics