Fast Most Similar Neighbor Classifier for Mixed Data

Hernández-Rodríguez, Selene; Martínez-Trinidad, J. Francisco; Carrasco-Ochoa, J. Ariel

doi:10.1007/978-3-540-72665-4_13

Selene Hernández-Rodríguez¹,
J. Francisco Martínez-Trinidad¹ &
J. Ariel Carrasco-Ochoa¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4509))

Included in the following conference series:

Conference of the Canadian Society for Computational Studies of Intelligence

973 Accesses

Abstract

The nearest neighbor (NN) classifier has been a widely used technique in pattern recognition because of its simplicity and good behavior. To decide the class of a new object, the NN classifier performs an exhaustive comparison between the object to classify and the training set T. However, when T is large, the exhaustive comparison is very expensive and sometimes becomes inapplicable. To avoid this problem, many fast NN algorithms have been developed for numerical object descriptions, most of them based on metric properties to avoid comparisons. However, in some sciences as Medicine, Geology, Sociology, etc., objects are usually described by numerical and non numerical attributes (mixed data). In this case, we can not assume the comparison function satisfies metric properties. Therefore, in this paper a fast most similar object classifier based on search methods suitable for mixed data is presented. Some experiments using standard databases and a comparison with other two fast NN methods are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. Trans. Information Theory 13, 21–27 (1967)
Article MATH Google Scholar
Fukunaga, K., Narendra, P.: A branch and bound algorithm for computing k-nearest neighbors. IEEE Trans. Comput. 24, 743–750 (1975)
MathSciNet Google Scholar
Kalantari, I., McDonald, G.: A data structure and an algorithm for the nearest point problem. IEEE Trans. Software Eng. 9, 631–634 (1983)
Article Google Scholar
Omachi, S., Aso, H.: A fast algorithm for a k-nn Classifier based on branch and bound method and computational quantity estimation. Systems and Computers in Japan 31(6), 1–9 (2000)
Article Google Scholar
Gómez-Ballester, E., Micó, L., Oncina, J.: Some Improvements in Tree Based Nearest Neighbour Search Algorithms. In: Sanfeliu, A., Ruiz-Shulcloper, J. (eds.) CIARP 2003. LNCS, vol. 2905, pp. 456–463. Springer, Heidelberg (2003)
Google Scholar
Gómez-Ballester, E., Mico, L., Oncina, J.: Some approaches to improve tree-based nearest neighbour search algorithms. Pattern Recognition Letters 39, 171–179 (2006)
MATH Google Scholar
Moreno-Seco, F., Micó, L., Oncina, J.: Approximate Nearest Neighbour Search with the Fukunaga and Narendra Algorithm and Its Application to Chromosome Classification. In: Sanfeliu, A., Ruiz-Shulcloper, J. (eds.) CIARP 2003. LNCS, vol. 2905, pp. 322–328. Springer, Heidelberg (2003)
Google Scholar
Mico, L., Oncina, J., Carrasco, R.: A fast Branch and Bound nearest neighbor classifier in metric spaces. Pattern Recognition Letters 17, 731–739 (1996)
Article Google Scholar
MacQueen, J.B.: Some Methods for classification and Analysis of Multivariate Observations. In: Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297. University of California Press, Berkeley (1967)
Google Scholar
García-Serrano, J.R., Martínez-Trinidad, J.F.: Extension to C-means Algorithm for the Use of Similarity Functions. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 354–359. Springer, Heidelberg (1999)
Google Scholar
Martínez-Trinidad, J.F., García-Serrano, J.R., Ayaquica-Martínez, I.O.: C-Means Algorithm with Similarity Functions. Computación y Sistemas 5(4), 241–246 (2002)
Google Scholar
Wilson, D.R., Martínez, T.: Reduction techniques for instance based learning algorithms. Machine Learning 38, 257–286 (2000)
Article MATH Google Scholar
Wilson, D., Martínez, T.: Improve heterogeneous Distance Functions. Journal of Artificial Intelligence Research 6, 1–34 (1997)
MathSciNet MATH Google Scholar
McNames, J.: A Fast Nearest Neighbour Algorithm Based on a Principal Axis Search Tree. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(9), 964–976 (2001)
Article Google Scholar
Yong-Sheng, C., Yi-Ping, H., Chiou-Shann, F.: Fast and versatile algorithm for nearest neighbor search based on lower bound tree. Pattern Recognition Letters (2006)
Google Scholar
Blake, C., Merz, C.: UCI Repository of machine learning databases. Department of Information and Computer Science, University of California, Irvine, CA (1998), http://www.uci.edu/mlearn/databases/

Download references

Author information

Authors and Affiliations

Computer Science Department, National Institute of Astrophysics, Optics and Electronics, Luis Enrique Erro No. 1, Sta. María Tonantzintla, Puebla, CP: 72840, Mexico
Selene Hernández-Rodríguez, J. Francisco Martínez-Trinidad & J. Ariel Carrasco-Ochoa

Authors

Selene Hernández-Rodríguez
View author publications
You can also search for this author in PubMed Google Scholar
J. Francisco Martínez-Trinidad
View author publications
You can also search for this author in PubMed Google Scholar
J. Ariel Carrasco-Ochoa
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Ziad Kobti Dan Wu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hernández-Rodríguez, S., Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A. (2007). Fast Most Similar Neighbor Classifier for Mixed Data. In: Kobti, Z., Wu, D. (eds) Advances in Artificial Intelligence. Canadian AI 2007. Lecture Notes in Computer Science(), vol 4509. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72665-4_13

Download citation

DOI: https://doi.org/10.1007/978-3-540-72665-4_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72664-7
Online ISBN: 978-3-540-72665-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics