Abstract
The nearest neighbor (NN) classifier has been a widely used technique in pattern recognition because of its simplicity and good behavior. To decide the class of a new object, the NN classifier performs an exhaustive comparison between the object to classify and the training set T. However, when T is large, the exhaustive comparison is very expensive and sometimes becomes inapplicable. To avoid this problem, many fast NN algorithms have been developed for numerical object descriptions, most of them based on metric properties to avoid comparisons. However, in some sciences as Medicine, Geology, Sociology, etc., objects are usually described by numerical and non numerical attributes (mixed data). In this case, we can not assume the comparison function satisfies metric properties. Therefore, in this paper a fast most similar object classifier based on search methods suitable for mixed data is presented. Some experiments using standard databases and a comparison with other two fast NN methods are presented.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. Trans. Information Theory 13, 21–27 (1967)
Fukunaga, K., Narendra, P.: A branch and bound algorithm for computing k-nearest neighbors. IEEE Trans. Comput. 24, 743–750 (1975)
Kalantari, I., McDonald, G.: A data structure and an algorithm for the nearest point problem. IEEE Trans. Software Eng. 9, 631–634 (1983)
Omachi, S., Aso, H.: A fast algorithm for a k-nn Classifier based on branch and bound method and computational quantity estimation. Systems and Computers in Japan 31(6), 1–9 (2000)
Gómez-Ballester, E., Micó, L., Oncina, J.: Some Improvements in Tree Based Nearest Neighbour Search Algorithms. In: Sanfeliu, A., Ruiz-Shulcloper, J. (eds.) CIARP 2003. LNCS, vol. 2905, pp. 456–463. Springer, Heidelberg (2003)
Gómez-Ballester, E., Mico, L., Oncina, J.: Some approaches to improve tree-based nearest neighbour search algorithms. Pattern Recognition Letters 39, 171–179 (2006)
Moreno-Seco, F., Micó, L., Oncina, J.: Approximate Nearest Neighbour Search with the Fukunaga and Narendra Algorithm and Its Application to Chromosome Classification. In: Sanfeliu, A., Ruiz-Shulcloper, J. (eds.) CIARP 2003. LNCS, vol. 2905, pp. 322–328. Springer, Heidelberg (2003)
Mico, L., Oncina, J., Carrasco, R.: A fast Branch and Bound nearest neighbor classifier in metric spaces. Pattern Recognition Letters 17, 731–739 (1996)
MacQueen, J.B.: Some Methods for classification and Analysis of Multivariate Observations. In: Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297. University of California Press, Berkeley (1967)
García-Serrano, J.R., Martínez-Trinidad, J.F.: Extension to C-means Algorithm for the Use of Similarity Functions. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 354–359. Springer, Heidelberg (1999)
Martínez-Trinidad, J.F., García-Serrano, J.R., Ayaquica-Martínez, I.O.: C-Means Algorithm with Similarity Functions. Computación y Sistemas 5(4), 241–246 (2002)
Wilson, D.R., Martínez, T.: Reduction techniques for instance based learning algorithms. Machine Learning 38, 257–286 (2000)
Wilson, D., Martínez, T.: Improve heterogeneous Distance Functions. Journal of Artificial Intelligence Research 6, 1–34 (1997)
McNames, J.: A Fast Nearest Neighbour Algorithm Based on a Principal Axis Search Tree. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(9), 964–976 (2001)
Yong-Sheng, C., Yi-Ping, H., Chiou-Shann, F.: Fast and versatile algorithm for nearest neighbor search based on lower bound tree. Pattern Recognition Letters (2006)
Blake, C., Merz, C.: UCI Repository of machine learning databases. Department of Information and Computer Science, University of California, Irvine, CA (1998), http://www.uci.edu/mlearn/databases/
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Hernández-Rodríguez, S., Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A. (2007). Fast Most Similar Neighbor Classifier for Mixed Data. In: Kobti, Z., Wu, D. (eds) Advances in Artificial Intelligence. Canadian AI 2007. Lecture Notes in Computer Science(), vol 4509. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72665-4_13
Download citation
DOI: https://doi.org/10.1007/978-3-540-72665-4_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72664-7
Online ISBN: 978-3-540-72665-4
eBook Packages: Computer ScienceComputer Science (R0)