A Fast Hybrid k-NN Classifier Based on Homogeneous Clusters

  • Stefanos Ougiaroglou
  • Georgios Evangelidis
Conference paper
Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT, volume 381)


This paper proposes a hybrid method for fast and accurate Nearest Neighbor Classification. The method consists of a non-parametric cluster-based algorithm that produces a two-level speed-up data structure and a hybrid algorithm that accesses this structure to perform the classification. The proposed method was evaluated using eight real-life datasets and compared to four known speed-up methods. Experimental results show that the proposed method is fast and accurate, and, in addition, has low pre-processing computational cost.


nearest neighbors classification clustering 


  1. 1.
    Alcalá-Fdez, J., Sánchez, L., García, S., del Jesús, M.J., Ventura, S., Guiu, J.M.G., Otero, J., Romero, C., Bacardit, J., Rivas, V.M., Fernández, J.C., Herrera, F.: Keel: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput. 13(3), 307–318 (2009)CrossRefGoogle Scholar
  2. 2.
    Brighton, H., Mellish, C.: Advances in instance selection for instance-based learning algorithms. Data Min. Knowl. Discov. 6(2), 153–172 (2002)MathSciNetzbMATHCrossRefGoogle Scholar
  3. 3.
    Chen, C.H., Jóźwik, A.: A sample set condensation algorithm for the class sensitive artificial neural network. Pattern Recogn. Lett. 17, 819–823 (1996)CrossRefGoogle Scholar
  4. 4.
    Dasarathy, B.V.: Nearest neighbor (NN) norms: NN pattern classification techniques. IEEE Computer Society Press (1991)Google Scholar
  5. 5.
    Garcia, S., Derrac, J., Cano, J., Herrera, F.: Prototype selection for nearest neighbor classification: Taxonomy and empirical study. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(3), 417–435 (2012)CrossRefGoogle Scholar
  6. 6.
    Grochowski, M., Jankowski, N.: Comparison of Instance Selection Algorithms II. Results and Comments. In: Rutkowski, L., Siekmann, J.H., Tadeusiewicz, R., Zadeh, L.A. (eds.) ICAISC 2004. LNCS (LNAI), vol. 3070, pp. 580–585. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  7. 7.
    Hart, P.E.: The condensed nearest neighbor rule. IEEE Transactions on Information Theory 14(3), 515–516 (1968)CrossRefGoogle Scholar
  8. 8.
    Hwang, S., Cho, S.: Clustering-Based Reference Set Reduction for k-Nearest Neighbor. In: Liu, D., Fei, S., Hou, Z., Zhang, H., Sun, C. (eds.) ISNN 2007, Part II. LNCS, vol. 4492, pp. 880–888. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  9. 9.
    Jankowski, N., Grochowski, M.: Comparison of Instances Seletion Algorithms I. Algorithms Survey. In: Rutkowski, L., Siekmann, J.H., Tadeusiewicz, R., Zadeh, L.A. (eds.) ICAISC 2004. LNCS (LNAI), vol. 3070, pp. 598–603. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  10. 10.
    Karamitopoulos, L., Evangelidis, G.: Cluster-based similarity search in time series. In: Proceedings of the Fourth Balkan Conference in Informatics, BCI 2009, pp. 113–118. IEEE Computer Society, Washington, DC (2009)CrossRefGoogle Scholar
  11. 11.
    Mardia, K., Kent, J., Bibby, J.: Multivariate Analysis. Academic Press (1979)Google Scholar
  12. 12.
    McQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proc. of 5th Berkeley Symp. on Math. Statistics and Probability, pp. 281–298. University of California Press, Berkeley (1967)Google Scholar
  13. 13.
    Olvera-López, J.A., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Kittler, J.: A review of instance selection methods. Artif. Intell. Rev. 34(2), 133–143 (2010)CrossRefGoogle Scholar
  14. 14.
    Ougiaroglou, S., Evangelidis, G.: Efficient data-set size reduction by finding homogeneous clusters. In: Procendings of the Fifth Balkan Conference in Informatics, BCI 2012. ACM (to appear, 2012)Google Scholar
  15. 15.
    Ougiaroglou, S., Evangelidis, G., Dervos, D.A.: An Adaptive Hybrid and Cluster-Based Model for Speeding Up the k-NN Classifier. In: Corchado, E., Snášel, V., Abraham, A., Woźniak, M., Graña, M., Cho, S.-B. (eds.) HAIS 2012, Part II. LNCS, vol. 7209, pp. 163–175. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  16. 16.
    Samet, H.: Foundations of multidimensional and metric data structures. The Morgan Kaufmann series in computer graphics. Elsevier/Morgan Kaufmann (2006)Google Scholar
  17. 17.
    Sánchez, J.S.: High training set size reduction by space partitioning and prototype abstraction. Pattern Recognition 37(7), 1561–1564 (2004)CrossRefGoogle Scholar
  18. 18.
    Triguero, I., Derrac, J., García, S., Herrera, F.: A taxonomy and experimental study on prototype generation for nearest neighbor classification. IEEE Transactions on Systems, Man, and Cybernetics, Part C 42(1), 86–100 (2012)CrossRefGoogle Scholar
  19. 19.
    Wang, X.: A fast exact k-nearest neighbors algorithm for high dimensional search using k-means clustering and triangle inequality. In: The 2011 International Joint Conference on Neural Networks (IJCNN), pp. 1293–1299 (August 2011)Google Scholar
  20. 20.
    Wilson, D.R., Martinez, T.R.: Reduction techniques for instance-based learning algorithms. Machine Learning 38(3), 257–286 (2000)zbMATHCrossRefGoogle Scholar
  21. 21.
    Zhang, B., Srihari, S.N.: Fast k-nearest neighbor classification using cluster-based trees. IEEE Trans. Pattern Anal. Mach. Intell. 26(4), 525–528 (2004)CrossRefGoogle Scholar

Copyright information

© IFIP International Federation for Information Processing 2012

Authors and Affiliations

  • Stefanos Ougiaroglou
    • 1
  • Georgios Evangelidis
    • 1
  1. 1.Department of Applied InformaticsUniversity of MacedoniaThessalonikiGreece

Personalised recommendations