Chapter

Intelligent Data Engineering and Automated Learning - IDEAL 2012

Volume 7435 of the series Lecture Notes in Computer Science pp 586-593

Parallel k-Most Similar Neighbor Classifier for Mixed Data

  • Guillermo Sanchez-DiazAffiliated withUniversidad Autonoma de San Luis Potosi
  • , Anilu Franco-ArcegaAffiliated withUniversidad Autonoma del Estado de Hidalgo
  • , Carlos Aguirre-SaladoAffiliated withUniversidad Autonoma de San Luis Potosi
  • , Ivan Piza-DavilaAffiliated withInstituto Tecnologico y de Estudios Superiores de Occidente
  • , Luis R. Morales-ManillaAffiliated withUniversidad Politecnica de Tulancingo
  • , Uriel Escobar-FrancoAffiliated withUniversidad Politecnica de Tulancingo

* Final gross prices may vary according to local VAT.

Get Access

Abstract

This paper presents a paralellization of the incremental algorithm inc-k-msn, for mixed data and similarity functions that do not satisfy metric properties. The algorithm presented is suitable for processing large data sets, because it only stores in main memory the k-most similar neighbors processed in step t, traversing only once the training data set. Several experiments with synthetic and real data are presented.

Keywords

K-most similar neighbor K-nearest neighbor classification parallel algorithms