The Journal of Supercomputing

, Volume 73, Issue 10, pp 4611–4634 | Cite as

GPU-based exhaustive algorithms processing kNN queries

  • Ricardo J. BarrientosEmail author
  • Fabricio Millaguir
  • José L. Sánchez
  • Enrique Arias


Efficient kNN search, or k-nearest neighbors search, is useful, among other fields, in multimedia information retrieval, data mining and pattern recognition problems. A distance function determines how similar the objects are to a given kNN query object. As finding the distance between any given pair of objects (i.e., high-dimensional vectors) is known to be a computationally expensive operation, using parallel computation techniques is an effective way of reducing running times to acceptable values in large databases. In the present work, we offer novel GPU approaches to solving kNN (k-nearest neighbor) queries using exhaustive algorithms based on the Selection Sort, Quicksort and state-of-the-art algorithms. We show that the best approach depends on the k value of the kNN query and achieve a speedup up to 86.4\(\times \) better than the sequential counterpart. We also propose a multi-core algorithm to be used as reference for the experiments and a hybrid algorithm which combines the proposed algorithms with a state-of-the-art heaps-based method, in which the best performance is obtained with high k values. We also extend our algorithms to be able to deal with large databases that do not fit in GPU memory and whose performance does not deteriorate as database size increases.


kNN Quicksort Selection Sort GPU Exhaustive algorithms 



This research was supported by the Project of the Universidad Católica del Maule (Chile) “Plan de Desarrollo Anual Facultad de Ingeniería. Convenio de Desempeño” and by the European Commission (FEDER) and Junta de Comunidades de Castilla-La Mancha under the project PEII-2014-028-P. Powered@NLHPC: This research was partially supported by the supercomputing infrastructure of the NLHPC (ECM-02).


  1. 1.
    Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27CrossRefzbMATHGoogle Scholar
  2. 2.
    Deng Z, Zhu X, Cheng D, Zong M, Zhang S (2016) Efficient knn classification algorithm for big data. Neurocomputing 195:143–148CrossRefGoogle Scholar
  3. 3.
    Mic V, Novak D, Zezula P (2016) Speeding up similarity search by sketches. Springer, Cham. doi: 10.1007/978-3-319-46759-7_19 CrossRefGoogle Scholar
  4. 4.
    Navarro G, Uribe-Paredes R (2011) Fully dynamic metric access methods based on hyperplane partitioning. Inf Syst 36(4):734–747. doi: 10.1016/ CrossRefGoogle Scholar
  5. 5.
    Novak D, Batko M, Zezula P (2011) Metric index: an efficient and scalable solution for precise and approximate similarity search. Inf Syst 36(4):721–733CrossRefGoogle Scholar
  6. 6.
    Keogh E, Mueen A (2010) Curse of dimensionality. In: Encyclopedia of Machine Learning. Springer US, pp 257–258. doi: 10.1007/978-0-387-30164-8_192
  7. 7.
  8. 8.
    Cai Y, See S (eds) (2016) GPU computing and applications. Springer, BerlinGoogle Scholar
  9. 9.
    Diouri MEM, Dolz MF, Glück O, Lefèvre L, Alonso P, Catalán S, Mayo R, Quintana-Ortí ES (2014) Assessing power monitoring approaches for energy and power analysis of computers. Sustain Comput Inf Syst 4(2):68–82Google Scholar
  10. 10.
    Mittal S, Vetter JS (2014) A survey of methods for analyzing and improving gpu energy efficiency. ACM Comput Surv 47(2):19:1–19:23. doi: 10.1145/2636342 CrossRefGoogle Scholar
  11. 11.
    NVIDIA: Nvidia’s next generation cuda compute architecture: Fermi. Tech. rep. (2010)Google Scholar
  12. 12.
    CUDA: compute unified device architecture. 2007 NVIDIA Corporation.
  13. 13.
    NVIDIA corporation: CUDA C best practices guide, 7.5 edn (2015)Google Scholar
  14. 14.
    Knuth DE (1997) The art of computer programming, vol 3, 3rd edn. Addison-Wesley, ReadingzbMATHGoogle Scholar
  15. 15.
    Hoare CA (1962) Quicksort. Comput J 5(1):10–16MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Aha DW, Kibler D, Albert M (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66Google Scholar
  17. 17.
    Barrientos R, Gómez J, Tenllado C, Prieto M, Marin M (2011) knn query processing in metric spaces using gpus. In: 17th International European Conference on Parallel and Distributed Computing (Euro-Par 2011), pp 380–392Google Scholar
  18. 18.
    Cayton L (2012) Accelerating nearest neighbor search on manycore systems. In: Parallel Distributed Processing Symposium (IPDPS), 2012 IEEE 26th International, pp 402–413. doi: 10.1109/IPDPS.2012.45
  19. 19.
    Pan J, Manocha D (2011) Fast gpu-based locality sensitive hashing for k-nearest neighbor computation. In: Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, GIS ’11. ACM, New York, NY, USA, pp 211–220. doi: 10.1145/2093973.2094002
  20. 20.
    Samet H (2005) Foundations of multidimensional and metric data structures. Morgan Kaufmann Publishers Inc., San FranciscozbMATHGoogle Scholar
  21. 21.
    Brisaboa NR, Fariña A, Pedreira O, Reyes N (2006) Similarity search using sparse pivots for efficient multimedia information retrieval. In: ISM, pp 881–888Google Scholar
  22. 22.
    Chávez E, Navarro G (2005) A compact space decomposition for effective metric indexing. Pattern Recognit Lett 26(9):1363–1376CrossRefGoogle Scholar
  23. 23.
    Garcia V, Debreuve E, Barlaud M (2008) Fast k nearest neighbor search using gpu. In: Computer Vision and Pattern Recognition Workshop 0, pp 1–6. doi: 10.1109/CVPRW.2008.4563100
  24. 24.
    Kuang Q, Zhao L (2009) A practical gpu based knn algorithm. Huangshan, China, pp 151–155Google Scholar
  25. 25.
    Cederman D, Tsigas P (2009) Gpu-quicksort: A practical quicksort algorithm for graphics processors. J Exp Algorithmics 14:14–124. doi: 10.1145/1498698.1564500 MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Beliakov G, Li G (2012) Improving the speed and stability of the k-nearest neighbors method. Pattern Recognit Lett 33(10):1296–1301. doi: 10.1016/j.patrec.2012.02.016 CrossRefGoogle Scholar
  27. 27.
    Beliakov G, Johnstone M, Nahavandi S (2012) Computing of high breakdown regression estimators without sorting on graphics processing units. Computing 94(5):433–447. doi: 10.1007/s00607-011-0183-7 MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
  29. 29.
    Barrientos RJ, Gómez JI, Tenllado C, Matias MP, Marin M (2013) Range query processing on single and multi GPU environments. Comput Electr Eng 39(8):2656–2668. doi: 10.1016/j.compeleceng.2013.05.012 CrossRefGoogle Scholar
  30. 30.
    Levenshtein V (1966) Binary codes capable of correcting deletions, insertions, and reversals. Sov Phys Dokl 10:707–710MathSciNetzbMATHGoogle Scholar
  31. 31.
    Bolettieri P, Esuli A, Falchi F, Lucchese C, Perego R, Piccioli T, Rabitti F (2009) Cophir: a test collection for content-based image retrieval. CoRR arXiv:0905.4627.
  32. 32.
    MUFIN web site: multi-feature indexing network.
  33. 33.
    Novak, D., Batko, M., Zezula, P.: Generic similarity search engine demonstrated by an image retrieval application. In: 32nd ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, Boston, MA, USA, p 840 (2009)Google Scholar
  34. 34.
    Barrientos R, Gómez J, Tenllado C, Prieto M, Zezula P (2013) Multi-level clustering on metric spaces using a multi-gpu platform. In: 19th International European Conference on Parallel and Distributed Computing (Euro-Par 2013), LNCS, vol 8097. Springer, Aachen, Germany, pp 216–228. doi: 10.1007/978-3-642-40047-6_24
  35. 35.
    Real query log from the similarity search engine mufin.
  36. 36.
    Stupar A, Michel S, Schenkel R (2010) Rankreduce processing k-nearest neighbor queries on top of mapreduce. In: Proceedings of the 8th Workshop on Large-Scale Distributed Systems for Information Retrieval, pp 13–18Google Scholar
  37. 37.
    Uribe-Paredes R, Valero-Lara P, Arias E, Sánchez JL, Cazorla D (2011) A gpu-based implementation for range queries on spaghettis data structure. In: Computational Science and Its Applications (ICCSA 2011), vol 6782. Lecture Notes in Computer Science. Springer, Santander, Spain, pp 615–629Google Scholar
  38. 38.
    Gil-Costa V, Barrientos RJ, Marin M, Bonacic C (2010) Scheduling metric-space queries processing on multi-core processors. In: 18th Euromicro Conference on Parallel. Distributed and Network-based Processing (PDP 2010). IEEE Computer Society, Pisa, Italy, pp 187–194Google Scholar
  39. 39.
    Uribe-Paredes R, Arias E, Sánchez JL, Cazorla D, Valero-Lara P (2012) Improving the performance for the range search on metric spaces using a multi-gpu platform. In: 23rd International Conference on Database and Expert Systems Applications (DEXA 2012), LNCS, vol 7447. Springer, pp 442–449Google Scholar
  40. 40.
    Tesla C2050/C2070 GPU computing processor.
  41. 41.
    Tesla M2050/M2070 GPU computing processor.
  42. 42.
    Lichman M (2013) UCI machine learning repository.
  43. 43.

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversidad Católica del MauleTalcaChile
  2. 2.Akzio ConsultingSantiagoChile
  3. 3.Computing Systems DepartmentUniversidad de Castilla-La ManchaAlbaceteSpain

Personalised recommendations