GPU-based exhaustive algorithms processing kNN queries


Efficient kNN search, or k-nearest neighbors search, is useful, among other fields, in multimedia information retrieval, data mining and pattern recognition problems. A distance function determines how similar the objects are to a given kNN query object. As finding the distance between any given pair of objects (i.e., high-dimensional vectors) is known to be a computationally expensive operation, using parallel computation techniques is an effective way of reducing running times to acceptable values in large databases. In the present work, we offer novel GPU approaches to solving kNN (k-nearest neighbor) queries using exhaustive algorithms based on the Selection Sort, Quicksort and state-of-the-art algorithms. We show that the best approach depends on the k value of the kNN query and achieve a speedup up to 86.4\(\times \) better than the sequential counterpart. We also propose a multi-core algorithm to be used as reference for the experiments and a hybrid algorithm which combines the proposed algorithms with a state-of-the-art heaps-based method, in which the best performance is obtained with high k values. We also extend our algorithms to be able to deal with large databases that do not fit in GPU memory and whose performance does not deteriorate as database size increases.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9


  1. 1.

    This programmer control is larger when using the SRAM mainly as a software-controlled memory, but hardware-controlled cache must also be taken into account during the mapping.


  1. 1.

    Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27

    Article  MATH  Google Scholar 

  2. 2.

    Deng Z, Zhu X, Cheng D, Zong M, Zhang S (2016) Efficient knn classification algorithm for big data. Neurocomputing 195:143–148

    Article  Google Scholar 

  3. 3.

    Mic V, Novak D, Zezula P (2016) Speeding up similarity search by sketches. Springer, Cham. doi:10.1007/978-3-319-46759-7_19

    Google Scholar 

  4. 4.

    Navarro G, Uribe-Paredes R (2011) Fully dynamic metric access methods based on hyperplane partitioning. Inf Syst 36(4):734–747. doi:10.1016/

    Article  Google Scholar 

  5. 5.

    Novak D, Batko M, Zezula P (2011) Metric index: an efficient and scalable solution for precise and approximate similarity search. Inf Syst 36(4):721–733

    Article  Google Scholar 

  6. 6.

    Keogh E, Mueen A (2010) Curse of dimensionality. In: Encyclopedia of Machine Learning. Springer US, pp 257–258. doi:10.1007/978-0-387-30164-8_192

  7. 7.

    GPU computing.

  8. 8.

    Cai Y, See S (eds) (2016) GPU computing and applications. Springer, Berlin

  9. 9.

    Diouri MEM, Dolz MF, Glück O, Lefèvre L, Alonso P, Catalán S, Mayo R, Quintana-Ortí ES (2014) Assessing power monitoring approaches for energy and power analysis of computers. Sustain Comput Inf Syst 4(2):68–82

    Google Scholar 

  10. 10.

    Mittal S, Vetter JS (2014) A survey of methods for analyzing and improving gpu energy efficiency. ACM Comput Surv 47(2):19:1–19:23. doi:10.1145/2636342

    Article  Google Scholar 

  11. 11.

    NVIDIA: Nvidia’s next generation cuda compute architecture: Fermi. Tech. rep. (2010)

  12. 12.

    CUDA: compute unified device architecture. 2007 NVIDIA Corporation.

  13. 13.

    NVIDIA corporation: CUDA C best practices guide, 7.5 edn (2015)

  14. 14.

    Knuth DE (1997) The art of computer programming, vol 3, 3rd edn. Addison-Wesley, Reading

    Google Scholar 

  15. 15.

    Hoare CA (1962) Quicksort. Comput J 5(1):10–16

    MathSciNet  Article  MATH  Google Scholar 

  16. 16.

    Aha DW, Kibler D, Albert M (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66

    Google Scholar 

  17. 17.

    Barrientos R, Gómez J, Tenllado C, Prieto M, Marin M (2011) knn query processing in metric spaces using gpus. In: 17th International European Conference on Parallel and Distributed Computing (Euro-Par 2011), pp 380–392

  18. 18.

    Cayton L (2012) Accelerating nearest neighbor search on manycore systems. In: Parallel Distributed Processing Symposium (IPDPS), 2012 IEEE 26th International, pp 402–413. doi:10.1109/IPDPS.2012.45

  19. 19.

    Pan J, Manocha D (2011) Fast gpu-based locality sensitive hashing for k-nearest neighbor computation. In: Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, GIS ’11. ACM, New York, NY, USA, pp 211–220. doi:10.1145/2093973.2094002

  20. 20.

    Samet H (2005) Foundations of multidimensional and metric data structures. Morgan Kaufmann Publishers Inc., San Francisco

    Google Scholar 

  21. 21.

    Brisaboa NR, Fariña A, Pedreira O, Reyes N (2006) Similarity search using sparse pivots for efficient multimedia information retrieval. In: ISM, pp 881–888

  22. 22.

    Chávez E, Navarro G (2005) A compact space decomposition for effective metric indexing. Pattern Recognit Lett 26(9):1363–1376

    Article  Google Scholar 

  23. 23.

    Garcia V, Debreuve E, Barlaud M (2008) Fast k nearest neighbor search using gpu. In: Computer Vision and Pattern Recognition Workshop 0, pp 1–6. doi:10.1109/CVPRW.2008.4563100

  24. 24.

    Kuang Q, Zhao L (2009) A practical gpu based knn algorithm. Huangshan, China, pp 151–155

  25. 25.

    Cederman D, Tsigas P (2009) Gpu-quicksort: A practical quicksort algorithm for graphics processors. J Exp Algorithmics 14:14–124. doi:10.1145/1498698.1564500

    MathSciNet  Article  MATH  Google Scholar 

  26. 26.

    Beliakov G, Li G (2012) Improving the speed and stability of the k-nearest neighbors method. Pattern Recognit Lett 33(10):1296–1301. doi:10.1016/j.patrec.2012.02.016

    Article  Google Scholar 

  27. 27.

    Beliakov G, Johnstone M, Nahavandi S (2012) Computing of high breakdown regression estimators without sorting on graphics processing units. Computing 94(5):433–447. doi:10.1007/s00607-011-0183-7

    MathSciNet  Article  MATH  Google Scholar 

  28. 28.

    CUB library v1.7.0.

  29. 29.

    Barrientos RJ, Gómez JI, Tenllado C, Matias MP, Marin M (2013) Range query processing on single and multi GPU environments. Comput Electr Eng 39(8):2656–2668. doi:10.1016/j.compeleceng.2013.05.012

    Article  Google Scholar 

  30. 30.

    Levenshtein V (1966) Binary codes capable of correcting deletions, insertions, and reversals. Sov Phys Dokl 10:707–710

    MathSciNet  MATH  Google Scholar 

  31. 31.

    Bolettieri P, Esuli A, Falchi F, Lucchese C, Perego R, Piccioli T, Rabitti F (2009) Cophir: a test collection for content-based image retrieval. CoRR arXiv:0905.4627.

  32. 32.

    MUFIN web site: multi-feature indexing network.

  33. 33.

    Novak, D., Batko, M., Zezula, P.: Generic similarity search engine demonstrated by an image retrieval application. In: 32nd ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, Boston, MA, USA, p 840 (2009)

  34. 34.

    Barrientos R, Gómez J, Tenllado C, Prieto M, Zezula P (2013) Multi-level clustering on metric spaces using a multi-gpu platform. In: 19th International European Conference on Parallel and Distributed Computing (Euro-Par 2013), LNCS, vol 8097. Springer, Aachen, Germany, pp 216–228. doi:10.1007/978-3-642-40047-6_24

  35. 35.

    Real query log from the similarity search engine mufin.

  36. 36.

    Stupar A, Michel S, Schenkel R (2010) Rankreduce processing k-nearest neighbor queries on top of mapreduce. In: Proceedings of the 8th Workshop on Large-Scale Distributed Systems for Information Retrieval, pp 13–18

  37. 37.

    Uribe-Paredes R, Valero-Lara P, Arias E, Sánchez JL, Cazorla D (2011) A gpu-based implementation for range queries on spaghettis data structure. In: Computational Science and Its Applications (ICCSA 2011), vol 6782. Lecture Notes in Computer Science. Springer, Santander, Spain, pp 615–629

  38. 38.

    Gil-Costa V, Barrientos RJ, Marin M, Bonacic C (2010) Scheduling metric-space queries processing on multi-core processors. In: 18th Euromicro Conference on Parallel. Distributed and Network-based Processing (PDP 2010). IEEE Computer Society, Pisa, Italy, pp 187–194

  39. 39.

    Uribe-Paredes R, Arias E, Sánchez JL, Cazorla D, Valero-Lara P (2012) Improving the performance for the range search on metric spaces using a multi-gpu platform. In: 23rd International Conference on Database and Expert Systems Applications (DEXA 2012), LNCS, vol 7447. Springer, pp 442–449

  40. 40.

    Tesla C2050/C2070 GPU computing processor.

  41. 41.

    Tesla M2050/M2070 GPU computing processor.

  42. 42.

    Lichman M (2013) UCI machine learning repository.

  43. 43.

    Watts up? .net meter.

Download references


This research was supported by the Project of the Universidad Católica del Maule (Chile) “Plan de Desarrollo Anual Facultad de Ingeniería. Convenio de Desempeño” and by the European Commission (FEDER) and Junta de Comunidades de Castilla-La Mancha under the project PEII-2014-028-P. Powered@NLHPC: This research was partially supported by the supercomputing infrastructure of the NLHPC (ECM-02).

Author information



Corresponding author

Correspondence to Ricardo J. Barrientos.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Barrientos, R.J., Millaguir, F., Sánchez, J.L. et al. GPU-based exhaustive algorithms processing kNN queries. J Supercomput 73, 4611–4634 (2017).

Download citation


  • kNN
  • Quicksort
  • Selection Sort
  • GPU
  • Exhaustive algorithms