Skip to main content

Advertisement

Log in

Towards an efficient static scheduling scheme for delivering queries to heterogeneous clusters in the similarity search problem

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Medium and large clusters incorporating hybrid CPU/graphics processing unit (GPU) nodes are present in many datacenters today. They can accelerate many different kinds of applications and appropriately manage applications dealing with a high volume of data. This is the case of the similarity problem because large databases are managed and very quick responses are required to hundreds or thousands of queries per second. However, the design and usage of heterogeneous computing platforms poses big challenges as system size, energy saving, task mapping, scheduling, among others, must be efficiently handled. In this paper we focus on the scheduling issue for distributing the incoming queries to all the processing components in the cluster nodes. Our algorithms exploit the computational resources, simultaneously processing queries on CPU cores and on the GPUs. Thus, we address the problem of how to distribute the queries over the whole system in order to obtain the best performance, under the assumption of defining a heuristic that automatically provides the best distribution. Experimental results show the benefits in terms of execution time and energy saving of using an appropriate scheduling scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Chávez E, Navarro G, Baeza-Yates R, Marroquín JL (2001) Searching in metric spaces. ACM Comput Surv 33(3):273–321

    Article  Google Scholar 

  2. Kalantari I, McDonald G (1983) A data structure and an algorithm for the nearest point problem. IEEE Trans Softw Eng 9(5):631–634

    Article  MATH  Google Scholar 

  3. Uhlmann JK (1991) Satisfying general proximity/similarity queries with metric trees. Inf Process Lett 40:175–179

    Article  MATH  Google Scholar 

  4. Ciaccia P, Patella M, Zezula P (1997) M-tree: an efficient access method for similarity search in metric spaces. In: Proceedings of the 23rd international conference on VLDB, pp 426–435

  5. Brin S (1995) Near neighbor search in large metric spaces. In: Proceedings of the 21st VLDB conference, pp 574–584

  6. Navarro G, Uribe-Paredes R (2011) Fully dynamic metric access methods based on hyperplane partitioning. Inf Syst 36(4):734–747

    Article  Google Scholar 

  7. Micó ML, Oncina J, Vidal E (1994) A new version of the nearest-neighbour approximating and eliminating search algorithm (AESA) with linear preprocessing time and memory requirements. Pattern Recognit Lett 15(1):9–17

    Article  Google Scholar 

  8. Baeza-Yates R, Cunto W, Manber U, Wu S (1994) Proximity matching using fixed-queries trees. In: Proceedings of the 5th combinatorial pattern matching (CPM’94), LNCS-807, pp 198–212

  9. Chávez E, Marroquín JL, Baeza-Yates R (1999) Spaghettis: an array based algorithm for similarity queries in metric spaces. In: Proceedings of the 6th international symposium on string processing and information retrieval (SPIRE’99), pp 38–46

  10. Chávez E, Marroquín JL, Navarro G (2001) Fixed queries array: a fast and economical data structure for proximity searching. Multimed Tools Appl 14(2):113–135

    Article  MATH  Google Scholar 

  11. Pedreira O, Brisaboa NR (2007) Spatial selection of sparse pivots for similarity search in metric spaces. In: Proceedings of the 33rd conference on current trends in theory and practice of computer science, LNCS-4362, pp 434–445

  12. Top500. http://top500.org/. Accessed 26 Dec 2013

  13. Green500. http://green500.org/. Accessed 26 Dec 2013

  14. Duato J, Peña AJ, Silla F, Mayo R, Quintana-Ortí ES (2010) rCUDA: reducing the number of GPU-based accelerators in high performance clusters. In: Proceedings of the 2010 international conference on high performance computing and simulation (HPCS 2010), pp 224–231

  15. Zezula P, Savino P, Rabitti F, Amato G, Ciaccia P (1998) Processing m-trees with parallel resources. In: Proceedings of the workshop on research issues in database engineering, RIDE ’98, p 147

  16. Alpkocak A, Danisman T, Tuba U (2002) A parallel similarity search in high dimensional metric space using m-tree. In: Proceedings of the advanced environments, tools, and applications for cluster computing LNCS-2326, pp 247–252

  17. Gil-Costa V, Marín M, Reyes N (2009) Parallel query processing on distributed clustering indexes. J Discret Algorithms 7(1):3–17

    Article  MATH  Google Scholar 

  18. Gil-Costa V, Barrientos R, Marín M, Bonacic C (2010) Scheduling metric-space queries processing on multi-core processors. In: Proceedings of the Euromicro conference on parallel, distributed, and network-based processing, pp 187–194

  19. Kuang Q, Zhao L (2009) A practical GPU based kNN algorithm. In: Proceedings of the international symposium on computer science and computational technology (ISCSCT), pp 151–155

  20. Garcia V, Debreuve E, Barlaud M (2008) Fast k nearest neighbor search using GPU. In: Computer vision and pattern recognition workshop, pp 1–6

  21. Barrientos RJ, Gómez JI, Tenllado C, Prieto M, Marín M (2011) kNN query processing in metric spaces using GPUs. In: Proceedings of the 17th international European conference on parallel and distributed computing (Euro-Par 2011), LNCS-6852, pp 380–392

  22. Barrientos RJ, Gómez JI, Tenllado C, Prieto M, Marín M (2012) Range query processing in a multi-GPU environment. In: Proceedings of the 10th IEEE international symposium on parallel and distributed processing with applications (ISPA 2012), pp 419–426

  23. Uribe-Paredes R, Valero-Lara P, Arias E, Sánchez JL, Cazorla D (2011) Similarity search implementations for multi-core and many-core processors. In: Proceedings of the international conference on high performance computing and simulation (HPCS), pp 656–663

  24. Uribe-Paredes R, Cazorla D, Sánchez JL, Arias E (2012) A comparative study of different metric structures. In: Thinking on GPU implementations. Lecture notes in engineering and computer science, pp 312–317

  25. Uribe-Paredes R, Arias E, Sánchez JL, Cazorla D, Valero-Lara P (2012) Improving the performance for the range search on metric spaces using a multi-GPU platform. In: Proceedings of the database and expert systems applications (DEXA), LNCS-7447, pp 442–449

  26. Grama A, Karypis G, Kumar V, Gupta A (2003) Introduction to parallel computing, 2nd edn. Addison-Wesley, USA

    Google Scholar 

  27. Uribe-Paredes R, Arias E, Sánchez JL, Cazorla D (2013) Metric data structures supported by heterogeneous systems, Technical Report DIAB-13-05-2, University of Castilla-La Mancha, Albacete

  28. YOKOGAMA PZ4000 POWER ANALYZER. http://tmi.yokogawa.com/es/. Accessed 26 Dec 2013

Download references

Acknowledgments

This work has been partially supported by the project Ref: TIN2009-14475-C04 and by CAPAP-H4 network (TIN2011-15734-E).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Enrique Arias.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Uribe-Paredes, R., Cazorla, D., Arias, E. et al. Towards an efficient static scheduling scheme for delivering queries to heterogeneous clusters in the similarity search problem. J Supercomput 70, 527–540 (2014). https://doi.org/10.1007/s11227-013-1079-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-013-1079-4

Keywords

Navigation