Skip to main content

DARDIS: Distributed And Randomized DIspatching and Scheduling

  • Conference paper
  • First Online:
AI*IA 2016 Advances in Artificial Intelligence (AI*IA 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10037))

Included in the following conference series:

  • 1252 Accesses

Abstract

Scheduling and dispatching are critical enabling technologies in supercomputing and grid computing. In these contexts, scalability is an issue: we have to allocate and schedule up to tens of thousands of tasks on tens of thousands of resources. This problem scale is out of reach for complete and centralized scheduling approaches. We propose a distributed allocation and scheduling paradigm called DARDIS that is lightweight, scalable and fully customizable in many domains. In DARDIS each task offloads to the available resources the computation of a probability index associated with each possible start time for the given task on the specific resource. The task then selects the proper resource and start time on the basis of the above probability. The scheduler can be customized with different policies to fit several objective functions like load balancing or makespan. We evaluate our approach in the domain of grids and supercomputers. We compare DARDIS with the most widely used algorithms used in these specific domains to show that this approach can reach better solutions in several cases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. NSCC: Tianhe-2 service page (2015). http://www.nscc-gz.cn/Product/HighPerformanceComputingService/ServiceCharacteristics.html#Page_1

  2. BBC: Supercomputers: Obama orders world’s fastest computer (2015). http://www.bbc.com/news/technology-33718311

  3. Attig, N., Gibbon, P., Lippert, T.: Trends in supercomputing: the european path to exascale. Comput. Phys. Commun. 182(9), 2041–2046 (2011)

    Article  Google Scholar 

  4. Lavignon, J., et al.: Etp4hpc strategic research agenda achieving hpc leadership in europe (2013). http://www.etp4hpc.eu/wp-content/uploads/2013/06/ETP4HPC_book_singlePage.pdf

  5. Salot, P.: A survey of various scheduling algorithm in cloud computing environment. Int. J. Res. Eng. Technol. (IJRET) (2013). ISSN 2319-1163

    Google Scholar 

  6. Bartolini, A., Borghesi, A., Bridi, T., Lombardi, M., Milano, M.: Proactive workload dispatching on the EURORA supercomputer. In: O’Sullivan, B. (ed.) CP 2014. LNCS, vol. 8656, pp. 765–780. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10428-7_55

    Google Scholar 

  7. Borghesi, A., Collina, F., Lombardi, M., Milano, M., Benini, L.: Power capping in high performance computing systems. In: Pesant, G. (ed.) CP 2015. LNCS, vol. 9255, pp. 524–540. Springer, Heidelberg (2015). doi:10.1007/978-3-319-23219-5_37

    Google Scholar 

  8. Van Den Briel, M., Scott, P., Thiébaux, S.: Randomized load control: A simple distributed approach for scheduling smart appliances. In: Proceedings of the 23th International Joint Conference on Artificial Intelligence, pp. 2915–2922. AAAI Press (2013)

    Google Scholar 

  9. Bergman, K., Borkar, S., Campbell, D., Carlson, W., Dally, W., Denneau, M., et al.: Exascale computing study: technology challenges in achieving exascale systems. Defense Advanced Research Projects Agency Information Processing Techniques Office (DARPA IPTO), Technical report 15 (2008)

    Google Scholar 

  10. Borghesi, A., Conficoni, C., Lombardi, M., Bartolini, A.: Ms3: A mediterranean-stile job scheduler for supercomputers-do less when it’s too hot!. In: 2015 International Conference on High Performance Computing & Simulation (HPCS), pp. 88–95. IEEE (2015)

    Google Scholar 

  11. Feng, X., Ge, R., Cameron, K.W.: Power and energy profiling of scientific applications on distributed systems. In: 19th IEEE International, Parallel and Distributed Processing Symposium, 2005, Proceedings, p. 34. IEEE (2005)

    Google Scholar 

  12. Mehta, V.K.: Variable load on power station (2005). http://www.nct-tech.edu.lk/Download/Technology%20Zone/Variable%20Load%20on%20Power%20Station..pdf

  13. Subramani, V., Kettimuthu, R., Srinivasan, S., Sadayappan, P.: Distributed job scheduling on computational grids using multiple simultaneous requests. In: 11th IEEE International Symposium on High Performance Distributed Computing, HPDC-11 2002, Proceedings, pp. 359–366. IEEE (2002)

    Google Scholar 

  14. Feitelson, D.: The cea curie log (2012). http://www.cs.huji.ac.il/labs/parallel/workload/l_cea_curie/index.html

  15. Yoo, A.B., Jette, M.A., Grondona, M.: SLURM: simple linux utility for resource management. In: Feitelson, D., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2003. LNCS, vol. 2862, pp. 44–60. Springer, Heidelberg (2003). doi:10.1007/10968987_3

    Chapter  Google Scholar 

  16. Blazewicz, J., Lenstra, J.K., Kan, A.R.: Scheduling subject to resource constraints: classification and complexity. Discrete Appl. Math. 5(1), 11–24 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  17. Hartmann, S.: A self-adapting genetic algorithm for project scheduling under resource constraints. NRL 49(5), 433–448 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  18. Damay, J., Quilliot, A., Sanlaville, E.: Linear programming based algorithms for preemptive and non-preemptive rcpsp. Eur. J. Oper. Res. 182(3), 1012–1022 (2007)

    Article  MATH  Google Scholar 

  19. Bhaskar, T., Pal, M.N., Pal, A.K.: A heuristic method for rcpsp with fuzzy activity times. Eur. J. Oper. Res. 208(1), 57–66 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  20. Haupt, R.: A survey of priority rule-based scheduling. Oper. Res. Spektrum 11(1), 3–16 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  21. Ramamritham, K., Stankovic, J., Zhao, W., et al.: Distributed scheduling of tasks with deadlines and resource requirements. IEEE Trans. Comput. 38(8), 1110–1123 (1989)

    Article  Google Scholar 

  22. Izakian, H., Tork Ladani, B., Zamanifar, K., Abraham, A.: A novel particle swarm optimization approach for grid job scheduling. In: Prasad, S.K., Routray, S., Khurana, R., Sahni, S. (eds.) ICISTM 2009. CCIS, vol. 31, pp. 100–109. Springer, Heidelberg (2009). doi:10.1007/978-3-642-00405-6_14

    Chapter  Google Scholar 

  23. Zhan, S., Huo, H.: Improved pso-based task scheduling algorithm in cloud computing. J. Inform. Comput. Sci. 9(13), 3821–3829 (2012)

    Google Scholar 

  24. Izakian, H., Ladani, B.T., Abraham, A., Snasel, V.: A discrete particle swarm optimization approach for grid job scheduling. Int. J. Innovative Comput. Inform. Control 6(9), 4219–4233 (2010)

    Google Scholar 

  25. Vanneschi, L., Codecasa, D., Mauri, G.: A comparative study of four parallel and distributed pso methods. New Gener. Comput. 29(2), 129–161 (2011)

    Article  Google Scholar 

  26. Montresor, A., Meling, H., Babaoğlu, Ö.: Messor: load-balancing through a swarm of autonomous agents. In: Moro, G., Koubarakis, M. (eds.) AP2PC 2002. LNCS (LNAI), vol. 2530, pp. 125–137. Springer, Heidelberg (2003). doi:10.1007/3-540-45074-2_12

    Chapter  Google Scholar 

  27. Benhamou, F. (ed.): CP 2006. LNCS, vol. 4204. Springer, Heidelberg (2006)

    MATH  Google Scholar 

  28. Gomes, C.P., van Hoeve, W.J., Selman, B.: Constraint programming for distributed planning and scheduling. AAAI Spring Symposium: Distributed Plan and Schedule Management, vol. 1, pp. 157–158 (2006)

    Google Scholar 

  29. Rolf, C.C., Kuchcinski, K.: Distributed constraint programming with agents. In: Bouchachia, A. (ed.) ICAIS 2011. LNCS (LNAI), vol. 6943, pp. 320–331. Springer, Heidelberg (2011). doi:10.1007/978-3-642-23857-4_32

    Chapter  Google Scholar 

  30. Bridi, T., Bartolini, A., Lombardi, M., Milano, M., Benini, L.: A constraint programming scheduler for heterogeneous high-performance computing machines. IEEE Trans. Parallel Distrib. Syst. 27(10), 2781–2794 (2016). doi:10.1109/TPDS.2016.2516997. ISSN:1045-9219

    Article  Google Scholar 

Download references

Acknowledgments

This work was partially supported by the FP7 ERC Advance project MULTITHERMAN (g.a. 291125), by the YINS RTD project (no. 20NA21 150939), evaluated by the Swiss NSF and funded by Nano-Tera.ch with Swiss Confederation financing and by CINECA.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Bridi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Bridi, T., Lombardi, M., Bartolini, A., Benini, L., Milano, M. (2016). DARDIS: Distributed And Randomized DIspatching and Scheduling. In: Adorni, G., Cagnoni, S., Gori, M., Maratea, M. (eds) AI*IA 2016 Advances in Artificial Intelligence. AI*IA 2016. Lecture Notes in Computer Science(), vol 10037. Springer, Cham. https://doi.org/10.1007/978-3-319-49130-1_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49130-1_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49129-5

  • Online ISBN: 978-3-319-49130-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics