Advertisement

Communication Models Insights Meet Simulations

  • Pierre-François Dutot
  • Millian Poquet
  • Denis Trystram
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9523)

Abstract

It is well-known that taking into account communications while scheduling jobs in large scale parallel computing platforms is a crucial issue. In modern hierarchical platforms, communication times are highly different when occurring inside a cluster or between clusters. Thus, allocating the jobs taking into account locality constraints is a key factor for reaching good performances. However, several theoretical results prove that imposing such constraints reduces the solution space and thus, possibly degrades the performances. In practice, such constraints simplify implementations and most often lead to better results.

Our aim in this work is to bridge theoretical and practical intuitions, and check the differences between constrained and unconstrained schedules (namely with respect to locality and node contiguity) through simulations. We have developed a generic tool, using SimGrid as the base simulator, enabling interactions with external batch schedulers to evaluate their scheduling policies. The results confirm that insights gained through theoretical models are ill-suited to current architectures and should be reevaluated.

Keywords

FCFS with backfilling Simulations Heterogeneity 

Notes

Acknowledgments

The work is partially supported by the ANR project MOEBUS. Experiments presented in this paper were carried out using the Grid’5000 experimental testbed, being developed under the INRIA ALADDIN development action with support from CNRS, RENATER and several Universities as well as other funding bodies (see https://www.grid5000.fr).

References

  1. 1.
    Casanova, H., Giersch, A., Legrand, A., Quinson, M., Suter, F.: Versatile, scalable, and accurate simulation of distributed applications and platforms. J. Parallel Distrib. Comput. 74(10), 2899–2917 (2014)CrossRefGoogle Scholar
  2. 2.
    Giroudeau, R., König, J.C.: Scheduling with communication delay. In: Multiprocessor Scheduling: Theory and Applications, pp. 1–26. ARS Publishing, December 2007Google Scholar
  3. 3.
    Hunold, S.: One step towards bridging the gap between theory and practice in moldable task scheduling with precedence constraints. Concurrency Comput. Pract. Experience 27(4), 1010–1026 (2015)CrossRefGoogle Scholar
  4. 4.
    Hunold, S., Casanova, H., Suter, F.: From simulation to experiment: a case study on multiprocessor task scheduling. In: Proceedings of the 13th Workshop on Advances on Parallel and Distributed Processing Symposium (APDCM) (2011)Google Scholar
  5. 5.
    Jeannot, E., Meneses, E., Mercier, G., Tessier, F., Zheng, G.: Communication and topology-aware load balancing in charm++ with treematch. In: IEEE Cluster 2013. IEEE, Indianapolis, United States, September 2013Google Scholar
  6. 6.
    Leung, J.: Handbook of Scheduling: Algorithms, Models, and Performance Analysis. Chapman and Hall/CRC Computer and Information Science Series. CRC Press, Boca Raton (2004) Google Scholar
  7. 7.
    Lifka, D.A.: The ANL/IBM SP scheduling system. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1995 and JSSPP 1995. LNCS, vol. 949. Springer, Heidelberg (1995) CrossRefGoogle Scholar
  8. 8.
    Lucarelli, G., Mendonca, F., Trystram, D., Wagner, F.: Contiguity and locality in backfilling scheduling. In: 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), May 2015Google Scholar
  9. 9.
    Mu’alem, A.W., Feitelson, D.G.: Utilization, predictability, workloads, and user runtime estimates in scheduling the ibm sp2 with backfilling. IEEE Trans. Parallel Distrib. Syst. 12(6), 529–543 (2001)CrossRefGoogle Scholar
  10. 10.
    Pascual, J.A., Navaridas, J., Miguel-Alonso, J.: Effects of topology-aware allocation policies on scheduling performance. In: Frachtenberg, E., Schwiegelshohn, U. (eds.) JSSPP 2009. LNCS, vol. 5798, pp. 138–156. Springer, Heidelberg (2009) CrossRefGoogle Scholar
  11. 11.
    Sinnen, O.: Task Scheduling for Parallel Systems. Wiley Series on Parallel and Distributed Computing. Wiley, New York (2007) CrossRefGoogle Scholar
  12. 12.
    Sinnen, O., Sousa, L.A., Sandnes, F.E.: Toward a realistic task scheduling model. IEEE Trans. Parallel Distrib. Syst. 17(3), 263–275 (2006)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Pierre-François Dutot
    • 1
    • 2
    • 3
  • Millian Poquet
    • 1
    • 2
    • 3
  • Denis Trystram
    • 1
    • 2
    • 3
    • 4
  1. 1.Université Grenoble Alpes, LIGGrenobleFrance
  2. 2.CNRS, LIGGrenobleFrance
  3. 3.InriaGrenobleFrance
  4. 4.Institut Universitaire de FranceParisFrance

Personalised recommendations