Simbatch: An API for Simulating and Predicting the Performance of Parallel Resources Managed by Batch Systems

  • Y. Caniou
  • J. -S. Gay
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5415)

Abstract

In this paper, we describe Simbatch, an API which offers core functionalities to realistically simulate parallel resources and batch reservation systems. The objective is twofold: proposing at the same time a tool to efficiently predict parallel resources usage based on their simulations, and to realistically study Grid scheduling heuristics that may be embedded in a Grid middleware or in a tool that deploys it. Indeed, such predictions can be used in a Grid middleware both for scheduling purposes, and to dynamically tune moldable applications in function of the load of the chosen parallel resource in place of the Grid user. Simbatch simulation experiments show an average error rate under 2% compared to real life experiments conducted with the OAR batch manager.

Keywords

Performance prediction Batch systems simulation Grid simulation Scheduling 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
    Bell, W., Cameron, D., Capozza, L., Millar, P., Stockinger, K., Zini, F.: Optorsim - a grid simulator for studying dynamic data replication strategies. Journal of High Performance Computing Applications 17 (2003)Google Scholar
  6. 6.
    Caniou, Y., Gay, J.-S., Ramet, P.: Tunable parallel experiments in a gridrpc framework: application to linear solvers. In: VECPAR 2008 International Meeting on High Performance Computing for Computational Science (2008) (to appear)Google Scholar
  7. 7.
    Caniou, Y., Kushida, N., Teshima, N.: Implementing interoperability between the Aegis and Diet GridRPC middleware to build an International Sparse Linear Algebra Expert System. In: Second International Conference on Advanced Engineering Computing and Applications in Sciences, ADVCOMP 2008 (2008)Google Scholar
  8. 8.
    Capit, N., Da Costa, G., Georgiou, Y., Huard, G., Martin, C., Mounier, G., Neyron, P., Richard, O.: A batch scheduler with high level components. In: Cluster computing and Grid 2005, CCGrid 2005 (2005)Google Scholar
  9. 9.
    Cappello, F., Desprez, F., Dayde, M., Jeannot, E., Jegou, Y., Lanteri, S., Melab, N., Namyst, R., Primet, P., Richard, O., Caron, E., Leduc, J., Mornet, G.: Grid’5000: A large scale, reconfigurable, controlable and monitorable grid platform. In: Proceedings of the 6th IEEE/ACM International Workshop on Grid Computing, Grid 2005, Seattle, Washington, USA (November 2005)Google Scholar
  10. 10.
    Caron, E., Desprez, F., Fleury, E., Lombard, F., Nicod, J.-M., Quinson, M., Suter, F.: Une approche hiérarchique des serveurs de calculs. In: Calcul réparti à grande échelle, Hermés Science, Paris (2002)Google Scholar
  11. 11.
    Casanova, H., Dongarra, J.: Netsolve: A network server for solving computational science problems. In: Proceedings of Super-Computing, Pittsburg (1996)Google Scholar
  12. 12.
    Cirne, W., Berman, F.: Using moldability to improve the performance of supercomputer jobs. J. Parallel Distrib. Comput. 62(10), 1571–1601 (2002)CrossRefMATHGoogle Scholar
  13. 13.
    Daydé, M., Desprez, F., Hurault, A., Pantel, M.: On deploying scientific software within the Grid-TLSE project. Computing Letters 1(3), 85–92 (2005)CrossRefGoogle Scholar
  14. 14.
    Fukuda, M.: ITBL – toward constructing a new R & D environment, vol. 55, pp. 19–23 (2002)Google Scholar
  15. 15.
    Galassi, M., Theiler, J.: The Gnu Standard Library (1996)Google Scholar
  16. 16.
    Garonne, V.: DIRAC - Distributed Infrastructure with Remote Agent Control. Ph.D thesis, Université de Méditéranée, Décember (2005)Google Scholar
  17. 17.
    Gay, J.-S., Caniou, Y.: Simbatch: an api for simulating and predicting the performance of parallel resources and batch systems. Technical Report RR2006-32, LIP ENS-Lyon, Université Claude Bernard Lyon 1, Lyon, (October 2006)Google Scholar
  18. 18.
    Kushida, N., Suzuki, Y., Teshima, N., Nakajima, N., Caniou, Y., Daydé, M., Ramet, P.: Toward an International Sparse Linear Algebra Expert System by interconnecting the ITBL computational Grid with the Grid-TLSE platform. In: VECPAR 2008 International Meeting on High Performance Computing for Computational Science (2008) (to appear)Google Scholar
  19. 19.
    Legrand, A., Marchal, L., Casanova, H.: Scheduling distributed applications: the simgrid simulation framework. In: IEEE Computer Society (ed.) 3rd International Symposium on Cluster Computing and the Grid, p. 138. IEEE Computer Society, Los Alamitos (2003)Google Scholar
  20. 20.
    Mu’alem, A.W., Feitelson, D.G.: Utilization, predictability, workloads, and user runtime estimates in scheduling the ibm sp2 with backfilling. In: IEEE (ed.) IEEE Trans. Parallel and Distributed Systeme, pp. 529–543. IEEE, Los Alamitos (2001)Google Scholar
  21. 21.
    Nakada, H., Matsuoka, S., Seymour, K., Dongarra, J., Lee, C., Casanova, H.: A GridRPC Model and API for End-User Applications (December 2003)Google Scholar
  22. 22.
    Ranganathan, K., Foster, I.: Decoupling computation and data scheduling in distributed data-intensive applications. In: 11th IEEE International Symposium on High Performance Distributed Computing, HPDC-11 (2002)Google Scholar
  23. 23.
    Sulistio, A., Cibej, U., Venugopal, S., Robic, B., Buyya, R.: A toolkit for modelling and simulating data grids: An extension to gridsim (accepted December 3, 2007) (in press)Google Scholar
  24. 24.
    Takefusa, A., Casanova, H., Matsuhoka, S., Berman, F.: A study of deadline for client-server sytems on the computational grid. In: 10th IEEE International Symposium on High Performance Distributed Computing (HPDC-10), pp. 406–415 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Y. Caniou
    • 1
  • J. -S. Gay
    • 2
  1. 1.LIP-ÉNS de Lyon, Université Claude Bernard de LyonFrance
  2. 2.LIP-ÉNS de LyonFrance

Personalised recommendations