Journal of Grid Computing

, Volume 6, Issue 1, pp 77–101 | Cite as

Statistical Analysis and Modeling of Jobs in a Grid Environment

  • Konstantinos Christodoulopoulos
  • Vasileios Gkamas
  • Emmanouel A. Varvarigos
Article

Abstract

The existence of good probabilistic models for the job arrival process and the delay components introduced at different stages of job processing in a Grid environment is important for the improved understanding of the Grid computing concept. In this study, we present a thorough analysis of the job arrival process in the EGEE infrastructure and of the time durations a job spends at different states in the EGEE environment. We define four delay components of the total job delay and model each component separately. We observe that the job inter-arrival times at the Grid level can be adequately modelled by a rounded exponential distribution, while the total job delay (from the time it is generated until the time it completes execution) is dominated by the computing element’s register and queuing times and the worker node’s execution times. Further, we evaluate the efficiency of the EGEE environment by comparing the job total delay performance with that of a hypothetical ideal super-cluster and conclude that we would obtain similar performance if we submitted the same workload to a super-cluster of size equal to 34% of the total average number of CPUs participating in the EGEE infrastructure. We also analyze the job inter-arrival times, the CE’s queuing times, the WN’s execution times, and the data sizes exchanged at the kallisto.hellasgrid.gr cluster, which is node in the EGEE infrastructure. In contrast to the Grid level, we find that at the cluster level the job arrival process exhibits self-similarity/long-range dependence. Finally, we propose simple and intuitive models for the job arrival process and the execution times at the cluster level.

Keywords

Grid computing Job profiling Delay components Probabilistic modeling 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Foster, I., Kesselman, C.: The Grid: Blueprint for a New Computing Infrastructure, 2nd edn. (Morgan Kaufman, San Francisco, 2003)Google Scholar
  2. 2.
    Feitelson, D.: Workload modeling for computer systems performance evaluation”, http://www.cs.huji.ac.il/~feit/wlmod
  3. 3.
    Cirne, W., Berman, F.: A Comprehensive Model of the Supercomputer Workload. Proceedings of the 4th IEEE Annual Workshop on Workload Characterization (2001)Google Scholar
  4. 4.
    Song, B., Ernemann, C., Yahyapour, R.: Parallel Computer Workload Modeling with Markov Chains. Proceedings of the 10th JSSPP (2004)Google Scholar
  5. 5.
    Denneulin, Y., Romagnoli, E., Trystram, D.: A Synthetic Workload Generator for Cluster Computing. Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS) (2004)Google Scholar
  6. 6.
    Medernach, E.: Workload Analysis of a Cluster in a Grid Environment. Proceedings of the 11th JSSPP (2005)Google Scholar
  7. 7.
    Li, H., Muskulus, M., Wolters, L.: Modeling Job Arrivals in a Data-Intensive Grid. Proceedings of the 12th JSSPP (2006)Google Scholar
  8. 8.
  9. 9.
    Li, H., Heusdens, R., Muskulus, M., Wolters L.: Analysis and Synthesis of Pseudo-Periodic Job arrivals in Grids: A matching Pursuit Approach. Proceedings of CCGrid07 (2007)Google Scholar
  10. 10.
    Nurmi, D., Mandal, A., Brevik, J., Koelbel, C., Wolski, R., Kennedy, K.: Grid Scheduling and Protocols – Evaluation of a Workflow Scheduler Using Integrated Performance Modelling and Batch Queue Wait Time Prediction. Proceedings of Supercomputing (2006)Google Scholar
  11. 11.
    The EGEE project homepage: http://public.eu-egee.org/
  12. 12.
  13. 13.
    Job description language: How To. Publicly available at http://www.infn.it/workload-grid/docs/DataGrid-01-TEN-0102-0_2-Document.pdf
  14. 14.
    Fischer, W., Meier-Hellstern, K.: The Markov-modulated Poisson process (MMPP) cookbook. Perform. Eval. 18(2), 149–171 (1993)MATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    The EMpht program: publicly available at http://home.imf.au.dk/asmus/pspapers.html
  16. 16.
  17. 17.
  18. 18.
    Karagiannis, T., Faloutsos, M., Molle, M.: A User-Friendly Self-Similarity Analysis Tool. ACM SIGCOMM Computer Communication Review (2003)Google Scholar
  19. 19.
    HellasGrid task force: http://www.hellasgrid.gr/

Copyright information

© Springer Science+Business Media B.V. 2007

Authors and Affiliations

  • Konstantinos Christodoulopoulos
    • 1
  • Vasileios Gkamas
    • 1
  • Emmanouel A. Varvarigos
    • 1
  1. 1.Computer Engineering and Informatics Department and Research Academic Computer Technology InstituteUniversity of PatrasRio, PatrasGreece

Personalised recommendations