Workload Characteristics of a Multi-cluster Supercomputer

  • Hui Li
  • David Groep
  • Lex Wolters
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3277)

Abstract

This paper presents a comprehensive characterization of a multi-cluster supercomputer workload using twelve-month scientific research traces. Metrics that we characterize include system utilization, job arrival rate and interarrival time, job cancellation rate, job size (degree of parallelism), job runtime, memory usage, and user/group behavior. Correlations between metrics (job runtime and memory usage, requested and actual runtime, etc) are identified and extensively studied. Differences with previously reported workloads are recognized and statistical distributions are fitted for generating synthetic workloads with the same characteristics. This study provides a realistic basis for experiments in resource management and evaluations of different scheduling strategies in a multi-cluster research environment.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Calzarossa, M., Serazzi, G.: Workload characterization: A survey. Proc. IEEE 81(8), 1136–1150 (1993)CrossRefGoogle Scholar
  2. 2.
    Feitelson, D.G.: Workload modeling for performance evaluation. In: Calzarossa, M.C., Tucci, S. (eds.) Performance 2002. LNCS, vol. 2459, pp. 114–141. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  3. 3.
  4. 4.
    Chiang, S.-H., Vernon, M.K.: Characteristics of a large shared memory production workload. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 2001. LNCS, vol. 2221, pp. 159–187. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  5. 5.
    Feitelson, D., Nitzberg, B.: Job characteristics of a production parallel scientific workload on the NASA ames iPSC/860. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1995 and JSSPP 1995. LNCS, vol. 949, pp. 337–360. Springer, Heidelberg (1995)Google Scholar
  6. 6.
    Windisch, K., Lo, V., Moore, R., Feitelson, D., Nitzberg, B.: A comparison of workload traces from two production parallel machines. In: 6th Symp. Frontiers Massively Parallel Comput., pp. 319–326 (1996)Google Scholar
  7. 7.
    Lublin, U., Feitelson, D.G.: The workload on parallel supercomputers: modeling the characteristics of rigid jobs. J. Parallel and Distributed Comput. 63(11), 1105–1122 (2003)MATHCrossRefGoogle Scholar
  8. 8.
    Jann, J., Pattnaik, P., Franke, H., Wang, F., Skovira, J., Riodan, J.: Modeling of workload in MPPs. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1997 and JSSPP 1997. LNCS, vol. 1291, pp. 95–116. Springer, Heidelberg (1997)Google Scholar
  9. 9.
    Cirne, W., Berman, F.: A comprehensive model of the supercomputer workload. In: IEEE 4th Annual Workshop on Workload Characterization (2001)Google Scholar
  10. 10.
    Chapin, S.J., Cirne, W., Feitelson, D.G., Jones, J.P., Leutenegger, S.T., Schwiegelshohn, U., Smith, W., Talby, D.: Benchmarks and standards for the evaluation of parallel job schedulers. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 1999, IPPS-WS 1999, and SPDP-WS 1999. LNCS, vol. 1659, pp. 67–90. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  11. 11.
    Downey, B., Feitelson, D.G.: The elusive goal of workload characterization. Perf. Eval. Rev. 26(4), 14–29 (1999)CrossRefGoogle Scholar
  12. 12.
    The DAS-2 Supercomputer, http://www.cs.vu.nl/das2
  13. 13.
    Banen, S., Bucur, A., Epema, D.H.J.: A Measurement-Based Simulation Study of Processor Co-Allocation in Multicluster Systems. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2003. LNCS, vol. 2862, pp. 105–128. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  14. 14.
    Portable Batch System, http://www.openpbs.org
  15. 15.
    The Maui Scheduler, http://www.supercluster.org
  16. 16.
    The Globus project, http://www.globus.org
  17. 17.
    Allen, O.: Probability, Statistics, and Queueing Theory with Computer Science Applications. Academic Press, London (1978)MATHGoogle Scholar
  18. 18.
    Khayari, R.E.A., Sadre, R., Haverkort, B.R.: Fitting world-wide web request traces with the EM-algorithm. Performance Evaluation 52, 175–191 (2003)CrossRefGoogle Scholar
  19. 19.
  20. 20.
  21. 21.
  22. 22.
    Downey, A.B.: Using Queue Time Predictions for Processor Allocation. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1997 and JSSPP 1997. LNCS, vol. 1291, pp. 35–57. Springer, Heidelberg (1997)Google Scholar
  23. 23.
    Feitelson, D.G.: Memory usage in the LANL CM-5 Workload. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1997 and JSSPP 1997. LNCS, vol. 1291, pp. 78–94. Springer, Heidelberg (1997)Google Scholar
  24. 24.
    Calzarossa, M., Serazzi, G.: Construction and use of multiclass workload models. Performance Evaluation 19(4), 341–352 (1994)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Hui Li
    • 1
  • David Groep
    • 2
  • Lex Wolters
    • 1
  1. 1.Leiden Institute of Advanced Computer Science (LIACS)Leiden UniversityThe Netherlands
  2. 2.National Institute for Nuclear and High Energy Physics (NIKHEF)The Netherlands

Personalised recommendations