Skip to main content

Comprehensive Workload Analysis and Modeling of a Petascale Supercomputer

  • Conference paper
Job Scheduling Strategies for Parallel Processing (JSSPP 2012)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7698))

Included in the following conference series:

Abstract

The performance of supercomputer schedulers is greatly affected by the characteristics of the workload it serves. A good understanding of workload characteristics is always important to develop and evaluate different scheduling strategies for an HPC system. In this paper, we present a comprehensive analysis of the workload characteristics of Kraken, the world’s fastest academic supercomputer and 11th on the latest Top500 list, with 112,896 compute cores and peak performance of 1.17 petaflops. In this study, we use twelve-month workload traces gathered on the system, which include around 700 thousand jobs submitted by more than one thousand users from 25 research areas. We investigate three categories of the workload characteristics: 1) general characteristics, including distribution of jobs over research fields and different queues, distribution of job size for an individual user, job cancellation rate, job termination rate, and walltime request accuracy; 2) temporal characteristics, including monthly machine utilization, job temporal distributions for different time periods, job inter-arrival time between temporally adjacent jobs and jobs submitted by the same user; 3) execution characteristics, including distributions of each job attribute, such as job queuing time, job actual runtime, job size, and memory usage, and the correlations between these job attributes. This work provides a realistic basis for scheduler design and comparison by studying the supercomputer’s workload with new approaches such as using Gaussian mixture model, and new viewpoints such as from the perspective of user community. To the best of our knowledge, it’s the first research to systematically investigate the workload characteristics of a petascale supercomputer that is dedicated to open scientific research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chiang, S.-H., Vernon, M.K.: Characteristics of a Large Shared Memory Production Workload. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 2001. LNCS, vol. 2221, pp. 159–187. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  2. Christodoulopoulos, K., Gkamas, V., Varvarigos, E.: Statistical analysis and modeling of jobs in a grid environment. Journal of Grid Computing 6, 77–101 (2008)

    Article  Google Scholar 

  3. Cirne, W., Berman, F.: A comprehensive model of the supercomputer workload. In: IEEE International Workshop on Workload Characterization, pp. 140–148 (2001)

    Google Scholar 

  4. Cleveland, W.S.: Robust locally weighted regression and smoothing scatterplots. Journal of the American Statistical Association 74(368), 829–836 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  5. Denneulin, Y., Romagnoli, E., Trystram, D.: A synthetic workload generator for cluster computing. In: International Parallel and Distributed Processing Symposium, p. 243 (April 2004)

    Google Scholar 

  6. Feitelson, D.G.: Workload Modeling for Performance Evaluation. In: Calzarossa, M.C., Tucci, S. (eds.) Performance 2002. LNCS, vol. 2459, pp. 114–141. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  7. Li, H.: Workload dynamics on clusters and grids. The Journal of Supercomputing 47, 1–20 (2009)

    Article  MATH  Google Scholar 

  8. Li, H., Groep, D., Wolters, L.: Workload Characteristics of a Multi-cluster Supercomputer. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, pp. 176–193. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  9. Li, H., Muskulus, M.: Analysis and modeling of job arrivals in a production grid. SIGMETRICS Perform. Eval. Rev. 34, 59–70 (2007)

    Article  Google Scholar 

  10. Li, H., Wolters, L., Groep, D.: Workload characteristics of the das-2 supercomputer (June 2004)

    Google Scholar 

  11. Lo, V., Mache, J., Windisch, K.: A Comparative Study of Real Workload Traces and Synthetic Workload Models for Parallel Job Scheduling. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 1998. LNCS, vol. 1459, pp. 25–46. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  12. Lublin, U., Feitelson, D.G.: The workload on parallel supercomputers: Modeling the characteristics of rigid jobs. Journal of Parallel and Distributed Computing 63, 2003 (2001)

    Google Scholar 

  13. Medernach, E.: Workload Analysis of a Cluster in a Grid Environment. In: Feitelson, D.G., Frachtenberg, E., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2005. LNCS, vol. 3834, pp. 36–61. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  14. Minh, T.N., Wolters, L.: Modeling Parallel System Workloads with Temporal Locality. In: Frachtenberg, E., Schwiegelshohn, U. (eds.) JSSPP 2009. LNCS, vol. 5798, pp. 101–115. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  15. National Institute for Computational Sciences. Running jobs on Kraken, http://www.nics.tennessee.edu/node/16 (accessed November 11, 2011)

  16. Rosenblatt, M.: Remarks on Some Nonparametric Estimates of a Density Function. The Annals of Mathematical Statistics 27(3), 832–837 (1956)

    Article  MathSciNet  MATH  Google Scholar 

  17. Song, B., Ernemann, C., Yahyapour, R.: Modelling of parameters in supercomputer workloads. In: International Conference on Architecture of Computing Systems, pp. 400–409 (2004)

    Google Scholar 

  18. Song, B., Ernemann, C., Yahyapour, R.: User group-based workload analysis and modelling. In: IEEE International Symposium on Cluster Computing and the Grid, vol. 2, pp. 953–961 (May 2005)

    Google Scholar 

  19. Top500. Application area share for 06/2011, http://www.top500.org/list/2011/11/100 (accessed November 11, 2011)

  20. Tsafrir, D., Etsion, Y., Feitelson, D.G.: Modeling User Runtime Estimates. In: Feitelson, D.G., Frachtenberg, E., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2005. LNCS, vol. 3834, pp. 1–35. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  21. Wolter, N., McCracken, M., Snavely, A., Hochstein, L., Nakamura, T., Basili, V.: What’s working in HPC: Investigating HPC user behavior and productivity. CT-Watch Quarterly 2(4A) (2006)

    Google Scholar 

  22. xRAC, http://www.teragridforum.org/mediawiki/index.php?title=XRAC

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

You, H., Zhang, H. (2013). Comprehensive Workload Analysis and Modeling of a Petascale Supercomputer. In: Cirne, W., Desai, N., Frachtenberg, E., Schwiegelshohn, U. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2012. Lecture Notes in Computer Science, vol 7698. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35867-8_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35867-8_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35866-1

  • Online ISBN: 978-3-642-35867-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics