Workload evolution on the Cornell Theory Center IBM SP2

  • Steven Hotovy
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1162)

Abstract

The Cornell Theory Center (CTC) put a 512-node IBM SP2 system into production in early 1995, and extended traces of batch jobs began to be collected in June of that year. An analysis of the workload shows that it has not only grown, but that its characteristics have changed over time. In particular, job duration increased with time, indicative of an expanding production workload. In addition, there was increasing use of parallelism.

As the load has increased and larger jobs have become more frequent, the batch management software (IBM's LoadLeveler) has had difficulty in scheduling the requested resources. New policies were established to improve the situation.

This paper will profile how the workload has changed over time and give an in-depth look at the maturing workload. It will examine how frequently certain resources are requested and analyze user submittal patterns. It will also describe the policies that were implemented to improve the scheduling situation and their effect on the workload.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    T. Agerwala, J.L. Martin, J.H. Mirza, D.C. Sadler, D.M. Dias and M. Snir, “SP2 System Architecture”. IBM Systems Journal, Vol. 34, No. 2, 1995.Google Scholar
  2. 2.
    R. Cypher, A. Ho, S. Konstantinidou and P. Messina. “Architectural Requirements of Parallel Scientific Applications with Explicit Communication”. In Proceedings of the 20th Annual International Symposium on Computer Architecture, May, 1993. p. 2–13.Google Scholar
  3. 3.
    D.G. Feitelson and B. Nitzberg. “Job Characteristics of a Production Parallel Scientific Workload on the NASA Ames iPSC/860”. IPPS'95 Workshop on Job Scheduling Strategies for Parallel Processing, April, 1995.Google Scholar
  4. 4.
    S.G. Hotovy, D.J. Schneider and T. O'Donnell. “Analysis of the Early Workload on the Cornell Theory Center IBM SP2”. In Proceedings of ACM SIGMETRICS Conference, 1996 (to appear).Google Scholar
  5. 5.
    D.A. Lifka. “The ANL/IBM SP Scheduling System”. IPPS'95 Workshop on Job Scheduling Strategies for Parallel Processing, April, 1995.Google Scholar
  6. 6.
    W. Pfeiffer, S. Hotovy, N.A. Nystrom, D. Rudy, T. Sterling and M. Straka. JNNIE: The Joint NSF-NASA Initiative on Evaluation. San Diego Supercomputer Center Technical Report GA-A22123, July, 1995.Google Scholar
  7. 7.
    M.E. Rosenkrantz, D.J. Schneider, R. Leibensberger, M. Shore and J. Zollweg. “Requirements of the Cornell Theory Center for Resource Management and Process Scheduling”. IPPS'95 Workshop on Job Scheduling Strategies for Parallel Processing, April, 1995.Google Scholar
  8. 8.
    IBM LoadLeveler Administration Guide, IBM Document Number SH26-7220-02, October, 1994.Google Scholar

Copyright information

© Springer-Verlag 1996

Authors and Affiliations

  • Steven Hotovy
    • 1
  1. 1.Cornell Theory CenterIthaca

Personalised recommendations