Workload evolution on the Cornell Theory Center IBM SP2
The Cornell Theory Center (CTC) put a 512-node IBM SP2 system into production in early 1995, and extended traces of batch jobs began to be collected in June of that year. An analysis of the workload shows that it has not only grown, but that its characteristics have changed over time. In particular, job duration increased with time, indicative of an expanding production workload. In addition, there was increasing use of parallelism.
As the load has increased and larger jobs have become more frequent, the batch management software (IBM's LoadLeveler) has had difficulty in scheduling the requested resources. New policies were established to improve the situation.
This paper will profile how the workload has changed over time and give an in-depth look at the maturing workload. It will examine how frequently certain resources are requested and analyze user submittal patterns. It will also describe the policies that were implemented to improve the scheduling situation and their effect on the workload.
Unable to display preview. Download preview PDF.
- 1.T. Agerwala, J.L. Martin, J.H. Mirza, D.C. Sadler, D.M. Dias and M. Snir, “SP2 System Architecture”. IBM Systems Journal, Vol. 34, No. 2, 1995.Google Scholar
- 2.R. Cypher, A. Ho, S. Konstantinidou and P. Messina. “Architectural Requirements of Parallel Scientific Applications with Explicit Communication”. In Proceedings of the 20th Annual International Symposium on Computer Architecture, May, 1993. p. 2–13.Google Scholar
- 3.D.G. Feitelson and B. Nitzberg. “Job Characteristics of a Production Parallel Scientific Workload on the NASA Ames iPSC/860”. IPPS'95 Workshop on Job Scheduling Strategies for Parallel Processing, April, 1995.Google Scholar
- 4.S.G. Hotovy, D.J. Schneider and T. O'Donnell. “Analysis of the Early Workload on the Cornell Theory Center IBM SP2”. In Proceedings of ACM SIGMETRICS Conference, 1996 (to appear).Google Scholar
- 5.D.A. Lifka. “The ANL/IBM SP Scheduling System”. IPPS'95 Workshop on Job Scheduling Strategies for Parallel Processing, April, 1995.Google Scholar
- 6.W. Pfeiffer, S. Hotovy, N.A. Nystrom, D. Rudy, T. Sterling and M. Straka. JNNIE: The Joint NSF-NASA Initiative on Evaluation. San Diego Supercomputer Center Technical Report GA-A22123, July, 1995.Google Scholar
- 7.M.E. Rosenkrantz, D.J. Schneider, R. Leibensberger, M. Shore and J. Zollweg. “Requirements of the Cornell Theory Center for Resource Management and Process Scheduling”. IPPS'95 Workshop on Job Scheduling Strategies for Parallel Processing, April, 1995.Google Scholar
- 8.IBM LoadLeveler Administration Guide, IBM Document Number SH26-7220-02, October, 1994.Google Scholar