Architecture-independent request-scheduling with tight waiting-time estimations
In the course of the last few years, the user's interaction with parallel computer-systems has changed. A continuous growth in the number of interactive HPC-applications can be observed. When considering partitionable MPP-systems with exclusive usage of the physically separated regions, issues like the average waiting-time become more dominant for the users than the total system-throughput.
In this paper, we focus on the problem of scheduling an arbitrary mixture of resource-requests for batch and interactive applications in an architecture-independent manner. To help users plan their daily work tight waiting-time estimations are indispensable. However, the resulting scheduling problem interferes with the problem of mapping requests onto certain MPP-architectures to reduce their internal fragmentations.
We will show that this conflict can be alleviated by a distributed proververifier methodology. At first, we will introduce the distributed resource-management software CCS with its architecture-independent scheduling method. The message-based approach presented is used to verify the pre-calculated schedules with help of the system-dependent mapping instances. Simulations with the accounting data of our center have shown that tight waiting-time estimations can be made while the architecture-independent scheduling approach is still preserved. We will show that by using this methodology the mean error-value of the predicted waiting-time can be reduced by 76 %. Finally, we will discuss the impact of such a distributed resource-management system on the metacomputing challenge.
KeywordsDistributed Resource Management Request-Scheduling Metacomputing
Unable to display preview. Download preview PDF.
- 1.A. Bachem, B. Monien, F. Ramme: Der Forschungsverbund NRW-Metacomputing ”Verteiltes Höchstleistungsrechnen”, Technical Report, Paderborn, 1996Google Scholar
- 2.M. Campione, K. Walrath: The Java Language Tutorial: Object-Oriented Programming for the Internet, ISBN 0-201-63454-6, expected July 1996Google Scholar
- 4.A. Colbrook, M. Lemke, H. Mierendorff, K. Stüben, C.A. Thole, O. Thomas: EUROPORT — ESPRIT European Porting Projects, Int. Conf. on High-Performance Computing and Networking, Proc. of the HPCN Europe, Springer-Verlag 1994, LNCS No. 796, Vol. I, pp. 46–54Google Scholar
- 5.E=MC2 Consortium c/o R. McConnell: The European Meta Computer Utilizing Integrated Broadband Communications (E=MC 2) Project, Int. Conf. on High-Performance Computing and Networking, Proc. of the HPCN Europe, LNCS, Springer-Verlag 1995 pp. 54–59Google Scholar
- 6.D.G. Feitelson: A Survey of Scheduling in Multiprogrammed Parallel Systems, Research Report RC 19790 (87657), IBM T.J. Watson Research Center, Oct. 1994Google Scholar
- 7.D.G. Feitelson, L. Rudolph: Toward Convergence in Job Schedulers for Parallel Supercomputers, In IPPS'96 Workshop on Job Scheduling Strategies for Parallel Processing, April 1996Google Scholar
- 8.R. Funke, R. Lüling, B. Monien, F. Lücking, H. Blanke-Bohne: An optimized reconfigurable architecture for Transputer networks, Proc. of 25th Hawaii Int. Conf. on System Sciences (HICSS 92), Vol. 1, pp. 237–245Google Scholar
- 9.J. Gehring, A. Reinefeld: MARS — A Framework for Minimizing the Job Execution Time in a Metacomputing Environment, To appear in spring issue of FGCS 1996Google Scholar
- 11.R.L. Henderson: Job Scheduling Under the Portable Batch System, IPPS Workshop on Job Scheduling Strategies for Parallel Processing, D.G. Feitelson and L. Rudolph (eds), Springer-Verlag 1995, LNCS No. 949, pp. 279–294Google Scholar
- 12.A.A. Khokhar, V.K. Prasanna, M.E. Shaaban, Cho-Li Wang: Heterogeneous Computing: Challenges and Oportunities, IEEE Computer, Vol. 26, No. 6, 1993, pp. 18–27Google Scholar
- 13.Reagan Moore: NSF MetaCenter: A White Paper, San Diego Supercomputing Center, 1995Google Scholar
- 14.F. Ramme: Building a Virtual Machine-Room — a Focal Point in Metacomputing, Future Generation Computer Systems (FGCS), Elsevier Science B. V., Aug. 1995, Special Issue on HPCN, Vol. 11, pp. 477–489Google Scholar
- 15.F. Ramme, K. Kremer: Scheduling a Metacomputer by an Implicit Voting System, 3rd IEEE Int. Symposium on High-Performance Distributed Computing, San Francisco, 1994, pp. 106–113Google Scholar
- 16.F. Ramme, T. Römke, K. Kremer: A Distributed Computing Center Software for the Efficient Use of Parallel Computer Systems, Int. Conf. on High-Performance Computing and Networking, Proc. of the HPCN Europe, Springer-Verlag 1994, LNCS No. 797, Vol. II, pp. 129–136Google Scholar
- 17.J. Skovira, W. Chan, H. Zhou, D. Lifka: The EASY — LoadLeveler API Project, In IPPS'96 Workshop on Job Scheduling Strategies for Parallel Processing, April 1996Google Scholar
- 19.HIPERCON — High-Performance Computing Network —, W. Zimmer (ed.), Eine Analyse zum Aufbau und Betrieb eines Höchstleistungsrechnerverbundnetzes in der Bundesrepublik Deutschland, GMD-First, Berlin 1995, im Auftrag des BMBFGoogle Scholar
- 20.M. Wan, R. Moore, G. Kremenek, K. Steube: A Batch Scheduler for the Intel Paragon with a Non-contiguous Node Allocation Algorithm, In IPPS'96 Workshop on Job Scheduling Strategies for Parallel Processing, April 1996Google Scholar