A gang-scheduling system for ASCI blue-pacific
The ASCI Blue-Pacific machines are large parallel systems comprised of thousands of processors. We are currently developing and testing a gangscheduling job control system for these machines that exploits space-and time-sharing in the presence of dedicated communication devices. Our initial experience with this system indicates that, though applications pay a small overhead, overall system performance as measured by average job queue and response times improves significantly. This gang-scheduling system is planned for deployment into production mode during 1999 at Lawrence Livermore National Laboratory.
KeywordsLawrence Livermore National Laboratory Average Execution Time Memory Footprint Kernel Extension Gang Schedule
Unable to display preview. Download preview PDF.
- 2.D. G. Feitelson and M. A. Jette. Improved Utilization and Responsiveness with Gang Scheduling. In IPPS'97 Workshop on Job Scheduling Strategies for Parallel Processing, volume 1291 of Lecture Notes in Computer Science, pages 238–261. Springer-Verlag, April 1997.Google Scholar
- 3.H. Franke, J. E. Moreira, and P. Pattnaik. Process Tracking for Parallel Job Control. Technical Report 21290, IBM Research Division, September 1998.Google Scholar
- 4.H. Franke, P. Pattnaik, and L. Rudolph. Gang Scheduling for Highly Efficient Multiprocessors. In Sixth Symposium on the Frontiers of Massively Parallel Computation, Annapolis, Maryland, 1996.Google Scholar
- 5.B. Gorda and R. Wolski. Time Sharing Massively Parallel Machines. In International Conference on Parallel Processing, volume II, pages 214–217. August 1995.Google Scholar
- 6.N. Islam, A. L. Prodromidis, M. S. Squillante, L. L. Fong, and A. S. Gopal. Extensible Resource Management for Cluster Computing. In Proceedings of the 17th International Conference on Distributed Computing Systems, pages 561–568, 1997.Google Scholar
- 7.M. Jette, D. Storch, and E. Yim. Timesharing the Cray T3D. Cray User Group, pages 247–252, March 1996.Google Scholar
- 8.D. Lifka. The ANL/IBM SP scheduling system. In IPPS'95 Workshop on Job Scheduling Strategies for Parallel Processing, volume 949 of Lecture Notes in Computer Science, pages 295–303. Springer-Verlag, April 1995.Google Scholar
- 9.J. K. Ousterhout. Scheduling Techniques for Concurrent Systems. In Third International Conference on Distributed Computing Systems, pages 22–30, 1982.Google Scholar
- 10.U. Schwiegelshohn and R. Yahyapour. Improving First-Come-First-Serve Job Scheduling by Gang Scheduling. In IPPS'98 Workshop on Job Scheduling Strategies for Parallel Processing, March 1998.Google Scholar
- 11.U. Vahalia. UNIX Internals: The New Frontiers. Prentice-Hall, Inc, 1996.Google Scholar
- 12.F. Wang, M. Papaefthymiou, and M. Squillante. Performance Evaluation of Gang Scheduling for Parallel and Distributed Multiprogramming. In IPPS'97 Workshop on Job Scheduling Strategies for Parallel Processing, volume 1291 of Lecture Notes in Computer Science, pages 277–298. Springer-Verlag, April 1997.Google Scholar