Skip to main content

Scheduling for Parallel Supercomputing: A Historical Perspective of Achievable Utilization

  • Conference paper
  • First Online:
Job Scheduling Strategies for Parallel Processing (JSSPP 1999)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1659))

Included in the following conference series:

Abstract

The NAS facility has operated parallel supercomputers for the past 11 years, including the Intel iPSC/860, Intel Paragon, Thinking Machines CM-5, IBM SP-2, and Cray Origin 2000. Across this wide variety of machine architectures, across a span of 10 years, across a large number of different users, and through thousands of minor configuration and policy changes, the utilization of these machines shows three general trends: (1) scheduling using a naive FCFS first-fit policy results in 40-60% utilization, (2) switching to the more sophisticated dynamic backfilling scheduling algorithm improves utilization by about 15 percentage points (yielding about 70% utilization), and (3) reducing the maximum allowable job size further increases utilization. Most surprising is the consistency of these trends. Over the lifetime of the NAS parallel systems, we made hundreds, perhaps thousands, of small changes to hardware, software, and policy, yet utilization was affected little. In particular, these results show that the goal of achieving near 100% utilization while supporting a real parallel supercomputing workload is unrealistic.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. David H. Bailey, Experiences with Parallel Computers at NASA/Ames NAS Technical Report RNR-91-007, NAS Facility, NASA Ames Research Center, February 1991.

    Google Scholar 

  2. Nick Cardo, Batch Scheduling: A Fresh Approach, in Proceedings of Cray’s User Group Conference, March 1995.

    Google Scholar 

  3. Job Scheduling in Multiprogrammed Parallel Systems by Dror G Feitelson, IBM Research Report RC 19790 (87657), October 1994.

    Google Scholar 

  4. Dror Feitelson and A. Mu’alem Weil., Utilization and Predictability in Scheduling the IBM SP2 With Backfilling, in Proceedings of 12th International Parallel Processing Symposium., pp. 542–546, April 1998.

    Google Scholar 

  5. Robert Henderson, Job Scheduling Under the Portable Batch System, in Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing, Santa Barbara, CA, April 1995.

    Google Scholar 

  6. James Patton Jones, The NASA SP2 Metacenter, in Proceedings of the Computational Aerosciences Workshop, HPCCP, August 1996.

    Google Scholar 

  7. James Patton Jones, Implementation of the NASA Metacenter: Phase 1 Report, NAS Technical Report NAS-97-027, NAS Facility, NASA Ames Research Center, October 1997. Scheduling for Parallel Supercomputing 15

    Google Scholar 

  8. B. Kingsbury, The Network Queuing System, Sterling Software, Palo Alto, 1985.

    Google Scholar 

  9. Scheduling Techniques for Concurrent Systems by J.K. Ousterhout, In 3rd International Conference of Distributed Computing Systems, pp. 22–30, October 1982.

    Google Scholar 

  10. Bill Saphir, Leigh Ann Tanner and Bernard Traversat, Job Management Requirements for NAS Parallel Systems and Clusters, in Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing, Santa Barbara, CA, April 1995.

    Google Scholar 

  11. Bernard Traversat, Ed Hook and James Patton Jones. A Dynamic-Backfilling Algorithm for Efficient Space-Sharing Scheduling on an IBM SP2, NAS Technical Report NAS-98-101, NAS Facility, NASA Ames Research Center, November 1998.

    Google Scholar 

  12. K. Windisch, Virginia Lo, R. Moore, Dror Feitelson, and Bill Nitzberg., A Comparison of Workload Traces From Two Production Parallel Machines, in Proceedings of 6th Synp. Frontiers of Massively Parallel Computing, pp. 319–326, October 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jones, J.P., Nitzberg, B. (1999). Scheduling for Parallel Supercomputing: A Historical Perspective of Achievable Utilization. In: Feitelson, D.G., Rudolph, L. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 1999. Lecture Notes in Computer Science, vol 1659. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47954-6_1

Download citation

  • DOI: https://doi.org/10.1007/3-540-47954-6_1

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66676-9

  • Online ISBN: 978-3-540-47954-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics