The Journal of Supercomputing

, Volume 34, Issue 2, pp 135–163 | Cite as

Characterization of Bandwidth-Aware Meta-Schedulers for Co-Allocating Jobs Across Multiple Clusters

  • William M. Jones
  • Walter B. LigonIII
  • Louis W. Pang
  • Dan Stanzione
Article

Abstract

In this paper, we present a bandwidth-centric job communication model that captures the interaction and impact of simultaneously co-allocating jobs across multiple clusters. We compare our dynamic model with previous research that utilizes a fixed execution time penalty for co-allocated jobs. We explore the interaction of simultaneously co-allocated jobs and the contention they often create in the network infrastructure of a dedicated computational multi-cluster.

We also present several bandwidth-aware co-allocating meta-schedulers. These schedulers take inter-cluster network utilization into account as a means by which to mitigate degraded job run-time performance. We make use of a bandwidth-centric parallel job communication model that captures the time-varying utilization of shared inter-cluster network resources. By doing so, we are able to evaluate the performance of multi-cluster scheduling algorithms that focus not only on node resource allocation, but also on shared inter-cluster network bandwidth.

Keywords

parallel job scheduling multiple clusters bandwidth-aware network contention job co-allocation multi-site scheduling simulation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    A. I. D. Bucar and D. H. J. Epema. The influence of communication on the performance of co-allocation. In 7th Workshop on Job Scheduling Strategies for Parallel Processing, in conjunction with ACM Sigmetrics 2001, pp. 66–86, June 2001.Google Scholar
  2. 2.
    A. I. D. Bucar and D. H. J. Epema. The performance of processor co-allocation in multicluster systems. In 3rd International Symposium on Cluster Computing and the Grid, pp. 302–309, May 2003.Google Scholar
  3. 3.
    C. Ernemann, V. Hamscher, A. Streit, and R. Yahyapour. Enhanced algorithms for multi-site scheduling. In Grid Computing—GRID 2002, Third International Workshop, Baltimore, MD, USA, November 18, 2002, Proceedings, pp. 219–231, 2002.Google Scholar
  4. 4.
    C. Ernemann, V. Hamscher, A. Streit, R. Yahyapour, and U. Schwiegelshohn. On adgantages of grid computing for parallel job scheduling. In 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID’02) Berlin Germany May 21, pp. 31–38, 2002.Google Scholar
  5. 5.
    D. G. Feitelson. Metrics for parallel job scheduling and their convergence. In Job Scheduling Strategies for Parallel Processing, vol. 2221, pp. 188–206, 2001.Google Scholar
  6. 6.
    D. Jackson, Q. Snell, and M. Clement. Core algorithms of the maui scheduler. In 7th Workshop on Job Scheduling Strategies for Parallel Processing. In conjunction with ACM Sigmetrics 2001, June 2001.Google Scholar
  7. 7.
    W. M. Jones, L. W. Pang, D. Stanzione, and W. B. Ligon III. Bandwidth-aware co-allocating meta-schedulers for mini-grid architectures. In Proc. of the IEEE International Conference on Cluster Computing, September 2004.Google Scholar
  8. 8.
    W. M. Jones, L. W. Pang, D. Stanzione, and W. B. Ligon III. Job communication characterization and its impact on meta-scheduling co-allocated jobs in a mini-grid. In Proc. of the IEEE 18th International Parallel and Distributed Processing Symposium: Performance Modeling, Evaluation, and Optimization of Parallel and Distributed Systems, April 2004.Google Scholar
  9. 9.
    D. Lifka. The ANL/IBM SP scheduling systems. In Proc. of the 1st Workshop on Job Scheduling Strategies for Parallel Processing, vol. 949, pp. 295–303. LNCS, 1995.Google Scholar
  10. 10.
    J. Sinaga, H. Mohamed, and D. H. J. Epema. A dynamic co-allocation service in multicluster systems. In 10th Workshop on Job Scheduling Strategies for Parallel Processing (in conjunction with Sigmetrics-Performance 2004), New York, June 2004.Google Scholar
  11. 11.
    S. Srinivasan, R. Kettimuthu, V. Subramani, and P. Sadayappan. Characterization of backfilling strategies for parallel job scheduling. In IEEE International Conference on Parallel Processing Workshops, pp. 514–519, August 2002.Google Scholar
  12. 12.
    Y. Zhang, H. Franke, J. Moreira, and A. Sivasubramaniam. An integrated approach to parallel scheduling using gang-scheduling, backfilling, and migration. In IEEE Transactions On Parallel and Distributed Systems, vol. 14, pp. 236–247, March 2003.Google Scholar

Copyright information

© Springer Science + Business Media, Inc. 2005

Authors and Affiliations

  • William M. Jones
    • 1
  • Walter B. LigonIII
    • 1
  • Louis W. Pang
    • 1
  • Dan Stanzione
    • 2
  1. 1.Parallel Architecture Research Lab, Department of Electrical and Computer EngineeringClemson UniversityClemson
  2. 2.High Performance Computing Center, Fulton School of EngineeringArizona State UniversityTempe

Personalised recommendations