Characterization of Bandwidth-Aware Meta-Schedulers for Co-Allocating Jobs Across Multiple Clusters
- 70 Downloads
In this paper, we present a bandwidth-centric job communication model that captures the interaction and impact of simultaneously co-allocating jobs across multiple clusters. We compare our dynamic model with previous research that utilizes a fixed execution time penalty for co-allocated jobs. We explore the interaction of simultaneously co-allocated jobs and the contention they often create in the network infrastructure of a dedicated computational multi-cluster.
We also present several bandwidth-aware co-allocating meta-schedulers. These schedulers take inter-cluster network utilization into account as a means by which to mitigate degraded job run-time performance. We make use of a bandwidth-centric parallel job communication model that captures the time-varying utilization of shared inter-cluster network resources. By doing so, we are able to evaluate the performance of multi-cluster scheduling algorithms that focus not only on node resource allocation, but also on shared inter-cluster network bandwidth.
Keywordsparallel job scheduling multiple clusters bandwidth-aware network contention job co-allocation multi-site scheduling simulation
Unable to display preview. Download preview PDF.
- 1.A. I. D. Bucar and D. H. J. Epema. The influence of communication on the performance of co-allocation. In 7th Workshop on Job Scheduling Strategies for Parallel Processing, in conjunction with ACM Sigmetrics 2001, pp. 66–86, June 2001.Google Scholar
- 2.A. I. D. Bucar and D. H. J. Epema. The performance of processor co-allocation in multicluster systems. In 3rd International Symposium on Cluster Computing and the Grid, pp. 302–309, May 2003.Google Scholar
- 3.C. Ernemann, V. Hamscher, A. Streit, and R. Yahyapour. Enhanced algorithms for multi-site scheduling. In Grid Computing—GRID 2002, Third International Workshop, Baltimore, MD, USA, November 18, 2002, Proceedings, pp. 219–231, 2002.Google Scholar
- 4.C. Ernemann, V. Hamscher, A. Streit, R. Yahyapour, and U. Schwiegelshohn. On adgantages of grid computing for parallel job scheduling. In 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID’02) Berlin Germany May 21, pp. 31–38, 2002.Google Scholar
- 5.D. G. Feitelson. Metrics for parallel job scheduling and their convergence. In Job Scheduling Strategies for Parallel Processing, vol. 2221, pp. 188–206, 2001.Google Scholar
- 6.D. Jackson, Q. Snell, and M. Clement. Core algorithms of the maui scheduler. In 7th Workshop on Job Scheduling Strategies for Parallel Processing. In conjunction with ACM Sigmetrics 2001, June 2001.Google Scholar
- 7.W. M. Jones, L. W. Pang, D. Stanzione, and W. B. Ligon III. Bandwidth-aware co-allocating meta-schedulers for mini-grid architectures. In Proc. of the IEEE International Conference on Cluster Computing, September 2004.Google Scholar
- 8.W. M. Jones, L. W. Pang, D. Stanzione, and W. B. Ligon III. Job communication characterization and its impact on meta-scheduling co-allocated jobs in a mini-grid. In Proc. of the IEEE 18th International Parallel and Distributed Processing Symposium: Performance Modeling, Evaluation, and Optimization of Parallel and Distributed Systems, April 2004.Google Scholar
- 9.D. Lifka. The ANL/IBM SP scheduling systems. In Proc. of the 1st Workshop on Job Scheduling Strategies for Parallel Processing, vol. 949, pp. 295–303. LNCS, 1995.Google Scholar
- 10.J. Sinaga, H. Mohamed, and D. H. J. Epema. A dynamic co-allocation service in multicluster systems. In 10th Workshop on Job Scheduling Strategies for Parallel Processing (in conjunction with Sigmetrics-Performance 2004), New York, June 2004.Google Scholar
- 11.S. Srinivasan, R. Kettimuthu, V. Subramani, and P. Sadayappan. Characterization of backfilling strategies for parallel job scheduling. In IEEE International Conference on Parallel Processing Workshops, pp. 514–519, August 2002.Google Scholar
- 12.Y. Zhang, H. Franke, J. Moreira, and A. Sivasubramaniam. An integrated approach to parallel scheduling using gang-scheduling, backfilling, and migration. In IEEE Transactions On Parallel and Distributed Systems, vol. 14, pp. 236–247, March 2003.Google Scholar