Benefits and Drawbacks of Redundant Batch Requests

Casanova, Henri

doi:10.1007/s10723-007-9068-6

Benefits and Drawbacks of Redundant Batch Requests

Published: 06 February 2007

Volume 5, pages 235–250, (2007)
Cite this article

Journal of Grid Computing Aims and scope Submit manuscript

Henri Casanova¹

103 Accesses
22 Citations
3 Altmetric
Explore all metrics

Abstract

Most parallel computing platforms are controlled by batch schedulers that place requests for computation in a queue until access to compute nodes is granted. Queue waiting times are notoriously hard to predict, making it difficult for users not only to estimate when their applications may start, but also to pick among multiple batch-scheduled platforms the one that will produce the shortest turnaround time. As a result, an increasing number of users resort to “redundant requests”: several requests are simultaneously submitted to multiple batch schedulers on behalf of a single job; once one of these requests is granted access to compute nodes, the others are canceled. Using simulation as well as experiments with a production batch scheduler we evaluate the impact of redundant requests on (1) average job performance, (2) schedule fairness, (3) system load, and (4) system predictability. We find that some of the popularly held beliefs about the harmfulness of redundant batch requests are unfounded. We also find that the two most critical issues with redundant requests are the additional load on current middleware infrastructures and unfairness towards users who do not use redundant requests. Using our experimental results we quantify both impacts in terms of the number of users who use redundant requests and of the amount of request redundancy these users employ.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

Brevik, J., Nurmi, D., Wolski, R.: Predicting bounds on queuing delay for batch-scheduled parallel machines. In: Proc. of the 11th ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming (PPoPP), pp. 110–118 (2006)
Bucur, A., Epema, D.: The performance of processor co-allocation in multicluster systems. In: Proc. of the 3rd IEEE International Symposium on Cluster Computing and the Grid (CCGrid), pp. 302–309 (2003)
Capit, N., Da Costa, G., Georgiou, Y., Huard, G., Martin, C., Mounié, G., Neyron, P., Richard, O.: A batch scheduler with high level components. In: Proc. of the 5th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid), pp. 776–783 (2005)
Feitelson, D.: Parallel Workloads Archive. http://www.cs.huji.ac.il/labs/parallel/workload/ (2006)
Feitelson, D.G., Rudolph, L., Schwiegelshohn, U.: Parallel job scheduling – a status report. In: Proc. of the 10th Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP). Lecture Notes in Computer Science, vol. 3277, pp. 1–16 (2004)
Gudgin, M., Hadley, M., Mendelsohn, U., Moreau, J.-J., Canon, S., Nielsen, H.: Simple Object Access Prototol 1.1. http://www.w3.org/TR/SOAP/. (2003)
Hamscher, V., Schwiegelshohn, U., Streit, A., Yahyapour, R.: Evaluation of job-scheduling strategies for Grid computing. In: Proc. of the 1st IEEE/ACM International Workshop on Grid Computing. Lecture Notes in Computer Science, vol. 1971, pp. 191–202 (2000)
Head, M.R., Govindaraju, M., Slominski, A., Liu, P., Abu-Ghazaleh, N., van Engelen, R., Chiu, K., Lewis, M.J.: A benchmark suite for SOAP-based communication in Grid web services. In: Proc. of the 2005 ACM/IEEE Conference on Supercomputing (SC), pp. 19–31 (2005)
Legrand, A., Marchal, L., Casanova, H.: Scheduling distributed applications: the SimGrid simulation framework. In: Proc. of the 3rd IEEE International Symposium on Cluster Computing and the Grid (CCGrid), pp. 138–145 (2003)
Lifka, D.: The ANL/IBM SP scheduling system. In: Proc. of the 1st Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP). Lecture Notes in Computer Science, vol. 949, pp. 295–303 (1995)
Lublin, U., Feitelson, D.: The workload on parallel supercomputers: modeling the characteristics of rigid jobs. J. Parallel Distrib. Comput. 63(11), 1105–1122 (2003)
Article MATH Google Scholar
Mu’alem, A., Feitelson, D.G.: Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling. IEEE Trans. Parallel Distrib. Syt. 12, 529–543. (2001)
Article Google Scholar
Pinchak, C., Lu, P., Goldenberg, M.: Practical heterogeneous placeholder scheduling in overlay metacomputers: early experiences. In: Proc. of the 8th Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP). Lecture Notes in Computer Science, vol. 2537, pp. 85–105 (2002)
Raicu, I.: A Performance Study of the Globus Toolkit ® and Grid Services via DiPerf, an Automated Distributed Performance Testing Framework. Master’s thesis, University of Chicago (2005)
Ranganathan, K., Foster, I.: Decoupling computation and data scheduling in distributed data-intensive applications. In: Proc. of the 11th IEEE International Symposium for High Performance Distributed Computing (HPDC), pp. 352–358 (2002)
Sabin, G., Kettimuthu, R., Rajan, A., Sadayappan, P.: Scheduling of parallel jobs in a heterogeneous multi-site environment. In: Proc. of the 9th Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP). Lecture Notes in Computer Science, vol. 2872, pp. 87–104 (2003)
Shan, H., Oliker, L., Biswas, R.: Job superscheduler architecture and performance in computational Grid environments. In: Proc. of the 2003 ACM/IEEE Conference on Supercomputing (SC), pp. 44-58 (2003)
Srinivasan, S., Subramani, V., Kettimuthu, R., Holenarsipur, P., Sadayappan, P.: Effective selection of partition sizes for moldable scheduling of parallel jobs. In: Proc. of the 9th International Conference on High Performance Computing (HiPC). Lecture Notes in Computer Science, vol. 2552, pp. 176–182 (2002)
Subramani, V., Kettimuthu, R., Srinivasan, S., Sadayappan, P.: Distributed job scheduling on computational Grids using multiple simultaneous requests. In: Proc. of the High Performance and Distributed Conference (HPDC), pp. 359–366 (2002)
The Globus ®Alliance: WS GRAM: Developer’s Guide. (2006) http://www.globus.org/toolkit/docs/4.0/execution/wsgram/
Tsafrir, D., Etsion, Y., Feitelson, D.G.: Modeling user runtime estimates. In: Proc. of the 11th Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP). Lecture Notes in Computer Science, vol. 3834, pp. 1–35 (2005)
van Engelen, R., Gallivan, K.: The gSOAP toolkit for web services and peer-to-peer computing networks. In: Proc. of the 2nd IEEE International Symposium on Cluster Computing and the Grid (CCGrid), pp. 128–135 (2002)
Yoshimoto, K.: The Catalina Batch Scheduler. http://www.sdsc.edu/catalina/ (2005)
Zhang, Y., Franke, H., Moreira, J.E., Sivasubramaniam, A.: An integrated approach to parallel scheduling using gang-scheduling, backfilling, and migration. In: Proc. of the 7th Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP). Lecture Notes on Computer Science, vol. 2221, pp. 133–158 (2001)

Download references

Author information

Authors and Affiliations

Department of Information and Computer Sciences, University of Hawai‘i at Manoa, 1680 East–West Rd., Post 317, Honolulu, HI, 96822, USA
Henri Casanova

Authors

Henri Casanova
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Henri Casanova.

Additional information

This work was supported by the NSF under Award 0546688.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Casanova, H. Benefits and Drawbacks of Redundant Batch Requests. J Grid Computing 5, 235–250 (2007). https://doi.org/10.1007/s10723-007-9068-6

Download citation

Received: 05 September 2006
Accepted: 12 January 2007
Published: 06 February 2007
Issue Date: June 2007
DOI: https://doi.org/10.1007/s10723-007-9068-6

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Benefits and Drawbacks of Redundant Batch Requests

Abstract

Access this article

Similar content being viewed by others

A brief introduction to distributed systems

Containerization technologies: taxonomies, applications and challenges

A survey on the evolution of stream processing systems

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Key words

Navigation

Benefits and Drawbacks of Redundant Batch Requests

Abstract

Access this article

Similar content being viewed by others

A brief introduction to distributed systems

Containerization technologies: taxonomies, applications and challenges

A survey on the evolution of stream processing systems

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation