Skip to main content

Advertisement

Log in

Co-scheduling algorithms for high-throughput workload execution

  • Published:
Journal of Scheduling Aims and scope Submit manuscript

Abstract

This paper investigates co-scheduling algorithms for processing a set of parallel applications. Instead of executing each application one by one, using a maximum degree of parallelism for each of them, we aim at scheduling several applications concurrently. We partition the original application set into a series of packs, which are executed one by one. A pack comprises several applications, each of them with an assigned number of processors, with the constraint that the total number of processors assigned within a pack does not exceed the maximum number of available processors. The objective is to determine a partition into packs, and an assignment of processors to applications, that minimize the sum of the execution times of the packs. We thoroughly study the complexity of this optimization problem, and propose several heuristics that exhibit very good performance on a variety of workloads, whose application execution times model profiles of parallel scientific codes. We show that co-scheduling leads to faster workload completion time (40 % improvement on average over traditional scheduling) and to faster response times (50 % improvement). Hence, co-scheduling increases system throughput and saves energy, leading to significant benefits from both the user and system perspectives.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Balay, S., Brown, J., Buschelman, K., Gropp, W. D., Kaushik, D., Knepley, M. G., McInnes, L. C., Smith, B. F., & Zhang, H. (2012). PETSc Web page. http://www.mcs.anl.gov/petsc.

  • Balay, S., Abhyankar, S., Adams, M. F., Brown, J., Brune, P., Buschelman, K., Eijkhout, V., Gropp, W. D., Kaushik, D., Knepley, M. G., McInnes, L. C., Rupp, K., Smith, B. F., & Zhang, H. (2014). PETSc Web page. http://www.mcs.anl.gov/petsc, http://www.mcs.anl.gov/petsc.

  • Bhadauria, M., & McKee, S. A. (2010). An approach to resource-aware co-scheduling for CMPs. In: Proceedings of 24th ACM International Conference on Supercomputing ICS ’10, ACM.

  • Blackford, L. S., Choi, J., Cleary, A., D’Azeuedo, E., Demmel, J., Dhillon, I., Hammarling, S., Henry, G., Petitet, A., Stanley, K., Walker, D., & Whaley R. C. (1997). ScaLAPACK User’s Guide. SIAM. Philadelphia, PA, USA.

  • Borgesson, L. (1996). Abaqus. In: Coupled thermo-hydro-mechanical processes of fractured media—mathematical and experimental studies, vol. 79. Amsterdam, Elsevier (pp. 565–570).

  • Brucker, P., Gladky, A., Hoogeveen, H., Kovalyov, M. Y., Potts, C., Tautenhahn, T., et al. (1998). Scheduling a batching machine. Journal of Scheduling, 1, 31–54.

    Article  Google Scholar 

  • Chandra, D., Guo, F., Kim, S., & Solihin, Y. (2005). Predicting inter-thread cache contention on a chip multi-processor architecture. In: HPCA 11, IEEE, (pp. 340–351). doi:10.1109/HPCA.2005.27.

  • Coffman, E. G, Jr, Garey, M. R., Johnson, D. S., & Tarjan, R. E. (1980). Performance bounds for level-oriented two-dimensional packing algorithms. SIAM Journal on Computing, 9(4), 808–826.

    Article  Google Scholar 

  • Cormen, T . H., Leiserson, C . E., Rivest, R . L., & Stein, C. (2009). Introduction to algorithms. Cambridge: The MIT Press.

    Google Scholar 

  • Deb, R. K., & Serfozo, R. F. (1973). Optimal control of batch service queues. Advances in Applied Probability, 340–361.

  • Drozdowski, M. (2003). Scheduling parallel tasks: Algorithms and complexity, Chapter 26. In J. Y. T. Leung (Ed.), Handbook of scheduling: Algorithms, models, and performance analysis. Boca Raton: Chapman/CRC.

    Google Scholar 

  • Dutot, P. F. (2003). Scheduling parallel tasks: Approximation algorithms, Chapter 26. In J. Y. T. Leung (Ed.), Handbook of scheduling: Algorithms, models, and performance analysis. Boca Rato: Chapman/CRC.

    Google Scholar 

  • Frachtenberg, E., Feitelson, D., Petrini, F., & Fernandez, J. (2005). Adaptive parallel job scheduling with flexible coscheduling. IEEE Transactions on Parallel and Distributed Systems, 16(11), 1066–1077. doi:10.1109/TPDS.2005.130.

    Article  Google Scholar 

  • Garey, M. R., & Johnson, D. S. (1979). Computers and intractability. A guide to the theory of NP-completeness. New York: W.H, Freeman and Co.

    Google Scholar 

  • Gordon. (2011). Gordon user guide: Technical summary. http://www.sdsc.edu/us/resources/gordon/

  • Hankendi, C., & Coskun, A. (2012). Reducing the energy cost of computing through efficient co-scheduling of parallel workloads. In: Design, Automation Test in Europe Conference Exhibition (DATE), 2012, (pp. 994–999). doi:10.1109/DATE.2012.6176641.

  • Heroux, M. A., Doerfler, D. W., Crozier, P. S., Willenbring, J. M., Edwards, H. C., Williams, A., Rajan, M., Keiter, E. R., Thornquist, H. K., & Numrich, R. W. (2009). Improving performance via mini-applications. Research Report 5574, Sandia National Laboratories, USA.

  • Ikura, Y., & Gimple, M. (1986). Efficient scheduling algorithms for a single batch processing machine. Operations Research Letters, 5(2), 61–65.

    Article  Google Scholar 

  • Kamil, S., Shalf, J., & Strohmaier, E. (2008). Power efficiency in high performance computing. In: IPDPS, IEEE.

  • Koehler, F., & Khuller, S. (2013). Optimal batch schedules for parallel machines. In: Proceedings of the 13th Annual Algorithms and Data Structures Symposium.

  • Koole, G., & Righter, R. (2001). A stochastic batching and scheduling problem. Probability in the Engineering and Informational Sciences, 15(04), 465–479.

    Google Scholar 

  • Kresse, G., & Hafner, J. (1993). Ab initio molecular dynamics for liquid metals. Physical Review B, 47(1), 558–561.

    Article  Google Scholar 

  • Li, D., Nikolopoulos, D. S., Cameron, K., de Supinski, B. R., & Schulz, M. (2010). Power-aware MPI task aggregation prediction for high-end computing systems. IPDPS, 10, 1–12.

    Google Scholar 

  • Lodi, A., Martello, S., & Monaci, M. (2002). Two-dimensional packing problems: A survey. European Journal of Operational Research, 141(2), 241–252.

    Article  Google Scholar 

  • Muthuvelu, N., Chai, I., Chikkannan, E., & Buyya, R. (2011). Batch resizing policies and techniques for fine-grain grid tasks: The nuts and bolts. Journal of Information Processing Systems, 7(2), 299–320.

    Article  Google Scholar 

  • Plimpton, S. (1995). Fast parallel algorithms for short-range molecular dynamics. Journal of Computational Physics, 117, 1–19.

    Article  Google Scholar 

  • Potts, C. N., & Kovalyov, M. Y. (2000). Scheduling with batching: A review. European Journal of Operational Research, 120(2), 228–249.

  • Rountree, B., Lownenthal, D. K., de Supinski, B. R., Schulz, M., Freeh, V. W., & Bletsch, T. (2009). Adagio: Making DVS practical for complex HPC applications. ICS, 09, 460–469.

    Google Scholar 

  • Scogland, T., Subramaniam, B., & Feng, W. -C. (2011), Emerging trends on the evolving green500: Year three. In: 7th Workshop on High-Performance, Power-Aware Computing, Anchorage, Alaska, USA.

  • Shantharam, M., Youn, Y., & Raghavan, P. (2013). Speedup-aware co-schedules for efficient workload management. Parallel Processing Letters, 23(2), 1340001.

    Article  Google Scholar 

  • Turek, J., Schwiegelshohn, U., Wolf, J. L., & Yu, P. S. (1994). Scheduling parallel tasks to minimize average response time. In: Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms, Society for Industrial and Applied Mathematics (pp. 112–121).

Download references

Acknowledgments

Anne Benoit and Yves Robert are with the Institut Universitaire de France (IUF). This work was supported in part by the ANR RESCUE project. The research of Padma Raghavan and Manu Shantharam was supported in part by the U.S. National Science Foundation through grants CCF 0963839, 1018881 and 1319448.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guillaume Aupy.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aupy, G., Shantharam, M., Benoit, A. et al. Co-scheduling algorithms for high-throughput workload execution. J Sched 19, 627–640 (2016). https://doi.org/10.1007/s10951-015-0445-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10951-015-0445-x

Keywords

Navigation