Advertisement

Using Malleable Task Scheduling to Accelerate Package Manager Installations

  • Samuel KnightEmail author
  • Jeremiah Wilke
  • Todd Gamblin
Conference paper
  • 7 Downloads
Part of the Communications in Computer and Information Science book series (CCIS, volume 1190)

Abstract

Package managers, containers, automated testing, and Continuous Integration (CI), are becoming an essential part of HPC development workflows. These automated tools often require software recompilation. However, large stacks such as those deployed on HPC clusters can have combinatorial dependencies, and may take a system several days to compile. Despite the use of simple parallelization (such as ‘make -j’), build execution time often do not scale with system resources. For such cases, it is possible to improve overall installation time by compiling parts of software stack independently, each scheduled on a subset of available cores. We apply malleable-task scheduling algorithms to better exploit available parallelism in build system workflows and improve stack build time overall. Using a prototype implementation in the Spack package manager, malleable-task scheduling can improve build times by more than 2x.

References

  1. 1.
    Amdahl, G.M.: Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the April 18–20, 1967, Spring Joint Computer Conference, AFIPS 1967 (Spring), pp. 483–485. ACM, New York (1967).  https://doi.org/10.1145/1465482.1465560, http://doi.acm.org/10.1145/1465482.1465560
  2. 2.
    Bansal, S., Kumar, P., Singh, K.: An improved two-step algorithm for task and data parallel scheduling in distributed memory machines. Parallel Comput. 32(10), 759–774 (2006).  https://doi.org/10.1016/j.parco.2006.08.004. http://www.sciencedirect.com/science/article/pii/S0167819106000524MathSciNetCrossRefGoogle Scholar
  3. 3.
    Bartlett, R., et al.: xSDK foundations: toward an extreme-scale scientific software development kit. Supercomput. Front. Innov. 4(1) (2017). http://superfri.org/superfri/article/view/127
  4. 4.
    Coffman Jr., E.G., Graham, R.L.: Optimal scheduling for two-processor systems. Acta Informatica 1(3), 200–213 (1972).  https://doi.org/10.1007/BF00288685MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    xSDK contributors: xsdk home (2019). https://xsdk.info/
  6. 6.
    Spack Contributors: Spack (2019). https://spack.io/. Accessed 27 Feb 2019
  7. 7.
    Du, J., Leung, J.Y.T.: Complexity of scheduling parallel task systems. SIAM J. Discrete Math. 2(4), 473–487 (1989).  https://doi.org/10.1137/0402042MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Gamblin, T., et al.: The Spack package manager: bringing order to HPC software chaos. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2015, pp. 40:1–40:12. ACM, New York (2015).  https://doi.org/10.1145/2807591.2807623, http://doi.acm.org/10.1145/2807591.2807623
  9. 9.
    Papadimitriou, C.H., Yannakakis, M.: Scheduling interval-ordered tasks. SIAM J. Comput. 8, 405–409 (1979).  https://doi.org/10.1137/0208031MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Hu, T.C.: Parallel sequencing and assembly line problems. Oper. Res. 9(6), 841–848 (1961). http://www.jstor.org/stable/167050MathSciNetCrossRefGoogle Scholar
  11. 11.
    Huang, K.C., Wu, W.Y., Wang, F.J., Liu, H.C., Hung, C.H.: An iterative expanding and shrinking process for processor allocation in mixed-parallel workflow scheduling. SpringerPlus 5(1), 1138 (2016).  https://doi.org/10.1186/s40064-016-2808-yCrossRefGoogle Scholar
  12. 12.
    Kwok, Y.K., Ahmad, I.: Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Comput. Surv. 31(4), 406–471 (1999).  https://doi.org/10.1145/344588.344618. http://doi.acm.org/10.1145/344588.344618CrossRefGoogle Scholar
  13. 13.
    Radulescu, A., van Gemund, A.J.C.: A low-cost approach towards mixed task and data parallel scheduling. In: International Conference on Parallel Processing 2001, pp. 69–76 (2001).  https://doi.org/10.1109/ICPP.2001.952048
  14. 14.
    Radulescu, A., Nicolescu, C., van Gemund, A.J.C., Jonker, P.P.: CPR: mixed task and data parallel scheduling for distributed systems. In: IPDPS (2001)Google Scholar
  15. 15.
    Ramaswamy, S., Sapatnekar, S., Banerjee, P.: A framework for exploiting task and data parallelism on distributed memory multicomputers. IEEE Trans. Parallel Distrib. Syst. 8(11), 1098–1116 (1997).  https://doi.org/10.1109/71.642945CrossRefGoogle Scholar
  16. 16.
    Sethi, R.: Scheduling graphs on two processors. SIAM J. Comput. 5, 73–82 (1976).  https://doi.org/10.1137/0205005MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Vydyanathan, N., et al.: Locality conscious processor allocation and scheduling for mixed parallel applications. In: 2006 IEEE International Conference on Cluster Computing, pp. 1–10 (2006).  https://doi.org/10.1109/CLUSTR.2006.311861

Copyright information

© National Technology & Engineering Solutions of Sandia, LLC. 2020

Authors and Affiliations

  1. 1.Sandia National LaboratoriesLivermoreUSA
  2. 2.Lawrence Livermore National LabLivermoreUSA

Personalised recommendations