Scheduling Trees of Malleable Tasks for Sparse Linear Algebra

  • Abdou Guermouche
  • Loris MarchalEmail author
  • Bertrand Simon
  • Frédéric Vivien
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9233)


Scientific workloads are often described by directed acyclic task graphs. This is in particular the case for multifrontal factorization of sparse matrices—the focus of this paper—whose task graph is structured as a tree of parallel tasks. Prasanna and Musicus [19, 20] advocated using the concept of malleable tasks to model parallel tasks involved in matrix computations. In this powerful model each task is processed on a time-varying number of processors. Following Prasanna and Musicus, we consider malleable tasks whose speedup is \(p^\alpha \), where p is the fractional share of processors on which a task executes, and \(\alpha \) (\(0 < \alpha \le 1\)) is a task-independent parameter. Firstly, we use actual experiments on multicore platforms to motivate the relevance of this model for our application. Then, we study the optimal time-minimizing allocation proposed by Prasanna and Musicus using optimal control theory. We greatly simplify their proofs by resorting only to pure scheduling arguments. Building on the insight gained thanks to these new proofs, we extend the study to distributed (homogeneous or heterogeneous) multicore platforms. We prove the NP-completeness of the corresponding scheduling problem, and we then propose some approximation algorithms.


Schedule Problem Optimal Control Theory Parallel Composition Task Graph Runtime System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Amestoy, P., Buttari, A., Duff, I.S., Guermouche, A., L’Excellent, J., Uçar, B.: Mumps. In: Padua, D.A. (ed.) Encyclopedia of Parallel Computing, pp. 1232–1238. Springer, USA (2011)Google Scholar
  2. 2.
    Ashcraft, C., Grimes, R.G., Lewis, J.G., Peyton, B.W., Simon, H.D.: Progress in sparse matrix methods for large linear systems on vector computers. Int. J. Supercomput. Appl. 1(4), 10–30 (1987)CrossRefGoogle Scholar
  3. 3.
    Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.A.: StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. Concurrency Comput.: Prac. Experience 23(2), 187–198 (2011)CrossRefGoogle Scholar
  4. 4.
    Bosilca, G., Bouteiller, A., Danalis, A., Faverge, M., Herault, T., Dongarra, J.J.: PaRSEC: exploiting heterogeneity for enhancing scalability. Comput. Sci. Eng. 15(6), 36–45 (2013)CrossRefGoogle Scholar
  5. 5.
    Buttari, A.: Fine granularity sparse QR factorization for multicore based systems. In: International Confernece on Applied Parallel and Scientific Computing, pp. 226–236 (2012)Google Scholar
  6. 6.
    Drozdowski, M.: Scheduling parallel tasks - algorithms and complexity. In: Leung, J. (ed.) Handbook of Scheduling. Chapman and Hall/CRC, Boca Raton (2004)Google Scholar
  7. 7.
    Duff, I.S., Reid, J.K.: The multifrontal solution of indefinite sparse symmetric linear systems. ACM Trans. Math. Softw. 9, 302–325 (1983)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman & Co., New York (1979)zbMATHGoogle Scholar
  9. 9.
    Gautier, T., Besseron, X., Pigeon, L.: Kaapi: A thread scheduling runtime system for data flow computations on cluster of multi-processors. In: International Workshop on Parallel Symbolic Computation, pp. 15–23 (2007)Google Scholar
  10. 10.
    Guermouche, A., Marchal, L., Simon, B., Vivien, F.: Scheduling trees of malleable tasks for sparse linear algebra. Technical report, RR-8616, INRIA, October 2014Google Scholar
  11. 11.
    Hardy, G., Littlewood, J., Pólya, G.: Inequalities, Chap. 6.14. Cambridge Mathematical Library, Cambridge University Press, Cambridge (1952)Google Scholar
  12. 12.
    Hénon, P., Ramet, P., Roman, J.: PaStiX: a high-performance parallel direct solver for sparse symmetric definite systems. Parallel Comput. 28(2), 301–321 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Hugo, A., Guermouche, A., Wacrenier, P.A., Namyst., R.: A runtime approach to dynamic resource allocation for sparse direct solvers. In: ICPP, pp. 481–490 (2014)Google Scholar
  14. 14.
    Jansen, K., Zhang, H.: Scheduling malleable tasks with precedence constraints. In: ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pp. 86–95 (2005)Google Scholar
  15. 15.
    Kellerer, H., Mansini, R., Pferschy, U., Speranza, M.G.: An efficient fully polynomial approximation scheme for the subset-sum problem. J. Comput. Syst. Sci. 66(2), 349–370 (2003)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Lepère, R., Trystram, D., Woeginger, G.J.: Approximation algorithms for scheduling malleable tasks under precedence constraints. IJFCS 13(4), 613–627 (2002)Google Scholar
  17. 17.
    Li, X.S.: An overview of SuperLU: algorithms, implementation, and user interface. ACM Trans. Math. Softw. 31(3), 302–325 (2005)CrossRefzbMATHGoogle Scholar
  18. 18.
    Liu, J.W.H.: The role of elimination trees in sparse factorization. SIAM J. Matrix Anal. Appl. 11(1), 134–172 (1990)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Prasanna, G.N.S., Musicus, B.R.: Generalized multiprocessor scheduling and applications to matrix computations. IEEE TPDS 7(6), 650–664 (1996)Google Scholar
  20. 20.
    Prasanna, G.N.S., Musicus, B.R.: The optimal control approach to generalized multiprocessor scheduling. Algorithmica 15(1), 17–49 (1996)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Abdou Guermouche
    • 1
  • Loris Marchal
    • 2
    Email author
  • Bertrand Simon
    • 2
  • Frédéric Vivien
    • 2
  1. 1.University of Bordeaux and INRIATalenceFrance
  2. 2.CNRS, INRIA and University of Lyon, LIP, ENS LyonLyonFrance

Personalised recommendations