Skip to main content

Sparse Direct Solution on Parallel Computers

  • Conference paper
  • First Online:
Numerical Analysis and Optimization (NAO 2017)

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 235))

Included in the following conference series:

  • 929 Accesses

Abstract

We describe our recent work on designing algorithms and software for solving sparse systems using direct methods on parallel computers. This work has been conducted within an EU Horizon 2020 Project called NLAFET. We first discuss the solution of large sparse symmetric positive definite systems. We use a runtime system to express and execute a DAG-based Cholesky factorization. The runtime system plays the role of a software layer between the application and the architecture and handles the management of task dependencies as well as task scheduling and maintaining data coherency. Although runtime systems are widely used in dense linear algebra, this approach is challenging for sparse algorithms because of the irregularity and variable granularity of the DAGs arising in these systems. We have implemented our software using the OpenMP standard and the runtime systems StarPU and PaRSEC. We compare these implementations to HSL_MA87, a state-of-the-art DAG-based solver for positive definite systems. We demonstrate comparable performance on a multicore architecture. We also consider the case when the matrix is symmetric indefinite. For highly unsymmetric systems, we use a completely different approach based on developing a parallel version of a Markowitz threshold ordering. This work is less advanced but we discuss some of the algorithmic challenges involved. Finally, we briefly discuss using a hybrid direct-iterative solver that combines the best of the two approaches and enables the solution of even larger problems in parallel.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    www.nlafet.eu.

  2. 2.

    An entry that is zero in A but is nonzero in the corresponding entry of the factors is termed fill-in.

  3. 3.

    See http://www.scarf.rl.ac.uk/hardware.

References

  1. Agullo, E., Buttari, A., Guermouche, A., Lopez, F.: Multifrontal QR factorization for multicore architectures over runtime systems. In: Proceedings of Euro-Par 2013 Parallel Processing, pp. 521–532. Springer, Berlin (2013)

    Chapter  Google Scholar 

  2. Agullo, E., Buttari, A., Guermouche, A., Lopez, F.: Implementing multifrontal sparse solvers for multicore architectures with sequential task flow runtime systems. ACM Trans. Math. Softw. 43, Article 13, 17p (2016)

    Article  MathSciNet  Google Scholar 

  3. Amestoy, P.R., Duff, I.S., L’Excellent, J.-Y.: Multifrontal parallel distributed symmetric and unsymmetric solvers. Comput. Methods Appl. Mech. Eng. 184, 501–520 (2000)

    Article  Google Scholar 

  4. Bunch, J.R., Kaufman, L., Parlett, B.N.: Decomposition of a symmetric matrix. Numer. Math. 27, 95–110 (1976)

    Article  MathSciNet  Google Scholar 

  5. Bunch, J.R., Parlett, B.N.: Direct methods for solving symmetric indefinite systems of linear equations. SIAM J. Numer. Anal. 8, 639–655 (1971)

    Article  MathSciNet  Google Scholar 

  6. Buttari, A.: Fine-grained multithreading for the multifrontal QR factorization of sparse matrices. SIAM J. Sci. Comput. 35, C323–C345 (2013)

    Article  MathSciNet  Google Scholar 

  7. Buttari, A., Langou, J., Kurzak, J., Dongarra, J.: A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Comput. 35, 38–53 (2009)

    Article  MathSciNet  Google Scholar 

  8. Çatalyürek, Ü.V., Aykanat, C.: Hypergraph-partitioning-based decomposition for parallel sparse-matrix vector multiplication. IEEE Trans. Parallel Distrib. Syst. 10, 673–693 (1999)

    Article  Google Scholar 

  9. Davis, T.A., Hu, Y.: The university of Florida sparse matrix collection. ACM Trans. Math. Softw. 38, 1:1–1:25 (2011)

    MathSciNet  MATH  Google Scholar 

  10. Duff, I.S., Erisman, A.M., Reid, J.K.: Direct Methods for Sparse Matrices, 2nd edn. Oxford University Press, Oxford, England (2016)

    MATH  Google Scholar 

  11. Duff, I.S., Guivarch, R., Ruiz, D., Zenadi, M.: The augmented block Cimmino distributed method. SIAM J. Sci. Comput. 37, A1248–A1269 (2015)

    Article  MathSciNet  Google Scholar 

  12. Duff, I.S., Hogg, J., Lopez, F.: Experiments with sparse Cholesky using a sequential task-flow implementation. Technical Report RAL-TR-2016-016, Rutherford Appleton Laboratory, Oxfordshire, England, 2016. NLAFET Working Note 7. Numer. Algeb. Contr. Optim. 8, 235–258. Submitted to NACO (2018)

    Article  Google Scholar 

  13. Duff, I.S., Lopez, F.: Experiments with sparse Cholesky using a parametrized task graph implementation. Technical Report RAL-TR-2017-006, Rutherford Appleton Laboratory, Oxfordshire, England, 2017. NLAFET Working Note 14. In parallel Processing and Applied Mathematics, Springer, pp 197–206. Accepted for presentation at PPAM 2017 (2018)

    Chapter  Google Scholar 

  14. Duff, I.S., Reid, J.K.: The design of MA48, a code for the direct solution of sparse unsymmetric linear systems of equations. ACM Trans. Math. Softw. 22, 187–226 (1996)

    Article  Google Scholar 

  15. Duff, I.S., Reid, J.K.: Exploiting zeros on the diagonal in the direct solution of indefinite sparse symmetric linear systems. ACM Trans. Math. Softw. 22, 227–257 (1996)

    Article  MathSciNet  Google Scholar 

  16. Elfving, T.: Block-iterative methods for consistent and inconsistent linear equations. Numer. Math. 35, 1–12 (1980)

    Article  MathSciNet  Google Scholar 

  17. Erisman, A.M., Grimes, R.G., Lewis, J.G., Poole Jr., W.G., Simon, H.D.: Evaluation of orderings for unsymmetric sparse matrices. SIAM J. Sci. Stat. Comput. 7, 600–624 (1987)

    Article  MathSciNet  Google Scholar 

  18. Hogg, J.: A new sparse LDLT solver using a posteriori threshold pivoting. Technical Report RAL-TR-2016-017, Rutherford Appleton Laboratory, Oxfordshire, England, 2017. NLAFET Working Note 6

    Google Scholar 

  19. Hogg, J., Reid, J., Scott, J.: Design of a multicore sparse Cholesky factorization using DAGs. SIAM J. Sci. Comput. 32, 3627–3649 (2010)

    Article  MathSciNet  Google Scholar 

  20. Hogg, J., Scott, J.: \(HSL_MA97\): a bit-compatible multifrontal code for sparse symmetric systems. Technical Report RAL-TR-2011-024. Rutherford Appleton Laboratory, Oxfordshire, England (2011)

    Google Scholar 

  21. Hogg, J., Scott, J.: A study of pivoting strategies for tough sparse indefinite systems. ACM Trans. Math. Softw. 40, Article 4, 19p (2013)

    Article  MathSciNet  Google Scholar 

  22. Ruiz, D.F.: Solution of large sparse unsymmetric linear systems with a block iterative method in a multiprocessor environment. Ph.D. Thesis, Institut National Polytechnique de Toulouse, 1992. CERFACS Technical Report, TH/PA/92/06

    Google Scholar 

  23. Schenk, O., Gärtner, K.: On fast factorization pivoting methods for sparse symmetric indefinite systems. Electron. Trans. Numer. Anal. 23, 158–179 (2006)

    MathSciNet  MATH  Google Scholar 

  24. Zenadi, M.: The solution of large sparse linear systems on parallel computers using a hybrid implementation of the block Cimmino method. Thése de Doctorat, Institut National Polytechnique de Toulouse, Toulouse, France, dcembre 2013

    Google Scholar 

Download references

Acknowledgements

This work is supported by the NLAFET Project funded by the European Union’s Horizon 2020 Research and Innovation Programme under Grant Agreement 671633. We would like to thank Philippe Gambron for doing the experiments reported in Sect. 8, Jonathan Hogg for his earlier work in the project developing the first versions of some of the kernels used in Sects. 5 and 6, and Tim Davis (Texas A&M) for discussions on parallel Markowitz. We also thank Jennifer Scott and Bo Kågström (Umeå) for their comments on a draft of this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Iain Duff .

Editor information

Editors and Affiliations

Appendix

Appendix

Test matrices used in experiments.

Table 4 Test matrices and their characteristics. n is the matrix order, nz(A) the number entries in the matrix A, nz(L) the number of entries in the factor L, and Flops corresponds to the operation count for the matrix factorization
Table 5 Easy Indefinite. Statistics as reported by the analyse phase of SSIDS with default settings, assuming no delayed pivots
Table 6 Hard Indefinite. Statistics as reported by the analyse phase of SSIDS with default settings, using matching-based ordering, assuming no delayed pivots

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Duff, I., Lopez, F., Nakov, S. (2018). Sparse Direct Solution on Parallel Computers. In: Al-Baali, M., Grandinetti, L., Purnama, A. (eds) Numerical Analysis and Optimization. NAO 2017. Springer Proceedings in Mathematics & Statistics, vol 235. Springer, Cham. https://doi.org/10.1007/978-3-319-90026-1_4

Download citation

Publish with us

Policies and ethics