Abstract
We describe our recent work on designing algorithms and software for solving sparse systems using direct methods on parallel computers. This work has been conducted within an EU Horizon 2020 Project called NLAFET. We first discuss the solution of large sparse symmetric positive definite systems. We use a runtime system to express and execute a DAG-based Cholesky factorization. The runtime system plays the role of a software layer between the application and the architecture and handles the management of task dependencies as well as task scheduling and maintaining data coherency. Although runtime systems are widely used in dense linear algebra, this approach is challenging for sparse algorithms because of the irregularity and variable granularity of the DAGs arising in these systems. We have implemented our software using the OpenMP standard and the runtime systems StarPU and PaRSEC. We compare these implementations to HSL_MA87, a state-of-the-art DAG-based solver for positive definite systems. We demonstrate comparable performance on a multicore architecture. We also consider the case when the matrix is symmetric indefinite. For highly unsymmetric systems, we use a completely different approach based on developing a parallel version of a Markowitz threshold ordering. This work is less advanced but we discuss some of the algorithmic challenges involved. Finally, we briefly discuss using a hybrid direct-iterative solver that combines the best of the two approaches and enables the solution of even larger problems in parallel.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
An entry that is zero in A but is nonzero in the corresponding entry of the factors is termed fill-in.
- 3.
References
Agullo, E., Buttari, A., Guermouche, A., Lopez, F.: Multifrontal QR factorization for multicore architectures over runtime systems. In: Proceedings of Euro-Par 2013 Parallel Processing, pp. 521–532. Springer, Berlin (2013)
Agullo, E., Buttari, A., Guermouche, A., Lopez, F.: Implementing multifrontal sparse solvers for multicore architectures with sequential task flow runtime systems. ACM Trans. Math. Softw. 43, Article 13, 17p (2016)
Amestoy, P.R., Duff, I.S., L’Excellent, J.-Y.: Multifrontal parallel distributed symmetric and unsymmetric solvers. Comput. Methods Appl. Mech. Eng. 184, 501–520 (2000)
Bunch, J.R., Kaufman, L., Parlett, B.N.: Decomposition of a symmetric matrix. Numer. Math. 27, 95–110 (1976)
Bunch, J.R., Parlett, B.N.: Direct methods for solving symmetric indefinite systems of linear equations. SIAM J. Numer. Anal. 8, 639–655 (1971)
Buttari, A.: Fine-grained multithreading for the multifrontal QR factorization of sparse matrices. SIAM J. Sci. Comput. 35, C323–C345 (2013)
Buttari, A., Langou, J., Kurzak, J., Dongarra, J.: A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Comput. 35, 38–53 (2009)
Çatalyürek, Ü.V., Aykanat, C.: Hypergraph-partitioning-based decomposition for parallel sparse-matrix vector multiplication. IEEE Trans. Parallel Distrib. Syst. 10, 673–693 (1999)
Davis, T.A., Hu, Y.: The university of Florida sparse matrix collection. ACM Trans. Math. Softw. 38, 1:1–1:25 (2011)
Duff, I.S., Erisman, A.M., Reid, J.K.: Direct Methods for Sparse Matrices, 2nd edn. Oxford University Press, Oxford, England (2016)
Duff, I.S., Guivarch, R., Ruiz, D., Zenadi, M.: The augmented block Cimmino distributed method. SIAM J. Sci. Comput. 37, A1248–A1269 (2015)
Duff, I.S., Hogg, J., Lopez, F.: Experiments with sparse Cholesky using a sequential task-flow implementation. Technical Report RAL-TR-2016-016, Rutherford Appleton Laboratory, Oxfordshire, England, 2016. NLAFET Working Note 7. Numer. Algeb. Contr. Optim. 8, 235–258. Submitted to NACO (2018)
Duff, I.S., Lopez, F.: Experiments with sparse Cholesky using a parametrized task graph implementation. Technical Report RAL-TR-2017-006, Rutherford Appleton Laboratory, Oxfordshire, England, 2017. NLAFET Working Note 14. In parallel Processing and Applied Mathematics, Springer, pp 197–206. Accepted for presentation at PPAM 2017 (2018)
Duff, I.S., Reid, J.K.: The design of MA48, a code for the direct solution of sparse unsymmetric linear systems of equations. ACM Trans. Math. Softw. 22, 187–226 (1996)
Duff, I.S., Reid, J.K.: Exploiting zeros on the diagonal in the direct solution of indefinite sparse symmetric linear systems. ACM Trans. Math. Softw. 22, 227–257 (1996)
Elfving, T.: Block-iterative methods for consistent and inconsistent linear equations. Numer. Math. 35, 1–12 (1980)
Erisman, A.M., Grimes, R.G., Lewis, J.G., Poole Jr., W.G., Simon, H.D.: Evaluation of orderings for unsymmetric sparse matrices. SIAM J. Sci. Stat. Comput. 7, 600–624 (1987)
Hogg, J.: A new sparse LDLT solver using a posteriori threshold pivoting. Technical Report RAL-TR-2016-017, Rutherford Appleton Laboratory, Oxfordshire, England, 2017. NLAFET Working Note 6
Hogg, J., Reid, J., Scott, J.: Design of a multicore sparse Cholesky factorization using DAGs. SIAM J. Sci. Comput. 32, 3627–3649 (2010)
Hogg, J., Scott, J.: \(HSL_MA97\): a bit-compatible multifrontal code for sparse symmetric systems. Technical Report RAL-TR-2011-024. Rutherford Appleton Laboratory, Oxfordshire, England (2011)
Hogg, J., Scott, J.: A study of pivoting strategies for tough sparse indefinite systems. ACM Trans. Math. Softw. 40, Article 4, 19p (2013)
Ruiz, D.F.: Solution of large sparse unsymmetric linear systems with a block iterative method in a multiprocessor environment. Ph.D. Thesis, Institut National Polytechnique de Toulouse, 1992. CERFACS Technical Report, TH/PA/92/06
Schenk, O., Gärtner, K.: On fast factorization pivoting methods for sparse symmetric indefinite systems. Electron. Trans. Numer. Anal. 23, 158–179 (2006)
Zenadi, M.: The solution of large sparse linear systems on parallel computers using a hybrid implementation of the block Cimmino method. Thése de Doctorat, Institut National Polytechnique de Toulouse, Toulouse, France, dcembre 2013
Acknowledgements
This work is supported by the NLAFET Project funded by the European Union’s Horizon 2020 Research and Innovation Programme under Grant Agreement 671633. We would like to thank Philippe Gambron for doing the experiments reported in Sect. 8, Jonathan Hogg for his earlier work in the project developing the first versions of some of the kernels used in Sects. 5 and 6, and Tim Davis (Texas A&M) for discussions on parallel Markowitz. We also thank Jennifer Scott and Bo Kågström (Umeå) for their comments on a draft of this manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
Test matrices used in experiments.
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Duff, I., Lopez, F., Nakov, S. (2018). Sparse Direct Solution on Parallel Computers. In: Al-Baali, M., Grandinetti, L., Purnama, A. (eds) Numerical Analysis and Optimization. NAO 2017. Springer Proceedings in Mathematics & Statistics, vol 235. Springer, Cham. https://doi.org/10.1007/978-3-319-90026-1_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-90026-1_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-90025-4
Online ISBN: 978-3-319-90026-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)