$${{\fancyscript{H}}} $$ -LU factorization on many-core systems

Kriemann, Ronald

doi:10.1007/s00791-014-0226-7

${{\fancyscript{H}}} $-LU factorization on many-core systems

Published: 30 November 2014

Volume 16, pages 105–117, (2013)
Cite this article

Computing and Visualization in Science

Ronald Kriemann¹

544 Accesses
32 Citations
Explore all metrics

Abstract

A version of the ${{\fancyscript{H}}} $-LU factorization is introduced, based on the individual computational tasks occurring during the block-wise ${{\fancyscript{H}}} $-LU factorization. The dependencies between these tasks form a directed acylic graph, which is used for efficient scheduling on parallel systems. The algorithm is especially suited for many-core processors and shows a much improved parallel scaling behavior compared to previous ${{\fancyscript{H}}} $-LU factorization algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Agullo, E., Buttari, A., Dongarra, J., Faverge, M., Hadri, B., Haidar, A., Kurzak, J., Langou, J., Ltaief, H., Luszczek, P., YarKhan, A.: PLASMA Users’ Guide. Electrical Engineering and Computer Science Department, University of Tennessee, Knoxville (1997)
Google Scholar
Agullo, E., Demmel, J., Dongarra, J., Hadri, B., Kurzak, J., Langou, J., Ltaief, H., Luszczek, P., Tomov, S.: Numerical linear algebra on emerging architectures: the PLASMA and MAGMA projects. J. Phys. Conf. Ser. 180(1), 012,037 (2009)
Article Google Scholar
Anderson, E., Bai, Z., Bischof, C., Blackford, S., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A., Sorensen, D.: LAPACK Users’ Guide, 3rd edn. Society for Industrial and Applied Mathematics, Philadelphia (1999)
Book Google Scholar
Buttari, A., Langou, J., Kurzak, J., Dongarra, J.: Parallel tiled QR factorization for multicore architectures. Concurr. Comput. Pract. Exp. 20(13), 1573–1590 (2008)
Article Google Scholar
Buttari, A., Langou, J., Kurzak, J., Dongarra, J.: A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Comput. 35(1), 38–53 (2009)
Article MathSciNet Google Scholar
Duff, I.S., Reid, J.K.: The multifrontal solution of indefinite sparse symmetric linear. ACM Trans. Math. Softw. 9(3), 302–325 (1983). doi:10.1145/356044.356047
Article MATH MathSciNet Google Scholar
Grasedyck, L., Hackbusch, W.: Construction and arithmetics of ${\cal H}$-matrices. Computing 70, 295–334 (2003)
Article MATH MathSciNet Google Scholar
Grasedyck, L., Hackbusch, W., Kriemann, R.: Performance of ${\cal H}$-LU preconditioning for sparse matrices. Comput. Methods Appl. Math. 8(4), 336–349 (2008)
Article MATH MathSciNet Google Scholar
Grasedyck, L., Kriemann, R., Le Borne, S.: Domain-decomposition based ${\cal H}$-matrix preconditioners. In: Proceedings of DD16. LNSCE, vol. 55, pp. 661–668. Springer, Berlin (2006)
Grasedyck, L., Kriemann, R., LeBorne, S.: Parallel black box ${\cal H}$-LU preconditioning for elliptic boundary value problems. Comput. Vis. Sci. 11(4–6), 273–291 (2008). doi: 10.1007/s00791-008-0098-9
Article MathSciNet Google Scholar
Group, K.O.W., et al.: The OpenCL specification. In: Munshi, A. (ed.) (2008). http://www.khronos.org/registry/cl/
Hackbusch, W.: A sparse matrix arithmetic based on ${\cal H}$ matrices. Part I: introduction to ${\cal H}$-matrices. Computing 62, 89–108 (1999)
Article MATH MathSciNet Google Scholar
Hogg, J., Reid, J., Scott, J.: Design of a multicore sparse cholesky factorization using DAGs. SIAM J. Sci. Comput. 32(6), 3627–3649 (2010)
Izadi, M.: Hierarchical matrix techniques on massively parallel computers. Ph.D. thesis, University of Leipzig (2012)
Kriemann, R.: Hlibpro. http://www.hlibpro.com/
Kriemann, R.: Parallel ${\cal H}$-matrix arithmetics on shared memory systems. Computing 74, 273–297 (2005)
Article MATH MathSciNet Google Scholar
Kurzak, J., Buttari, A., Dongarra, J.: Solving systems of linear equations on the CELL processor using Cholesky factorization. IEEE Trans. Parallel Distrib. Syst. 19(9), 1175–1186 (2008)
Article Google Scholar
Kurzak, J., Dongarra, J.: QR factorization for the cell broadband engine. Sci. Program. 17(1–2), 31–42 (2009)
Google Scholar
Lacoste, X., Ramet, P., Faverge, M., Ichitaro, Y., Dongarra, J.: Sparse direct solvers with accelerators over DAG runtimes. Rapport de recherche RR-7972, INRIA. http://hal.inria.fr/hal-00700066 (2012)
Nickolls, J., Buck, I., Garland, M., Skadron, K.: Scalable parallel programming with CUDA. Queue 6(2), 40–53 (2008)
Article Google Scholar
Quintana-Ortí, E.S., Geijn, R.A.V.D.: Updating an LU factorization with pivoting. ACM Trans. Math. Softw. (TOMS) 35(2), 11 (2008)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Max-Planck-Institute for Mathematics in the Sciences, Leipzig, Germany
Ronald Kriemann

Authors

Ronald Kriemann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ronald Kriemann.

Additional information

Communicated by: Gabriel Wittum.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kriemann, R. ${{\fancyscript{H}}} $-LU factorization on many-core systems. Comput. Visual Sci. 16, 105–117 (2013). https://doi.org/10.1007/s00791-014-0226-7

Download citation

Received: 10 March 2014
Accepted: 26 July 2014
Published: 30 November 2014
Issue Date: June 2013
DOI: https://doi.org/10.1007/s00791-014-0226-7

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

\({{\fancyscript{H}}} \)-LU factorization on many-core systems

Abstract

Access this article

Similar content being viewed by others

Performance improvement of the triangular matrix product in commodity clusters

Parallelizing the dual revised simplex method

Stability improvements for fast matrix multiplication

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

\({{\fancyscript{H}}} \)-LU factorization on many-core systems

Abstract

Access this article

Similar content being viewed by others

Performance improvement of the triangular matrix product in commodity clusters

Parallelizing the dual revised simplex method

Stability improvements for fast matrix multiplication

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation