Abstract
A version of the \({{\fancyscript{H}}} \)-LU factorization is introduced, based on the individual computational tasks occurring during the block-wise \({{\fancyscript{H}}} \)-LU factorization. The dependencies between these tasks form a directed acylic graph, which is used for efficient scheduling on parallel systems. The algorithm is especially suited for many-core processors and shows a much improved parallel scaling behavior compared to previous \({{\fancyscript{H}}} \)-LU factorization algorithms.
Similar content being viewed by others
References
Agullo, E., Buttari, A., Dongarra, J., Faverge, M., Hadri, B., Haidar, A., Kurzak, J., Langou, J., Ltaief, H., Luszczek, P., YarKhan, A.: PLASMA Users’ Guide. Electrical Engineering and Computer Science Department, University of Tennessee, Knoxville (1997)
Agullo, E., Demmel, J., Dongarra, J., Hadri, B., Kurzak, J., Langou, J., Ltaief, H., Luszczek, P., Tomov, S.: Numerical linear algebra on emerging architectures: the PLASMA and MAGMA projects. J. Phys. Conf. Ser. 180(1), 012,037 (2009)
Anderson, E., Bai, Z., Bischof, C., Blackford, S., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A., Sorensen, D.: LAPACK Users’ Guide, 3rd edn. Society for Industrial and Applied Mathematics, Philadelphia (1999)
Buttari, A., Langou, J., Kurzak, J., Dongarra, J.: Parallel tiled QR factorization for multicore architectures. Concurr. Comput. Pract. Exp. 20(13), 1573–1590 (2008)
Buttari, A., Langou, J., Kurzak, J., Dongarra, J.: A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Comput. 35(1), 38–53 (2009)
Duff, I.S., Reid, J.K.: The multifrontal solution of indefinite sparse symmetric linear. ACM Trans. Math. Softw. 9(3), 302–325 (1983). doi:10.1145/356044.356047
Grasedyck, L., Hackbusch, W.: Construction and arithmetics of \({\cal H}\)-matrices. Computing 70, 295–334 (2003)
Grasedyck, L., Hackbusch, W., Kriemann, R.: Performance of \({\cal H}\)-LU preconditioning for sparse matrices. Comput. Methods Appl. Math. 8(4), 336–349 (2008)
Grasedyck, L., Kriemann, R., Le Borne, S.: Domain-decomposition based \({\cal H}\)-matrix preconditioners. In: Proceedings of DD16. LNSCE, vol. 55, pp. 661–668. Springer, Berlin (2006)
Grasedyck, L., Kriemann, R., LeBorne, S.: Parallel black box \({\cal H}\)-LU preconditioning for elliptic boundary value problems. Comput. Vis. Sci. 11(4–6), 273–291 (2008). doi: 10.1007/s00791-008-0098-9
Group, K.O.W., et al.: The OpenCL specification. In: Munshi, A. (ed.) (2008). http://www.khronos.org/registry/cl/
Hackbusch, W.: A sparse matrix arithmetic based on \({\cal H}\) matrices. Part I: introduction to \({\cal H}\)-matrices. Computing 62, 89–108 (1999)
Hogg, J., Reid, J., Scott, J.: Design of a multicore sparse cholesky factorization using DAGs. SIAM J. Sci. Comput. 32(6), 3627–3649 (2010)
Izadi, M.: Hierarchical matrix techniques on massively parallel computers. Ph.D. thesis, University of Leipzig (2012)
Kriemann, R.: Hlibpro. http://www.hlibpro.com/
Kriemann, R.: Parallel \({\cal H}\)-matrix arithmetics on shared memory systems. Computing 74, 273–297 (2005)
Kurzak, J., Buttari, A., Dongarra, J.: Solving systems of linear equations on the CELL processor using Cholesky factorization. IEEE Trans. Parallel Distrib. Syst. 19(9), 1175–1186 (2008)
Kurzak, J., Dongarra, J.: QR factorization for the cell broadband engine. Sci. Program. 17(1–2), 31–42 (2009)
Lacoste, X., Ramet, P., Faverge, M., Ichitaro, Y., Dongarra, J.: Sparse direct solvers with accelerators over DAG runtimes. Rapport de recherche RR-7972, INRIA. http://hal.inria.fr/hal-00700066 (2012)
Nickolls, J., Buck, I., Garland, M., Skadron, K.: Scalable parallel programming with CUDA. Queue 6(2), 40–53 (2008)
Quintana-Ortí, E.S., Geijn, R.A.V.D.: Updating an LU factorization with pivoting. ACM Trans. Math. Softw. (TOMS) 35(2), 11 (2008)
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Gabriel Wittum.
Rights and permissions
About this article
Cite this article
Kriemann, R. \({{\fancyscript{H}}} \)-LU factorization on many-core systems. Comput. Visual Sci. 16, 105–117 (2013). https://doi.org/10.1007/s00791-014-0226-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00791-014-0226-7