Skip to main content
Log in

\({{\fancyscript{H}}} \)-LU factorization on many-core systems

  • Published:
Computing and Visualization in Science

Abstract

A version of the \({{\fancyscript{H}}} \)-LU factorization is introduced, based on the individual computational tasks occurring during the block-wise \({{\fancyscript{H}}} \)-LU factorization. The dependencies between these tasks form a directed acylic graph, which is used for efficient scheduling on parallel systems. The algorithm is especially suited for many-core processors and shows a much improved parallel scaling behavior compared to previous \({{\fancyscript{H}}} \)-LU factorization algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Agullo, E., Buttari, A., Dongarra, J., Faverge, M., Hadri, B., Haidar, A., Kurzak, J., Langou, J., Ltaief, H., Luszczek, P., YarKhan, A.: PLASMA Users’ Guide. Electrical Engineering and Computer Science Department, University of Tennessee, Knoxville (1997)

    Google Scholar 

  2. Agullo, E., Demmel, J., Dongarra, J., Hadri, B., Kurzak, J., Langou, J., Ltaief, H., Luszczek, P., Tomov, S.: Numerical linear algebra on emerging architectures: the PLASMA and MAGMA projects. J. Phys. Conf. Ser. 180(1), 012,037 (2009)

    Article  Google Scholar 

  3. Anderson, E., Bai, Z., Bischof, C., Blackford, S., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A., Sorensen, D.: LAPACK Users’ Guide, 3rd edn. Society for Industrial and Applied Mathematics, Philadelphia (1999)

    Book  Google Scholar 

  4. Buttari, A., Langou, J., Kurzak, J., Dongarra, J.: Parallel tiled QR factorization for multicore architectures. Concurr. Comput. Pract. Exp. 20(13), 1573–1590 (2008)

    Article  Google Scholar 

  5. Buttari, A., Langou, J., Kurzak, J., Dongarra, J.: A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Comput. 35(1), 38–53 (2009)

    Article  MathSciNet  Google Scholar 

  6. Duff, I.S., Reid, J.K.: The multifrontal solution of indefinite sparse symmetric linear. ACM Trans. Math. Softw. 9(3), 302–325 (1983). doi:10.1145/356044.356047

    Article  MATH  MathSciNet  Google Scholar 

  7. Grasedyck, L., Hackbusch, W.: Construction and arithmetics of \({\cal H}\)-matrices. Computing 70, 295–334 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  8. Grasedyck, L., Hackbusch, W., Kriemann, R.: Performance of \({\cal H}\)-LU preconditioning for sparse matrices. Comput. Methods Appl. Math. 8(4), 336–349 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  9. Grasedyck, L., Kriemann, R., Le Borne, S.: Domain-decomposition based \({\cal H}\)-matrix preconditioners. In: Proceedings of DD16. LNSCE, vol. 55, pp. 661–668. Springer, Berlin (2006)

  10. Grasedyck, L., Kriemann, R., LeBorne, S.: Parallel black box \({\cal H}\)-LU preconditioning for elliptic boundary value problems. Comput. Vis. Sci. 11(4–6), 273–291 (2008). doi: 10.1007/s00791-008-0098-9

    Article  MathSciNet  Google Scholar 

  11. Group, K.O.W., et al.: The OpenCL specification. In: Munshi, A. (ed.) (2008). http://www.khronos.org/registry/cl/

  12. Hackbusch, W.: A sparse matrix arithmetic based on \({\cal H}\) matrices. Part I: introduction to \({\cal H}\)-matrices. Computing 62, 89–108 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  13. Hogg, J., Reid, J., Scott, J.: Design of a multicore sparse cholesky factorization using DAGs. SIAM J. Sci. Comput. 32(6), 3627–3649 (2010)

  14. Izadi, M.: Hierarchical matrix techniques on massively parallel computers. Ph.D. thesis, University of Leipzig (2012)

  15. Kriemann, R.: Hlibpro. http://www.hlibpro.com/

  16. Kriemann, R.: Parallel \({\cal H}\)-matrix arithmetics on shared memory systems. Computing 74, 273–297 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  17. Kurzak, J., Buttari, A., Dongarra, J.: Solving systems of linear equations on the CELL processor using Cholesky factorization. IEEE Trans. Parallel Distrib. Syst. 19(9), 1175–1186 (2008)

    Article  Google Scholar 

  18. Kurzak, J., Dongarra, J.: QR factorization for the cell broadband engine. Sci. Program. 17(1–2), 31–42 (2009)

    Google Scholar 

  19. Lacoste, X., Ramet, P., Faverge, M., Ichitaro, Y., Dongarra, J.: Sparse direct solvers with accelerators over DAG runtimes. Rapport de recherche RR-7972, INRIA. http://hal.inria.fr/hal-00700066 (2012)

  20. Nickolls, J., Buck, I., Garland, M., Skadron, K.: Scalable parallel programming with CUDA. Queue 6(2), 40–53 (2008)

    Article  Google Scholar 

  21. Quintana-Ortí, E.S., Geijn, R.A.V.D.: Updating an LU factorization with pivoting. ACM Trans. Math. Softw. (TOMS) 35(2), 11 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ronald Kriemann.

Additional information

Communicated by: Gabriel Wittum.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kriemann, R. \({{\fancyscript{H}}} \)-LU factorization on many-core systems. Comput. Visual Sci. 16, 105–117 (2013). https://doi.org/10.1007/s00791-014-0226-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00791-014-0226-7

Keywords

Mathematics Subject Classification

Navigation