Out-of-Core Computation of the QR Factorization on Multi-core Processors
We target the development of high-performance algorithms for dense matrix operations where data resides on disk and has to be explicitly moved in and out of the main memory. We provide strong evidence that, even for a complex operation like the QR factorization, the use of a run-time system creates a separation of concerns between the matrix computations and I/O operations with the result that no significant changes need to be introduced to existing in-core algorithms. The library developer can thus focus on the design of algorithms-by-blocks, addressing disk memory as just another level of the memory hierarchy. Experimental results for the out-of-core computation of the QR factorization on a multi-core processor reveal the potential of this approach.
KeywordsDense linear algebra out-of-core computation QR factorization multi-core processors high performance
Unable to display preview. Download preview PDF.
- 1.Baboulin, M., Giraud, L., Gratton, S., Langou, J.: Parallel tools for solving incremental dense least squares problems. application to space geodesy. Technical Report UT-CS-06-582; TR/PA/06/63, University of Tennessee; CERFACS (2006); To appear in J. of Algorithms and Computational Technology 3(1) (2009) Google Scholar
- 2.D’Azevedo, E.F., Dongarra, J.J.: The design and implementation of the parallel out-of-core scalapack LU, QR, and Cholesky factorization routines. LAPACK Working Note 118 CS-97-247, University of Tennessee, Knoxville (1997)Google Scholar
- 3.Reiley, W.C., van de Geijn, R.A.: POOCLAPACK: Parallel Out-of-Core Linear Algebra Package. Technical Report CS-TR-99-33, Department of Computer Sciences, The University of Texas at Austin (1999)Google Scholar
- 4.Toledo, S.: A survey of out-of-core algorithms in numerical linear algebra. In: DIMACS Series in Discrete Mathematics and Theoretical Computer Science (1999)Google Scholar
- 5.Marqués, M., Quintana-Ortí, G., Quintana-Ortí, E.S., van de Geijn, R.: Solving “large” dense matrix problems on multi-core processors. In: 10th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing – PDSEC 2009 (to appear, 2009)Google Scholar
- 6.Van Zee, F.G.: The complete reference (2008) (in preparation), http://www.cs.utexas.edu/users/flame
- 7.Baboulin, M.: Solving large dense linear least squares problems on parallel distributed computers. Application to the Earth’s gravity field computation. Ph.D. dissertation, INPT, TH/PA/06/22 (2006)Google Scholar
- 8.Gunter, B.C.: Computational methods and processing strategies for estimating Earth’s gravity field. PhD thesis, The University of Texas at Austin (2004)Google Scholar
- 10.Schafer, N., Serban, R., Negrut, D.: Implicit integration in molecular dynamics simulation. In: ASME International Mechanical Engineering Congress & Exposition (2008) (IMECE2008-66438)Google Scholar
- 11.Zhang, Y., Sarkar, T.K., van de Geijn, R.A., Taylor, M.C.: Parallel MoM using higher order basis function and PLAPACK in-core and out-of-core solvers for challenging EM simulations. In: IEEE AP-S & USNC/URSI Symposium (2008)Google Scholar
- 15.Quintana-Ortí, G., Quintana-Ortí, E.S., van de Geijn, R., Zee, F.V., Chan, E.: Programming matrix algorithms-by-blocks for thread-level parallelism. ACM Transactions on Mathematical Software (2008) (to appear), FLAME Working Note #32, http://www.cs.utexas.edu/users/flame/