Out-of-Core Computation of the QR Factorization on Multi-core Processors

  • Mercedes Marqués
  • Gregorio Quintana-Ortí
  • Enrique S. Quintana-Ortí
  • Robert van de Geijn
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5704)


We target the development of high-performance algorithms for dense matrix operations where data resides on disk and has to be explicitly moved in and out of the main memory. We provide strong evidence that, even for a complex operation like the QR factorization, the use of a run-time system creates a separation of concerns between the matrix computations and I/O operations with the result that no significant changes need to be introduced to existing in-core algorithms. The library developer can thus focus on the design of algorithms-by-blocks, addressing disk memory as just another level of the memory hierarchy. Experimental results for the out-of-core computation of the QR factorization on a multi-core processor reveal the potential of this approach.


Dense linear algebra out-of-core computation QR factorization multi-core processors high performance 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Baboulin, M., Giraud, L., Gratton, S., Langou, J.: Parallel tools for solving incremental dense least squares problems. application to space geodesy. Technical Report UT-CS-06-582; TR/PA/06/63, University of Tennessee; CERFACS (2006); To appear in J. of Algorithms and Computational Technology 3(1) (2009) Google Scholar
  2. 2.
    D’Azevedo, E.F., Dongarra, J.J.: The design and implementation of the parallel out-of-core scalapack LU, QR, and Cholesky factorization routines. LAPACK Working Note 118 CS-97-247, University of Tennessee, Knoxville (1997)Google Scholar
  3. 3.
    Reiley, W.C., van de Geijn, R.A.: POOCLAPACK: Parallel Out-of-Core Linear Algebra Package. Technical Report CS-TR-99-33, Department of Computer Sciences, The University of Texas at Austin (1999)Google Scholar
  4. 4.
    Toledo, S.: A survey of out-of-core algorithms in numerical linear algebra. In: DIMACS Series in Discrete Mathematics and Theoretical Computer Science (1999)Google Scholar
  5. 5.
    Marqués, M., Quintana-Ortí, G., Quintana-Ortí, E.S., van de Geijn, R.: Solving “large” dense matrix problems on multi-core processors. In: 10th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing – PDSEC 2009 (to appear, 2009)Google Scholar
  6. 6.
    Van Zee, F.G.: The complete reference (2008) (in preparation), http://www.cs.utexas.edu/users/flame
  7. 7.
    Baboulin, M.: Solving large dense linear least squares problems on parallel distributed computers. Application to the Earth’s gravity field computation. Ph.D. dissertation, INPT, TH/PA/06/22 (2006)Google Scholar
  8. 8.
    Gunter, B.C.: Computational methods and processing strategies for estimating Earth’s gravity field. PhD thesis, The University of Texas at Austin (2004)Google Scholar
  9. 9.
    Geng, P., Oden, J.T., van de Geijn, R.: Massively parallel computation for acoustical scattering problems using boundary element methods. Journal of Sound and Vibration 191(1), 145–165 (1996)CrossRefMATHGoogle Scholar
  10. 10.
    Schafer, N., Serban, R., Negrut, D.: Implicit integration in molecular dynamics simulation. In: ASME International Mechanical Engineering Congress & Exposition (2008) (IMECE2008-66438)Google Scholar
  11. 11.
    Zhang, Y., Sarkar, T.K., van de Geijn, R.A., Taylor, M.C.: Parallel MoM using higher order basis function and PLAPACK in-core and out-of-core solvers for challenging EM simulations. In: IEEE AP-S & USNC/URSI Symposium (2008)Google Scholar
  12. 12.
    Gunter, B.C., van de Geijn, R.A.: Parallel out-of-core computation and updating the QR factorization. ACM Transactions on Mathematical Software 31(1), 60–78 (2005)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Watkins, D.S.: Fundamentals of Matrix Computations, 2nd edn. John Wiley & Sons, Inc., New York (2002)CrossRefMATHGoogle Scholar
  14. 14.
    Dongarra, J.J., Du Croz, J., Hammarling, S., Duff, I.: A set of level 3 basic linear algebra subprograms. ACM Transactions on Mathematical Software 16(1), 1–17 (1990)CrossRefMATHGoogle Scholar
  15. 15.
    Quintana-Ortí, G., Quintana-Ortí, E.S., van de Geijn, R., Zee, F.V., Chan, E.: Programming matrix algorithms-by-blocks for thread-level parallelism. ACM Transactions on Mathematical Software (2008) (to appear), FLAME Working Note #32, http://www.cs.utexas.edu/users/flame/
  16. 16.
    Anderson, E., Bai, Z., Demmel, J., Dongarra, J.E., DuCroz, J., Greenbaum, A., Hammarling, S., McKenney, A.E., Ostrouchov, S., Sorensen, D.: LAPACK Users’ Guide. SIAM, Philadelphia (1992)MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Mercedes Marqués
    • 1
  • Gregorio Quintana-Ortí
    • 1
  • Enrique S. Quintana-Ortí
    • 1
  • Robert van de Geijn
    • 2
  1. 1.Depto. de Ingeniería y Ciencia de ComputadoresUniversidad Jaume I (UJI)CastellónSpain
  2. 2.Department of Computer SciencesThe University of Texas at AustinAustin

Personalised recommendations