Look-ahead in the two-sided reduction to compact band forms for symmetric eigenvalue problems and the SVD

Abstract

We address the reduction to compact band forms, via unitary similarity transformations, for the solution of symmetric eigenvalue problems and the computation of the singular value decomposition (SVD). Concretely, in the first case, we revisit the reduction to symmetric band form, while, for the second case, we propose a similar alternative, which transforms the original matrix to (unsymmetric) band form, replacing the conventional reduction method that produces a triangular–band output. In both cases, we describe algorithmic variants of the standard Level 3 Basic Linear Algebra Subroutines (BLAS)-based procedures, enhanced with look-ahead, to overcome the performance bottleneck imposed by the panel factorization. Furthermore, our solutions employ an algorithmic block size that differs from the target bandwidth, illustrating the important performance benefits of this decision. Finally, we show that our alternative compact band form for the SVD is key to introduce an effective look-ahead strategy into the corresponding reduction procedure.

This is a preview of subscription content, log in to check access.

References

  1. 1.

    Aliaga, J.I., Alonso, P., Badía, J.M., Chacn, P., Davidović, D., López-Blanco, J.R., Quintana-Ortí, E.S.: A fast band–krylov eigensolver for macromolecular functional motion simulation on multicore architectures and graphics processors. J. Comput. Phys. 309(Supplement C), 314–323 (2016)

    MathSciNet  Article  MATH  Google Scholar 

  2. 2.

    Anderson, E., Bai, Z., Blackford, L.S., Demmel, J., Dongarra, J.J., Du Croz, J., Hammarling, S., Greenbaum, A., McKenney, A., Sorensen, D.C.: LAPACK Users’ Guide, 3rd edn. SIAM (1999)

  3. 3.

    Ballard, G., Demmel, J., Grigori, L., Jacquelin, M., Knight, N., Nguyen, H.D.: Reconstructing householder vectors from tall-skinny QR. J. Parallel Distributed Comp. 85, 3–31 (2015)

    Article  Google Scholar 

  4. 4.

    Bientinesi, P., Igual, F.D., Kressner, D., Petschow, M., Quintana-ortí, E.S.: Condensed forms for the symmetric eigenvalue problem on multi-threaded architectures. Concurrency Comp.: Pract. Exp. 23(7), 694–707 (2011)

    Article  Google Scholar 

  5. 5.

    Bischof, C.H., Lang, B., Sun, X.: Algorithm 807: the SBR Toolbox—software for successive band reduction. ACM Trans. Math. Soft. 26(4), 602–616 (2000)

    MathSciNet  Article  MATH  Google Scholar 

  6. 6.

    Buttari, A., Langou, J., Kurzak, J., Dongarra, J.: A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Comput. 35(1), 38–53 (2009)

    MathSciNet  Article  Google Scholar 

  7. 7.

    Castaldo, A.M., Whaley, R.C., Samuel, S.: Scaling LAPACK panel operations using parallel cache assignment. ACM Trans. Math. Soft. 39(4), 22:1–22:30 (2013)

    MathSciNet  Article  MATH  Google Scholar 

  8. 8.

    Catalȧn, S., Herrero, J.R., Quintana-ortí, E.S., Rodríguez-Sȧnchez, R., van de Geijn, R.A.: A case for malleable thread-level linear algebra libraries: The LU factorization with partial pivoting. CoRR, arXiv:1611.06365 (2016)

  9. 9.

    Davidović, D., Quintana-Ortí, E.S.: Applying OOC techniques in the reduction to condensed form for very large symmetric eigenproblems on GPUs. In: Proceedings of the 20th Euromicro Conference on Parallel, Distributed and Network Based Processing – PDP 2012, pp. 442–449 (2012)

  10. 10.

    Davis, T.A., Rajamanickam, S.: Algorithm 8xx: PIRO BAND, pipelined plane rotations for band reduction. ACM Trans. Math. Soft. Submitted

  11. 11.

    Dhillon, I.S., Parlett, B.N., Vömel, C.: The design and implementation of the MRRR algorithm. ACM Trans. Math. Softw. 32(4), 533–560 (2006)

    MathSciNet  Article  MATH  Google Scholar 

  12. 12.

    Dongarra, J.J., Du Croz, J., Hammarling, S., Duff, I.: A set of level 3 basic linear algebra subprograms. ACM Trans. Math. Softw. 16(1), 1–17 (1990)

    MathSciNet  Article  MATH  Google Scholar 

  13. 13.

    Dongarra, J.J., Croz, J.D., Hammarling, S., Hanson, R.J.: An extended set of FORTRAN basic linear algebra subprograms. ACM Trans. Math. Softw. 14 (1), 1–17 (1988)

    Article  MATH  Google Scholar 

  14. 14.

    Fernando, K.V., Parlett, B.N.: Accurate singular values and differential QD algorithms. Numer. Mathematik 67(2), 191–229 (1994)

    MathSciNet  Article  MATH  Google Scholar 

  15. 15.

    Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. The Johns Hopkins University Press, Baltimore (1996)

  16. 16.

    Grosser, B., Lang, B.: Efficient parallel reduction to bidiagonal form. Parallel Comput. 25(8), 969–986 (1999)

    MathSciNet  Article  MATH  Google Scholar 

  17. 17.

    Gu, M., Eisenstat, S.C.: Divide-and-conquer algorithm for the bidiagonal SVD. SIAM J. Matrix Anal. Appl. 16(1), 79–92 (1995)

    MathSciNet  Article  MATH  Google Scholar 

  18. 18.

    Haidar, A., Ltaief, H., Dongarra, J.: Parallel reduction to condensed forms for symmetric eigenvalue problems using aggregated fine-grained and memory-aware kernels. In: 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC), pp. 1–11 (2011)

  19. 19.

    Haidar, A., Kurzak, J., Luszczek, P.: An improved parallel singular value algorithm and its implementation for multicore hardware. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC ’13, pp. 90:1–90:12. ACM, New York (2013)

  20. 20.

    Moldaschl, M., Gansterer, W.N.: Comparison of eigensolvers for symmetric band matrices. Sci. Comput. Program. 90(PA), 55–66 (2014)

    Article  Google Scholar 

  21. 21.

    Petschow, M., Peise, E., Bientinesi, P.: High-performance solvers for dense Hermitian eigenproblems. SIAM J. Scientific Comp. 35(1), C1–C22 (2013)

    MathSciNet  Article  MATH  Google Scholar 

  22. 22.

    Quintana-Ortí, G., Quintana-Ortí, E.S., van de Geijn, R.A., Van Zee, F.G., Chan, E.: Programming matrix algorithms-by-blocks for thread-level parallelism. ACM Trans. Math. Softw. 36(3), 14:1–14:26 (2009)

    MathSciNet  Article  MATH  Google Scholar 

  23. 23.

    Strazdins, P.: A comparison of lookahead and algorithmic blocking techniques for parallel matrix factorization. Technical Report TR-CS-98-07, Department of Computer Science, The Australian National University Canberra 0200 ACT, Australia (1998)

  24. 24.

    Van Zee, F.G., Smith, T.M., Marker, B., Low, T.M., Van De Geijn, R.A., Igual, F.D., Smelyanskiy, M., Zhang, X., Kistler, M., Austel, V., Gunnels, J.A., Killough, L.: The BLIS framework: experiments in portability. ACM Trans. Math. Softw. 42(2), 12:1–12:19 (2016)

    MathSciNet  Article  Google Scholar 

Download references

Funding

This research was partially sponsored by projects TIN2014-53495-R and TIN2015-65316-P of the Spanish Ministerio de Economía y Competitividad, project 2014-SGR-1051 from the Generalitat de Catalunya, and the EU H2020 project 732631 OPRECOMP.

Author information

Affiliations

Authors

Corresponding author

Correspondence to José R. Herrero.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Rodríguez-Sánchez, R., Catalán, S., Herrero, J.R. et al. Look-ahead in the two-sided reduction to compact band forms for symmetric eigenvalue problems and the SVD. Numer Algor 80, 635–660 (2019). https://doi.org/10.1007/s11075-018-0500-8

Download citation

Keywords

  • Two-sided reduction to compact band form
  • Look-ahead
  • Symmetric eigenvalue problems
  • Singular value decomposition