Skip to main content

Parallel reduction of four matrices to condensed form for a generalized matrix eigenvalue algorithm


The VZ algorithm proposed by Charles F. Van Loan (SIMA, 1975) attempts to solve the generalized type of matrix eigenvalue problem ACx = λBDx, where A, BRn×m, C, DRm×n, and mn, without forming products and inverses. Especially, this algorithm is suitable for solving the generalized singular value problem. Van Loan’s approach first reduces the matrices A, B, C, and D to a condensed form by the finite step initial reduction. The reduction finds orthogonal matrices Q, U, V, and Z, such that QAZ is upper Hessenberg, and QBV, ZTCU, and VTDU are upper triangular. In this initial reduction, A is reduced to upper Hessenberg form, while simultaneously preserving triangularity of other three matrices. This is done by Givens rotations, annihilating one by one element of A, and by generating three more rotations applied to other matrices per each annihilation. Such an algorithm is quite inefficient. In our work, we propose a blocked algorithm for the initial reduction, based on aggregated Givens rotations and matrix–matrix multiplications, which are applied in the outer loop updates. This algorithm has another level of blocking, exploited in the inner loop. Further, we also consider a variant of the algorithm in a hybrid CPU–GPU framework, where compute-intensive outer loop updates are performed on GPU, and can be overlapped with the reduction in the next step performed on CPU. On the other hand, application of a sequence of rotations in the inner loop is parallelized on CPU, with balanced operation count per thread. Since a large number of aggregated rotations are produced in every outer loop step, they are simultaneously accumulated before outer loop updates. These adjustments speed up original initial reduction considerably which is confirmed by numerical experiments, and the efficiency of the whole VZ algorithm is increased.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14


  1. Alter, O., Brown, P.O., Botstein, D.: Generalized singular value decomposition for comparative analysis of genome-scale expression data sets of two different organisms. Proc. Natl. Acad. Sci. USA 100, 3351–3356 (2003)

    Article  Google Scholar 

  2. Antoulas, A.C., Sorensen, D.C.: Approximation of large–scale dynamical systems: an overview. Int. J. Appl. Math. Comput. Sci. 11, 1093–1121 (2001)

    MathSciNet  MATH  Google Scholar 

  3. Bai, Z., Demmel, J.W.: Computing the generalized singular value decomposition. SIAM J. Sci. Comput. 14, 1464–1486 (1993)

    MathSciNet  Article  Google Scholar 

  4. Bai, Z., Zha, H.: A new preprocessing algorithm for the computation of the generalized singular value decomposition. SIAM J. Sci. Comput. 14, 1007–1012 (1993)

    MathSciNet  Article  Google Scholar 

  5. Benner, P.: Computational methods for linear–quadratic optimization. Supplemento ai Rendiconti del Circolo Matematico di Palermo Serrie II(58), 21–56 (1999)

    MathSciNet  MATH  Google Scholar 

  6. Benner, P., Byers, R., Mehrmann, V., Xu, H.: Numerical computation of deflating subspaces of skew-Hamiltonian/Hamiltonian pencils. SIAM J. Matrix Anal. Appl. 24, 165–190 (2002)

    MathSciNet  Article  Google Scholar 

  7. Benner, P., Byers, R., Mehrmann, V., Xu, H.: Robust numerical methods for robust control. Technical Report 06-2004, Institut für Mathematik, TU Berlin (2004)

  8. Bhuyan, K., Singh, S.B., Bhuyan, P.K.: Application of generalized singular value decomposition to ionospheric tomography. Annal. Geophys. 22, 3437–3444 (2004)

    Article  Google Scholar 

  9. Bischof, C., Van Loan, C.F.: The WY representation for products of Householder matrices. SIAM J. Sci. Stat. Comput. 8, 2–13 (1987)

    MathSciNet  Article  Google Scholar 

  10. Bojanczyk, A., Golub, G.H., Van Dooren, P.: The periodic Schur decomposition; algorithm and applications. In: Proceedings of SPIE Conference, vol. 1770, pp. 31–42 (1992)

  11. Bosner, N.: Efficient algorithm for simultaneous reduction to the m-Hessenberg-triangular-triangular form. BIT 55, 677–703 (2015)

    MathSciNet  Article  Google Scholar 

  12. Bosner, N., Karlsson, L.: Parallel and heterogeneous m–Hessenberg–triangular–triangular reduction. SIAM J. Sci. Comput. 39, C29–C47 (2017)

    MathSciNet  Article  Google Scholar 

  13. Demmel, J.W., Veselić, K.: Jacobi’s method is more accurate than QR. SIAM J. Matrix Anal. Appl. 13, 1204–1245 (1992)

    MathSciNet  Article  Google Scholar 

  14. Falk, S., Langemeyer, P. Schuff, H.K. (ed.): Das Jacobische Rotationsverfahren Fur Reel Symmetrische Matrizenpaare I, II. Friedr. Vieweg & Sohn, Braunschweig (1960)

  15. Golub, G., Reinsch, C.: Singular value decomposition and least squares solution. Numer. Math. 14, 403–420 (1970)

    MathSciNet  Article  Google Scholar 

  16. Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd Edn. The Johns Hopkins University Press, Baltimore and London (1996)

  17. Hari, V.: On Cyclic Jacobi Methods for the Positive Definite Generalized Eigenvalue Problem, publisher=PhD Thesis, FernUniversität-Gesamthochschule, Hagen (1984)

  18. Higham, N.J.: Accuracy and Stability of Numerical Algorithms, 2nd edn. SIAM, Philadelphia (2002)

  19. Higham, N.J., Konstantinov, M., Mehmann, V., Petkov, P.: The sensitivity of computational control problems. IEEE Control Syst. Mag. 24, 28–43 (2004)

    Google Scholar 

  20. Kågström, B., Kressner, D., Quintana-Ortí, E.S., Quintana-Ortí, G.: Blocked algorithms for the reduction to Hessenberg-triangular form revisited. BIT 48, 563–584 (2008)

    MathSciNet  Article  Google Scholar 

  21. Kogbetliantz, E.: Diagonalization of General Complex Matrices as a New Method for Solution of Linear Equations. In: Proc. of Intern. Congr. Math, vol. 2, 356–357. Amsterdam (1954)

  22. Kressner, D.: Numerical Methods for General and Structured Eigenvalue Problems Lecture Notes in Computational Science and Engineering, vol. 46. Springer, Heidelberg (2005)

    Google Scholar 

  23. Kuo, S.R., Yeih, W., Wu, Y.C.: Applications of the generalized singular-value decomposition method on the eigenproblem using the incomplete boundary element formulation. J. Sound Vibr. 235, 813–845 (2000)

    MathSciNet  Article  Google Scholar 

  24. Lang, B.: Using level 3 BLAS in rotation-based algorithms. SIAM J. Sci. Comput. 19, 626–634 (1998)

    MathSciNet  Article  Google Scholar 

  25. Moler, C.B., Stewart, G.W.: An algorithm for generalized matrix eigenvalue problems. SIAM J. Numer. Anal. 10, 241–256 (1973)

    MathSciNet  Article  Google Scholar 

  26. Moore, B.C.: Principal component analysis in linear systems: controllability, observabilitiy, and model reduction. IEEE Trans. Automat. Control. 26, 17–32 (1981)

    MathSciNet  Article  Google Scholar 

  27. Netlib: BLAS (Basic Linear Algebra Subprograms). (2017)

  28. Novaković, V., Singer, S., Singer, S.: Blocking and parallelization of the Hari–Zimmermann variant of the Falk–Langemeyer algorithm for the generalized SVD. Parallel Comput. 49, 136–152 (2015)

    MathSciNet  Article  Google Scholar 

  29. NVIDIA: CUBLAS Library DU-06702-001_v10.0, User Guide. (2018)

  30. Paige, C.C.: Computing the generalized singular value decomposition. SIAM J. Sci. Stat. Comput. 7, 1126–1146 (1986)

    MathSciNet  Article  Google Scholar 

  31. Schreiber, R., Van Loan, C.F.: A storage–efficient WY representation for products of Householder transformations. SIAM J. Sci. Stat. Comput. 10, 53–57 (1989)

    MathSciNet  Article  Google Scholar 

  32. Stykel, T.: Gramian-based model reduction for descriptor systems. Math. Control Signals Syst. 16, 297–319 (2004)

    MathSciNet  Article  Google Scholar 

  33. Tombs, M.S., Postlethwaite, I.: Truncated balanced realization of a stable non-minimal state–space system. Internat. J. Control 46, 1319–1330 (1987)

    MathSciNet  Article  Google Scholar 

  34. Van Loan, C.F.: A general matrix eigenvalue algorithm. SIAM J. Numer. Anal. 12, 819–834 (1975)

    MathSciNet  Article  Google Scholar 

  35. Watkins, D.S.: Product eigenvalue problems. SIAM Rev. 47, 3–40 (2005)

    MathSciNet  Article  Google Scholar 

Download references


The author wishes to thank the anonymous referees for giving many helpful suggestions, which helped to improve the quality of the paper.


This research has been financially supported by the Croatian Science Foundation under grant HRZZ-9345.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Nela Bosner.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bosner, N. Parallel reduction of four matrices to condensed form for a generalized matrix eigenvalue algorithm. Numer Algor 86, 153–178 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Hessenberg–triangular–triangular–triangular form
  • Generalized singular value and eigenvalue problem
  • Givens rotations
  • Block and parallel implementations

Mathematics Subject Classification (2010)

  • 15A21
  • 15A18
  • 15A23
  • 65Y05
  • 65Y20