Abstract
The VZ algorithm proposed by Charles F. Van Loan (SIMA, 1975) attempts to solve the generalized type of matrix eigenvalue problem ACx = λBDx, where A, B ∈ Rn×m, C, D ∈ Rm×n, and m ≥ n, without forming products and inverses. Especially, this algorithm is suitable for solving the generalized singular value problem. Van Loan’s approach first reduces the matrices A, B, C, and D to a condensed form by the finite step initial reduction. The reduction finds orthogonal matrices Q, U, V, and Z, such that QAZ is upper Hessenberg, and QBV, ZTCU, and VTDU are upper triangular. In this initial reduction, A is reduced to upper Hessenberg form, while simultaneously preserving triangularity of other three matrices. This is done by Givens rotations, annihilating one by one element of A, and by generating three more rotations applied to other matrices per each annihilation. Such an algorithm is quite inefficient. In our work, we propose a blocked algorithm for the initial reduction, based on aggregated Givens rotations and matrix–matrix multiplications, which are applied in the outer loop updates. This algorithm has another level of blocking, exploited in the inner loop. Further, we also consider a variant of the algorithm in a hybrid CPU–GPU framework, where compute-intensive outer loop updates are performed on GPU, and can be overlapped with the reduction in the next step performed on CPU. On the other hand, application of a sequence of rotations in the inner loop is parallelized on CPU, with balanced operation count per thread. Since a large number of aggregated rotations are produced in every outer loop step, they are simultaneously accumulated before outer loop updates. These adjustments speed up original initial reduction considerably which is confirmed by numerical experiments, and the efficiency of the whole VZ algorithm is increased.
Similar content being viewed by others
References
Alter, O., Brown, P.O., Botstein, D.: Generalized singular value decomposition for comparative analysis of genome-scale expression data sets of two different organisms. Proc. Natl. Acad. Sci. USA 100, 3351–3356 (2003)
Antoulas, A.C., Sorensen, D.C.: Approximation of large–scale dynamical systems: an overview. Int. J. Appl. Math. Comput. Sci. 11, 1093–1121 (2001)
Bai, Z., Demmel, J.W.: Computing the generalized singular value decomposition. SIAM J. Sci. Comput. 14, 1464–1486 (1993)
Bai, Z., Zha, H.: A new preprocessing algorithm for the computation of the generalized singular value decomposition. SIAM J. Sci. Comput. 14, 1007–1012 (1993)
Benner, P.: Computational methods for linear–quadratic optimization. Supplemento ai Rendiconti del Circolo Matematico di Palermo Serrie II(58), 21–56 (1999)
Benner, P., Byers, R., Mehrmann, V., Xu, H.: Numerical computation of deflating subspaces of skew-Hamiltonian/Hamiltonian pencils. SIAM J. Matrix Anal. Appl. 24, 165–190 (2002)
Benner, P., Byers, R., Mehrmann, V., Xu, H.: Robust numerical methods for robust control. Technical Report 06-2004, Institut für Mathematik, TU Berlin (2004)
Bhuyan, K., Singh, S.B., Bhuyan, P.K.: Application of generalized singular value decomposition to ionospheric tomography. Annal. Geophys. 22, 3437–3444 (2004)
Bischof, C., Van Loan, C.F.: The WY representation for products of Householder matrices. SIAM J. Sci. Stat. Comput. 8, 2–13 (1987)
Bojanczyk, A., Golub, G.H., Van Dooren, P.: The periodic Schur decomposition; algorithm and applications. In: Proceedings of SPIE Conference, vol. 1770, pp. 31–42 (1992)
Bosner, N.: Efficient algorithm for simultaneous reduction to the m-Hessenberg-triangular-triangular form. BIT 55, 677–703 (2015)
Bosner, N., Karlsson, L.: Parallel and heterogeneous m–Hessenberg–triangular–triangular reduction. SIAM J. Sci. Comput. 39, C29–C47 (2017)
Demmel, J.W., Veselić, K.: Jacobi’s method is more accurate than QR. SIAM J. Matrix Anal. Appl. 13, 1204–1245 (1992)
Falk, S., Langemeyer, P. Schuff, H.K. (ed.): Das Jacobische Rotationsverfahren Fur Reel Symmetrische Matrizenpaare I, II. Friedr. Vieweg & Sohn, Braunschweig (1960)
Golub, G., Reinsch, C.: Singular value decomposition and least squares solution. Numer. Math. 14, 403–420 (1970)
Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd Edn. The Johns Hopkins University Press, Baltimore and London (1996)
Hari, V.: On Cyclic Jacobi Methods for the Positive Definite Generalized Eigenvalue Problem, publisher=PhD Thesis, FernUniversität-Gesamthochschule, Hagen (1984)
Higham, N.J.: Accuracy and Stability of Numerical Algorithms, 2nd edn. SIAM, Philadelphia (2002)
Higham, N.J., Konstantinov, M., Mehmann, V., Petkov, P.: The sensitivity of computational control problems. IEEE Control Syst. Mag. 24, 28–43 (2004)
Kågström, B., Kressner, D., Quintana-Ortí, E.S., Quintana-Ortí, G.: Blocked algorithms for the reduction to Hessenberg-triangular form revisited. BIT 48, 563–584 (2008)
Kogbetliantz, E.: Diagonalization of General Complex Matrices as a New Method for Solution of Linear Equations. In: Proc. of Intern. Congr. Math, vol. 2, 356–357. Amsterdam (1954)
Kressner, D.: Numerical Methods for General and Structured Eigenvalue Problems Lecture Notes in Computational Science and Engineering, vol. 46. Springer, Heidelberg (2005)
Kuo, S.R., Yeih, W., Wu, Y.C.: Applications of the generalized singular-value decomposition method on the eigenproblem using the incomplete boundary element formulation. J. Sound Vibr. 235, 813–845 (2000)
Lang, B.: Using level 3 BLAS in rotation-based algorithms. SIAM J. Sci. Comput. 19, 626–634 (1998)
Moler, C.B., Stewart, G.W.: An algorithm for generalized matrix eigenvalue problems. SIAM J. Numer. Anal. 10, 241–256 (1973)
Moore, B.C.: Principal component analysis in linear systems: controllability, observabilitiy, and model reduction. IEEE Trans. Automat. Control. 26, 17–32 (1981)
Netlib: BLAS (Basic Linear Algebra Subprograms). http://www.netlib.org/blas (2017)
Novaković, V., Singer, S., Singer, S.: Blocking and parallelization of the Hari–Zimmermann variant of the Falk–Langemeyer algorithm for the generalized SVD. Parallel Comput. 49, 136–152 (2015)
NVIDIA: CUBLAS Library DU-06702-001_v10.0, User Guide. http://docs.nvidia.com/cuda/pdf/CUBLAS_Library.pdf (2018)
Paige, C.C.: Computing the generalized singular value decomposition. SIAM J. Sci. Stat. Comput. 7, 1126–1146 (1986)
Schreiber, R., Van Loan, C.F.: A storage–efficient WY representation for products of Householder transformations. SIAM J. Sci. Stat. Comput. 10, 53–57 (1989)
Stykel, T.: Gramian-based model reduction for descriptor systems. Math. Control Signals Syst. 16, 297–319 (2004)
Tombs, M.S., Postlethwaite, I.: Truncated balanced realization of a stable non-minimal state–space system. Internat. J. Control 46, 1319–1330 (1987)
Van Loan, C.F.: A general matrix eigenvalue algorithm. SIAM J. Numer. Anal. 12, 819–834 (1975)
Watkins, D.S.: Product eigenvalue problems. SIAM Rev. 47, 3–40 (2005)
Acknowledgments
The author wishes to thank the anonymous referees for giving many helpful suggestions, which helped to improve the quality of the paper.
Funding
This research has been financially supported by the Croatian Science Foundation under grant HRZZ-9345.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bosner, N. Parallel reduction of four matrices to condensed form for a generalized matrix eigenvalue algorithm. Numer Algor 86, 153–178 (2021). https://doi.org/10.1007/s11075-020-00883-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11075-020-00883-z
Keywords
- Hessenberg–triangular–triangular–triangular form
- Generalized singular value and eigenvalue problem
- Givens rotations
- Block and parallel implementations