New iterative methods for generalized singular-value problems

This paper presents two new iterative methods to compute generalized singular values and vectors of a large sparse matrix. To reach acceleration in the convergence process, we have used a different inner product instead of the common one, Euclidean one. Furthermore, at each restart, a different inner product has been chosen by the researchers. A number of numerical experiments illustrate the performance of the above-mentioned methods.


Introduction
There are a number of applications for generalized singular-value decomposition (GSVD) in the literature including the computation of the Kronecker form of the matrix pencil A À kB [5], solving linear matrix equations [1], weighted least squares [2], and linear discriminant analysis [6] to name but a few. In a number of applications like the generalized total least squares problem, the matrices A and B are large and sparse, so in such cases, only a few of the generalized singular vectors corresponding to the smallest or largest generalized singular values are needed. There is a kind of close connection between the GSVD problem and two different generalized eigenvalue problems. In fact, there are many efficient numerical methods to solve generalized eigenvalue problems [8][9][10][11]. In this paper, we will examine the Jacobi-Davidson-type subspace method which is related to the Jacobi-Davidson for the SVD [5], which in turn is inspired by the Jacobi-Davidson method to solve the eigenvalue problem [4]. The main step in Jacobi-Davidson-type method for the (GSVD) is solving the correction equations in an exact manner requiring the solution of linear systems of original size at each iteration. In general, these systems are considered as large, sparse, and nonsymmetrical. For this matter, we use the weighted Krylov subspace process to solve the correction equations in an exact manner, and we show that our proposed method has the feature of asymptotic quadratic convergence.
The paper is organized as follows. In ''Preparations'', we will remind the readers of basic definitions of the generalized singular-value decomposition problems and their elementary properties. ''A new iterative method for GSVD'' introduces our new numerical methods to solve generalized eigenvalue problems together with an analysis of the convergence of these methods. Here, taking the previous theorem into consideration, we see that there are orthogonal matrices U mÂm , V pÂp and a nonsingular matrix X nÂn , such that Proof Refer to [3].
furthermore, consider it as nonsingular. Here, then, the matrix pencil has eigenvalues k j ¼ AEa j b j ; j ¼ 1; . . .; n which corresponds to the eigenvectors: where u j is the ith column of U and x j is the ith column of X.
. . .; d n Þ. If u and v are two vectors of R n , we define the D-scalar product of ðu; vÞ D ¼ v T Du: which is well defined if and only if the matrix D is positively definite or to say d i [ 0; i ¼ 1; . . .; n. The norm associated with this inner product is the D-norm Á k k D which is defined As assumption B has full rank, ðx; yÞ ðB T BÞ À1 :¼ y T ðB T BÞ À1 x is an inner product, and due to this, the corresponding norm satisfies x k k 2 ðB T BÞ À1 :¼ ðx; xÞ ðB T BÞ À1 . Inspired by the equality Z k k 2 F ¼ traceðZ T ZÞ for a real matrix Z, we define the ðB T BÞ À1 -Frobenius norm of Z by A new iterative method for GSVD We will advance different extraction methods here which are often more appropriate for small generalized singular values than the standard one from ''A new iterative method for GSVD''. Before dealing with these new methods, we should refer to our main idea which is developed considering Krylov subspace methods.
Theorem 3.1 Assume that r; u; v ð Þ is a generalized singular triple: Aw ¼ ru and A T u ¼ rB T Bw, where r is a simple nontrivial generalized singular value, and u k k ¼ Bw k k ¼ 1, and suppose that the correction equations are solved exactly in every step. Provided that the initial vectors ðũ;wÞ are close enough to ðu; wÞ the sequence of approximations ðũ;wÞ converges quadratically to ðu; wÞ.
Lemma 3.2 Having in mind the Theorem 3.1, now suppose that m steps of the weighted Arnoldi process [7] have been performed on the following matrix: Furthermore, consider the matrix e H m as the Hessenberg matrix, whose nonzero entries are the scalarsh i;j , constructed by the Weighted Arnoldi process. Here, we notice that the basis e V m ¼ṽ 1 ; . . .;ṽ m ½ constructed by this algorithm is D-orthonormal and we have Proof See [4].
We know that similar to Krylov methods, the mth ðm ! 1Þ iterate x m ¼ s m ; t m ½ t of the weighted-FOM and weighted-GMRES methods belong to the affine Krylov subspace: Now, it is the time to prove our main theorem.
Theorem 3.3 Considering Theorem 3.1, m steps of the weighted Arnoldi process have been run on (7). Here, the iterate x m ¼ s m ; t m ½ t is the exact solution of the correction equation: Proof The iterate x WF m of the weighted-FOM method is selected, because its residual is D-orthonormal or The iterate x WG m of the weighted-GMRES method is selected to lessen the residual D-norm in (9). Here, we notice that it is the solution of the least squares problem: In these methods, we use the D-inner product and the Dnorm to calculate the solution in the affine subspace (9) and we create a D-orthonormal basis of the Krylov subspace: by the weighted Arnoldi process. An iterate x m of these two methods can be transcribed as where y m 2 R m .
Therefore, the matching residual r m ¼ r To the extent that the weighted-GMRES method is considered, the matrix e V mþ1 is D-orthonormal, so we have and problem (12) is condensed to find the vector y WG m solution of the minimization problem: We can reach the solution of (14) and (15) with the use of the QR decomposition of the matrix e H m , as for the FOM and GMRES algorithms.
When m is equal to the degree of the minimal polynomial of 0 t , the Krylov subspace (13) will be invariant. Therefore, the iterate x m ¼ ½s m ; t m t gained by both methods is the exact solution of the correction Eq. (10). j It is time to write the main algorithm in this paper now. The following algorithm applies FOM, GMRES, weighted-FOM, and weighted-GMRES processes to solve the correction Eq. (10) and as a final point to solve the generalized singular-value decomposition problem. They are represented as F-JDGSVD, G-JDGSVD, WF-JDGSVD, and WG-JDGSVD.
As Algorithm 3.1 displays, there are two loops in this algorithm. One of them computes the largest generalized singular value called the outer iteration, and the other called the inner iteration solves the system of linear equation at each iteration. Numerical tests indicate that there is a significant relation between parameter m and the norm of residual vector and the computational time.

Convergence
We will now demonstrate that the method we have proposed has asymptotically quadratic convergence to generalized singular values when the correction equations are solved in an exact manner and tend toward linear convergence when they are solved with a sufficiently small residual reduction. PðA À hBÞ s m t m ¼ Àr: Besides, let au ¼ũ þ s; s?ũ, and bw ¼w þ t; t?w, for certain scalars a and b, satisfy (15); note that these decompositions are possible meanwhile u Tũ 6 ¼ 0 and w Tw 6 ¼ 0 because of the assumption that the vectors ðũ;wÞ are close to ðu; wÞ. Projecting (16) yields Subtracting (16) from (17) gives PðA À hBÞ s À s m t À t m ¼ P ðl 1 À hÞs ðl 2 À hÞB T Bt : Thus for ðũ;wÞ close enough to ðu; wÞ, PðA À hBÞ is a bijection fromũ ? Âw ? onto itself. Together with this implies asymptotic quadratic convergence:

Numerical experiments
In this section, we look for the largest generalized singular value, using the following default options of the proposed method:  [7]. We choose two diagonal matrices of dimension n ¼ 1000. For j ¼ 1; 2; . . .; 1000 where the r j uniformly distributed on the interval ð0; 1Þ and Á d e denotes the ceil function. We take where Q 1 and Q 2 are two random orthogonal matrices. The estimated condition numbers of A and B are 4:4e2 and 5:7e0, respectively (Table 1).
We can see that by increasing the value of m, the number of outer and inner iterations decreases. Therefore, the consuming time also decreases. But not that if m is very large, the number of iterations increases because of loosing the orthogonality property. This example is given to show the improvement brought by the weighted methods WF-JDGSVD and WG-JDGSVD is simultaneously on the relative error and on the computational time (Fig. 1).
From figure one, we can see that the suggested method WG-JDGSVD is more accurate form the other methods. This example is given to show the performance of two new methods on the large sparse problems. In this test, we have difficulties in computing the largest singular value for ill-conditioned matrices A and B. We note that in this experiments, due to the ill-conditioning of A and B, it turned out to be advantageous to turn of the Krylov option. Example 4.3 Consider the matrix pair ðA; BÞ, where A is selected from the university of Florida sparse matrix collection [8] as lp-ganges. This matrix arises from a linear programming problem. Its size is 1309 Â 1706 and it has a total of Nz ¼ 6937 nonzero elements. The estimated condition number is 2:1332e4, and B is the 1309 Â 1706 identity matrix (Tables 2, 3).
We should mention that, for all considered Krylov subspaces sizes, each weighted method converges in less iterations and less time than its corresponding standard method. The convergence of F-JDGSVD and G-JDGSVD is slow, and we have linear asymptotic convergence. However, the two WF-JDGSVD and WG-JDGSVD methods have quadratic asymptotic convergence, because the correction Eq. (10) is solved exactly.
Remark 4.4 From the above examples and tables, we can see that the two suggested methods are more accurate than G-JDGSVD and F-JDGSVD for the same value m, but its computational times are often a little longer than G-JDGSVD and F-JDGSVD. Therefore, we can use WF-JDGSVD and WG-GSVD if the computational time is less important.
Remark 4.5 The algorithm we have described finds the largest generalized singular triple. We can compute multiple generalized singular triples of the pair ðA; BÞ using a deflation technique. Suppose that U f ¼ u 1 ; . . .; u f Â Ã and contain the already found generalized singular vectors, where BW f has orthonormal columns. We can check that the pair of deflated matriceŝ has the same generalized singular values and vectors as the pair ðA; BÞ (see [3]).

Example 4.6
In generalized singular-value decomposition, if B ¼ I n , the n Â n identity matrix, we get the singular value of A. SVD has important applications in image and data compression. For example, consider the following image. This image is represented by a 1185 Â 1917 matrix A. Which we can then decompose via the singular-value decomposition as is 1185 Â 1917, and V is 1917 Â 1917. The matrix A, however, can also be written as a sum of rank 1 matrices A ¼ P r j¼1 r j u j v T j , where r 1 ! r 2 ! Á Á Á ! r r [ 0 are the r nonzero singular value of A. In digital image processing, any matrix A of order m Â nðm ! nÞ generally has a large number of small singular values. Suppose there are ðn À kÞ small singular values of A that can be neglected (Fig. 2).
Then, the matrix

Conclusions
In this paper, we have suggested two new iterative methods, namely, WF-JDGSVD and WG-JDGSVD, for the computation of some of the generalized singular values and corresponding vectors. Various examples studied illustrate these methods. To accelerate the convergence, we applied the Krylov subspace method for solving the correction equations in large sparse problems. In our methods, we see the existence of asymptotically quadratic convergence, because the correction equations are solved exactly. In the meantime, the correction equations in F-JDGSVD and G-JDGSVD methods are solved inexactly for large sparse problems, so we have linear convergence.  As the amount of the WF-JDGSVD and WG-JDGSVD methods is not much larger than that of the F-JDGSVD and G-JDGSVD methods, and as the weighted methods need less iterations to convergence, the parallel version of the weighted methods seems very interesting. From the tables and the figures, we see that when m increases, the suggested methods are more accurate than the previous methods; moreover, by increasing the dimension of the matrix, two suggested methods are applicable; this results are supported by convergence theorem which shows the asymptotically quadratic convergence to generalized singular values.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://crea tivecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.