Golub–Kahan vs. Monte Carlo: a comparison of bidiagonlization and a randomized SVD method for the solution of linear discrete ill-posed problems

Randomized methods can be competitive for the solution of problems with a large matrix of low rank. They also have been applied successfully to the solution of large-scale linear discrete ill-posed problems by Tikhonov regularization (Xiang and Zou in Inverse Probl 29:085008, 2013). This entails the computation of an approximation of a partial singular value decomposition of a large matrix A that is of numerical low rank. The present paper compares a randomized method to a Krylov subspace method based on Golub–Kahan bidiagonalization with respect to accuracy and computing time and discusses characteristics of linear discrete ill-posed problems that make them well suited for solution by a randomized method.


Introduction
We are concerned with the solution of linear least-squares problems where A ∈ R m×n is a large matrix, whose singular values "cluster" at the origin, and b ∈ R m . In particular, A has many "tiny" singular values of different orders of magnitude. Least-squares problem with a matrix of this kind commonly are referred to as linear discrete ill-posed problems. They arise, for instance, from the discretization of Fredholm integral equations of the first kind; see, e.g., [10,19]. Applications of this kind of least-squares problems include image reconstruction and remote sensing. Throughout this paper · denotes the Euclidean vector norm or the spectral matrix norm. Both the situations when m ≥ n and when m < n will be considered. The vector b in linear discrete ill-posed problems that arise in applications typically represents measured data and is contaminated by a measurement error e ∈ R m . Thus, b = b exact + e, (2) where b exact denotes the unknown error-free vector associated with b. We are interested in determining the solution, x exact , of minimal Euclidean norm of the unavailable leastsquares problem min x∈R n Ax − b exact .
Due to the error e in b and the presence of small positive singular values of A, the solution of (1) of minimal Euclidean norm typically is not a useful approximation of x exact . To determine a meaningful approximation of x exact , one generally replaces the minimization problem (1) by a nearby problem, whose solution is less sensitive to the error e. This replacement is known as regularization. One of the most popular regularization methods is due to Tikhonov. It replaces (1) by a penalized least-squares problem of the form where L ∈ R p×n is a regularization matrix and μ ≥ 0 a regularization parameter. We require that N (A) ∩ N (L) = {0}, where N (M) denotes the null space of the matrix M. Then the penalized least-squares problem (3) has a unique solution for any μ > 0; the superscript T denotes transposition. Common choices of regularization matrices L include the identity matrix, denoted by I , and discrete approximations of differential operators; see, e.g., [5,7,19,23]. The value of the regularization parameter μ > 0 determines how sensitive the vector (4) is to the error in b and how close it is to the desired vector x exact . We will determine μ with the aid of the discrepancy principle; see below. This requires that a bound for the error e be known, i.e., e ≤ , (5) and that b exact ∈ R(A), where R(A) denotes the range of A. If these requirements are not satisfied, then other methods, including the L-curve criterion and generalized cross validation can be used to determine a suitable value of μ; see, e.g., [11,25,26,29] for discussions and illustrations. The Tikhonov solution (4) is said to satisfy the discrepancy principle if where η > 1 is a user-chosen parameter that is independent of . When in (5) is known to be an accurate estimate of e and e represents white Gaussian noise, then generally η is chosen to be close to unity. Equation (6) has a unique solution μ discr = μ > 0 for many reasonable values of η > 0; see, e.g., [2]. Several zerofinders for determining μ discr are described in [30]. A proof in a Hilbert space setting that x μ → x exact as 0 can be found in [10]. It is the purpose of the present paper to compare a solution method for large-scale Tikhonov minimization problems (3) based on partial Golub-Kahan bidiagonalization of A to a randomized solution method. Partial Golub-Kahan bidiagonalization is the basis for the possibly most popular Krylov subspace methods for the solution of large-scale problems (3) with a nonsymmetric or symmetric indefinite matrix A; see, e.g., [2,15,18,19,22,24] for discussions and illustrations of this solution approach. Iterative solution methods that are based on the Arnoldi process instead of Golub-Kahan bidiagonalization are competitive for certain problems, but may fail to determine accurate approximations of x exact for some problems; see [6,14,15,27] for discussions and applications of the Arnoldi process to large-scale linear discrete ill-posed problems. We therefore focus on Golub-Kahan bidiagonalization in the present paper.
When solving (3) by application of steps of Golub-Kahan bidiagonalization, the matrix A is replaced by an approximation of rank at most . Typically, 1 ≤ max{m, n} in applications. Thus, Golub-Kahan bidiagonalization applied to the solution of (3) replaces A by a low-rank approximation of the matrix A, and then solves the low-rank problem instead of (3).
Randomized solution methods for the solution of large-scale problems have received considerable attention; see Halko et al. [17] for a survey. When applied to the solution of (3), these methods also determine a low-rank approximation of A. They compute an approximate solution of the original problem by replacing the given matrix by its low-rank approximation, and then solve the low-rank problem so obtained. Xiang and Zou [33,34] describe applications of this approach to the solution of large-scale Tikhonov minimization problems (3). To the best of our knowledge, very few comparisons of randomized and Krylov subspace-based solution methods for linear discrete ill-posed problems are available in the literature. It is quite natural to compare these approaches to the solution of large-scale linear discrete ill-posed problems, because they both determine low-rank approximations of the large matrix A. Vatankhah et al. [32] show that randomized methods may be faster than Krylov subspace methods based on Golub-Kahan bidiagonalization for certain problems. We illustrate that, for some linear discrete ill-posed problems, methods based on Golub-Kahan bidiagonalization are competitive and we seek to shed light on for which kinds of linear discrete ill-posed problems Golub-Kahan bidiagonalization may be preferable.
This paper is organized as follows. Section 2 reviews methods based on partial Golub-Kahan bidiagonalization of the matrix A described in [2,22] for the solution of large-scale Tikhonov regularization problems (3). Section 3 outlines the randomized method proposed by Xiang and Zou [33]. The randomized method discussed in [34] also is commented on. Section 4 presents computed results, and Sect. 5 contains concluding remarks.

Solution methods based on Golub-Kahan bidiagonalization
This section reviews solution methods for the Tikhonov minimization problem described in [2,22]. They are based on reducing the matrix A to a small bidiagonal matrix by the application of 1 ≤ min{m, n} steps of Golub-Kahan bidiagonalization to A. The number of steps is chosen as small as possible so that the computed solution can satisfy the discrepancy principle. Thus, application of steps of Golub-Kahan bidiagonalization to A with initial vector b gives the decompositions where the matrices U +1 ∈ R m×( +1) and V ∈ R n× have orthonormal columns, U ∈ R m× consists of the first columns of U +1 , and Here and throughout this paper e 1 = [1, 0, . . . , 0] T denotes the first axis vector of appropriate dimension. The range of V is the Krylov subspace Further, the matrixC is lower bidiagonal with positive entries σ k and ρ k , and C ∈ R × is obtained by removing the last row ofC . We assume that is small enough so that the decompositions (7) with the described properties exist. This is the generic situation. The dominating computational effort required to determine the decompositions (7) is the sequential evaluation of matrix-vector products with each one the matrices A and A T ; see, e.g., [16,Sect. 10.4.4] for an algorithm. Following [22], we compute an approximate solution of (3) by minimizing over the Krylov subspace (9) instead of over R n . Thus, we solve which, by using the representation x = V y, and the relations (7) and (8), can be expressed as Denote the solution of (11) by y μ, . Then x μ, = V y μ, is an approximate solution of (3). We point out that the problems (3) and (10) only differ in the spaces over which they are minimized. The randomized method of Sect. 3 gives a minimization problem that differs in several ways from the problem (3). First consider the situation when L = I . Then it is shown in [1, Theorem 5.1] that It therefore suffices to choose μ > 0 so that the reduced problem on the right-hand side satisfies the discrepancy principle; see [2] for details. It follows that it is quite inexpensive to determine a value of μ > 0 such that the approximate solution x μ, = V y μ, of (3) with L = I satisfies (6). A discussion on how this can be done by using Newton's method can be found in [2]; other zero-finders are discussed in [30]. The expressions (12) decrease as increases. This follows from the fact that the dimension of the Krylov subspace in (10) increases with and that the subspaces are nested, i.e., K (A T A, A T b) ⊂ K +1 (A T A, A T b) for = 1, 2, . . . . We choose the number of bidiagonalization steps, , as small as possible to satisfy the discrepancy principle for some 0 < μ < ∞, i.e., we choose so that Further details on the choice of are described in [2]. We turn to the case when L = I . This situation can be handled by several approaches; see, e.g., [3,9,22]. In the numerical examples of Sect. 4, we will apply the method described in [22]. Let L ∈ R p×n and assume that in (7) satisfies 1 ≤ ≤ min{ p, n}. Compute the QR factorization Q R = LV , where Q ∈ R n× has orthonormal columns and R ∈ R × is upper triangular. Then (11) becomes Typically, the matrix R is nonsingular and not very ill-conditioned. Then the change of variables z = R y results in the minimization problem It can be shown that if μ is determined so that the solution z μ, satisfies C R −1 z − e 1 b = η , then the associated approximate solution x μ, = V R −1 z μ, satisfies the discrepancy principle (6); see [22] for details. Hence, it is quite cheap to determine an approximate solution of (3) that satisfies (6) also when L = I .

Randomized solution methods
The reduced singular value decomposition (SVD) of the matrix A ∈ R m×n , with m ≥ n, is of the form where the matrices U ∈ R m×n and V ∈ R n×n have orthonormal columns, and The diagonal entries σ j are known as the singular values of A. An analogous decomposition is available when m < n; see, e.g., [16,Sect. 2.4] for details on the SVD of a matrix. The computation of the SVD (13) requires O(max{m, n} min{m, n} 2 ) flops and, therefore, is expensive when m and n are large. Randomized SVD methods determine an approximation of the factorization (13) and are less expensive; see, e.g., Halko et al. [17]. Xiang and Zou [33] describe how randomized SVD methods can be applied to the solution of large-scale Tikhonov minimization problems (3) when L = I . We will outline their approaches and compare the performance of these randomized methods to the Golub-Kahan bidiagonalization method of Sect. 2 in Sect. 4.
Xiang and Zou [34] also describe several randomized approaches to the solution of Tikhonov regularization problems in general form (3). Some of these methods are based on first transforming (3) to an equivalent problem with L = I , similarly as outlined at the end of Sect. 2, while others apply a randomized generalized SVD. We will not discuss the latter methods in the present paper.
We first describe the method proposed by Xiang and Zou [33] for the approximate solution of (3) when m ≥ n and L = I . Let the entries of the matrix ∈ R n× , where 1 ≤ n, be identically and normally distributed random numbers with zero mean, and compute the QR factorization where Q ∈ R m× has orthonormal columns and R ∈ R × is upper triangular. The matrix R is assumed to be nonsingular in [33] and we will, for now, assume the same. Then the columns of Q form an orthonormal basis for R(A ). Let B = Q T A ∈ R ×n and compute the reduced SVD, where the matrices W ∈ R × and V ∈ R n× have orthonormal columns, and ∈ R × is a diagonal matrix with nonnegative diagonal entries arranged in decreasing order. The right-hand side of is an approximation of the SVD of A (13). The decomposition (15) is much cheaper to compute than (13) when m and n are large and 1 ≤ n ≤ m. The following proposition provides bounds for the closeness of A and Q Q T A.

. Let the columns of Q form an orthonormal basis for R(A ). Then
with probability not less than 1 − 3 p − p .
Proof The left-hand side inequality is a consequence of the Eckart and Young theorem [8], and the right-hand side inequality is shown by Halko et al. [17,Corollary 10.9].
We will comment below on the significance of the upper bound (16). In order for this bound to be small, we have to choose k large enough so that σ k+1 is small, and p large enough so that the right-hand side inequality (16) holds with high probability. Common choices of p are 5 or 10.
The computational cost for determining the matrices Q , W , , and V in (15) is comprised of O(m 2 ) flops for determining Q from , matrix-vector product evaluations with A (to form A ) and matrix-vector product evaluations with A T (to form B = Q T A), as well as O(n 2 ) flops for the computation of the SVD of B (14). Some of these computations can be implemented efficiently by using high-level BLAS; see, e.g., [12] for a discussion on implementation issues.
We will use the decomposition (15) to determine an approximate solution of (3). Replacing A by this decomposition in (3) gives which can be expressed as Since we would like a solution x of minimal Euclidean norm, it is natural to require the solution to be of the form x = V y for some y ∈ R . Substitution into (18) gives the minimization problem This problem differs from (3) in several ways: (18) is ignored, and iii) the space R n is replaced by R( V ). The differences between the minimization problems (3) and (19) are small if p and k, and therefore = p + k, in Proposition 1 are sufficiently large. In particular, the choice of k has to be large enough in relation to how quickly the singular values of A decay to zero with increasing index.
Let x denote the solution of (17) with μ > 0 chosen so that x satisfies the discrepancy principle (6). The discrepancy principle suggests that be chosen large enough so that where the parameter is the same as in (6), and 0 < η ≤ η is a user-chosen parameter with η the same as in (6). An accurate upper bound for A − Q Q T A can be determined with high probability by evaluating (A − Q Q T A)w for sufficiently many random vectors with normally distributed entries with zero mean; see [17,Eq. (4.3) and Lemma 4.1]. The evaluation of such a bound increases the computational effort required by the randomized method. Moreover, we would like to be large enough so that We illustrate in Sect. 4 that for some problems the parameters p and k in (16) have to be chosen too large to make the randomized solution method of this section competitive with the Krylov subspace method of Sect. 2.
There is a small probability that in a particular application of the randomized method, columns of the matrix Q are singular vectors associated with "tiny" singular values of A. These singular vectors typically "oscillate" a lot, i.e., the vector entries as a function of their index number can be thought of as the discretization of a highly oscillatory function. The presence of such vectors in the solution subspace R(Q ) typically would result in an a highly oscillatory, and therefore undesired, approximate solution of (3). This phenomenon may be considered an instability of the randomized method. However, we hasten to add that we have not observed this instability in any one of numerous computed examples that we have carried out. The occurrence of this instability, indeed, is rare.
When L = I , we have L V y = y . Otherwise, we compute the QR factorization Q R = L V , where Q ∈ R p× has orthonormal columns and R ∈ R × is upper triangular. When the matrix R is of full rank and fairly well-conditioned, we proceed similarly as described at the end of Sect. 2. Now consider the case when, numerically, rank(A ) < . This situation may arises when the matrix A is of numerical rank less than . Then we compute the SVD Let σ , j be the smallest numerically nonvanishing diagonal entry. Then, numerically, the columns of Q , j := Q [u ,1 , u ,2 , . . . , u , j ] ∈ R m× j form an orthonormal basis for R(A ), and we replace the matrix Q in (15), (17), and (19) by Q , j .
We turn to the situation when m < n. Following Xiang and Zou [33], let ∈ R ×m , with 1 ≤ m, be a random matrix with the same kind of entries as above, and let the columns of Q ∈ R n× form an orthonormal basis for a linear space that contains R(( A) T ). We compute Q by evaluating the QR factorization of ( A) T . Then we calculate the SVD of AQ , where the matrices U ∈ R n× and W ∈ R × have orthonormal columns, and ∈ R × is a diagonal matrix with nonnegative diagonal entries arranged in decreasing order. The expression is an approximation of the SVD of A.
We determine an approximate solution of (3) by solving which, with x = Q W y, can be written as Finally, let R ∈ R × be the upper triangular matrix in a QR factorization of L Q W . Since, generally, R is nonsingular and not very ill-conditioned, we may transform the Tikhonov minimization problem (21) to standard from by the change of variables z = R y. We conclude this section with a discussion on the application of the discrepancy principle, and first consider the situation when m ≥ n. Then we solve (17) by computing the solution of (19). Assume that the error e in b is normally distributed with zero mean and variance 2 . Then, since Q Q T is an orthogonal projector, the variance of Q Q T e is m 2 . Therefore, when determining the regularization parameter μ for the problem (17), we replace by m in (6).
When m < n, we solve (20) by computing the solution of (21). Since U U T is an orthogonal projector, the variance of U U T e is n 2 . Therefore, when determining the regularization parameter μ for the minimization problem (21), we replace by n in (6). The value of = p + k affects both the quality of the computed solution and the computing time required. This value has to be large enough so that AQ Q T is a sufficiently accurate approximations of A.
The computed examples of the following section illustrate this.

Computed examples
The examples reported in this section show the performance of the methods discussed in the previous sections. All computations were carried out on a Windows computer with an i7-8750H @2.2 GHz CPU and 16 GB of memory. The implementations were done in MATLAB R2018b.
The noise level is defined by In all experiments, the regularization parameter was determined with the aid of the discrepancy principle and computed by Newton's method as outlined above with initial iterate μ 0 = 0. More details on the application of Newton's method can be found in [2]. This method also was used to determine the regularization parameter in the randomized SVD (RSVD) method in a similar way. We consider several examples described in Regularization Tools by Hansen [20] and in IR Tools by Gazzola et al. [13]. Both Regularization Tools and IR Tools are program packages written in MATLAB. Problems in both one and two space-dimensions will be discussed. We compare the performance of the Krylov method of Sect. 2 and the randomized SVD method of Sect. 3. The computed examples show that rapid decay of the singular values to zero with increasing index number is essential for the success of the randomized method. We will refer to the Krylov subspace-based Tikhonov regularization method of Sect. 2 as "K-Tikhonov", and to the Tikhonov regularization method based on the randomized SVD technique as "R-Tikhonov". Throughout this section, denotes the number of bidiagonalization steps carried out by the Golub-Kahan bidiagonalization method, as well as the number of columns of the random matrix when m ≥ n, or the number of rows of the random matrix when m < n; see Sect. 3.
Before comparing the two methods, we would like to discuss the computational cost of the K-Tikhonov and R-Tikhonov methods. Let us assume that n. Then for K-Tikhonov, the computational cost is dominated by the matrix-vector product evaluations with A and the matrix-vector product evaluations with A T . The R-Tikhonov method requires the evaluation of A and A T Q . The flop count for these evaluations in K-Tikhonov and R-Tikhonov is the same, and of order O(mn ), but the evaluations in R-Tikhonov can be implemented as matrix-matrix products, while this is not possible in K-Tikhonov, because in Golub-Kahan bidiagonalization the columns of the matrices V and U +1 are determined sequentially one-by-one. As we will illustrate in the following, when A is stored as a matrix, the matrix-matrix product evaluations in R-Tikhonov are faster than the matrix-vector product evaluations in K-Tikhonov. However, when A is not explicitly stored and it therefore is not possible to evaluate the matrix-matrix products A and A T Q efficiently, i.e., when we need to compute 2 matrix-vector products (one for each column of and Q ), the computing time required by K-Tikhonov and R-Tikhonov to evaluate the necessary matrix-vector products is almost identical. Shaw. We first consider the Shaw test problem described in [31]. It is an integral equation of the first kind with a smooth kernel in one space-dimension. MATLAB code that gives a discretization of this integral equation is provided in [20]. This code gives the matrix A ∈ R 2048×2048 and a vector x exact ∈ R 2048 , from which we compute b exact = Ax exact . We add a "noise vector" e ∈ R 2048 to b exact that models white Gaussian noise with noise level δ = 0.01 to obtain the "available" noise-contaminated data vector b; cf. (2). The moderate dimension of this problem allows us to compute the solution of the Tikhonov regularized problem also in the full space since it is possible to explicitly compute the SVD of the matrix A. We refer to the latter approach as "standard Tikhonov". The regularization matrix L used in this example is the discrete Laplacian in one space-dimension.
We apply the K-Tikhonov and R-Tikhonov methods for different dimensions of the solution subspace and compare the results obtained with those obtained with standard Tikhonov, i.e., the solution of (3). In particular, we are interested in comparing the quality of the computed approximations of x exact determined by the different methods, as well as in the timings. Let x denote an approximate solution computed by one of the methods considered. We define the relative reconstruction error RRE(x) = x − x exact x exact . Figure 1 reports RREs and timings for the K-Tikhonov and R-Tikhonov methods. The horizontal axes in the subfigures show the dimension, , of the solution subspaces for the K-Tikhonov and R-Tikhonov method. We can observe that the RRE for R-Tikhonov is slightly smaller than for K-Tikhonov for > 5. Moreover, the CPU time required for the computation of the solution with R-Tikhonov is smaller than the CPU time needed for the computation of the solution with K-Tikhonov for solution subspaces of the same dimension. Finally, we note that the RRE obtained with K-Tikhonov rapidly converges to the RRE of the solution determined by standard Tikhonov, while the RRE for the solutions determined by R-Tikhonov typically is smaller. The difference in the quality of the computed solutions is made possibly by the facts that the computed solutions are determined by the discrepancy principle and live in different solution subspaces. extremely fast with increasing index , the matrix Q Q T A approximates A well; the approximation error is very close to the optimal one, i.e., to σ +1 . Figure 2b shows that the discrepancy principle already can be satisfied in a subspace of fairly small dimension. We remark that, although the function → (I − Q Q T )b appears to be constant for large enough, this function is decreasing very slowly as increases and vanishes for = n. Heat. We consider the heat test problem in [20]. It is described in [4]. This problem models inverse heat conduction in one space-dimension. We use MATLAB code supplied in [20]. This code requires a parameter κ, which is set to the default value 1. Discretization gives a problem (1) with a matrix A ∈ R 2048×2048 and a vector x exact , from which we compute the exact data vector b exact = Ax exact . We add an error vector e that models white Gaussian noise and corresponds to a noise level of δ = 0.02 to  (1). Similarly as in the previous example, we let L be the discrete Laplacian in one space-dimension, and we display the RRE and CPU times for K-Tikhonov and R-Tikhonov. Figure 3 shows the RRE and CPU times for several -values. Similarly as in the Shaw example, the computing times required for the computation of the R-Tikhonov solutions are much smaller than the times required for K-Tikhonov for solution subspaces of the same dimension. For the present example, the RRE-values obtained with R-Tikhonov are slightly larger than those obtained with K-Tikhonov for solution subspaces of the same dimension, at least for < 23. However, the RREs are of the same order of magnitude for both methods for solution subspaces of the same dimension. Finally, we observe that, differently from the previous example, the RRE obtained with the K-Tikhonov method is not the same, for large, as for standard Tikhonov. This is due to the fact that the discrepancy principle does not determine a unique solution; the computed solution depends on the solution subspace used.
Like in the previous example, we report the singular values σ +1 of A and the quantities ( Fig. 4. In the present example, the singular values of A decrease to zero slower than in the previous example. Although the matrix Q Q T A is still a good approximation of A, the approximation error is visibly larger than the optimal one, given by σ +1 . This is illustrated by Fig. 4b. We observe that in order to satisfy the discrepancy principle, the parameter has to be larger than in the previous example. Phillips. Our last example in one space-dimension is the phillips test problem from [20], where MATLAB code is available. This code provides a discretization of a convolution. A background for this problem is given in [28]. We generate the matrix A and the noise-contaminated vector b in the same manner as in the previous examples. Thus, the noise is white Gaussian and corresponds to the noise level δ = 0.05. The regularization matrix L is the discrete Laplacian in one space-dimension. We report the results obtained with the K-Tikhonov and R-Tikhonov methods in Fig. 5. These results are similar to the ones obtained for the Shaw test problem. Thus, R-Tikhonov outperforms K-Tikhonov in terms of timings and RRE (for sufficiently large). Similarly to the Shaw test problem, the solutions computed with K-Tikhonov give the same RRE as the solutions computed by standard Tikhonov already for small subspace dimensions , while the solution computed by R-Tikhonov provide a better approximation of x exact .
As above we display the singular values σ +1 of A and the norms (I − Q Q T )A , (I − Q Q T )A x exact , and (I − Q Q T )b as functions of in Fig. 6. This example behaves like the Heat example and we therefore can draw the same conclusions. Nevertheless, let us observe that the decay of the singular values, even though it is slower than in the Shaw example, is still fast enough to yield a fairly accurate approximation of A using the randomized method.
The R-Tikhonov method performs well in all the above examples, even though the matrices A in these examples have different properties. The matrix in the Shaw and Phillips examples are symmetric, while in the Heat example, the matrix is very far from a symmetric matrix. The singular values of the matrix decay to zero quite quickly with increasing index in the Shaw and Heat test problems, while they do not for the problem Phillips.
The above examples are discretizations of problems in one space-dimension. We now turn to problems that are discretizations of ill-posed problems in two spacedimensions. The relative performance of the methods in our comparison will be seen to be different for this kind of problems. Blur. We determine the matrix A with the MATLAB function blur(45,8,1) from [20]. This function call generates a symmetric block-Toeplitz-Toeplitz-block (BTTB) matrix A ∈ R 2025×2025 , which models a Gaussian point spread function in two spacedimensions. The parameter value 8 is the half-bandwidth of the Toeplitz blocks. Thus, the matrix A is very sparse. It is stored in sparse matrix format. The regularization matrix L is the discrete Laplacian in two space-dimensions. Let the entries of x exact be pixel values of an 45 × 45-pixel image, with the pixels ordered column-wise. Then b exact = Ax exact represents a blurred image associated with x exact . Add a vector e that represents white Gaussian noise with noise level δ = 0.03 to b exact to obtain the contaminated data vector b; cf. (2).
We compare RREs and CPU times for different values of similarly as in the previous examples. The results are reported in Fig. 7. The figure shows that the R-Tikhonov method does not perform well in terms of the RRE. In fact, the RRE is very large and does not decrease significantly as the dimension of the solution subspace, , increases. Moreover, since the matrix A is very sparse, the matrix-vector products required by the K-Tikhonov method are not computational demanding. Therefore, the computational cost of the two methods is about the same.  The relatively poor performance of the R-Tikhonov method in this example is due to the fact that the singular values of the matrix A decrease fairly slowly to zero with increasing index and are not approximated well by the singular values of the reduced matrix AQ 30 Q T 30 used to compute the R-Tikhonov solution. To see this in more detail, we plot in Fig. 8 30 match very well the largest singular values of A. This suggests that the R-Tikhonov method may not be effective for the solution of linear discrete illposed problems with a matrix A whose singular values decay fairly slowly with their index number. To illustrate this, we choose the solution subspace for the R-Tikhonov method to be = 1000, and compare the error in the computed solution with the errors in the solution determined by the K-Tikhonov method with = 30 and by standard Tikhonov. This comparison is reported in Table 1. We can see that even when = 1000, the R-Tikhonov method provides less accurate results than K-Tikhonov with = 30, and requires much more execution time (about 45 times as much).
These observations are corroborated by Fig. 9, which shows the singular values σ +1 of A and compares them to (I − Q Q T )A as functions of . The figure also  Fig. 9a that since the singular values of A do not decay fast enough to zero with increasing index , the approximation error (I − Q Q T )A is extremely large even for = 100. Moreover, we can see that the discrepancy principle cannot be satisfied for ≤ 100. By visual inspection of Fig. 9b, we can deduce that a very large value of may be required to satisfy the discrepancy principle. Hubble. We turn to a deblurring problem from [13]. Specifically, we consider the deblurring problem obtained when the available image is blurred by a medium speckle PSF and, in addition, is contaminated by 5% white Gaussian noise. The size of the image is 512 × 512 pixels; see Fig. 10. We impose periodic boundary conditions. Then the blurring matrix A ∈ R n×n , with n = 512 2 , is block circulant with circulant blocks (BCCB). Thus, A can be diagonalized by the bidimensional Fourier matrix. We can compute the eigenvalues of A in O(n log n) flops with the aid of the fast Fourier transform (FFT) algorithm; see, e.g., [21] for a discussion on image deblurring and boundary conditions. By choosing L as the discretization of the bidimensional Laplacian with periodic boundary conditions, we can solve (3) in O(n log n) flops for each value of the regularization parameter μ. This allows us to compute the solution Similarly as above, we apply both the R-Tikhonov and K-Tikhonov methods for different values of , and compare results in terms of CPU time and accuracy. The matrix A is not explicitly formed; instead we evaluate matrix-vector products with A and A T by using the FFT algorithm. Hence, in R-Tikhonov the matrices A and A T Q are computed by evaluating matrix-vector products with A and matrixvector products with A T . We therefore expect the computing time for R-Tikhonov and K-Tikhonov to be about the same. This is confirmed by the graphs of Fig. 11b. On the other hand, we can see from Fig. 11a and d that the R-Tikhonov method fails to accurately determine the largest singular values of A, and that the restored image determined by R-Tikhonov is of very poor quality; see Fig. 12b. This is due to the fact that, as we can see in Fig. 11c, the singular values of A do not decrease very fast to zero. Figure 12 displays the reconstructed images obtained with standard Tikhonov, R-Tikhonov, and K-Tikhonov. Visual inspection shows that K-Tikhonov is able to provide a reconstruction of similar quality as standard Tikhonov, while R-Tikhonov fails to determine an accurate approximation of x exact .

Conclusion
The application of randomized algorithms to the solution of large-scale problems has received considerable attention. This paper compares their performance with a Krylov subspace method when applied to the solution of linear discrete ill-posed problems by Tikhonov regularization. The singular values of linear discrete ill-posed problems "cluster" at the origin, however, their rate of decay towards zero with increasing index is problem dependent. The randomized method is found to be competitive for the solution of linear discrete ill-posed problems in one space-dimension, for which the singular values decay to zero fast enough with increasing index. However, when the singular values do not decrease quickly enough, the Krylov method considered outperforms the randomized method. This depends on that Krylov methods determine more appropriate solution subspaces of low dimensions for linear discrete ill-posed problems than the randomized method when the singular values do not decay to zero sufficiently rapidly.
We only consider one Krylov subspace method, Golub-Kahan bidiagonalization, in this paper. However, our conclusions carry over to other Krylov subspace solution methods, such as the Arnoldi method, as well, at least when the matrix A is not too far from symmetric. When A is far from symmetric, solution methods for discrete ill-posed problems based on the Arnoldi process are known not to provide satisfactory results; see, e.g., [6,14] an illustration.