Sampling and Reconstruction in Distinct Subspaces Using Oblique Projections

We study reconstruction operators on a Hilbert space that are exact on a given reconstruction subspace. Among those the reconstruction operator obtained by the least squares fit has the smallest operator norm, and therefore is most stable with respect to noisy measurements. We then construct the operator with the smallest possible quasi-optimality constant, which is the most stable with respect to a systematic error appearing before the sampling process (model uncertainty). We describe how to vary continuously between the two reconstruction methods, so that we can trade stability for quasi-optimality. As an application we study the reconstruction of a compactly supported function from nonuniform samples of its Fourier transform.


Introduction
1.1. The reconstruction problem. In this paper we treat the following sampling problem. Let H be a separable Hilbert space over C with inner product ·, · H .We assume that we are given linear measurements ( f, u j H ) j∈N , u j ∈ H, of an unknown function f ∈ H. We call (u j ) j∈N the sampling frame and U := span(u j ) j∈N the sampling space. Our goal is to approximate the function f by an element in the reconstruction space T := span(t k ) k∈N with t k ∈ H, by a series expansionf = k∈N c k t k from the given measurements. The main point is that in general the reconstruction space is distinct from the sampling space, whereas in classical frame theory these two spaces coincide.
1.2. Areas of application and related work. This type of sampling problem arises in many concrete applications and in the numerical modelling of infinite dimensional problems.
(i) Sampling of bandlimited functions. In [25] a bandlimited function is approximated from finitely many, nonuniform samples by means of a trigonometric polynomial. In this case the sampling space consists of the reproducing kernels u j (x) = sin π(x−x j ) π(x−x j ) , j = 1, . . . , n, and the reconstruction vectors are t k (x) = e 2πikx/(2M +1) χ [−M,M ] (x), |k| M.
(ii) Inverse Polynomial Reconstruction Method. In this method one tries to approximate an algebraic polynomial or an analytic function from its Fourier samples. Thus the sampling space consists of vectors u j (x) = e πijx χ [−1,1] (x), j = 1, . . . , m, Funding by the Austrian Science Fund (FWF) through grant NFN SISE (S10602) and P26273 -N25 and by the Vienna Science and Technology Fund (WWTF) through project VRG12-009 and project ICT15-119. 1 and the reconstruction space consists of a suitable polynomial basis, usually the monomials t k (x) = x k , k = 0, . . . , n, or the Legendre polynomials. This method claims to efficiently mitigate the Gibbs phenomenon [28][29][30]38], and, indeed, the modified inverse polynomial reconstruction method [27] leads to a numerically stable reconstruction when m n 2 .
(iii) Fourier sampling. More generally, the goal is to approximate a compactly supported function in some smoothness class from its nonuniform Fourier samplesf (ω j ). Thus the sampling space consists again of the functions u j (x) = e πiω j x χ [−1,1] (x). The reconstruction space depends on the signal model and on a priori information. If f is smooth and belongs to a Besov space, then the reconstruction space may be taken to be a wavelet subspace. The problems of Fourier sampling have motivated Adcock and Hansen to revisit nonuniform sampling theory and to create the impressive and useful framework of generalized sampling [6-8, 10, 33].
(iv) Model reduction in parametric partial differential equations and the generalized empirical interpolation method. In general the solution manifold to a parametric partial differential equation is quite complicated, therefore it is approximated by finite-dimensional spaces T n . The Generalized Empirical Interpolation Method (GEIM) [34,35] builds an interpolant in an n-dimensional space T n based on the knowledge of n physical measurements ( f, u j H ) n j=1 . In [13,36] an extension based on a least squares method has been proposed, where the dimension m of T m is smaller than the number n of the measurements ( f, u j H ) n j=1 . A further generalization to Banach spaces is contained in [18]. The focus in [13,36] lies in minimizing the error caused by the model mismatch. This is done by using a correction term outside of the reconstruction space, which means that (in contrast to our work) the reconstruction is allowed to be located outside of the reconstruction space. This approach is optimal in the absence of measurement noise [13].
In all these problems the canonical approximation or reconstruction is by means of a least squares fit, namelỹ f = arg min g∈T j∈N The weights w j are usually chosen to be w j = 1, but in many contexts is has turned out to be useful to use weights as a kind of cheap preconditioners. The use of adaptive weights in sampling theory goes back at least to [23,24], and has become standard in the recent work on (Fourier) sampling, see for example [1-5, 11, 25, 26, 40].
1.3. The reconstruction operators. In this paper we restrict ourselves to the case where the approximationf = k∈N c k t k of the unknown function f ∈ H is obtained by a linear and bounded reconstruction operator Q : ℓ 2 (N) → T . Thus the approximationf from the data f, u j j∈N is given byf = Q f, u j j∈N . We will use two quantities to measure the quality of such a reconstruction operator. As a measure of stability with respect to measurement noise we use the operator norm Q op . As a measure of stability with respect to model mismatch we follow [9] and use the so-called quasi-optimality constant µ(Q) (see Definition 2.1).
Let P T denote the orthogonal projection onto T , f ∈ H the target function, and l ∈ ℓ 2 (N) be the noise vector. Then the input data are given by the sequence ( f, u j H + l j ) j∈N , the reconstruction isf = Q(( f, u j H + l j ) j∈N ), and the error is bounded by 1.4. Contributions. The error bound (2) raises several questions: • Which operators admit an error bound of the form (2)?
• Under what circumstances does such an operator exist?
• Which operator has the smallest possible operator norm Q op ?
• Which operator has the smallest possible quasi-optimality constant µ(Q)?
• Is there a way to trade-off between quasi-optimality and operator norm?
Our objective is to answer these questions both in finite-dimensional and infinite dimensional Hilbert spaces. The results can be formulated conveniently in the language of frame theory.
(i) Characterization of all reconstruction operators. We characterize all reconstruction operators that admit an error estimate of the form (2). In fact, every dual frame of the set (P T u j ) j∈N yields a reconstruction satisfying (2). Conversely, every reconstruction operator subject to (2) is the synthesis operator of a dual frame of (P T u j ) j∈N . Note that (2) implies that such a reconstruction operator Q is exact on the reconstruction space, i.e., f = Q( f, u j H ) j∈N for all f ∈ T . Reconstruction operators fulfilling this property are called perfect. For a precise formulation see Theorem 2.2.
The important insight of [9] is the connection between stability and the angle φ T ,U between the sampling space and the reconstruction space. We will see that a perfect reconstruction operator exists if and only if cos(φ T ,U ) > 0. It should also be mentioned that the reconstruction operators considered in this paper are a special case of pseudoframes [32].
(ii) Least squares approximation. As already mentioned, the canonical approximation of the data f, u j j∈N by a vector in T is by a least squares fit. Let U * f = ( f, u j H ) j∈N denote the analysis operator of the frame (u j ) j∈N and let d ∈ ℓ 2 (N) denote the vector containing the noisy measurements (d j ) = ( f, u j H + l j ) j∈N = U * f + l. Let the reconstruction operator Q 1 be defined by the least squares fit It is folklore that the least squares solution (3) is optimal in the absence of additional information on f . Precise formulations of this optimality were proven in [9, Theorem 6.2.] (including even non-linear reconstructions) and in [12] (in abstract Hilbert space). We will show in addition (Theorem 3.1) that Q 1 is the synthesis operator of the canonical dual frame of (P T u j ) j∈N . Using this property we derive a simple proof for the statement that the operator Q 1 has the smallest possible operator norm among all perfect reconstruction operators.
(iii) Minimizing the quasi-optimality constant. Let W = G † 2 := (G † ) 1 2 be the square root of the Moore-Penrose pseudoinverse of the Gramian G = U * U of the sampling frame U and consider the operator Q 0 defined by We will show (Theorem 3.3) that Q 0 has the smallest possible quasi-optimality constant. The reduction of the quasi-optimality constant is one of the motivations of weighted least squares, see [1][2][3]5]. In (4) we go a step further and use the non-diagonal matrix W = G † 2 as a weight for the least squares problem. From the point of view of linear algebra, W may be seen as a preconditioner.
In [1][2][3]5] and also [4,11,[23][24][25][26] the stability with respect to a bias in the measured object is considered, i.e., the reconstruction from U * (f + ∆f ) = ( f + ∆f, u j H ) j∈N (stated in terms of a frame inequality in the latter). In this context, Q 0 is the most stable operator with respect to biased objects, see the discussion in Section 3.6.
(iv) Trading stability and quasi-stability. It is natural to ask whether one can mix between the two least squares problems (3) and (4). Let Σ λ = λI + (1 − λ)U * U and λ ∈ [0, 1] and define Q λ by These reconstruction operators "interpolate" between Q 1 (most stable with respect to noise) and Q 0 (most stable with respect to model uncertainty). The parameter λ can be seen as a regularization parameter, or alternatively the matrix Σ λ as version of the adaptive weights in sampling. In Theorem 3.7 and Lemma 3.8 we will study this class of reconstruction operators and derive several representations for Q λ .
(v) Fourier resampling -numerical experiments. In the last part we carry out a numerical comparison of the various reconstruction operators on the basis of the so-called resampling problem. We approximate a function with compact support from finitely many, nonuniform samples of its Fourier transform and then resample the Fourier transform on a regular grid. For this problem we test the performance of the reconstruction operators Q λ .
The paper is organized as follows: In Section 2 we introduce the frame theoretic background, discuss the angle between subspaces, and characterize all reconstruction operators satisfying the required stability estimate (2). In Section 3 we study the various least squares problems (3) and (4) and analyze several representations of the corresponding reconstruction operators. The section is complemented by general numerical considerations. Section 4 covers the numerical experiments on Fourier sampling. The brief appendix collects some standard facts about frames.

Classification of all reconstruction operators
We will use the language of frame theory throughout the whole paper. The Appendix contains a short list of basic definitions and well known facts from frame theory. For more details on this topic, see for instance [15]. Let us introduce some notation. To every set of measurement vectors (u j ) j∈N in a Hilbert space H (of finite or infinite dimension) we associate the synthesis operator U defined formally by Uc = j∈N c j u j and the sampling space U = span(u j ) j∈N . The adjoint operator U * consists of the measurements U * f = ( f, u j H ) j∈N and is called the analysis operator. The frame operator is S = UU * and the Gramian is G = U * U. With this notation, (u j ) j∈N is a frame for U = span(u j ) j∈N , if there exist constants A, B > 0, such that for every f ∈ U We always assume that (u j ) j∈N is a frame for U, thus U * is bounded from H to ℓ 2 (N) and U * has closed range in ℓ 2 (N). We use R(A) for the range of an operator A and N (A) for its kernel (null space).
Likewise we assume that (t k ) k∈N is a frame for the reconstruction space T = span(t k ) k∈N with synthesis operator T and analysis operator T * . Thus Given a sequence of linear measurements ( f, u j H ) j∈N = U * f , we try to find an approximation of f in the subspace T . Assuming that all occurring operartors are bounded, we investigate the class of reconstruction operators Q : ℓ 2 → T , such thatf = QU * f is the desired reconstruction or approximation of f . We use two metrics to quantify the stability of a reconstruction operator Q : ℓ 2 (N) → T . As a measure for stability with respect to measurement noise we use the operator norm Q op . In order to measure how well Q deals with the part of the function lying outside of the reconstruction space, we use the quasi-optimality constant from [9]. Definition 2.1. Let Q : ℓ 2 (N) → T and P T be the orthogonal projection onto T . The quasi-optimality constant µ = µ(Q) > 0 is the smallest number µ, such that If µ(Q) < ∞ we call Q a quasi-optimal operator. Since P T f is the element of T closest to f , the quasi-optimality constant µ is a measure of how well QU * performs in comparison to orthogonal projection P T . Note that for f ∈ T we have QU * f = f , thus a quasi-optimal reconstruction operator is perfect.
The following theorem characterizes all bounded quasi-optimal operators.
Theorem 2.2. Let T and U be closed subspaces of H, and (u j ) j∈N be a Bessel sequence spanning the closed subspace U. For an operator Q : ℓ 2 (N) → T the following are equivalent.
(i) There exist constants 0 µ, β < ∞, such that for f ∈ H and l ∈ ℓ 2 (N) (ii) QU * g = g for g ∈ T and Q is a bounded operator.

5
(iii) The sequence (P T u j ) j∈N is a frame for T . Let (h j ) j∈N ⊂ T be a dual frame of (P T u j ) j∈N , then Q is of the form i.e., Q is the synthesis operator of some dual frame of (P T u j ) j∈N . (iv) The operator Q is bounded and QU * is a bounded oblique projection onto T .
Theorem 2.2 sets up a bijection between the class of reconstruction operators and the class of all dual frames of (P T u j ) j∈N .
To prove Theorem 2.2, we need the concept of subspace angles. Among the many different definitions of the angle between subspaces (see [39,41]) the following definition is most suitable for our analysis. Definition 2.3. Let T and U be closed subspaces of a Hilbert space H. The subspace angle ϕ T ,U ∈ [0, π 2 ] between T and U is defined as We observe that in general cos(ϕ T ,U ) = cos(ϕ U ,T ). For T ⊂ U, cos(ϕ T ,U ) = 1 and therefore ϕ T ,U = 0. If U T , then cos(ϕ T ,U ) = 0 and ϕ T ,U = π 2 . The following lemma collects the main properties of oblique projections and angles between subspaces.
for all f ∈ H 1 . The upper bound in (7) is sharp.
Proof of Theorem 2.2 (i) ⇒ (ii). Set l = 0 and choose f ∈ T . Then (5) Let (e j ) j∈N be the standard basis of ℓ 2 (N) and let h j = Qe j . Then Qc = j∈N c j h j . In particular for g ∈ T , j∈N be a dual frame of (P T u j ) j∈N and define Q by Qc = j∈N c j h j and P := QU * . Since the range of P is contained in T and QU * g = g for g ∈ T , it follows that P is onto T and that P 2 = QU * QU * = QU * = P . Since both Q and U * are bounded, P is bounded.
(iv) ⇒ (i). Let Q be a bounded operator, and let P := QU * be a bounded oblique projection onto T .
This finishes the proof. As a direct consequence of Theorem 2.2 and Lemma 2.4, (iii), we obtain the following characterization of the quasi-optimality constant.
Corollary 2.5. If Q : ℓ 2 (N) → T is a bounded and perfect reconstruction operator, then P = QU * is a bounded oblique projection onto T . If W ⊥ denotes the null-space of P , then In the following we always use the assumption that the angle between the reconstruction and sampling space fulfills cos(ϕ T ,U ) > 0. The following lemma shows that this assumption is equivalent to (P T u j ) j∈N forming a frame for T for every frame (u j ) j∈N for U. By Theorem 2.2 (iii) this is necessary for the existence of a quasi-optimal operator. In finite dimensions, for a basis (u j ) n j=1 for U, can only be a spanning set for T if dim(U) dim(T ). This means that by the assumption cos(ϕ T ,U ) > 0 we restrict ourselves to an oversampled regime. Lemma 2.6. If T and U are closed subspaces of H, then the following are equivalent: (ii) For every frame (u j ) j∈N for U with frame bounds A and B, the projection (P T u j ) j∈N is a frame for T with frame bounds A cos 2 (ϕ T ,U ) and B.
If one of these conditions is satisfied, then the following property holds: (iii) R(T * U) = R(T * ), therefore both R(T * U) and R(U * T ) are closed subspaces and U * T is pseudo-invertible. Furthermore, Proof. (i) ⇒ (ii) Let (u j ) j∈N be a frame for U with frame bounds A and B.
The assumption cos(ϕ T ,U ) > 0 and the definition of ϕ T ,U imply that In particular, for g ∈ T we obtain with (9) The identity P U g, u j H = g, u j H = g, P T u j H for g ∈ T and j ∈ N now shows that (P T u j ) j∈N is a frame for T with frame bounds A cos 2 (ϕ T ,U ) and B.
(ii) ⇒ (i) Let (u j ) j∈N be a frame for U with upper frame bound B and let (P T u j ) j∈N be a frame for T with lower frame bound C 1 > 0. Since g, P T u j H = P U g, u j H for g ∈ T , we obtain This implies that cos(ϕ T ,U ) = inf (ii) ⇒ (iii) Since (u j ) j∈N and (t k ) k∈N are Bessel sequences, both U * and T are bounded, and therefore U * T is also bounded. The entries of U * T are given by and U * T is a cross-Gramian of two frames for T . Let (ũ j ) j∈N be a dual frame of (P T u j ) j∈N . Setting c j = f,ũ j H we obtain, for f ∈ T , It follows that Since (t k ) k∈N is a frame for T , R(T * ) is closed in ℓ 2 (N), and so are R(T * U) and R(U * T ). This implies that both T * U and U * T possess a pseudoinverse (see Appendix 4.1).

The reconstruction operators
3.1. Least squares and the operator Q 1 . We first consider the reconstruction operator Q 1 : ℓ 2 (N) → T corresponding to the solution of the least squares problem This approach is analyzed in detail in [9]. Least square approximation is by far the most frequent approximation method in applications and of fundamental importance, since it has the smallest operator norm among all perfect operators.
The following theorem reviews several representations of the operator Q 1 . The connection of the operator Q 1 to the oblique projection P T ,S(T ) ⊥ was already derived in [9, Section 4.1.] for finite dimensional space T . Our new contribution is the connection to the canonical dual frame and the systematic discussion of the various representions of a least squares problem. As we will apply the statement several times, we include a streamlined proof. As usual, A † denotes the Moore-Penrose pseudo-inverse of an operator A. For the existence of A † it suffices to show that the range of A is closed (see Appendix 4.1).
Theorem 3.1. Let T and U be closed subspaces of a Hilbert space H such that cos(ϕ T ,U ) > 0. Let (u j ) j∈N be a frame for U with synthesis operator U and frame operator S. Let (t k ) k∈N be a frame for T with synthesis operator T .
Consider the following operators: and on R(U * ) ⊥ by By (13) A 2 is independent of the particular choice of the reconstruction frame (t k ) k∈N for T . (iii) Let (h j ) j∈N be the canonical dual frame of (P T u j ) j∈N and A 3 c = j∈N c j h j be the synthesis operator of (h j ) j∈N . (iv) Let d ∈ ℓ 2 (N) and letĉ = (ĉ k ) k∈N be the unique minimal norm element of the set Let the operator A 4 be defined by A 4 d = ∞ k=1ĉ k t k = Tĉ. Then all four operators are equal, Q 1 := A 1 = A 2 = A 3 = A 4 and provide the unique solution to the least squares problem Proof.
Step 1. First we check that each A j , j = 1, . . . , 4, is well defined from ℓ 2 (N) to T . For A 1 this is clear by virtue of Lemma 2.6. For A 2 we need to show that the projection P T ,S(T ) ⊥ is well defined and bounded on the whole space H. According to Lemma 2.4(i) we need to verify that S(T ) is closed, cos(ϕ T ,S(T ) ) > 0 and that H = T ⊕ S(T ) ⊥ . For this we exploit the frame inequality (10) of (u j ) j∈N , (9), and the fact that S = SP U = P U SP U , and we obtain The lower bound implies that S(T ) is closed. For the angle ϕ T ,S(T ) we obtain Since g, Sg = P U g, SP U g A P U g 2 H and Sg H = SP U g H B P U g H , we continue (17) as follows: It remains to prove that T ⊕ S(T ) ⊥ = H, or, equivalently, that Setting d = c, we obtain U * T c = 0. By Lemma 2.6(iii) v = T c ∈ T ∩N (U * ) = {0}, and thus g = Sv = 0, which implies that T ⊥ ∩ S(T ) = {0}. The operator A 3 is the synthesis operator with respect to the canonical dual frame of (P T u j ) j∈N and is therefore bounded by general frame theory. Now to A 4 : By Lemma 2.6 the operator U * T has a closed range and therefore its Moore-Penrose pseudoinverse is well defined. It is well known thatĉ = (U * T ) † d is the unique element of K of minimal norm. Consequently, A 4 d = k∈Nĉ k t k is bounded on ℓ 2 (N).
Step 2. We next show that all these operators are equal.
The equality R 2 = R follows from the identity A † AA † = A † for the Moore-Penrose pseudoinverse applied to A = U * T . Clearly R(R) ⊆ T . To prove the converse inclusion we show that R(RT ) = T . Using R(T * U) = R(T * ) from Lemma 2.6 and A † A = P R(A * ) we conclude that 10 Now let f ∈ N (R), then we have, for all h ∈ H, Since R(T * ) = R(T * U) by Lemma 2.6(iii), this means that for all c ∈ ℓ 2 (N) In other words, f ∈ R(U * UT ) ⊥ = S(T ) ⊥ , as claimed.
Claim A 1 = A 3 . We need to show that the operator A 1 = T (U * T ) † is the synthesis operator of the canonical dual frame of (P T u j ) j∈N . The frame operator S of (P T u j ) j∈N can be written in the form By Definition 4.2(iv), the canonical dual frame of (P T u j ) j∈N is given by (S † P T u j ) j∈N with synthesis operator where we used A † = (A * A) † A * with A = U * P T for the last equality. Since we have already proved that A 1 = A 2 , we know that the operator A 1 is independent of the particular choice of a frame for T . We may therefore use the frame (P T u j ) j∈N with synthesis operator P T U instead of T , and as a consequence obtain that Comparing with (19), we have proved that A 3 = A 1 .
Step 3. Finally we show that each operator A 1 = · · · = A 4 provides the unique solution to the least squares fit (12). Since N (U * ) ∩ T = {0} by Lemma 2.6(iii), the solutionf ∈ T of the least squares problem is unique. Since R(T ) = T , there exists a c ∈ ℓ 2 (N), such thatf = T c, and by (20) f = T c for every element c ∈ K (cf. (15)). In particular, for the minimal norm elementĉ = (U * T ) † d ∈ K used for the definition of the operator A 4 , we obtaiñ Theorem 3.1 implies a simple proof for the statement that the operator Q 1 has the smallest possible operator norm among all perfect reconstruction operators. This has already been proven in [9, Theorem 6.2.] in a more general setup that includes non-linear reconstruction operators.
Theorem 3.2. Let T and U be two closed subspaces of a Hilbert space H such that cos(ϕ T ,U ) > 0. If Q : ℓ 2 (N) → T is a perfect reconstruction operator (QU * g = g for g ∈ T ), then Q op Q 1 op .
Proof. Let Q : ℓ 2 (N) → T be a bounded and perfect operator. From Theorem 2.2 we infer that Q is the synthesis operator of a dual frame of (P T u j ) j∈N . From Lemma 4.5 (expansion coefficients with respect to the canonical dual frame have the minimum ℓ 2 -norm) we infer that for g ∈ T Since Q * is the analysis operator of a frame for T , we have Q * g ⊥ = Q * 1 g ⊥ = 0 for g ⊥ ∈ T ⊥ . Therefore Q * op Q 1 * op , and consequently Q op Q 1 op .
3.2. The operator Q 0 . In the last section we analyzed the operator Q 1 with the smallest operator norm. We now introduce and study the operator Q 0 with the smallest quasi-optimality constant. In the following we write G † 2 = (G † ) 1/2 when G is a positive operator with a pseudoinverse. 1 Theorem 3.3. Let T and U be closed subspaces of a Hilbert space H such that cos(ϕ T ,U ) > 0. Let (u j ) j∈N be a frame for U with synthesis operator U and Gramian G = U * U, and let (t k ) k∈N be a frame for T with synthesis operator T . Consider the following operators: and on R(U * ) ⊥ by Consequently B 2 depends only on the subspace T , but not on the particular choice of a frame (t k ) k∈N for T . (iii) Let d ∈ ℓ 2 (N) andĉ = (ĉ k ) k∈N be the unique minimal norm element of the set K := arg min Let the operator B 3 be defined by B 3 d = ∞ k=1ĉ k t k . Then the operators defined by (i)-(iv) are equivalent, Q 0 := B 1 = B 2 = B 3 and provide the unique solution of the least squares problem Proof. Let (S † 2 u j ) j∈N be the tight frame for U associated to (u j ) j∈N . By Lemma 4.4 its analysis operator L * is given by We now apply Theorem 3.1 to the frames (t k ) k∈N for T and (S † 2 u j ) j∈N for U.
Since (S † 2 u j ) j∈N is a tight frame for U, its frame operator isS = LL * = P U . As proven in Theorem 3.1, H = T ⊕S(T ) ⊥ = T ⊕ P U (T ) ⊥ and the projection P T ,S(T ) ⊥ = P T ,P U (T ) ⊥ is well defined and bounded.
Let us show that B 1 = B 2 . We set A 1 = T (L * T ) † . The equivalence of the operators A 1 and A 2 of Theorem 3.1 says that A 1 L * = P T ,P U (T ) ⊥ on R(L * ) and A 1 = 0 on R(L * ) ⊥ . Consequently, with (24), we obtain In order to prove that (22) holds for B 1 , we show that R(U * ) ⊥ = N G † 2 . The set R(U * ) is closed because (u j ) j∈N is a frame for U, and therefore R( That the operator B 1 has the representation (iii) follows from the fact that the minimal norm element of K (cf. (23)) is obtained by the Moore-Penrose pseudoinverse, i.e.ĉ = (G † 2 U * T ) † G † 2 d. By Theorem 3.1 A 1 = T (L * T ) † solves the following least squares problems: for everyd ∈ ℓ 2 , in particular ford = G † Remark 3.4. The approximation or reconstruction of f from U * f by means of Q 0 can be understood as a two-step procedure: first the input data U * f are preprocessed with G † 2 , the result is G † Thenf is produced with T (L * T ) † , which is again the synthesis operator of a frame.
The next result shows that the operator Q 0 has the smallest possible quasioptimality constant.

3.3.
Combinations of Q 0 and Q 1 . The operators Q 0 and Q 1 optimize different performance metrics, specifically Q 1 is most stable with respect to noisy data, and Q 0 is optimal with respect to the deviation of the target function from the reconstruction space. It is natural to interpolate between these two operators and to try to define mixtures Q λ such that Q 1 op Q λ op Q 0 op and µ(Q 0 ) µ(Q λ ) µ(Q 1 ), λ ∈ (0, 1). To do this, we procede as follows.
For λ ∈ [0, 1] we define and where I denotes the identity operator on H and on ℓ 2 (N) respectively, S 1 = UU * the frame operator and G 1 := U * U the Gramian of the frame (u j ) j∈N for U. For λ > 0 M λ is invertible on H, U is an invariant subspace of M λ and Σ λ is invertible on ℓ 2 (N). We now set The next lemma describes the properties of the new frame (u λ,j ) j∈N .
Lemma 3.6. Let (u j ) j∈N be a frame for U with frame bounds A and B. Fix λ ∈ (0, 1], and let (u λ,j ) j∈N be defined by (28).
i.e., the operator Σ − 1 2 λ U * is the analysis operator of the frame (u λ,j ) j∈N for U.
Proof. Using S 1 = UU * for the frame operator of (u j ) j∈N , we obtain the following: Combining this with (31) implies the frame inequality (29).
Theorem 3.7. Let T and U be closed subspaces of a separable Hilbert space H such that cos(ϕ T ,U ) > 0. Let (u j ) j∈N be a frame for U and (t k ) k∈N be a frame for T . For 0 < λ 1 let L λ be the synthesis operator of the frame (u λ,j = M − 1 2 λ u j ) j∈N for U and S λ = L λ L * λ the corresponding frame operator. Consider the following operators: and Consequently, C 2 is independent of the particular choice of the reconstruction frame (t k ) k∈N for T .
Then these operators are equal, Q λ := C 1 = C 2 = C 3 , andf = Q λ d is the unique solution of the least squares problem Proof. We apply Theorem 3.1 to the frames (u λ,j ) j∈N for U and (t k ) k∈N for T . Let L * λ be the analysis operator of (u λ,j ) j∈N and set A 1 = T (L * λ T ) † . Since A 1 has the equivalent representation (ii) of Theorem 3.1  (33).
For showing C 1 = C 3 and (34) we repeat the proof of Theorem 3.3 verbatim.
The following Lemma gives a useful upper bound on the quasi-optimality constant of the operators Q λ .
Proof. By Lemma 3.6 (u j,λ ) j∈N is a frame for U with frame bounds λ U * . Using (18) for the frame (u j,λ ) j∈N we infer (35).
We observe that the upper bound in (35) is decreasing for λ → 0, and for λ = 0 the upper bound coincides with the operator norm Q 0 U * op = 1 cos(ϕ T ,U ) . Unfortunately we do not know yet how to obtain a meaningful bound on the operator norm of Q λ . Remark 3.9. In [12] the authors consider a regularization term (Tikhonov regularization) . The reconstruction operators corresponding to such a regularized least squares fit do not fulfill Q( f, u j H ) j∈N = f for f ∈ T (see [12, equation (7)]), and therefore do not belong to the class of reconstruction operators analyzed in this paper.
If we formulate this least squares problem in terms of the normal equations, we have to solve We observe that the cross-Gramian U * T ∈ C n×m is the matrix with entries (U * T )(j, k) = u j , t k H , and the matrix Σ λ ∈ C n×n is given by For the solution of an overdetermined least squares problem one may use a direct method, such as the QR decomposition with pivoting with an operation count of O(nm 2 ). Alternatively, one may approximate the solution of (36) up to a given precision ε > 0 by means of iterative methods, such as the conjugate gradient method applied to the normal equations with an operation count O(log(ε)nm). A concrete realization is the LSQR algorithm, see [37].
The convergence of the conjugate gradient iteration depends fundamentally on the condition number κ(R λ ) of the matrix R λ = T * UΣ −1 λ U * T in (37) (where κ(A) = A op A −1 op ). The following lemma offers an estimate for the condition number under the additional condition that the reconstruction space is spanned by an orthonormal set. This is a common practice in many applications [6,[9][10][11]27]. Lemma 3.10. Let T and U be closed subspaces of a separable Hilbert space H such that cos(ϕ T ,U ) > 0. Let (u j ) j∈N be a frame for U with frame bounds A and B, and let (t k ) k∈N be an orthonormal basis for T . Set R λ = T * UΣ −1 λ U * T . Then Proof. From Lemma 3.6 we know that (u j,λ = M Using T c H = c 2 and (10) for the frame (u j,λ ) j∈N instead of (u j ) j∈N we infer that Since (38) is now a direct consequence of (39).
Remark 3.11. 1. We observe that the bound for κ(R λ ) on the right-hand side of (38) is increasing in λ and we expect that also κ(R λ 1 ) κ(R λ 2 ) for λ 1 λ 2 . This has been tested experimentally in Section 4.
2. Note that for λ > 0 the solution of the original least squares problem min c∈ℓ 2 (N) U * T c − d 2 and of min λ d 2 are distinct in general. This is an important difference to classical preconditioning of square systems, where the solution of the original and the preconditioned system coincide.
3. One may interpret the introduction of Σ −1/2 λ as a form of preprocessing of the measurement vector d. In most sampling problems the preprocessing is by a diagonal matrix [1-5, 11, 23-26, 40], where the entries are called "adaptive weights" or "density compensation factors". The use of non-diagonal matrices seems to be a new idea.
4. The use of more general matrices for preprocessing is very promising, but requires additional numerical considerations. To achieve a small numerical complexity, one needs to approximate Σ −1 λ by a simpler matrix V λ and then solve the normal equations T * UV λ U * T c = T * UV λ d. This question will be pursued in future work.

3.5.
Conditions for the approximations to coincide. While in general the reconstruction operators Q 1 , Q 0 and Q λ are different, they coincide in several situations.
Lemma 3.12. Let T and U be closed subspaces of a separable Hilbert space H and let (u j ) n j=1 be a frame for U. If T ⊕ U ⊥ = H and Q : ℓ 2 (N) → T is a bounded, perfect reconstruction operator, then Consequently for λ ∈ [0, 1] Proof. As in the proof of Theorem 3.5 we see that QU * g = g for g ∈ T and QU * u ⊥ = 0 for u ⊥ ∈ U ⊥ imply that R(QU * ) ⊇ T and N (QU * ) ⊇ U ⊥ . Since by assumption T ⊕ U ⊥ = H, this proves (40). Since Q 1 c = Q 0 c = Q λ c = 0 for c ∈ R(U * ) ⊥ , this implies (41).
The decomposition T ⊕ U ⊥ = H is the general assumption for consistent sampling [16,17,[19][20][21][22]. In finite dimensions the assumption T ⊕ U ⊥ = H is fulfilled only if dim(T ) = dim(U) and cos(ϕ T ,U ) > 0 [9, Lemma 3.7]. In case of linearly independent sampling and reconstruction vectors, the condition dim(T ) = dim(U) requires as many sampling as reconstruction vectors. In other words, in the critical case (between overdetermined and underdetermined) all reconstruction operators coincide.
Since (u j ) j∈N is a tight frame for U, its frame operator is S 1 = AP U for some A > 0. Consequently, and M −1/2 λ u j = (λ + (1 − λ)A) −1/2 u j is just a constant multiple of the original tight frame. Therefore (M −1/2 λ u j ) j∈N is again a tight frame for every λ ∈ [0, 1], S λ (T ) = P U (T ) and Q λ U * = P T ,S λ (T ) ⊥ = P T ,P U (T ) ⊥ is independent of λ. Consequently, 3.6. Stability with respect to a biased objects. In [1][2][3]5] and also [4,11,[23][24][25][26] a notion of stability with respect to a bias in the measured object is considered (in the latter stated in terms of a frame inequality). This means that the measurements are made on the vector f + ∆f instead of the correct f , and ∆f ∈ H is the bias or object uncertainty. In this case the error estimate is of the form It is important to understand the conceptual difference between (42) and (5). The error estimate (5) treats the error arising from perturbed or noisy measurements U * f + l. Estimate (42) treats the uncertainty of the target function (object uncertainty) and assumes that the exact measurements of the biased function f +∆f are available. Since the operator Q 0 has the smallest possible quasi-optimality constant µ(Q 0 ) and since µ(Q 0 ) = Q 0 U * op , Theorem 3.5 yields the following corollary. Corollary 3.14. Let cos(ϕ T ,U ) > 0 and let Q 0 be defined as in Theorem 3.3. If an operator Q : ℓ 2 (N) → T satisfies for f ∈ H and ∆f ∈ H for some 0 < β i < ∞, then β i µ(Q 0 ), i = 1, 2.
Consequently, if we restrict ourselves to linear mappings, Corollary 3.14 shows that Q 0 is optimal for the problem considered in [1-5, 11, 23-26]

Numerical experiments for reconstruction from Fourier measurements
In this section, we apply the various reconstruction methods to the reconstruction of a compactly supported function from non-uniform Fourier samples. This approximation problem occurs in numerous applications, for example, radial sampling of the Fourier transform is used in MRI and CT, see [31].
From the given dataf (ω j ), j = −n, . . . , n, of a compactly supported function, we calculate the Fourier coefficientsf (k), k = −m, . . . , m, of f and construct a final approximation by a truncated Fourier series. This is the uniform resampling problem, see [9,43]. If f is smooth and periodic, then the Fourier series converges exponentially fast. However, if f is non-periodic or discontinuous, then the Fourier series of f converges slowly and also suffers from the Gibbs phenomenon. Of course, for discontinuous or non-periodic functions the trigonometric polynomials of fixed degree are a bad choice for the reconstruction space. Since the function f is unknown there will always be some model mismatch in practice, independent of the particular choice of the reconstruction vectors. Our objective in this section is not to choose optimal reconstruction functions, but rather to compare how the various reconstruction operators Q λ deal with the model mismatch in noisy regimes. We will see that a smart choice of the parameter λ yields better approximations than the standard least square approximation (3).
We remark that the Gibbs phenomenon can be avoided by choosing a more appropriate reconstruction space, e.g., algebraic polynomials [7,27] or wavelet expansions [6,10].
dx the standard inner product on the Hilbert space L 2 (R) and the Fourier transform F on L 2 (R) with normalization Let H be the subspace of L 2 (R) of functions with support in the interval [− 1 2 , 1 2 ], i.e., The given data are finitely many (non-uniform) noisy Fourier measurements where l j ∈ C is additive noise. The noise l j is assumed to be i.i.d. Gaussian with variance (average power) σ 2 ℓ and signal-to-noise ratio (SNR) SNR = The sampling frequencies ω j ∈ R are chosen For the particular bases (u j ), (t k ) consisting of exponentials, the cross-Gramian U * T ∈ C (2n+1)×(2m+1) has the entries and the preconditioning matrix Σ λ is given by the entries All results in this section have been averaged over 1000 independent realizations of the sampling frequencies and the noise.

Noisy samples.
In the first experiment we study the influence of the sampling rate 2m+1 2n+1 and the SNR on the recovery performance of the operators Q λ . We approximate the exponential function f (x) = e x χ [− 1 2 , 1 2 ] (x) from 181 noisy Fourier samples (n = 90) and reconstruct in a space of trigonometric polynomials of degree m = 10, 20, 30, 40 (with dimension 2m + 1). Table 1 lists the operator norm Q λ op , the quasi-optimality constant µ(Q λ ), the angle ϕ T ,U , the condition number κ(Σ  All values are listed in the form E ± σ where E is the expected value and σ the standard deviation of the quantity, and rounded to the third decimal place. By (2) the (absolute) reconstruction error depends both on the quasi-optimality constant µ(Q λ ) and the operator norm Q λ op . The numerical simulations support Theorem 3.5 asserting that the quasi-optimality constant µ is minimal for λ = 0. As expected in view of Lemma 3.8 the quasi-optimality constant µ is increasing with λ. The angle ϕ T ,U is almost zero, so the reconstruction space is "almost contained" in the sampling space. Therefore Q 0 U * is nearly identical to the orthogonal projection P T onto T . A small angle ϕ T ,U is essential for stable reconstruction. Taking for example m = n leads to an angle close to π 2 (since the sampling frequencies are contained in the interval [− n 2 − 2, n 2 + 2]), which necessarily leads to an unstable scenario. Taking more measurements than reconstruction vectors is a common way to stabilize the reconstruction problem [1][2][3][4][5][6][7][8]10,11,[24][25][26][27]33,40]. The operator norm of Q λ is decreasing in λ, and the approximation becomes less sensitive to noise, with Q 1 being the most stable reconstruction in line with [9, Theorem 6.2.] and with Theorem 3.2. The intermediate reconstruction operators offer a trade-off between sensitivity to noise and the out-of-space contributions. A suitable choice of λ then leads to more accurate reconstructions than Q 0 and Q 1 . For example for m = 10 and SNR = 20dB the average relative reconstruction error is 0.088 for Q 0 , 0.098 for Q 1 , but only 0.075 for Q 0.1 .
Interestingly, for SNR = 10dB we obtain a higher average approximation error for dimension m = 40 than for m = 10, 20, 30. Although the increase in dimension makes the distance f − P T f L 2 smaller, the high irregularity in the sampling frequencies seems to lead to unstable scenarios. This correlates nicely with the increase of the operator norm Q λ op with increasing m.
We next observe that the condition number κ(Σ − 1 2 λ U * T ) is increasing with λ, as anticipated in Lemma 3.10. As discussed in Section 3.4, this opens the possibility of finding non-diagonal weight matrices.
For Figure 1(a) we have determined the parameter λ opt such that the relative reconstruction error is minimal, i.e, λ opt = arg min . We then plot the correlation between λ opt and the signal-to-noise ratio. The plot confirms Theorems 3.5 and 3.2: for SNR → ∞ the optimal reconstruction is with Q 0 , and for SNR → −∞, the optimal reconstruction is with Q 1 .
In Figure 1(b) we depict the approximations obtained by Q 0 , Q 1 and Q λopt for a single realization of the sampling frequencies and noise with m = 20 and SNR = 20dB. The optimal choice of the regularization parameter λ opt yields a significantly better approximation than the standard least square approximation (with Q 1 ).
In Figure 1(c) we depict the quasi-optimality constant µ(Q λ ) versus the operator norm Q λ op for λ ∈ [0, 1] for m = 20 and SNR= 20dB. The curve consists of the points (µ(Q λ ), Q λ op ) for λ ∈ [0, 1]. This curve exhibits a striking change of direction at a small value of λ, as is shown by the points corresponding to λ = 10 −6 , 0.01, 0.1. The value of λ near the edge of the curve (around 0.01) yields a good trade-off between quasi-optimality and operator norm. Figure 1(d) shows the relative reconstruction error as a function of λ for fixed SNR. This is a typical L-curve known from many regularization procedures of ill-posed problems. The plots supports the interpretation of λ as a regularization parameter.

4.3.
Reconstruction from measurements of a biased function. We now assume that we are given a set of Fourier measurements of a perturbation of f ∈ H d = [F (f + ∆f )(ω −n ), . . . , F (f + ∆f )(ω n )] T .
The coefficients a j in (47) are i.i.d. Gaussian distributed with variance (average power) σ 2 . Table 2 shows the relative error for SNR = ∞, SNR = 20dB and SNR = 10dB and m = 10 in (a), m = 20 in (b), m = 30 in (c) and m = 40 in (d). In this case the most accurate reconstruction is always given by the reconstruction with the operator Q 0 , thus confirming Corollary 3.14. In addition, with increasing dimension of the reconstruction space the distance f − P T f L 2 is decreasing with m, and the relative error decreases up to m = 40. For m = 50 the angle between the sampling and reconstruction space is ϕ T ,U = 0.309 ± 0.094, which results in a significantly higher approximation error.

APPENDIX: FRAMES IN HILBERT SPACES
We need the definition of the Moore-Penrose pseudoinverse of an operator on a Hilbert space [15, Section 2.5]. We use the notation R(A) for the range, and N (A) for the null-space of the operator A.   (ii) If U is bounded U : ℓ 2 (N) → H, (u j ) j∈N and has closed range, (u j ) j∈N is called a frame for the subspace U = span(u j ) j∈N .
(iii) If (u j ) j∈N is a Bessel sequence, then the adjoint operator of U is the analysis operator U * : H → ℓ 2 (N), U * f = ( f, u j H ) j∈N , and S = UU * : H → H, Sf = ∞ j=1 f, u j H u j is the frame operator of (u j ). (iv) The sequence (S † u j ) j∈N ⊆ U is the canonical dual frame in U , and every f ∈ U possesses the frame expansions