On numerical realizations of Shannon’s sampling theorem

In this paper, we discuss some numerical realizations of Shannon’s sampling theorem. First we show the poor convergence of classical Shannon sampling sums by presenting sharp upper and lower bounds of the norm of the Shannon sampling operator. In addition, it is known that in the presence of noise in the samples of a bandlimited function, the convergence of Shannon sampling series may even break down completely. To overcome these drawbacks, one can use oversampling and regularization with a convenient window function. Such a window function can be chosen either in frequency domain or in time domain. We especially put emphasis on the comparison of these two approaches in terms of error decay rates. It turns out that the best numerical results are obtained by oversampling and regularization in time domain using a sinh-type window function or a continuous Kaiser–Bessel window function, which results in an interpolating approximation with localized sampling. Several numerical experiments illustrate the theoretical results.


Introduction
The classical Whittaker-Kotelnikov-Shannon sampling theorem (see [32,11,25]) plays a fundamental role in signal processing, since it describes the close relation between a bandlimited function and its equidistant samples.A function f ∈ L 2 (R) is called bandlimited with bandwidth N  2 , if the support of its Fourier transform Let the space of all bandlimited functions with bandwidth N 2 be denoted by which is also known as the Paley-Wiener space.By definition (1.1), the Paley-Wiener space B N/2 (R) consists of equivalence classes of almost equal functions.However, it can be shown (see e.g.[10, p. 6]) that there is always a smooth representation, since for each r ∈ N 0 we have by inverse Fourier transform that and therefore f (r) ∈ C 0 (R), where C 0 (R) denotes the Banach space of continuous functions f : R → C vanishing as |t| → ∞ with the norm That is to say, we have Then the Whittaker-Kotelnikov-Shannon sampling theorem states that any bandlimited function f ∈ B N/2 (R) can be recovered from its samples f k L , k ∈ Z, with L ≥ N as where the sinc function is given by sinc x := sin x x : x ∈ R \ {0} , It is well known that the series in (1.2) converges absolutely and uniformly on whole R.
Unfortunately, the practical use of this sampling theorem is limited, since it requires infinitely many samples, which is impossible in practice.Furthermore, the sinc function decays very slowly such that the Shannon sampling series with L ≥ N has rather poor convergence, as can be seen from the sharp upper and lower bounds of the norm of the Shannon sampling operator (see Theorem 2.2).In addition, it is known (see [6]) that in the presence of noise in the samples f k L , k ∈ Z, of a bandlimited function f ∈ B N/2 (R), the convergence of Shannon sampling series may even break down completely.To overcome these drawbacks, many applications employ oversampling, i.e., a function f ∈ L 2 (R) of bandwidth N  2 is sampled on a finer grid 1 L Z with L > N , where the oversampling is measured by the oversampling parameter λ := L−N N ≥ 0. In addition, we consider several regularization techniques, where a so-called window function is used.Since this window function can be chosen in frequency domain or in spatial domain, we study both approaches and compare the theoretical and numerical approximation properties in terms of decay rates.
On the one hand, we investigate the regularization with a window function in frequency domain (called frequency window function), cf.e.g.[5,15,24,17,28].Here we use a suitable function of the form , where χ : N 2 , L 2 → [0, 1] is frequently chosen as some monotonously decreasing, continuous function with χ N 2 = 1 and χ L 2 = 0. Applying inverse Fourier transform, we determine the corresponding function ψ in time domain.Since ψ is compactly supported, the uncertainty principle (cf.[18, p. 103, Lemma 2.39]) yields supp ψ = R. Then it is known that the function f can be represented in the form Using uniform truncation, we approximate a function f ∈ B N/2 (R) by the T -th partial sum On the other hand, we examine the regularization with a window function in time domain (called time window function), cf.e.g.[21,13,14,10].Here a suitable window function φ : R → [0, 1] with compact support − m L , m L belongs to the set Φ m/L (as defined in Section 4) with some m ∈ N \ {1}.Then we recover a function f ∈ B N/2 (R) by the regularized Shannon sampling formula with L ≥ N .By defining the set Φ m/L of window functions φ, only small truncation parameters m are needed to achieve high accuracy, resulting in short sums being computable very fast.In other words, this approach uses localized sampling.Moreover, this method is an interpolating approximation, since for all n, k ∈ Z we have In this paper we propose new estimates of the uniform approximation errors max where we apply several window functions ψ and φ.Note that for the frequency window functions ψ only error estimates on certain compact interval (such as for example [−1, 1]) can be given, while results on whole R are not feasible.It is shown in the subsequent sections that the uniform approximation error decays algebraically with respect to T , if ψ is a frequency window function.Otherwise, if φ ∈ Φ m/L is chosen as a time window function such as the sinh-type or continuous Kaiser-Bessel window function, then the uniform approximation error decays exponentially with respect to m.
To this end, this paper is organized as follows.First, in Section 2 we show the poor convergence of classical Shannon sampling sums and improve results on the upper and lower bounds of the norm of the Shannon sampling operator.Consequently, we study the different regularization techniques.In Section 3 we start with the regularization using a frequency window function.After recapitulating a general result in Theorem 3.3, we consider window functions of different regularity and present the corresponding algebraic decay results in Theorems 3.4 and 3.7.Subsequently, in Section 4 we proceed with the regularization using a time window function.Here we also review the known general result in Theorem 4.1 and afterwards demonstrate the exponential decay of the considered sinh-type and continuous Kaiser-Bessel window functions in Theorems 4.2 and 4.3.Finally, in Section 5 we compare the previously considered approaches from Sections 3 and 4 to illustrate our theoretical results.

Poor convergence of Shannon sampling sums
In order to show that the Shannon sampling series (1.3) has rather poor convergence, we truncate the series (1.3) with T ∈ N, and consider the T -th Shannon sampling sum Obviously, this operator can be formed for each f ∈ C 0 (R).(2.2) Proof.For each f ∈ C 0 (R) and t ∈ R we have By defining the even nonnegative function which is contained in C 0 (R), and assuming that s T has its maximum in t 0 ∈ R, this yields The other way around, we consider the linear spline g ∈ C 0 (R) with nodes in 1 L Z, where Obviously, we have ∥g∥ C 0 (R) = 1 and and hence (2.2).Now we show that the norm ∥S T ∥ is unbounded with respect to T .
Theorem 2.2.The norm of the operator S T : C 0 (R) → C 0 (R) can be estimated by where we use Euler's constant 3) is even, we estimate the maximum of s T (t) only for t ≥ 0. For k = 1, . . ., T, we have a k (0) = a k 1 L = 0 and by trigonometric identities we obtain for t ∈ 0, 1 L that We define the functions b k : (0, 1) → R, k = 1, . . ., T , via , each b k is symmetric with reference to 1 2 .Furthermore, by b ′ k (x) ≥ 0 for x ∈ 0, 1 2 , the function b k is increasing on 0, 1  2 and therefore has its maximum at x = 1 2 .Thus, the function a k : 0, 1 L → R has its maximum at t = 1 2L , i.e., max Since a T +1 (t) can be written as Hence, in summary this yields For t ∈ n L , n+1 L with arbitrary n ∈ N, the sum s T (t) is less than it is for t ∈ 0, 1 L , since for each n ∈ N and t ∈ 0, 1  L we have and therefore s T n L + t < s T (t).Thus, we obtain Note that for T ≫ 1 the value is a good approximation of the norm ∥S T ∥.
Now we estimate ∥S T ∥ by ln T .For this purpose we denote the T -th harmonic number by are known (see [33]).From (2.7) and (2.8) we conclude that . (2.9) Nevertheless, this convergence is very slow due to the poor decay of the sinc function, as can be seen from the sharp upper and lower bounds of the norm of the Shannon sampling operator, see Lemma 2.2.More precisely, rigorous analysis of the approximation error, cf.[8, Theorem 1], shows that on the fixed interval [−1, 1] we have a convergence rate of only (T − L) −1/2 , which was also mentioned in [34,29].
Note that general results on whole R are not feasible for arbitrary f ∈ B N/2 (R).Rather, the uniform approximation error ∥f − S T f ∥ C 0 (R) can only be studied under the additional assumption that f ∈ B N/2 (R) satisfies certain decay conditions, see [12,9].
Moreover, for the special error terms such that the Shannon sampling series is not numerically robust in the worst case analysis.
Next, we proceed with the lower bound.Due to the special choice of the error terms ε k we obtain By (2.6) and (2.9) we conclude that such that (2.12) completes the proof.
Note that (2.10) specifies a corresponding remark of [6, p. 681] as it makes the constant explicit.In addition, we remark that the norm ∥ f − f ∥ C 0 (R) does not depend on the special choice of the function f or the oversampling parameter λ, see Figure 2.1.Furthermore, Figure 2.1 also illustrates that for T → ∞ the error behavior shown in Theorem 2.4 is not satisfactory.Thus, in the presence of noise in the samples f k L , k ∈ Z, the convergence of the Shannon sampling series (1.3) may even break down completely.
Remark 2.5.In the above worst case analysis we have seen that the approximation of f ∈ B N/2 (R) by the T -th partial sum (2.1) of its Shannon sampling series with L ≥ N is not numerically robust in the deterministic sense.Otherwise, a simple average case study (see [31]) shows that this approximation is numerically robust in the stochastic sense.Therefore, we compute (2.1) as an inner product of the real (2T + 1)-dimensional vectors Now assume that instead of the exact samples . ., T , are given, where the real random variables X k are uncorrelated, each having expectation E(X k ) = 0 and constant variance with ρ > 0. Then we consider the stochastic approximation error 1.5 Obviously, this error term ∆ T has the expectation and the variance From [27, p. 89, Problem 1.10.9] it follows that

Regularization with a frequency window function
To overcome the drawbacks of poor convergence and numerical instability, one can apply regularization with a convenient window function either in the frequency domain or in the time domain.Often one employs oversampling, i.e., a bandlimited function f ∈ B N/2 (R) of bandwidth N 2 is sampled on a finer grid 1 L Z with L = N (1 + λ), where λ > 0 is the oversampling parameter.
First, together with oversampling, we consider the regularization with a frequency window function of the form cf. [5,17,28], where χ : N 2 , L 2 → [0, 1] is frequently chosen as some monotonously decreasing, continuous function with χ N 2 = 1 and χ L 2 = 0. Applying the inverse Fourier transform, we determine the corresponding function in time domain as Note that in a trigonometric setting a function of the form (3.3) is also often referred to as trapezoidal or de La Vallée-Poussin type window function, respectively.Obviously, ψlin (v) is a continuous linear spline supported on − L 2 , L 2 , see Figure 3.1 (a).By (3.2) we receive This function 1 L ψ lin is even, supported on whole R, has its maximum at t = 0 such that In addition, 1 L ψ lin (t) has a faster decay than sinc(N πt) for |t| → ∞, cf. Figure 3.1 (b).Note that we have lim However, 1 L ψ lin • − k L k∈Z is not an orthonormal sequence and also not a Riesz sequence.To see this, we consider the 1-periodic function Figure 3.1: The frequency window function (3.3) and its scaled inverse Fourier transform (3.4).
Using oversampling and regularization with a frequency window function ψ of the form (3.1), the function f can be represented as where the series (3.6) converges absolutely and uniformly on R.

Proof. Since by assumption
and therefore the function f restricted on − L 2 , L 2 can be represented by its L-periodic Fourier series with the Fourier coefficients Using the inverse Fourier transform, we see that Hence, we may write f given by (3.7) as where summation and integration may be interchanged by the theorem of Fubini-Tonelli, since we have Note that (3.6) is not an interpolating approximation, since in general we have Moreover, since the frequency window function ψ in (3.1) is compactly supported, the uncertainty principle (cf.[18, p. 103, Lemma 2.39]) yields supp ψ = R, such that (3.6) does not imply localized sampling for any choice of ψ.In other words, the representation (3.6) still requires infinitely many samples f k L .Thus, for practical realizations we need to consider a truncated version of (3.6) and hence for T ∈ N we introduce the T -th partial sum Proof.By (3.6) and (3.9) we have such that Cauchy-Schwarz inequality implies see [10, Formula (3.8)], and thereby we have It can easily be seen that (3.4) satisfies the decay condition .
Example 3.5.Next, we visualize the error bound of Theorem 3.4, i.e., for a given function f ∈ B N/2 (R) with L = N (1 + λ), λ > 0, we show that the approximation error satisfies (3.10).For this purpose, the error max is estimated by evaluating the given function f as well as its approximation P lin,T f , cf. (3.9), at equidistant points t s ∈ [−1, 1], s = 1, . . ., S, with S = 10 5 .Here we study the function We fix N = 128 and consider the error behavior for increasing T ∈ N.More specifically, in this experiment we choose several oversampling parameters λ ∈ {0.5, 1, 2} and truncation parameters T = 2 c with c ∈ {0, . . ., 15}.The corresponding results are depicted in Figure 3.2.Note that the error bound in (3.10) is only valid for T > L. Therefore, we have additionally marked the point T = L for each λ as a vertical dash-dotted line.It can easily be seen that also the error results are much better when T > L. Note, however, that increasing the oversampling parameter λ requires a much larger truncation parameter T to obtain errors of the same size.In order to obtain convergence rates better than the one in Theorem 3.4, one may consider frequency window functions (3.1) of higher smoothness.
Thus, to obtain a smoother frequency window function, we need to additionally satisfy the first order conditions lim Then the corresponding interpolation polynomial yields the cubic frequency window function see Figure 3.3 (a).By (3.2) we see that the inverse Fourier transform of (3.15) is given by for t ∈ R \ {0} and ψ cub (0) = L+N 2 , cf.
Example 3.8.Another continuously differentiable frequency window function is given in [24] as the raised cosine frequency window function see Figure 3.3 (a).By (3.2) the corresponding function in time domain can be determined as and Therefore, instead of determining smooth frequency window functions of the form (3.1) by means of interpolation as in Example 3.6, they can also be constructed by convolution, cf.[15].
(a) ψcub and ψcos  This completes the proof.
Given such a frequency window function ψconv as in (3.20), its inverse Fourier transform (3.2) is known by the convolution property of the Fourier transform as Thus, to obtain a suitable window function (3.20), we need to assure that the inverse Fourier transform ρ of ρ is explicitly known.Since ρ is even by assumption, we have ρ = ρ with Note that the convolutional approach (3.20) has the substantial advantage that the smoothness of (3.20) is determined by the smoothness of the chosen function ρ.Example 3.12.In [15] the infinitely differentiable function with the scaling factor c = 1 2 .
is considered.The corresponding frequency window function (3.20) is denoted by ψ∞ .However, since for this function ρ ∞ the inverse Fourier transform ρ∞ cannot explicitly be stated, the function (3.22) in time domain can only be approximated, which was done by a piecewise rational approximation ρrat in [15].We remark that because of this additional approximation a numerical decay of the expected rate is doubtful, since the issue of robustness of the corresponding regularized Shannon series remains unclear.This effect can also be seen in The same comment also applies to [28], where an infinitely differentiable window function ψ is aimed for as well.Since such ψ with explicit inverse Fourier transform (3.2) is not known, in [28] the function ψ is estimated by some Gabor approximation.Although an efficient computation scheme via fast Fourier transform (FFT) was introduced in [29], the numerical nonrobustness by this approximation seems to be neglected in this work.
Finally, we remark that already in [5, p. 19] it was stated that a faster decay than for ψlin from (3.3) can be obtained by choosing ψ in (3.1) smoother, but at the price of a very large constant.This can also be seen in Figure 5.1, where the results for the window functions ψcub in (3.15), ψcos in (3.18), ψconv,2 in (3.23), and ψrat from Example 3.12 are plotted as well.For this reason many authors restricted themselves to the linear frequency window function ψlin in (3.3).Furthermore, the numerical results in Figure 5.1 encourage the suggestion that in practice only algebraic decay rates are achievable by regularization with a frequency window function.

Regularization with a time window function
To preferably obtain better decay rates, we now consider a second regularization technique, namely regularization with a convenient window function in the time domain.To this end, for L > N and any m ∈ N \ {1} with 2m ≪ L, we introduce the set Φ m/L of all window functions φ : R → [0, 1] with the following properties: Further, φ is even and continuous on − m L , m L .
• Each φ ∈ Φ m/L restricted on 0, m L is monotonously non-increasing with φ(0 As examples of such window functions we consider the sinh-type window function with β = πm (L−N )

L
, and the continuous Kaiser-Bessel window function with β = πm (L−N )

L
, where I 0 (x) denotes the modified Bessel function of first kind given by Both window functions are well-studied in the context of the nonequispaced fast Fourier transform (NFFT), see e.g.[19] and references therein.
A visualization of the continuous Kaiser-Bessel window function (4.2) as well as the corresponding regularization ξ cKB (t) := sinc(Lπt) φ cKB (t) of the sinc function can be found in Figure 4.1.We remark that in contrast to Figure 3.1 here the function ξ cKB in time domain is compactly supported and its Fourier transform ξcKB is supported on whole R, where for the frequency window function (3.3) it is vice versa (see [18, p. 103, Lemma 2.39]).Note that a visualization for the sinh-type window function (4.1) would look the same as Figure 4.1.This concept of regularized Shannon sampling formulas with localized sampling and oversampling has already been studied by various authors.A survey of different approaches for window functions can be found in [22], while the prominent Gaussian window function was e.g.considered in [21,23,26,30,13].Since this Gaussian window function has also been studied in [10], where superiority of the sinh-type window function (4.1) was shown, we now focus on time window functions φ ∈ Φ m/L , such as (4.1) and (4.2).
Similar as in [10], for given f ∈ B N/2 (R) and φ ∈ Φ m/L the uniform approximation error ∥f − R φ,m f ∥ C 0 (R) of the regularized Shannon sampling formula can be estimated as follows.
with the corresponding error constants Note that it is especially beneficial for the estimation of the error constant (4.4), if the Fourier transform of φ ∈ Φ m/L is explicitly known.Now we specify the result of Theorem 4.1 for certain window functions.To this end, assume that f ∈ B N/2 (R) with N ∈ N and L = N (1 + λ), λ > 0. Additionally, we choose m ∈ N \ {1}.We demonstrate that for the window functions (4.1) and (4.2) the approximation error of the regularized Shannon sampling formula (4.3) decreases exponentially with respect to m.For the sinh-type window function (4.1) the following result is already known.Hence, we obtain Note that for suitable β = mπλ 1+λ we find I Thus, we conclude By [2] the function e x I 0 (x) is strictly decreasing on [0, ∞) and tends to zero as x → ∞.Numerical experiments have shown that and thus we conclude that for all m ∈ N \ {1}.This completes the proof.
As seen in Theorem 2.4, if the samples f k L , k ∈ Z, of a bandlimited function f ∈ B N/2 (R) are not known exactly, i.e., only erroneous samples fk := f k L + ε k with |ε k | ≤ ε, k ∈ Z, with ε > 0 are known, the corresponding Shannon sampling series (1.3) may differ appreciably from f .Here we denote the regularized Shannon sampling formula with erroneous samples fk by Then, in contrast to the Shannon sampling series (1.3), the regularized Shannon sampling formula (4.3) is numerically robust in the worst case analysis, i.e., the uniform perturbation error ∥R φ,m f − R φ,m f ∥ C 0 (R) is small.
Note that it is especially beneficial for obtaining explicit error estimates, if the Fourier transform (4.6) of φ ∈ Φ m/L is explicitly known.In the following, we demonstrate that for the window functions (4.1) and (4.2) the perturbation error of the regularized Shannon sampling formula (4.3)only grows as O( √ m).
Proof.For a proof of Theorem 4.  such that (4.11) and β = πmλ 1+λ yields the assertion (4.12).The associated results are displayed in Figure 5.1.We see that for all window functions the theoretical error behavior perfectly coincides with the numerical outcomes.In this regard, see also Table 5.1 which summarizes the theoretical results.Moreover, it can clearly be seen that for higher oversampling parameter λ and higher truncation parameter m, the error results using (4.3) get much better than the ones using (3.6), due to the exponential error decay rate shown for (4.3).This is to say, our numerical results show that regularization with a time window function performs much better than regularization with a frequency window function, since an exponential decay can (up to now) only be realized using a time window function.Furthermore, the great importance of an explicit representation of the Fourier transform of the regularizing window function can be seen, cf.Example 3.12.

Comparison of the two regularization methods
Note that the code files for this and all the other experiments are available on https: //github.com/melaniekircheis/On-numerical-realizations-of-Shannons-sampling-theorem.
In summary, we found that the regularized Shannon sampling formula with the sinh-type time window function is the best of the considered methods, since this approach is the most accurate, easy to compute, robust in the worst case error, and requires less data (for comparable accuracy) than the classical Shannon sampling sums or the regularization with a frequency window function.

. 9 )Theorem 3 . 4 .
Then for the linear frequency window function (3.3) we show the following convergence result.Let f ∈ B N/2 (R) be a bandlimited function with bandwidth N 2 , N ∈ N, and let L = N (1 + λ) with λ > 0 be given.Assume that the samples f k L , k ∈ Z, fulfill the condition (3.5).Using oversampling and regularization with the linear frequency window function (3.3), the T -th partial sums P lin,T f converge uniformly to f on [−1, 1] as T → ∞.For T > L the following estimate holds max t∈[−1, 1]

Example 3 . 6 .
Next, we construct a continuously differentiable frequency window function by polynomial interpolation.Since the frequency window function (3.1) is even, it suffices to consider only χ : N 2 , L 2 → [0, 1] at the interval boundaries N 2 and L 2 .Clearly, the linear frequency window function ψlin in (3.3) fulfills lim v→ N 2

Figure 3 .
3 (b).Analogous to Theorem 3.4, the following error estimate can be derived.Theorem 3.7.Let f ∈ B N/2 (R) be a bandlimited function with bandwidth N 2 , N ∈ N, and let L = N (1 + λ) with λ > 0 be given.Assume that the samples f k L , k ∈ Z, fulfill the condition (3.5).Using oversampling and regularization with the cubic frequency window function (3.15), the T -th partial sums P cub,T f converge uniformly to f on [−1, 1] as T → ∞.For T > L the following estimate holds max t∈[−1, 1]

Figure 3 .
3 (b).Note that since ψcos in (3.18) possesses the same regularity as ψcub in (3.15), both frequency window functions meet the same error bound (3.17), cf. Figure 5.1.Note that by (3.4) and the convolution property of the Fourier transform, for L > N the linear frequency window function (3.3) can be written as ψlin

Remark 3 . 10 .Example 3 . 11 ..πt 2 . ( 3 . 23 )
The frequency window functions ψcub in(3.15)  and ψcos in(3.18)lack a convolutional representation (3.20).Although the corresponding functions (3.16) and (3.19) in spatial domain are of the form (3.22), for both frequency windows the Fourier transform of the respective function ρ is only known in the sense of tempered distributions.For the special choice of ρ(v) = 2n L−N M n 2n L−N v with n ∈ N, where M n is the centered cardinal B-spline of order n, we have ρ(t) = sinc L−N 2n πt n Using n = 1 this again yields (3.4), whereas for n = 2 we obtainψ conv,2 (t) = N + L 2 sinc N +L 2 πt sinc L−N 4Note that the frequency window function ψconv,2 , cf.(3.23), possesses the same regularity as ψcub in (3.15) and ψcos in(3.18), and therefore they all meet the same error bound (3.17), cf.Figure5.1.