Fourier series windowed by a bump function

We study the Fourier transform windowed by a bump function. We transfer Jackson's classical results on the convergence of the Fourier series of a periodic function to windowed series of a not necessarily periodic function. Numerical experiments illustrate the obtained theoretical results.


Introduction
The theory of Fourier series plays an essential role in numerous applications of contemporary mathematics. It allows us to represent a periodic function in terms of complex exponentials. Indeed, any real function f : R → R of period 2π, that is locally of bounded variation, has a pointwise converging Fourier series, i.e.
By the classical results of Jackson in 1930, see [Jac94], the decay rate of the Fourier coefficients and therefore the convergence speed of the Fourier series depend on the regularity of the function. If f has a jump discontinuity, then the order of magnitude of the coefficients is O(1/|k|), as |k| → ∞. Moreover, if f is a smooth function of period 2π, say f ∈ C s+1 (R) for some s ≥ 1, then the order improves to O(1/|k| s+1 ).
In the present paper we focus on the reconstruction of a not necessarily periodic function with respect to a finite interval (−λ, λ). For this purpose let us think of a smooth, non-periodic real function ψ : R → R, which we want to represent by a Fourier series in (−λ, λ). Therefore we will examine its 2λ-periodic extension, see Figure 1. Whenever ψ(−λ + ) = ψ(λ − ), the periodization has a jump discontinuity at λ, and thus the Fourier coefficients are O(1/|k|). An easy way to eliminate these discontinuities at the boundary, is to multiply the original function by a smooth window, compactly supported in [−λ, λ]. The resulting periodization has no jumps. Consequently, one expects faster convergence of the windowed Fourier sums.
The concept of windowed Fourier atoms has been introduced by Gabor in 1946, see [Gab46]. According to [Mal09,chapter 4.2], for (x, ξ) ∈ R 2 and a symmetric window function g : R → R, satisfying g L 2 (R) = 1, these atoms are given by g x,ξ (y) := e iξy g(y − x), y ∈ R. It can be understood as the Fourier transform on the real line of the windowed function ψ · g(• − x). If g is localized in a neighborhood of x, then the same applies to the windowed function. Hence, the spectrum of the STFT is connected to the windowed interval. In particular, Gabor investigated Gaussian windows with respect to the uncertainty principle, see [Chu92,chapter 3.1]. In many engineering applications, windows are discussed in terms of data weighting and spectral leakage. Depending on the type of signal, numerous windows have been developed, see e.g. [Har78,chapter IV]. More recently, in [MRS10] a smooth bump window has been suggested for the data analysis of gravitational waves. It is an essential property of this window, that it is equal to 1 in a closed subinterval of its support.
We investigate the convergence speed of Fourier series windowed by bump functions. The properties of the bumps will allow an effortless transfer of Jackson's classical results on the convergence of Fourier series for smooth functions, see Theorem 3.3 and Corollary 4.5. We complement these results by a lower error bound for the Hann window, a member of the set of cos α functions.
1.1. Outline. We start by recalling basic properties of the Fourier series for functions of bounded variation in §2. Afterwards, in §3 we present the windowed transform, see Proposition 3.2, and estimate the reconstruction errors in Theorem 3.3. In §4 we introduce the C s -bumps and transfer the results of chapter 3 to this class. As a special candidate of C 1 -bumps, we consider the Tukey window in §4.1. Finally, we present numerical experiments in §5, that underline our theoretical results and illustrate the benefits of bump windows.

Functions of bounded variation and their Fourier series
2.1. Functions of bounded variation. We denote by BV loc the set of functions f : R → R, which are locally of bounded variation, that is of bounded variation on every finite interval. In particular, we assume that such functions are normalized for any x in the interior of the interval of definition, see [BN71,§0.6], by We recall that a function of bounded variation is bounded, has at most a countably set of jump discontinuities, and that the pointwise evaluation is well-defined. Suppose that ψ ∈ BV loc as well as λ > 0 and t ∈ R. Then, where the coefficients c ψ (k) are given by For the proof of Lemma 2.1 and our subsequent analysis, we will use a translation, a scaling and a periodization operator. For the center t ∈ R and a scaling factor a > 0, we introduce: For the period half length λ > 0, we set Proof. Consider the 2π-periodic function f = P π S λ/π T t ψ. Then, it follows from Lemma A.1 that f ∈ BV loc and therefore The Fourier coefficients of f are given by Consequently, for all x ∈ (t − λ, t + λ) we obtain 2.3. The classical result of Jackson. In general, even if ψ is a smooth function, the periodic extension f = P π S λ/π T t ψ ∈ BV loc has jump discontinuities at ±π.
Let V (f ) < ∞ denote the total variation of f . Then, by [Edw82, chapter 2.3.6], Hence, the coefficients are O(1/|k|). Moreover, the rate of the coefficients transfers to an estimate for the reconstruction errors. For an arbitrary function f ∈ BV loc of period 2π let us introduce the partial Fourier sum Our analysis relies on the following classical result by Jackson on the convergence of the Fourier sum, see [Jac94, chapter II.3, Theorem IV]: Proposition 2.2. If f : R → R is a function of period 2π, which has a sth derivative with limited variation, s ≥ 1, and if V is the total variation of f (s) over a period, then, for n > 0,

The windowed transform
There seems to be no general definition of a window function, but most authors tend to think of a real function w = 0, vanishing outside a given interval. In relation to the STFT in (1.1), additional properties, such as a smooth cut-off or complex values, may be required, see e.g. [Grö01,§3] and [Kai11,§2]. Whenever speaking about windows in this paper, we assume the following: Definition 3.1. Let λ > 0. We say that a function w ∈ BV loc is a window function on the interval (−λ, λ), if the following properties are satisfied: In particular, we obtain the rectangular window, if w(x) = 1 for all x ∈ (−λ, λ), and for simplicity we just write w ≡ 1 in this case. For ψ ∈ BV loc and a window w on (−λ, λ) we introduce the windowed periodization Note that ψ w is 2π-periodic, and by Lemma A.1 we obtain ψ w ∈ BV loc .
3.1. The windowed representation. According to the classical Fourier series of the periodization presented in Lemma 2.1, the windowed series allows an alternative representation with potentially faster convergence.
Proposition 3.2. (Windowed Fourier series) Let ψ ∈ BV loc and λ > 0 and t ∈ R. If w ∈ BV loc is a window on (−λ, λ), then, where the coefficients c w ψ (k) are given by The statement in the last Proposition follows as in Lemma 2.1, for the Fourier series of the 2π-periodic windowed shape ψ w ∈ BV loc . Suppose that ψ w ∈ C s (R), s ≥ 1, and that ψ Note that R w n ψ = T −t S π/λ (S n ψ w ). We now analyze the errors of the reconstruction sums. Therefore, we transfer Jackson's classical result in Proposition 2.2 to an estimate using the Lipschitz constant of the sth derivative of ψ w : Theorem 3.3. (Reconstruction, windowed series) Let λ > 0 and 0 < ρ < λ and t ∈ R. Suppose that ψ w ∈ C s+1 (R), s ≥ 1 and let L s > 0 denote the Lipschitz constant of ψ (s) w over [−π, π]. Then, for n ≥ 1 the error of the reconstruction R w n ψ in the interval X = [t − ρ, t + ρ] is given by where the non-negative constant K ∞ (ψ, ρ) ≥ 0 is given by Proof. Let V < ∞ denote the total variation of ψ (s) w over a period. In particular, Hence, for all x ∈ R the classical Jackson result in Proposition 2.2 yields Moreover, for all x ∈ X we have 0 ≤ w(x) ≤ 1 and thus, by the reverse triangle inequality, we obtain Taking the supremum proves (3.4).
Note that for w ≡ 1 we obtain the convergence of the plain reconstruction R n ψ, where K ∞ (ψ, ρ) = 0. Theorem 3.3 allows a calculation of the L 2 -error: Corollary 3.4. The L 2 -error of the reconstruction is given by where the non-negative constant K 2 (ψ, ρ) ≥ 0 is given by . Then, it follows from (3.5) that and therefore integration yields Consequently, (3.6) follows from In addition to the assumptions in Theorem 3.3, let us assume that w(x) = 1 for all x ∈ [−ρ, ρ]. Then, it follows that K ∞ (ψ, ρ) = 0 and therefore K 2 (ψ, ρ) = 0. Hence, the reconstruction errors converge to 0 as n → ∞. This motivates the investigation of bump windows.

Bump windows
We will now introduce C s -bump windows by singling out two additional properties: On the one hand, bump windows fall off smoothly at the boundary of their support, on the other hand, to receive a faithful windowed shape of the original function, bump windows have to equal 1 in a closed subinterval of their support. The plots in Figure 2 show the typical shape and the action of a bump.
Definition 4.1. Let λ > 0 and 0 ≤ ρ < λ. For some s ≥ 1 we say that the function If ρ = 0, we say that the bump is degenerated. Moreover, whenever w ρ,λ ∈ C ∞ c (R), we say that w ρ,λ is a smooth bump.
Bump windows have previously been used for the data analysis of gravitational waves, see [DIS00, Equation (3.35)] and [MRS10, §2, Equation (7)]. Moreover, bump functions occur when working with partitions of unity, e.g. in the theory of manifolds, see [Lee13, Lemma 2.22] and [Tu11, §13.1], and further, with a view to numerical applications, for so-called partition of unity methods, which are used for solving partial differential equations, see [GS00, §4.1.2]. An example for a smooth bump is given by the even function w ρ,λ , given by (4.1) As we see in the right plots of Figure 2, the product of a non-degenerated bump w ρ,λ and ψ produces a (smooth) windowed shape, matching with ψ in [−ρ, ρ] and tending to 0 at the boundaries of (−λ, λ). In particular, we obtain excellent reconstructions using the smooth bump given by (4.1) in our numerical experiments.  can be viewed as a degenerated C 1 -bump. For 0 < α < 1, the Tukey window, see Definition 4.3, is a non-degenerated C 1 -bump. Generally, the C s -bump w ρ,λ in Definition 4.1 (bottom) is s-times, but not (s + 1)-times continuously differentiable.
In the sense of Definition 4.1 the Hann window is a degenerated C 1 -bump. In particular, for 0 < ρ < λ it follows from Theorem 3.3 and Corollary 3.4, that the reconstruction errors for a function ψ = 0 according to the interval [t − ρ , t + ρ ] are bounded from below by positive constants K ∞ (ψ, ρ ), K 2 (ψ, ρ ) > 0. This fact can also be observed in our numerical experiments, see §5.1 and §5.2.
As it turns out, the Hann window arises as a special candidate of a more general class, the Tukey windows, often called cosine-tapered windows. These windows can be imagined as a cosine lobe convolved with a rectangular window: Definition 4.3. The Tukey window with parameter α ∈ (0, 1] is given by The Tukey window is a C 1 -bump w ρ,λ with ρ = (1 − α)λ. In particular, tukey 1,λ = hann λ = w 0,λ , and for 0 < α < 1 the Tukey window is not degenerated. We note that the sum of phase-shifted Hann windows creates a Tukey window: Lemma 4.4. Let τ > 0 and m ≥ 0. Then, for α = 1/(m + 1) and λ = (m + 1)τ , Proof. For all x ∈ R we introduce the function Obviously, H τ,m is an even function. Thus, for all x ∈ R we obtain 4.2. The representation for bump windows. The windowed Fourier series in Proposition 3.2 applies to bump functions and yields the following representation in the restricted interval [t − ρ, t + ρ]: Corollary 4.5. (Fourier series windowed by a bump function) Suppose that ψ ∈ C s+1 (R), s ≥ 1, as well as λ > 0 and 0 ≤ ρ < λ and t ∈ R.
, satisfying the three conditions in Definition 4.1, then, where the coefficients c w ψ (k) are given by In particular, if L s > 0 denotes the Lipschitz constant of ψ (s) w over [−π, π], then, We note that for w = hann λ the representation in Corollary 4.5 shrinks to a pointwise representation at x = t. Furthermore, the bound in (4.2) depends on the choice of the bump w ρ,λ , and for ρ ≈ λ the windowed transform does not lead to an improvement of the decay for low frequencies k, because in this case the action of the bump is comparable to a truncation of ψ, such that the Lipschitz constant L s dominates. We will illustrate this fact with numerical experiments in §5.2.

4.3.
A bound for the Lipschitz constant. We now investigate the Lipschitz constant L s in Corollary 4.5. Using the work of Ore in [Ore38], we crucially use an estimate on the higher order derivatives of the product of two functions, which is developed in §4.4.
For a function f : R → R, that is (s + 1)-times differentiable, s ≥ 1, with a (s + 1)th derivative bounded on a finite interval (a, b), let us introduce the nonnegative constant Theorem 4.6. Let λ > 0 and 0 ≤ ρ < λ and t ∈ R. Suppose that ψ ∈ C s+1 (R) and w ρ,λ ∈ C s+1 c (R) for some s ≥ 1. Assume the existence of two non-negative constants M ψ , M ψs+1 ≥ 0, such that for x ∈ (t − λ, t + λ): Then, the Lipschitz constant L s in Corollary 4.5 is bounded by where the non-negative constants C s,ψ , C s,w ≥ 0 are given by and the constant K s > 0 is given by Proof. We use the estimate for the (s+1)th derivative of the product of the functions f = w ρ,λ and g = ψ(• + t) according to the interval (−λ, λ), see Proposition 4.8 in the next section. This results in Moreover, for the formula of the constant K s in (4.4) we refer to Lemma 4.9.
Remark 4.7. Stirling's formula yields the following approximation of K s : The sign ∼ means that the ratio of the quantities tends to 1 as s → ∞.  ∈ (a, b), where the combinatorial constant K(i, s) > 0 is defined according to We now use the general Leibniz rule to lift this result to an explicit bound for the (s + 1)th derivative of the product of two functions.
Proposition 4.8. Let s ≥ 1 and f, g : R → R, both (s + 1)-times differentiable in a finite interval (a, b). Assume the existence of four non-negative constants such that for all x ∈ (a, b): Then, for all x ∈ (a, b) we have where the constants C s,f , C s,g ≥ 0 are defined according to (4.3) and the constant K s > 0, which only depends on s, is given by Proof. By the general Leibniz rule the (s + 1)th derivative of f g is given by We therefore obtain the following estimate for all x ∈ (a, b): Using (4.5) for 1 ≤ k ≤ s, we conclude that and thus 4.5. The combinatorial constant. Next, we will investigate the combinatorial constant K s and derive formula (4.4) presented in Theorem 4.6.
In Appendix B we derive an upper bound for K s based on binomial coefficients.

Numerical results
According to our results in Theorem 3.3 and Corollary 4.5 we present numerical experiments for three different functions. We investigate reconstructions with the smooth bump w ρ,λ given by (4.1), compared to those with the Hann window in Definition 4.2 and the Tukey window in Definition 4.3. Besides the reconstructions we also present the decay of the coefficients and the reconstruction errors. In §5.1 we start with the saw wave function to demonstrate the superiority of the windowed transform with a smooth bump for a function having a high jump discontinuity. Afterwards, the experiments in §5.2 deal with a parabola function. The symmetric periodic extension has no discontinuities, and therefore the parabola is a good candidate to illustrate the limitations of bump windows. Last, in §5.3 we work with a rapidly decreasing function. As we will see in this example, for low frequencies all coefficients (plain, tukey, bump) are almost similar and have a rapid decrease, implying excellent reconstructions.
Remark 5.1. In the following experiments, the dependency of the windows on the parameters λ, ρ and α are always assumed implicitly and therefore we write hann = hann λ , tukey = tukey α,λ , bump = w ρ,λ .
For the numerical computation of the (windowed) coefficients we used the fast Fourier transform (FFT), see Appendix C.
5.1. Saw wave function. In the first example we consider the function The corresponding periodic extension P λ ψ results in a saw wave function. We note that c ψ (k) and c hann ψ (k) can be evaluated analytically and are given by , k ∈ Z\{−1, 0, 1}.
Moreover, since ψ is a real function, we conclude that The upper left hand side of Figure 3 shows |c ψ (k)| 2 = 1/k 2 , as well as |c w ψ (k)| 2 for both windows (hann und bump). We observe that the windowed coefficients have a faster asymptotic decay than the plain Fourier coefficients. The coefficients and the reconstruction errors for the bump (green) show the best asymptotic decay. As we observe in the upper right plot of Figure 3, the bump-windowed coefficients show exponential decay. In particular, we recognize a trembling for these coefficients, while the other (plain and hann) have a smooth decay. We provide an explanation of this phenomenon in Appendix D. The reconstructions R 10 ψ and R w 10 ψ are visualized in Figure 4. For the bump we recognize a good convergence to the original function ψ in [−ρ, ρ] (dotted lines), and the typical overshoots of the Fourier sum are dampened. As expected, the reconstruction with the Hann window is accurate only in a small neighborhood of the center t = 0, and according to Theorem 3.3 and Corollary 3.4 the reconstruction errors converge to K ∞ (ψ, ρ), K 2 (ψ, ρ) > 0. For the saw wave these constants can be calculated analytically in terms of λ and ρ, and their values are given by K ∞ ≈ 8.91 and K 2 ≈ 2.76. We have marked these values with red crosses. In fact we observe a perfect match.    Coefficients (ρ 2 = 0.8) Figure 5. Decay of the representation coefficients for the parabola with ρ 1 = 0.25 (left) and ρ 2 = 0.8 (right). Again, the coefficients for the bump show a fast asymptotic decay.

5.2.
Parabola. We consider the symmetric function Note that The plots in Figure 5 show the decay of the coefficients. Especially for low frequencies, the coefficients for the Hann window show the fastest decay. Nevertheless, we observe once more that the bump coefficients and errors have the best asymptotics, see Figure 6. As with the saw wave, the constants K ∞ (ψ, ρ), K 2 (ψ, ρ) > 0 can be calculated analytically and are given by K ∞ ≈ 9.1 · 10 −3 , if ρ = 0.25, 0.58, if ρ = 0.8, and K 2 ≈ 4.7 · 10 −6 , if ρ = 0.25, 0.075, if ρ = 0.8.
We have marked these values with red crosses and verify the predicted convergence of the errors. The reconstructions R 50 (ψ) and R w 50 (ψ) are visualized in Figure 7. For the first choice ρ 1 = 0.25 (left) the bump-windowed series approximates the original function only in the small interval [−ρ 1 , ρ 1 ] = [−0.25, 0.25]. We note that the periodic extension of the parabola has no discontinuities and therefore the plain reconstruction gives a good approximation, even with few coefficients.
For a bad choice of the parameter ρ, the reconstruction with the bump gets worse. According to Theorem 4.6, the Lipschitz constant L s is getting large as ρ → λ, implying a slow decay for low frequencies, which can particularly be observed for the choice ρ 2 = 0.8. This value leads to a high derivative of the smooth bump w 0.8,1 in the interval (0.8, 1). For low frequencies, the coefficients and the errors for the bump show a slow decay (right plots in Figure 5,6) and are even worse than for the plain Fourier series.

5.3.
A function of rapid decrease. We also applied the transforms to ψ(x) = 8x 3 − 24x 2 + 12x + 4 e −(x−1) 2 /2 , λ = 2π, ρ = 5.9, t = 1 . We note that ψ(x + 1) is the product of the Hermite polynomial H 3 (x) = 8x 3 − 12x times a Gaussian, i.e. a rescaled Hermite function. For the center we chose t = 1. In contrast to the previous examples, we now work with the Tukey window for α = 1 − ρ/λ, see Definition 4.3. We recall that this window is a non-degenerated C 1 -bump. The 2λ-periodic extension of ψ produces discontinuities with very small jumps, which can only be resolved with high frequencies. Consequently, for low frequencies all coefficients are almost the same and fall off rapidly, see Figure 8. Nevertheless, the plain coefficients are O(1/|k|), while the coefficients for the smooth bump again show the best asymptotic decay. For the reconstructions we used R 10 ψ and R w 10 ψ. As we observe in the right plot of Figure 8, the rapid decrease of the coefficients yields excellent reconstructions and no differences can be determined to the original function.
Proof. For a function f : [a, b] → R and a partition P of some finite interval [a, b] we denote by V (f, P ) the variation of f with respect to P , and by V (f ) the total variation of f on [a, b]. Now, for ψ ∈ BV loc and λ > 0 consider f := P λ ψ. It remains to show that V (f | [−λ,λ] ) is a finite number. Therefore, let Thus, taking the supremum among such partitions, we conclude that

Appendix B. Upper bound for K s
Recall the representation of the combinatorial constant K s in (4.8). We want to find an estimate for the following sum, cf. equation (4.9): For the summand we calculate In particular, for m = s − 1 and a = b = s + 1, s ≥ 1, we obtain by (B.1) and (B.2) we conclude that This proves that K s ≤ 2 2s+1 · s 2 · (3s)! (2s + 1)! · 2s s + 2 , s ≥ 2.
Consequently, the true value of K s is overestimated by the factor 2s/(s + 2).
Appendix C. Computing Fourier Integrals using the FFT In §5 we presented numerical results for reconstructions based on windowed Fourier coefficients and windowed series, respectively. For the computation of Fourier-type integrals, such as The trapezoidal rule with grid {x j } j∈{0,1,...,m} yields the following approximation: where the constants r 1 , r 2 ∈ R are given by r 1 = ψ(t + λ)w(λ) 2 and r 2 = ψ(t − λ)w(−λ) 2 .
In particular, the vectorv can be calculated with the FFT. For sufficiently large values of m and N we get 1 2λ t+λ t−λ ψ(x)w(x − t)e −iξ π λ x dx ≈ e −iξ π λ t e iξπ m r 1 e −2πiξ − r 2 +v n+1 .
We note that the trapezoidal rule gives accurate results for smooth integrands and the rates of convergence can be found e.g. in [DR84, chapter 2.9].

Appendix D. Oscillations of the coefficients
We focus once more on the windowed coefficients c w ψ . In the plot at the upper left hand side of Figure 3 the green line doesn't fall off in a smooth, but a trembling way. To explain this phenomenon, we extend the domain of the Fourier coefficients. For a 2π-periodic function f ∈ BV loc and ξ ∈ R consider the number f (ξ) := 1 2π π −π f (x)e −iξx dx.  Figure 3). We recall that the bump-windowed coefficients (green) have a fast asymptotic decay. The left plot only shows low frequencies ξ, and in the right plot we observe that the bump coefficients fall below the Hann coefficients.
This means, that we calculate the Fourier coefficients not only for integer values, but for all real numbers ξ. For example, the extended Fourier coefficients of the function f ≡ 1 are given by 1(ξ) := 1 2π π −π e −iξx dx = sin(πξ) πξ = sinc(ξ) .
In particular, if k is an integer, we obtain the simple Fourier coefficients and | 1(k)| = 1, if k = 0, 0, else.
Therefore, if k = 0 is an integer, we conclude that | x(k)| = 1 k .
Thus, the coefficients of the saw wave function have a smooth decay, as we see at the right hand side of Figure 9 (orange line). We computed the extended (windowed) coefficients for ξ ∈ [1, 10] and ξ ∈ [390, 400] for the saw wave. The result can be found in Figure 10. By extending the domain of the Fourier coefficients, we observe that the trembling also occurs for the other coefficients (plain and hann).