1 Introduction

We consider signal processing tasks based on continuous frames [1, 46]. In the general setting, an input signal s, from the Hilbert space of signals \({\mathcal {H}}\), is first analyzed into a coefficient space representation \(V_f[s]\) via the frame analysis operator \(V_f\). Then, \(V_f[s]\) is manipulated in the coefficient space by first applying a pointwise nonlinearity \(r\circ V_f[s]\) and then a linear operator T, to produce \(T(r\circ V_f[s])\), which is finally synthesized back to the signal space via \(V_f^*\). The end-to-end pipeline reads

$$\begin{aligned} {\mathcal {H}}\ni s \mapsto V_f^*T(r\circ V_f[s]). \end{aligned}$$
(1)

We call pipelines of the form (1) phase space signal processing. The linear operator T models a global change of the signal in the feature space, while the nonlinearity r allows modifying the feature coefficients term-by-term with respect to their values.

Some examples of continuous frames are the 1D continuous wavelet transform—CWT [13, 26], the short time Fourier transform - STFT [25], the Shearlet transform [28] and the Curvelet transform [7]. Signal processing tasks of the form (1) are used in a multitude of applications. In multipliers [2, 3, 37, 40, 47], T is a multiplicative operator and the nonlinearity is trivial \(r(x)=x\). Multipliers have applications, for example, in audio analysis [4] and increasing signal-to-noise ratio [36]. In signal denoising, e.g., wavelet shrinkage denoising [14, 15], and Shearlet denoising [29], the linear operator is trivial \(T=I\) and r is a nonlinearity that attenuates low values. The same is true in shearlet based image enhancement [21, Section 4]. In phase vocoder, T is a dilation operator and r is a so-called phase correction nonlinearity [12, 16, 32, 34, 41, 44, 45, 52]. In analysis-based iterative thresholding algorithms for inverse problems, the sparsification step in each iteration can be written as (1) with \(T=I\) and r a thresholding nonlinearity [22, 30].

We note that the theory presented in this paper works also when the analysis and synthesis in (1) are done by two different frames sharing the same feature space. An example application is shearlet or curevelet based Radon transform inversion [9, 11, 20], where T is a multiplicative operator, r is thresholding, analysis is done by the curvelet/shearlet frame, and synthesis is done by some modified curvelet/shearlet frame. For simplicity of the presentation, we stick to the same frame for analysis and synthesis. More accurately, in (5) and (6) we extend (1) to a pipeline based on a frame and its canonical dual.

In this paper we study quadrature discretizations of (1) based on random samples.

1.1 Quadrature Versus Discrete Frame Discretizations of Continuous Frames

An analysis operator of a continuous frame \(V_f\) has the form

$$\begin{aligned} {\mathcal {H}}\ni s \mapsto V_f[s] = \left\langle s,f_{(\cdot )}\right\rangle \in L^2(G). \end{aligned}$$
(2)

where G is a measure space called phase space, that usually has some physical interpretation (e.g., in the STFT G is the time-frequency plane), \(\{f_g\}_{g\in G}\) is the continuous frame, and \(L^2(G)\) is called the coefficient space. Accordingly, the synthesis operator \(V_f^*\) has the form [46, Theorem 2.6]

$$\begin{aligned} V_f^*(F) = \int _G F(g)f_gdg. \end{aligned}$$
(3)

As evident from the above description, phase space signal processing involves integrals, and thus some form of discretization is required. The common approach is to sample points from G to construct a discrete frame version \(\{f_{g^k}\}_{k=1}^{\infty }\) of the continuous frame (e.g., as in [13, 25]). For the discrete frame, the synthesis operator reads, for \((F_k)_k\in l^2\),

$$\begin{aligned} V_{\{f_{g^k}\}_n}^*\{F_k\}_k = \sum _{k=1}^{\infty } f_{g^k} F_k. \end{aligned}$$
(4)

Note that (4) looks like a quadrature approximation of (3). However, in the standard continuous-to-discrete frame approach, the points \(g^k\) are not chosen for approximating (3). Rather, \(g^k\) are chosen so that the discrete system \(\{f_{g^k}\}_{n=1}^{\infty }\) satisfies the discrete frame inequality. Hence, the discrete frame is related to the continuous one by the fact that \(f_{g^k}\) are sampled from \(f_{(\cdot )}\), not by some approximation requirement. In this paper we take a different route to discretization, requiring that (4) approximates (3). Let us call the latter discretization approach the quadrature approach.

There is an advantage in the quadrature approach over the discrete frame approach when working with highly redundant continuous frames. To illustrate the idea, consider the STFT, where \(G={\mathbb {R}}^2\) is the time-frequency plane. Suppose that we extend G by adding to the time and frequency axes \(t,\omega \) a third axis c that controls the time width of the window. To discretize the resulting continuous frame \(f_{t,\omega ,c}\) to a discrete frame, one may simply choose to fix the window width axis to one single value \(c=c^0\), and sample a time-frequency 2D grid \((t^n,\omega ^m,c^0)_{n,m}\). Indeed, such an approach would result in a standard discrete STFT, which is a discrete frame. However, the information along the third axis is lost in this discretization. In fact, nothing in the continuous-to-discrete frame approach forces the discretization to faithfully represent the whole continuous phase space. In the quadrature approach, our goal is to discretize the continuous frames more faithfully, sampling uniformly all the feature directions in phase space.

1.2 Randomized Quadrature Approximations of Continuous Frames

Our motivation is hence to discretize highly redundant continuous frames in the quadrature approach. In such situations, the dimensionality of phase space is higher than that of the signal space, and hence using a randomized discretization makes sense. Our approach is motivated by randomized methods in finite-dimensional numerical linear algebra, which are a prominent approach for dealing with high dimensional data [10, 17, 18, 35, 50, 51]. The goal in this paper is to develop a similar randomized theory in an infinite dimensional setting in general separable Hilbert spaces, namely, in the phase space signal processing setting.

Randomized algorithms in a context of continuous frames were presented in the past. Relevant sampling is a line of work in which integral transforms are randomly discretized [5, 24, 42, 49]. While the goal in our approach is to approximate the continuous frame with a quadrature sum, the goal in relevant sampling is to sample discrete frames from continuous frames.

We summarize our construction as follows.

Signal processing in phase space Let \(V_f:{\mathcal {H}}\rightarrow L^2(G)\) be the analysis operator of a continuous frame, and \(S_f=V_f^*V_f\) be the frame operator. Since \(S_f^{-1}V_f^*V_f=V_f^*V_fS_f^{-1}\) is the identity I, we consider the following two formulations of signal processing in phase space. Synthesis phase space signal processing is defined by the pipeline

$$\begin{aligned} s\mapsto V_f^* T r\circ \big (V_f[S_f^{-1}s]\big ), \end{aligned}$$
(5)

and analysis phase space signal processing is defined by

$$\begin{aligned} s\mapsto S_f^{-1}V_f^* T r\circ \big (V_f[s]\big ). \end{aligned}$$
(6)

Here, T is a bounded operator in \(L^2(G)\) and \(r:{\mathbb {C}}\rightarrow {\mathbb {C}}\) is a nonlinearity. The pipelines (5) and (6) can be seen as working with the canonical dual frame \(S_f^{-1}f\) [46] either in the analysis or in the synthesis step. We suppose that \(S_f\) has an efficient discretization in the signal space \({\mathcal {H}}\). Hence, we would like to find an efficient discretization of the rest of the pipeline, namely, of \(V_f^* T r\circ \big (V_f[s]\big )\).

Mote Carlo signal processing in phase space We study a Monte Carlo approximation of signal processing in phase space based on the pipelines (5) and (6). We first consider a Monte Carlo approximation of synthesis. For \(F\in L^2(G)\), under certain assumptions given in Sect. 3.5, we consider the approximation of the synthesis operator by

$$\begin{aligned} V_f^*F = \int _G F(g)f_gdg \approx \frac{C}{K}\sum _{k=1}^K F(g^k)f_{g^k}. \end{aligned}$$
(7)

Here, \(\{g^k\}_{k=1}^K\subset G\) is a finite set of independent random sample points, and C is a normalization constant.

Using this approximation, in Sect. 4 we study the approximation rate of stochastic signal processing in phase space (5) and (6). The first version of the approximation reads, for the synthesis and analysis formulations respectively,

$$\begin{aligned}&s\mapsto \frac{C}{K} \sum _{k=1}^K \big (T(r\circ V_f[S_f^{-1}s])\big )(g^k)f_{g^k}, \end{aligned}$$
(8)
$$\begin{aligned}&s\mapsto \frac{C}{K} S_f^{-1}\sum _{k=1}^K \big (T(r\circ V_f[s])\big )(g^k)f_{g^k}. \end{aligned}$$
(9)

Under some general assumptions, we also approximate the signal processing pipelines (5) and (6) when T in an integral operator defined by

$$\begin{aligned} TF(g) = \int _G R(g,g')F(g')dg', \end{aligned}$$

where \(R:G^2\rightarrow {\mathbb {C}}\) (Definition 6). The synthesis and analysis approximations in this case read respectively

$$\begin{aligned}&s\mapsto \frac{C}{KL} \sum _{k=1}^K\sum _{m=1}^L R(g^k,y^m)r\big (V_f[S_f^{-1}s](f_{y^m})\big )f_{g^k}, \end{aligned}$$
(10)
$$\begin{aligned}&s\mapsto \frac{C}{KL} S_f^{-1}\sum _{k=1}^K\sum _{m=1}^L R(g^k,y^m)r\big (V_f[s](f_{y^m})\big )f_{g^k}. \end{aligned}$$
(11)

Here, \(\{g^k\}_{k=1}^K,\{y^m\}_{m=1}^L\subset G\) are two finite sets of independent random sample points.

Methods (10) and (11) are useful for integral operators. Methods (8) and (9) are useful when the samples \(\big (Tr\circ V_f[s]\big )(g^k)\) can be computed using some other samples \(V_f[s](y^k)\) of \(V_f[s]\), which is the case for multiplicative and diffeomorphism operators (see Sect. 6.2).

Summary of our main results

  • We prove the convergence of the Monte Carlo methods (8)–(11) as the number as samples increase, and also introduce non-asymptotic error bounds. When considering discrete signals of resolution/dimension M, embedded in the continuous signal space, the error in the stochastic method is of order \(O(\sqrt{M/K})\), where K is the number of Monte Carlo samples.

  • The computational complexity of our method does not depend on the dimension of phase space for a rich class of signal processing pipelines. This allows approximating highly redundant continuous frames efficiently using sample points which are well spread in all directions in phase space.

  • As a toy application of the theory, we show how to increase the expressive capacity of the 2D time-frequency phase space by adding a third axis, and use the construction in a phase vocoder scheme.

The proofs in this paper are inspired by the constructions in standard Monte Carlo theory (see, e.g., [6, Section 2]), adapted to infinite dimensional Hilbert spaces.

2 Background: Harmonic Analysis in Phase Space

In this section we review the theory of continuous frames, and give the two examples: STFT and CWT. By convention, all Hilbert spaces in this paper are assumed to be separable. The Fourier transform \({\mathcal {F}}\) is defined with the following normalization

$$\begin{aligned}{}[{\mathcal {F}}s](\omega ) ={\hat{s}}(\omega )= \int _{{\mathbb {R}}}s(t)e^{-2\pi i \omega t}dt, \quad [{\mathcal {F}}^{-1}{\hat{s}}](t) = \int _{{\mathbb {R}}}{\hat{s}}(\omega )e^{2\pi i \omega t}d\omega . \end{aligned}$$
(12)

We denote the norm of a vector v in a Banach space \({\mathcal {B}}\) by \(\left\| v\right\| _{{\mathcal {B}}}\). For a measure space \(\{G,\mu \}\), we denote interchangeably by

$$\begin{aligned} \left\| f\right\| _p= \left\| f\right\| _{L^p(G)} = \Big ( \int _G \left| f(g)\right| ^p d\mu (g) \Big )^{1/p} \end{aligned}$$

the p norm of the signal \(f\in L^p(G)\), where \(1\le p < \infty \), and denote interchangeably

$$\begin{aligned} \left\| f\right\| _{\infty } = \left\| f\right\| _{L^{\infty }(G)}= \mathrm{ess}\ \mathrm{sup}_{g\in G}\left| f(g)\right| \end{aligned}$$

for \(f\in L^{\infty }(G)\). We note that an equality between two \(L^p\) functions is by definition an almost-everywhere equality. We denote the induced operator norm of operators in Banach spaces using the subscript of the Banach space, e.g., for a bounded linear operator \(T:L^p(G)\rightarrow L^p(G)\), we denote \(\left\| T\right\| _{p}\). When we want to emphasize that the norm is an operator norm, we also denote \(\left\| T\right\| _{p\rightarrow p}\).

2.1 Continuous Frames

The following definitions and claims are from [46] and [23, Chapter 2.2], with notation adopted from the later.

Definition 1

Let \({\mathcal {H}}\) be a Hilbert space, and \((G,{\mathcal {B}},\mu )\) a locally compact topological space with Borel sets \({\mathcal {B}}\), and \(\sigma \)-finite Borel measure \(\mu \). Let \(f:G\rightarrow {\mathcal {H}}\) be a weakly measurable mapping, namely for every \(s\in {\mathcal {H}}\)

$$\begin{aligned} g\mapsto \left\langle s,f_g\right\rangle \end{aligned}$$

is a measurable function \(G\rightarrow {\mathbb {C}}\). For any \(s\in {\mathcal {H}}\), we define the coefficient function

$$\begin{aligned} V_f[s]:G\rightarrow {\mathbb {C}}\quad , \quad V_f[s](g)=\left\langle s,f_g\right\rangle _{{\mathcal {H}}}. \end{aligned}$$
(13)
  1. 1.

    We call f a continuous frame, if \(V_f[s]\in L^2(G)\) for every \(s\in {\mathcal {H}}\), and there exist constants \(0<A\le B<\infty \) such that

    $$\begin{aligned} A\left\| s\right\| _{{\mathcal {H}}}^2 \le \left\| V_f[s]\right\| _2^2 \le B\left\| s\right\| _{{\mathcal {H}}}^2 \end{aligned}$$
    (14)

    for every \(s\in {\mathcal {H}}\).

  2. 2.

    If it is possible to choose \(A=B\), f is called a tight frame.

  3. 3.

    We call \({\mathcal {H}}\) the signal space, G phase space, \(V_f\) the analysis operator, and \(V_f^*\) the synthesis operator.

  4. 4.

    We call the frame f bounded, if there exist a constant \(0<C\in {\mathbb {R}}\) such that

    $$\begin{aligned} \forall g\in G\ , \left\| f_g\right\| _{{\mathcal {H}}}\le C. \end{aligned}$$
  5. 5.

    We call \(S_f=V_f^*V_f\) the frame operator, and \(Q_f=V_f V_f^*\) the Gramian operator.

  6. 6.

    We call f a Parseval continuous frame, if \(V_f\) is an isometry between \({\mathcal {H}}\) and \(L^2(G)\).

Remark 2

A frame is Parseval if and only if the frame bounds cane be chosen as \(A=B=1\).

For the closed form formula of the synthesis operator \(V_f^*\), we recall the notion of weak vector integrals, also called Pettis integral, introduced in [43].

Definition 3

Let \({\mathcal {H}}\) be a separable Hilbert space, and G a measure space. Let \(v:G\rightarrow {\mathcal {H}}\) be a mapping such that the mapping \(s\mapsto \int _G \left\langle s,v(g)\right\rangle dg\) is continuous in \(s\in {\mathcal {H}}\). Then the weak vector integral (or weak \({\mathcal {H}}\) integral) is defined to be the vector \(\int ^{\mathrm{w}}_G v(g)dg \in {\mathcal {H}}\) such that

$$\begin{aligned} \forall s\in {\mathcal {H}}\ \quad \int _G \left\langle s,v(g)\right\rangle dg=\left\langle s,\int ^{\mathrm{w}}_G v(g)dg\right\rangle . \end{aligned}$$

The existence of such a vector is guaranteed by Riesz representation theorem. In this case, v is called a weakly integrable function.

Given a continuous frame, the synthesis operator can be written by [46, Theorem 2.6]

$$\begin{aligned} V_f^*[F] = \int ^{\mathrm{w}}_G F(g)f_g dg. \end{aligned}$$
(15)

Definition 4

The frame kernel \(K_f:G^2\rightarrow {\mathbb {C}}\) is defined by

$$\begin{aligned} K_f(g,g')=\left\langle f_g,f_{g'}\right\rangle = V_f[f_g](g'). \end{aligned}$$
(16)

The following result is taken from [23, Proposition 2.12].

Proposition 5

The Gramian operator \(Q_f\) is an integral operator with kernel K. Namely, for every \(F\in L^2(G)\)

$$\begin{aligned}{}[Q_fF](g') = \int _G F(g) K_f(g,g')dg. \end{aligned}$$
(17)

For a Parseval frame, the image space \(V_f[{\mathcal {H}}]\) is a reproducing kernel Hilbert space, with kernel \(K_f(g,\cdot )\), and the orthogonal projection upon \(V_f[{\mathcal {H}}]\) is given by the Gramian operator \(Q_f=V_fV_f^*\).

2.2 Examples

An important class of Parseval frames are wavelet transforms based on square integrable representations, which we call in this paper simply wavelet transforms. We refer the reader to [23, Chapters 2.3–2.5], and the classical papers [19, 27]. The wavelet system in the general theory is generated by fixing one signal \(f\in {\mathcal {H}}\), that is typically called the mother wavelet or the window function, and applying on it a set of transformations \(\{\pi (g)f\ |\ g\in G\}\), parameterized by a locally compact topological group G. The Haar measure is taken in G, and \(\pi :G\rightarrow {{{\mathcal {U}}}}({\mathcal {H}})\) is assumed to be a square integrable representation, where \({{{\mathcal {U}}}}({\mathcal {H}})\) is the group of unitary operators in \({\mathcal {H}}\).

The wavelet transform is defined by

$$\begin{aligned} V_f:{\mathcal {H}}\rightarrow L^2(G) \quad , \quad V_f[s](g)=\left\langle s,\pi (g)f\right\rangle . \end{aligned}$$

For any two mother wavelets \(f_1\) and \(f_2\), the reconstruction formula of the wavelet transform is given by

$$\begin{aligned} s=\frac{1}{\left\langle Af_2,Af_1\right\rangle }V_{f_2}^*V_{f_1}(s) =\frac{1}{\left\langle Af_2,Af_1\right\rangle }\int ^{\mathrm{w}}_G V_{f_1}[s](g)\pi (g)f_2 \ dg. \end{aligned}$$

Here, A is a special positive operator in \({\mathcal {H}}\), called the Duflo-Moore operator, uniquely defined for every square integrable representation \(\pi \), that determines the normalization of windows.

The short time Fourier transform The following construction is taken from [25]. Consider the signal space \(L^2({\mathbb {R}})\). Let \({\mathcal {T}}:{\mathbb {R}}\rightarrow \mathcal{U}(L^2({\mathbb {R}}))\) be the translation in \(L^2({\mathbb {R}})\), defined for \(x\in {\mathbb {R}}\) and \(f\in L^2({\mathbb {R}})\) by \([{\mathcal {T}}(x)f](t)=f(t-x)\). Let \({\mathcal {M}}:{\mathbb {R}}\rightarrow \mathcal{U}(L^2({\mathbb {R}}))\) be the modulation in \(L^2({\mathbb {R}})\), defined for \(\omega \in {\mathbb {R}}\) and \(f\in L^2({\mathbb {R}})\) by \([{\mathcal {M}}(\omega )f](t)=e^{2 \pi i \omega t}f(t)\). Denote \(\pi (x,\omega )= {\mathcal {T}}(x){\mathcal {M}}(\omega )\). For a normalized window f, the mapping

$$\begin{aligned} {\mathbb {R}}^2\ni (x,\omega )\mapsto \pi (x,\omega )f \end{aligned}$$

is a Parseval continuous frame, with the standard Lebesgue measure of the phase space \({\mathbb {R}}^2\). The resulting transform \(V_f[s](x,\omega )=\left\langle s,\pi (x,\omega )f\right\rangle \) is called the Short Time Fourier Transform (STFT).

2.2.1 The 1D Continuous Wavelet Transform

The following construction is taken from [13, 26]. Consider the signal space \(L^2({\mathbb {R}})\), and the translation \({\mathcal {T}}\) as in the STFT. Let \({\mathcal {D}}:{\mathbb {R}}\setminus \{0\}\rightarrow {{{\mathcal {U}}}}(L^2({\mathbb {R}}))\) be the dilation in \(L^2({\mathbb {R}})\), defined for \(\tau \in {\mathbb {R}}\setminus \{0\}\) and \(f\in L^2({\mathbb {R}})\) by \([{\mathcal {D}}(\tau )f](t)=\frac{1}{\sqrt{\left| \tau \right| }}f(\frac{t}{\tau })\). The set of transformations

$$\begin{aligned} {{{\mathcal {A}}}}=\{ {\mathcal {T}}(x){\mathcal {D}}(\tau )\ |\ (x,\tau )\in {\mathbb {R}}\times ({\mathbb {R}}\setminus \{0\})\} \end{aligned}$$
(18)

is closed under compositions. We can treat \({{{\mathcal {A}}}}\) as a group of tuples \({\mathbb {R}}\times ({\mathbb {R}}\setminus \{0\})\), with group product derived from the compositions of operators in (18). The group \({{{\mathcal {A}}}}\) is called the 1D affine group. The mapping

$$\begin{aligned} \pi (x,\tau )={\mathcal {T}}(x){\mathcal {D}}(\tau ) \end{aligned}$$

is a square integrable representation, with Dulfo-Moore operator A defined by \( [{\mathcal {F}}A {\mathcal {F}}^*{\hat{f}}](z) = \frac{1}{\sqrt{\left| z\right| }}{\hat{f}}(z) \), where \({\mathcal {F}}\) is the Fourier transform. The resulting wavelet transform is called the Continuous Wavelet Transform (CWT).

Next, we show how the CWT atoms are interpreted as time-frequency atoms, and the CWT is interpreted as a time-frequency transform. Here, by changing variable \(\omega =\frac{1}{\tau }\), we obtain the Parseval frame

$$\begin{aligned} \{\pi '(x,\omega )f\}_{(x,\omega )\in {\mathbb {R}}\times ({\mathbb {R}}\setminus \{0\})} \end{aligned}$$

based on the representation \(\pi '(x,\omega )={\mathcal {T}}(x){\mathcal {D}}(\omega ^{-1})\). The parameter \(\omega \) is interpreted as frequency. The mapping \(\pi '\) is a representation of the 1D affine group with the new parameterization \(\omega =\frac{1}{\tau }\), in which the Haar measure is the standard Lebesgue measure of \({\mathbb {R}}\times ({\mathbb {R}}\setminus \{0\})\).

3 Elements of Stochastic Signal Processing in Phase Space

In this section we develop basic approximation results that will be used later in the paper to bound the approximation error between the signal processing pipelines (5) and (6), and their stochastic approximations (7)–(11).

3.1 Phase Space Operators

We start by defining integral operators in the coefficient space.

Definition 6

Let T be a bounded linear operator in \(L^2(G)\), where G is a locally compact topological space with \(\sigma \)-finite Borel measures.

  1. 1.

    We call T a phase space integral operator (PSI operator) if there exists a measurable function \(R:G\times G\rightarrow {\mathbb {C}}\) with \(R(\cdot ,g)\in L^2(G)\) for almost every \(g\in G\), such that for every \(F\in L^2(G)\)

    $$\begin{aligned} TF = \int _{G} R(\cdot ,g)F(g)dg. \end{aligned}$$
    (19)
  2. 2.

    A phase space integral operator T is called uniformly square integrable, if there is a constant \(D>0\) such that for almost every \(g\in G\)

    $$\begin{aligned} \left\| R(\cdot ,g)\right\| _{L^2(G)}=\sqrt{\int _{G} \left| R(g',g)\right| ^2 dg'} \le D. \end{aligned}$$
    (20)

Example 7

The Gramian operator \(Q_f\) of a continuous frame is a phase space operator by Proposition 5, with \(\left\| Q_f\right\| _2\le B\). If f is bounded, with bound \(\left\| f_g\right\| _{{\mathcal {H}}}\le C\), then \(Q_f\) is uniformly square integrable with bound

$$\begin{aligned} \left\| K_f(g,\cdot )\right\| _{2} = \sqrt{\int _{G} \left| K_f(g,g')\right| ^2 dg'} = \sqrt{\int _{G} \left| V_f[f_g](g')\right| ^2 dg'} \le B^{1/2}C. \end{aligned}$$

3.2 Sampling in Phase Space

Let \(F\in L^2(G)\), and let f be a continuous frame. The phase space G in general does not have finite measure, and thus uniform sampling is not defined on G. However, when G has infinite measure, functions \(F\in L^2(G)\) must decay in some sense “at infinity”, so it is possible to restrict our sampling to a compact domain in G, in which F has most of its energy. More accurately, since G is \(\sigma \)-finite, it is the disjoint union of at most countably many sets of finite measure. Namely, there are disjoint measurable sets \(X_n\) of finite measure, with \(\bigcup _{n\in {\mathbb {N}}}X_n= G\), such that for every \(F\in L^2(G)\)

$$\begin{aligned} \left\| F\right\| _{L^2(G)}^2 = \int _G \left| F(g)\right| ^2dg = \sum _{n\in {\mathbb {N}}}\int _{X_n} \left| F(g)\right| ^2dg=\sum _{n\in {\mathbb {N}}}\left\| F\right\| _{L^2(X_n)}^2. \end{aligned}$$
(21)

Denote \(G_n=\bigcup _{j=1}^n X_n\), and note that \(\bigcup _{n\in {\mathbb {N}}}G_n= G\). Now, (21) is equivalent to

$$\begin{aligned} \left\| F\right\| _{L^2(G)}^2 = \lim _{n\rightarrow \infty }\left\| F\right\| _{L^2(G_n)}^2. \end{aligned}$$

Thus, for every \(\epsilon >0\), there exists an indicator function \(\psi _{\epsilon }\) (that depends on F) of a measurable set of finite measure \(\left\| \psi _{\epsilon }\right\| _1\), such that

$$\begin{aligned} \left\| \psi _{\epsilon } F - F\right\| _2<\epsilon . \end{aligned}$$
(22)

In our analysis, we allow more general forms of envelopes \(\psi _{\epsilon }\).

Definition 8

An envelope is a positive \(\psi \in L^1(G)\cap L^{\infty }(G)\) satisfying \(\left\| \psi \right\| _{\infty }\le 1\).

Given an envelope \(\psi \), samples can be drawn from G according to the probability density \(\frac{\psi (g)}{\left\| \psi \right\| _1}\). In the following analysis we fix an envelope \(\psi _{\epsilon }=\psi \) independently of a specific function \(F\in L^2(G)\). This is the common approach in classical signal processing, where a compact frequency band [ab] is predefined independently of a specific signal. It is implicit that we can only treat signals having most of their frequency energy in [ab]. Any frequency information outside of [ab] is lost or projected into the band. In Sect. 5, and specifically Definition 25, we study the support of \(\psi \) required to capture most of the energy of discrete signals.

3.3 Input Sampling in Phase Space Operators

Given a PSI operator T with kernel R, in this subsection we sample the input variable g of \(R(g',g)\), and keep the output variable \(g'\) continuous. In Sect. 4 we show that sampling the output variable \(g'\) is a special case of the framework developed in this subsection. Let \(\psi \) be an envelope, and \(g\in G\) be a random sample according to the probability distribution \(\frac{\psi (g)}{\left\| \psi \right\| _1}\). Define the random rank one operator \(T^{\psi ,1}\), applied on \(F\in L^2(G)\), by

$$\begin{aligned} g\mapsto (T^{\psi ,1}F)(g)=\left\| \psi \right\| _1 R(\cdot ,g) F(g). \end{aligned}$$

We also denote \(T^{\psi ,1}F(g';g)=\left\| \psi \right\| _1 R(g',g) F(g)\), where \(g'\) is the variable of the output function \(T^{\psi ,1}F\). Next, we define the Monte Carlo approximation of TF as a sum of independent \(T^{\psi ,1}F\) vectors.

Definition 9

(Input Monte Carlo phase space operator) Let T be a PSI operator in \(L^2(G)\) (Definition 6), \(\psi \) an envelope, \(F\in L^2(G)\), and \(K\in {\mathbb {N}}\). Let \(G_k=G\), \(k=1,\ldots ,K\), be K copies of G, and let \(\{g^k\}_{k=1}^K\) denote a random sample from \(G_1\times \ldots \times G_K\) with the probability distribution \(\prod _{k=1}^K\frac{\psi (g^k)}{\left\| \psi \right\| _1}\). Let \(T^{\psi ,1}_kF: G_k\rightarrow L^2(G)\) be the random vectors defined for \(g^k\in G_k\) by \([T^{\psi ,1}_kF](g^k) =[T^{\psi ,1}F](g^k)\), \(k=1,\ldots , K\). Define the random vector \(T^{\psi ,K}F: G_1\times \ldots \times G_K \rightarrow L^2(G)\) by

$$\begin{aligned}{}[T^{\psi ,K}F](g';g^1,\ldots ,g^K):=\frac{1}{K}\sum _{k=1}^K [T^{\psi ,1}_kF](g';g^k) = \frac{\left\| \psi \right\| _1}{K}\sum _{k=1}^K R(g', g^k) F(g^k). \end{aligned}$$
(23)

We call \(T^{\psi ,K}F\) the Monte Carlo phase space integral operator applied on F and based on K samples, approximating TF.

When the envelope \(\psi \) is fixed throughout the analysis, we often denote interchangeably \(T^KF=T^{\psi ,K}F\). In the following we fix an envelope \(\psi \).

Remark 10

Note that \(T^kF\) is a random variable – a function with \((\{g^k\}_{k=1}^K,g')\) as the variable. Thus, we can sample \(L^2(G)\) vectors in (23), which are equivalence classes of functions, without requiring any continuity assumption on \(R(\cdot ,\cdot \cdot )F(\cdot \cdot )\). Indeed, \(T^kF\) is defined up to a set of tuples \((\{g^k\}_{k=1}^K,g')\) of measure zero.

The expected value \({\mathbb {E}}(T^K F)\in L^2(G)\) of \(T^K F\) is a function in \(L^2(G)\), defined by

$$\begin{aligned} {\mathbb {E}}(T^K F) = \int _{G^K} [T^KF]\big ((\cdot );g^1,\ldots ,g^K\big ) \frac{\psi (g^1)}{\left\| \psi \right\| _1}dg^1\ldots \frac{\psi (g^K)}{\left\| \psi \right\| _1}dg^K. \end{aligned}$$

We define the variance \({\mathbb {V}}(T^K F)\) as the integral

$$\begin{aligned} {\mathbb {V}}(T^K F) = \int _{G^K} \left| [T^KF]\big ((\cdot );g^1,\ldots ,g^K\big )-{\mathbb {E}}^{\mathrm{w}}(T^K F)\right| ^2 \frac{\psi (g^1)}{\left\| \psi \right\| _1}dg^1\ldots \frac{\psi (g^K)}{\left\| \psi \right\| _1}dg^K. \end{aligned}$$

Given an envelope \(\psi \), by abuse of notation, we also denote by \(\psi \) the multiplicative operator

$$\begin{aligned} \psi : L^2(G)\rightarrow L^2(G) \quad , \quad [\psi F](g)=\psi (g)F(g). \end{aligned}$$

Proposition 11

Let T be a PSI operator, and \(F\in L^2(G)\). Then,

  1. 1.

    the expected value \({\mathbb {E}}(T^KF)\) is in \(L^2(G)\) and satisfies

    $$\begin{aligned} {\mathbb {E}}(T^KF)=T(\psi F), \end{aligned}$$
    (24)
  2. 2.

    if T is a uniformly square integrable PSI operator, with bound D, then

    $$\begin{aligned} {\mathbb {V}}(T^KF)=\frac{1}{K}{\mathbb {V}}(T^1F) \in L^{1}(G), \end{aligned}$$
    (25)

    and \(\left\| {\mathbb {V}}(T^KF)\right\| _1 \le \frac{1}{K}\left\| \psi \right\| _1D^2\left\| F\right\| _2^2\).

The proof of Item 1 of Proposition 11 follows directly from the definition of PSI operators (Definition 6). Indeed, for \(T^1F\),

$$\begin{aligned} {\mathbb {E}}(T^1F)(g') = \int _{G} \left\| \psi \right\| _1 R(g',g) F(g) \frac{\psi (g)}{\left\| \psi \right\| _1}dg = T(\psi F)(g'), \end{aligned}$$

and for \(T^K\) we use linearity. The proof of Item 2 of Proposition 11 is in the next subsection.

We next bound the average square error in approximating \(T[\psi F]\) by \(T^kF\).

Proposition 12

Let f be a continuous frame, and T a uniformly square integrable PSI operator with bound D. Then

$$\begin{aligned} {\mathbb {E}}\Big (\left\| T^KF-T(\psi F)\right\| _2^2 \Big )\le \frac{\left\| \psi \right\| _1}{K} D^2\left\| F\right\| ^2_2. \end{aligned}$$

Proof

By the Fubini–Tonelli theorem, and Proposition 11,

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}\Big ( \left\| T^KF-T(\psi F)\right\| _2^2 \Big ) \\&\quad =\int _{G} \int _G \left| [T^KF](g';g^1,\ldots ,g^K)-T(\psi F)(g')\right| ^2 \\&\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad dg' \psi (g^1)/\Vert \psi \Vert ^1_1\cdots \psi (g^K)/\Vert \psi \Vert ^K_1 dg^1\ldots dg^K \\&\quad =\left\| {\mathbb {V}}(T^kF)(g')\right\| _1 \le \frac{\left\| \psi \right\| _1}{K} D^2\left\| F\right\| ^2_2. \end{aligned} \end{aligned}$$

\(\square \)

The expected error in Proposition 12 is pointwise in F. We note that an operator expected error bound of the form “\({\mathbb {E}}\left\| T^K-T\big (\psi (\cdot )\big )\right\| ^2_{2}= O(\frac{\left\| \psi \right\| _1}{K})\)” like in the finite dimensional matrix operator case [48] is not possible. Indeed, for any sample set \(\{g^k\}_{k=1}^K\) there is a normalized function \(F\in L^2(G)\) supported in \(G\setminus \{g^k\}_{k=1}^K\), so \(T^KF=0\) and \(\left\| T^K-T\big (\psi (\cdot )\big )\right\| _2 \ge \left\| T^KF-T(\psi F)\right\| _2=\left\| T(\psi F)\right\| _2\). We thus focus in this paper on pointwise error estimates.

The following is an important special case of Propositions 11 and 12.

Corollary 13

Let \(T=Q_f=V_f V^*_f\) be the Gramian operator of a bounded continuous frame with \(\left\| f_g\right\| _{{\mathcal {H}}}\le C\) and upper frame bound B. Let \(K_f\) be the frame kernel. Then, we have

$$\begin{aligned} Q_f^K F= \frac{\left\| \psi \right\| _1}{K} \sum _{k=1}^K F(g^k)K_f(g^k,\cdot ), \end{aligned}$$
(26)

and

$$\begin{aligned} {\mathbb {E}}(Q_f^K F) = Q_f (\psi F). \end{aligned}$$

Moreover, \(Q_f\) is a uniformly square integrable PSI operator with bound \(B^{1/2}C\), and

$$\begin{aligned} {\mathbb {E}}\Big (\left\| Q_f^KF-Q_f(\psi F)\right\| _2^2 \Big )\le \frac{\left\| \psi \right\| _1}{K} BC^2\left\| F\right\| ^2_2. \end{aligned}$$

Proof

By Example 7, \(Q_f\) is a uniformly square integrable PSI operator with bound \(B^{1/2}C\). The rest of the results follow from Propositions 11 and 12. \(\square \)

3.4 Proof of Proposition 11

The proof is based on the following lemma.

Lemma 14

Let T be a PSI operator with kernel \(R(g',g)\), let \(\psi \) be an envelope, and let \(F\in L^2(G)\). Then the following holds.

  1. 1.

    The expected value of \(T^1F\) satisfies

    $$\begin{aligned} {\mathbb {E}}(T^1F)=T(\psi F). \end{aligned}$$
    (27)
  2. 2.

    If T is a uniformly square integrable PSI operator, then \({\mathbb {V}}(T^1F)\in L^1(G)\), and

    $$\begin{aligned} {\mathbb {V}}(T^1F)= {\mathbb {E}}\Big (\left| T^1 F\right| ^2\Big ) - \left| {\mathbb {E}}(T^1F)\right| ^2. \end{aligned}$$
    (28)

    Here, \(\left| T^1 F\right| ^2\) is the function \((g;g')\mapsto \left| \left\| \psi \right\| _1 R(g',g) F(g)\right| ^2\), and expected value is with respect to the random variable g.

  3. 3.

    If T is a uniformly square integrable PSI operator, with bound D, then

    $$\begin{aligned} \left\| {\mathbb {V}}(T^1F)\right\| _1 \le \left\| \psi \right\| _1D^2\left\| F\right\| _2^2. \end{aligned}$$
    (29)

Proof

Part 1 was shown in the discussion below Proposition 11. For parts 2 and 3, we write

$$\begin{aligned} \begin{aligned} {\mathbb {V}}(T^1F)(g')&= \int _{G} \left| [T^1F]\big (g';g\big )\right| ^2 \frac{\psi (g)}{\left\| \psi \right\| _1}dg \\&\qquad -2\mathrm{Real} \int _{G} [T^1F]\big (g';g\big )\overline{{\mathbb {E}}(T^1 F)(g')} \frac{\psi (g)}{\left\| \psi \right\| _1}dg \\&\qquad + \int _{G} \left| {\mathbb {E}}(T^1 F)(g')\right| ^2 \frac{\psi (g)}{\left\| \psi \right\| _1}dg. \end{aligned} \end{aligned}$$
(30)

We first use the Fubini–Tonelli theorem to prove integrability with respect to \(g'\) of the first term of (30). By the fact that T is uniformly square integrable (Definition 6.2), and by \(\left\| \psi \right\| _{\infty }\le 1\),

$$\begin{aligned} \int _{G}\int _{G} \left| [T^1F]\big (g';g\big )\right| ^2 dg'\frac{\psi (g)}{\left\| \psi \right\| _1} dg= & {} \left\| \psi \right\| _1 \int _{G}\int _{G} \left| R(g',g)\right| ^2 dg' \left| F(g)\right| ^2\psi (g) dg\nonumber \\&\le \left\| \psi \right\| _1 D^2\left\| F\right\| _2^2, \end{aligned}$$
(31)

so \(\int _{G} \left| [T^1F]\big (g';g\big )\right| ^2 \frac{\psi (g)}{\left\| \psi \right\| _1}dg \in L^1(G)\) with respect to \(g'\). For the second term of (30), we have

$$\begin{aligned} \begin{aligned}&\int _{G}\int _{G} [T^1F]\big (g';g\big )\overline{{\mathbb {E}}(T^1 F)(g')} \frac{\psi (g)}{\left\| \psi \right\| _1}dg dg'\\&\quad = \int _{G}\int _{G} R(g',g)F(g) \psi (g) dg \overline{T(\psi F)(g')}dg' = \left\| {\mathbb {E}}(T^1 F)\right\| _2^2. \end{aligned} \end{aligned}$$

This leads to (28). By the non-negativity of the integrand in the definition of \({\mathbb {V}}(T^1F)(g')\), we can write \(\left\| {\mathbb {V}}(T^1F)\right\| _1 = \int _{G}{\mathbb {V}}(T^1F)(g') dg'\), so by (28) and (31), we get (29). \(\square \)

Proof of Proposition 11

Part 1 was shown in the discussion below Proposition 11. Next we show Part 2. By the Fubini–Tonelli theorem, we have

$$\begin{aligned} \begin{aligned}&\left\| {\mathbb {V}}(T^KF)\right\| _1 \\&=\int _{G^K}\int _{G} \left| \frac{1}{K}\sum _{k=1}^K [T^1_kF](g') - {\mathbb {E}}(T^KF)(g')\right| ^2 dg' \left\| \psi \right\| _1^{-K}\psi (g^1)dg^1 \psi (g^K)dg^K \end{aligned} \end{aligned}$$
(32)

When expanding the product in (32), we have the term

$$\begin{aligned} \int _{G^K}\int _{G} \frac{1}{K^2}\sum _{k=1}^K \left| [T^1_kF](g') - {\mathbb {E}}(T^KF)(g')\right| ^2 dg'\left\| \psi \right\| _1^{-K}\psi (g^1)dg^1\ldots \psi (g^K)dg^K, \end{aligned}$$

and mixed terms, for \(k\ne k'\),

$$\begin{aligned} \begin{aligned}&\int _{G^K}\int _{G} \frac{1}{K^2}\Big ([T^1_kF](g^k;g') - {\mathbb {E}}(T^KF)(g')\Big ) \times \\&\quad \quad \overline{\Big ([T^1_{k'}F](g^{k'};g') - {\mathbb {E}}(T^KF)(g')\Big )} dg' \left\| \psi \right\| _1^{-K}\psi (g^1)dg^1\ldots \psi (g^K)dg^K \\&\quad =\int _{G}\int _{G} \overline{\Big ([T^1_kF](g^{k'};g') - {\mathbb {E}}(T^KF)(g')\Big )} \frac{1}{K^2} \times \\&\quad \quad \quad \int _G\Big ([T^1_kF](g^k;g') - {\mathbb {E}}(T^KF)(g')\Big ) \left\| \psi \right\| _1^{-1}\psi (g^k)dg^{k} \ dg'\ \left\| \psi \right\| _1^{-1}\psi (g^{k'})dg^{k'}, \end{aligned} \end{aligned}$$

which are equal to zero, since

$$\begin{aligned} \begin{aligned}&\int _{G}\Big ([T^1_kF](g^k;g') - {\mathbb {E}}(T^KF)(g')\Big ) \left\| \psi \right\| _1^{-1}\psi (g^k)dg^{k} \\&\quad =\int _{G}[T^1_kF](g^k;g')\left\| \psi \right\| _1^{-1}\psi (g^k)dg^{k} - {\mathbb {E}}(T^1_k F)(g') = 0. \end{aligned} \end{aligned}$$

Here, the fact that \(\overline{\Big ([T^1_{k'}F](g^{k'};(\cdot )) - {\mathbb {E}}(T^KF)(\cdot )\Big )}\in L^2(G)\) for a.e. \(g^{k'}\), and the fact that \(\left\| \psi \right\| _1^{-1}\psi (g^k)dg^k\) and \(\left\| \psi \right\| _1^{-1}\psi (g^{k'})dg^{k'}\) are probability measures, justify the above use of Fubini’s theorem.

We thus have, by part 3 of Lemma 14,

$$\begin{aligned} \begin{aligned} \left\| {\mathbb {V}}(T^KF)\right\| _1&=\int _{G^K}\int _{G} \frac{1}{K^2}\sum _{k=1}^K\left| \Big ([T^1_kF](g^k;g') - {\mathbb {E}}(T^KF)(g')\Big )\right| ^2 \\&\quad dg' \left\| \psi \right\| _1^{-K}\psi (g^1)dg^1\ldots \psi (g^K)dg^K \\&= \frac{1}{K^2}\sum _{k=1}^K\int _{g^k}\int _{g'} \left| \Big ([T^1_kF](g^k;g') - {\mathbb {E}}(T^KF)(g')\Big )\right| ^2 dg' \left\| \psi \right\| _1^{-1}\psi (g^k)dg^k \\&= \frac{1}{K^2}\sum _{k=1}^K{\mathbb {V}}(T^1_kF) = \frac{1}{K}{\mathbb {V}}(T^1_kF) \le \frac{1}{K}\left\| \psi \right\| _1D^2\left\| F\right\| _2^2. \end{aligned} \end{aligned}$$

\(\square \)

3.5 Monte Carlo Synthesis

In this subsection, we use the results of Sect. 3.3 to define and analyze the Monte Carlo approximation of synthesis.

Definition 15

Let \(\psi \) be an envelope and \(\{g^k\}_{k=1}^K\) random samples as in Definition 9. Given \(F\in L^2(G)\), the Monte Carlo synthesis \(V_f^{*\psi ,K}F\) is the random variable \(G^K\rightarrow {\mathcal {H}}\) defined as

$$\begin{aligned} V_f^{*\psi ,K} F =\frac{\left\| \psi \right\| _1}{K}\sum _{k=1}^K F(g^k)f_{g^k}. \end{aligned}$$

When the envelope \(\psi \) is constant in the analysis, we often denote the Monte Carlo synthesis in short by \(V_f^{* K}\). The following proposition formulates the Monte Carlo synthesis using the Monte Carlo PSI operator \(Q_f^K\) approximating the Gramian operator \(Q_f\) (Corollary 13), and the frame operator \(S_f\).

Proposition 16

\(V_f^{*K} F = S_f^{-1}V_f^*Q_f^K F.\)

Proof

By linearity, it is enough to prove for \(K=1\). By the fact that \(S_f=V_f^*V_f\),

$$\begin{aligned} \begin{aligned} S_f^{-1}V_f^*Q_f^1 F&= S_f^{-1}V_f^* \left\| \psi \right\| _1F(g)K(g,\cdot ) \\&= \left\| \psi \right\| _1F(g) S_f^{-1} V_f^* V_f(f_g)= \left\| \psi \right\| _1F(g) f_g = V_f^{*1} F. \end{aligned} \end{aligned}$$

\(\square \)

Next, we show that \(V_f^{*K}F\) approximates \(V_f^*[\psi F]\).

Proposition 17

(Synthesis Monte Carlo approximation rate) Let f be a bounded continuous frame with frame bounds AB, and \(\left\| f_g\right\| _{{\mathcal {H}}}\le C\). Let \(\psi \in L^1(G)\) be an envelope. Then

$$\begin{aligned} {\mathbb {E}}\Big ( \left\| V_f^{*K}F-V_f^* [\psi F]\right\| _{{\mathcal {H}}}^2 \Big ) \le \frac{\left\| \psi \right\| _1}{K} \frac{B}{A}C^2 \left\| F\right\| _2^2. \end{aligned}$$

Proof

By Proposition 16 and Lemma 35 of Appendix B

$$\begin{aligned} \begin{aligned} \left\| V_f^{*K} F-V_f^*[\psi F]\right\| _{{\mathcal {H}}}&= \left\| S_f^{-1}V_f^*Q_f^K F-S_f^{-1}V_f^*Q_f[\psi F]\right\| _{{\mathcal {H}}}\\&= \left\| V_f^+\Big (Q_f^K F-Q_f[\psi F]\Big )\right\| _{{\mathcal {H}}} \\&\le \left\| V_f^+\right\| \left\| Q_f^K F-Q_f[\psi F]\right\| _2 \le A^{-1/2}\left\| Q_f^K F-Q_f[\psi F]\right\| _2 . \end{aligned} \end{aligned}$$
(33)

Indeed, by the frame bound \(A^{1/2}\left\| s\right\| _{{\mathcal {H}}} \le \left\| V_f[s]\right\| _2\) for \(s=V_f^+F\),

$$\begin{aligned} \left\| V_f^+F\right\| _{{\mathcal {H}}} \le A^{-1/2}\left\| V_fV_f^+F\right\| _2 = A^{-1/2}\left\| P_{V_f({\mathcal {H}})}F\right\| _2 \le A^{-1/2}\left\| F\right\| _2. \end{aligned}$$

Now, the result follows from Corollary 13. \(\square \)

4 Stochastic Phase Space Signal Processing of Continuous Signals

In this subsection we formulate and analyze the Monte Carlo approximations (8)–(11) of the signal processing in phase space pipelines (5) and (6). In Sect. 4.2 we bound the expected value of the error, and in Sect. 4.3 we bound the concentration of measure of the error.

4.1 Definition of Stochastic Phase Space Signal Processing

For Parseval frames (Definition 1.6), the frame operator is \(S_f=I\), and hence signal processing in phase space takes the form \({\mathcal {P}}_{f,T,r}:= V_f^* T r\circ V_f\). We call \({\mathcal {P}}_{f,T,r}\) Parseval signal processing in phase space even if f is non-Parseval. For non-Parseval frames, synthesis and analysis signal processing in phase space involve the multiplications \({\mathcal {P}}_{f,T,r}S_f^{-1}\) and \(S_f^{-1}{\mathcal {P}}_{f,T,r}\) respectively. Since \(\left\| S_f^{-1}\right\| _2\le A^{-1}<\infty \), it is enough to bound the error entailed by randomly approximating \({\mathcal {P}}_{f,T,r}\), and then multiply the bound by \(\left\| S_f^{-1}\right\| _2\) for non-Parseval pipelines. We hence focus only on Parseval signal processing in our analysis.

As discussed in Sect. 3.2, sampling in phase space requires enveloping. We hence formulate the following list of signal processing pipelines.

Definition 18

Let f be a continuous frame over the phase space G, T be a bounded operator in \(L^2(G)\), \(r:{\mathbb {C}}\rightarrow {\mathbb {C}}\), and \(\psi ,\eta \in L^1(G)\) two envelopes. Let \(s\in {\mathcal {H}}\) denote a generic signal.

  1. 1.

    A signal processing pipeline is defined by

    $$\begin{aligned} {\mathcal {P}}_{f,T,r}s = V_f^* T (r\circ V_f[s]). \end{aligned}$$
    (34)
  2. 2.

    An output enveloped signal processing pipeline is defined by

    $$\begin{aligned} {\mathcal {P}}^{\psi }_{f,T,r}s = V_f^*\psi T (r\circ V_f[s]). \end{aligned}$$
    (35)
  3. 3.

    An input-output enveloped signal processing pipeline is defined by

    $$\begin{aligned} {\mathcal {P}}^{\psi ; \eta }_{f,T,r}s = V_f^*\psi T \Big (\eta r\circ V_f[s]\Big ). \end{aligned}$$
    (36)

The following list of Monte Carlo approximations correspond to the pipelines of Definition 18.

Definition 19

Let \({\mathcal {P}}_{f,T,r}\) be a signal processing pipeline, \(\psi \) and \(\eta \) two envelopes, and \(K,L\in {\mathbb {N}}\). Let \(s\in {\mathcal {H}}\) denote a generic signal.

  1. 1.

    The output stochastic signal processing pipeline is defined by

    $$\begin{aligned}{}[{\mathcal {P}}s]_{f,T,r}^{\psi , K} = V_f^{*\psi , K }\big (T (r\circ V_f[s])\big ) \end{aligned}$$
    (37)
  2. 2.

    For a phase space integral operator T, the input-output stochastic signal processing pipeline is defined by

    $$\begin{aligned}{}[{\mathcal {P}}s]_{f,T,r}^{\psi , K ;\eta ,L} = V_f^{*\psi ,K}\big (T^{\eta , L}(r\circ V_f[s])\big ) \end{aligned}$$
    (38)

We typically fix f, T, and r, in which case we omit them from the pipeline notation and denote \({\mathcal {P}}\), \({\mathcal {P}}^{\psi }\), \({\mathcal {P}}^{\psi ,K}\) etc. Equations (8)–(11) give explicit formulas for the synthesis and analysis formulations of \([{\mathcal {P}}s]^{\psi , K ;\eta ,L}\) and \([{\mathcal {P}}s]^{\psi , K}\), based on the samples in phase space. As noted in Sect. 1.2, the pipeline (37) is useful for multipliers, shrinkage, and phase vocoder, and the pipeline (38) is useful for PSI operators.

4.2 Expected Error in Stochastic Phase Space Signal Processing

In the following, we estimate the error of the stochastic methods.

Theorem 20

Let f be a bounded continuous frame with bounds AB, and \(\left\| f_g\right\| _{{\mathcal {H}}}\le C\). Let T be a bounded operator in \(L^2(G)\), and \(r:{\mathbb {C}}\rightarrow {\mathbb {C}}\) satisfy \(\left| r(x)\right| \le E\left| x\right| \) for some \(E\ge 0\) and every \(x\in {\mathbb {C}}\). Let \(\psi \) and \(\eta \) be two envelopes, and \(K,L\in {\mathbb {N}}\). Then, for every signal \(s\in {\mathcal {H}}\), the following two properties hold.

  1. 1.

    Output enveloped signal processing stochastic approximation:

    $$\begin{aligned} {\mathbb {E}}\Big (\left\| [{\mathcal {P}}s]^{\psi , K}-[{\mathcal {P}}s]^{\psi }\right\| _{{\mathcal {H}}}^2\Big ) \le \frac{\left\| \psi \right\| _1}{K}A^{-1}B^2C^2 E^2\left\| T\right\| _2^2 \left\| s\right\| _{{\mathcal {H}}}^2. \end{aligned}$$
  2. 2.

    Input-output enveloped signal processing stochastic approximation: if T is a uniformly bounded PSI operator with bound D, then

    $$\begin{aligned} \begin{aligned}&{\mathbb {E}}\Big (\left\| [{\mathcal {P}}s]^{\psi , K;\eta ,L}-[{\mathcal {P}}s]^{\psi ;\eta }\right\| _{{\mathcal {H}}}^2\Big )\\&\quad \le 4\frac{{\left\| \eta \right\| _1}}{L}D^2 B^2 E^2 \left\| s\right\| _{{\mathcal {H}}}^2 +16\frac{{\left\| \psi \right\| _1}{\left\| \eta \right\| _1}}{KL}A^{-1}B^2 C^2 D^2E^2\left\| s\right\| _{{\mathcal {H}}}^2 \\&\qquad +16\frac{\left\| \psi \right\| _1}{K}A^{-1}B^2 C^2 E^2\left\| T\right\| _2^2 \left\| s\right\| _{{\mathcal {H}}}^2 \\&\quad = O\left( \frac{\left\| \eta \right\| _1}{L}+\frac{\left\| \psi \right\| _1}{K} + \frac{{\left\| \psi \right\| _1}{\left\| \eta \right\| _1}}{KL}\right) \left\| s\right\| _{{\mathcal {H}}}^2. \end{aligned} \end{aligned}$$

We use the following simple observation to prove Theorem 20.

Lemma 21

Let \(Z_1,Z_2,Z_3\) be non-negative real-valued random variables such that \(Z_1\le Z_2+Z_3\) pointwise in the sample set. Then,

$$\begin{aligned} {\mathbb {E}}(Z_1^2) \le 4{\mathbb {E}}(Z_2^2) + 4{\mathbb {E}}(Z_3^2). \end{aligned}$$
(39)

Proof

We have \(Z_1\le 2\max \{Z_2,Z_3\}\), where the maximum is pointwise in the sample space. Therefore, \(Z_1^2\le 4\max \{Z_2^2,Z_3^2\} \le 4Z_2^2+4Z_3^2\), and (39) follows. \(\square \)

Proof of Theorem 20

We prove 2, and note that 1 is simpler and uses similar techniques. Denote by \(\mathbf{g}=\{g^1,\ldots ,g^K\}\) the output samples underlying \(V_f^{*\psi ,K}\), and by \(\mathbf{y}=\{y^1,\ldots ,y^L\}\) the input samples underlying \(T^{\eta ,L}\). Denote \(F=r(V_f[s])\in L^2(G)\). By the triangle inequality, and by the fact that \(\left\| V_f\right\| =\left\| V_f^*\right\| \le B^{1/2}\) and \(0\le \psi (g)\le 1\),

$$\begin{aligned} \begin{aligned}&\left\| V_f^{*\psi ,K}T^{\eta , L}F-V_f^*\psi T[\eta F]\right\| _{{\mathcal {H}}} \\&\quad \le \left\| V_f^{*\psi ,K}T^{\eta , L}F-V_f^*\psi T^{\eta ,L} F\right\| _{{\mathcal {H}}} + B^{1/2}\left\| T^{\eta ,L} F- T[\eta F]\right\| _{2}. \end{aligned} \end{aligned}$$
(40)

When calculating the conditional expected value of \(\left\| V_f^{*\psi ,K}T^{\eta , L}F-V_f^*\psi T[\eta F]\right\| _{{\mathcal {H}}}^2 \), with respect to a fixed \(\mathbf{g}\) (denoted here by \({\mathbb {E}}(\ \cdot \ | \mathbf{g})\)), we use Lemma 21 and Proposition 12 to get

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}\Big ( \left\| V_f^{*\psi ,K}T^{\eta , L}F-V_f^*\psi T[\eta F]\right\| _{{\mathcal {H}}}^2 \Big | \mathbf{g}\Big ) \\&\quad \le 4\frac{{\left\| \eta \right\| _1}}{{L}}D^2 B \left\| F\right\| _2^2 +4 {\mathbb {E}} \Big ( \left\| V_f^{*\psi ,K}T^{\eta , L}F-V_f^*\psi T^{\eta ,L} F\right\| _{{\mathcal {H}}}^2 \Big | \mathbf{g}\Big ). \end{aligned} \end{aligned}$$

Thus

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}\Big ( \left\| V_f^{*\psi ,K}T^{\eta , L}F-V_f^*\psi T[\eta F]\right\| _{{\mathcal {H}}}^2 \Big ) \\&\le 4\frac{{\left\| \eta \right\| _1}}{{L}}D^2 B \left\| F\right\| _2^2 +4 {\mathbb {E}} \Big ( \left\| V_f^{*\psi ,K}T^{\eta , L}F-V_f^*\psi T^{\eta ,L} F\right\| _{{\mathcal {H}}}^2 \Big ). \end{aligned} \end{aligned}$$

Note that Fubini–Tonelli theorem is satisfied in the computation of \({\mathbb {E}} \Big ( \left\| V_f^{*\psi ,K}T^{\eta , L}F-V_f^*\psi T^{\eta ,L} F\right\| _{{\mathcal {H}}}^2 \Big )\) as a repeated integral of \(\mathbf{y}\) and \(\mathbf{g}\), since the integrand is positive and the measure is \(\sigma \)-finite.

Next, by Proposition 17,

$$\begin{aligned} {\mathbb {E}} \Big ( \left\| V_f^{*\psi ,K}T^{\eta , L}F-V_f^*\psi T^{\eta ,L} F\right\| _{{\mathcal {H}}}^2 \Big | \mathbf{y}\Big ) \le \frac{{\left\| \psi \right\| _1}}{K}A^{-1}BC^2\left\| T^{\eta ,L}F\right\| _{2}^2. \end{aligned}$$

Now,

$$\begin{aligned} \left\| T^{\eta ,L}F\right\| _{2} \le \left\| T^{\eta ,L}F- T[\eta F]\right\| _{2} + \left\| T[\eta F]\right\| _{2}, \end{aligned}$$

so by Lemma 21, by Proposition 12, and by the fact that \(0\le \eta (y)\le 1\),

$$\begin{aligned} \begin{aligned}&{\mathbb {E}} \Big ( \left\| V_f^{*\psi ,K}T^{\eta , L}F-V_f^*\psi T^{\eta ,L} F\right\| _{{\mathcal {H}}}^2\Big )\\&\quad \le \frac{{\left\| \psi \right\| _1}}{{K}}A^{-1}B C^2 \Big ( 4\frac{{\left\| \eta \right\| _1}}{{L}}D^2\left\| F\right\| _2^2 + 4\left\| T\right\| _2^2\left\| F\right\| _2^2 \Big ). \end{aligned} \end{aligned}$$

Altogether,

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}\Big ( \left\| V_f^{*\psi ,K}T^{\eta , L}F-V_f^*\psi T[\eta F]\right\| _{{\mathcal {H}}}^2 \Big ) \\&\quad \le 4\frac{{\left\| \eta \right\| _1}}{L} D^2 B \left\| F\right\| _2^2 +16\frac{{\left\| \psi \right\| _1}}{K}A^{-1}B C^2 \frac{{\left\| \eta \right\| _1}}{L}D^2\left\| F\right\| _2^2\\&\qquad +16\frac{{\left\| \psi \right\| _1}}{K}A^{-1}B C^2 \left\| T\right\| _2^2\left\| F\right\| _2^2. \end{aligned} \end{aligned}$$

The claim now follows from \(\left\| F\right\| _2^2=\left\| r\circ V_f[s]\right\| _2^2 \le E^2 B\left\| s\right\| _{{\mathcal {H}}}^2\). \(\square \)

4.3 Concentration of Error in Stochastic Phase Space Signal Processing

Propositions 12 and 17, and Theorem 20 estimate the average square error of the stochastic approximations. In this subsection, we formulate the approximation results as bounds on the error that hold in high probability. We show how to apply the classical concentration of measure estimates, Markov’s inequality, and Bernstein’s inequality, in our setting.

4.3.1 A Bernstein Inequality in Hilbert Spaces

In the following version of Bernstein’s inequality, we define expected values of weakly integrable random vectors v over the sample set G using the weak integral (Definition 3) as \({\mathbb {E}}^{\mathrm{w}}(v)=\int _G^\mathrm{w}v(g)d\mu (g)\). The following version of Bernstein’s inequality is a direct result of [8, Theorem 2.6], and is proved in Appendix A.

Theorem 22

(Hilbert space Bernstein’s inequality) Let \({\mathcal {H}}\) be a separable Hilbert space, and G a probability space. Let \(\{v_k\}_{k=1}^K:G^K\rightarrow {\mathcal {H}}^K\) be a finite sequence of independent random weakly integrable vectors. Suppose that for every \(k=1,\ldots ,K\), \({\mathbb {E}}^{\mathrm{w}}(v_k)=0\) and \(\left\| v_k\right\| _{{\mathcal {H}}}\le B\) a.s. and assume that \(\rho _K^2> \sum _{k=1}^K {\mathbb {E}}\left\| v_k\right\| _{{\mathcal {H}}}^2\) for some constant \(\rho _K\in {\mathbb {R}}\). Then, for every \(0\le t \le \rho _K^2/B\),

$$\begin{aligned} P\left( \left\| \sum _{k=1}^K v_k\right\| _{{\mathcal {H}}}\ge t\right) \le \exp \left( -\frac{t^2}{8\rho _K^2}+\frac{1}{4}\right) . \end{aligned}$$
(41)

We note that existing variants of Bernstein’s inequality in infinite dimensional Hilbert spaces are not adequate for us. For example, the operator Bernstein’s inequality of [38] is limited to trace class operators, and thus does not even include the identity.

4.3.2 Concentration of Error Results

Markov type concentration of error results can be derived from Propositions 12 and 17, and Theorem 20, by multiplying the error bound by \(\delta ^{-1}\), and replacing the expected value with an event that has probability at least \((1-\delta )\). For example, the Markov type concentration of error version of Proposition  17 reads

$$\begin{aligned} \left\| V_f^{*K}F-V_f^* [\psi F]\right\| _{{\mathcal {H}}}^2 \le \frac{\left\| \psi \right\| _1}{K} \frac{B}{A}C^2 \left\| F\right\| _2^2\delta ^{-1} \end{aligned}$$

in probability more that \((1-\delta )\).

The following proposition summarizes the Markov and Bernstein types concentration of error bounds in output stochastic signal processing.

Theorem 23

(Output signal processing concentration of error) Let f be a bounded continuous frame with frame bounds A and B, and with \(\left\| f_g\right\| _{{\mathcal {H}}}\le C\), let \(\psi \) be an envelope, and \(K\in {\mathbb {N}}\). Let T be a bounded operator in \(L^2(G)\), and \(r:{\mathbb {C}}\rightarrow {\mathbb {C}}\) satisfy \(\left| r(x)\right| \le E \left| r(x)\right| \) for every \(x\in {\mathbb {C}}\), where \(E>0\). Let \(s\in {\mathcal {H}}\) and \(0<\delta <1\). Then, with probability more than \(1-\delta \), we have

$$\begin{aligned} \left\| [{\mathcal {P}}s]^{\psi , K}-[{\mathcal {P}}s]^{\psi }\right\| _{{\mathcal {H}}} \le \frac{\sqrt{\left\| \psi \right\| _1}}{\sqrt{K}} A^{-1/2}BCE\left\| T\right\| _2\left\| s\right\| _{{\mathcal {H}}} \kappa (\delta ), \end{aligned}$$
(42)

where \(\kappa (\delta )\) can be chosen as one of the following two options.

  1. 1.

    Markov type error bound: \(\kappa (\delta )=\delta ^{-\frac{1}{2}}\).

  2. 2.

    Bernstein type error bound: \(\kappa (\delta )=2\sqrt{2}\sqrt{\ln \Big (\frac{1}{\delta }\Big ) +\frac{1}{4}}\) in case \(\left\| T\right\| _{\infty }<\infty \) and K satisfies

    $$\begin{aligned} K \ge \left\| \psi \right\| _1\Big (\frac{C}{B^{1/2}} \frac{\left\| T\right\| _{\infty }}{\left\| T\right\| _2} + \frac{B^{1/2}}{C\left\| \psi \right\| _1} \Big )^2 \kappa (\delta )^2. \end{aligned}$$
    (43)

Proof

We prove 2 and note that 1 is simpler and based on Markov’s inequality. Denote \(F=T r(V_f[s])\). Below, we use the following bounds

$$\begin{aligned}&\left\| F\right\| _{\infty } =\left\| T r(V_f[s])\right\| _{\infty } \le \left\| T\right\| _{\infty } EC\left\| s\right\| _{{\mathcal {H}}}, \end{aligned}$$
(44)
$$\begin{aligned}&\left\| F\right\| _2=\left\| T r(V_f[s])\right\| _{2} \le \left\| T\right\| _2 E B^{1/2}\left\| s\right\| _{{\mathcal {H}}}. \end{aligned}$$
(45)

and

$$\begin{aligned} B^{1/2}C \left\| F\right\| _{\infty }+ \frac{1}{\left\| \psi \right\| _1}B\left\| F\right\| _2\le J, \end{aligned}$$
(46)

where

$$\begin{aligned} J= B^{1/2}C^2E \left\| T\right\| _{\infty } \left\| s\right\| _{{\mathcal {H}}}+ \frac{1}{\left\| \psi \right\| _1}B^{1.5}E \left\| T\right\| _2\left\| s\right\| _{{\mathcal {H}}}. \end{aligned}$$

We use Theorem 22 as follows. Define the independent random vectors

$$\begin{aligned} v_k:G^k\rightarrow L^2(G), \quad v_k({\mathbf {g}})= \frac{1}{K} \big (Q_f^1(g^k) F - Q_f \psi F \big ), \quad k=1,\ldots , K \end{aligned}$$

where the sample set is \(\{G^k\ ;\ \prod _{k=1}^K \frac{\psi (g^k)}{\left\| \psi \right\| _1}dg^k\}\). By Corollary 13, \({\mathbb {E}}^{\mathrm{w}}(v_k)={\mathbb {E}}(v_k)=0\), and \({\mathbb {E}}(\left\| v_k\right\| _2^2) \le \frac{\left\| \psi \right\| _1}{K^2}BC^2\left\| F\right\| ^2_2\). Therefore, by (45),

$$\begin{aligned} \sum _{k=1}^K {\mathbb {E}}(\left\| v_k\right\| _2^2) \le \frac{\left\| \psi \right\| _1}{K}BC^2\left\| F\right\| ^2_2 \le \frac{\left\| \psi \right\| _1}{K}B^2C^2E^2 \left\| T\right\| ^2_2\left\| s\right\| ^2_{{\mathcal {H}}}. \end{aligned}$$

Moreover, by Proposition 5, Example 7, and (46), for every \(g^k\in G\)

$$\begin{aligned} \begin{aligned} \left\| v_k\right\| _2&\le \frac{1}{K}\Big (\left\| \left\| \psi \right\| _1K(g^k,\cdot )F(g^k) \right\| _2 + \left\| Q_f \psi F\right\| _2\Big ) \\&\le \frac{\left\| \psi \right\| _1}{K}\Big ( B^{1/2}C \left\| F\right\| _{\infty }+ \frac{1}{\left\| \psi \right\| _1}B\left\| F\right\| _2\Big ) \le \frac{\left\| \psi \right\| _1}{K} J. \end{aligned} \end{aligned}$$

Hence, by Theorem 22, for every \(0 \le t\le B^2C^2E^2 \left\| T\right\| ^2_2\left\| s\right\| ^2_{{\mathcal {H}}}/J\)

$$\begin{aligned} P\Big ( \left\| Q_f^K F - Q_f(\psi F)\right\| _2 \ge t \Big ) \le \exp \left( -\frac{t^2}{8B^2C^2E^2 \left\| T\right\| ^2_2\left\| s\right\| ^2_{{\mathcal {H}}}}\frac{K}{\left\| \psi \right\| _1} + \frac{1}{4}\right) . \end{aligned}$$
(47)

Now, set

$$\begin{aligned} \delta =\exp \left( -\frac{t^2}{8B^2C^2E^2 \left\| T\right\| ^2_2\left\| s\right\| ^2_{{\mathcal {H}}}}\frac{K}{\left\| \psi \right\| _1} + \frac{1}{4}\right) , \end{aligned}$$

or equivalently

$$\begin{aligned} t=\sqrt{8}\sqrt{-\ln (\delta )+\frac{1}{4}}BCE\left\| T\right\| _2\left\| s\right\| _{{\mathcal {H}}}\frac{\sqrt{\left\| \psi \right\| _1}}{\sqrt{K}}, \end{aligned}$$

and demand \(0 \le t\le B^2C^2E^2 \left\| T\right\| ^2_2\left\| s\right\| ^2_{{\mathcal {H}}}/J\), namely,

$$\begin{aligned} \sqrt{8}\sqrt{-\ln (\delta )+\frac{1}{4}}BCE\left\| T\right\| _2\left\| s\right\| _{{\mathcal {H}}}\frac{\sqrt{\left\| \psi \right\| _1}}{\sqrt{K}} \le B^2C^2E^2 \left\| T\right\| ^2_2\left\| s\right\| ^2_{{\mathcal {H}}}/J. \end{aligned}$$

This gives, in probability at least \((1-\delta )\),

$$\begin{aligned} \left\| Q_f^K F - Q_f\psi F\right\| _{{\mathcal {H}}} \le BCE\left\| T\right\| _2\left\| s\right\| _{{\mathcal {H}}}\frac{\sqrt{\left\| \psi \right\| _1}}{\sqrt{K}}\kappa (\delta ), \end{aligned}$$

whenever k satisfies (43). Last, using Proposition 16 and (33), we get

$$\begin{aligned} \begin{aligned} \left\| [{\mathcal {P}}s]^{\psi , K}-[{\mathcal {P}}s]^{\psi }\right\| _{{\mathcal {H}}}&= \left\| S_f^{-1}V_f^*Q_f^K F - S_f^{-1}V_f^*Q_f\psi F\right\| _{{\mathcal {H}}}\\&\le A^{-1/2}\left\| Q_f^K F - Q_f\psi F\right\| _{{\mathcal {H}}} \end{aligned} \end{aligned}$$

\(\square \)

5 Stochastic Phase Space Signal Processing of Discrete Signals

In previous sections we showed how to randomly discretize phase space. In Theorems 20 and 23, when the number of samples satisfy \(K,L=Z \max \{\left\| \psi \right\| _1,\left\| \eta \right\| _1\}\), for \(Z>0\), the approximation errors are of order \(O(Z^{-1/2})\). In this section, we additionally discretize the signal space \({\mathcal {H}}\) to a finite dimensional subspace \(V_M\subset {\mathcal {H}}\) of dimension/resolution \(M\in {\mathbb {N}}\). The main goal is to relate the choices of \(\left\| \psi \right\| _1\) and \(\left\| \eta \right\| _1\) to the resolution M. We introduce a class of frames, called linear volume discretizable (LVD) frames, for which there are envelopes \(\psi _M\) and \(\eta _M\) with \(\left\| \psi _M\right\| _1,\left\| \eta _M\right\| _1=O(M)\) that contain most of the energy of \(V_f[s_M]\) for every \(s_M\in V_M\). Thus, a stochastic signal processing method for LVD frames requires \(K,L=ZM\) samples, with \(Z>0\), for the approximation error to be \(O(\frac{1}{\sqrt{Z}})\).

5.1 Discrete Signals and Linear Volume Discretization of Continuous Frames

We treat discrete signals as embedded in the Hilbert space of signals \({\mathcal {H}}\). A discrete signal is an element of a finite dimensional subspace of \({\mathcal {H}}\). On the one hand, we can analyze discrete signals directly in \({\mathcal {H}}\). On the other hand, discrete signals are determined by a finite number of scalars, so they are well adapted to numerical analysis. In our analysis, we sometimes restrict ourselves to a class of signals \({\mathcal {R}}\subset {\mathcal {H}}\) which need not be a linear space. We typically consider \({\mathcal {R}}\) defined by imposing a restriction on signals in \({\mathcal {H}}\) which is natural for real life signals of some type.

Definition 24

Let \({\mathcal {H}}\) be a Hilbert space that we call the signal space. A class of signals \({\mathcal {R}}\subset {\mathcal {H}}\) is a (possibly non-linear) subset of \({\mathcal {H}}\). A sequence of discretizations of \({\mathcal {R}}\) is a sequence of (generally non-linear) subspaces \(\{V_M\subset {\mathcal {H}}\}_{M=1}^{\infty }\) that satisfies the following condition: for every \(s\in {\mathcal {R}}\) there is a sequence \(\{s_M\in V_M\}_{M=1}^{\infty }\) such that

$$\begin{aligned} \lim _{M\rightarrow \infty }\left\| s_M-s\right\| _{{\mathcal {H}}}=0. \end{aligned}$$

The resolution \(\mathrm{dim}(V_M)\) of \(V_M\) is defined to be the dimension of \(\mathrm{span}V_M\).

The idea in discretizing a continuous frame is to find an envelope \(\psi _M\) for each discrete space \(V_M\) such that for any \(s_M\in V_M\), the approximation error of \(V_f[s_M]\) by \(\psi _M V_f[s_M]\) is controlled. The envelopes \(\psi _M\) are interpreted as covering domains \(G_M\subset G\) in which most of the energy of functions from \(V_f [V_M]\) resides.

Definition 25

Let \(f:G\rightarrow {\mathcal {H}}\) be a continuous frame. Let \({\mathcal {R}}\subset {\mathcal {H}}\) be a class of signals, and \(\{V_M\}_{m=1}^{\infty }\) a discretization of \({\mathcal {R}}\).

  1. 1.

    The continuous frame f is called linear volume discretizable (LVD) with respect to the class \({\mathcal {R}}\) and the discretization \(\{V_M\}_{M=1}^{\infty }\), if for every error tolerance \(\epsilon >0\) there is a constant \(C_{\epsilon }>0\) and \(M_0\in {\mathbb {N}}\), such that for any \(M\ge M_0\) there is an envelope \(\psi _M\) with

    $$\begin{aligned} \left\| \psi _M\right\| _1 \le C_{\epsilon }\mathrm{dim}(V_M) \end{aligned}$$
    (48)

    such that for any \(s_M\in V_M\),

    $$\begin{aligned} \frac{\left\| V_f[s_M] - \psi _M V_f[s_M]\right\| _2}{\left\| V_f[s_M]\right\| _2} < \epsilon . \end{aligned}$$
    (49)
  2. 2.

    For a linear volume discretizable continuous frame f with respect to \({\mathcal {R}}\) and \(\{V_M\}_{M=1}^{\infty }\), and a fixed tolerance \(\epsilon >0\) with a corresponding fixed \(C_{\epsilon }\) and envelope sequence \(\{\psi _M\}_{M=1}^{\infty }\) satisfying (48) and (49), we call f together with \({\mathcal {R}}\), \(\{V_M\}_{m=1}^{\infty }\), and \(\{\psi _M\}_{M=1}^{\infty }\), an \(\epsilon \)-linear volume discretization (\(\epsilon \)-LVD) of f.

5.2 Error in Discrete Stochastic Phase Space Signal Processing

Next, we study the error in discrete stochastic phase space signal processing of LVD frames. Since the energy of \(V_f[s_M]\) may be shifted after applying an operator T on \(V_f[s_M]\), we first introduce the following definition.

Definition 26

Let G be a phase space, T a bounded linear operator in \(L^2(G)\), \(\psi \) and \(\eta \) two envelopes, and \(\epsilon >0\). We say that T maps the energy of \(\eta \) to \(\psi \) up to \(\epsilon \), if

$$\begin{aligned} \left\| T\eta - \psi T \eta \right\| _{2\rightarrow 2} \le \epsilon . \end{aligned}$$
(50)

The next theorem summarizes the expected approximation error in stochastic signal processing with \(\epsilon \)-LVD frames.

Theorem 27

Let f be a bounded continuous frame with bound \(\left\| f_g\right\| _{{\mathcal {H}}}\le C\). Let \(r:{\mathbb {C}}\rightarrow {\mathbb {C}}\) satisfy \(\left| r(x)\right| \le E\left| x\right| \), where \(E>0\). Suppose that f together with the signal class \({\mathcal {R}}\), the discretization \(\{V_M\}_{m=1}^{\infty }\), and the envelopes \(\{\eta _M\}_{M=1}^{\infty }\), is an \(\epsilon \)-LVD of f, with constant \(C_{\epsilon }\). Let \(\{\psi _M\}_{M=1}^{\infty }\) be a sequence of envelopes satisfying

$$\begin{aligned} \left\| \psi _M\right\| _1 \le C_{\epsilon }\mathrm{dim}(V_M). \end{aligned}$$
(51)

Let T be a bounded operator on \(L^2(G)\) that maps the energy of \(\eta _M\) to \(\psi _M\) up to \(\epsilon \). Then, the following two bounds are satisfied for every \(s_M\in V_M\).

  1. 1.
    $$\begin{aligned} \begin{aligned}&\frac{{\mathbb {E}}\Big (\left\| [{\mathcal {P}}s_M]_{f,T,r}^{\psi _M, K} - [{\mathcal {P}}s_M]_{f,T,r} \right\| _{{\mathcal {H}}}^2\Big )}{{\left\| s_M\right\| _{{\mathcal {H}}}^2} } \\&\quad \le 4\frac{C_{\epsilon }\mathrm{dim}(V_M)}{K}A^{-1}B^2C^2 E^2\left\| T\right\| _2^2 +4 B^2 E^2 (1+2\left\| T\right\| _2)^2 \epsilon ^2 \end{aligned} \end{aligned}$$
    (52)
  2. 2.

    If T is a uniformly square integrable PSI operator with bound D, then

    $$\begin{aligned} \begin{aligned}&\frac{{\mathbb {E}}\Big (\left\| [{\mathcal {P}}s_M]_{f,T,r}^{\psi _M, K;\eta _M,L} - [{\mathcal {P}}s_M]_{f,T,r} \right\| _{{\mathcal {H}}}^2\Big )}{{\left\| s_M\right\| _{{\mathcal {H}}}^2} } \\&\quad \le 16\frac{C_{\epsilon }\mathrm{dim}(V_M)}{L}D^2 B^2 E^2 +64\frac{C_{\epsilon }^{2}\mathrm{dim}(V_M)^2}{KL}A^{-1}B^2 C^2 D^2E^2 \\&\qquad +64\frac{C_{\epsilon }\mathrm{dim}(V_M)}{K}A^{-1}B^2 C^2 E^2\left\| T\right\| _2^2 + 4B^2 E^2 (1+\left\| T\right\| _2^2)\epsilon ^2. \end{aligned} \end{aligned}$$
    (53)

Proof

We first prove (53). By Lemma 21,

$$\begin{aligned}&{\mathbb {E}}\Big (\left\| [{\mathcal {P}}s_M]^{\psi _M, K;\eta _M,L} - [{\mathcal {P}}s_M]_{f,T,r} \right\| _{{\mathcal {H}}}^2\Big ) \nonumber \\&\quad \le 4 {\mathbb {E}}\Big (\left\| [{\mathcal {P}}s_M]^{\psi _M, K;\eta _M,L}-[{\mathcal {P}}s_M]^{\psi _M;\eta _M}\right\| _{{\mathcal {H}}}^2\Big ) + 4\left\| [{\mathcal {P}}s_M]^{\psi _M;\eta _M} - [{\mathcal {P}}s_M]\right\| _{{\mathcal {H}}}^2. \end{aligned}$$
(54)

Next, we bound the second term of (54).

$$\begin{aligned} \begin{aligned} \left\| [{\mathcal {P}}s_M]^{\psi _M;\eta _M} - [{\mathcal {P}}s_M]\right\| _{{\mathcal {H}}}&= \left\| V_f^*\psi _M T \Big (\eta _M r\circ V_f[s_M]\Big ) - V_f^* T \Big ( r\circ V_f[s_M]\Big )\right\| _{{\mathcal {H}}} \\&\le B^{1/2}\left\| \psi _M T \Big (\eta _M r\circ V_f[s_M]\Big ) - T \Big ( r\circ V_f[s_M]\Big )\right\| _{{\mathcal {H}}} \\&\le B^{1/2}\left\| T \eta _M r\circ V_f[s_M] -\psi _M T \eta _M r\circ V_f[s_M]\right\| _2\\&\quad + B^{1/2}\left\| Tr\circ V_f[s_M] - T \eta _M r\circ V_f[s_M]\right\| _2 \\&\le B^{1/2}\left\| T \eta _M -\psi _M T \eta _M\right\| _2 \left\| r\circ V_f[s_M]\right\| _2 \\&\quad + B^{1/2}\left\| T\right\| _2\left\| r\circ V_f[s_M] - \eta _M r\circ V_f[s_M]\right\| _2\\&\le B\epsilon E\left\| s_M\right\| _{{\mathcal {H}}} + B^{1/2}\left\| T\right\| _2\left\| (1-\eta _M)r\circ V_f[s_M]\right\| _2 . \end{aligned} \end{aligned}$$
(55)

For the second term of the last line of (55), by the LVD property,

$$\begin{aligned} \left\| (1-\eta _M)r\circ V_f[s_M]\right\| _2 \le E \left\| \eta _M V_f[s_M]-V_f[s_M]\right\| _2 \le \epsilon E B^{1/2}\left\| s_M\right\| _{{\mathcal {H}}}. \end{aligned}$$
(56)

To conclude, (54) together with (55), (56), and Theorem 20, give (53).

Next, we prove (52). As before,

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}\Big (\left\| [{\mathcal {P}}s_M]_{f,T,r}^{\psi _M, K} - [{\mathcal {P}}s_M]_{f,T,r} \right\| _{{\mathcal {H}}}^2\Big ) \\&\quad \le 4 {\mathbb {E}}\Big (\left\| [{\mathcal {P}}s_M]^{\psi _M, K}-[{\mathcal {P}}s_M]^{\psi _M}\right\| _{{\mathcal {H}}}^2\Big ) + 4\left\| [{\mathcal {P}}s_M]^{\psi _M} - [{\mathcal {P}}s_M]\right\| _{{\mathcal {H}}}^2. \end{aligned} \end{aligned}$$
(57)

We bound the second term of (57) using (50) and (56) by

$$\begin{aligned} \begin{aligned}&\left\| [{\mathcal {P}}s_M]^{\psi _M} - [{\mathcal {P}}s_M]\right\| _{{\mathcal {H}}} = \left\| V_f^*\psi _M T r\circ V_f[s_M] -V_f^* T r\circ V_f[s_M] \right\| _{{\mathcal {H}}} \\&\quad \le B^{1/2} \left\| T r\circ V_f[s_M] - T \eta _M r\circ V_f[s_M]\right\| _{{\mathcal {H}}}\\&\qquad +B^{1/2} \left\| T \eta _M r\circ V_f[s_M]-\psi _M T \eta _M r\circ V_f[s_M] \right\| _{{\mathcal {H}}}\\&\qquad +B^{1/2} \left\| \psi _M T \eta _M r\circ V_f[s_M]-\psi _M T r\circ V_f[s_M] \right\| _{{\mathcal {H}}}\\&\quad \le B^{1/2} \left\| T\right\| _2\left\| r\circ V_f[s_M] - \eta _M r\circ V_f[s_M]\right\| _{{\mathcal {H}}}\\&\qquad +B^{1/2} \left\| T \eta _M -\psi _M T \eta _M \right\| _{{\mathcal {H}}}\left\| r\circ V_f[s_M]\right\| _{{\mathcal {H}}} \\&\qquad +B^{1/2} \left\| T\right\| _2\left\| \eta _M r\circ V_f[s_M]- r\circ V_f[s_M] \right\| _{{\mathcal {H}}}\\&\quad \le B^{1/2} \left\| T\right\| _2 \epsilon E B^{1/2}\left\| s_M\right\| _{{\mathcal {H}}} +B^{1/2} \epsilon E B^{1/2}\left\| s_M\right\| _{{\mathcal {H}}} +B^{1/2} \left\| T\right\| _2 \epsilon E B^{1/2}\left\| s_M\right\| _{{\mathcal {H}}} \\&\quad = BE(2\left\| T\right\| _2+1)\left\| s_M\right\| _{{\mathcal {H}}}. \end{aligned} \end{aligned}$$
(58)

This, together with (57) and Theorem 20, leads to (52). \(\square \)

Next, we formulate concentration of error results for LVD frames. A Markov type concentration of error result can be derived directly from Theorem 27. For a Bernstein type error bound, we offer the following theorem only for the output stochastic signal processing pipeline (Definition 19.1).

Theorem 28

Consider the setting of Theorem 27.1, and suppose that \(\left\| T\right\| _{\infty }<\infty \). Let \(\delta >0\), \(\kappa (\delta )=2\sqrt{2}\sqrt{\ln \Big (\frac{1}{\delta }\Big ) +\frac{1}{4}}\), and K satisfy (43). Then, in probability more than \((1-\delta )\),

$$\begin{aligned} \begin{aligned}&\frac{\left\| [{\mathcal {P}}s_M]_{f,T,r}^{\psi _M, K} - [{\mathcal {P}}s_M]_{f,T,r} \right\| _{{\mathcal {H}}}}{\left\| s_M\right\| _{{\mathcal {H}}}} \\&\quad \le \frac{\sqrt{C_{\epsilon }\mathrm{dim}(V_M)}}{\sqrt{K}} A^{-1/2}BCE\left\| T\right\| _2 \kappa (\delta ) + BE(2\left\| T\right\| _2+1) \epsilon . \end{aligned} \end{aligned}$$
(59)

Proof

The proof follows from (58) and Theorem 23.2, similarly to the proof of Theorem 27.1. \(\square \)

5.3 Discrete Stochastic Time–Frequency Signal Processing

In this subsection we present a discretization under which the STFT is LVD. In the companion paper [33] we present a discretization under which the CWT is linear volume discretizable. We analyze time signals \(s:{\mathbb {R}}\rightarrow {\mathbb {C}}\) by decomposing them to compact time interval sections. Without loss of generality, we suppose that each signal segment is supported in \([-1/2,1/2]\). Focusing on one segment, we take the signal class \({\mathcal {R}}\) as \(L^2[-1/2,1/2]\). Let \(V_M\) be the space of trigonometric polynomials of order M (namely, finite Fourier series expansions). In the frequency domain, signals \(q\in V_M\) are represented by

$$\begin{aligned} {\hat{q}}(z)=\sum _{n=-M}^M c_n \mathrm{sinc}(z-n) \end{aligned}$$

where \(c_n\) are the Fourier coefficients of q, and \(\mathrm{sinc}\) is the Fourier transform of the indicator function of \([-1/2,1/2]\). Consider a window function f supported at the time interval \([-S,S]\) that satisfies the following. There exist constants \(C',Y>0\), and \(\kappa >1/2\), such that for every \(z>Y\) or \(z<-Y\)

$$\begin{aligned} {\hat{f}}(z) \le C'\left| z\right| ^{-\kappa }. \end{aligned}$$
(60)

Let \(W>0\). For each \(M\in {\mathbb {N}}\), we consider the following phase space domain \(G_M\subset G\), where G is the STFT time-frequency plane,

$$\begin{aligned} G_M = \big \{(x,\omega )\ \big |\ -WM< \omega<WM ,\ \left| x\right| < 1/2+ S\big \}. \end{aligned}$$
(61)

The area of \(G_M\) in the time-frequency plane is

$$\begin{aligned} \mu (G_M) = 2WM(1+2S). \end{aligned}$$
(62)

Denote

$$\begin{aligned} \psi _M(g) = \left\{ \begin{array}{ccc} 1 &{} , &{} g\in G_M \\ 0 &{} , &{} g\notin G_M. \end{array}\right. \end{aligned}$$
(63)

Theorem 29

Under the above setting, the STFT is LVD with respect to the class \(L^2[-1/2,1/2]\) and the discretization \(\{V_M\}_{M\in {\mathbb {N}}}\), with the envelopes \(\psi _M\) defined by (63) for large enough W that depends only on \(\epsilon \) of Definition 25.

Proof

Let \(W>1\). A direct calculation of the STFT shows

$$\begin{aligned} \int _{{\mathbb {R}}}\left| V_f[q](\omega ,x)\right| ^2 dx = \int _{{\mathbb {R}}} \left| {\hat{q}}(z)\right| ^2\left| {\hat{f}}(z-\omega )\right| ^2 dz. \end{aligned}$$
(64)

We consider \(\omega >0\) and \(z>0\), and note that the other cases are similar. For each value of \(\omega >MW\), we decompose the integral (64) along z into the two integrals in \(z\in (0,(M+\omega )/2)\) and \(z\in ((M+\omega )/2,\infty )\). For \(z\in (0,(M+\omega )/2)\), since \(\omega \ge MW\) and \(z\le (M+\omega )/2\),

$$\begin{aligned}z-\omega \le M-\omega /2 \le -M(W/2-1)<0,\end{aligned}$$

so \(\left| z-\omega \right| ^{-2\kappa }\) obtains its maximum at \(z=(M+\omega )/2\). Thus, by (60),

$$\begin{aligned}&\int _{0}^{(M+\omega )/2} \left| {\hat{q}}(z)\right| ^2\left| {\hat{f}}(z-\omega )\right| ^2 dz \le \left\| q\right\| _2^2 \max _{0 \le z \le (M+\omega )/2}C^{\prime 2}\left| z-\omega \right| ^{-2\kappa }\nonumber \\&\quad =\left\| q\right\| _2^2 C^{\prime 2}\left| (M+\omega )/2-\omega \right| ^{-2\kappa }= \left\| q\right\| _2^2 C^{\prime 2}\left| (M-\omega )/2\right| ^{-2\kappa } \end{aligned}$$
(65)

Integrating the bound (65) for \(\omega \in (WM,\infty )\) gives

$$\begin{aligned} \int _{WM}^{\infty }\int _{0}^{(M+\omega )/2} \left| {\hat{q}}(z)\right| ^2\left| {\hat{f}}(z-\omega )\right| ^2 dz d\omega= & {} (W-1)^{-2k+1}M^{-2k+1}\left\| q\right\| _2^2 O(1) \nonumber \\= & {} o_W(1)o_M(1) \left\| q\right\| _2^2. \end{aligned}$$
(66)

For \(z\in ((M+\omega )/2,\infty )\), \({\hat{q}}\) decays like \(M^{1/2}(z-M)^{-1}\). Indeed, since \(z>M\)

$$\begin{aligned} \sum _{n=-M}^M c_n \mathrm{sinc}(z-n)\le & {} \left\| \{c_n\}\right\| _2 \sqrt{\sum _{n=-M}^M \frac{1}{(z-n)^2} } \le \left\| q\right\| _2 \sqrt{\sum _{n=-M}^M \frac{1}{(z-M)^2} }\nonumber \\\le & {} 2\left\| q\right\| _2\sqrt{M} (z-M)^{-1}. \end{aligned}$$
(67)

Now, by (64) and (67),

$$\begin{aligned} \begin{aligned} \int _{(M+\omega )/2}^{\infty } \left| {\hat{q}}(z)\right| ^2\left| {\hat{f}}(z-\omega )\right| ^2 dz&\le 2\left\| f\right\| _2^2\left\| q\right\| _2^2 \max _{(M+\omega )/2 \le z < \infty } M(z-M)^{-2}\\&=\left\| f\right\| _2^2\left\| q\right\| _2^2 M \big ((\omega -M)/2\big )^{-2}. \end{aligned} \end{aligned}$$
(68)

Integrating the bound (68) for \(\omega \in (WM,\infty )\) gives

$$\begin{aligned} \int _{WM}^{\infty }\int _{(M+\omega )/2}^{\infty } \left| {\hat{q}}(z)\right| ^2\left| {\hat{f}}(z-\omega )\right| ^2 dz d\omega = (W-1)^{-1}\left\| q\right\| _2^2 O(1) . \end{aligned}$$
(69)

Last, the bounds (66) and (69) are combined to give \(\left\| (I-\psi _M)V_f[q]\right\| _2= o_W(1)\left\| q\right\| _2\), so by the frame inequality

$$\begin{aligned} \frac{\left\| (I-\psi _M)V_f[q]\right\| _2}{\left\| V_f[q]\right\| _2}= o_W(1). \end{aligned}$$

This means that given \(\epsilon >0\), we may choose W large enough to guarantee \(\frac{\left\| (I-\psi _M)V_f[q]\right\| _2}{\left\| V_f[q]\right\| _2} < \epsilon \), and also guarantee that for every \(M\in {\mathbb {N}}\), \(\left\| \psi _M\right\| _1\le C_{\epsilon } M\), with \(C_{\epsilon }= 2W(1+2S)\) by (62). \(\square \)

6 Applications of Stochastic Signal Processing of Continuous Signals

In this section, we introduce two applications of the theory developed in this paper: integration of continuous frames and stochastic phase space diffeomorphism.

6.1 Integration of Linear Volume Discretizable Frames

Here, we show how to integrate a set of LVD continuous frames into one continuous LVD frame, while retaining all stochastic approximation bounds of a single LVD frame. We first show how to integrate frames.

Proposition 30

Let G and U be two topological spaces with \(\sigma \)-finite Borel measures \(\mu _G\) and \(\mu _U\) respectively, with \(\mu _U(U)=1\). Let \(A,B,C>0\) and \({\mathcal {H}}\) be a Hilbert space. For each \(u\in U\), let \(f_{\cdot ,u}:g\mapsto f_{g,u}\) be a bounded continuous frame over the phase space G and the signal space \({\mathcal {H}}\), with frame constants AB and bound \(\left\| f_{g,u}\right\| _{{\mathcal {H}}}\le C\). Suppose that the mapping \(f:(g,u)\mapsto f_{g,u}\) is continuous. Then f is a bounded continuous frame over the phase space \(G\times U\), with frame constants AB and bound \(\left\| f_{g,u}\right\| _{{\mathcal {H}}}\le C\).

Proof

Consider the mapping \(V_f\) that maps \(s\in {\mathcal {H}}\) to the function

$$\begin{aligned}V_f[s]:(g,u)\mapsto \left\langle s,f_{g,u}\right\rangle .\end{aligned}$$

By continuity of \((g,u)\mapsto f_{g,u}\), \(V_f[s]\) is continuous for every \(s\in {\mathcal {H}}\). Indeed

$$\begin{aligned} \left| V_f[s](g,u) - V_f[s](g',u')\right| = \left| \left\langle s,f_{g,u}\right\rangle - \left\langle s,f_{g',u'}\right\rangle \right| \le \left\| s\right\| _{{\mathcal {H}}}\left\| f_{g,u} - f_{g',u'}\right\| _{{\mathcal {H}}}. \end{aligned}$$

Thus, for every \(s\in {\mathcal {H}}\), \(V_f[s]:G\times U \rightarrow {\mathbb {C}}\) is a measurable function. For each \(u\in U\), denote by \(V_{f_{\cdot ,u}}\) the analysis operator corresponding to the continuous frame \(f_{\cdot ,u}\). By Fubini-Tonelli theorem, for every signal \(s\in {\mathcal {H}}\)

$$\begin{aligned} \left\| V_f[s]\right\| _2^2=\iint _{G\times U} \left| \left\langle s,f_{g,u}\right\rangle \right| ^2 d(g,u)=\int _U\int _G \left| \left\langle s,f_{g,u}\right\rangle \right| ^2 dgdu = \int _U \left\| V_{f_{\cdot ,u}}[s]\right\| _2^2 du. \end{aligned}$$

Therefore,

$$\begin{aligned} A\left\| s\right\| _{{\mathcal {H}}}^2 =\int _U A \left\| s\right\| _{{\mathcal {H}}}^2 du \le \left\| V_f[s]\right\| _2^2 \le \int _U B \left\| s\right\| _{{\mathcal {H}}}^2 du = B \left\| s\right\| _{{\mathcal {H}}}^2, \end{aligned}$$

and \(\{f_{g,u}\}_{(g,u)\in G\times U}\) is a continuous frame with frame bounds AB. \(\square \)

Next, we show that the LVD property is retained under integration of frames.

Proposition 31

Consider the setting of Proposition 30. Let \({\mathcal {R}}\subset {\mathcal {H}}\) be a signal class, and \(V_M\) a discretization of \({\mathcal {R}}\). Suppose that for every \(u\in U\), \(f_{\cdot ,u}\) is an LVD frame. Let \(\epsilon >0\) and \(\{\psi _M\in L^1(G)\}_{M=1}^{\infty }\) a sequence of envelopes. Suppose that for every \(u\in U\), \(f_{\cdot ,u}\) is \(\epsilon \)-LVD with respect to the envelopes \(\{\psi _M\in L^1(G)\}\) and the constant \(C_{\epsilon }\). For each M, denote by \(\psi _M\in L^1(G\times U)\) the envelope \((g,u)\mapsto \psi _M(g)\). Then, f is an \(\epsilon \)-LVD frame with respect to the envelopes \(\{\psi _M\in L^1(G\times U)\}\) and the bound \(C_{\epsilon }\).

Proof

By Fubini-Tonelli theorem

$$\begin{aligned} \left\| V_{f}[s_M]-\psi _M V_{f}[s_M]\right\| _2^2 = \int _U \left\| V_{f_{\cdot ,u}}[s_M]-\psi _M V_{f_{\cdot ,u}}[s_M]\right\| _2^2 du \le \epsilon ^2 \left\| V_{f}[s]\right\| _2^2. \nonumber \\ \end{aligned}$$
(70)

\(\square \)

Last, we show how to integrate operators in phase space, and show that mapping the energy between envelopes up to \(\epsilon \) (Definition 26) is preserved under integration.

Proposition 32

Consider the setting of Proposition 30. Let \({\mathcal {R}}\subset {\mathcal {H}}\) be a signal class, and \(V_M\) a discretization of \({\mathcal {R}}\). Suppose that for every \(u\in U\), \(f_{\cdot ,u}\) is an LVD frame. Let \(\epsilon >0\) and \(\{\eta _M\in L^1(G)\}_{M=1}^{\infty }\) and \(\{\psi _M\in L^1(G)\}\) two sequences of envelopes. Suppose that for every \(u\in U\), \(f_{\cdot ,u}\) is \(\epsilon \)-LVD with respect to the envelopes \(\{\eta _M\in L^1(G)\}\) and the constant \(C_{\epsilon }\). For each \(u\in U\), let \(T_u\) be a bounded operator in \(L^2(G)\) with \(\left\| T_u\right\| _{L^2(G)}\le C_T\). Suppose that for every \(M\in {\mathbb {N}}\) and a.e. \(u\in U\), \(T_u\) maps the energy of \(\eta _M\) to \(\psi _M\) up to \(\epsilon \). Let T be the operator in \(L^2(G\times U)\) defined for \(F\in L^2(G\times U)\) by

$$\begin{aligned}TF(g,u) = T_u F(\cdot ,u)(g).\end{aligned}$$

Then T is bounded with \(\left\| T\right\| _{L^2(G\times U)}\le C_T\), and maps the energy of \(\eta _M\in L^1(G\times U)\) to \(\psi _M\in L^1(G\times U)\) up to \(\epsilon \).

Under the assumptions of Proposition 32, f and T satisfy the conditions of Theorems 27 and 28. This means that the number of random samples in the stochastic method in f, required for a given accuracy, is comparable to the number of samples required for each \(\{f_{\cdot ,u}\}_{u\in U}\). Namely, the addition of the new feature direction U to the phase space G does not entail an increase in computational complexity, and the approximation of the continuous method by the discrete Monte Carlo method is of order \(O(\frac{\sqrt{\mathrm{dim}(V_M)}}{\sqrt{K}})\).

The above procedure of integrating continuous frames can be carried out when the definition of a certain continuous frame depends on some free parameters u. For example, in the STFT and the CWT, the window function and mother wavelet are free parameters. Instead of fixing the window function, we may consider a parametric space of window functions, parameterized by u, sharing the same linear volume discretization, and add the parameter u as additional dimensions to phase space. For example, in the CWT we may choose as u the spread of the mother wavelet. Integration of frames is the basis on which we construct the LTFT in Definition 33 below.

6.2 Stochastic Diffeomorphism Operator and Highly Redundant Phase Vocoder

In this subsection, we study the signal processing pipeline when T is a diffeomorphism operator, and propose a potential application in audio signal processing.

6.2.1 Stochastic Signal Processing with Diffeomorphism Operators

Let \(f:G\rightarrow {\mathcal {H}}\) be a bounded continuous frame, with bound \(\left\| f_g\right\| _{{\mathcal {H}}}\le C\), and suppose that the phase space G is a Riemannian manifold. Let \({\mathcal {R}}\subset {\mathcal {H}}\) be a class of signal, \(\{V_M\}_M\) a discretization of \({\mathcal {R}}\), and \(\{\eta _M\}_M\) a sequence of envelopes. Let \(\epsilon >0\), and suppose that \(\big \{f,{\mathcal {R}},\{V_M\}_M,\{\eta _M\}_M\big \}\) is an \(\epsilon \)-LVD of f.

Let \(d:G\rightarrow G\) be a diffeomorphism (invertible smooth mapping with smooth inverse), with Jacobian \(J_d\in L^{\infty }(G)\). Consider the diffeomorphism operator T, defined for any \(F\in L^2(G)\) by

$$\begin{aligned}{}[TF](g) = F\big (d^{-1}(g)\big ). \end{aligned}$$
(71)

Note that \(\left\| T\right\| _2= \left\| J_d\right\| _{\infty }\). Let \(r:{\mathbb {C}}\rightarrow {\mathbb {C}}\) satisfy \(\left| r(x)\right| \le E \left| x\right| \). The signal processing pipeline based on the diffeomorphism T is defined to be \({\mathcal {P}}_{f,T,r}S_f^{-1}s\) for the synthesis pipeline, and \(S_f^{-1}{\mathcal {P}}_{f,T,r}s\) for the analysis pipeline. The stochastic approximations of these pipelines are given on \(s_M\in V_M\), up to the application of \(S_f^{-1}\) from the right or from the left, by

$$\begin{aligned}{}[{\mathcal {P}}s_M]^{\eta _M, K}= \frac{\left\| \eta _M\right\| _1}{K} \sum _{k=1}^K r\big (V_f [s](g^k)\big )f_{d(g^k)}. \end{aligned}$$
(72)

In (72), the points \(\{g^k\}_{k=1}^K\) are sampled from the envelope \(\eta _M\). This means that the points \(\{d(g^k)\}_{k=1}^K\) are sampled from the envelope \(\psi _M(g)=\eta _M\big (d(g)\big )J_d(g)\) with \(\left\| \psi _M\right\| _1=\left\| \eta _M\big (d(\cdot )\big )J_d(\cdot )\right\| _1 = \left\| \eta _M\right\| _1\). We can use either Theorems 27 or 28 to bound the stochastic approximation error, and in either case we obtain an error of order \(O(\frac{\sqrt{\mathrm{dim}(V_M)}}{\sqrt{K}})\).

6.2.2 Integer Time Dilation Phase Vocoder

A time stretching phase vocoder is an audio effect that slows down an audio signal without dilating its frequency content. In the classical definition, G is the time frequency plane, and \(V_f\) is the STFT. Phase vocoder can be formulated as phase space signal processing in case the signal is dilated by an integer [52, Section 7.4.3]. For an integer \(\Delta \), we consider the diffeomorphism operator T with \(d(g_1,g_2)=(\Delta g_1,g_2)\), and consider the nonlinearity r, defined by

$$\begin{aligned} r(e^{i\theta }a)=e^{i\Delta \theta }a, \end{aligned}$$
(73)

for \(a\in {\mathbb {R}}_+\) and \(\theta \in {\mathbb {R}}\). The phase vocoder is defined to be \(s\mapsto V_f^* T r\circ V_f[s]\). Note that since the STFT is a Parseval frame, there is no difference between analysis and synthesis signal processing.

Next, we replace the STFT frame in the phase vocoder with a highly redundant time-frequency representation, based on a 3D phase space.

6.2.3 The Localizing Time–Frequency Transform

Here, we construct an example redundant time frequency transform based on a combination of CWT atoms and STFT atoms. The CWT is better than the STFT at isolating transient high frequency events, since middle to high frequency wavelet atoms have shorter time supports than LTFT atoms. On the other hand, low frequency events are smeared by the CWT, since low frequency wavelet atoms have large supports. We thus use STFT atoms to represent low frequencies, and CWT atoms to represent middle frequencies. High frequencies are represented again by STFT atoms with narrow time supports. This is done to potentially avoid false positive detection of very short transient events by very short wavelet atoms.

We then add to this 2D time-frequency system a third axis that controls the number of oscillations in the CWT atoms. We motivate this as follows. Time-frequency atoms are subject to the uncertainty principle. The more accurately a time-frequency atom measures frequency, the less accurately it measures time. Different signal features call for a different balance between the time and the frequency measurement accuracy. In polyphonic audio signals we expect a range of such appropriate balances, which means that no choice of window is appropriate for all features. Hence, the addition of the number of oscillations axis may be useful for representing a variety of features in polyphonic audio signals.

Consider a non-negative real valued window h(t) supported in \([-1/2,1/2]\). For example, the Hann window is defined to be \(h(t) = \big (1+cos(2\pi t)\big )/2\), and zero for \(t\notin [-1/2,1/2]\). Consider a parameter \(\tau \) that controls the number of oscillations in the CWT atoms. We denote by \(0<\tau _1<\tau _2\) the minimal and maximal number of oscillations of the wavelet atoms. The LTFT phase space is defined to be \(G={\mathbb {R}}^2\times [\tau _1,\tau _2]\), where the measure \(\mu _3\) on \([\tau _1,\tau _2]\) is any weighted Lebesgue measure with \(\mu _3([\tau _1,\tau _2])=1\). There are two transition frequencies in the LTFT, where the atoms change from STFT to CWT atoms and back. In general, we allow these transition frequencies \(0<a_{\tau }<b_{\tau }<\infty \) to depend on \(\tau \).

Definition 33

(The localizing time-frequency continuous frame) Consider the above setting. The LTFT atoms are defined for \((x,\omega ,\tau )\in {\mathbb {R}}^2\times [\tau _1,\tau _2]\), where x represents time, \(\omega \) frequencies, and \(\tau \) the number of wavelet oscillations, by

$$\begin{aligned} f_{x,\omega ,\tau }(t) = \left\{ \begin{array}{ccc} \sqrt{\frac{a_{\tau }}{\tau }}h\big (\frac{a_{\tau }}{\tau }(t-x)\big )e^{2\pi i \omega (t-x)} &{} \mathrm{if} &{} \left| \omega \right|<a_{\tau } \\ \sqrt{\frac{\omega }{\tau }}h\big (\frac{\omega }{\tau }(t-x)\big )e^{2\pi i \omega (t-x)} &{} \mathrm{if} &{} a_{\tau }\le \left| \omega \right| \le b_{\tau } \\ \sqrt{\frac{b}{\tau }}h\big (\frac{b_{\tau }}{\tau }(t-x)\big )e^{2\pi i \omega (t-x)} &{} \mathrm{if} &{} b_{\tau }<\left| \omega \right| \end{array} \right. \end{aligned}$$
(74)

In the companion paper [33] we prove that the LTFT is an LVD continuous frame. This is natural in view of Sect. 6.1, since, up to the low and high frequency truncation, the LTFT is based on integrating LVD wavelet transforms.

6.2.4 LTFT-Based Phase Vocoder

In this subsection we offer a toy example application of the LTFT, namely, integer time stretching phase vocoder based on the implementation of Sect. 6.2.2. The integer time dilation phase modification r of (73) is said to preserve the horizontal phase coherence, and deals well with slowly varying instantaneous frequencies [39]. When the instantaneous frequency of a component in a signal changes rapidly, the phase modification r is not sufficient, and methods for “locking” the phase to a frequency bin outside the horizontal line are used in modern implementations [31, 32]. Nevertheless, in this toy application we consider horizontal phase lock. One potential motivation for using this simplistic model is that phasiness artifactsFootnote 1 may be alleviated by using CWT atoms, since CWT atoms have shorter time supports than STFT atoms.

An example implementation of stochastic LTFT phase vocoder (72) is given in https://github.com/RonLevie/LTFT-Phase-Vocoder. In the companion paper [33], we prove that the total number of scalar operations in the phase vocoder LTFT method is \(O(ZM + M\log (M))\), when the number of Monte Carlo samples is \(K=ZM\). In future work, we will study more modern phase vocoder implementations based on the LTFT, akin to [16, 34, 41, 45].