1 Introduction

A signal in this paper is a function from the interval \(I=[0,1]\) to the interval \((-2^{-\frac{1}{2}}, 2^{-\frac{1}{2}})\). In quantum signal processing, one represents such a signal as the imaginary part of one entry of an ordered product of unitary matrices. The factors of this product alternate between matrices depending on the functional parameter \(x\in I\) and matrices depending on a sequence \(\Psi \) of scalar parameters \(\psi _n\) which are tuned so that the product represents a given signal. We are interested in the particular representation of this type proposed by [14] and extended to infinite absolutely summable sequences in [7]. Our main observation is that after some change of variables, the map sending the sequence \(\Psi \) to the signal is identified as the nonlinear Fourier series described in [23]. Indeed, this nonlinear Fourier series as well as variants including one with SU(1, 1) matrices in [24] have been studied for a long time in different contexts such as orthogonal polynomials [21], Krein systems [6], scattering transforms [2, 22] or AKNS systems [1].

In particular, transferring knowledge from nonlinear Fourier analysis, we extend the theory in [7] from absolutely summable to square summable \(\Psi \) using a nonlinear version of the Plancherel identity. We obtain a representation of measurable signals by square summable sequences \(\Psi \). This representation extremizes a certain inequality of Plancherel type.

To state our main result, Theorem 1, we make some formal definitions. Given \(\epsilon >0\), define the signal space \(\textbf{S}_{\epsilon }\) to be the set of real valued measurable functions f on [0, 1] that satisfy the bound

$$\begin{aligned} \sup _{x\in [0,1]}|f(x)|\le 2^{- \frac{1}{2}}-\epsilon . \end{aligned}$$
(1.1)

We equip \(\textbf{S}_{\epsilon }\) with the metric induced by the Hilbert space norm

$$\begin{aligned} \left\| f \right\| \equiv \left( \frac{2}{\pi } \int \limits _{0} ^1 \left| f(x) \right| ^2 \frac{dx}{\sqrt{1-x^2}} \right) ^{\frac{1}{2}} \,. \end{aligned}$$
(1.2)

Let \(\textbf{P}\) be the space of sequences \(\Psi =(\psi _k)_{k\in {\mathbb {N}}}\) of numbers \(\psi _k\in (-\frac{\pi }{2},\frac{\pi }{2})\). We equip \(\textbf{P}\) with the metric induced by the \(L^{\infty }\)-norm

$$\begin{aligned} \Vert \Psi \Vert _\infty = \sup _{k\in {\mathbb {N}}}|\psi _k|. \end{aligned}$$

For \(x\in [0,1]\), define

$$\begin{aligned} W(x):= \begin{pmatrix} x &{}\quad i\sqrt{1-x^2} \\ i\sqrt{1-x^2} &{} \quad x \end{pmatrix},\quad Z = \begin{pmatrix} 1 &{}\quad 0 \\ 0 &{}\quad -1 \end{pmatrix}. \end{aligned}$$
(1.3)

For \(\Psi \in \textbf{P}\) and \(x\in [0,1]\), define recursively

$$\begin{aligned} U_0(\Psi ,x)=e^{i\psi _0 Z} \end{aligned}$$
(1.4)

and

$$\begin{aligned} U_d(\Psi ,x)= e^{i\psi _{d} Z}W(x) U_{d-1}(\Psi ,x) W(x)e^{i\psi _{d} Z}. \end{aligned}$$
(1.5)

Define \(u_d(\Psi ,x)\) to be the upper left entry of \(U_d(\Psi ,x)\).

Theorem 1

Let \(\epsilon >0\). For each \(f \in \textbf{S}_{\epsilon }\), there exists a unique sequence \(\Psi \in \textbf{P}\) such that

$$\begin{aligned} \sum _{k\in {\mathbb {Z}}}\log (1+\tan ^2\psi _{|k|}) = -\frac{2}{\pi }\int _{0}^1 \log |1-f(x)^2| \frac{dx}{\sqrt{1-x^2}} \end{aligned}$$
(1.6)

and \(\Im (u_d(\Psi ,x))\) converges with respect to the norm (1.2) to the function f as d tends to \(\infty \). For two functions \(f,\tilde{f} \in \textbf{S}_\epsilon \) with corresponding sequences \(\Psi , \tilde{\Psi }\) as above, we have the Lipschitz bound

$$\begin{aligned} \Vert \Psi -\tilde{\Psi }\Vert _\infty \le 7.3 \epsilon ^{-\frac{3}{2}}\Vert f-\tilde{f}\Vert . \end{aligned}$$
(1.7)
Fig. 1
figure 1

Illustration of QSP

Figure 1 is a simplified cartoon of QSP, conflating for illustrative purpose the group SO(3) with its double-cover SU(2) and ignoring for simplicity the reflection symmetry in the product (1.5). For a given signal f, Theorem 1 provides tuning parameters \(\psi _j\) with which we can then evaluate f at \(x = \cos \theta \) as follows. We alternatingly rotate the horizontal blue vector by \(\theta \) about the vertical axis, an action generated by the Pauli matrix X defined in Sect. 4, and by the consecutive tuning parameters \(\psi _j\) about the horizontal brown axis, an action generated by the Pauli matrix Z. The resulting rotated blue vector has height f(x).

Our proof provides an algorithm to compute \(\psi _k\) via a Banach fixed point iteration that converges exponentially fast with rate depending on \(\epsilon \). The iteration step requires the application of a Cauchy projection, which in practice may be computed using a fast Fourier transform.

The weight \((1-x^2)^{-\frac{1}{2}}\) in (1.6) has a singularity at one but not at zero. This asymmetry arises because our theory works naturally with f extended to an even function on \([-1,1]\).

After developing the relevant parts of nonlinear Fourier analysis, we prove Theorem 1 in Sect. 8. A relaxation of the threshold (1.1) will be discussed in a forthcoming paper.

The literature both on QSP and NLFA is extensive and we do not try to give a complete overview here. Our first reference to the QSP algorithm discussed here is [14], which was interested in an optimal algorithm for Hamiltonian simulation. Various interesting properties of QSP are discussed in [5, 9]. [18] introduces an SU(1, 1) variant of QSP. For the task of computing the potential \((\psi _n)\) for a given target function f, several algorithms have been proposed including the so-called factorization method [3, 11, 12, 26], an optimization algorithm [25], fixed point iteration [7] and Newton’s method [8]. The factorization method in the context of nonlinear Fourier series is called the layer-stripping formula and is discussed below. The papers [7] and [25] develop the \(\ell ^1\) and \(\ell ^2\) theories for QSP with many interesting theoretical results. Many of these results are implicit in our discussion of NLFA in the present paper.

Discrete NLFA for the SU(1, 1) model was studied in [24] with particular emphasis on transferring analytic estimates for the linear Fourier transform to the nonlinear setting. For the SU(2) model, a similar discussion appears in [23]. Some important contributions to the quest for analogs of classical linear inequalities were made by [4, 13, 16, 17, 20], namely providing maximal and variational Hausdorff-Young inequalities and discussing Carleson-type theorems for the SU(1, 1) model of the nonlinear Fourier transform and some variants. For a discussion of some recent results and open questions see [19].

The interest of the third author and subsequently the other authors in quantum signal processing was initiated during an inspiring talk by L. Lin at a delightful conference at ICERM on Modern Applied and Computational Analysis. In particular, we dedicate this result to R. Coifman, who anticipated at the conference that QSP is some sort of nonlinear Fourier analysis. The third author acknowledges an invitation to the Santaló Lecture 2022, where he gave an introduction to nonlinear Fourier analysis. The authors acknowledge support by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy – EXC-2047/1 – 390685813 as well as CRC 1060. We also thank Jiasu Wang for pointing out typos in the first ArXiv posting of this article.

2 The nonlinear Fourier transform

We are mainly interested in nonlinear Fourier series. However, we start with an excursion to the nonlinear Fourier transform on the real line, which is a multiplicative and non-commutative version of the linear Fourier transform.

Recall the linear Fourier transform

$$\begin{aligned} \widehat{f}(\xi ):=\int _{\mathbb {R}}f(x) e^{-2\pi i x\xi } dx. \end{aligned}$$

This integral is understood to be a Lebesgue integral if f is in \(L^1({\mathbb {R}})\). If \(\widehat{f}\) is also in \(L^1({\mathbb {R}})\), then both f and \(\widehat{f}\) can be seen to be in \(L^2({\mathbb {R}})\) and one has the Plancherel identity

$$\begin{aligned} \Vert \widehat{f}\Vert _2=\Vert {f}\Vert _2. \end{aligned}$$

The Plancherel identity holds for f in a dense subset of \(L^2({\mathbb {R}})\), and one can use it to extend the Fourier transform to a unitary map from \(L^2({\mathbb {R}})\) to itself. This definition in \(L^2({\mathbb {R}})\) coincides with the integral definition when f is in \(L^2({\mathbb {R}})\cap L^1({\mathbb {R}})\).

The integral in the definition of the Fourier transform is an additive process over continuous time x. This process can alternatively be expressed by a differential evolution equation for the partial Fourier integrals

$$\begin{aligned} S(\xi ,x)=\int _{-\infty }^x f(t) e^{-2\pi i t\xi }\, dt, \end{aligned}$$

namely

$$\begin{aligned} \partial _x S(\xi ,x)= f(x) e^{-2\pi i x\xi } \end{aligned}$$

with the initial condition

$$\begin{aligned} S(\xi ,-\infty )=0 \end{aligned}$$

and the final state

$$\begin{aligned} S(\xi ,\infty )=\widehat{f}(\xi ). \end{aligned}$$

If \(f\in L^1({\mathbb {R}})\), the required analytic facts such as solvability of the differential equation and limits as x tends to \(\pm \infty \) can be elaborated with standard methods.

Exponentiation turns this additive process into a multiplicative process. Define

$$\begin{aligned} G(\xi ,x)=e^{S(\xi ,x)}. \end{aligned}$$

Then G satisfies the differential equation

$$\begin{aligned} \partial _x G(\xi ,x)= G(\xi ,x) f(x) e^{-2\pi i x\xi } \end{aligned}$$
(2.1)

with the initial condition

$$\begin{aligned} G(\xi ,-\infty )=1 \end{aligned}$$

and the final state

$$\begin{aligned} G(\xi ,\infty )=e^{\widehat{f}(\xi )}. \end{aligned}$$

In the above scalar valued setting, the multiplicative perspective is of an artificial nature. However, the multiplicative process allows for matrix valued generalizations, which lead to substantially different nonlinear Fourier transforms. For these generalizations, the complex factor \(f(x) e^{-2\pi i x\xi }\) in \({\mathbb {C}}\) in (2.1) needs to be replaced by a matrix factor. The most basic choices of such matrix factors come from real linear embeddings of \({\mathbb {C}}\) into three dimensional Lie algebras, in particular the ones associated with the Lie groups SU(1, 1) and SU(2).

The most common SU(1, 1) model of the nonlinear Fourier transform is described by the differential equation

$$\begin{aligned} \partial _x G(\xi ,x)= G(\xi ,x)\left( \begin{array}{cc} 0 &{}\quad f(x) e^{-2\pi i x\xi } \\ \overline{f(x) e^{-2\pi i x\xi }} &{}\quad 0 \end{array}\right) \end{aligned}$$
(2.2)

with the initial condition

$$\begin{aligned} G(\xi ,-\infty )=\left( \begin{array}{cc} 1 &{}\quad 0 \\ 0 &{}\quad 1 \end{array}\right) \end{aligned}$$

and the final state defined to be the SU(1, 1) nonlinear Fourier transform of f,

$$\begin{aligned} G(\xi ,\infty )=\left( \begin{array}{cc} a(\xi ) &{} b(\xi ) \\ \overline{b(\xi )} &{} \overline{a(\xi )} \end{array}\right) . \end{aligned}$$
(2.3)

As the matrix factor in (2.2) is in the Lie Algebra of SU(1, 1), the solution to the differential equation stays in SU(1, 1). This explains the particular structure of the matrix in (2.3) and we also have

$$\begin{aligned} |a(\xi )|^2-|b(\xi )|^2=1. \end{aligned}$$

Analogous to the linear situation, solvability of the differential equation with limits as x tends to \(\pm \infty \) is elementary for \(f\in L^1({\mathbb {R}})\). By Picard iteration, a solution can be written as the limit of recursively defined approximations \(G_{k}\) with

$$\begin{aligned} G_{0}(\xi ,x)=\left( \begin{array}{cc} 1 &{}\quad 0 \\ 0 &{}\quad 1 \end{array}\right) \end{aligned}$$

and for \(k>0\)

$$\begin{aligned} G_{k}(\xi ,x)=\left( \begin{array}{cc} 1 &{} \quad 0 \\ 0 &{}\quad 1 \end{array}\right) +\int _{-\infty }^x G_{k-1}(\xi ,t_k)\left( \begin{array}{cc} 0 &{}\quad f( t_k) e^{-2\pi i t_k\xi } \\ \overline{f(t_k) e^{-2\pi i t_k\xi }} &{}\quad 0 \end{array}\right) \, dt_k. \end{aligned}$$

In particular, \(G_{k}-G_{k-1}\) is k-linear in f. If k is even, the k-linear term is diagonal with upper left entry

$$\begin{aligned} \int _{-\infty<t_1<t_2<\dots<t_k<\infty } \prod _{j=1}^{k/2} \overline{f(t_{2j})}{f(t_{2j-1})}e^{2\pi i \xi (t_{2j}-t_{2j-1})}\, dt_{2j-1}dt_{2j} \end{aligned}$$
(2.4)

and lower right entry the complex conjugate of (2.4). If k is odd, then the k-linear term is anti-diagonal with upper right entry

$$\begin{aligned} \int _{-\infty<t_1<t_2<\dots<t_k<\infty } f(t_{k})e^{-2\pi i \xi t_k} \prod _{j=1}^{(k-1)/2} \overline{f(t_{2j})} f(t_{2j-1})e^{2\pi i \xi (t_{2j}-t_{2j-1})}\, dt_{2j-1}dt_{2j}\nonumber \\ \end{aligned}$$
(2.5)

and lower left entry the complex conjugate of (2.5). Note that (2.4) and (2.5) are the terms involved in the multilinear expansions of a and b, which have first order approximation of the constant function 1 and the linear Fourier transform of f, respectively.

The entries (2.4) and (2.5) are bounded in absolute value by the integrals

$$\begin{aligned} \int _{-\infty<t_1<t_2<\dots<t_k<\infty } \prod _{j=1}^{k} |f(t_{j})| dt_j =\frac{1}{k!} \Vert f\Vert _1^k. \end{aligned}$$
(2.6)

Hence the nonlinear Fourier transform is a real analytic map from \(L^1({\mathbb {R}})\) to the space \(L^\infty ({\mathbb {R}}, {\mathbb {C}}^{2\times 2})\). Moreover, the inverse linear Fourier transform of (2.4) can be written with the Dirac \(\delta \) and the functional variable x as

$$\begin{aligned} \int _{-\infty<t_1<t_2<\dots<t_k<\infty } \delta \left( x+\sum _{j=1}^{k/2} t_{2j}-t_{2j-1}\right) \prod _{j=1}^{k/2} \bar{f(t_{2j})}f(t_{2j-1}) \, dt_{2j-1}dt_{2j},\nonumber \\ \end{aligned}$$
(2.7)

and similarly for (2.5). The function (2.7) is again in \(L^1({\mathbb {R}})\) with norm bounded as in (2.6). Hence the nonlinear Fourier transform is a real analytic map from \(L^1({\mathbb {R}})\) to \(A({\mathbb {R}}, {\mathbb {C}}^{2\times 2})\), the matrix valued functions with entries in the Wiener space \(A({\mathbb {R}})\), which is the linear Fourier transform of \({L^1}({\mathbb {R}})\).

With more work, one can also show that the nonlinear Fourier transform extends to an analytic map from \(L^p({\mathbb {R}})\) into a suitable space [4, 17] for \(1<p<2\). At \(p=2\), the SU(1, 1) nonlinear Fourier transform can be defined by a similar density argument as in the linear case using the nonlinear analogue of the Plancherel identity

$$\begin{aligned} \Vert f\Vert _2^2=2 \int \log {|a(\xi )|}\, d\xi =\int _{\mathbb {R}}\log (1+|b(\xi )|^2) \, d\xi , \end{aligned}$$
(2.8)

which we will elaborate on below after (2.14). However, unlike the linear setting, one obtains neither an injective map on \(L^2({\mathbb {R}})\), nor a real analytic map on \(L^2({\mathbb {R}})\) in any suitable sense. See [24] in the discrete setting and [15] for references on these respective phenomena.

The SU(2) model of the nonlinear Fourier transform is described by the solution to the differential equation

$$\begin{aligned} \partial _x G(\xi ,x)= G(\xi ,x)\left( \begin{array}{cc} 0 &{}\quad f(x) e^{-2\pi i x\xi } \\ -\overline{f(x) e^{-2\pi i x\xi }} &{}\quad 0 \end{array}\right) \end{aligned}$$
(2.9)

with the initial condition

$$\begin{aligned} G(\xi ,-\infty )=\left( \begin{array}{cc} 1 &{}\quad 0 \\ 0 &{}\quad 1 \end{array}\right) \end{aligned}$$

and whose final state is the SU(2) nonlinear Fourier transform of f

$$\begin{aligned} G(\xi ,\infty )=\left( \begin{array}{cc} a(\xi ) &{} b(\xi ) \\ -\overline{b(\xi )} &{} \overline{a(\xi )} \end{array}\right) . \end{aligned}$$
(2.10)

Here, the matrix in (2.10) is in SU(2) for each \(\xi \in {\mathbb {R}}\), and in particular

$$\begin{aligned} |a(\xi )|^2+|b(\xi )|^2=1. \end{aligned}$$

The \(L^p\) theory for \(p<2\) in so far as discussed above is largely analogous to the case of SU(1, 1) but with suitable changes of signs in the multi-linear terms. The analogue of Plancherel however is the weaker information

$$\begin{aligned} \Vert f\Vert _2^2= \lim _{\xi \rightarrow i\infty } 2\pi i\xi \log {(a(\xi ))}, \end{aligned}$$
(2.11)

where \(\xi \) tends to \(\infty \) along the imaginary axis in the upper half plane, or more generally through any ray from the origin strictly in the upper half plane. This can be shown by doing an asymptotic expansion

$$\begin{aligned} 2\pi i\xi \log (a(\xi ))= c +O(|\xi |^{-1}) \end{aligned}$$

along such a ray as in [16] and observing that it is only the bilinear term in the multilinear expansion of a that contributes to c. The bilinear term of a, now the negative of the bilinear term of the SU(1, 1) case, is equal to

$$\begin{aligned} -\int _{-\infty<t_1<t_2 < \infty } \overline{f(t_{2})}{f(t_{1})}e^{2\pi i \xi (t_{2}-t_{1})}\, dt_{1}dt_{2} =-\int _{s>0} \int _t \overline{f(t+s)}{f(t)}e^{2\pi i \xi s}\, dtds. \nonumber \\ \end{aligned}$$
(2.12)

Multiplying by \(2\pi i \xi \) and using that

$$\begin{aligned} -2\pi i\xi e^{2\pi i\xi s}1_{\{s>0\}} \end{aligned}$$
(2.13)

is an approximating unit converging to the Dirac delta as \(\xi \) tends to infinity along a ray in the upper half plane, we obtain

$$\begin{aligned} \lim _{\xi \rightarrow i\infty } -2\pi i\xi \int _{s>0} \int _t \overline{f(t+s)}{f(t)}e^{ 2\pi i\xi s}\, dtds =\int \overline{f(t)} f(t)\, dt. \end{aligned}$$
(2.14)

This shows (2.11).

In the SU(1, 1) setting, where \(\log (a)\) has an analytic extension to the upper half plane, one can use a contour integral over a large semicircle in the upper half plane to express the analogue of the limit (2.11) by an integral as in (2.8). Here, in the SU(2) case, \(\log (a)\) is in general not analytic in the upper half plane due to zeros of a and one cannot as easily express the limit by an integral. Instead, one resorts to tools such as factorization into inner and outer functions [10].

3 Nonlinear Fourier series

Passing from functions on \({\mathbb {R}}\) to sequences F on \({\mathbb {Z}}\), the Fourier transform, which we now call Fourier series, no longer lives on \({\mathbb {R}}\) but on the unit circle \(\mathbb {T}:=\{z\in {\mathbb {C}}: |z|=1\}\). We slightly misuse the notion of Fourier series here, usually this notion is reserved for the inverse of the map that we call Fourier series here.

There are nonlinear Fourier series with values in SU(1, 1), this is discussed in [24], and nonlinear Fourier series with values in SU(2) discussed in [23]. We focus here on the SU(2) model, which is relevant to the QSP model in Theorem 1.

The linear Fourier series of a sequence \(F=(F_n)_{n\in {\mathbb {Z}}}\) with finite support is defined as

$$\begin{aligned} \widehat{F}(z)=\sum _{n\in {\mathbb {Z}}} F_nz^n. \end{aligned}$$

The analogy with the Fourier transform becomes apparent when writing \(z=e^{-2\pi i \xi }\) for some \(\xi \in {\mathbb {R}}\). Indeed, if we define a measure f on the real line as

$$\begin{aligned} f(x)=\sum _{n\in {\mathbb {Z}}} F_n\delta (x-n), \end{aligned}$$

then

$$\begin{aligned} \widehat{f}(\xi )= & {} \int _{\mathbb {R}}\sum _{n\in {\mathbb {Z}}} F_n\delta (x-n)e^{-2\pi i \xi x}\, dx \\ {}= & {} \int _{\mathbb {R}}\sum _{n\in {\mathbb {Z}}} F_n\delta (x-n)e^{-2\pi i \xi n}\, dx= \sum _{n\in {\mathbb {Z}}} F_ne^{-2\pi i \xi n}=\widehat{F}(z). \end{aligned}$$

The nonlinear analog becomes an ordered product of matrices described below. We will be interested in meromorphic extensions beyond the circle \({\mathbb {T}}\), hence we consider the Riemann sphere \({\mathbb {C}}\cup \{\infty \}\) where \(\infty \) is the reciprocal of 0. For a subset \(\Omega \) of the Riemann sphere we define the reflected set

$$\begin{aligned} \Omega ^*=\{\overline{z^{-1}}:z\in \Omega \}. \end{aligned}$$
(3.1)

For a function a on \(\Omega \) we define \(a^*\) on \(\Omega ^*\) by

$$\begin{aligned} a^*(z)=\overline{a(\overline{z^{-1}})}. \end{aligned}$$
(3.2)

We note that \((\Omega ^*)^*=\Omega \) and \((a^*)^*=a\). If \(z\in \mathbb {T}\), then \(a^*(z)=\overline{a(z)}\). Define the open unit disc

$$\begin{aligned} \mathbb {D} \equiv \left\{ z \in {\mathbb {C}}~:~ \left| z \right| < 1 \right\} , \end{aligned}$$

The function a is analytic on \({\mathbb {D}}^*\) precisely if \(a^*\) is analytic on \({\mathbb {D}}\). We have

$$\begin{aligned} a(\infty )=\overline{a^*(0)}. \end{aligned}$$
(3.3)

If a is analytic on \({\mathbb {D}}^*\) and continuous up to the boundary \({\mathbb {T}}\) of \({\mathbb {D}}^*\), then we have the mean value theorem

$$\begin{aligned} a(\infty )=\overline{a^*(0)}=\overline{\int _{\mathbb {T}}a^*}=\int _{\mathbb {T}}a, \end{aligned}$$
(3.4)

where we denote by

$$\begin{aligned} \int _{\mathbb {T}}a = \int _0^{1} a(e^{2\pi i \theta })\, d\theta \end{aligned}$$

the mean value of a on \({\mathbb {T}}\), i.e., the constant term in the Fourier expansion of a.

For a sequence \(F:{\mathbb {Z}}\rightarrow {\mathbb {C}}\) with finite support, define the meromorphic matrix valued function G on the Riemann sphere by the recursive equation

$$\begin{aligned} G_k(z)= G_{k-1}(z)\frac{1}{\sqrt{1+|F_k|^2}}\left( \begin{array}{cc} 1 &{}\quad F_k z^{k} \\ -\overline{F_k} z^{-k} &{}\quad 1 \end{array}\right) \end{aligned}$$
(3.5)

with the initial condition

$$\begin{aligned} \lim _{k\rightarrow -\infty } G_k(z)= \left( \begin{array}{cc} 1 &{}\quad 0 \\ 0 &{} \quad 1 \end{array}\right) , \end{aligned}$$

and define the SU(2) nonlinear Fourier series

$$\begin{aligned} G(z)=\lim _{k \rightarrow \infty } G_k(z)= \left( \begin{array}{cc} a(z) &{}\quad b(z) \\ -{b^*(z)} &{}\quad {a^*(z)} \end{array}\right) . \end{aligned}$$
(3.6)

Existence of the limit as \(k\rightarrow \pm \infty \) is trivial thanks to the finite support of F, which makes the sequence \(G_k(z)\) eventually constant in k. The matrix factors in (3.5) are in SU(2) on \({\mathbb {T}}\) and hence so is their product. In particular,

$$\begin{aligned} a(z)a^*(z)+b(z)b^*(z)=1 \end{aligned}$$

on \({\mathbb {T}}\) and as well on the Riemann sphere by analytic continuation.

Under the analogous formal transformation as above, the nonlinear Fourier series becomes the SU(2) nonlinear Fourier transform for the measure

$$\begin{aligned} f(x)=\sum _{n\in {\mathbb {Z}}} f_n\delta (x-n), \end{aligned}$$

where \(f_n=\arctan (|F_n|)F_n |F_n|^{-1}\). This value of \(f_n\) arises from the model computation

$$\begin{aligned}{} & {} \exp \left( \begin{array}{cc} 0 &{} f_0 \\ -\overline{f_0} &{} 0\end{array}\right) = \left( \begin{array}{cc} \cos |f_0| &{} {f_0}{|f_0|^{-1}}\sin |f_0| \\ -{\overline{f_0}}{|f_0|^{-1}}\sin |f_0| &{} \cos |f_0| \end{array}\right) \\ {}{} & {} \quad =\cos |f_0| \left( \begin{array}{cc} 1 &{} {f_0}{|f_0|^{-1}}\tan |f_0| \\ -{\overline{f_0}}{|f_0|^{-1}}\tan |f_0| &{} 1 \end{array}\right) =\frac{1}{1+|F_0|^2}\left( \begin{array}{cc} 1 &{} F_0 \\ -\overline{F_0} &{} 1 \end{array}\right) . \end{aligned}$$

We write the SU(2) nonlinear Fourier series of the sequence F on \({\mathbb {Z}}\) as

$$\begin{aligned} {\overbrace{F}}:=(a,b) \end{aligned}$$

with a and b as defined in (3.6). We identify the row vector (ab) with the matrix function as in (3.6). In particular, we write the product

$$\begin{aligned} (a,b)(c,d)=(ac-bd^*, ad+bc^*). \end{aligned}$$

We describe some properties of the nonlinear SU(2) Fourier series, following [23] and the analogous arguments in [24]. The first theorem describes some basic transformation properties analogous to transformation properties of the linear Fourier series. To better understand the analogy, recall from the analogous discussion of the nonlinear Fourier transform that the first order approximations of a and b are one and the linear Fourier series, respectively.

Theorem 2

Let FH be complex valued finitely supported sequences on \({\mathbb {Z}}\) and let

$$\begin{aligned} \overbrace{F}(z)=(a,b). \end{aligned}$$

If all entries of F except possibly the zeroth entry vanish, then

$$\begin{aligned} (a(z),b(z))=(1+|F_0|^2)^{-\frac{1}{2}}(1, F_0). \end{aligned}$$
(3.7)

If \(H_n=F_{n-1}\), then

$$\begin{aligned} \overbrace{H}(z)=(a(z),zb(z)). \end{aligned}$$
(3.8)

If the support of F is entirely to the left of the support of H, then

$$\begin{aligned} \overbrace{F+H}=\overbrace{F} \overbrace{H}. \end{aligned}$$
(3.9)

If \(|c|=1\), then

$$\begin{aligned} \overbrace{cF}=(a,cb). \end{aligned}$$
(3.10)

If \(H_n=F_{-n}\), then

$$\begin{aligned} \overbrace{H}(z)=(a^*(z^{-1}), b(z^{-1})). \end{aligned}$$
(3.11)

If \(H_n=\overline{F_n}\), then

$$\begin{aligned} \overbrace{H}(z)=(a^*(z^{-1}), b^*(z^{-1})). \end{aligned}$$
(3.12)

Note that the properties in Theorem 2 are sufficient to uniquely determine the map from F to \(\overbrace{F}\). The next theorem describes the range of this map on the space of sequences with finite support. Let l(MN) be the space of all complex valued sequences F on \({\mathbb {Z}}\) which are supported on the interval \(M\le k\le N\) in the strict sense that \(F(M)\ne 0\) and \(F(N)\ne 0\).

Theorem 3

Let \(M\le N\). The SU(2) nonlinear Fourier series maps l(MN) bijectively to the space of pairs (ab) such that b is the linear Fourier series of a sequence in l(MN) and a is the linear Fourier series of a sequence in \(l(M-N,0)\) with \(0<a(\infty )\) and

$$\begin{aligned} aa^*+bb^*=1. \end{aligned}$$
(3.13)

Moreover, we have the identity

$$\begin{aligned} a(\infty )=\prod _{n\in {\mathbb {Z}}} (1+|F_n|^2)^{-1/2}. \end{aligned}$$
(3.14)

Note that (3.13) implies that a and b have no common zeros in the Riemann sphere. Moreover, |a| and |b| are bounded by 1 on \({\mathbb {T}}\) and \(a(\infty )\le 1\) with equality only if \(b=0\) and \(F=0\).

Note that if a does not have zeros in \(\mathbb {D}^*\), then \(\log (a)\) is analytic in \(\mathbb {D}^*\) and the real part of (3.4) gives

$$\begin{aligned} \log |a(\infty )| = \int _{\mathbb {T}}\log |a|. \end{aligned}$$
(3.15)

Multiplying by \(-2\) and using (3.14) and (3.13), we obtain

$$\begin{aligned} \sum _{n\in {\mathbb {Z}}} \log (1+|F_n|^2) = -\int _{\mathbb {T}}\log (1- |b|^2) \end{aligned}$$
(3.16)

in analogy to (2.8). If a has zeros in \(\mathbb {D}^*\), then we have only the inequality

$$\begin{aligned} \sum _{n\in {\mathbb {Z}}} \log (1+|F_n|^2) \ge -\int _{\mathbb {T}}\log (1- |b|^2), \end{aligned}$$
(3.17)

which can be obtained by applying the mean value theorem (3.4) to the logarithm of the quotient of \(a^*\) divided by the Blaschke product of its zeros [10].

As \(\log (1+x)\) is comparable to x for small x, under suitable pointwise smallness assumptions on F and b and absence of zeros of a in \({\mathbb {D}}^*\) we obtain from (3.16) that \(\Vert F\Vert _{l^2({\mathbb {Z}})}\) and \(\Vert b\Vert _{L^2({\mathbb {T}})}\) are comparable, in analogy to the linear situation.

4 Quantum signal processing for finite sequences

In this section, we relate at the level of finite sequences the nonlinear Fourier series to QSP.

Let \(\Psi \) be in \(\textbf{P}\) as in Theorem 1. Let \(F_n\) for \(n\in {\mathbb {Z}}\) be defined by

$$\begin{aligned} F_n=i\tan (\psi _{|n|}) \end{aligned}$$
(4.1)

and note that \((F_n)\) is even and purely imaginary, that is, for all \(n\in {\mathbb {Z}}\),

$$\begin{aligned} F_{-n}=F_n=-\overline{F_n}. \end{aligned}$$

For \(d\ge 0\), let \(G_{d}\) be the nonlinear Fourier series of the truncated sequence

$$\begin{aligned} \left( F_n 1_{ \{-d\le n\le d \}} \right) . \end{aligned}$$

We may write \(G_d(z)\) for \(z\in {\mathbb {T}}\), using the symmetries of \((F_n)\), recursively as

$$\begin{aligned} G_{0}(z)= & {} \frac{1}{\sqrt{1-F_0^2}} \begin{pmatrix} 1 &{}\quad F_0 \\ F_0 &{}\quad 1 \end{pmatrix}, \end{aligned}$$
(4.2)
$$\begin{aligned} G_{d}(z)= & {} \frac{1}{{1-F_d^2}} \begin{pmatrix} 1 &{}\quad F_dz^{-d} \\ F_d z^{d} &{}\quad 1 \end{pmatrix} G_{d-1}(z) \begin{pmatrix} 1 &{}\quad F_dz^d \\ F_d z^{-d} &{}\quad 1 \end{pmatrix}. \end{aligned}$$
(4.3)

Define X and M and recall Z as follows:

$$\begin{aligned} X = \begin{pmatrix} 0 &{}\quad 1 \\ 1 &{}\quad 0 \end{pmatrix}, \quad M = 2^{- \frac{1}{2}}\begin{pmatrix} 1 &{}\quad 1 \\ 1 &{}\quad -1 \end{pmatrix}, \quad Z = \begin{pmatrix} 1 &{}\quad 0 \\ 0 &{}\quad -1 \end{pmatrix}. \end{aligned}$$
(4.4)

Observe that \(M^2\) is the identity matrix, that

$$\begin{aligned} XM = 2^{- \frac{1}{2}} \begin{pmatrix} 1 &{}\quad -1 \\ 1 &{}\quad 1 \end{pmatrix} =MZ, \end{aligned}$$
(4.5)

and hence also \(MZM=X\) and \(MXM=Z\).

Lemma 1

For \(x\in [0,1]\) let \(\theta \) be the unique number in \( [0,\frac{\pi }{2}]\) so that \(\cos \theta =x\) and set \(z=e^{2 i \theta }\). We have for every \(d\ge 0\) and \(U_d\) as in Theorem 1,

$$\begin{aligned} M U_d(\Psi ,x) M= \begin{pmatrix} e^{id\theta } &{}\quad 0\\ 0 &{} \quad e^{-id\theta } \end{pmatrix} G_{d}(z) \begin{pmatrix} e^{id\theta } &{}\quad 0\\ 0 &{} \quad e^{-id\theta } \end{pmatrix}. \end{aligned}$$

Note that the factor two in the exponent of the definition of z differs from the convention in [7].

Proof

We prove the Lemma by induction on d. For \(k\in {\mathbb {N}}\), we have

$$\begin{aligned} Me^{i\psi _k Z}M= & {} e^{i\psi _k MZM} \end{aligned}$$
(4.6)
$$\begin{aligned}= & {} e^{i\psi _k X}=\begin{pmatrix} \cos (\psi _k) &{} i\sin (\psi _k)\\ i \sin (\psi _k) &{} \cos (\psi _k) \end{pmatrix} ={\cos (\psi _k)}\begin{pmatrix} 1 &{} i\tan (\psi _k)\\ i \tan (\psi _k) &{}1 \end{pmatrix}\nonumber \\ \end{aligned}$$
(4.7)
$$\begin{aligned}= & {} \frac{1}{\sqrt{1+\tan (\psi _k)^2}}\begin{pmatrix} 1 &{} i\tan (\psi _k)\\ i \tan (\psi _k) &{}1 \end{pmatrix} =\frac{1}{\sqrt{1-F_k^2}}\begin{pmatrix} 1 &{} F_k\\ F_k &{}1 \end{pmatrix}.\nonumber \\ \end{aligned}$$
(4.8)

Applying this with \(k=0\) and using (4.2) and (1.4) in the form

$$\begin{aligned} Me^{i\psi _0 Z}M=M U_0(\Psi ,x) M \end{aligned}$$
(4.9)

verifies the base case \(d=0\) of the induction.

Now let \(d\ge 1\) and assume the induction hypothesis is true for \(d-1\). Noting that similarly as in (4.7),

$$\begin{aligned} W(x) = e^{i \arccos (x)X}, \end{aligned}$$

we have

$$\begin{aligned} MW(x)M =e^{i\arccos (x)MXM}=e^{i\theta Z}= \begin{pmatrix} e^{i\theta } &{}\quad 0\\ 0 &{} \quad e^{-i\theta } \end{pmatrix}. \end{aligned}$$
(4.10)

Hence, with (4.6),

$$\begin{aligned} \sqrt{1-F_d^2}MW(x)e^{i\psi _d Z} =\begin{pmatrix} e^{i\theta } &{}\quad 0\\ 0 &{} \quad e^{-i\theta } \end{pmatrix}\begin{pmatrix} 1 &{}\quad F_d\\ F_d &{}\quad 1 \end{pmatrix}M \end{aligned}$$
(4.11)

and

$$\begin{aligned} \sqrt{1-F_d^2} e^{i\psi _d Z}W(x)M = M\begin{pmatrix} 1 &{} \quad F_d\\ F_d &{}\quad 1 \end{pmatrix}\begin{pmatrix} e^{i\theta } &{}\quad 0\\ 0 &{}\quad e^{-i\theta } \end{pmatrix}. \end{aligned}$$
(4.12)

We obtain with the recursive definition (1.5) and induction hypothesis,

$$\begin{aligned}{} & {} (1-F_d^2)M U_d(\Psi ,x) M\end{aligned}$$
(4.13)
$$\begin{aligned}{} & {} \quad =(1-F_d^2)M e^{i\psi _{d}Z} W(x)M(M U_{d-1}M) M(\Psi ,x)W(x)e^{\psi _d Z} M \nonumber \\{} & {} \quad =\begin{pmatrix} 1 &{} F_d\\ F_d &{}1 \end{pmatrix}\begin{pmatrix} e^{id\theta } &{} 0\\ 0 &{} e^{-id\theta } \end{pmatrix}G_{d-1}(z) \begin{pmatrix} e^{id\theta } &{} 0\\ 0 &{} e^{-id\theta } \end{pmatrix} \begin{pmatrix} 1 &{} F_d\\ F_d &{}1 \end{pmatrix} \nonumber \\{} & {} \quad =\begin{pmatrix} e^{id\theta } &{} 0\\ 0 &{} e^{-id\theta } \end{pmatrix} \begin{pmatrix} 1 &{} F_dz^{-d}\\ F_d z^{d} &{}1 \end{pmatrix} G_{d-1}(z) \begin{pmatrix} 1 &{} F_d z^d\\ F_d z^{-d} &{}1 \end{pmatrix} \begin{pmatrix} e^{id\theta } &{} 0\\ 0 &{} e^{-id\theta } \end{pmatrix} \nonumber \\{} & {} \quad = (1-F_d^2) \begin{pmatrix} e^{id\theta } &{} 0\\ 0 &{} e^{-id\theta } \end{pmatrix} G_{d}(z) \begin{pmatrix} e^{id\theta } &{} 0\\ 0 &{} e^{-id\theta } \end{pmatrix}. \end{aligned}$$
(4.14)

This proves the induction step for d and completes the proof of Lemma 1. \(\square \)

Lemma 2

Let \(d\ge 0\) and set

$$\begin{aligned} G_d(z)=:\begin{pmatrix} a(z)&{}\quad b(z)\\ -b^*(z) &{}\quad a^*(z) \end{pmatrix}. \end{aligned}$$

For \(x\in [0,1]\), let \(\theta \) be the unique number in \([0,\frac{\pi }{2}]\) such that \(\cos \theta =x\) and set \(z=e^{2 i \theta }\). We have for \(d\ge 1\) and \(u_d\) as in Theorem 1,

$$\begin{aligned} i\Im (u_{d}(\Psi ,x))= b(z). \end{aligned}$$

Proof

We use Lemma 1 to obtain

$$\begin{aligned} U_d(\Psi ,x)= & {} M\begin{pmatrix} e^{id\theta } &{} 0\\ 0 &{} e^{-id\theta } \end{pmatrix} G_{d}(z) \begin{pmatrix} e^{id\theta } &{} 0\\ 0 &{} e^{-id\theta } \end{pmatrix}M \\= & {} M\begin{pmatrix} a(z)z^d &{} b(z)\\ -b^*(z) &{} a^*(z)z^{-d} \end{pmatrix} M. \end{aligned}$$

We then compute the upper left corner

$$\begin{aligned} u_{d}(\Psi ,x)=\frac{1}{2} \begin{pmatrix} 1&1 \end{pmatrix} \begin{pmatrix} a(z)z^d &{} b(z)\\ -b^*(z) &{} a^*(z)z^{-d} \end{pmatrix} \begin{pmatrix} 1 \\ 1 \end{pmatrix} =\frac{1}{2}(a(z)z^d+a^*(z)z^{-d} +b(z)-b^*(z)).\nonumber \\ \end{aligned}$$
(4.15)

As z is in \({\mathbb {T}}\), the last display becomes

$$\begin{aligned} \Re (a(z)z^d)+ i \Im (b(z)). \end{aligned}$$

As \((F_n)\) is purely imaginary and even, the symmetries of the nonlinear Fourier series imply that b is also purely imaginary. In particular, (4.15) gives Lemma 2. \(\square \)

The above proof gives \(b(z)=b(z^{-1})\) for \(z\in {\mathbb {T}}\) and that \(\Im (u_d(x, \Phi ))\) extends to an even function in \(x\in [-1,1]\).

We note that with this correspondence between NLFA and QSP established, Theorem 31 and Theorem 5 in [25] observe a version of the comparability of \(\Vert F\Vert _{l^2({\mathbb {Z}})}\) and \(\Vert b\Vert _{L^2({\mathbb {T}})}\) discussed in the remarks to (3.16) in the previous section.

5 Nonlinear Fourier series of summable sequences

While our focus in this paper is on square summable sequences, we briefly comment on the analytically simpler theory of nonlinear Fourier series of elements in the space \(\ell ^1(\mathbb {Z})\) of absolutely summable sequences on \(\mathbb {Z}\). The linear Fourier series maps \(\ell ^1({\mathbb {Z}})\) to the space \(C(\mathbb {T})\) of continuous functions on \({\mathbb {T}}\), a closed subspace of \(L^\infty (\mathbb {T})\). The actual image of \(\ell ^1({\mathbb {Z}})\) under the linear Fourier series is the Wiener algebra \(A({\mathbb {T}})\). Similar mapping properties are true for the nonlinear Fourier series.

We first recall the Theorem below of [23] for the \(L^\infty \) bounds. Consider a metric on SU(2) induced by the operator norm, i.e.,

$$\begin{aligned} {\text {dist}}(T,T'):= \Vert T - T'\Vert _{op} \end{aligned}$$
(5.1)

and let \(C (\mathbb {T}, SU(2))\) be the metric space of all continuous \(G: \mathbb {T}\rightarrow SU(2)\) with metric defined by

$$\begin{aligned} {\text {dist}}(G,G'):=\sup _{z\in {\mathbb {T}}} {\text {dist}}(G(z), G'(z)). \end{aligned}$$
(5.2)

Theorem 4

([23, Theorem 2.5]) The SU(2) nonlinear Fourier series extends uniquely to a Lipschitz map \(\ell ^1({\mathbb {Z}})\rightarrow C(\mathbb {T}, SU(2))\) with Lipschitz constant at most 3.

The use of the operator norm is of no particular relevance except possibly for the value of the Lipschitz constant, because all norms on the finite dimensional space of \(2\times 2\) matrices are equivalent. Let b and \(b'\) be the second entries of the first row of \(\overbrace{F}\) and \(\overbrace{F'}\), respectively. Then Theorem 4 in particular implies

$$\begin{aligned} \Vert b-b'\Vert _{L^{\infty }} \le 3 \Vert F-F' \Vert _{\ell ^1}. \end{aligned}$$
(5.3)

We next turn to the Wiener algebra \(A({\mathbb {T}})\). Recall that the linear Fourier series is injective from \(\ell ^1({\mathbb {Z}})\) onto \(A({\mathbb {T}})\) and the norm \(\Vert .\Vert _A\) on the Wiener algebra is defined so that the linear Fourier series is an isometry from \(\ell ^1({\mathbb {Z}})\) to the Wiener algebra. Let \(A(\mathbb {T},\mathbb {C}^2)\) be the space of pairs (ab) of functions in \(A({\mathbb {T}})\) and let the norm of (ab) be defined as \(\Vert a\Vert _A+\Vert b\Vert _A\).

Theorem 5

The SU(2) nonlinear Fourier series is a real analytic map from \(\ell ^1({\mathbb {Z}})\) to \(A({\mathbb {T}},{\mathbb {C}}^2)\).

Let \(R\ge 0\), \(\Vert F\Vert _{\ell ^1}, \Vert F'\Vert _{\ell ^1} \le R\). If b and \(b'\) are the second entries of the nonlinear Fourier series of F and \(F'\), respectively, then

$$\begin{aligned} \Vert b-b'\Vert _A \le e^R \Vert F-F'\Vert _{\ell ^1}. \end{aligned}$$
(5.4)

If additionally \(R \le 0.36\), then

$$\begin{aligned} \Vert F-F'\Vert _{\ell ^1} \le 2\Vert b-b'\Vert _A. \end{aligned}$$
(5.5)

Proof

We begin with a finite sequence F and write the nonlinear Fourier series as an ordered product

$$\begin{aligned} (a(z),b(z))=\prod _{j=-\infty }^\infty (1+ |F_j|^2)^{-\frac{1}{2}}(1,F_jz^j), \end{aligned}$$
(5.6)

where the non-commutative product is understood in the sense of j increasing from left to right. We decompose \((1,F_jz^j)=(1,0)+(0,F_jz^j)\) and apply the distributive law. The terms resulting from the distributive law are parameterized by increasing sequences \(j_1<\dots <j_n\) of indices, for which \(F_jz^j\) appears in the term. Hence we write the right side of (5.6) as

$$\begin{aligned} C(F)\left( \sum _{n=0}^{\infty } \sum _{j_1<j_2<\dots < j_n} \prod _{k=1}^n (0, F_{j_k} z^{j_k}) \right) \end{aligned}$$
(5.7)

with

$$\begin{aligned} C(F)= \prod _{j=-\infty }^\infty (1+ |F_j|^2)^{-\frac{1}{2}}.\end{aligned}$$

The n-th term in the sum of (5.7) is diagonal for even n and anti-diagonal for odd n. Setting

$$\begin{aligned} T_n (F^1,\dots , F^n)(z):= \sum _{j_1<j_2<\dots < j_n} \left( \prod _{\begin{array}{c} 1\le k\le n \\ k \text { is odd} \end{array}}F^k_{j_k}z^{j_k} \right) \left( \prod _{\begin{array}{c} 1\le k\le n \\ k \text { is even} \end{array}}-\overline{F^k_{j_k}} z^{-j_k}\right) , \end{aligned}$$
(5.8)

we obtain

$$\begin{aligned} a = C(F) \sum _{n=0}^{\infty } T_{2n} (F,\dots , F). \end{aligned}$$
(5.9)
$$\begin{aligned} b = C(F)\sum _{n=0}^{\infty } T_{2n+1} (F,\dots , F). \end{aligned}$$
(5.10)

We will show that both the function C and the multilinear expansions in (5.9) and (5.10) extend to analytic maps in \(\ell ^1({\mathbb {Z}})\), thereby proving that a and b extend to analytic maps in the argument \(F\in \ell ^1({\mathbb {Z}})\).

We first discuss C(F). For a sequence \(H_j\) of non-negative numbers, we have

$$\begin{aligned} \prod _{j=-\infty }^\infty (1+H_j)=1+\sum _{n=1}^\infty \sum _{j_1<\dots <j_n} \prod \limits _{k=1}^n H_{j_k}\le 1+\sum _{n=1}^\infty \frac{1}{n!}\Vert H\Vert _{\ell ^1({\mathbb {Z}})}, \end{aligned}$$
(5.11)

which we recognize as a multi-linear expansion with infinite radius of convergence. As the map \(F_j\rightarrow F_j\overline{F_j}\) is real analytic from \(\ell ^1({\mathbb {Z}})\) to itself and the \(-\frac{1}{2}\)-th power is real analytic from \([1,\infty )\) to (0, 1], the function C extends to a real analytic map from \(\ell ^1({\mathbb {Z}})\) to (0, 1].

As for \(T_n\), taking all \(F^j = F\) and summing (5.8) in absolute value over all permutations of the indices \(j_1\) to \(j_n\), the sum separates into a product of sums and one can estimate for \(|z|=1\)

$$\begin{aligned} |T_n (F,\dots , F) (z)| \le \frac{1}{n!} \Vert F\Vert _{\ell ^1} ^n. \end{aligned}$$
(5.12)

Thus the multilinear expansions in the expression (5.9) and (5.10) of a and b have infinite radius of convergence in \(\ell ^1({\mathbb {Z}})\) and extend to real analytic maps from \(\ell ^1\) to \(L^{\infty }({\mathbb {T}},\mathbb {C}^2)\). Moreover, (5.8) is the linear Fourier series of the sequence given by

$$\begin{aligned} (\check{T}_n(F^1,\dots , F^n))_j= \sum _{\begin{array}{c} j_1<j_2<\dots < j_n \\ \sum _{k=1}^n -(-1)^kj_k=j \end{array} } \left( \prod _{\begin{array}{c} 1\le k\le n \\ k \text { is odd} \end{array}}F^k_{j_k} \right) \left( \prod _{\begin{array}{c} 1\le k\le n \\ k \text { is even} \end{array}} - \overline{F^i_{j_i}} \right) . \end{aligned}$$

Absolutely summing over j as well as over permutations of the indices from \(j_1\) to \(j_n\) yields that

$$\begin{aligned} \Vert T_n (F^1,\dots , F^n) \Vert _A \le \frac{1}{n!} \prod _{j=1}^n \Vert F^j\Vert _{\ell ^1}. \end{aligned}$$
(5.13)

Hence the multilinear expansions in (5.9) and (5.10) extend to real analytic maps from \(\ell ^1({\mathbb {Z}})\) to \(A({\mathbb {T}})\). The nonlinear Fourier series extends to a real analytic map from \(\ell ^1({\mathbb {Z}})\) to \(A({\mathbb {T}},\mathbb {C}^2)\). This proves the first statement of Theorem 5.

We turn to the proof of (5.4). By an n-fold application of the triangle inequality and (5.13) above,

$$\begin{aligned}{} & {} \Vert T_n (F,\dots , F) - T_n (F',\dots ,F') \Vert _A\nonumber \\{} & {} \quad \le \sum _{j=1}^n \Vert T_n (F',\dots , F', F-F', F,\dots , F)\Vert _A \nonumber \\{} & {} \quad \le \frac{\Vert F-F'\Vert _{\ell ^1} R^{n-1}}{(n-1)!}, \end{aligned}$$
(5.14)

where in the middle term the difference \(F-F'\) occurs in the j-th entry. On the other hand, by a telescoping sum as in (5.14) using that all factors of C(F) are bounded by 1, we get

$$\begin{aligned}{} & {} |C(F)-C(F')| \le \sum _{j=-\infty }^\infty \left| (1+ |F_j|^2)^{-\frac{1}{2}}- (1+ |F_j'|^2)^{-\frac{1}{2}}\right| \nonumber \\{} & {} \le \sum _{j=-\infty }^\infty \left| (1+ |F_j|^2)^{\frac{1}{2}}- (1+ |F_j'|^2)^{\frac{1}{2}}\right| \le \sum _{j=-\infty }^\infty |F_j-F'_j|=\Vert F-F'\Vert _{\ell ^1(Z)}.\nonumber \\ \end{aligned}$$
(5.15)

Thus, by the triangle inequality,

$$\begin{aligned}{} & {} \Vert b-b'\Vert _A \le |C(F)-C(F')|\sum _{n=0}^\infty \frac{R^{2n+1}}{(2n+1)!}+C(F')\sum _{n=0}^\infty \frac{\Vert F-F'\Vert _{\ell ^1}R^{2n}}{(2n)!}\qquad \quad \end{aligned}$$
(5.16)
$$\begin{aligned}{} & {} \quad \le \left( \sinh (R) + \cosh (R) \right) \Vert F-F'\Vert _{\ell ^1} = e^R \Vert F-F'\Vert _{\ell ^1}.\qquad \qquad \end{aligned}$$
(5.17)

This proves (5.4).

We turn to the proof of (5.5). We first note a lower bound for C(F). We have

$$\begin{aligned} -2\log (C(F)) = { \sum _{j=-\infty }^{\infty } \log (1+|F_j|^2)) } \le { \sum |F_j|^2 } \le { \Vert F\Vert _{\ell ^1}^2 } \le { R^2 } \end{aligned}$$
(5.18)

and hence

$$\begin{aligned} C(F)\ge e^{-\frac{1}{2} R^2}. \end{aligned}$$
(5.19)

The key observation is now that \(T_1 (F)\) is the linear Fourier series of F. We will isolate this term in (5.10) by the triangle inequality as follows:

$$\begin{aligned}{} & {} \Vert b-b'\Vert _A + \Vert \sum _{n=1}^{\infty } C(F) T_{2n+1}(F,\dots ,F) - C(F')T_{2n+1}(F',\dots ,F')\Vert _A\nonumber \\{} & {} \quad \ge \Vert C(F)T_1 (F) - C(F')T_1(F')\Vert _A\nonumber \\{} & {} \quad \ge C(F) \Vert T_1(F)-T_1(F')\Vert _{A} - |C(F)-C(F')| \Vert T_1(F')\Vert _A\nonumber \\{} & {} \quad \ge e^{-\frac{1}{2}R^2} \Vert F-F'\Vert _{\ell ^1} - \Vert F-F'\Vert _{\ell ^1} \Vert F\Vert _{\ell ^1} \ge ( e^{-\frac{1}{2}R^2}-R) \Vert F-F'\Vert _{\ell ^1}.\nonumber \\ \end{aligned}$$
(5.20)

Here we have used (5.19) and (5.15).

Estimating the second term on the far left-hand side of (5.20) analogously to (5.16) yields

$$\begin{aligned} \Vert b-b'\Vert _A + (e^R-1-R)\Vert F-F'\Vert _{\ell ^1} \ge (e^{-\frac{1}{2}R^2}-R) \Vert F-F'\Vert _{\ell ^1} \end{aligned}$$

and hence

$$\begin{aligned} \Vert b-b'\Vert _A \ge \Vert F-F'\Vert _{\ell ^1} \left( 1+e^{-\frac{1}{2}R^2} - e^R\right) , \end{aligned}$$

where the last term in parentheses is larger than \(\frac{1}{2}\) for \(R\le 0.36\). \(\square \)

In [7], the authors investigate similar inequalities. Up to comparing the constants, Theorem 3, Corollaries 18 and 20 of [7] state the same inequalities as (5.4) and (5.5). In fact, constants in [7] are better than the ones we obtain. This is due to the fact that we only use the triangle inequality and absorb all the multilinear terms into the first term, whereas [7] carries out a more subtle estimate through the Jacobian of \(\sum _{n=0}^{\infty }T_{2n+1}(F,\dots , F)\).

6 Nonlinear Fourier series of one sided square summable sequences

In this section, we largely follow [23] while giving a self-contained presentation.

For \(1 \le p \le \infty \), let \(H^p({\mathbb {D}})\) be the classical Hardy space associated to the disc \({\mathbb {D}}\), that is the set of functions f in \(L^p({\mathbb {T}})\) which are the linear Fourier series of a sequence supported in \([0,\infty )\). The linear Fourier series of a Hardy space function f provides an analytic extension of f to \({\mathbb {D}}\) which has non-tangential limits almost everywhere on \({\mathbb {T}}\) equal to the function f. We denote the value of the extension of f at a point \(z\in {\mathbb {D}}\) by f(z). The anti-Hardy space \(H^p ({\mathbb {D}}^*)\) consists of the functions f on \({\mathbb {T}}\) for which \(f^* \in H^p ({\mathbb {D}})\). The mean value theorem in the form of (3.4) continues to hold for functions \(a\in H^p({\mathbb {D}}^*)\). In particular, values of functions a in \(H({\mathbb {D}}^*)\) and \(H({\mathbb {D}})\) respectively at \(\infty \) and 0, if real, are the average of the real part of the function on \({\mathbb {T}}\).

If \(f\in H^p({\mathbb {D}})\) is bounded by 1, then its extension to \({\mathbb {D}}\) is bounded by 1. If f has modulus 1 almost everywhere on \({\mathbb {T}}\), then f is called inner. If \(f(0)>0\), then \(\log |f|\) is integrable on \({\mathbb {T}}\) and

$$\begin{aligned} \int _{\mathbb {T}}\log |f|\ge \log f(0), \end{aligned}$$
(6.1)

and f is called outer if equality holds in (6.1).

If \(f\in H^p({\mathbb {D}})\), and f vanishes at 0, then the imaginary part of f is the Hilbert transform H with respect to the circle of the real part of f. If \(f\in H^p({\mathbb {D}}^*)\), and f vanishes at \(\infty \), then the imaginary part of f is the negative of the Hilbert transform of the real part of f. The Hilbert transform has operator norm one in \(L^2({\mathbb {T}})\).

On the Hilbert space \(L^2 \left( {\mathbb {T}}\right) \), there is the orthogonal projection operator \(P_{{\mathbb {D}}}\) onto \(H^2 \left( {\mathbb {D}}\right) \). We also define

$$\begin{aligned} P_{{\mathbb {D}}^*}f = ({P_{{\mathbb {D}}} ({f}^*)})^*. \end{aligned}$$

Then \(P_{{\mathbb {D}}^*}\) is the orthogonal projection of \(L^2 \left( {\mathbb {T}}\right) \) onto \(H^2 ({\mathbb {D}}^*)\). Both these operators have operator norm one on \(L^2 \left( {\mathbb {T}}\right) \).

We refer to [10] for these and further details on the theory of Hardy spaces.

Let \({{\textbf{L}}}\) be the set of pairs of measurable functions (ab) on \(\mathbb {T}\) such that

$$\begin{aligned} aa^*+bb^*=1 \end{aligned}$$
(6.2)

almost everywhere on \({\mathbb {T}}\) and a is in \(H^2({\mathbb {D}}^*)\) with \(a(\infty )>0\). We introduce the following metric on \({{\textbf{L}}}\):

$$\begin{aligned} \rho ((a,b),(c,d))=\left( \int _{{\mathbb {T}}} |a-c|^2 \right) ^{\frac{1}{2}}+ \left( \int _{{\mathbb {T}}}{|b-d|^2} \right) ^{\frac{1}{2}}+|\log (a(\infty ))-\log (c(\infty ))|.\nonumber \\ \end{aligned}$$
(6.3)

The nonlinear Fourier series of a finite sequence is in \(\textbf{L}\). Moreover, the metric \(\rho \) has the following compatibility with the product (3.9) in Theorem 2.

Theorem 6

Let \(F,\tilde{F}\) be finite sequences with support entirely to the left of the support of some other finite sequences \(G,\tilde{G}\). Let (ab), \((\tilde{a},\tilde{b})\), (cd), \((\tilde{c},\tilde{d})\), be the nonlinear Fourier series of \(F,\tilde{F}, G, \tilde{G}\), respectively. Then

$$\begin{aligned} \rho ((a,b)(c,d), (\tilde{a},\tilde{b}) (\tilde{c},\tilde{d})) \le 2\rho ((a,b), (\tilde{a},\tilde{b}) ) +2\rho ((c,d), (\tilde{c},\tilde{d})). \end{aligned}$$
(6.4)

Proof

Thanks to the support properties of the sequences, \(bd^*\) vanishes at \(\infty \) and

$$\begin{aligned} \log (a c-bd^*)(\infty )=\log (a(\infty ))+\log (c(\infty )) \end{aligned}$$

and similarly for the quantities with tildes. This shows the desired inequality for the logarithmic parts of the metric.

The functions abcd and their tilde counterparts are bounded by 1 almost everywhere on \({\mathbb {T}}\). Hence

$$\begin{aligned} |(ac-bd^*)- (\tilde{a}\tilde{c}-\tilde{b}\tilde{d}^*)|= & {} |(a-\tilde{a})c+\tilde{a}(c-\tilde{c})-(b-\tilde{b})d^*-\tilde{b}(d^*-\tilde{d^*})|\\ {}\le & {} |a-\tilde{a}|+|c-\tilde{c}|+ |b-\tilde{b}|+|d^*-\tilde{d^*}|, \end{aligned}$$

and similarly

$$\begin{aligned} |(ad+bc^*)- (\tilde{a}\tilde{d}+ \tilde{b}\tilde{c}^*)| \le |a-\tilde{a}|+|c-\tilde{c}|+ |b-\tilde{b}|+|d^*-\tilde{d^*}|. \end{aligned}$$

The desired bounds for the \(L^2\) parts of \(\rho \) then follow by the triangle inequality. \(\square \)

Note that as the absolute values of a and c are almost everywhere bounded by 1 on \(\mathbb {T}\), the metric \(\rho \) defines the same topology as the metric on \(\textbf{L}\) in [23] using an \(L^1\) integral.

Let \(\overline{{\textbf{H}}}\) be the set of functions in \({\textbf{L}}\) such that b is in \(H^2({\mathbb {D}})\). Let \({{\textbf{H}}}\) be the set of functions in \(\overline{{\textbf{H}}}\) such that \(a^*\) and b have no common inner factor in the sense that if \(a^*g^{-1}\) and \(bg^{-1}\) are in \(H^2({\mathbb {D}})\) for some inner function g on \({\mathbb {T}}\), then g is constant. Note that the bar in \(\overline{\textbf{H}}\) has the meaning of a closure rather than a complex conjugation. In fact, \(\bar{\textbf{H}}\) is complete.

For a sequence F supported on \([0,\infty )\) define \((a_k,b_k)\) for \(k\ge 0\) recursively by

$$\begin{aligned} (a_0(z),b_0(z))= (1+|F_0|^2)^{-\frac{1}{2}}( 1, F_0) \end{aligned}$$
(6.5)

and for \(k>0\)

$$\begin{aligned} (a_k(z),b_k(z) )= (a_{k-1}(z),b_{k-1}(z)){(1+|F_k|^2)^{-\frac{1}{2}}}(1, F_k z^{k}). \end{aligned}$$
(6.6)

This definition coincides with (3.6) under the identification of \(G_k\) with \((a_k,b_k)\) for sequences supported on \([0,\infty )\).

Theorem 7

Let F be a sequence in \(l^2({\mathbb {Z}})\) with support in \([0,\infty )\). The sequence \((a_k,b_k)\) as in (6.5) and (6.6) converges in \(\textbf{L}\) to an element (ab) in \(\overline{\textbf{H}}\). We have

$$\begin{aligned} a(\infty )= \prod _{n\ge 0} (1+|F_n|^2)^{-1/2}. \end{aligned}$$
(6.7)

We call the limit (ab) in this theorem the nonlinear Fourier series of F. This definition is consistent with the definition of the nonlinear Fourier series near (3.6) in the case that F has finite support. Theorem 2 continues to hold for one sided infinite sequences, by taking limits as in Theorem 7.

Proof

We first show that the sequence \((a_k,b_k)\) is Cauchy in \(\textbf{L}\). Let \(k< l\) and write

$$\begin{aligned} (a_k,b_k)(c,d) = (a_l,b_l) \end{aligned}$$
(6.8)

where (cd) is the nonlinear Fourier series of the sequence \(\left( F_n \textbf{1}_{\{k+1\le n \le l\}} \right) \) and where we have used the multiplicative property (3.9) in Theorem 2. We have by Plancherel (3.14), applied to \(a_l\), \(a_k\), and c

$$\begin{aligned} \log |a_l(\infty )|-\log |a_k(\infty )|= -\frac{1}{2} \sum _{k<n\le l}\log (1+|F_n|^2) =\log |c(\infty )|. \end{aligned}$$
(6.9)

The right-hand side tends to zero for \(k\rightarrow \infty \) because \((F_n)\) is square summable.

By (6.8) we have

$$\begin{aligned} |a_k-a_l|= |a_k(1-c)+ b_k d^*|\le |1-c|+|d|, \\ |b_l-b_k|= |a_k d+b_k (c^*-1)|\le |1-c|+|d|. \end{aligned}$$

To control the \(L^2\) norms of the left-hand sides, it suffices to control the \(L^2\) norm of each summand on the right-hand side. As \(1-c(\infty )\) is real and \(1-c\) is in \(H^2({\mathbb {D}}^*)\), the \(L^2\) norm of the imaginary part of \(1-c\), which equals the negative of the Hilbert transform of the real part of \(1-c\), is hence bounded by the \(L^2\) norm of its real part, since the Hilbert transform H has norm one acting on \(L^2({\mathbb {T}})\) [10, Chapter 3]. Using this and the bound \(\Vert c\Vert _\infty \le 1\), we estimate

$$\begin{aligned} \frac{1}{4}\int _{\mathbb {T}}|1-c|^2 \le \frac{1}{2} \int _{\mathbb {T}}|\Re (1-c)|^2 \le \int _{\mathbb {T}}\Re (1-c) =1-c(\infty )\le -\log c(\infty ). \end{aligned}$$

The latter tends to zero as in (6.9) as \(k \rightarrow \infty \). Moreover,

$$\begin{aligned} \int _{\mathbb {T}}|d|^2=\int _{\mathbb {T}}1-|c|^2\le - 2\int _{\mathbb {T}}\log |c| \le -2 \log (c(\infty )), \end{aligned}$$

which also tends to zero as in (6.9). Having seen that the sequence \((a_k,b_k)\) is Cauchy with respect to all three summands in the definition of \(\rho \), it is Cauchy with respect to \(\rho \).

As each \((a_k,b_k)\) is in \(\overline{\textbf{H}}\) and \(\overline{\textbf{H}}\) is complete, \((a_k,b_k)\) has a limit in \(\overline{\textbf{H}}\).

\(\square \)

Theorem 8

Let \((a,b)\in \overline{\textbf{H}}\). There is a unique \(y\in {\mathbb {C}}\) such that there exists \((c,d)\in \overline{\textbf{H}}\) satisfying

$$\begin{aligned} (c(z),d(z)z):=(1+|y|^2)^{-1/2} (1,-y)(a(z),b(z)) \end{aligned}$$
(6.10)

for almost all \(z\in {\mathbb {T}}\). Using this statement, define the functions \((a_n,b_n)\) recursively for \(n\ge 0\) by

$$\begin{aligned} (a_0,b_0)= & {} (a,b)\nonumber \\ (a_{n+1}(z),b_{n+1}(z)z)= & {} (1+|F_n|^2)^{-\frac{1}{2}} (1,-F_n)(a_n(z),b_n(z)), \end{aligned}$$
(6.11)

where \(F_n\) is the unique number such that \((a_{n+1},b_{n+1})\) is in \(\overline{\textbf{H}}\). Then the sequence \((F_n)\) is square summable and

$$\begin{aligned} a(\infty )\le \prod _{n\ge 0} (1+|F_n|^2)^{-1/2}. \end{aligned}$$
(6.12)

If \((a,b)\not \in \textbf{H}\), then we have strict inequality in (6.12).

The sequence produced in this theorem is called the layer stripping sequence of (ab), [22]. Layer stripping is an alternation between a left multiplication by a constant matrix and a shift by z, mirroring the alternation between two types of unitary matrices in the QSP representation of the nonlinear Fourier series.

Proof

To see the first statement of Theorem 8, note that for each \(y \in {\mathbb {C}}\), the factor \((1+|y|^2)^{-1/2} (1,-y)\) is in SU(2) and thus the right-hand side of (6.10) is an almost everywhere SU(2)-valued function on \({\mathbb {T}}\). Hence (cd) is SU(2)-valued on \({\mathbb {T}}\). The matrix product on the right-hand side of (6.10) is \(( a+ {y} b^*, b-ya^*)\). We have that \(a+yb^*\) is in \(H^2({\mathbb {D}}^*)\) and \(b-ya^*\) is in \(H^2({\mathbb {D}})\). Equality in (6.10) requires \(b-ya^*\) to vanish at 0. There is a unique complex number y so that this happens, namely

$$\begin{aligned} y=b(0)/a^*(0). \end{aligned}$$
(6.13)

For this y, we note that \(b-ya^*\) can be written as a product of z with an \(H^2({\mathbb {D}})\) function, and that, using that \(a(\infty )\) is positive and thus equal to \(a^*(0)\),

$$\begin{aligned} a(\infty )+{y}b^*(\infty ) = a(\infty )+{y}\overline{b(0)}= a(\infty ) (1+|y|^2)>0. \end{aligned}$$
(6.14)

We may thus use (6.10) with this y to define cd. In particular, by (6.14) and (6.10) we have \(c(\infty )> 0\). It follows that \((c,d) \in \bar{\textbf{H}}\). We have thus shown existence and uniqueness of y as in the first part of the theorem.

We may define \(F_n\) as in the second part of the theorem and obtain by induction

$$\begin{aligned} a_{n+1}(\infty )=a(\infty )\prod _{k=0}^{n}(1+|F_n|^2)^{\frac{1}{2}}. \end{aligned}$$

As \(a_{n+1}(\infty )\le 1\) for all n, we obtain (6.12).

Now assume \((a,b)\not \in H\). Then there is a non-constant inner function g such that \(\tilde{a} ^* =a ^* g^{-1}\) and \(\tilde{b}=bg^{-1}\) are in \(H^2({\mathbb {D}})\). Multiplying by a number of modulus one, we may assume that \(g(0)\ge 0\), and since g is not constant then by the maximum principle we have \(g(0)<1\). Multiplying (6.11) by \(((g^*)^{-1}, 0)\) from the right, one obtains inductively that the layer stripping sequence \(\tilde{F}\) of \((\tilde{a},\tilde{b})\) is the same as that of (ab). But

$$\begin{aligned} \tilde{a}(\infty )> g^*(\infty )\tilde{a}(\infty )=a(\infty ). \end{aligned}$$

Applying (6.12) to \((\tilde{a},\tilde{b})\) then gives (6.12) for (ab) with strict inequality. \(\square \)

Theorem 9

Let F be a sequence in \(l^2({\mathbb {Z}})\) with support in \([0,\infty )\). Then the layer stripping sequence of \(\overbrace{F}=(a,b)\) is F. Moreover, (ab) is in \(\textbf{H}\). Conversely, if (ab) is any element in \(\textbf{H}\), then its layer stripping sequence is square summable and (ab) is the nonlinear Fourier series of this layer stripping sequence.

Proof

To prove the first part of the theorem, assume to get a contradiction that there is a sequence F such that the layer stripping sequence of \(\overbrace{F}\) is not equal to F. Let n be the minimal index such that the n-th term of F differs from the n-term of the layer stripping sequence of \(\overbrace{F}\). We may assume that n is minimal among all hypothetical counterexamples to the first statement of the theorem.

Let \(\tilde{F}\) be the sequence supported on \([1,\infty )\) which coincides with F on \([1,\infty )\). Let \((a_k,b_k)\) and \((\tilde{a}_k,\tilde{b_k})\) be the respective sequences defined by the recursion (6.5), (6.6).

By induction, we obtain for \(k\ge 0\)

$$\begin{aligned} (1+|F_0|^2)^{-\frac{1}{2}} (1,-F_0)(a_k,b_k)=(\tilde{a}_k,\tilde{b}_k). \end{aligned}$$
(6.15)

Taking a limit as \(k\rightarrow \infty \) with Theorem 7, we obtain for the nonlinear Fourier series (ab) and \((\tilde{a},\tilde{b})\)

$$\begin{aligned} (1+|F_0|^2)^{-\frac{1}{2}} (1,-F_0)(a,b)= (\tilde{a},\tilde{b}). \end{aligned}$$
(6.16)

By Theorem 3, \(\tilde{b}_k(0)=0\), for all \(k\ge 0\), and by taking limits, as evaluation at 0 is continuous in \(H^2({\mathbb {D}})\), we have \(\tilde{b}(0)=0\). Hence there is \(d\in H^2({\mathbb {D}})\) such that \(d(z)z=\tilde{b}(z)\). Set \(c=\tilde{a}\), then

$$\begin{aligned} (1+|F_0|^2)^{-\frac{1}{2}} (1,-F_0)(a(z),b(z))= (c(z),d(z)z). \end{aligned}$$
(6.17)

By the uniqueness part of Theorem 8, we have that \(F_0\) is the zeroth entry of the layer stripping sequence of (ab). In particular, \(n\ge 1\). By definition, the later terms of the layer stripping sequence of (ab) are those of the layer stripping sequence of (cd). But (cd) is the nonlinear Fourier series of the sequence \(H_{n}=\tilde{F}_{n+1}\) by Theorem 2. It follows that the \((n-1)\)-st term of the layer stripping sequence of (cd) does not coincide with \(H_{n-1}\). This contradicts the minimality of n.

Thus we have shown that the layer stripping sequence of (ab) is equal to F. As we have the Plancherel identity (6.7), we observe that \((a,b)\in \textbf{H}\) by Theorem 8.

We turn to the second part of the Theorem. Let \((c,d)\in \textbf{H}\), let F be its layer stripping sequence and let \((c_k,d_k)\) be the corresponding sequence as defined in Theorem 8. By (6.12), the sequence F is square summable. Let (ab) be the nonlinear Fourier series of F and let \((a_k,b_k)\) be as in (6.5), (6.6). By induction, we show that for each \(k\ge 0\) we have

$$\begin{aligned} (a_k(z),b_k(z))(c_{k+1}(z),d_{k+1}(z)z^{k+1})=(c(z),d(z)). \end{aligned}$$
(6.18)

Namely, for \(k=0\), both sides of the equation are equal to

$$\begin{aligned} (1+|F_0|^2)^{-\frac{1}{2}} (1,F_0)(c_{1}(z),d_{1}(z)z). \end{aligned}$$

For \(k\ge 1\), the left-hand side of (6.18) is

$$\begin{aligned} (a_{k-1}(z),b_{k-1}(z)) (1+|F_k|^2)^{-\frac{1}{2}} (1,F_k(z)z^k)(c_{k+1}(z),d_{k+1}(z)z^{k+1}) \end{aligned}$$
(6.19)

by definition of \((a_k,b_k)\). Algebraic verification analogous to a shift show that we may replace \(z^k\) and \(z^{k+1}\) in (6.19) by 1 and z, respectively. Hence, by the definition of the sequence \((c_k,d_k)\), the expression (6.19) is equal to

$$\begin{aligned} (a_{k-1}(z),b_{k-1}(z)) (c_{k}(z),d_{k}(z)z^{k}), \end{aligned}$$
(6.20)

which by the induction hypothesis is the right-hand side of (6.18). This completes the induction step and proves (6.18).

Dividing by the left factor of (6.20), we obtain from (6.18)

$$\begin{aligned} (c_k(z),d_k(z)z^k)=(a_k^*(z),-b_k(z))(c,d). \end{aligned}$$

The entries of the matrix on the right-hand side converge in \(L^2({\mathbb {T}})\) because the entries of \((a_k^*,b_k)\) converge and such convergence is preserved under multiplication by a bounded measurable function. Therefore, the entries of the left-hand side converge in \(L^2 ({\mathbb {T}})\) as well.

The limit of \(d_k(z)z^k\) is in the closed subspace \(H^2({\mathbb {D}})\). The linear Fourier coefficients of this limit vanish because the k leading Fourier coefficients of \(d_k(z)z^k\) vanish and the map taking a function to any individual Fourier coefficient is continuous in the space \(H^2({\mathbb {D}})\). Hence \(d_k(z)z^k\) converges to zero. Also \(c_k ^*\) converges in \(H^2 ({\mathbb {D}})\). The limit is an inner function g, because \(|c_k|=\sqrt{1-|d_k|^2}\) converges pointwise almost everywhere to 1. Taking limits in (6.18) with Theorem 6 gives

$$\begin{aligned} (a,b)(g^*,0)=(c,d) \end{aligned}$$

and thus \(ag^*=c\) and \(bg=d\). As \((c,d)\in \textbf{H}\), we have by definition that g is constant. This constant is positive and hence is equal to one. It follows that \((c,d)=(a,b)\), which proves the second part of Theorem 9.

\(\square \)

7 Nonlinear Fourier series of square summable sequences.

In this section, we adapt arguments in [24] to the SU(2) setting. Unlike [24], we need to assume an effective bound on b and assume that a is outer. Some of the complex analytic tools need adjustments. We give a self-contained presentation.

Recall \(\textbf{H}\) and define \(\textbf{H}_0^*\) to be the set of (ab) in \({{\textbf{L}}}\) such that \((a, b^*)\) is in \(\textbf{H}\) and \(b^*(0)=0\). By the shift and mirror symmetries of Theorem 2, the results of the previous section apply in symmetric form. In particular, \(\textbf{H}_0^*\) is the space of nonlinear Fourier series of sequences in \(l^2({\mathbb {Z}})\) supported on \((-\infty , -1]\).

We split a sequence F in \(l^2(\mathbb {Z})\), as \(F_-\) + \(F_+\), where \(F_-\) is supported in \((-\infty , -1]\) and \(F_+\) is supported in \([0,\infty )\). Let \((a_-,b_-)\) in \(\textbf{H}_0^*\) and \((a_+,b_+)\) in \(\textbf{H}\) be the nonlinear Fourier series of \(F_-\) and \(F_+\), respectively. Then we define (ab) almost everywhere on \({\mathbb {T}}\) by

$$\begin{aligned} (a,b):=(a_-,b_-)(a_+,b_+). \end{aligned}$$
(7.1)

As a product of SU(2) matrices almost everywhere, (ab) is in SU(2) and thus entry-wise bounded by one almost everywhere. The identity

$$\begin{aligned} a= a_- a_+ -b_- b_+^* \end{aligned}$$
(7.2)

shows that a has an analytic extension to \(\mathbb {D}^*\) with

$$\begin{aligned} a(\infty )= a_-(\infty ) a_+(\infty ). \end{aligned}$$
(7.3)

Hence a is in \(H^2({\mathbb {D}}^*)\) with \(a(\infty )>0\) and we have \((a,b)\in \textbf{L}\).

We define (ab) as in (7.1) to be the nonlinear Fourier series of F. By Theorems 6 and 7 and the symmetries of Theorem 2, (ab) is the limit as \(k\rightarrow \infty \) in \(\textbf{L}\) of the nonlinear Fourier series of the truncations of F to the intervals \([-k,k]\). The properties in Theorem 2 continue to hold for this extension of the definition of nonlinear Fourier series. We also see with (6.7) and (7.3) that

$$\begin{aligned} a(\infty ) = \prod _{n\in {\mathbb {Z}}} (1+|F_n|^2)^{-1/2}. \end{aligned}$$
(7.4)

Let \(\textbf{B}\) be the subspace of \(\textbf{L}\) of all (ab) such that

$$\begin{aligned} \inf _{z\in \mathbb {D}^*}|a(z)|^2> \frac{1}{2}. \end{aligned}$$
(7.5)

For a function a satisfying (7.5), there is a holomorphic branch of \(\log (a^*)\) on \({\mathbb {D}}\) with nontangential limits coinciding with \(\log (a^*)\) on the boundary. By the mean value theorem for the real part of \(\log (a^*)\), \(a^*\) is outer on \({\mathbb {D}}\).

We embed \(\textbf{B}\) into the Hilbert space \( \mathcal {H} \equiv L^2 \left( {\mathbb {T}}\right) \oplus L^2 \left( {\mathbb {T}}\right) \), written as column vectors, with the norm

$$\begin{aligned} \left\| \begin{pmatrix} a \\ b \end{pmatrix} \right\| _{\mathcal {H}} = \sqrt{\left\| a \right\| _{L^2 \left( {\mathbb {T}}\right) } ^2 + \left\| b \right\| _{L^2 \left( {\mathbb {T}}\right) } ^2 }. \end{aligned}$$

For (ab) and (cd) in \(\textbf{B}\), the metrics defined by \(\mathcal {H}\) and \(\rho \) are equivalent,

$$\begin{aligned} \frac{1}{8}\rho ((a,b),(c,d)) \le \left\| \begin{pmatrix} a\\ b \end{pmatrix}- \begin{pmatrix} c\\ d \end{pmatrix}\right\| _{\mathcal {H}}\le \rho ((a,b),(c,d)). \end{aligned}$$

Indeed, the second inequality follows directly from the definition of \(\rho \), while the first follows from the additional observation that for outer functions \(a^*\) and \(c^*\),

$$\begin{aligned} \left| \log |a(\infty )|-\log |c(\infty )| \right| = \left| \int _{\mathbb {T}}\log |a|-\log |c| \right| \le 2\int _{\mathbb {T}}|a-c|\le 2\Vert a-c\Vert _{L^2({\mathbb {T}})}, \end{aligned}$$

which used an elementary inequality for the logarithm in the domain \([\frac{1}{2}, 1]\).

Theorem 10

For each complex valued measurable function b on \(\mathbb {T}\) with

$$\begin{aligned} {\text {ess}} \,{\text {sup}}_{z\in \mathbb {T}}|b(z)|^2< \frac{1}{2}, \end{aligned}$$
(7.6)

there is a unique measurable function a on \(\mathbb {T}\) such that \((a,b)\in \textbf{B}\).

Proof

To see existence of a, let

$$\begin{aligned} M (z) \equiv \log \sqrt{1 - \left| b (z) \right| ^2} \end{aligned}$$

for almost every \( z \in {\mathbb {T}}\). By (7.6), M is real and integrable on \({\mathbb {T}}\). Then \(M-iHM\) with H the Hilbert transform on \({\mathbb {T}}\) has an analytic extension to \({\mathbb {D}}^*\). Define

$$\begin{aligned} a :=e^{M-iHM} , \end{aligned}$$
(7.7)

which is in \(H^2({\mathbb {D}}^*)\) and satisfies

$$\begin{aligned} \left| a (z)\right| = e^{M} = \sqrt{1 - \left| b(z) \right| ^2} \end{aligned}$$

for almost every \(z\in {\mathbb {T}}\). Also \(a^{-1}\) has analytic extension to \({\mathbb {D}}^*\) and is bounded by \(2^{\frac{1}{2}}\). It follows that

$$\begin{aligned} \inf \limits _{z \in {\mathbb {D}}^*} \left| a \left( z \right) \right| ^2 > \frac{1}{2}. \end{aligned}$$

Hence \((a,b) \in \textbf{B}\).

To see uniqueness of a, let \(\tilde{a}\) be another function as claimed in the theorem. Then \(\tilde{a} a^{-1}\) and its reciprocal are analytic in the disc \({\mathbb {D}}^*\) with boundary values of modulus one almost everywhere. Hence both are bounded by 1 on the disc and thus of modulus one and are hence constant. This constant is positive at \(\infty \) and thus 1. This proves uniqueness.

\(\square \)

Theorem 11

For each \((a,b)\in \textbf{B}\), there are unique \((a_+,b_+)\in \textbf{H}\) and \((a_-,b_-)\in \textbf{H}_0^*\) such that we have the Riemann Hilbert type factorization

$$\begin{aligned} (a_{-}, b_{-})(a_{+}, b_{+}) = (a,b) \end{aligned}$$
(7.8)

almost everywhere on \({\mathbb {T}}\). Moreover, there is a unique \(F\in l^2({\mathbb {Z}})\) whose nonlinear Fourier series is (ab).

Proof

Existence and uniqueness of the factorization (7.8) shows existence and uniqueness of F by the one sided Theorem 9 and the definition (7.1) of the nonlinear Fourier series on \(\ell ^2({\mathbb {Z}})\). It therefore suffices to show existence and uniqueness of the factorization.

We first discuss uniqueness and begin by deducing necessary conditions on the factors in (7.8). Multiplying by the inverse of the matrix \((a_+,b_+)\) from the right in (7.8), we obtain

$$\begin{aligned} (a_{-}, b_{-}) = (a , b) (a_{+} ^* , -b_{+} ) = (a a_{+} ^* + b b_{+} ^* , - a b_{+} + a_{+} b) . \end{aligned}$$
(7.9)

In particular, the second component of this identity reads as

$$\begin{aligned} b_{-} = - a b_{+} + a_{+} b . \end{aligned}$$
(7.10)

Because |a| is bounded below almost everywhere on \({\mathbb {T}}\), we can divide by a to get

$$\begin{aligned} b_{+}=-\frac{b_{-}}{a} + \frac{b}{a} a_{+}. \end{aligned}$$
(7.11)

The term \(b_+\) is in \(H^2({\mathbb {D}})\). The term \(\frac{b_-}{a}\) has an analytic extension to \({\mathbb {D}}^*\) and hence is in \(H^2({\mathbb {D}}^*)\) because |a| is bounded below by \(2^{-\frac{1}{2}}\). Moreover, \(\frac{b_-}{a}\) vanishes at \(\infty \). Acting on (7.11) by the Cauchy projection \(P_{{\mathbb {D}}}\) yields

$$\begin{aligned} b_{+}= P_{{\mathbb {D}}} \left( \frac{b}{a} a_{+} \right) . \end{aligned}$$
(7.12)

We similarly rewrite the identity for the first component of (7.9) as

$$\begin{aligned} a_{+} ^*=\frac{a_{-}}{a} - \frac{b}{a} b_{+} ^*, \end{aligned}$$

and applying \(P_{{\mathbb {D}}}\) yields

$$\begin{aligned} a_{+} ^* = \frac{1}{a _{+} (\infty )} - P_{{\mathbb {D}}} \left( \frac{b}{a} b_{+} ^* \right) . \end{aligned}$$
(7.13)

Here we used that \({a_-} {a}^{-1}\) has analytic extension to \({\mathbb {D}}^*\) and applying \(P_{\mathbb {D}}\) to it gives the constant term in the linear Fourier expansion, which is equal to \({a_-(\infty )}{a(\infty )}^{-1}\), which is positive and equal to \({a_+(\infty )}^{-1}\) by (7.3).

Motivated by (7.12) and (7.13), we consider the mapping

$$\begin{aligned} (A,B) \mapsto \left( \left( 1 - P_{{\mathbb {D}}}\left( \frac{b}{a}B^* \right) \right) ^*, P_{{\mathbb {D}}} \left( \frac{b}{a} A \right) \right) , \end{aligned}$$
(7.14)

which is a contraction on \(\mathcal {H}\) because \(P_{{\mathbb {D}}}\) is a projection and by (7.5),

$$\begin{aligned} \mathop {{\text {ess}} \,{\text {sup}}}\nolimits _{z \in {\mathbb {T}}} \left| \frac{b \left( z \right) }{a \left( z \right) } \right| = \sqrt{\mathop {{\text {ess}} \,{\text {sup}}}\nolimits _{z \in {\mathbb {T}}} \frac{1}{\left| a \left( z \right) \right| ^2} - 1}< 1, \end{aligned}$$

where we used that a has limits almost everywhere on \({\mathbb {T}}\). Thus (7.14) has a unique fixed point (AB) by Banach’s fixed point theorem. Multiplying (7.12) and (7.13) by \(a_+(\infty )\) shows

$$\begin{aligned} (A,B) = a_+ (\infty ) ( a_+, b_+) \, \end{aligned}$$
(7.15)

is the fixed point of (7.14) and therefore the right side of (7.15) is uniquely determined. Evaluating at infinity gives

$$\begin{aligned} A(\infty )= a_+(\infty )^2. \end{aligned}$$
(7.16)

Identity (7.16) is necessary and thus the positive value \(a_+(\infty )\) is unique. Dividing the necessary (7.15) by this unique number shows that \((a_+,b_+)\) is unique. And by (7.8), \((a_{-}, b_{-})\) is also unique.

For existence of a factorization (7.8), again consider the map in (7.14) on \(\mathcal {H}\), and let \((A,B) \in \mathcal {H}\) be the unique solution. We claim that

$$\begin{aligned} M \equiv A A^* + B B^* \end{aligned}$$
(7.17)

is constant on \({\mathbb {T}}\). Clearly, M is real on \({\mathbb {T}}\), so it suffices to show that M is in \(H^1 \left( {\mathbb {D}}^* \right) \). Indeed, this will ensure the linear Fourier coefficients of M are supported on \((-\infty , 0]\), while the conjugate antipodal symmetry of the Fourier coefficients of any real valued function ensures the negative Fourier coefficients of M vanish just as its positive coefficients do. We use the fact that (AB) is a fixed point of (7.14) to write

$$\begin{aligned} M = A \left[ 1 - P_{{\mathbb {D}}}\left( \frac{b}{a}B^* \right) \right] + B^* P_{{\mathbb {D}}} \left( \frac{b}{a} A \right) , \end{aligned}$$
(7.18)

which after adding and subtracting \(AB^*ba^{-1}\) gives

$$\begin{aligned} M= A \left[ 1 + ({\text {Id}}- P_{{\mathbb {D}}})\left( \frac{b}{a}B^* \right) \right] - B^* \left( {\text {Id}}- P_{{\mathbb {D}}} \right) \left( \frac{b}{a} A \right) . \end{aligned}$$
(7.19)

We recognize \({\text {Id}}- P_{{\mathbb {D}}}\) as the projection operator onto \(H^2 _0 \left( {\mathbb {D}}^* \right) \), i.e., the set of functions in \(H^2 \left( {\mathbb {D}}^* \right) \) with vanishing zeroth Fourier coefficient. As the fixed point equation shows that \(A\in H^2({\mathbb {D}}^*)\) and \(B\in H^2({\mathbb {D}})\), then M is a sum of products of functions in \(H^2 ({\mathbb {D}}^*)\), which must then belong in \(H^1 ({\mathbb {D}}^*)\).

By (7.19), we also have \(M\left( \infty \right) \) equals

$$\begin{aligned} A \left( \infty \right) \left[ 1 + ({\text {Id}}- P_{{\mathbb {D}}}) \left( \frac{b}{a}B^* \right) \left( \infty \right) \right] - B^* (\infty ) \left( {\text {Id}}- P_{{\mathbb {D}}} \right) \left( \frac{b}{a} A \right) (\infty ) =A(\infty ). \end{aligned}$$

We can also write

$$\begin{aligned} A (\infty ) = M = \int \limits _{{\mathbb {T}}} M = \int \limits _{{\mathbb {T}}} \left| A\right| ^2 + \left| B\right| ^2 \ge 0 , \end{aligned}$$
(7.20)

where equality holds in the last step if and only if \(\left| A \right| = \left| B \right| = 0\) almost everywhere on \({\mathbb {T}}\). However \((A,B) = (0, 0)\) is not the fixed point of (7.14), hence \(A(\infty ) > 0\).

Define normalized versions of A and B, i.e.,

$$\begin{aligned} a_{+} \left( z \right) \equiv \frac{A(z)}{A(\infty )^{\frac{1}{2}}} , \quad b_{+} \left( z \right) \equiv \frac{B(z)}{A(\infty )^{\frac{1}{2}}} , \end{aligned}$$
(7.21)

so that \(a_{+} \in H^2 \left( {\mathbb {D}}^* \right) \), \(b_+ \in H^2 \left( {\mathbb {D}}\right) \) satisfy

$$\begin{aligned} a_{+} a_{+} ^* + b_{+} b_{+} ^* = \frac{M}{A \left( \infty \right) } = 1 \end{aligned}$$

and thus \((a_+,b_+)\in \overline{\textbf{H}}\).

Now we define

$$\begin{aligned} \left( a_{-}, b_{-} \right) \equiv \left( a,b \right) \left( a_{+} ^*, -b_{+} \right) \end{aligned}$$

Because \((a_{-}, b_{-})\) is the product of matrices in SU(2), we have

$$\begin{aligned} a_{-} a_{-} ^* + b_{-} b_{-} ^* = 1 \text { on } {\mathbb {T}}. \end{aligned}$$

We claim that \(\left( a_{-}, b_{-} \right) \in \overline{\textbf{H}}_{0} ^*\), that is \(a_-,b_- \in H^2 ({\mathbb {D}}^*)\) and \(b_-(\infty )=0\).

Indeed, because (AB) is a fixed point of (7.14), then \(a_{-}\) equals

$$\begin{aligned} a a_{+} ^* + b b_{+} ^* = a \left( \frac{1}{a_{+} (\infty )} - P_{{\mathbb {D}}}\left( \frac{b}{a}b_{+} ^* \right) \right) + b b_{+} ^* = a \frac{1}{a_{+}(\infty )} + a\left( {\text {Id}}- P_{{\mathbb {D}}} \right) \left( \frac{b}{a}b_{+} ^* \right) , \end{aligned}$$

which is clearly an element of \(H ^2 \left( {\mathbb {D}}^* \right) \) with constant term \( \frac{ a(\infty )}{a_{+} (\infty )} \), which is positive as both numerator and denominator are positive. Using (7.14) again, we have

$$\begin{aligned} b_{-} = -a b_{+} + b a_{+} = - a P_{{\mathbb {D}}} \left( \frac{b}{a} a_+ \right) + b a_{+} = a\left( {\text {Id}}- P_{{\mathbb {D}}} \right) \left( \frac{b}{a} a_+ \right) , \end{aligned}$$

which again is in \(H^2 \left( {\mathbb {D}}^* \right) \) and has constant term \(b_-(\infty )=0\). Thus we see that \((a_-,b_-)\) is in \( \overline{\textbf{H}}^*_0\).

To check that \((a_+, b_+) \) and \((a_{-}, b_{-})\) are indeed in \( \textbf{H}\) and \( \textbf{H}_{0} ^*\), it remains to show that \(a_{+} ^*\) and \(b_+\) share no common nontrivial inner factor g on \({\mathbb {D}}\), and likewise for \(a_{-} ^*\) and \(b_{-}^*\). Suppose first that g is an inner function such that \({a_{+} ^*}{g^{-1}}\) and \({b_{+}}{g^{-1}}\) are both in \(H^2 ({\mathbb {D}})\). Then by (7.8), we have

$$\begin{aligned} {a ^*}{g^{-1}} = a_{-} ^* {a_{+} ^*}{g^{-1}} - b_{-} ^* {b_{+}}{g^{-1}}, \end{aligned}$$
(7.22)

which is an \(H^2 ({\mathbb {D}})\) function. Thus g is an inner factor of \(a^*\). But \(a^*\) is outer as observed near (7.5). This implies that \(|g(0)|=1\) because otherwise

$$\begin{aligned} \log |a^*(0)|<\log |a^*g^{-1}(0)|\le \int _{\mathbb {T}}\log |a^*g^{-1}|=\int _{\mathbb {T}}\log |a^*|, \end{aligned}$$
(7.23)

a contradiction. By the maximum principle, g is constant. Hence \(a_{+} ^*\) and \(b_{+}\) share no common inner factor and so \((a_+,b_+)\in \textbf{H}\). Similar reasoning with (7.22) and (7.23) shows \(a_{-}^*\) and \(b_{-}\) share no common inner factor, and hence \((a_{-}, b_{-})\) is in \( \textbf{H}_{0} ^*\).

\(\square \)

Given \(\epsilon > 0\), let \(\textbf{B}_{\epsilon }\) be the subset of all elements (ab) of \(\textbf{B}\) which satisfy

$$\begin{aligned} \inf \limits _{z \in {\mathbb {D}}} \left| a (z) \right| \ge 2^{-\frac{1}{2}} + \epsilon . \end{aligned}$$
(7.24)

Lemma 3

Let \(\epsilon \in (0, 1 -2^{-\frac{1}{2}})\) and let \((a,b) \in \textbf{B}_{\epsilon }\). Then for \(\eta \equiv 3^{\frac{3}{2}}2^{-1}\) we have

$$\begin{aligned} {\text {ess}} \,{\text {sup}}_{z \in {\mathbb {T}}} \frac{\left| b (z) \right| }{ \left| a (z) \right| } \le 1 - \eta \epsilon . \end{aligned}$$
(7.25)

Proof

For x in the interval \(I \equiv (2^{-\frac{1}{2}}, 1)\), define the positive function

$$\begin{aligned} f(x) \equiv \sqrt{x^{-2} - 1}. \end{aligned}$$

Then

$$\begin{aligned} f'(x) = \frac{- x^{-3}}{\sqrt{x^{-2} - 1}} = \frac{-1}{\sqrt{x^{4} - x^6}} \end{aligned}$$

achieves its maximum on I when \(x^4 - x^6\) achieves its maximum. Because

$$\begin{aligned} (x^{4} - x^6)' =4 x^3 - 6 x^5 = -6 x^3\left( x - 2^{ \frac{1}{2}} 3^{- \frac{1}{2}} \right) \left( x + 2^{ \frac{1}{2}} 3^{- \frac{1}{2}} \right) \,, \end{aligned}$$

then \(x^{4} - x^6\), and hence \(f'(x)\), achieves its maximum on I at \(x=2^{ \frac{1}{2}} 3^{- \frac{1}{2}}\).

Now, let \((a,b) \in \textbf{B}_{\epsilon }\). Using first (6.2) and then our assumption (7.24), we write

$$\begin{aligned} {\text {ess}} \,{\text {sup}}_{z \in {\mathbb {T}}} \frac{\left| b (z) \right| }{ \left| a (z) \right| } = {\text {ess}} \,{\text {sup}}_{z \in {\mathbb {T}}} \sqrt{\left| a (z) \right| ^{-2} -1} \le \sqrt{\left( 2^{-\frac{1}{2}} +\epsilon \right) ^{-2} -1} = f (2^{- \frac{1}{2} } +\epsilon ). \end{aligned}$$

By the mean value theorem, there exists \(\xi \in (2^{- \frac{1}{2} }, 2^{- \frac{1}{2} } + \epsilon )\) for which this equals

$$\begin{aligned} f(2^{- \frac{1}{2}}) +f' (\xi ) \epsilon \le f( 2^{ - \frac{1}{2}}) +f' \left( 2^{ \frac{1}{2}} 3^{- \frac{1}{2}} \right) \epsilon = 1 - 3 ^{\frac{3}{2} }2^{-1} \epsilon \,. \end{aligned}$$

\(\square \)

Theorem 12

If we endow both \(\textbf{B}_{\epsilon }\) and \(\textbf{H}\) with the metric from \(\mathcal {H}\), then the map sending \((a,b)\in \textbf{B}_\epsilon \) to the right factor \((a_+,b_+)\in \textbf{H}\) of (7.8) in Theorem 11 is Lipschitz with constant at most \((2^{\frac{5}{2}}+4) (\eta \epsilon )^{-\frac{3}{2}}\), where \(\eta \) is the constant in (7.25).

Proof

For the fixed point (AB) of the map (7.14) we obtain

$$\begin{aligned} \left\| \begin{pmatrix} A\\ B\end{pmatrix} \right\| _{\mathcal {H} }{} & {} \le \left\| \begin{pmatrix} 0 &{} -P_{{\mathbb {D}}^* } \frac{b^*}{a^*}\\ P_{{\mathbb {D}}} \frac{b}{a} &{} 0 \end{pmatrix} \right\| _{\mathcal {H} \rightarrow \mathcal {H}}\left\| \begin{pmatrix} A\\ B\end{pmatrix} \right\| _{\mathcal {H} } + \left\| \begin{pmatrix} 1\\ 0 \end{pmatrix} \right\| _{\mathcal {H}}\nonumber \\ {}{} & {} \le \left( 1- \eta {\epsilon } \right) \left\| \begin{pmatrix} A\\ B\end{pmatrix} \right\| _{\mathcal {H} } + 1, \end{aligned}$$
(7.26)

where we used (7.25) and the fact that \(P_{{\mathbb {D}}}\) and \(P_{{\mathbb {D}}^*}\) have operator norm 1 on \(L^2 ({\mathbb {T}})\). Collecting the norms of \((A,B)^T\) on the left hand side and dividing by \(\eta \epsilon \), we obtain

$$\begin{aligned} \left\| \begin{pmatrix} A\\ B\end{pmatrix} \right\| _{\mathcal {H} } \le \frac{1}{\eta \epsilon } . \end{aligned}$$
(7.27)

By the mean value property and the Cauchy-Schwarz inequality, we obtain further

$$\begin{aligned} \left| A \left( \infty \right) \right| = \left| \int \limits _{{\mathbb {T}}} A \right| \le \left\| A \right\| _{L^2 \left( {\mathbb {T}}\right) } \le \frac{1}{ \eta \epsilon } . \end{aligned}$$
(7.28)

To see that the map from \(\textbf{B}_{\epsilon }\) to the fixed point of (7.14) is Lipschitz, let \((a,b), (c,d) \in \textbf{B}_{\epsilon }\), and let (AB) and (CD) be the respective fixed points, i.e.,

$$\begin{aligned} \begin{pmatrix} A\\ B \end{pmatrix}&= \begin{pmatrix} 1\\ 0 \end{pmatrix} + \begin{pmatrix} 0 &{}\quad -P_{{\mathbb {D}}^* } \frac{b^*}{a^*}\\ P_{{\mathbb {D}}} \frac{b}{a} &{}\quad 0 \end{pmatrix} \begin{pmatrix} A\\ B \end{pmatrix}, \end{aligned}$$
(7.29)
$$\begin{aligned} \begin{pmatrix} C\\ D \end{pmatrix}&= \begin{pmatrix} 1\\ 0 \end{pmatrix} + \begin{pmatrix} 0 &{}\quad -P_{{\mathbb {D}}^* } \frac{d^*}{c^*}\\ P_{{\mathbb {D}}} \frac{d}{c} &{}\quad 0 \end{pmatrix} \begin{pmatrix} C\\ D \end{pmatrix} . \end{aligned}$$
(7.30)

We subtract the second equation from the first to get an equation for

$$\begin{aligned}X=\begin{pmatrix} A\\ B \end{pmatrix} - \begin{pmatrix} C\\ D \end{pmatrix},\end{aligned}$$

namely

$$\begin{aligned} X= \begin{pmatrix} 0 &{} \quad -P_{{\mathbb {D}}^* } \frac{b^*}{a^*}\\ P_{{\mathbb {D}}} \frac{b}{a} &{}\quad 0 \end{pmatrix} X + \begin{pmatrix} P_{{\mathbb {D}}^* }\left( \frac{d^*}{c^*} -\frac{b^*}{a^*} \right) D\\ P_{{\mathbb {D}}} \left( \frac{b}{a} - \frac{d}{c} \right) C \end{pmatrix} . \end{aligned}$$
(7.31)

This equation is analogous to (7.26), and the same bootstrapping argument as there leading to (7.27), combined with the fact that \(P_{{\mathbb {D}}}, P_{{\mathbb {D}}^*}\) have operator norm 1, gives

$$\begin{aligned} \Vert X\Vert _{\mathcal {H}}\le \frac{1}{\eta \epsilon }\left\| \begin{pmatrix} P_{{\mathbb {D}}^* }\left( \frac{d^*}{c^*} -\frac{b^*}{a^*} \right) D\\ P_{{\mathbb {D}}} \left( \frac{b}{a} - \frac{d}{c} \right) C \end{pmatrix} \right\| _{\mathcal {H}} \le \frac{1}{\eta \epsilon } \left\| \begin{pmatrix} \left( \frac{d^*}{c^*} -\frac{b^*}{a^*} \right) D\\ \left( \frac{b}{a} - \frac{d}{c} \right) C \end{pmatrix} \right\| _{\mathcal {H}}. \end{aligned}$$
(7.32)

By (7.17), (7.20), we have that C and D are almost everywhere bounded on \({\mathbb {T}}\) by

$$\begin{aligned} \left| C \left( \infty \right) \right| ^\frac{1}{2} \le (\eta \epsilon )^{-\frac{1}{2}}, \end{aligned}$$

where the last inequality follows from (7.28). Moreover, a and c are bounded below by \(2^{-\frac{1}{2}}\) by assumption. Hence (7.32) gives

$$\begin{aligned}\Vert X\Vert _{\mathcal {H}}\le (\eta \epsilon )^{-\frac{3}{2}} \left\| \begin{pmatrix} \frac{d^*}{c^*} -\frac{b^*}{a^*} \\ \frac{b}{a} - \frac{d}{c} \end{pmatrix} \right\| _{\mathcal {H}} \le 2 (\eta \epsilon )^{-\frac{3}{2}}\left\| \begin{pmatrix} d^* a^* - b^* c^* \\ b c - d a \end{pmatrix} \right\| _{\mathcal {H}}. \end{aligned}$$

Adding and subtracting terms as in \(bc-ad = c(b-d) + d(c-a)\), and using that abcd are all bounded above by 1 on \({\mathbb {T}}\), we obtain

$$\begin{aligned} \Vert X\Vert _{\mathcal {H}}\le 4 (\eta \epsilon )^{-\frac{3}{2}} \left\| \begin{pmatrix} a \\ b \end{pmatrix} - \begin{pmatrix} c \\ d \end{pmatrix} \right\| _{\mathcal {H}}. \end{aligned}$$
(7.33)

By (7.21), we have

$$\begin{aligned} \left\| \begin{pmatrix} a_{+} \\ b_{+} \end{pmatrix} - \begin{pmatrix} c_{+} \\ d_{+} \end{pmatrix} \right\| _{\mathcal {H}}= & {} \left\| \frac{1}{A (\infty ) ^{\frac{1}{2}}} \begin{pmatrix} A \\ B \end{pmatrix} - \frac{1}{C (\infty ) ^{\frac{1}{2}}} \begin{pmatrix} C \\ D \end{pmatrix} \right\| _{\mathcal {H}}\nonumber \\\le & {} \frac{1}{A (\infty ) ^{\frac{1}{2}}} \left\| X\right\| _{\mathcal {H} } + \left\| \begin{pmatrix} C\\ D \end{pmatrix} \right\| _{\mathcal {H}} \left| \frac{1}{C (\infty ) ^{\frac{1}{2}}} - \frac{1}{ A (\infty )^{\frac{1}{2}}} \right| .\nonumber \\ \end{aligned}$$
(7.34)

Using (7.3) and (7.24) the fact that \(a_{-} \left( \infty \right) \le 1\), we have

$$\begin{aligned} \left| A (\infty ) ^{\frac{1}{2}} \right| = |a_{+} \left( \infty \right) |=\left| \frac{a \left( \infty \right) }{a_{-} \left( \infty \right) } \right| \ge 2^{-\frac{1}{2}} \end{aligned}$$
(7.35)

and analogously for \(C (\infty )\). Further, as \((c,d) \in \textbf{L}\), we have

$$\begin{aligned} \left\| \frac{1}{C (\infty )^\frac{1}{2}} \begin{pmatrix} C\\ D \end{pmatrix} \right\| _{\mathcal {H}} = \left\| \begin{pmatrix} c\\ d \end{pmatrix} \right\| _{\mathcal {H}} =1, \end{aligned}$$
(7.36)

and using (7.35), in particular

$$\begin{aligned} C (\infty ) ^\frac{1}{2} + A (\infty ) ^\frac{1}{2} \ge 2^{\frac{1}{2}}, \end{aligned}$$

we get

$$\begin{aligned} \left| {1} - \frac{C (\infty )^{\frac{1}{2}}}{A (\infty ) ^{\frac{1}{2}}} \right|= & {} \frac{ |C (\infty ) - A (\infty )|}{ A (\infty )^{\frac{1}{2}} \left( A (\infty ) ^\frac{1}{2} + C(\infty ) ^\frac{1}{2} \right) } \le \left| A (\infty ) - C (\infty ) \right| \le \int \limits _{{\mathbb {T}}} |A - C|\nonumber \\\le & {} \left\| X \right\| _{\mathcal {H}}. \end{aligned}$$
(7.37)

Using (7.27) and the above estimates (7.35), (7.36), and (7.37), we obtain from (7.34) and (7.33)

$$\begin{aligned} \left\| \begin{pmatrix} a_{+} \\ b_{+} \end{pmatrix}-\begin{pmatrix} c_{+} \\ d_{+} \end{pmatrix}\right\| _{\mathcal {H}} \le (2^{\frac{1}{2}}+1) \left\| X \right\| _{\mathcal {H}} \le 4(2^{\frac{1}{2}}+1) (\eta \epsilon )^{-\frac{3}{2}} \left\| \begin{pmatrix} a \\ b \end{pmatrix} - \begin{pmatrix} c \\ d \end{pmatrix} \right\| _{\mathcal {H}}. \end{aligned}$$

This proves Theorem 12. \(\square \)

Theorem 13

The map sending (ab) to the coefficient \(F_0\) of the sequence F as in Theorem 11 is Lipschitz on \(\textbf{B}_{\epsilon }\) endowed with the metric \(\mathcal {H}\). The Lipschitz constant is at most \((8 + 2^{ \frac{5}{2}}) (\eta \epsilon )^{-\frac{3}{2}}\), where \(\eta \) is the constant in (7.25).

Proof

Let (ab) and (cd) in \(\textbf{B}_{\epsilon }\) be the nonlinear Fourier series of \((F_n)\) and \((G_n)\). Let \((a_+, b_+)\) and \((c_+, d_+)\) be as in the proof of Theorem 11. We have

$$\begin{aligned}2^{-\frac{1}{2}}\le {a_{+} \left( \infty \right) }, {c_{+} \left( \infty \right) }\le 1,\end{aligned}$$

the upper bound for general elements in \(\textbf{L}\) and the lower bound by (7.3) and assumptions on a and c. By (6.13), we have

$$\begin{aligned}{} & {} \frac{1}{2} \left| F_0 -G_0 \right| = \frac{1}{2}\left| \frac{b_+(0)}{a_+(\infty )}-\frac{d_+ (0)}{c_+ (\infty )} \right| \le \left| b_+ (0)c_+ (\infty )- d_+ (0)a_+ (\infty ) \right| \\{} & {} \quad \le |(b_+ (0)- d_+ (0))c_+ (\infty )|+|d_+ (0)(c_+ (\infty )- a_+ (\infty ))| \\{} & {} \quad \le |b_+ (0)- d_+ (0)|+ |c_+ (\infty )- a_+ (\infty )| \le \int _{{\mathbb {T}}} |b_+ -d_+|+\int _{{\mathbb {T}}} |c_+- a_+ | \\{} & {} \quad \le 2^{\frac{1}{2}} \left\| \begin{pmatrix} a_+ \\ b_+ \end{pmatrix} - \begin{pmatrix} c_+ \\ d_+ \end{pmatrix} \right\| _{\mathcal {H}} \le (8+ 2^{ \frac{5}{2}}) (\eta \epsilon )^{-\frac{3}{2}} \left\| \begin{pmatrix} a \\ b \end{pmatrix} - \begin{pmatrix} c \\ d \end{pmatrix} \right\| _{\mathcal {H}}. \end{aligned}$$

Here the last inequality followed from Theorem 12. This proves Theorem 13. \(\square \)

8 Proof of the main Theorem 1

Let f be given as in Theorem 1. Extend f to an even function on \([-1,1]\).

We first show existence of the sequence \(\Psi \) by construction. Define \(b(z)=if(x)\) where \(x=\cos \theta \) for \(\theta \in [0,\pi ]\) given by \(z=e^{2i\theta }\). As f is even, \(b(z)=b(z^{-1})\) for \(z\in {\mathbb {T}}\), and in particular b(z) is well-defined at \(z=1\) because \(f(-1) = f(1)\). Moreover, b is purely imaginary and is bounded in absolute value by \(2^{-\frac{1}{2}}-\epsilon \). By Theorem 10, there is an a such that \((a,b)\in \textbf{B}_\epsilon \). By Theorem 11, there is a sequence \(F=(F_n) \in \ell ^2 \left( {\mathbb {Z}}\right) \) so that (ab) is the nonlinear Fourier series of F.

The reflection symmetry of the purely imaginary b implies \(F_{-n} = F_n\) and \(\bar{F_n} = F_n\) for all n. Indeed, by (3.11), extended to infinite sequences, and the fact that \(b(z^{-1}) = b(z)\), the sequence \((F_{-n})\) has nonlinear Fourier series \((a ^* (z^{-1}), b(z))\). By the uniqueness part of Theorem 10, this implies

$$\begin{aligned} a ^* (z^{-1}) = a(z). \end{aligned}$$

Thus (ab) is the nonlinear Fourier series of both \((F_n)\) and \((F_{-n})\), which by the uniqueness part of Theorem 11 implies \(F_n = F_{-n}\). And by (3.12) extended to infinite sequences, the sequence \(\bar{F_n}\) has nonlinear Fourier series

$$\begin{aligned} (a ^* (z^{-1}), b^* (z^{-1})) = (a (z), - b(z)), \end{aligned}$$

which by (3.10), again extended to infinite sequences, is the nonlinear Fourier series of \(-F_n\). Again by uniqueness part of Theorem 11, we conclude that \(\bar{F_n} = -F_n\), i.e., \(F_n\) is purely imaginary.

We define \(\psi _n\in (-\frac{\pi }{2},\frac{\pi }{2})\) for \(n\in {\mathbb {N}}\) by

$$\begin{aligned} \psi _n \equiv \arctan \left( \frac{F_n}{i} \right) . \end{aligned}$$

We now show the desired properties of the sequence \(\Psi \). We begin with the Plancherel identity (1.6). We compute

$$\begin{aligned} \frac{1}{\pi } \int \limits _{-1} ^1 \log \left( 1 - f(x)^2 \right) \frac{dx}{\sqrt{1-x^2} }= \frac{1}{\pi }\int _0^\pi \log (1-f(\cos \varphi )^2 ) \, d\varphi \\=\int \limits _{{\mathbb {T}}} \log \left( 1 - \left| b \right| ^2 \right) = 2 \int \limits _{{\mathbb {T}}} \log \left| a \right| = 2 \log \left| a^* \left( 0 \right) \right| , \end{aligned}$$

where the last equality follows from a being outer. By (7.4), the last term equals

$$\begin{aligned} - \sum \limits _{k \in {\mathbb {Z}}} \log (1+|F_k|^2) = -\sum \limits _{k \in {\mathbb {Z}}} \log (1+\tan ^2 \psi _{|k|} ). \end{aligned}$$

This proves (1.6).

Next we show convergence of \(\Im (u_d \left( \Psi , x \right) \) to f in the norm (1.2). Let \((a_d, b_d)\) be the nonlinear Fourier series of the truncated sequence \((F_n 1_{\{|n| \le d \}})\). Then by Lemma 2, we have

$$\begin{aligned} b_d \left( z \right) = \Im \left( u_d \left( \Psi , x \right) \right) , \end{aligned}$$

where we again recall that the right side is an even function on \([-1,1]\). By the reasoning just above (7.4), the sequence \((a_d, b_d)\) converges to (ab) in \(\textbf{L}\), and hence b converges to \(b_d\) in \(L^2 \left( {\mathbb {T}}\right) \). As

$$\begin{aligned} \left( \frac{1}{\pi } \int \limits _{-1}^1 \left| \Im u_d \left( \Psi , x\right) - f \left( x \right) \right| ^2 \frac{dx}{\sqrt{1-x^2}} \right) ^{\frac{1}{2}} = \left\| b_d - b \right\| _{L^2 \left( {\mathbb {T}}\right) } , \end{aligned}$$
(8.1)

which converges to 0 as \(d \rightarrow \infty \) by the remarks just above (7.4), we then have convergence of \(\Im u_d \left( \Psi , x\right) \) to f in the norm (1.2).

This shows existence of \(\Psi \). To see uniqueness, let \(\tilde{\Psi }\) be any sequence satisfying the properties of the theorem. Set \(\tilde{F}_n = i \tan \tilde{\psi }_{|n|}\). By (1.6), the sequence \(\tilde{F}\) is square summable. Let \((\tilde{a},\tilde{b})\) be its non-linear Fourier series, and \((\tilde{a}_d,\tilde{b}_d)\) the non-linear Fourier series of its truncations. By the remarks just above (7.4), \(\tilde{b}_d\) converges to \(\tilde{b}\), and it also converges to b by definition of b and the convergence assumption of the theorem. Hence \(b=\tilde{b}\). Because

$$\begin{aligned} \left| \frac{\tilde{a^*}}{a^*} \right| = 1 \end{aligned}$$

on \({\mathbb {T}}\) and \(\frac{1}{a^*} \in H^2 ({\mathbb {D}})\), then \(\frac{\tilde{a^*}}{a^*}\) is an inner function on \({\mathbb {D}}\). By (7.4), (1.6), the definition of the function \(b = i f\) and then the definition (7.7) of the outer function a, we have

$$\begin{aligned} \tilde{a} (\infty ) = \prod \limits _{n \in {\mathbb {Z}}} (1 + |\tilde{F}_{n}|^2)^{ - \frac{1}{2}} = \frac{1}{2} \int \limits _{{\mathbb {T}}} \log (1-|{\tilde{b}}|^2) =\int \limits _{{\mathbb {T}}} \log (|a|) = a (\infty )\,. \end{aligned}$$

Thus \(\frac{\tilde{a} (\infty )}{a (\infty )} =1\) and by the maximum principle, the inner function \(\frac{\tilde{a} ^*}{a^*}\) must be constant 1, i.e., \(a=\tilde{a}\). By the uniqueness part of Theorem 11, we have \(\tilde{F}_n=F_n\). Hence \(\tilde{\Psi }=\Psi \) since for each j we have \(\psi _j, {\tilde{\psi }}_j \in (- \frac{\pi }{2}, \frac{\pi }{2})\), an interval on which \(\tan \) is injective.

Now we show that the map sending f to \(\Psi \) is Lipschitz. It suffices to show for each \(k\ge 0\) that the map from f to \(\psi _k\) is Lipschitz.

We write this map as a composition of three maps. By Theorem 13, the map sending (ab) in \(\textbf{B}_{\epsilon }\) to \(F_0\) is Lipschitz. As the shift \((a,b) \mapsto (a,b z)\) is an isometry in \(\textbf{B}_{\epsilon }\), the same holds for \(F_k\) with \(k\in {\mathbb {Z}}\) and we have for (ab) and \((\tilde{a},\tilde{b})\) in \(\textbf{B}_{\epsilon }\),

$$\begin{aligned} \left| F_k -\tilde{F}_{k} \right| \le (8+ 2^{ \frac{5}{2}}) (\eta \epsilon )^{-\frac{3}{2}} \left\| \begin{pmatrix} a\\ b \end{pmatrix} - \begin{pmatrix} \tilde{a}\\ \tilde{b} \end{pmatrix} \right\| _{\mathcal {H}} , \end{aligned}$$
(8.2)

where \(\eta \) is as in (7.25).

As \(\arctan \left( x \right) \) has slope between \(-1\) and 1, we have

$$\begin{aligned} \left| \psi _k -\tilde{\psi }_{k} \right| \le \left| F_k -\tilde{F}_{k} \right| . \end{aligned}$$
(8.3)

It remains to obtain a Lipschitz bound for the map sending f to (ab). By the cosine theorem, with \(\theta (z)\) the angle between a(z) and \(\tilde{a}(z)\)

$$\begin{aligned} |a- \tilde{a}|^2= & {} |a|^2+|\tilde{a}|^2-2|a||\tilde{a}|\cos \theta =(|a|-|\tilde{a}|)^2 + 2 |a||\tilde{a}|(1-\cos \theta )\nonumber \\\le & {} (|a|-|\tilde{a}|)^2 + 2 (1-\cos \theta ) \le (|a|-|\tilde{a}|)^2 + \theta ^2. \end{aligned}$$
(8.4)

Here we used that \(|a|,|\tilde{a}|\le 1\), and that \(2(1-\cos \theta )\) vanishes of order two at \(\theta =0\) and has second derivative less than or equal to two.

The angle \(\theta \) is given by the imaginary part of \(\log (a)-\log (\tilde{a})\). As \(\log (a)\) and \(\log (\tilde{a})\) have analytic extensions to \({\mathbb {D}}^*\) that are real at \(\infty \), the angle \(\theta \) is dominated in absolute value by the imaginary part of \(\log (a)-\log (\tilde{a})\), which in turn is given as \(-H(\log |a|-\log |\tilde{a}|)\) for the Hilbert transform H. Recall that the Hilbert transform has operator norm bounded by 1 on \(L^2({\mathbb {T}})\).

Inequality (8.4) yields

$$\begin{aligned}{} & {} \left\| a - \tilde{a} \right\| _{L^2 \left( {\mathbb {T}}\right) }^2 \le \left\| \left| a \right| - \left| \tilde{a} \right| \right\| _{L^2 \left( {\mathbb {T}}\right) }^2 + \left\| H({\log \left| a \right| } - {\log \left| \tilde{a} \right| }) \right\| _{L^2 \left( {\mathbb {T}}\right) }^2, \nonumber \\{} & {} \quad \le \left\| \sqrt{1- |b|^2} - \sqrt{1- |\tilde{b}|^2} \right\| _{L^2 \left( {\mathbb {T}}\right) }^2 + \frac{1}{4}\left\| {\log \left| 1- |b|^2 \right| } - {\log | 1- |\tilde{b}|^2 |} \right\| _{L^2 \left( {\mathbb {T}}\right) }^2,\nonumber \\ \end{aligned}$$
(8.5)

Using that on the interval \([0, 2^{-\frac{1}{2}}]\), the map \(x \mapsto \log (1-x^2)\) has slope bounded by \(2^{\frac{3}{2}}\) and \(x \mapsto \sqrt{ 1-x^2}\) has slope bounded by \(2^{\frac{1}{2}}\), we estimate the (8.5) by

$$\begin{aligned}\le 4\left\| \left| b \right| - \left| \tilde{b} \right| \right\| _{L^2 \left( {\mathbb {T}}\right) } ^2 \le 4 \left\| b - \tilde{b} \right\| _{L^2 \left( {\mathbb {T}}\right) } ^2. \end{aligned}$$

We obtain

$$\begin{aligned} \left\| \begin{pmatrix} a\\ b \end{pmatrix} - \begin{pmatrix} \tilde{a}\\ \tilde{b} \end{pmatrix} \right\| _{\mathcal {H}} ^2 \le 5\left\| b - \tilde{b} \right\| _{L^2 \left( {\mathbb {T}}\right) } ^2 = \frac{5}{\pi } \int \limits _{-1}^1 \left| f \left( x \right) - \tilde{f} \left( x \right) \right| ^2 \frac{dx}{\sqrt{1-x^2}} . \end{aligned}$$
(8.6)

Combining (8.2), (8.3) and (8.6), we obtain

$$\begin{aligned} \left| \psi _0 -\tilde{\psi }_0 \right| \le 5 ^{ \frac{1}{2}} (8+ 2^{ \frac{5}{2}}) (\eta \epsilon )^{-\frac{3}{2}}\left( \frac{1}{\pi } \int \limits _{-1}^1 \left| f \left( x \right) - \tilde{f} \left( x \right) \right| ^2 \frac{dx}{\sqrt{1-x^2}} \right) ^{\frac{1}{2}}. \end{aligned}$$

The bound for \(\eta \) from Lemma 3 yields the Lipschitz constant above is at most \(7.3 \epsilon ^{- \frac{3}{2}}\), which completes the proof of Theorem 1.