1 Introduction

The connection between low-rank Hankel and Toeplitz operators and matrices, and properties of the functions that generate them play a crucial role for instance in frequency estimation [7, 32, 4648], system identification [14, 16, 31, 33] and approximation theory [46, 810, 42]. The reason for this is that there is a connection between the rank of such an operator and its generating function being a sum of exponential functions, where the number of terms is connected to the rank of the operator (Kronecker’s theorem). Moreover, adding the condition of positive semi-definite imposes further restrictions on the sum of exponentials (Caratheódory-Fejér’s and Fischer’s theorem), a result which underlies e.g. Pisarenko’s famous method for frequency estimation [43].

We provide corresponding theorems in the multidimensional setting. In contrast to the one dimensional situation, the multidimensional framework provides substantial flexibility in how to define these operators. Whereas most previous research on multidimensional Hankel and Toeplitz type operators considers “generating functions/sequences” f that are defined on product domains, we here consider a framework where f is defined on an open connected and bounded domain \(\Omega \) in \({\mathbb R}^d\) (or discretizations thereof). Besides providing beautiful new theorems, it is our hope that the new results in this paper will pave the way for applications in multidimensional frequency estimation/approximation/compression, in analogy with the use of Toeplitz and Hankel matrices in the one dimensional setting. For this reason, we present results both in the continuous and discretized setting, and discuss how they influence each other.

To present the key ideas, we here focus mainly on the continuous theory since it is more transparent. “General domain Hankel (Toeplitz) operators” is a class of integral operators whose kernel K(xy) is of the form \(K(x,y)=f(x+y)\) or \(K(x,y)=f(x-y)\) Footnote 1, and f is the so called “generating function”. Their precise definition also depends on an auxiliary domain \(\Omega \) on which f is defined, we postpone detailed definitions to Sect. 2.2. We denote by \(\Gamma _f\) a generic general domain Hankel operator and by \(\Theta _f\) their Toeplitz counterparts (see Fig. 1 for an example of a discretized \(\Gamma _f\)). These operators were introduced in [3] where it is shown that if \(\Gamma _f\) or \(\Theta _f\) has rank equal to \(K<\infty \), then f is necessarily an exponential polynomial;

$$\begin{aligned} f( x)=\sum _{j=1}^J p_j(x)e^{\zeta _j\cdot x} \end{aligned}$$
(1.1)

where \(J\le K\) (assuming no cancellation), \(p_j\) are polynomials in \( x=(x_1,\ldots ,x_d)\), \( \zeta _j\in {\mathbb C}^d\) and \(\zeta _j\cdot x\) denotes the standard scalar product

$$\begin{aligned} {\zeta }_j\cdot {x}=\sum _{m=1}^d \zeta _{j,m}x_m. \end{aligned}$$

Conversely, any such exponential polynomial gives rise to finite rank \(\Gamma _f\) and \(\Theta _f\) respectively, and there is a method to determine the rank given the generating function (1.1). Most notably, the rank equals K if f is of the form

$$\begin{aligned} f( x)=\sum _{k=1}^K c_ke^{ \zeta _k\cdot x}, \end{aligned}$$
(1.2)

where \(c_k\in {\mathbb C}\) (assuming that there is no cancelation in (1.2)).

Fig. 1
figure 1

a “Generating sequence” defined on a disc \(\Omega \); b The matrix realization of the corresponding “general domain Hankel operator” (see Sect. 4.1 for further details)

The main topic of this paper is the study of how the additional condition that \(\Gamma _f\) or \(\Theta _f\) be positive semi-definite (PSD) affects the generating function f. We prove that \(\Theta _f\) then has rank K if and only if f is of the form

$$\begin{aligned} f( x)=\sum _{k=1}^K c_ke^{i \xi _k\cdot x} \end{aligned}$$
(1.3)

where \(c_k>0\) and \(\xi _k\in {\mathbb R}^d\) (Theorem 7.1), which in a certain sense is an extension of Carathéodory-Fejér’s theorem on PSD Toeplitz matrices. Correspondingly, \(\Gamma _f\) is PSD and has rank K if and only if f is of the form

$$\begin{aligned} f( x)=\sum _{k=1}^K c_ke^{ \xi _k\cdot x} \end{aligned}$$
(1.4)

where again \(c_k>0\) and \(\xi _k\in {\mathbb R}^d\) (Theorem 8.1). Similar results for Hankel matrices date back to work of Fischer [22].

The only of the above results that has a simple counterpart in the finite dimensional discretized multivariable setting is the Carathéodory-Fejér’s theorem, which has been observed previously in [53] (concerning block Toeplitz matrices). In this paper we provide a general result on tensor products, which can be used to “lift” structure results in one-dimension to the multi-dimensional setting. We use this to give an alternative proof of the discrete Carathéodory-Fejér theorem, which subsequently is used to prove the continuous counterpart.

Fischer’s theorem on the other hand has no neat version in the multivariable finite dimensional setting, but has been generalized to so called small Hankel operators on \(\ell ^2({\mathbb N}^d)\) in [44], a paper which also contains a result analog to (1.4).

However, the product domain setting is rather restrictive and not always a natural choice. Whereas one-dimensional generating functions necessarily are defined on an interval, there is an abundance of possible regions to define their corresponding multidimensional cousins. Despite this, the majority of multivariate treatments of these issues are set either directly in a block-Toeplitz/Hankel setting, or rely on tensor products. In both cases the corresponding domain of definition \(\Omega \) of the generating function/sequence is a square (or multi-cube), but for concrete applications to multidimensional frequency estimation, the available data need not be naturally defined on such a domain. In radially symmetric problems, a circle may be more suitable or, for certain boundary problems, a triangle might be more appropriate.

Concerning analogs of the above results for the discretized counterparts of \(\Theta _f\) and \(\Gamma _f\), we show in this paper how to construct “structured matrices” that approximate their continuous counterparts, and hence can be expected to inherit these desirable properties, given sufficient sampling rate. We give simple conditions on the regularity of f and \(\Omega \) needed for this to be successful. This gives rise to an interesting class of structured matrices, which we call “general domain Hankel/Toeplitz matrices”. As an example, in Fig. 1 we have a “generating sequence” f on a discretized disc, together with a plot of its general domain Hankel matrix.

The paper is organized as follows. In the next section we review the theory and at the same time introduce the operators we will be working with in the continuous setting (Sect. 2.2). The short Sect. 3 provides a tool from tensor algebra, and also introduce useful notation for the discrete setting. Section 4 discuss how to discretize the \(\Gamma _f\)’s and \(\Theta _f\)’s, and we discuss particular cases such as block Toeplitz and Hankel matrices. In Sect. 5 we prove the Caratheodory-Fejér theorem in the discrete (block) setting. Section 6 shows that the discrete operators approximate the continuous counterparts, given sufficient sampling rate, and we discuss Kronecker’s theorem. Sections 7 and 8 consider structure results for f under the PSD condition, first for \(\Theta _f\)’s and then for \(\Gamma _f\)’s. In the last section, we extend the above results to the corresponding operators on unbounded domains.

2 Review of the Field

A Toeplitz matrix is a matrix that is constant on the diagonals, i.e. the matrix elements satisfy \(a_{k,j}=a_{k+1,j+1}\) for all indices kj such that the above formula is well defined. A sequence f such that \(a_{k,j}=f_{k-j}\) is called its generating sequence. Hankel matrices on the other hand are constant on the anti-diagonals; \(a_{k,j}=a_{k+1,j-1}\); and the sequence f such that \(a_{k,j}=f_{k+j}\) is called its generating sequence. Naturally, the set of subindices for f depends on whether we are dealing with Hankel or Toeplitz matrices (and also if the upper left element is taken as \(a_{1,1}\) or \(a_{0,0}\)), but this is not of importance here and hence we do not specify it.

Suppose that the generating sequence of either a Hankel matrix H or a Toeplitz matrix T (of size \(N\times N\)) is a “discretized exponential polynomial”

$$\begin{aligned} f_n=\sum _{j=1}^J p_j(n)\lambda _j^n \end{aligned}$$
(2.1)

(where \(\lambda _j\in {\mathbb C}\) are distinct) of “cardinality”

$$\begin{aligned} K=\sum _{j=1}^J(\deg p_j +1) \end{aligned}$$
(2.2)

strictly less than N. Based on the theory of Vandermonde-matrices, one can show that the rank of either H or T equals K, and that the polynomials \(p_j\) and the \(\lambda _j\)’s are unique. The converse statement is not true; consider for example the Hankel matrix

$$\begin{aligned} \left( \begin{array}{ccccc} 1 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 1 \\ \end{array} \right) . \end{aligned}$$
(2.3)

Clearly, the rank is 2 but the generating sequence (1, 0, 0, 0, 0, 0, 0, 0, 1) is not of the form (2.1) with \(J=1\) or 2. However, in terms of applications this doesn’t matter because of the following stronger statement: If T or H has rank \(K<N\) then its generating sequence is “generically” of the form

$$\begin{aligned} f_n=\sum _{k=1}^K c_k\lambda _k^{n}, \end{aligned}$$
(2.4)

a fact which underlies the famous ESPRIT frequency estimation algorithm [46].

The above claims are certainly well known to specialists, but very hard to find in the literature. The book [28], which has two sections devoted entirely to the topic of the rank of finite Toeplitz and Hankel matrices, gives a number of exact theorems relating the rank with the “characteristic” of the corresponding matrix, which is a set of numbers related to when determinants of certain sub-matrices vanish. It is possible to deduce representations of the form (2.1) (under certain additional assumptions) from these results, but this is never stated explicitly. Another viewpoint has been taken by Mourrain et al. [11, 17, 36, 37], in which, loosely speaking, these matrices are analyzed using projective algebraic geometry and the 1 to the bottom right corresponds to the point \(\infty \). The book [41] deals exclusively with the infinite dimensional case, and generalizations thereof. For completeness we provide outlines of proofs of the claims made earlier in the appendix, based on results in [18, 28].

In either case, the complexity of the theory does not reflect the relatively simple interaction between rank and exponential sums, as indicated in the introduction. There are however a few exceptions in the discrete setting. Kronecker’s theorem says that for a Hankel operator (i.e. an infinite Hankel matrix acting on \(\ell ^2({\mathbb N})\)), the rank is K if and only if the generating sequence is of the desired form (2.1) (\(0^0\) defined as 1), with the restriction that \(|\lambda _j|<1\) if one is only interested in bounded operators, see e.g. [13, 29, 30, 41]. Also, it is finite rank and PSD if and only if the generating sequence is of the form (2.4) with \(c_k>0\) and \(\lambda _k\in (-1,1)\), a result which also has been extended to the multivariable (tensor product) setting [44]. In contrast, there are no finite rank bounded Toeplitz operators (on \(\ell ^2({\mathbb N})\)). If boundedness is not an issue, then a version of Kronecker’s theorem holds for Toeplitz operators as well [18].

Adding the PSD condition for a Toeplitz matrix yields a simple result which is valid (without exceptions) for finite matrices. This is the essence of what usually is called the Carathéodory-Fejér theorem. The result was used by Pisarenko [43] to construct an algorithm for frequency estimation. Since then, this approach has rendered a lot of related algorithms, for instance the MUSIC method [47]. We reproduce the statement here for the convenience of the reader. For a proof see e.g. Theorem 12 in [2] or Section 4 in [26]. Other relevant references include [1, 15].

Theorem 2.1

Let T be a finite \({(N+1)}\times {(N+1)}\) Toeplitz matrix with generating sequence \((f_n)_{n=-N}^{N}\). Then T is PSD and \(\mathsf {Rank}~ T=K\le N\) if and only if f is of the form

$$\begin{aligned} f(n)=\sum _{k=1}^Kc_k\lambda _k^n \end{aligned}$$
(2.5)

where \(c_k>0\) and the \(\lambda _k\)’s are distinct and satisfy \(|\lambda _k|=1\).

The corresponding situation for Hankel matrices H is not as clean, since (2.3) is PSD and has rank 2, but do not fit with the model (2.5) for \(c_k>0\) and real \(\lambda _k\)’s. Results of this type seems to go back to Fischer [22], and we will henceforth refer to statements relating the rank of PSD Hankel-type operators to the structure of their generating sequence/function, as “Fischer-type theorems” (see e.g. Theorem 5 [2] or [22]). Corresponding results in the full rank case can be found e.g. in [50].

We end this subsection with a few remarks on the practical use of Theorem 2.1. For a finitely sampled signal, the autocorrelation matrix can be estimated by \(H^*H\) where H is a (not necessarily square) Hankel matrix generated by the signal. This matrix will obviously be PSD, but in general it will not be Toeplitz. However, under the assumption that the \(\lambda _k\)’s in (2.5) are well separated, the contribution from the scalar products of the different terms will be small and might therefore be ignored. Under these assumptions on the data, the matrix \(H^*H\) is PSD and approximately Toeplitz, which motivates the use of the Carathéodory-Fejér theorem as a means to retrieve the \(\lambda _k\)’s.

2.1 Toeplitz and Hankel Operators on the Paley–Wiener Space

The theory in the continuous case is much “cleaner” than in the discrete case. In this section we introduce the integral operator counterpart of Toeplitz and Hankel matrices, and discuss Kronecker’s theorem in this setting.

Given a function on the interval \([-2,2]\), we define the truncated convolution operator \(\Theta _f:L^2([-1,1])\rightarrow L^2([-1,1])\) by

$$\begin{aligned} \Theta _f(g)(x)=\int f(x-y)g(y)~dy. \end{aligned}$$
(2.6)

Replacing \(x-y\) by \(x+y\) we obtain the a truncated correlation operator which we denote by \(\Gamma _f\). Following [45], we refer to these operators as Toeplitz and Hankel operators on the Paley–Wiener space (although in [3] they were called finite interval convolution/correlation operators). It is easy to see that if we discretize these operators, i.e. replace integrals by finite sums, then we get Toeplitz and Hankel matrices, respectively. More on this in Sect. 4.1.

Kronecker’s theorem (as formulated by Rochberg in [45]) then states that \(\mathsf {Rank}~ \Theta _f=K\) (and \(\mathsf {Rank}~ \Gamma _f=K\)) if and only if f is of the form

$$\begin{aligned} f(x)=\sum _{j=1}^J p_j(x) e^{\zeta _j x} \end{aligned}$$
(2.7)

where \(p_j\) are polynomials and \(\zeta _j\in {\mathbb C}\). Moreover, the rank of \(\Theta _f\) (or \(\Gamma _f\)) equals the cardinality

$$\begin{aligned} K=\sum _{j=1}^J(\deg p_j+1). \end{aligned}$$
(2.8)

However, functions of the form

$$\begin{aligned} f(x)=\sum _{k=1}^K c_k e^{\zeta _k x},\quad c_k,\zeta _k\in {\mathbb C}\end{aligned}$$
(2.9)

are known to be dense in the set of all generating functions giving rise to rank K finite interval convolution operators. Hence, the general form (2.7) is hiding the following simpler statement, which often is of practical importance. \(\Theta _f\) generically has rank K if and only if f is a sum of K exponential functions (see the Appendix for an outline of a proof of this claim). The corresponding statement is false in several variables, which is shown in [3]. The polynomial factors appear in the limit if two frequencies in (2.9) approach each other and interfere destructively, e.g.

$$\begin{aligned} x=\lim _{\epsilon \rightarrow 0^+}\frac{e^{\epsilon x}-1}{\epsilon }. \end{aligned}$$
(2.10)

This can heuristically explain why these factors do not appear when adding the PSD condition, since the functions on the right of (2.10) give rise to one large positive and one large negative eigenvalue.

2.2 General Domain Hankel and Toeplitz Integral Operators in Several Variables

Given any (square integrable) function f on an open connected and bounded set \(\Omega \) in \({\mathbb R}^d\), \(d\ge 1\), the natural counterpart to the operator (2.6) is the general domain Toeplitz integral operator defined by

(2.11)

where \(\Xi \) and are connected open bounded sets such that

(2.12)

In [3] such operators are studied, (albeit under the name general domain truncated convolution operator), and their finite rank structure is completely characterized. It is easy to see that \(\Theta _f\) has rank K whenever f has the form

$$\begin{aligned} f({x})=\sum _{k=1}^K c_k e^{{\zeta }_k\cdot {x}},\quad {x}\in \Omega , \end{aligned}$$
(2.13)

where the \({\zeta _1},\ldots ,{\zeta }_K\in {\mathbb C}^d\) are assumed to be distinct and all \(c_k\)’s are non-zero.

The reverse direction is however not as neat as in the one-dimensional case. It is true that the rank is finite only if f is an exponential polynomial (i.e. the multidimensional analogue of (2.7), see Theorem 4.4 in [3]), but there is no counterpart to the simple formula (2.8). However, Section 5 (in [3]) gives a complete description of how to determine the rank given the generating function f explicitly, Sect. 7 gives results on the generic rank based on the degree of the polynomials that appear in f, we also provide lower bounds on the rank, and Sect. 8 investigates the fact that polynomial coefficients seem to appear more frequently in the multidimensional setting. Section  9 contains an investigation related to boundedness of these operators for the case of unbounded domains, which we will treat briefly in Sect. 9 of the present paper.

If we instead set then we may define the general domain Hankel integral operator (called truncated correlation operator in [3])

(2.14)

This is the continuous analogue of finite Hankel (block) matrices. As in the finite dimensional case, there is no real difference between \(\Gamma _f\) and \(\Theta _f\) regarding the finite rank structure. In fact, one turns into the other under composition with the “trivial” operator \(\iota (f)({x})=f(-{x}),\) and thus all statements concerning the rank of one can easily be transferred to the other. We remark however that composition with \(\iota \) does not preserve PSD, and hence separate proofs are needed in this situation. Finally, we remark that the choice \(\Upsilon =\Xi ={\mathbb R}_+^d\) gives what is known as “small Hankel operators”. The study of their boundedness and related topics have received a lot of attention, see e.g. [21, 34, 35].

2.3 Other Multidimensional Versions

The usual multidimensional framework is that of block-Hankel and block-Toeplitz matrices, tensor products, or so called “small Hankel operators” on \(\ell ^2({\mathbb N}^d)\). In all cases, the generating sequence f is forced to live on a product domain. For example, in [52] they consider the generating sequences of the form (1.2) (where x is on a discrete grid) and give conditions on the size of the block Hankel matrices under which the rank is K, and in [53] it is observed that the natural counterpart of the Carathéodory-Fejér theorem can be lifted by induction to the block Toeplitz setting. For the full rank case, factorizations of these kinds of operators have been investigated in [20, 49]. Extensions to multi-linear algebra are addressed for instance in [3840]. Rank deficient block Toeplitz matrices also play an important role in [23].

Concerning “small Hankel operators”, in addition to [44] we wish to mention [27] where a formula for actually determining the rank appears, although this is based on reduction over the dimension and hence not suitable for non-product domains.

There is some heuristic overlap between [3] and [24, 25]. In [24] they consider block Hankel matrices with polynomial generating function, and obtain results concerning their rank (Theorem 4.6) that overlap with Propositions 5.3, Theorem 7.4 and Proposition 7.7 of [3] for the 2d case. Proposition 7 in [25] is an extension to 2d of Kronecker’s theorem for infinite block Hankel matrices (not truncated), which can be compared with Theorem 4.4 in [3].

In the discrete setting, the work of Mourrain et al. considers a general domain context, and what they call “quasi Toeplitz/Hankel matrices” correspond to what here is called ”general domain Toeplitz/Hankel matrices” (we stick to this term since we feel it is more informative). See e.g. Section 3.5 in [37], where such matrices are considered for solving systems of polynomial equations. In [11], discrete multidimensional Hankel operators (not truncated) are studied, and Theorem 5.7 is a description of the rank of such an operator in terms of decompositions of related ideals. Combined with Theorem 7.34 of [17], this result also implies that the generating sequence must be of the form (2.1). (See also Section 3.2 of [36], where similar results are presented.) These results can be thought of as a finite dimensional analogue (for product domains) of Theorem 1.2 and Proposition 1.4 in [3]. Theorem 5.9 gives another condition on certain ideals in order for the generating sequence to be of the simpler type, i.e. the counterpart of (1.2) instead of (1.1). In Sect. 6 of the same paper they give conditions for when these results apply also to the truncated setting, based on rank preserving extension theorems. Similar results in the one-variable setting is found in Section 3 of [18].

Finally, we remark that the results in this paper concerning finite rank PSD Hankel operators partially overlap heuristically with results of [44] and those found in Section 4 in [36], where the formula (2.4) is found in the (non-truncated) discrete environment. In the latter reference they subsequently provide conditions under which this applies to the truncated setting.

With these remarks we end the review and begin to present the new results of this paper. For the sake of introducing useful notation, it is convenient to start with the result on tensor products, which will be used to “lift” the one-dimensional Carathéodory-Fejér theorem to the multidimensional discrete setting.

3 A Property of Tensor Products

Let \(U_1,\ldots ,U_d\) be finite dimensional linear subspaces of \({\mathbb C}^n\). Then \(\otimes _{j=1}^d U_j\) is a linear subspace of \(\otimes _{j=1}^d {\mathbb C}^n\), and the latter can be identified with the set of \({\mathbb C}\)-valued functions on \(\{1,\ldots ,n\}^d\). Given \(f\in \otimes _{j=1}^d {\mathbb C}^n\) and \(\varvec{x}\in \{1,\ldots ,n\}^d\), we will write \(f(\varvec{x})\) for the corresponding value. For fixed \(\varvec{x}=(x_1,\ldots ,x_{d-1})\in \{1,\ldots ,n\}^{d-1}\) we define vectors

$$\begin{aligned} f_1(\varvec{x})=\Big (f(j ,x_1,\dots , x_{d-1})\Big )_{j=1}^n, f_2(\varvec{x})=\Big (f(x_1,j,x_2, \dots , x_{d-1})\Big )_{j=1}^n,~ etc., \end{aligned}$$

i.e. the vectors obtained from f by fixing all but one variable (and collecting the \(d-1\) fixed variables in \(\varvec{x}\)). We refer to these vectors as “probes” of f. If \(f\in \otimes _{j=1}^d U_j\) then it is easy to see that all probes \(f_j\) of f will be elements of \(U_j\), \(j=1,\ldots ,d\). The following theorem states that the converse is also true.

Theorem 3.1

If all possible probes \(f_j(\varvec{x})\) of a given \(f\in \otimes _{j=1}^d {\mathbb C}^n\) lie in \(U_j\), then \(f\in \otimes _{j=1}^d U_j\).

Proof

First consider the case \(d=2\). Let \(V\subset \otimes _{j=1}^2 {\mathbb C}^n\) consist of all f with the property stated in the theorem. This is obviously linear and \(U_1\otimes U_2\subset V\). If we do not have equality, we can pick an f in V which is orthogonal to \(U_1\otimes U_2\). At least one probe \(f_{1}(k)\) must be a non-zero element \(u_1\) of \(U_1\). Given any \(u_2\in U_2\) we have

$$\begin{aligned} \langle u_1\otimes u_2, f \rangle =\sum _{j=1}^n u_{2,j}\langle u_1 ,f_1(j)\rangle = \Bigg \langle u_2, \sum _{i=1}^n \overline{u_{1,i}} f_2(i) \Bigg \rangle . \end{aligned}$$
(3.1)

From the middle representation and the choice of \(u_1\), we see that at least one value of the vector \(\sum _{i=1}^n \overline{u_{1,i}} f_2(i)\) is non-zero. Moreover this is a linear combination of probes \(f_2(i)\), and hence an element of \(U_2\). But then we can pick \(u_2\in U_2\) such that the scalar product (3.1) is non-zero, which is a contradiction to the choice of f. The theorem is thus proved in the case \(d=2\).

The general case now easily follows by induction on the dimension, noting that \(\otimes _{j=1}^d {\mathbb C}^n\) can be identified with \({\mathbb C}^n\otimes (\otimes _{j=2}^{d} {\mathbb C}^n)\) and that \(\otimes _{j=1}^d U_j\) under this identification turns into \(U_1\otimes (\otimes _{j=2}^{d} U_j)\). \(\square \)

4 General Domain Toeplitz and Hankel Operators and Matrices

The operators in the title arise as discretizations of general domain Toeplitz/Hankel integral operators. These become “summing operators”, which can be represented as matrices in many ways, which we describe in the next section.

4.1 Discretization

For simplicity of notation, we here discretize using an integer grid, since grids with other sampling lengths (these are considered in Sect. 6.1) can be obtained by first dilating the respective domains. Let \(\Xi ,\Upsilon \) be any open connected and bounded domains in \({\mathbb R}^d\), and let f be a bounded function defined on \(\Omega =\Xi -\Upsilon \). We will throughout the paper use bold symbols for discrete objects, and normal font for their continuous analogues. Set

$$\begin{aligned} \varvec{\Upsilon }=\{\varvec{x}\in {\mathbb Z}^d:~\varvec{x}\in {\Upsilon }\}, \end{aligned}$$

make analogous definition for \(\Xi /\varvec{\Xi }\) and define \(\varvec{\Omega }=\varvec{\Upsilon }-\varvec{\Xi }.\) We let \(\varvec{\Theta }_{f}\) denote what we call a general domain Toeplitz (summing) operator

$$\begin{aligned} \varvec{\Theta }_{f}(g)(\varvec{x})=\sum _{\varvec{y}\in \varvec{\Upsilon }}f(\varvec{x}-\varvec{y})g(\varvec{y}),\quad \varvec{x}\in \varvec{\Xi }, \end{aligned}$$
(4.1)

where g is an arbitrary function on \(\varvec{\Upsilon }\). We will talk of \(\varvec{\Theta }_f\) as a discretization of the corresponding integral operator \(\Theta _f\), introduced in Sect. 2.2, more on this in Sect. 6.1.

We may of course represent g as a vector, by ordering the entries in some (non-unique) way. More precisely, by picking any bijection

$$\begin{aligned} o_y:\{1,\ldots ,|\varvec{\Upsilon }|\}\rightarrow \varvec{\Upsilon } \end{aligned}$$
(4.2)

we can identify g with the vector \(\tilde{g}\) given by

$$\begin{aligned} (\tilde{g}_j)_{j=1}^{|\varvec{\Upsilon }|}=g(o_y(j)). \end{aligned}$$

Letting \(o_x\) be an analogous bijection for \(\varvec{\Xi }\), it is clear that \(\varvec{\Theta }_{f}\) can be represented as a matrix, where the (ij)’th element is \(f(o_x(i)-o_y(j))\). Such matrices will be called “general domain Toeplitz matrices”, see Fig. 2 for a small scale example. We make analogous definitions for \(\Gamma _f\) and denote the corresponding discrete operator by \(\varvec{\Gamma }_f\). We refer to this as a “general domain Hankel (summing) operator” and its matrix realization as “general domain Hankel matrix”. An example of such is shown in Fig. 1.

Fig. 2
figure 2

Left Domains \(\varvec{\Xi }\), \(\varvec{\Upsilon }\), and \(\varvec{\Omega }=\varvec{\Xi }-\varvec{\Upsilon }\) with the points numbered lexicographically. Right Illustration of where the numbered points in \(\varvec{\Omega }\) show up in the corresponding matrix realization of \(\varvec{\Theta }_f\)

4.2 Block Toeplitz and Hankel Matrices

If we let \(\varvec{\Xi }\) and \(\varvec{\Upsilon }\) be multi-cubes and the ordering bijections be the lexicographical order, then the matrix realization \(\varvec{\Theta }_f\) of (4.1) becomes a block Toeplitz matrix. These are thus special cases of the more general operators considered above. Similarly, block Hankel matrices arise when representing \(\varvec{\Gamma }_f\) in the same way.

For demonstration we consider \(\varvec{\Xi }=\varvec{\Upsilon }=\{-1,0,1\}^3\) so \(\varvec{\Omega }=\{-2,\ldots ,2\}^3\). The lexicographical order then orders \(\{-1,0,1\}^3\) as

$$\begin{aligned} (-1,-1,-1),~(-1,-1,0),~(-1,-1,1),~(-1,0,-1),~(-1,0,0),\ldots ,(1,1,1). \end{aligned}$$

The matrix-realization T of a multidimensional Toeplitz operator \(\varvec{\Theta }_f\) then becomes

$$\begin{aligned} T=\left( \begin{array}{lllllllll} T_{f_3(0,0)} &{} T_{f_3(0,-1)} &{} T_{f_3(0,-2)} &{} T_{f_3(-1,0)} &{} T_{f_3(-1,-1)} &{} T_{f_3(-1,-2)} &{} T_{f_3(-2,0)} &{} T_{f_3(-2,-1)} &{} T_{f_3(-2,-2)} \\ T_{f_3(0,1)} &{} T_{f_3(0,0)} &{} T_{f_3(0,-1)} &{} T_{f_3(-1,1)} &{} T_{f_3(-1,0)} &{} T_{f_3(-1,-1)} &{} T_{f_3(-2,1)} &{} T_{f_3(-2,0)} &{} T_{f_3(-2,-1)} \\ T_{f_3(0,2)} &{} T_{f_3(0,1)} &{} T_{f_3(0,0)} &{} T_{f_3(-1,2)} &{} T_{f_3(-1,1)} &{} T_{f_3(-1,0)} &{} T_{f_3(-2,2)} &{} T_{f_3(-2,1)} &{} T_{f_3(-2,0)} \\ T_{f_3(1,0)} &{} T_{f_3(1,-1)} &{} T_{f_3(1,-2)} &{} T_{f_3(0,0)} &{} T_{f_3(0,-1)} &{} T_{f_3(0,-2)} &{} T_{f_3(-1,0)} &{} T_{f_3(-1,-1)} &{} T_{f_3(-1,-2)} \\ T_{f_3(1,1)} &{} T_{f_3(1,0)} &{} T_{f_3(1,-1)} &{} T_{f_3(0,1)} &{} T_{f_3(0,0)} &{} T_{f_3(0,-1)} &{} T_{f_3(-1,1)} &{} T_{f_3(-1,0)} &{} T_{f_3(-1,-1)} \\ T_{f_3(1,2)} &{} T_{f_3(1,1)} &{} T_{f_3(1,0)} &{} T_{f_3(0,2)} &{} T_{f_3(0,1)} &{} T_{f_3(0,0)} &{} T_{f_3(-1,2)} &{} T_{f_3(-1,1)} &{} T_{f_3(-1,0)} \\ T_{f_3(2,0)} &{} T_{f_3(2,-1)} &{} T_{f_3(2,-2)} &{} T_{f_3(1,0)} &{} T_{f_3(1,-1)} &{} T_{f_3(1,-2)} &{} T_{f_3(0,0)} &{} T_{f_3(0,-1)} &{} T_{f_3(0,-2)} \\ T_{f_3(2,1)} &{} T_{f_3(2,0)} &{} T_{f_3(2,-1)} &{} T_{f_3(1,1)} &{} T_{f_3(1,0)} &{} T_{f_3(1,-1)} &{} T_{f_3(0,1)} &{} T_{f_3(0,0)} &{} T_{f_3(0,-1)} \\ T_{f_3(2,2)} &{} T_{f_3(2,1)} &{} T_{f_3(2,0)} &{} T_{f_3(1,2)} &{} T_{f_3(1,1)} &{} T_{f_3(1,0)} &{} T_{f_3(0,2)} &{} T_{f_3(0,1)} &{} T_{f_3(0,0)} \\ \end{array} \right) \end{aligned}$$

where e.g.

$$\begin{aligned} T_{f_3(0,0)}=\left( \begin{array}{ccc} f(0,0,0) &{} f(0,0,-1) &{} f(0,0,-2) \\ f(0,0,1) &{} f(0,0,0) &{} f(0,0,-1) \\ f(0,0,2) &{} f(0,0,1) &{} f(0,0,0) \\ \end{array} \right) \end{aligned}$$

Note that this matrix has a Toeplitz structure on 3 levels, since each \(3\times 3\)-block of the large matrix above is Toeplitz, and these blocks themselves form a \(3\times 3\) Toeplitz structure.

4.3 Exponential Sums

We pause the general development to note some standard facts that will be needed in what follows. Fix \(N\in {\mathbb N}\), and for \(j=1,\ldots ,d\) let \(\Phi _j\) be a set of at most 2N numbers in \({\mathbb C}\). Set \({\Phi }=\Phi _1\times \cdots \times \Phi _d\).

Proposition 4.1

The set \(\{e^{{\zeta }\cdot \varvec{x}}:~{\zeta }\in {\Phi }\}\) is linearly independent as functions on \(\{-N,\ldots ,N\}^d\).

Proof

If \(d=1\) the result is standard, see e.g. Proposition 1.1 in [18] or [12, Sec. 3.3]. For \(d>1\), the function \(e^{{\zeta }\cdot \varvec{x}}=e^{{\zeta }_1\varvec{x}_1}\ldots e^{{\zeta }_d\varvec{x}_d}\) is a tensor. The desired conclusion now follows from the \(d=1\) case and standard tensor product theory.

We now set \(\varvec{\Upsilon }=\varvec{\Xi }=\{-N,\ldots ,N\}^d\), and let \(\varvec{\Omega }=\{-2N,\ldots ,2N\}^d\) in accordance with Sect. 4.1. Consider functions f on \(\varvec{\Omega }\) given by

$$\begin{aligned} f(\varvec{x})=\sum _{k=1}^K c_k e^{{\zeta }_k\cdot \varvec{x}}. \end{aligned}$$
(4.3)

We say that the representation (4.3) is reduced if all \({\zeta }_k\)’s are distinct and the corresponding coefficients \(c_k\) are non-zero.

Proposition 4.2

Let \({\Phi }\) be as before. Let the function f on \(\{-2N,\ldots ,2N\}^d\) be of the reduced form (4.3) where each \({\zeta }_k\) is an element of \({\Phi }\). Then

$$\begin{aligned} \mathsf {Rank}~ \varvec{\Theta }_f=\mathsf {Rank}~ \varvec{\Gamma }_f=K. \end{aligned}$$

Proof

Pick a fixed \({\zeta }\in {\mathbb C}^d\) and consider \(f(\varvec{x})=e^{{\zeta }\cdot \varvec{x}}\) then

$$\begin{aligned} \varvec{\Theta }_{f}(g)(\varvec{x})=\sum _{\varvec{y}\in \varvec{\Upsilon }}e^{{\zeta }\cdot \varvec{x}}e^{-{\zeta }\cdot \varvec{y}}g(\varvec{y})=e^{\zeta \cdot \varvec{x}}\langle g, \overline{e^{-{\zeta }\cdot \varvec{y}}}\rangle , \end{aligned}$$
(4.4)

which has rank 1. For a general f of the form (4.3) the rank will thus be less than or equal to K. But Proposition 4.1 implies that the set \(\{e^{{\zeta }_k\cdot \varvec{x}}\}_{k=1}^K\) is linearly independent as functions on \(\varvec{\Xi }\). Thus the rank will be precisely K, as desired. The argument for \(\Gamma _f\) is analogous. \(\square \)

We end this section with a technical observation concerning 1 variable.

Proposition 4.3

Let f be a vector of length \(m>n+1\) and \(K<n\). Let \(\zeta _1,\ldots ,\zeta _K\) be fixed and suppose that each sub-vector of f with length \(n+1\) can be written of the form (4.3), then f can be written in this form as well.

Proof

Consider two adjacent sub-vectors with overlap of length n. On this overlap the representation (4.3) is unique, due to Proposition 4.1. The result now easily follows.

5 The Multidimensional Discrete Carathéodory-Fejér Theorem

Throughout this section, let \(\varvec{\Upsilon },~\varvec{\Xi }\) and \(\varvec{\Omega }\) be as in Sects. 4.2 and 4.3, i.e. multi-cubes centered at 0. The following theorem was first observed in [53], but using a completely different proof.

Theorem 5.1

Set \(\varvec{\Xi }=\varvec{\Upsilon }=\{-N,\ldots ,N\}^d\). Given f on \(\{-2N,\ldots ,2N\}^d\), suppose that \(\varvec{\Theta }_f\) is PSD and has rank K where \(K\le 2N\). Then f can be written as

$$\begin{aligned} f(\varvec{x})=\sum _{k=1}^K c_k e^{i {\xi }_k\cdot \varvec{x}} \end{aligned}$$
(5.1)

where \(c_k>0\) and \({\xi }_k\in {\mathbb R}^d\) are distinct and unique. Conversely, if f has this form then \(\varvec{\Theta }_f\) is PSD with rank K.

The proof is based on the following simple observation about PSD matrices. Let \(\mathsf {Ran}~ A\) denote the range of a matrix A, and \(\mathsf {Ker}~ A\) its kernel.

Proposition 5.2

Let

$$\begin{aligned} \left( \begin{array}{cc} A &{} B \\ B^* &{} C \\ \end{array} \right) \end{aligned}$$
(5.2)

be a PSD matrix, where ABC are matrices (with dimensions compatible with (5.2)). Then

$$\begin{aligned} \mathsf {Ran}~ B\subset \mathsf {Ran}~ A. \end{aligned}$$

Proof

Note that the orthogonal complement of \(\mathsf {Ran}~ B\) equals \(\mathsf {Ker}~ B^*\). Since \(A=A^*\) it suffices to show that \(\mathsf {Ker}~ A\subset \mathsf {Ker}~ B^*\). Suppose that this is not the case and let \(x\in \mathsf {Ker}~ A\) be such that \(B^*x=y\ne 0.\) For \(t\in {\mathbb R}\) arbitrary we have

$$\begin{aligned} \left\langle \left( \begin{array}{cc} A &{} B \\ B^* &{} C \\ \end{array} \right) \left( \begin{array}{c} x \\ ty \\ \end{array} \right) ,\left( \begin{array}{c} x \\ ty \\ \end{array} \right) \right\rangle =2t\mathsf {Re}~ \langle B^*x,y\rangle +t^2\langle Cy,y\rangle =2t\Vert y\Vert ^2+t^2\langle Cy,y\rangle \end{aligned}$$

Since \(y\ne 0\) this expression takes negative values for some t, which is a contradiction. \(\square \)

Proof of Theorem 5.1

First assume that \(\varvec{\Theta }_f\) is PSD and has rank K. Let T be a block Toeplitz representation of \(\varvec{\Theta }_f\), as described in Sect. 4.2. Recall that the Toeplitz matrix \(T_{f_d(\varvec{0})}\) is the \((2N+1)\times (2N+1)\) sub-matrix on the diagonal of T, (and \(\varvec{0}\in {\mathbb Z}^{d-1}\)). This is clearly PSD and of some rank \(J_d\le K\), so by the classical Carathéodory-Fejér theorem (Theorem 2.1), \(f_d(\varvec{0})\) can be represented by

$$\begin{aligned} f_d(\varvec{0})=\sum _{k=1}^{J_d} c_k e^{i\xi _k^d \varvec{x}},\quad \varvec{x}\in \{-2N,\ldots , 2N\} \end{aligned}$$
(5.3)

with \(\xi _k^d\in {\mathbb R}\). We identify functions on \(\{-N\ldots N\}\) with \({\mathbb C}^{2N+1}\) in the obvious way, and define \(U_d\subset {\mathbb C}^{2N+1}\) by

$$\begin{aligned} U_d=\mathsf {Span}~ \{e^{i\xi _1^d \varvec{x}},\ldots ,e^{i\xi _{J_d}^d\varvec{x}}\}. \end{aligned}$$

The analogous subspace of \({\mathbb C}^{4N+1}\) will be called \(U_d^{ext}\). Note that \(f_d(\varvec{0})\in U_d^{ext}\) by (5.3) and that

$$\begin{aligned} \mathsf {Ran}~ T_{f_d(\varvec{0})}=U_d, \end{aligned}$$
(5.4)

which follows easily from the proof of Proposition 4.2. Set \(\Phi _d=\{\xi _1^d,\ldots ,\xi _{J_d}^d\}\).

Fix \(\varvec{j}\in \{-2N,\ldots ,2N\}^{d-1}\) with \(\varvec{j}\le \varvec{0}\) in the lexicographical order. By restricting T to a suitable subspace, it is clear that the matrix

$$\begin{aligned} \left( \begin{array}{cc} T_{f_d(\varvec{0})} &{} T_{f_d(\varvec{j})} \\ T_{f_d(\varvec{-j})} &{} T_{f_d(\varvec{0})} \\ \end{array} \right) \end{aligned}$$

is PSD (see the example in Sect. 4.2). Hence \(T_{f_d(\varvec{j})}^*=T_{f_d(-\varvec{j})}\) and Proposition 5.2 and (5.4) then imply that \(\mathsf {Ran}~ T_{f_d(\varvec{j})}\subset U_d\). It follows that all sub-vectors of \({f_d(\varvec{j})}\) of length \(2N+1\) are in \(U_d\), and thus

$$\begin{aligned} {f_d(\varvec{j})}\in U_d^{ext} \end{aligned}$$
(5.5)

by Proposition 4.3. Moreover the relation \(T_{f_d(\varvec{j})}^*=T_{f_d(-\varvec{j})}\) implies that \({f_d(-\varvec{j})}=\overline{\text {flip}~f_d(\varvec{j})}\) where the operation \(\text {flip}~ v\) reverses the order of the vector v. Since \(\overline{\text {flip}~U_d^{ext}}=U_d^{ext}\), it follows that (5.5) holds for all \(\varvec{j}\).

By choosing a different ordering and repeating the above argument, we conclude that for each l (\(1\le l\le d\)), there is a corresponding subspace \(U_l^{ext}\) of dimension \(J_l\le K\) such that all possible probes \(f_l(\cdot )\) are in \(U_{l}^{ext}\). Let \({\xi }_k\in {\mathbb R}^d\) be an enumeration of all \(J=J_1J_2\dots J_d\) multi-frequencies \(\Phi _1\times \cdots \times \Phi _d\). The corresponding J exponential functions span \(\otimes _{j=1}^d U_j\). By Theorem 3.1 we can thus write

$$\begin{aligned} f(x)=\sum _{k=1}^{J} c_k e^{i {\xi }_k\cdot \varvec{x}}. \end{aligned}$$
(5.6)

However, by Proposition 4.2, precisely K of the coefficients \(c_k\) are non-zero. This is (5.1). The uniqueness of the multi-frequencies is immediate by Proposition 4.1 (applied with \(N:=2N\)). The linear independence of these functions also gives that the coefficients are unique. To see that \(c_k\) is positive, (\(1\le k\le K\)), just pick a function g on \(\varvec{\Xi }\) which is orthogonal to all other \(e^{i {\xi }_j\cdot \varvec{x}}\), \(j\ne k\). Using the formula (4.4) it is easy to see that

$$\begin{aligned} 0\le \langle \varvec{\Theta }_{f}(g),g\rangle =c_k|\langle g,e^{i {\xi }_k\cdot \varvec{x}}\rangle |^2, \end{aligned}$$
(5.7)

and the first statement is proved.

For the converse, let f be of the form (5.1). Then \(\varvec{\Theta }_f\) has rank K by Proposition 4.2 and the PSD property follows by the fact that

$$\begin{aligned} 0\le \langle \varvec{\Theta }_{f}(g),g\rangle =\sum _{k=1}^Kc_k|\langle g,e^{i {\xi }_k\cdot \varvec{x}}\rangle |^2, \end{aligned}$$
(5.8)

in analogy with (5.7). \(\square \)

It is possible to extend this result to more general domains as considered in Sect. 4.1. However, such extensions are connected with some technical conditions, which are not needed in the continuous case. Moreover, in the next section we will show that the discretizations of Sect. 4.1 capture the essence of their continuous counterparts, given sufficient sampling. For these reasons we satisfy with stating such extensions for the continuous case, see Sect. 7.

The above proof could also be modified to apply to block Hankel matrices, but since Fischer’s theorem is connected with preconditions to rule out exceptional cases, the result is not so neat. (It does however provide alternative proofs to the results in [44] concerning small Hankel operators). Again, we here present only the cleaner continuous version, see Sect. 8.

6 The Multidimensional Discrete Kronecker Theorem

If we want to imitate the proof of Theorem 5.1 in Kronecker’s setting, i.e. without the PSD assumption, then we have to replace (5.3) (a sum of exponentials) with (2.7) (a sum of exponentials with polynomial coefficients). With suitable modifications, the whole argument goes through up until (5.6), where now the \( \xi _k\)’s can lie in \({\mathbb C}^d\) and \(c_k\) also can be polynomials. However, the key step of reducing the (J-term) representation (5.6) to the (K-term) representation (5.1), via Proposition 4.2, fails. Thus, the only conclusion we can draw is that f has a representation

$$\begin{aligned} f(\varvec{x})=\sum _{j=1}^J p_j(\varvec{x}) e^{{\zeta }_j\cdot \varvec{x}},\quad \varvec{x}\in \Omega , \end{aligned}$$
(6.1)

where \(J\le K\), but we have very little information on the amount of terms in each \(p_j\). This is a fundamental difference compared to before. In [3] examples are presented of general domain Hankel and Toeplitz integral operators, whose generating function is a single polynomial p, where \(\Gamma _p\) has rank K much lower than the amount of monomials needed to represent p. It is also not the case that these polynomials necessarily are the limit of functions of the form (2.13) (in a similar way as (2.10)), and hence we can not dismiss these polynomials as “exceptional”. To obtain similar examples in the finite dimensional setting considered here, one can just discretize the corresponding \(\Gamma _p\) found in [3] (as described in Sect. 4.1).

Nevertheless, in the continuous setting (i.e. for operators of the form \(\Theta _f\) and \(\Gamma _f\), c.f. (2.11) and (2.14)) the correspondence between rank and the structure of f is resolved in [3]. In particular it is shown that (either of) these operators have finite rank if and only if f is an exponential polynomial, and that the rank equals K if f is of the (reduced) form

$$\begin{aligned} f=\sum _{k=1}^Kc_ke^{\zeta _k\cdot x}. \end{aligned}$$
(6.2)

We now show that these results apply also in the discrete setting, given that the sampling is sufficiently dense. For simplicity of notation, we only consider the case \(\Gamma _f\) from now on, but include the corresponding results for \(\Theta _f\) in the main theorems.

6.1 Discretization

Let bounded open domains \(\Upsilon ,~\Xi \) be given, and let \(l>0\) be a sampling length parameter. Set

$$\begin{aligned} \varvec{\Upsilon }_l=\{\varvec{n}l\in \Upsilon :~\varvec{n}\in {\mathbb Z}^d\}, \end{aligned}$$

(c.f. (4.1)), make analogous definition for \(\varvec{\Xi }_l\) and define \(\varvec{\Omega }_l=\varvec{\Upsilon }_l+\varvec{\Xi }_l.\) We denote the cardinality of \(\varvec{\Upsilon }_l\) by \(|\varvec{\Upsilon }_l|\), and we define \(\ell ^2(\varvec{\Upsilon }_l)\) as the Hilbert space of all functions g on \(\varvec{\Upsilon }_l\) and norm

$$\begin{aligned} \Vert g\Vert _{\ell ^2}=\sqrt{\sum _{\varvec{y}\in \varvec{\Upsilon }_l}|g({\varvec{y}})|^2}. \end{aligned}$$

We let \(\varvec{\Gamma }_{f,l}:\ell ^2(\varvec{\Upsilon }_l)\rightarrow \ell ^2(\varvec{\Xi }_l)\) denote the summing operator

$$\begin{aligned} \varvec{\Gamma }_{f,l}(g)(\varvec{x})=\sum _{\varvec{y}\in \varvec{\Upsilon }_l}f(\varvec{x}+\varvec{y})g(\varvec{y}),\quad \varvec{x}\in \varvec{\Xi }_l. \end{aligned}$$

When l is understood from the context, we will usually omit it from the notation to simplify the presentation. It clearly does not matter if f is defined on or \(\varvec{\Xi }_l+\varvec{\Upsilon }_l\), and we use the same notation in both cases. We define \(\varvec{\Theta }_{f,l}\) in the obvious analogous manner. Note that in Sects. 4 and 5 we worked with \(\varvec{\Theta }_{f}\), which with the new notation becomes the same as \(\varvec{\Theta }_{f,1}\).

Proposition 6.1

There exists a constant \(C>0\), depending only on \(\Xi \), such that

$$\begin{aligned} \Vert \varvec{\Gamma }_{f,l}\Vert \le Cl^{-d/2}\Vert f\Vert _{\ell ^2(\varvec{\Omega }_l)}. \end{aligned}$$

Proof

By the Cauchy–Schwarz inequality we clearly have

$$\begin{aligned} |\varvec{\Gamma }_{f,l}(g)(\varvec{x})|\le \Vert f\Vert _{\ell ^2(\varvec{\Omega }_l)}\Vert g\Vert _{\ell ^2(\varvec{\Upsilon }_l)} \end{aligned}$$

for each \(\varvec{x}\in \varvec{\Xi }_l\). If we let \(|\varvec{\Xi }_l|\) denote the amount of elements in this set, it follows that

$$\begin{aligned} \Vert \varvec{\Gamma }_{f,l}\Vert \le \Vert f_{\varvec{\Omega }_l}\Vert _{\ell ^2(\varvec{\Omega }_l)}|\varvec{\Xi }_l|^{1/2}. \end{aligned}$$

Since \(\Xi \) is a bounded set, it is clear that \(|\varvec{\Xi }_l|l^d\) is bounded by some constant, and hence the result follows. \(\square \)

Theorem 6.2

Let \(\Xi \) and \(\Upsilon \) be bounded open and connected and let f be a continuous function on \(\Xi +\Upsilon \). Then

$$\begin{aligned} \mathsf {Rank}~ \varvec{\Gamma }_{f,l}\le \mathsf {Rank}~ \Gamma _f. \end{aligned}$$

Similarly, \(\mathsf {Rank}~ \varvec{\Theta }_{f,l}\le \mathsf {Rank}~ \Theta _f\) for any continuous f on \(\Xi -\Upsilon \).

Proof

Given \({\varvec{y}}\in \varvec{\Upsilon }_l\) and \(t\le l\) let \(C_{\varvec{y}}^{l,t}\) denote the multi-cube with center \({\varvec{y}}\) and sidelength t, i.e. \(C_{\varvec{y}}^{l,t}=\{y\in {\mathbb R}^d:~|y-\varvec{y}|_{\infty }<t/2\}\), where \(|\cdot |_{\infty }\) denotes the supremum norm in \({\mathbb R}^d\). Choose \(t_0\) such that \(\sqrt{d}t_0/2<\mathsf {dist} (\varvec{\Upsilon }_l,\partial \Upsilon )\). For \(t<t_0\) we then have that the set \(\{e^{l,t}_{\varvec{y}}\}_{{\varvec{y}}\in \varvec{\Upsilon }_l}\) defined by \(e^{l,t}_{\varvec{y}}=t^{-d/2}{} \mathbf 1 _{C_{\varvec{y}}^{l,t}}\) is orthonormal in \(L^2(\Upsilon )\). We make analogous definitions for \(\varvec{\Xi }_l\). Clearly \(\ell ^2({\varvec{\Upsilon }_l})\) is in bijective correspondence with \(\mathsf {Span}~ \{e^{l,t}_{\varvec{y}}\}_{{\varvec{y}}\in \varvec{\Upsilon }_l}\) via the canonical map \(P^{l,t}\), i.e. \(P^{l,t}(\delta _{\varvec{y}})=e^{l,t}_{\varvec{y}}\) where \(\delta _{\varvec{y}}\) is the “Kronecker \(\delta -\)function”. Let \(Q^{l,t}\) denote the corresponding map \(Q^{l,t}:\ell ^2({\varvec{\Xi }_l})\rightarrow L^2(\Xi )\).

Now, clearly \(\mathsf {Rank}~ {Q^{l,t}}^*\Gamma _f P^{l,t}\le \mathsf {Rank}~ \Gamma _f\) and

$$\begin{aligned} \frac{1}{t^d}\langle {Q^{l,t}}^*\Gamma _f P^{l,t}\delta _{\varvec{y}},\delta _{\varvec{x}}\rangle =\frac{1}{t^{2d}}\int _{|x-\varvec{x}|_\infty<t/2}\int _{|y-\varvec{y}|_\infty <t/2}f(x+y) ~dy~dx. \end{aligned}$$

If we denote this number by \(\tilde{f}^t({\varvec{x}+\varvec{y}})\), we see that \(\frac{1}{t^{d}}{Q^{l,t}}^*\Gamma _f P^{l,t}=\varvec{\Gamma }_{\tilde{f}^t,l}\). It follows that \(\mathsf {Rank}~ \varvec{\Gamma }_{\tilde{f}^t,l}\le \mathsf {Rank}~ \Gamma _f\). Since f is continuous, it is easy to see that \(\lim _{t\rightarrow 0^+}\tilde{f}^t({\varvec{x}+\varvec{y}})=f(\varvec{x}+\varvec{y})\), which implies that \(\lim _{t\rightarrow 0^+}\varvec{\Gamma }_{\tilde{ f}^t,l}=\varvec{\Gamma }_{f,l}\), and the proof is complete. \(\square \)

6.2 From Discrete to Continuous

Our next result says that for sufficiently small l, the inequality in Theorem 6.2 is actually an equality. This needs some preparation. Given \({\varvec{y}}\in \varvec{\Upsilon }_l\) we abbreviate \(C_{\varvec{y}}^{l,l}\) by \(C_{\varvec{y}}^{l}\), i.e. the multi-cube with center \({\varvec{y}}\) and sidelength l. Set \(\varvec{\Upsilon }^{int}_l=\{\varvec{y}\in \varvec{\Upsilon }_l: ~C_{\varvec{y}}\subset \Upsilon \}\), i.e. the set of those \(\varvec{y}\)’s whose corresponding multicubes are not intersecting the boundary. Moreover, for each \(\varvec{y} \in \varvec{\Upsilon }_l\), set

$$\begin{aligned} e^{l}_{\varvec{y}}=\left\{ \begin{array}{ll} l^{-d/2}{} \mathbf 1 _{C_{\varvec{y}}^{l}}, &{} \textit{ if } \varvec{y}\in \varvec{\Upsilon }^{int}_l \\ 0, &{} else \end{array}\right. \end{aligned}$$

We now define \(P^{l}:\ell ^2({\varvec{\Upsilon }_l})\rightarrow L^2(\Upsilon )\) via \(P^{l}(\delta _{\varvec{y}})=e^{l}_{\varvec{y}}\). Note that this map is only a partial isometry, in fact, \({P^{l}}^*{P^{l}}\) is the projection onto \(\mathsf {Span}~ \{\delta _{\varvec{y}}:~\varvec{y}\in \varvec{\Upsilon }_l^{int}\}\), and \({P^{l}}{P^{l}}^*\) is the projection in \(L^2(\Upsilon )\) onto the corresponding subspace. We make analogous definitions for \(\varvec{\Xi }_l\), denoting the corresponding partial isometry by \(Q^l\). Set

$$\begin{aligned}N_l=N_l(\Upsilon )=|\varvec{\Upsilon }_l{\setminus }\varvec{\Upsilon }_l^{int}|,\end{aligned}$$

i.e. \(N_l\) is the amount of multi-cubes \(C^l_{\varvec{y}}\) intersecting the boundary of \(\Upsilon \), and note that \(N_l=\dim \mathsf {Ker}~ P^l\). Since \(\Upsilon \) is bounded and open, it is easy to see that \(|\varvec{\Upsilon }_l^{int}|\) is proportional to \(1/l^d\). We will say that the boundary of a bounded domain \(\Upsilon \) is well-behaved if

$$\begin{aligned} \lim _{l\rightarrow 0^+}l^d N_l =0.\end{aligned}$$
(6.3)

In other words, \(\partial \Upsilon \) is well behaved if the amount of multi-cubes \(C^l_{\varvec{y}}\) properly contained in \(\Upsilon \) asymptotically outnumbers the amount that are not. The next proposition implies that most decent domains have well-behaved boundaries.

Proposition 6.3

Let \(\Upsilon \) be a bounded domain with Lipschitz boundary. Then \(\partial \Upsilon \) is well behaved.

Proof

By definition, for each point \(x\in \partial \Upsilon \) one can find a local coordinate system such that \(\partial \Upsilon \) locally is the graph of a Lipschitz function from some bounded domain in \({\mathbb R}^{d-1}\) to \({\mathbb R}\), see e.g. [51] or [19], Sec. 4.2. It is not hard to see that each such patch of the boundary can be covered by a collection of balls of radius l, where the amount of such balls is bounded by some constant times \(1/l^{d-1}\). Since \(\partial \Upsilon \) is compact, the same statement applies to the entire boundary. However, it is also easy to see that one ball of radius l can not intersect more than \(3^d\) multi-cubes of the type \(C_{\varvec{y}}^l\), and henceforth \(N_l\) is bounded by some constant times \(1/l^{d-1}\) as well. The desired statement follows immediately. \(\square \)

We remark that all bounded convex domains have well behaved boundaries, since such domains have Lipschitz boundaries, (see e.g.  [19, Sec. 6.3]). Also, note that the above proof yielded a faster decay of \(N_l l^d\) than necessary, so most “natural” domains will have well-behaved boundaries. We are now ready for the main theorem of this section:

Theorem 6.4

Let the boundaries of the bounded open and connected domains \(\Upsilon \) and \(\Xi \) be well behaved, and let f be a continuous function on . Then

$$\begin{aligned} \Gamma _f=\lim _{l\rightarrow 0^+} l^dQ^l\varvec{\Gamma }_{f,l}{P^l}^*. \end{aligned}$$
(6.4)

For f continuous and defined on we analogously have \(\Theta _f=\lim _{l\rightarrow 0^+}l^dQ^l\varvec{\Theta }_{f,l}{P^l}^*\)).

Proof

We first establish that \({P^{l}}{P^{l}}^*\) converges to the identity operator I in the SOT-topology. Let \(g\in L^2(\Upsilon )\) be arbitrary, pick any \(\epsilon >0\) and let \(\tilde{g}\) be a continuous function on \(\textit{cl}(\Upsilon )\) with \(\Vert g-\tilde{g}\Vert <\epsilon \). Then

$$\begin{aligned} \Vert g-{P^{l}}{P^{l}}^*g\Vert \le \Vert g-\tilde{g}\Vert +\Vert \tilde{g}-{P^{l}}{P^{l}}^*\tilde{g}\Vert +\Vert {P^{l}}{P^{l}}^*(\tilde{g}-g)\Vert . \end{aligned}$$

Both the first and the last term are clearly \(\le \epsilon ,\) whereas it is easy to see that the limit of the middle term as \(l\rightarrow 0^+\) equals 0, since \(\tilde{g}\) is continuous on \(\textit{cl}(\Upsilon )\) and the boundary is well-behaved. Since \(\epsilon \) was arbitrary we conclude that \(\lim _{l\rightarrow 0^+}P^l{P^l}^*g=g\), as desired. The corresponding fact for \(Q^l\) is of course then also true.

Now, since \(\Gamma _f\) is compact by Corollary 2.4 in [3], it follows by the above result and standard operator theory that

$$\begin{aligned} \Gamma _f=\lim _{l\rightarrow 0^+}Q^l{Q^l}^*\Gamma _f P^l{P^l}^*, \end{aligned}$$

and hence it suffices to show that

$$\begin{aligned} 0=\lim _{l\rightarrow 0^+}\Vert Q^l{Q^l}^*\Gamma _f P^l{P^l}^*-l^dQ^l\varvec{\Gamma }_{f,l}{P^l}^*\Vert =\lim _{l\rightarrow 0^+}\Vert Q^l({Q^l}^*\Gamma _f P^l-l^d\varvec{\Gamma }_{f,l}){P^l}^*\Vert . \end{aligned}$$

Since \(Q^l\) and \({P^l}^*\) are contractions, this follows if

$$\begin{aligned} \lim _{l\rightarrow 0^+}\Vert {Q^l}^*\Gamma _f P^l-l^d\varvec{\Gamma }_{f,l}\Vert =0. \end{aligned}$$
(6.5)

By the Tietze extension theorem, we may suppose that f is actually defined on \({\mathbb R}^n\) and has compact support there. In particular it will be equicontinuous. Now, to establish (6.5), let \({g}={g}_1+{g}_2\in \ell ^2(\varvec{\Upsilon }_l)\) be arbitrary, where \(\mathsf {supp}~ {g}_1\subset \varvec{\Upsilon }_l^{int}\) and \(\mathsf {supp}~ {g}_2\subset \varvec{\Upsilon }_l{\setminus }\varvec{\Upsilon }_l^{int}\). By definition, \(P^l {g}_2=0\) so \({Q^l}^*\Gamma _f P^l {g}_2=0\) whereas

$$\begin{aligned} |l^d\varvec{\Gamma }_{f,l}{g}_2(\varvec{x})|\le l^d \Vert f\Vert _{\infty }N_l(\Upsilon )^{1/2}\Vert {g}_2\Vert , \end{aligned}$$

by the Cauchy–Schwarz inequality. Thus

$$\begin{aligned} |({Q^l}^*\Gamma _f P^l-l^d\varvec{\Gamma }_{f,l}){g}_2(\varvec{x})|\le l^d \Vert f\Vert _{\infty }N_l(\Upsilon )^{1/2}\Vert {g}_2\Vert . \end{aligned}$$
(6.6)

We now provide estimates for \({g}_1\). Given \(\varvec{x}\in \varvec{\Xi }_l\) and \(\varvec{y}\in \varvec{\Upsilon }_l\), set

$$\begin{aligned} \tilde{f}({\varvec{x}+\varvec{y}})=\frac{1}{l^{2d}}\int _{|x-\varvec{x}|_{\infty }<l/2}\int _{|y-\varvec{y}|_{\infty }<l/2}f(x+y) ~dy~dx \end{aligned}$$

and note that

$$\begin{aligned} \tilde{f}({\varvec{x}+\varvec{y}})=\frac{1}{l^d}\langle {Q^{l}}^*\Gamma _f P^{l}\delta _{\varvec{y}},\delta _{\varvec{x}}\rangle \end{aligned}$$

whenever \(\varvec{x}\in \varvec{\Xi }_l^{int}\) and \(\varvec{y}\in \varvec{\Upsilon }_l^{int}\). As in the proof of Theorem 6.2 it follows that \({Q^l}^*\Gamma _f P^l {g}_1(\varvec{x})=l^d\varvec{\Gamma }_{\tilde{f},l} {g}_1(\varvec{x})\) for \(\varvec{x}\in \varvec{\Xi }_l^{int}\). For such \(\varvec{x}\) we thus have

$$\begin{aligned} |({Q^l}^*\Gamma _f P^l-l^d\varvec{\Gamma }_{f,l}){g}_1(\varvec{x})|=|l^d\varvec{\Gamma }_{\tilde{f}-f,l} g_1(\varvec{x})|\le l^d \Vert f-\tilde{f}\Vert _{\ell ^2(\varvec{\Omega }_l)}\Vert {g}_1\Vert \end{aligned}$$
(6.7)

by Cauchy–Schwarz, and for \(\varvec{x}\in \varvec{\Xi }{\setminus }\varvec{\Xi }_{l}^{int}\) we get

$$\begin{aligned} |({Q^l}^*\Gamma _f P^l-l^d\varvec{\Gamma }_{f,l}){g}_1(\varvec{x})|=|l^d\varvec{\Gamma }_{f,l} g_1(\varvec{x})|\le l^d \Vert f\Vert _{\infty }|\varvec{\Upsilon }_l|^{1/2}\Vert {g}_1\Vert \end{aligned}$$
(6.8)

due to the definition of \(Q^l\). Combining (6.6)–(6.8) we see that

$$\begin{aligned}&\Vert ({Q^l}^*\Gamma _f P^l-l^d\varvec{\Gamma }_{f,l}){g}\Vert \le \Vert ({Q^l}^*\Gamma _f P^l-l^d\varvec{\Gamma }_{f,l}){g}_1\Vert +\Vert ({Q^l}^*\Gamma _f P^l-l^d\varvec{\Gamma }_{f,l}){g}_2\Vert \\&\qquad \le |\varvec{\Xi }_l^{int}|^{1/2}l^{d}\Vert f-\tilde{f}\Vert _{\varvec{\Omega }_l}\Vert {g_1}\Vert +N_l(\Xi )^{1/2}l^d \Vert f\Vert _{\infty }|\varvec{\Upsilon }_l|^{1/2}\Vert {g}_1\Vert \\&\qquad +|\varvec{\Xi }_l|^{1/2}l^d\Vert f\Vert _{\infty } N_l(\Upsilon )^{1/2}\Vert {g_2}\Vert . \end{aligned}$$

Since \(\Xi \) and \(\Upsilon \) are bounded sets, \(|\varvec{\Xi }_l|\) and \(|\varvec{\Upsilon }_l|\) are bounded by some constant C times \(1/l^d\), and as \(\Vert g_1\Vert \le \Vert g\Vert \) and \(\Vert g_2\Vert \le \Vert g\Vert \), it follows that

$$\begin{aligned}&\Vert ({Q^l}^*\Gamma _f P^l-l^d\varvec{\Gamma }_{f,l})\Vert \le C^{1/2}l^{d/2}\Vert f-\tilde{f}\Vert _{\ell ^2(\varvec{\Omega }_l)}+ C^{1/2}N_l(\Xi )^{1/2}l^{d/2} \Vert f\Vert _{\infty }\\&\qquad +\, C^{1/2}l^{d/2}\Vert f\Vert _{\infty } N_l(\Upsilon )^{1/2}. \end{aligned}$$

By Proposition 6.3 the last two terms go to 0 as l goes to 0. The same is true for the first term by noting that \(l^{d/2}\Vert f-\tilde{f}\Vert _{\ell ^2(\varvec{\Omega }_l)}\le \Vert f-\tilde{f}\Vert _{\ell ^\infty (\varvec{\Omega }_l)}l^{d/2}|\varvec{\Omega }_l|^{1/2}\) and

$$\begin{aligned} \lim _{l\rightarrow 0^+}\Vert f-\tilde{f}\Vert _{\ell ^\infty (\varvec{\Omega }_l)}=0, \end{aligned}$$

which is an easy consequence of the equicontinuity of f. Thereby (6.5) follows and the proof is complete. \(\square \)

In particular, we have the following corollary. Note that the domains need not have well-behaved boundaries.

Corollary 6.5

Let \(\Upsilon \) and \(\Xi \) be open, bounded and connected domains, and let f be a continuous function on \(\textit{cl}(\Xi +\Upsilon )\). We then have

$$\begin{aligned} \mathsf {Rank}~ \Gamma _f=\lim _{l\rightarrow 0^+}\mathsf {Rank}~ \varvec{\Gamma }_{f,l} \end{aligned}$$
(6.9)

Similarly, if f is continuous on we have \(\mathsf {Rank}~ \Theta _f=\lim _{l\rightarrow 0^+}\mathsf {Rank}~ \varvec{\Theta }_{f,l}\).

Proof

By Propositions 5.1 and 5.3 in [3], the rank of \(\Gamma _f\) is independent of \(\Upsilon \) and \(\Xi \). Combining this with Theorem 6.2, it is easy to see that it suffices to verify the corollary for any open connected subsets of \(\Upsilon \) and \(\Xi \). We can thus assume that their boundaries are well-behaved. By Theorem 6.4 and standard operator theory we have

$$\begin{aligned} \mathsf {Rank}~ \Gamma _f\le & {} \liminf _{l\rightarrow 0^+} \mathsf {Rank}~ l^dQ^l\varvec{\Gamma }_{f,l}{P^l}^*=\liminf _{l\rightarrow 0^+} \mathsf {Rank}~ Q^l\varvec{\Gamma }_{f,l}{P^l}^*\\\le & {} \liminf _{l\rightarrow 0^+} \mathsf {Rank}~ \varvec{\Gamma }_{f,l}.\end{aligned}$$

On the other hand, Theorem 6.2 gives

$$\begin{aligned} \limsup _{l\rightarrow 0^+} \mathsf {Rank}~ \varvec{\Gamma }_{f,l}\le \mathsf {Rank}~ \Gamma _f. \end{aligned}$$

\(\square \)

7 The Multidimensional Continuous Carathéodory-Fejér Theorem

In the two final sections we investigate how the PSD-condition affects the structure of the generating functions. This condition only makes sense as long as

$$\begin{aligned} \Xi =\Upsilon , \end{aligned}$$

which we assume from now on. In this section we show that the natural counterpart of Carathéodory-Fejér’s theorem holds for general domain Toeplitz integral operators \(\Theta _f\), and in the next we consider Fischer’s theorem for general domain Hankel integral operators.

Theorem 7.1

Suppose that \(\Xi =\Upsilon \) is open bounded and connected, \(\Omega =\Xi -\Upsilon \), and \(f\in L^2(\Omega )\). Then the operator \(\Theta _f\) is PSD and has finite rank K if and only if there exist distinct \({\xi }_1,\ldots ,{\xi }_K\in {\mathbb R}^d\) and \(c_1,\ldots ,c_K>0\) such that

$$\begin{aligned} f=\sum _{k=1}^K c_k e^{i {\xi }_k\cdot {x}}. \end{aligned}$$
(7.1)

Proof

Suppose first that \(\Theta _f\) is PSD and has finite rank K. By Theorem 4.4 in [3], f is an exponential polynomial (i.e. can be written as (6.1)). By uniqueness of analytic continuation, it suffices to prove the result when \(\Xi =\Upsilon \) are neighborhoods of some fixed point \({x}_0\). By a translation, it is easy to see that we may assume that \({x}_0=0\). We consider discretizations \(\varvec{\Theta }_{f,l}\) of \(\Theta _f\) where l assume values \(2^{-j}\), \(j\in {\mathbb N}\). For j large enough, (beyond J say), the operator \(\varvec{\Gamma }_{f,2^{-j}}\) has rank K (Corollary 6.5) and Theorem 5.1 applies (upon dilation of the grids). We conclude that for \(j>J\) the representation (7.1) holds (on \(\varvec{\Omega }_{2^{-j}}=\varvec{\Xi }_{2^{-j}}-\varvec{\Upsilon }_{2^{-j}}\)) but the \({\xi }_k\)’s may depend on j. However, since each grid \(\varvec{\Omega }_{2^{-j-1}}\) is a refinement of \(\varvec{\Omega }_{2^{-j}}\), Proposition 4.1 guarantees that this dependence on j may only affect the ordering, not the actual values of the set of \(\xi _k\)’s used in (7.1). We can thus choose the order at each stage so that it does not depend on j. Since f is an exponential polynomial, it is continuous, so taking the limit \(j\rightarrow \infty \) easily yields that (7.1) holds when \(\varvec{x}\) is a continuous variable as well.

Conversely, suppose that f is of the form (7.1). Then \(\Theta _f\) has rank K by Proposition 4.1 in [3] (see also the remarks at the end of Sect. 2.2). The PSD condition follows by the continuous analogue of (5.8). \(\square \)

8 The Multidimensional Continuous Fischer Theorem

Theorem 8.1

Suppose that \(\Xi =\Upsilon \) is open bounded and connected, , and \(f\in L^2(\Omega )\). The operator \(\Gamma _f\) is PSD and has finite rank K if and only if there exist distinct \({\xi }_1,\ldots ,{\xi }_{K}\in {\mathbb R}^d\) and \(c_1,\ldots ,c_{K}>0\) such that

$$\begin{aligned} f=\sum _{k=1}^{K} c_k e^{{\xi }_k\cdot {x}}. \end{aligned}$$
(8.1)

We remark that the continuous version above differs significantly from the discrete case, even in one dimension, since the sequence \((\lambda ^n)_{n=0}^{2N}\) generates a PSD Hankel matrix for all \(\lambda \in {\mathbb R}\) (even negative values), whereas the base \(e^{\xi _k}\) is positive in (8.1). Recall also the example (2.3), which does not fit in the discrete version of (8.1).

Proof

Surprisingly, the proof is rather different than that of Theorem 7.1. First suppose that \(\Gamma _f\) is PSD and has finite rank K. Then f is an exponential polynomial, i.e. has a representation (6.1), by Theorem 4.4 in [3]. Suppose that there are non-constant polynomial factors in the representation (6.1), say \(p_1(x)e^{\zeta _1 \cdot x}\). Let N be the maximum degree of all polynomials \(\{p_j\}_{j=1}^J\). Pick a closed subset \(\tilde{\Xi }\subset \Xi \) and \(r>0\) such that \(\mathsf {dist} (\tilde{\Xi },{\mathbb R}^d{\setminus }\Xi )>2r\). Pick a continuous real valued function \(g\in L^2({\mathbb R}^d)\) with support in \(\tilde{\Xi }\) that is orthogonal to the monomial exponentials

$$\begin{aligned} \{ x^{ \alpha } e^{ \zeta _j\cdot x}\}_{| \alpha |\le N, 1\le j\le J}{\setminus }\{e^{ \zeta _1\cdot x}\} \end{aligned}$$

(where \({\alpha }\in {\mathbb N}^d\) and we use standard multi-index notation), but satisfies \(\langle g,e^{ \zeta _1\cdot x}\rangle =1\), (that such a function exists is standard, see e.g. Proposition 3.1 in [3]). A short calculation shows that

$$\begin{aligned} \langle \Gamma _f g(\cdot - z),g(\cdot - w)\rangle =p_1( z+ w)e^{ \zeta _1\cdot ( z+ w)} \end{aligned}$$
(8.2)

whenever \(| z|,| w|<r\). Since \(p_1\) is non-constant, there exists a unit length \( \nu \in {\mathbb R}^d\) such that \(q(t)=p_1(r \nu t)\) is a non-constant polynomial in t. Set \(\zeta =r \zeta _1\cdot \nu \). Consider the operator \(A:L^2([0,1])\rightarrow L^2(\Xi )\) defined via

$$\begin{aligned} A(\phi )= \int _0^1 \phi (t) g( x- {r}\nu t). \end{aligned}$$

Clearly \(A^*\Gamma _f A\) is PSD. It follows by (8.2) and Fubini’s theorem that

$$\begin{aligned} \langle A^*\Gamma _f A(\phi ),\psi \rangle&=\int _0^1\int _0^1 p_1(r\nu t+r \nu s)e^{\zeta _1\cdot (r\nu t+r \nu s)}\phi (t)dt\overline{\psi (s)}ds\\&=\int _0^1\int _0^1 q(t+s) e^{\zeta (t+s)}\phi (t)dt\overline{\psi (s)}ds. \end{aligned}$$

With \(h(t)=q(t) e^{\zeta t}\), it follows that the operator \(\Gamma _h:L^2([0,1])\rightarrow L^2([0,1])\) is PSD. Since \(\Gamma _h\) is self adjoint it is easy to see that \(h(t+s)=\overline{h(s+t)}\), (either by repeating arguments from Sect. 6, or by standard results from integral operator theory). In particular h is real valued. This clearly implies that \(\zeta \in {\mathbb R}\). Now consider the operator \(B:L^2([0,1])\rightarrow L^2([0,1])\) defined by \(B(g)(t)=e^{-\zeta t}g(t)\). As before we see that \(B^*\Gamma _hB=\Gamma _q\), and this operator is PSD. Given \(0<\epsilon <1/2\), define \(C_{\epsilon }:L^2([0,1/2])\rightarrow L^2([0,1])\) by \(C_{\epsilon }(g)(t)=\frac{g(t-\epsilon )-g(t)}{\epsilon }\), (where we identify functions on [0, 1 / 2] with functions on \({\mathbb R}\) that are identically zero outside the interval). It is easy to see that

$$\begin{aligned} C_{\epsilon }^* \Gamma _q{\mathbb C}_{\epsilon }=\Gamma _{{\epsilon ^{-2}}{(q(\cdot +2\epsilon )-2q(\cdot +\epsilon )+q(\cdot ))}}, \end{aligned}$$

in particular it is PSD. Since q is a polynomial, it is easy to see that \((q(\cdot +2\epsilon )-2q(\cdot +\epsilon )+q(\cdot ))/\epsilon ^2\) converges uniformly on compacts to \(q''\). By simple estimates based on the Cauchy–Schwarz inequality (see e.g. Proposition 2.1 in [3]), it then follows that the corresponding sequence of operators converges to \(\Gamma _{q''}\) (acting on \(L^2([0,1/2])\)), which therefore is PSD. Continuing in this way, we see that we can assume that q is of degree 1 or 2, where \(\Gamma _q\) acts on an interval [0, 3l] where 3l is a power of 1 / 2. We first assume that the degree is 2, and parameterize \(q(t)=a+b(t/l)+c(t/l)^2\). Performing the differentiation trick once more, we see that \(\Gamma _c\) is PSD on some smaller interval, which clearly means that \(c>0\). Now pick \(g\in L^2([0,l])\) such that \(\langle g,1\rangle =1\), \(\langle g,t\rangle =0\), \(\langle g,t^2\rangle =0\), and consider \(D:{\mathbb C}^3\rightarrow L^2([0,3l])\) defined by

$$\begin{aligned} D((c_0,c_1,c_2))=c_0 g(\cdot )+c_1 g(\cdot -l)+c_2 g(\cdot -2l). \end{aligned}$$

By (8.2), the matrix representation of \(D^*\Gamma _q D\) is

$$\begin{aligned} M=\left( \begin{array}{ccc} q(0) &{} q(l) &{} q(2l) \\ q(l) &{} q(2l) &{} q(3l) \\ q(2l) &{} q(3l) &{} q(4l) \\ \end{array} \right) =\left( \begin{array}{ccc} a &{} a+b+c &{} a+2b+4c \\ a+b+c &{} a+2b+4c &{} a+3b+9c \\ a+2b+4c &{} a+3b+9c &{} a+4b+16c \\ \end{array} \right) , \end{aligned}$$

which then is PSD. However, a (not so) short calculation shows that the determinant of M equals \(-8c^3\) which is a contradiction, since it is less than 0 (recall that \(c>0\)). We now consider the case of degree 1, i.e. \(c=0\) and \(b\ne 0\). As above we deduce that the matrix

$$\begin{aligned} M=\left( \begin{array}{ccc} q(0) &{} q(l) \\ q(l) &{} q(2l) \\ \end{array} \right) =\left( \begin{array}{ccc} a &{} a+b+c \\ a+b+c &{} a+2b+4c \\ \end{array} \right) , \end{aligned}$$

has to be PSD, which contradicts the fact that its determinant is \(-b^2\).

By this we finally conclude that there can be no polynomial factors in the representation (6.1). By the continuous version of Proposition 4.2 (see Proposition 4.1 in [3]), we conclude that f is of the form (6.2), i.e. \(f=\sum _{k=1}^Kc_ke^{\zeta _k\cdot x}\). From here the proof is easy. Repeating the first steps, we conclude that \(\zeta _k\cdot \nu \in {\mathbb R}\) for all \( \nu \in {\mathbb R}^d\), by which we conclude that \(\zeta _k\) are real valued. We therefore call them \( \xi _k\) henceforth. With this at hand we obviously have

$$\begin{aligned} \langle \varvec{\Gamma }_{f}(g),g\rangle =\sum _{k=1}^Kc_k|\langle g,e^{ {\xi }_k\cdot {x}}\rangle |^2 \end{aligned}$$
(8.3)

for all \(g\in L^2(\Xi )\), whereby we conclude that \(c_k>0\).

For the converse part of the statement, let f be of the form (8.1). That \(\Gamma _f\) has rank K has already been argued (Proposition 4.1 in [3]) and that \(\Gamma _f\) is PSD follows by (8.3). The proof is complete. \(\square \)

9 Unbounded Domains

For completeness, we formulate the results form the previous two sections for unbounded domains. \(\Gamma _f\) is defined precisely as before, i.e. via the formula (2.14), except that we now have to assume that \(f(x+\cdot )\) is in \(L^2(\Upsilon )\) for every \(x\in \Xi \) and vice versa, \(f(\cdot +y)\in L^2(\Xi )\) for every \(y\in \Upsilon \) (see definition 1.1 in [3]). Obviously, analogous definitions/restrictions apply to \(\Theta _f\) as well. The main difficulty with unbounded domains is that exponential polynomials then can give rise to unbounded operators. Following [3], we address this by assuming that \(\Omega \) is convex and we let \(\Delta _{\Omega }\) Footnote 2 denote the set of directions \({\vartheta }\in {\mathbb R}^d\) such that the orthogonal projection of \(\Omega \) on the half line \([0,\infty )\cdot {\vartheta }\) is a bounded set, and we let denote its interior.

Theorem 9.1

Let \(\Xi =\Upsilon \subset {\mathbb R}^d\) be convex domains, set and let f be a function on \(\Omega \) such that \(f(x+\cdot )\in L^2(\Upsilon )\) \(\forall x\in \Xi \) and \(f(\cdot +y)\in L^2(\Xi )\) \(\forall y\in \Upsilon \). Then \(\Gamma _f\) is bounded, PSD and has finite rank if and only if f is of the form (8.1) and for all k.

Proof

This follows by straightforward modifications of the proofs in Section 9 of [3], so we satisfy with outlining the details. The “if” direction is easy so we focus on the “only if”. We restrict the operator \(\Gamma _f\) to functions living on a subset (see Theorem 9.1 [3]) to obtain a new operator to which Theorem 8.1 above applies. From this we deduce that f locally has the form (8.1). That this formula then holds globally is an immediate consequence of uniqueness of real analytic continuation, combined with the observation that \(\Omega \) is connected. Finally, the restriction on the \(\xi _k\)’s is immediate by Theorem 9.3 in [3]. \(\square \)

The corresponding situation for general domain Toeplitz integral operators is quite different. We first note that \(\Theta _f:L^2(\Upsilon )\rightarrow L^2(\Xi )\) is bounded if and only if \(\Gamma _f:L^2(-\Upsilon )\rightarrow L^2(\Xi )\) is bounded, as mentioned in Sect. 2.2 and further elaborated on around formula (1.2) in [3]. With this, we immediately obtain the following theorem.

Theorem 9.2

Let \(\Xi ,\Upsilon \subset {\mathbb R}^d\) be convex domains, set \(\Omega =\Xi -\Upsilon \) and let f be a function on \(\Omega \) such that \(f(x-\cdot )\in L^2(\Upsilon )\) \(\forall x\in \Xi \) and \(f(\cdot -y)\in L^2(\Xi )\) \(\forall y\in \Upsilon \). Then \(\Theta _f\) is bounded and has finite rank if and only if f is an exponential polynomial (i.e. \({f({x})=\sum _{j=1}^J p_j({x}) e^{{\zeta }_j\cdot {x}}}\)) and for all j.

However, if now again we let \(\Xi =\Upsilon \) and we additionally impose PSD, the proof of Theorem 9.1 combined with Theorem 7.1 shows that \( \zeta _j=i\xi _j\) for some \(\xi \in {\mathbb R}^d\). However, Theorem 9.2 then forces , which can only happen if \(\Delta _{\Omega }={\mathbb R}^d\), since it is a cone. This in turn is equivalent to \(\Omega \) being bounded, so we conclude that

Theorem 9.3

Let \(\Xi =\Upsilon \subset {\mathbb R}^d\) be convex unbounded domains, set \(\Omega =\Xi -\Upsilon \) and let f be as in Theorem 9.2. Then \(\Theta _f\) is bounded and PSD if and only if \(f\equiv 0\).

10 Conclusions

Multidimensional versions of the Kronecker, Carathéodory-Fejér and Fischer theorems are discussed and proven in discrete and continuous settings. The former relates the rank of general domain Hankel and Toeplitz type matrices and operators to the number of exponential polynomials needed for the corresponding generating functions/sequences. The latter two include the condition that the operators be positive semi-definite. The multi-dimensional versions of the Carathéodory-Fejér theorem behave as expected, while the multi-dimensional versions of the Kronecker theorem generically yield more complicated representations, which are clearer in the continuous setting. Fischer’s theorem also exhibits a simpler structure in the continuous case than in the discrete. We also show that the discrete case approximates the continuous, given sufficient sampling.