On the structure of positive semi-definite finite rank general domain Hankel and Toeplitz operators in several variables

Multivariate versions of the Kronecker theorem in the continuous multivariate setting has recently been published, that characterize the generating functions that give rise to finite rank multidimensional Hankel and Toeplitz type operators defined on general domains. In this paper we study how the additional assumption of positive semi-definite affects the characterization of the corresponding generating functions. We show that these theorems become particularly transparent in the continuous setting, by providing elegant if-and-only-if statements connecting the rank with sums of exponential functions. We also discuss how these operators can be discretized, giving rise to an interesting class of structured matrices that inherit desirable properties from their continuous analogs. In particular we describe how the continuous Kronecker theorem also applies to these structured matrices, given sufficient sampling. We also provide a new proof for the Carathéodory-Fejér theorem for block Toeplitz matrices, based on tools from tensor algebra.


Introduction
The connection between low-rank Hankel and Toeplitz operators and matrices, and properties of the functions that generate them play a crucial role for instance in frequency estimation [7,32,46,47,48], system identification [14,16,31,33] and approximation theory [4,5,6,8,9,10,42]. The reason for this is that there is a connection between the rank of such an operator and its generating function being a sum of exponential functions, where the number of terms is connected to the rank of the operator (Kronecker's theorem). Moreover, adding the condition of positive semidefinite imposes further restrictions on the sum of exponentials (Caratheódory-Fejér's and Fischer's theorem), a result which underlies e.g. Pisarenko's famous method for frequency estimation [43].
We provide corresponding theorems in the multidimensional setting. In contrast to the one dimensional situation, the multidimensional framework provides substantial flexibility in how to define these operators. Whereas most previous research on multidimensional Hankel and Toeplitz type operators considers "generating functions/sequences" f that are defined on product domains, we here consider a framework where f is defined on an open connected and bounded domain Ω in R d (or discretizations thereof). Besides providing beautiful new theorems, it is our hope that the new results in this paper will pave the way for applications in multidimensional frequency estimation/approximation/compression, in analogy with the use of Toeplitz and Hankel matrices in the one dimensional setting. For this reason, we present results both in the continuous and discretized setting, and discuss how they influence each other.
To present the key ideas, we here focus mainly on the continuous theory since it is more transparent. "General domain Hankel (Toeplitz) operators" is a class of integral operators whose kernel K(x, y) is of the form K(x, y) = f (x + y) or K(x, y) = f (x − y) 1 , and f is the so called "generating function". Their precise definition also depends on an auxiliary domain Ω on which f is defined, we postpone detailed definitions to Section 2.2. We denote by Γ f a generic general domain Hankel operator and by Θ f their Toeplitz counterparts (see Figure 1 for an example of a discretized Γ f ). These operators were introduced in [3] where it is shown that if Γ f or Θ f has rank equal to K < ∞, then f is necessarily an exponential polynomial; where J ≤ K (assuming no cancelation), p j are polynomials in x = (x 1 , . . . , x d ), ζ j ∈ C d and ζ j · x denotes the standard scalar product Conversely, any such exponential polynomial gives rise to finite rank Γ f and Θ f respectively, and there is a method to determine the rank given the generating function (1.1). Most notably, the rank equals K if f is of the form where c k ∈ C (assuming that there is no cancelation in (1.2)).
The main topic of this paper is the study of how the additional condition that Γ f or Θ f be positive semi-definite (PSD) affects the generating function f . We prove that Θ f then has rank K if and only if f is of the form where c k > 0 and ξ k ∈ R d (Theorem 7.1), which in a certain sense is an extension of Carathéodory-Fejér's theorem on PSD Toeplitz matrices. Correspondingly, Γ f is PSD and has rank K if and only if f is of the form c k e ξ k ·x (1.4) where again c k > 0 and ξ k ∈ R d (Theorem 8.1). Similar results for Hankel matrices date back to work of Fischer [22]. The only of the above results that has a simple counterpart in the finite dimensional discretized multivariable setting is the Carathéodory-Fejér's theorem, which has been observed previously in [53] (concerning block Toeplitz matrices). In this paper we provide a general result on tensor products, which can be used to "lift" structure results in one-dimension to the multi-dimensional setting. We use this to give an alternative proof of the discrete Carathéodory-Fejér theorem, which subsequently is used to prove the continuous counterpart.
Fischer's theorem on the other hand has no neat version in the multivariable finite dimensional setting, but has been generalized to so called small Hankel operators on 2 (N d ) in [44], a paper which also contains a result analog to (1.4).
However, the product domain setting is rather restrictive and not always a natural choice. Whereas one-dimensional generating functions necessarily are defined on an interval, there is an abundance of possible regions to define their corresponding multidimensional cousins. Despite this, the majority of multivariate treatments of these issues are set either directly in a block-Toeplitz/Hankel setting, or rely on tensor products. In both cases the corresponding domain of definition Ω of the generating function/sequence is a square (or multi-cube), but for concrete applications to multidimensional frequency estimation, the available data need not be naturally defined on such a domain. In radially symmetric problems, a circle may be more suitable or, for certain boundary problems, a triangle might be more appropriate.
Concerning analogs of the above results for the discretized counterparts of Θ f and Γ f , we show in this paper how to construct "structured matrices" that approximate their continuous counterparts, and hence can be expected to inherit these desirable properties, given sufficient sampling rate. We give simple conditions on the regularity of f and Ω needed for this to be successful. This gives rise to an interesting class of structured matrices, which we call "general domain Hankel/Toeplitz matrices". As an example, in Figure 1 we have a "generating sequence" f on a discretized disc, together with a plot of its general domain Hankel matrix.
The paper is organized as follows. In the next section we review the theory and at the same time introduce the operators we will be working with in the continuous setting (Section 2.2). The short Section 3 provides a tool from tensor algebra, and also introduce useful notation for the discrete setting. Section 4 discuss how to discretize the Γ f 's and Θ f 's, and we discuss particular cases such as block Toeplitz and Hankel matrices. In Section 5 we prove the Caratheodory-Fejér theorem in the discrete (block) setting. Section 6 shows that the discrete operators approximate the continuous counterparts, given sufficient sampling rate, and we discuss Kronecker's theorem. Sections 7 and 8 consider structure results for f under the PSD condition, first for Θ f 's and then for Γ f 's. In the last section, we extend the above results to the corresponding operators on unbounded domains.

Review of the field
A Toeplitz matrix is a matrix that is constant on the diagonals, i.e. the matrix elements satisfy a k,j = a k+1,j+1 for all indices k, j such that the above formula is well defined. A sequence f such that a k,j = f k−j is called its generating sequence. Hankel matrices on the other hand are constant on the anti-diagonals; a k,j = a k+1,j−1 ; and the sequence f such that a k,j = f k+j is called its generating sequence. Naturally, the set of subindices for f depends on whether we are dealing with Hankel or Toeplitz matrices (and also if the upper left element is taken as a 1,1 or a 0,0 ), but this is not of importance here and hence we do not specify it.
Suppose that the generating sequence of either a Hankel matrix H or a Toeplitz matrix T (of size N × N ) is a "discretized exponential polynomial" strictly less than N . Based on the theory of Vandermonde-matrices, one can show that the rank of either H or T equals K, and that the polynomials p j and the λ j 's are unique. The converse statement is not true; consider for example the Hankel matrix Clearly, the rank is 2 but the generating sequence (1, 0, 0, 0, 0, 0, 0, 0, 1) is not of the form (2.1) with J = 1 or 2. However, in terms of applications this doesn't matter because of the following stronger statement: If T or H has rank K < N then its generating sequence is "generically" of the form a fact which underlies the famous ESPRIT frequency estimation algorithm [46]. The above claims are certainly well known to specialists, but very hard to find in the literature. The book [28], which has two sections devoted entirely to the topic of the rank of finite Toeplitz and Hankel matrices, gives a number of exact theorems relating the rank with the "characteristic" of the corresponding matrix, which is a set of numbers related to when determinants of certain submatrices vanish. It is possible to deduce representations of the form (2.1) (under certain additional assumptions) from these results, but this is never stated explicitly. Another viewpoint has been taken by B. Mourrain et. al [11,17,36,37], in which, loosely speaking, these matrices are analyzed using projective algebraic geometry and the 1 to the bottom right corresponds to the point ∞.
The book [41] deals exclusively with the infinite dimensional case, and generalizations thereof. For completeness we provide outlines of proofs of the claims made earlier in the appendix, based on results in [28] and [18].
In either case, the complexity of the theory does not reflect the relatively simple interaction between rank and exponential sums, as indicated in the introduction. There are however a few exceptions in the discrete setting. Kronecker's theorem says that for a Hankel operator (i.e. an infinite Hankel matrix acting on 2 (N)), the rank is K if and only if the generating sequence is of the desired form (2.1) (0 0 defined as 1), with the restriction that |λ j | < 1 if one is only interested in bounded operators, see e.g. [13,29,30,41]. Also, it is finite rank and PSD if and only if the generating sequence is of the form (2.4) with c k > 0 and λ k ∈ (−1, 1), a result which also has been extended to the multivariable (tensor product) setting [44]. In contrast, there are no finite rank bounded Toeplitz operators (on 2 (N)). If boundedness is not an issue, then a version of Kronecker's theorem holds for Toeplitz operators as well [18].
Adding the PSD condition for a Toeplitz matrix yields a simple result which is valid (without exceptions) for finite matrices. This is the essence of what usually is called the Carathéodory-Fejér theorem. The result was used by Pisarenko [43] to construct an algorithm for frequency estimation. Since then, this approach has rendered a lot of related algorithms, for instance the MUSIC method [47]. We reproduce the statement here for the convenience of the reader. For a proof see e.g. Theorem 12 in [2] or Section 4 in [26]. Other relevant references include [1,15]. where c k > 0 and the λ k 's are distinct and satisfy |λ k | = 1.
The corresponding situation for Hankel matrices H is not as clean, since (2.3) is PSD and has rank 2, but do not fit with the model (2.5) for c k > 0 and real λ k 's. Results of this type seems to go back to Fischer [22], and we will henceforth refer to statements relating the rank of PSD Hankeltype operators to the structure of their generating sequence/function, as "Fischer-type theorems" (see e.g. Theorem 5 [2] or [22]). Corresponding results in the full rank case can be found e.g. in [50].
We end this subsection with a few remarks on the practical use of Theorem 2.1. For a finitely sampled signal, the autocorrelation matrix can be estimated by H * H where H is a (not necessarily square) Hankel matrix generated by the signal. This matrix will obviously be PSD, but in general it will not be Toeplitz. However, under the assumption that the λ k 's in (2.5) are well separated, the contribution from the scalar products of the different terms will be small and might therefore be ignored. Under these assumptions on the data, the matrix H * H is PSD and approximately Toeplitz, which motivates the use of the Carathéodory-Fejér theorem as a means to retrieve the λ k 's.

Toeplitz and Hankel operators on the Paley-Wiener space
The theory in the continuous case is much "cleaner" than in the discrete case. In this section we introduce the integral operator counterpart of Toeplitz and Hankel matrices, and discuss Kronecker's theorem in this setting.
Given a function on the interval [−2, 2], we define the truncated convolution operator Θ f : (2.6) Replacing x − y by x + y we obtain the a truncated correlation operator which we denote by Γ f . Following [45], we refer to these operators as Toeplitz and Hankel operators on the Paley-Wiener space (although in [3] they were called finite interval convolution/correlation operators). It is easy to see that if we discretize these operators, i.e. replace integrals by finite sums, then we get Toeplitz and Hankel matrices, respectively. More on this in Section 4.1.
Kronecker's theorem (as formulated by R. Rochberg in [45]) then states that Rank Θ f = K (and where p j are polynomials and ζ j ∈ C. Moreover, the rank of Θ f (or Γ f ) equals the cardinality However, functions of the form are known to be dense in the set of all generating functions giving rise to rank K finite interval convolution operators. Hence, the general form (2.7) is hiding the following simpler statement, which often is of practical importance. Θ f generically has rank K if and only if f is a sum of K exponential functions (see the Appendix for an outline of a proof of this claim). The corresponding statement is false in several variables, which is shown in [3]. The polynomial factors appear in the limit if two frequencies in (2.9) approach each other and interfere destructively, e.g.
This can heuristically explain why these factors do not appear when adding the PSD condition, since the functions on the right of (2.10) give rise to one large positive and one large negative eigenvalue.

General domain Hankel and Toeplitz integral operators in several variables
Given any (square integrable) function f on an open connected and bounded set Ω in R d , d ≥ 1, the natural counterpart to the operator (2.6) is the general domain Toeplitz integral operator where Ξ and Υ are connected open bounded sets such that In [3] such operators are studied, (albeit under the name general domain truncated convolution operator), and their finite rank structure is completely characterized. It is easy to see that Θ f has rank K whenever f has the form where the ζ 1 , . . . , ζ K ∈ C d are assumed to be distinct and all c k 's are non-zero. The reverse direction is however not as neat as in the one-dimensional case. It is true that the rank is finite only if f is an exponential polynomial (i.e. the multidimensional analogue of (2.7), see Theorem 4.4 in [3]), but there is no counterpart to the simple formula (2.8). However, Section 5 (in [3]) gives a complete description of how to determine the rank given the generating function f explicitly, Section 7 gives results on the generic rank based on the degree of the polynomials that appear in f , we also provide lower bounds on the rank, and Section 8 investigates the fact that polynomial coefficients seem to appear more frequently in the multidimensional setting. Section 9 contains an investigation related to boundedness of these operators for the case of unbounded domains, which we will treat briefly in Section 9 of the present paper.
If we instead set Ω = Ξ + Υ then we may define the general domain Hankel integral operator (called truncated correlation operator in [3]) (2.14) This is the continuous analogue of finite Hankel (block) matrices. As in the finite dimensional case, there is no real difference between Γ f and Θ f regarding the finite rank structure. In fact, one turns into the other under composition with the "trivial" operator ι(f )(x) = f (−x), and thus all statements concerning the rank of one can easily be transferred to the other. We remark however that composition with ι does not preserve PSD, and hence separate proofs are needed in this situation. Finally, we remark that the choice Υ = Ξ = R d + gives what is known as "small Hankel operators". The study of their boundedness and related topics have received a lot of attention, see e.g. [21,34,35].

Other multidimensional versions
The usual multidimensional framework is that of block-Hankel and block-Toeplitz matrices, tensor products, or so called "small Hankel operators" on 2 (N d ). In all cases, the generating sequence f is forced to live on a product domain. For example, in [52] they consider the generating sequences of the form (1.2) (where x is on a discrete grid) and give conditions on the size of the block Hankel matrices under which the rank is K, and in [53] it is observed that the natural counterpart of the Carathéodory-Fejér theorem can be lifted by induction to the block Toeplitz setting. For the full rank case, factorizations of these kinds of operators have been investigated in [20,49]. Extensions to multi-linear algebra are addressed for instance in [38,39,40]. Rank deficient block Toeplitz matrices also play an important role in [23].
Concerning "small Hankel operators", in addition to [44] we wish to mention [27] where a formula for actually determining the rank appears, although this is based on reduction over the dimension and hence not suitable for non-product domains.
There is some heuristic overlap between [3] and [24,25]. In [24] they consider block Hankel matrices with polynomial generating function, and obtain results concerning their rank (Theorem 4.6) that overlap with Propositions 5.3, Theorem 7.4 and Proposition 7.7 of [3] for the 2d case. Proposition 7 in [25] is an extension to 2d of Kronecker's theorem for infinite block Hankel matrices (not truncated), which can be compared with Theorem 4.4 in [3].
In the discrete setting, the work of B. Mourrain et al. considers a general domain context, and what they call "quasi Toeplitz/Hankel matrices" correspond to what here is called "general domain Toeplitz/Hankel matrices" (we stick to this term since we feel it is more informative). See e.g. Section 3.5 in [37], where such matrices are considered for solving systems of polynomial equations. In [11], discrete multidimensional Hankel operators (not truncated) are studied, and Theorem 5.7 is a description of the rank of such an operator in terms of decompositions of related ideals. Combined with Theorem 7.34 of [17], this result also implies that the generating sequence must be of the form (2.1). (See also Section 3.2 of [36], where similar results are presented.) These results can be thought of as a finite dimensional analogue (for product domains) of Theorem 1.2 and Proposition 1.4 in [3]. Theorem 5.9 gives another condition on certain ideals in order for the generating sequence to be of the simpler type, i.e. the counterpart of (1.2) instead of (1.1). In Section 6 of the same paper they give conditions for when these results apply also to the truncated setting, based on rank preserving extension theorems. Similar results in the one-variable setting is found in Section 3 of [18].
Finally, we remark that the results in this paper concerning finite rank PSD Hankel operators partially overlap heuristically with results of [44] and those found in Section 4 in [36], where the formula (2.4) is found in the (non-truncated) discrete environment. In the latter reference they subsequently provide conditions under which this applies to the truncated setting.
With these remarks we end the review and begin to present the new results of this paper. For the sake of introducing useful notation, it is convenient to start with the result on tensor products, which will be used to "lift" the one-dimensional Carathéodory-Fejér theorem to the multidimensional discrete setting.

A property of tensor products
Let U 1 , . . . , U d be finite dimensional linear subspaces of C n . Then ⊗ d j=1 U j is a linear subspace of ⊗ d j=1 C n , and the latter can be identified with the set of C-valued functions on {1, . . . , n} d . Given f ∈ ⊗ d j=1 C n and x ∈ {1, . . . , n} d , we will write f (x) for the corresponding value. For fixed i.e. the vectors obtained from f by fixing all but one variable (and collecting the d−1 fixed variables in x). We refer to these vectors as "probes" of f . If f ∈ ⊗ d j=1 U j then it is easy to see that all probes f j of f will be elements of U j , j = 1, . . . , d. The following theorem states that the converse is also true.
Proof. First consider the case d = 2. Let V ⊂ ⊗ 2 j=1 C n consist of all f with the property stated in the theorem. This is obviously linear and U 1 ⊗ U 2 ⊂ V . If we do not have equality, we can pick an f in V which is orthogonal to U 1 ⊗ U 2 . At least one probe f 1 (k) must be a non-zero element u 1 of U 1 . Given any u 2 ∈ U 2 we have From the middle representation and the choice of u 1 , we see that at least one value of the vector n j=1 u 1,i f 2 (i) is non-zero. Moreover this is a linear combination of probes f 2 (i), and hence an element of U 2 . But then we can pick u 2 ∈ U 2 such that the scalar product (3.1) is non-zero, which is a contradiction to the choice of f . The theorem is thus proved in the case d = 2.
The general case now easily follows by induction on the dimension, noting that

General domain Toeplitz and Hankel operators and matrices
The operators in the title arise as discretizations of general domain Toeplitz/Hankel integral operators. These become "summing operators", which can be represented as matrices in many ways, which we describe in the next section.

Discretization
For simplicity of notation, we here discretize using an integer grid, since grids with other sampling lengths (these are considered in Section 6.1) can be obtained by first dilating the respective domains. Let Ξ, Υ be any open connected and bounded domains in R d , and let f be a bounded function defined on Ω = Ξ−Υ. We will throughout the paper use bold symbols for discrete objects, and normal font for their continuous analogues. Set where g is an arbitrary function on Υ. We will talk of Θ f as a discretization of the corresponding integral operator Θ f , introduced in Section 2.2, more on this in Section 6.1.
Matrix realization of Θ f : We may of course represent g as a vector, by ordering the entries in some (non-unique) way. More precisely, by picking any bijection we can identify g with the vectorg given by Letting o x be an analogous bijection for Ξ, it is clear that Θ f can be represented as a matrix, where the (i, j)'th element is f (o x (i) − o y (j)). Such matrices will be called "general domain Toeplitz matrices", see Figure 2 for a small scale example. We make analogous definitions for Γ f and denote the corresponding discrete operator by Γ f . We refer to this as a "general domain Hankel (summing) operator" and its matrix realization as "general domain Hankel matrix". An example of such is shown in Figure 1.

Block Toeplitz and Hankel matrices
If we let Ξ and Υ be multi-cubes and the ordering bijections be the lexicographical order, then the matrix realization Θ f of (4.1) becomes a block Toeplitz matrix. These are thus special cases of the more general operators considered above. Similarly, block Hankel matrices arise when representing Γ f in the same way.  , (1, 1, 1).
The matrix-realization T of a multidimensional Toeplitz operator Θ f then becomes where e.g.
Note that this matrix has a Toeplitz structure on 3 levels, since each 3 × 3-block of the large matrix above is Toeplitz, and these blocks themselves form a 3 × 3 Toeplitz structure.

Exponential sums
We pause the general development to note some standard facts that will be needed in what follows. Fix N ∈ N, and for j = 1, . . . , d let Φ j be a set of at most 2N numbers in C.  Then Proof. Pick a fixed ζ ∈ C d and consider f (x) = e ζ·x then which has rank 1. For a general f of the form (4.3) the rank will thus be less than or equal to K. But Proposition 4.1 implies that the set {e ζ k ·x } K k=1 is linearly independent as functions on Ξ. Thus the rank will be precisely K, as desired. The argument for Γ f is analogous.
We end this section with a technical observation concerning 1 variable. Proposition 4.3. Let f be a vector of length m > n + 1 and K < n. Let ζ 1 , . . . , ζ K be fixed and suppose that each sub-vector of f with length n + 1 can be written of the form (4.3), then f can be written in this form as well.
Proof. Consider two adjacent sub-vectors with overlap of length n. On this overlap the representation (4.3) is unique, due to Proposition 4.1. The result now easily follows.

The multidimensional discrete Carathéodory-Fejér theorem
Throughout this section, let Υ, Ξ and Ω be as in Sections 4.2 and 4.3, i.e. multi-cubes centered at 0. The following theorem was first observed in [53], but using a completely different proof.
suppose that Θ f is PSD and has rank K where K ≤ 2N . Then f can be written as where c k > 0 and ξ k ∈ R d are distinct and unique. Conversely, if f has this form then Θ f is PSD with rank K.
The proof is based on the following simple observation about PSD matrices. Let Ran A denote the range of a matrix A, and Ker A its kernel. Proof. Note that the orthogonal complement of Ran B equals Ker B * . Since A = A * it suffices to show that Ker A ⊂ Ker B * . Suppose that this is not the case and let x ∈ Ker A be such that B * x = y = 0. For t ∈ R arbitrary we have A B B * C x ty , x ty = 2tRe B * x, y + t 2 Cy, y = 2t y 2 + t 2 Cy, y Since y = 0 this expression takes negative values for some t, which is a contradiction.
Proof of Theorem 5.1. First assume that Θ f is PSD and has rank K. Let T be a block Toeplitz representation of Θ f , as described in Section 4.2. Recall that the Toeplitz matrix T f d (0) is the (2N + 1) × (2N + 1) sub-matrix on the diagonal of T , (and 0 ∈ Z d−1 ). This is clearly PSD and of some rank J d ≤ K, so by the classical Carathéodory-Fejér theorem (Theorem 2.1), f d (0) can be represented by with ξ d k ∈ R. We identify functions on {−N . . . N } with C 2N +1 in the obvious way, and define The analogous subspace of C 4N +1 will be called U ext d . Note that f d (0) ∈ U ext d by (5.3) and that is PSD (see the example in Section 4.2). Hence However, by Proposition 4.2, precisely K of the coefficients c k are non-zero. This is (5.1). The uniqueness of the multi-frequencies is immediate by Proposition 4.1 (applied with N := 2N ). The linear independence of these functions also gives that the coefficients are unique. To see that c k is positive, (1 ≤ k ≤ K), just pick a function g on Ξ which is orthogonal to all other e iξj ·x , j = k. Using the formula (4.4) it is easy to see that and the first statement is proved. For the converse, let f be of the form (5.1). Then Θ f has rank K by Proposition 4.2 and the PSD property follows by the fact that in analogy with (5.7).
It is possible to extend this result to more general domains as considered in Section 4.1. However, such extensions are connected with some technical conditions, which are not needed in the continuous case. Moreover, in the next section we will show that the discretizations of Section 4.1 capture the essence of their continuous counterparts, given sufficient sampling. For these reasons we satisfy with stating such extensions for the continuous case, see Section 7.
The above proof could also be modified to apply to block Hankel matrices, but since Fischer's theorem is connected with preconditions to rule out exceptional cases, the result is not so neat. (It does however provide alternative proofs to the results in [44] concerning small Hankel operators). Again, we here present only the cleaner continuous version, see Section 8.

The multidimensional discrete Kronecker theorem
If we want to imitate the proof of Theorem 5.1 in Kronecker's setting, i.e. without the PSD assumption, then we have to replace (5.3) (a sum of exponentials) with (2.7) (a sum of exponentials with polynomial coefficients). With suitable modifications, the whole argument goes through up until (5.6), where now the ξ k 's can lie in C d and c k also can be polynomials. However, the key step of reducing the (J-term) representation (5.6) to the (K-term) representation (5.1), via Proposition 4.2, fails. Thus, the only conclusion we can draw is that f has a representation where J ≤ K, but we have very little information on the amount of terms in each p j . This is a fundamental difference compared to before. In [3] examples are presented of general domain Hankel and Toeplitz integral operators, whose generating function is a single polynomial p, where Γ p has rank K much lower than the amount of monomials needed to represent p. It is also not the case that these polynomials necessarily are the limit of functions of the form (2.13) (in a similar way as (2.10)), and hence we can not dismiss these polynomials as "exceptional". To obtain similar examples in the finite dimensional setting considered here, one can just discretize the corresponding Γ p found in [3] (as described in Section 4.1). Nevertheless, in the continuous setting (i.e. for operators of the form Θ f and Γ f , c.f. (2.11) and (2.14)) the correspondence between rank and the structure of f is resolved in [3]. In particular it is shown that (either of) these operators have finite rank if and only if f is an exponential polynomial, and that the rank equals K if f is of the (reduced) form We now show that these results apply also in the discrete setting, given that the sampling is sufficiently dense. For simplicity of notation, we only consider the case Γ f from now on, but include the corresponding results for Θ f in the main theorems.

Discretization
Let bounded open domains Υ, Ξ be given, and let l > 0 be a sampling length parameter. Set (c.f. (4.1)), make analogous definition for Ξ l and define Ω l = Υ l + Ξ l . We denote the cardinality of Υ l by |Υ l |, and we define 2 (Υ l ) as the Hilbert space of all functions g on Υ l and norm We let Γ f,l : 2 (Υ l ) → 2 (Ξ l ) denote the summing operator When l is understood from the context, we will usually omit it from the notation to simplify the presentation. It clearly does not matter if f is defined on Ξ + Υ or Ξ l + Υ l , and we use the same notation in both cases. We define Θ f,l in the obvious analogous manner. Note that in Section 4 and 5 we worked with Θ f , which with the new notation becomes the same as Θ f,1 .
Proposition 6.1. There exists a constant C > 0, depending only on Ξ, such that Proof. By the Cauchy-Schwartz inequality we clearly have for each x ∈ Ξ l . If we let |Ξ l | denote the amount of elements in this set, it follows that Since Ξ is a bounded set, it is clear that |Ξ l |l d is bounded by some constant, and hence the result follows. Similarly, Rank Θ f,l ≤ Rank Θ f for any continuous f on Ξ − Υ.
Proof. Given y ∈ Υ l and t ≤ l let C l,t y denote the multi-cube with center y and sidelength t, i.e. C l,t y = {y ∈ R d : |y − y| ∞ < t/2}, where | · | ∞ denotes the supremum norm in R d . Choose t 0 such that √ dt 0 /2 < dist(Υ l , ∂Υ). For t < t 0 we then have that the set {e l,t y } y∈Υ l defined by e l,t y = t −d/2 1 C l,t y is orthonormal in L 2 (Υ). We make analogous definitions for Ξ l . Clearly 2 (Υ l ) is in bijective correspondence with Span {e l,t y } y∈Υ l via the canonical map P l,t , i.e. P l,t (δ y ) = e l,t y where δ y is the "Kronecker δ−function". Let Q l,t denote the corresponding map Q l,t : If we denote this number byf t (x + y), we see that 1 t d Q l,t * Γ f P l,t = Γf t ,l . It follows that Rank Γf t ,l ≤ Rank Γ f . Since f is continuous, it is easy to see that lim t→0 +f t (x + y) = f (x + y), which implies that lim t→0 + Γf t ,l = Γ f,l , and the proof is complete.

From discrete to continuous
Our next result says that for sufficiently small l, the inequality in Theorem 6.2 is actually an equality. This needs some preparation. Given y ∈ Υ l we abbreviate C l,l y by C l y , i.e. the multi-cube with center y and sidelength l. Set Υ int l = {y ∈ Υ l : C y ⊂ Υ}, i.e. the set of those y's whose corresponding multicubes are not intersecting the boundary. Moreover, for each y ∈ Υ l , set We now define P l : 2 (Υ l ) → L 2 (Υ) via P l (δ y ) = e l y . Note that this map is only a partial isometry, in fact, P l * P l is the projection onto Span {δ y : y ∈ Υ int l }, and P l P l * is the projection in L 2 (Υ) onto the corresponding subspace. We make analogous definitions for Ξ l , denoting the corresponding partial isometry by Q l . Set N l = N l (Υ) = |Υ l \ Υ int l |, i.e. N l is the amount of multi-cubes C l y intersecting the boundary of Υ, and note that N l = dim Ker P l . Since Υ is bounded and open, it is easy to see that |Υ int l | is proportional to 1/l d . We will say that the boundary of a bounded domain Υ is well-behaved if In other words, ∂Υ is well behaved if the amount of multi-cubes C l y properly contained in Υ asymptotically outnumbers the amount that are not. The next proposition implies that most decent domains have well-behaved boundaries. Proof. By definition, for each point x ∈ ∂Υ one can find a local coordinate system such that ∂Υ locally is the graph of a Lipschitz function from some bounded domain in R d−1 to R, see e.g. [51] or [19], Sec. 4.2. It is not hard to see that each such patch of the boundary can be covered by a collection of balls of radius l, where the amount of such balls is bounded by some constant times 1/l d−1 . Since ∂Υ is compact, the same statement applies to the entire boundary. However, it is also easy to see that one ball of radius l can not intersect more than 3 d multi-cubes of the type C l y , and henceforth N l is bounded by some constant times 1/l d−1 as well. The desired statement follows immediately.
We remark that all bounded convex domains have well behaved boundaries, since such domains have Lipschitz boundaries, (see e.g. [19,Sec. 6.3]). Also, note that the above proof yielded a faster decay of N l l d than necessary, so most "natural" domains will have well-behaved boundaries. We are now ready for the main theorem of this section: For f continuous and defined on cl(Ξ − Υ) we analogously have Θ f = lim l→0 + l d Q l Θ f,l P l * ).
Proof. We first establish that P l P l * converges to the identity operator I in the SOT -topology. Let g ∈ L 2 (Υ) be arbitrary, pick any > 0 and letg be a continuous function on cl(Υ) with g −g < . Then g − P l P l * g ≤ g −g + g − P l P l * g + P l P l * (g − g) .
Both the first and the last term are clearly ≤ , whereas it is easy to see that the limit of the middle term as l → 0 + equals 0, sinceg is continuous on cl(Υ) and the boundary is well-behaved. Since was arbitrary we conclude that lim l→0 + P l P l * g = g, as desired. The corresponding fact for Q l is of course then also true. Now, since Γ f is compact by Corollary 2.4 in [3], it follows by the above result and standard operator theory that Γ f = lim l→0 + Q l Q l * Γ f P l P l * , and hence it suffices to show that Since Q l and P l * are contractions, this follows if By the Tietze extension theorem, we may suppose that f is actually defined on R n and has compact support there. In particular it will be equicontinuous. Now, to establish (6.5), let g = g 1 + g 2 ∈ 2 (Υ l ) be arbitrary, where supp g 1 ⊂ Υ int l and supp g 2 ⊂ Υ l \ Υ int l . By definition, P l g 2 = 0 so by the Cauchy-Schwartz inequality. Thus We now provide estimates for g 1 . Given x ∈ Ξ l and y ∈ Υ l , set f (x + y) = 1 l 2d |x−x|∞<l/2 |y−y|∞<l/2 f (x + y) dy dx and note thatf (x + y) = 1 l d Q l * Γ f P l δ y , δ x whenever x ∈ Ξ int l and y ∈ Υ int l . As in the proof of Theorem 6.2 it follows that Q l * Γ f P l g 1 (x) = l d Γf ,l g 1 (x) for x ∈ Ξ int l . For such x we thus have by Cauchy-Schwartz, and for x ∈ Ξ \ Ξ int l we get due to the definition of Q l . Combining (6.6)-(6.8) we see that Since Ξ and Υ are bounded sets, |Ξ l | and |Υ l | are bounded by some constant C times 1/l d , and as g 1 ≤ g and g 2 ≤ g , it follows that By Proposition 6.3 the last two terms go to 0 as l goes to 0. The same is true for the first term by noting that which is an easy consequence of the equicontinuity of f . Thereby (6.5) follows and the proof is complete.
In particular, we have the following corollary. Note that the domains need not have well-behaved boundaries.
Corollary 6.5. Let Υ and Ξ be open, bounded and connected domains, and let f be a continuous function on cl(Ξ + Υ). We then have Proof. By Propositions 5.1 and 5.3 in [3], the rank of Γ f is independent of Υ and Ξ. Combining this with Theorem 6.2, it is easy to see that it suffices to verify the corollary for any open connected subsets of Υ and Ξ. We can thus assume that their boundaries are well-behaved. By Theorem 6.4 and standard operator theory we have On the other hand, Theorem 6.2 gives

The multidimensional continuous Carathéodory-Fejér theorem
In the two final sections we investigate how the PSD-condition affects the structure of the generating functions. This condition only makes sense as long as Ξ = Υ, which we assume from now on. In this section we show that the natural counterpart of Carathéodory-Fejér's theorem holds for general domain Toeplitz integral operators Θ f , and in the next we consider Fischer's theorem for general domain Hankel integral operators. Then the operator Θ f is PSD and has finite rank K if and only if there exist distinct ξ 1 , . . . , ξ K ∈ R d and c 1 , . . . , c K > 0 such that Proof. Suppose first that Θ f is PSD and has finite rank K. By Theorem 4.4 in [3], f is an exponential polynomial (i.e. can be written as (6.1)). By uniqueness of analytic continuation, it suffices to prove the result when Ξ = Υ are neighborhoods of some fixed point x 0 . By a translation, it is easy to see that we may assume that x 0 = 0. We consider discretizations Θ f,l of Θ f where l assume values 2 −j , j ∈ N. For j large enough, (beyond J say), the operator Γ f,2 −j has rank K (Corollary 6.5) and Theorem 5.1 applies (upon dilation of the grids). We conclude that for j > J the representation (7.1) holds (on Ω 2 −j = Ξ 2 −j − Υ 2 −j ) but the ξ k 's may depend on j. However, since each grid Ω 2 −j−1 is a refinement of Ω 2 −j , Proposition 4.1 guarantees that this dependence on j may only affect the ordering, not the actual values of the set of ξ k 's used in (7.1). We can thus choose the order at each stage so that it does not depend on j. Since f is an exponential polynomial, it is continuous, so taking the limit j → ∞ easily yields that (7.1) holds when x is a continuous variable as well.
Conversely, suppose that f is of the form (7.1). Then Θ f has rank K by Proposition 4.1 in [3] (see also the remarks at the end of Section 2.2). The PSD condition follows by the continuous analogue of (5.8). We remark that the continuous version above differs significantly from the discrete case, even in one dimension, since the sequence (λ n ) 2N n=0 generates a PSD Hankel matrix for all λ ∈ R (even negative values), whereas the base e ξ k is positive in (8.1). Recall also the example (2.3), which does not fit in the discrete version of (8.1).

The multidimensional continuous Fischer theorem
Proof. Surprisingly, the proof is rather different than that of Theorem 7.1. First suppose that Γ f is PSD and has finite rank K. Then f is an exponential polynomial, i.e. has a representation (6.1), by Theorem 4.4 in [3]. Suppose that there are non-constant polynomial factors in the representation (6.1), say p 1 (x)e ζ1x . Let N be the maximum degree of all polynomials {p j } J j=1 . Pick a closed subset Ξ ⊂ Ξ and r > 0 such that dist(Ξ, R d \ Ξ) > 2r. Pick a continuous real valued function g ∈ L 2 (R d ) with support inΞ that is orthogonal to the monomial exponentials {x α e ζj ·x } |α|≤N,1≤j≤J \ {e ζ1·x } (where α ∈ N d and we use standard multi-index notation), but satisfies g, e ζ1·x = 1, (that such a function exists is standard, see e.g. Proposition 3.1 in [3]). A short calculation shows that Γ f g(· − z), g(· − w) = p 1 (z + w)e ζ1·(z+w) (8.2) whenever |z|, |w| < r. Since p 1 is non-constant, there exists a unit length ν ∈ R d such that q(t) = p 1 (rνt) is a non-constant polynomial in t. Set ζ = rζ 1 · ν. Consider the operator A : Clearly A * Γ f A is PSD. It follows by (8.2) and Fubini's theorem that With h(t) = q(t)e ζt , it follows that the operator Γ h : is self adjoint it is easy to see that h(t + s) = h(s + t), (either by repeating arguments from Section 6, or by standard results from integral operator theory). In particular h is real valued. This clearly implies that ζ ∈ R. Now consider the operator B : . As before we see that B * Γ h B = Γ q , and this operator is PSD. Given , (where we identify functions on [0, 1/2] with functions on R that are identically zero outside the interval). It is easy to see that in particular it is PSD. Since q is a polynomial, it is easy to see that (q(· + 2 ) − 2q(· + ) + q(·))/ 2 converges uniformly on compacts to q . By simple estimates based on the Cauchy-Schwartz inequality (see e.g. Proposition 2.1 in [3]), it then follows that the corresponding sequence of operators converges to Γ q (acting on L 2 ([0, 1/2])), which therefore is PSD. Continuing in this way, we see that we can assume that q is of degree 1 or 2, where Γ q acts on an interval [0, 3l] where 3l is a power of 1/2. We first assume that the degree is 2, and parameterize q(t) = a + b(t/l) + c(t/l) 2 .
a + b + c a + 2b + 4c a + b + c a + 2b + 4c a + 3b + 9c a + 2b + 4c a + 3b + 9c a + 4b + 16c   , which then is PSD. However, a (not so) short calculation shows that the determinant of M equals −8c 3 which is a contradiction, since it is less than 0 (recall that c > 0). We now consider the case of degree 1, i.e. c = 0 and b = 0. As above we deduce that the matrix has to be PSD, which contradicts the fact that its determinant is −b 2 .
By this we finally conclude that there can be no polynomial factors in the representation (6.1). By the continuous version of Proposition 4.2 (see Proposition 4.1 in [3]), we conclude that f is of the form (6.2), i.e. f = K k=1 c k e ζ k ·x . From here the proof is easy. Repeating the first steps, we conclude that ζ k · ν ∈ R for all ν ∈ R d , by which we conclude that ζ k are real valued. We therefore call them ξ k henceforth. With this at hand we obviously have for all g ∈ L 2 (Ξ), whereby we conclude that c k > 0.
For the converse part of the statement, let f be of the form (8.1). That Γ f has rank K has already been argued (Proposition 4.1 in [3]) and that Γ f is PSD follows by (8.3). The proof is complete.

Unbounded domains
For completeness, we formulate the results form the previous two sections for unbounded domains. Γ f is defined precisely as before, i.e. via the formula (2.14), except that we now have to assume that f (x+·) is in L 2 (Υ) for every x ∈ Ξ and vice versa, f (·+y) ∈ L 2 (Ξ) for every y ∈ Υ (see definition 1.1 in [3]). Obviously, analogous definitions/restrictions apply to Θ f as well. The main difficulty with unbounded domains is that exponential polynomials then can give rise to unbounded operators. Following [3], we address this by assuming that Ω is convex and we let ∆ Ω 2 denote the set of directions ϑ ∈ R d such that the orthogonal projection of Ω on the half line [0, ∞) · ϑ is a bounded set, and we let int(∆ Ω ) denote its interior.
Proof. This follows by straightforward modifications of the proofs in Section 9 of [3], so we satisfy with outlining the details. The "if" direction is easy so we focus on the "only if". We restrict the operator Γ f to functions living on a subset (see Theorem 9.1 [3]) to obtain a new operator to which Theorem 8.1 above applies. From this we deduce that f locally has the form (8.1). That this formula then holds globally is an immediate consequence of uniqueness of real analytic continuation, combined with the observation that Ω is connected. Finally, the restriction on the ξ k 's is immediate by Theorem 9.3 in [3].
The corresponding situation for general domain Toeplitz integral operators is quite different. We first note that Θ f : L 2 (Υ) → L 2 (Ξ) is bounded if and only if Γ f : L 2 (−Υ) → L 2 (Ξ) is bounded, as mentioned in Section 2.2 and further elaborated on around formula (1.2) in [3]. With this, we immediately obtain the following theorem.
However, if now again we let Ξ = Υ and we additionally impose PSD, the proof of Theorem 9.1 combined with Theorem 7.1 shows that ζ j = iξ j for some ξ ∈ R d . However, Theorem 9.2 then forces 0 = Re ζ j ∈ int(∆ Ω ), which can only happen if ∆ Ω = R d , since it is a cone. This in turn is equivalent to Ω being bounded, so we conclude that Theorem 9.3. Let Ξ = Υ ⊂ R d be convex unbounded domains, set Ω = Ξ − Υ and let f be as in Theorem 9.2. Then Θ f is bounded and PSD if and only if f ≡ 0.

Conclusions
Multidimensional versions of the Kronecker, Carathéodory-Fejér and Fischer theorems are discussed and proven in discrete and continuous settings. The former relates the rank of general domain Hankel and Toeplitz type matrices and operators to the number of exponential polynomials needed for the corresponding generating functions/sequences. The latter two include the condition that the operators be positive semi-definite. The multi-dimensional versions of the Carathéodory-Fejér theorem behave as expected, while the multi-dimensional versions of the Kronecker theorem generically yield more complicated representations, which are clearer in the continuous setting. Fischer's theorem also exhibits a simpler structure in the continuous case than in the discrete. We also show that the discrete case approximates the continuous, given sufficient sampling.
The article [18] is primarily concerned with when the converse of Theorem 11.1 holds. We note that this is the case whenever Rank H N f = Rank H N −1 f ≤ N , see Theorem 3.1 of [18]. Our next concern is to validate the statements concerning the generic form (2.4) of f , i.e. that holds "generically" given that Rank H N f = K ≤ N or Rank T N f = K ≤ N . We only outline the details in the Hankel case, the Toeplitz case being an easy consequence as in the previous proof. We adapt the concept of generic as introduced in Definition 7.2 of [3]. Briefly, this says that when dealing with a set M that is a union of manifolds of possibly different dimensions, a property holds generically if the set M F where the property fails is a union of manifolds of lower dimension than the maximum dimension of the components of M. Now let M be the set of all sequences f such that Rank H N f ≤ K. First note that by considering c k and λ k as variables in C, the expression (11.2) gives rise to a manifold of dimension 2K except at degenerate points (e.g. when two λ k 's coincide). In a similar way, we can consider generating functions of the more general form (11.1) as parts of manifolds where the coefficients of each p j as well as the λ j 's are variables. By counting the number of free variables, it is clear that this gives rise to a manifold of lower dimension than 2K. It remains to prove that the generating sequences f ∈ M that are not of the form (11.1), also are part of manifolds of dimension lower than 2K. This follows from the below proposition, which characterizes all sequences giving rise to rank K Hankel matrices. Let {e l } l∈Z denote the canonical basis of 2 (Z). where J j=1 (deg p j + 1) + k = K. Remark: Note that, arguing as before, the number of free variables in the representation (11.3) at most is 2(K −k)+k = 2K −k (with equality precisely when there are no non-constant polynomials present in the representation (11.3)). The amount of free variables is thus less than 2K − 1 unless k = 0, which concludes the argument preceding the proposition.

Proof. Let
Rank H N f have (r, k)-characteristic, as defined in Section 10, [28]. By definition Rank H r f = Rank H r−1 f = r, and hence (f n ) r n=0 is of the form (11.1) with cardinality r, by Theorem 3.1 in [18]. This expression can be used to define an alternative sequencef , such that Rank M f = r for all M ≥ r. By the uniqueness statement in Theorem 9.2 of [28] and the definition of the (r, k)−characteristic, the representation (11.3) follows. Finally, r + k = K by Theorem 11.1 of [28].