1 Introduction

Linear functionals \(L:\mathcal {V}\rightarrow \mathbb {K}\) with \(\mathbb {K}= \mathbb {R}\) or \(\mathbb {C}\) belong to the most important structures in mathematics, e.g. for separation arguments. If \(\mathcal {V}\) is a vector space of functions \(v:\mathcal {X}\rightarrow \mathbb {K}\) then L is called a moment functional if it is represented by a (non-negative) measure \(\mu \) on \(\mathcal {X}\):

$$\begin{aligned} L(v)=\int _\mathcal {X}v(x)~\textrm{d}\mu (x) \qquad \text {for all}\ v\in \mathcal {V}. \end{aligned}$$

If \(\textrm{supp}\,\mu \subseteq K\subseteq \mathcal {X}\), then L is called a K-moment functional.

Among the moment functionals the most important ones act on polynomials \(\mathcal {V}= \mathbb {R}[x_1,\ldots ,x_n]\) on some \(K\subseteq \mathbb {R}^n\), \(n\in \mathbb {N}\). The origin of the name moment comes from physics, especially the moment of inertia which is calculated by

$$\begin{aligned} \int _{\mathbb {R}^3} (x_1^2 + x_2^2)\cdot \rho (x_1,x_2,x_3)~\textrm{d}x \end{aligned}$$

for a rotation around the \(x_3\)-axis of a body with density distribution \(\rho \). If K is closed then Haviland’s Theorem [14, 15] states that a linear functional \(L:\mathbb {R}[x_1,\ldots ,x_n]\rightarrow \mathbb {R}\) is a K-moment functional if and only if \(L(p)\ge 0\) for all \(p\in \mathbb {R}[x_1,\ldots ,x_n]\) with \(p\ge 0\). On the other side \(p\ge 0\) on K if and only if \(L(p)\ge 0\) for all K-moment functionals L since every point evaluation is a moment functional. These are the two directions in the duality theorem and the many connections between the moment problem (deciding when a linear functional is a moment functional) and non-negative polynomials (and therefore optimization and many other applications) only start here. See e.g. [1, 2, 5, 8, 10, 19,20,21, 24, 36] and references therein for more on the moment problem, the connection to non-negative polynomials, and applications.

Besides the one-point evaluation \(L(f) = f(x)\) the following is probably the simplest moment functional.

Example 1.1

Let \(\lambda \) be the Lebesgue measure on [0, 1] and let \(\mathcal {V}= \mathbb {R}[t]\). Then the functional

$$\begin{aligned} L_{\text {Leb}}:\mathbb {R}[t]\rightarrow \mathbb {R}\quad \text {with}\quad L_{\text {Leb}}(t^d) = \int _0^1 t^d~\textrm{d}\lambda (t) = \frac{1}{d+1}\quad \text {for all}\ d\in \mathbb {N}_0, \end{aligned}$$
(1)

is the unique linear functional such that \(L(t^d) = \frac{1}{d+1}\) holds for all \(d\in \mathbb {N}_0\).\(\circ \)

Besides this the general [0, 1]-moment problem (also called the Hausdorff moment problem) is the easiest to decide.

Hausdorff Moment Problem 1.2

(see [13] or [19, Thm. 1.1 and 1.2])

  1. (a)

    Let \(d\in \mathbb {N}\). The following are equivalent.

    1. (i)

      \(L:\mathbb {R}[x]_{\le d}\rightarrow \mathbb {R}\) is a [0, 1]-moment functional.

    2. (ii)

      \(L(p)\ge 0\) holds for all \(p\in \mathbb {R}[x]_{\le d}\) such that \(p\ge 0\) on [0, 1].

  2. (b)

    The following are equivalent.

    1. (i)

      \(L:\mathbb {R}[x]\rightarrow \mathbb {R}\) is a [0, 1]-moment functional.

    2. (ii)

      \(L(p)\ge 0\) holds for all \(p\in \mathbb {R}[x]\) such that \(p\ge 0\) on [0, 1].

This problem is fully solved since by the univariate Positivstellensatz every polynomial \(p\in \mathbb {R}[x]\) which is non-negative on [0, 1] has the form

$$\begin{aligned} p(x) = p_1(x)+ x\cdot (1-x)\cdot p_2(x) = q_1(x) + x\cdot q_2(x) + (1-x)\cdot q_3(x) \end{aligned}$$

for some \(p_i\), \(q_i\in \sum \mathbb {R}[x]^2\) sums of squares. This also holds with the degree bound \(\deg p \le d\).

In higher dimensions the problem is not completely solved and several problems appear, especially since in \(\mathbb {R}^n\) with \(n\ge 2\) there are non-negative polynomials which are not sums of squares [26] or a tuple \((X_1,\ldots ,X_n)\) of pairwise commuting and symmetric multiplication operators need not have an extension to pairwise commuting and self-adjoint multiplication operators \((\overline{X_1},\ldots ,\overline{X_n})\).

Reduction to lower-dimensional moment problems is one way to handle multidimensional moment problems. For example disintegration [36, Ch. 14.6] can be used which is based on the following disintegration theorem.

Theorem 1.3

(see e.g. [36, Prop. 14.27]) Let \(X\subseteq \mathbb {R}^n\) and \(Y\subseteq \mathbb {R}^m\) with \(n,m\in \mathbb {N}\) be closed subsets. Let \(\nu \) be a finite Radon measure on X and \(p:X\rightarrow Y\) be a \(\nu \)-measurable mapping. Set \(\mu := p(\nu )\). Then there exists a function \(y\mapsto \lambda _y\), \(y\in Y\) and \(\lambda _y\) Radon measures on X, which satisfies the following:

  1. (i)

    \(\textrm{supp}\,\lambda _y\subseteq p^{-1}(y)\),

  2. (ii)

    \(\lambda _y(p^{-1}(y)) = 1\) \(\mu \)-a.e., and

  3. (iii)

    for each nonnegative Borel function \(f:X\rightarrow \mathbb {R}\) we have

    $$\begin{aligned} \int _X f(x)~\textrm{d}\nu (x) = \int _Y \int _X f(x)~\textrm{d}\lambda _y(x)~\textrm{d}\mu (y). \end{aligned}$$

Our approach in this paper is similar to disintegration. To understand moment functionals better and to simplify them we investigate in this article the possibility of transforming a linear (moment) functional into another linear (moment) functional based on isomorphism and transformation results between measure spaces. While (real) algebraic and functional analytic/operator theoretic results have been applied intensively, deeper measure theoretic aspects have not been studied.

In the theory of moments representing a linear functional \(L:\mathcal {V}\rightarrow \mathbb {R}\) by an integral (i.e., proving the existence of a representing measure) is done by the use of the Riesz (Riesz–Markov–Kakutani) Theorem [17, 23, 28]. But other classical results in measure theory [12, 16, 25, 30, 31, 37, 38] have not been applied mainly because these deal with general measurable functions and the moment problem is dominated by the use of polynomials [20, 21, 36]. But in the operator theoretic approach to the moment problem we work in the general Hilbert space setting and allowing measurable functions instead of using only polynomials is the natural framework. Also recent developments such as [18] (see Theorem 2.19) have not been considered in the theory of moments so far. The measure theoretic results we use in this paper are distributed over almost a century and hence the mathematical language and definitions used in the original literature can differ enormously from our mathematical language today. We therefore give the original references but mainly refer for unified and up to date formulations and definitions to [4].

Let us have a look at the following theorem to see what kind of results we are looking for.

Theorem 1.4

Let S be a Souslin set (e.g. a Borel set \(S\subseteq \mathbb {R}^n\)), \(\mathcal {V}\) be a vector space of real measurable functions \(v:S\rightarrow \mathbb {R}\), and \(L:\mathcal {V}\rightarrow \mathbb {R}\) be a linear functional. Then the following are equivalent:

  1. (i)

    \(L:\mathcal {V}\rightarrow \mathbb {R}\) is a S-moment functional.

  2. (ii)

    There exists a measurable function \(f:[0,1]\rightarrow S\) such that

    $$\begin{aligned} L(v) = \int _0^1 v(f(t))~\textrm{d}\lambda (t) \end{aligned}$$
    (2)

    for all \(v\in \mathcal {V}\) where \(\lambda \) is the Lebesgue measure on [0, 1], i.e., \(\lambda \circ f^{-1}\) is a representing measure of L.

Proof

(i)\(\rightarrow \)(ii): Let \(\mu \) be a representing measure of L. By Corollary 2.15 there exists a measurable function \(f:[0,1]\rightarrow S\) such that \(\mu = \lambda \circ f^{-1}\) and hence

$$\begin{aligned} L(v) = \int _S v(x)~\textrm{d}\mu (x) = \int _S v(x)~\textrm{d}(\lambda \circ f^{-1})(x) \overset{\text {Lemma 2.1}}{=} \int _0^1 v(f(t))~\textrm{d}\lambda (t) \end{aligned}$$

for all \(v\in \mathcal {V}\).

(ii)\(\rightarrow \)(i): \(\lambda \circ f^{-1}\) is a representing measure of L by Lemma 2.1. \(\square \)

Theorem 1.4 can be seen as a complete characterization of (S-)moment functionals, i.e., every moment functional \(L:\mathcal {V}\rightarrow \mathbb {R}\) has the form (2) for some \(f:[0,1]\rightarrow S\). Additionally, Theorem 1.4 also shows that every moment functional L is represented by \(\lambda \circ f^{-1}\) for a measurable function \(f:[0,1]\rightarrow \mathbb {R}^n\).

Hence, the aim of this paper is to characterize and represent moment functionals in the form of (2) and especially to find additional properties of \(f:[0,1]\rightarrow S\).

The notation and result in Theorem 1.4 stimulate the notation of a transformation of a linear (moment) functional. We introduce the following definitions.

Definition 1.5

Let \(\mathcal {X}\) and \(\mathcal {Y}\) be two Souslin spaces, \(\mathcal {U}\) and \(\mathcal {V}\) two vector spaces of real measurable functions on \(\mathcal {X}\) resp. \(\mathcal {Y}\), and \(K:\mathcal {U}\rightarrow \mathbb {R}\) and \(L:\mathcal {V}\rightarrow \mathbb {R}\) be two linear functionals. We say L (continuously) transforms into K, symbolized by \(L\leadsto K\) resp. \(L\overset{{\textit{c}}}{\leadsto }K\), if there exists a Borel (resp. continuous) function \(f:\mathcal {X}\rightarrow \mathcal {Y}\) such that \(\mathcal {V}\circ f \subseteq \mathcal {U}\) and \(L(v) = K(v\circ f)\) for all \(v\in \mathcal {V}\).

We say L strongly (and continuously) transforms into K, symbolized by \(L\overset{{\textit{s}}}{\leadsto }K\) resp. \(L\overset{{\textit{sc}}}{\leadsto }K\), if there exists a surjective Borel (resp. surjective and continuous) function \(f:\mathcal {X}\twoheadrightarrow \mathcal {Y}\) such that \(\mathcal {V}\circ f =\mathcal {U}\) and \(L(v) = K(v\circ f)\) for all \(v\in \mathcal {V}\).

If in this definition of a transformation \(\leadsto \) a function \(f:\mathcal {X}\rightarrow \mathcal {Y}\) is fixed because it has special properties, then we denote that in the transformation by \(\overset{f}{\leadsto }\). Of course, we have the implications

$$\begin{aligned} L\overset{{\textit{s}}}{\leadsto }K \qquad \Rightarrow \qquad L\leadsto K \end{aligned}$$

and

$$\begin{aligned} L\overset{{\textit{sc}}}{\leadsto }K \qquad \Rightarrow \qquad L\overset{{\textit{c}}}{\leadsto }K \qquad \Rightarrow \qquad L\leadsto K. \end{aligned}$$

With this definition Theorem 1.4 can be reformulated to the following statement.

Corollary 1.6

\(L\!:\!\mathcal {V}\!\rightarrow \!\mathbb {R}\) is a moment functional iff \(L\!\leadsto \! [K\!:\!\mathcal {L}^1([0,1],\lambda )\!\rightarrow \!\mathbb {R}]\).

The paper is structured as follows. In Sect. 2 we will give the preliminaries on measure theory and integration. Since most of the measure theoretic terminology and results in Sect. 2 (Souslin sets, Lebesgue–Rohlin spaces, isomorphisms between measure spaces etc.) have to our knowledge never been used in connection with the moment problem before, we give the complete definitions, results, and important examples which are essential for this paper (but without proofs).

In Sect. 3 we present basic properties of transformations (Definition 1.5). E.g. in Theorem 3.3 we show that if there exists a transformation \(L\leadsto K\) and K is a moment functional, then also L is a moment functional. So, the transformation \(\leadsto \) (literally and symbolically) aims at moment functionals K to determine whether already L was a moment functional.

Section 4 contains then the main results where several non-trivial transformations to [0, 1]- or \(I_k\)-moment functionals are presented, \(I_k\) finite union of compact intervals in \(\mathbb {R}\). We show, which might already be apparent from Theorem 1.4, that the structure of possible moment functionals K are quite simple. These are always [0, 1]- or \(I_k\)-moment functionals. However, this simplicity of K has the price that \(f:[0,1]\rightarrow S\) has little properties. In the worst case as in Theorem 1.4 we only have that f is measurable and each component \(f_i\) is in \(\mathcal {L}([0,1],\lambda )\). We therefore also present results where f is at least continuous and can therefore approximated by polynomials on [0, 1] in the supremum norm.

In Sect. 5 we give the conclusions and open problems. Additionally, we give and discuss several open questions, especially the restriction that f is a rational or a polynomial map.

2 Preliminaries: Measure Theory and the Lebesgue Integral

We give here the measure theoretic results used in our paper. Of course, it is possible to go directly to Sect. 3 and the main results in Sect. 4 and consult this Sect. 2 if necessary while reading the results and proofs.

In this article we follow the monographs [9, 22], and especially [4] for the measure theory and Lebesgue integral. We denote by \(\mathcal {P}(\mathcal {X})\) the power set of a set \(\mathcal {X}\), i.e., the set of all subsets of \(\mathcal {X}\). Let \(\mathcal {A}\subseteq \mathcal {P}(\mathcal {X})\) be a \(\sigma \)-algebra on a set \(\mathcal {X}\), then we call \((\mathcal {X},\mathcal {A})\) a measurable space. A function \(f:(\mathcal {X},\mathcal {A})\rightarrow (\mathcal {Y},\mathcal {B})\) between measurable spaces is called measurable if \(f^{-1}(B)\in \mathcal {A}\) for all \(B\in \mathcal {B}\) holds.

Given \(\mathcal {F}\subseteq \mathcal {P}(\mathcal {X})\), then by \(\sigma (\mathcal {F})\) we denote the \(\sigma \)-algebra generated by \(\mathcal {F}\), i.e., the smallest \(\sigma \)-algebra containing \(\mathcal {F}\). The Borel \(\sigma \)-algebra \(\mathfrak {B}(\mathcal {X})\) of a topological (e.g. Hausdorff) space \(\mathcal {X}\) is generated by all open sets in \(\mathcal {X}\).

Given a measurable space \((\mathcal {X},\mathcal {A})\), a measure \(\mu \) on \((\mathcal {X},\mathcal {A})\) is a countably additive function \(\mu :\mathcal {A}\rightarrow [0,\infty ]\). I.e., dissident from [4] for us all measures are non-negative if not otherwise explicitly stated as signed. \((\mathcal {X},\mathcal {A},\mu )\) is called a measure space. \((\mathcal {X},\mathcal {A},\mu )\) is called probability measure space if additionally \(\mu (\mathcal {X})=1\). An atom \(\delta _x\) is a measure such that

$$\begin{aligned} \delta _x(A) = {\left\{ \begin{array}{ll} 1 &{}\text {for}\ x\in A\\ 0 &{} \text {for}\ x\not \in A\end{array}\right. }. \end{aligned}$$

Let \(\mathcal {X}\) be a topological (e.g. locally compact Hausdorff) space. A measure on \((\mathcal {X},\mathfrak {B}(\mathcal {X}))\) is called Borel measure. A Radon measure \(\mu \) is a measure over \((\mathcal {X},\mathfrak {B}(\mathcal {X}))\) such that \(\mu (K)<\infty \) for all compact \(K\subseteq \mathcal {X}\) and \(\mu (V) = \sup \{\mu (K) \,|\, K\) is compact, \(K\subseteq V\}\). By \(\lambda ^n\) we denote the n-dimensional Lebegue measure on \((\mathbb {R}^n,\mathfrak {B}(\mathbb {R}^n))\).

We have the following transformation formula.

Lemma 2.1

Let \(f:(\mathcal {Y},\mathcal {B})\rightarrow (\mathbb {R},\mathfrak {B}(\mathbb {R}))\) and \(g:(\mathcal {X},\mathcal {A})\rightarrow (\mathcal {Y},\mathcal {B})\) be measurable functions, \(\mu \) be a measure on \((\mathcal {X},\mathcal {A})\) such that \(f\circ g\) is \(\mu \)-integrable. Then \(\mu \circ g^{-1}\) is a measure on \((\mathcal {Y},\mathcal {B})\) and f is \(\mu \circ g^{-1}\)-integrable with

$$\begin{aligned} \int _\mathcal {X}(f\circ g)(x)~\textrm{d}\mu (x) = \int _\mathcal {Y}f(y)~\textrm{d}(\mu \circ g^{-1})(y). \end{aligned}$$
(3)

Proof

It is sufficient to show (3) for \(f\ge 0\):

$$\begin{aligned} \int _\mathcal {X}(f\circ g)(x)~\textrm{d}\mu (x)&= \int _0^\infty \mu ((f\circ g)^{-1}((t,\infty )))~\textrm{d}t\\&= \int _0^\infty \mu (g^{-1}(f^{-1}((t,\infty ))))~\textrm{d}t\\&= \int _0^\infty (\mu \circ g^{-1})(f^{-1}((t,\infty )))~\textrm{d}t\\&= \int _\mathcal {Y}f(y)~\textrm{d}(\mu \circ g^{-1})(y). \end{aligned}$$

\(\square \)

We have the first result from measure theory. We apply it in Proposition 4.1.

Proposition 2.2

(see e.g. [4, Prop. 9.1.11]) Let \(\mu \) be an atomless probability measure on a measurable space \((\mathcal {X},\mathcal {A})\). Then there exists an \(\mathcal {A}\)-measurable function \(f:\mathcal {X}\rightarrow [0,1]\) such that \(\mu \circ f^{-1} = \lambda \) is the Lebesgue measure on [0, 1].

The following is a central definition.

Definition 2.3

([4, Def. 6.6.1]) A set in a Hausdorff space is called a Souslin set if it is the image of a complete separable metric space under a continuous mapping. A Souslin space is a Hausdorff space that is a Souslin set.

The empty set is a Souslin set. Souslin sets are fully characterized.

Proposition 2.4

(see e.g. [4, Prop. 6.6.3]) Every non-empty Souslin set is the image of \([0,1]\setminus \mathbb {Q}\) under some continuous function and also the image of (0, 1) under some Borel mapping.

More concrete examples which are important to us are the following.

Example 2.5

The unit interval \([0,1]\subset \mathbb {R}\) is of course a complete separable metric space (with the usual distance metric \(d(x,y):= |x-y|\)). The question which sets are the continuous images of [0, 1] is partially answered by space filling curves, see e.g. [33, Ch. 5]. So the Peano curves as continuous and surjective functions

$$\begin{aligned} f:[0,1]\rightarrow [a_1,b_1]\times \cdots \times [a_n,b_n] \end{aligned}$$

with \(n\in \mathbb {N}\) and \(-\infty< a_i< b_i < \infty \) for all \(i=1,\ldots ,n\) show that all hyper-rectangles are Souslin spaces/sets. Especially [0, 1] is a Souslin set/space.

A full answer gives the following theorem.

Hahn–Mazurkiewicz’ Theorem 2.6

(see [12, 25] or e.g. [33, Thm. 6.8]) A set K in a non-empty Hausdorff space is the continuous image of [0, 1] if and only if it is compact, connected, and locally connected.

So sets \(K\subseteq \mathbb {R}^n\) are continuous images of [0, 1] if and only if they are compact and path-connected. Hahn–Mazurkiewicz also implies that \(\mathbb {P}\mathbb {R}^n\) is a Souslin space.\(\circ \)

Lemma 2.7

(see e.g. [4, Lem. 6.6.5, Thm. 6.6.6 and 6.7.3])

  1. (i)

    The image of a Souslin set under a continuous function to a Hausdorff space is a Souslin set.

  2. (ii)

    Every open or closed set of a Souslin space is Souslin.

  3. (iii)

    If \(A_n\) are Souslin sets in \(\mathcal {X}_n\) for all \(n\in \mathbb {N}\) then \(\prod _{n\in \mathbb {N}} A_n\) is a Souslin set in \(\prod _{n\in \mathbb {N}} \mathcal {X}_n\).

  4. (iv)

    If \(A_n\subseteq \mathcal {X}\) are Souslin sets in a Hausdorff space \(\mathcal {X}\), then \(\bigcap _{n\in \mathbb {N}} A_n\) and \(\bigcup _{n\in \mathbb {N}} A_n\) are Souslin sets.

  5. (v)

    Every Borel subset of a Souslin space is a Souslin space.

  6. (vi)

    Let \(A\subseteq \mathcal {X}\) and \(B\subseteq \mathcal {Y}\) be Souslin sets of Souslin spaces and \(f:\mathcal {X}\rightarrow \mathcal {Y}\) be a Borel function. Then f(A) and \(f^{-1}(B)\) are Souslin sets.

Remark 2.8

The reverse of Lemma 2.7(v) is in general not true. Not every Souslin set is Borel. In fact, every non-empty complete metric space without isolated points contains a non-Borel Souslin set, see e.g. [4, Cor. 6.7.11].\(\circ \)

(vi) demonstrates the difference between Souslin sets and Borel sets (in \(\mathbb {R}^n\)). While the continuous image of a Borel set is again a Borel set, this no longer holds for Borel functions. But as (vi) shows for the Souslin sets the preimage and image under measurable functions remain Souslin sets.

From Example 2.5 and Lemma 2.7 we get the following additional explicit examples of Souslin sets.

Example 2.9

\(\mathbb {R}^n\) and every compact semi-algebraic set in \(\mathbb {R}^n\) (resp. \(\mathbb {P}\mathbb {R}^n\)) are Souslin sets.\(\circ \)

Definition 2.10

Let \((\mathcal {X},\mathcal {A})\) and \((\mathcal {Y},\mathcal {B})\) be two measurable spaces. A measurable function \(\iota :(\mathcal {X},\mathcal {A})\rightarrow (\mathcal {Y},\mathcal {B})\) is called an isomorphism and the two measurable spaces isomorphic if \(\iota \) is bijective, \(\iota (\mathcal {A})=\mathcal {B}\), and \(\iota ^{-1}(\mathcal {B})=\mathcal {A}\).

The reason why we work with Souslin spaces is revealed in the following theorem.

Theorem 2.11

(see e.g. [4, Thm. 6.7.4]) Let \(\mathcal {X}\) be a Souslin space. Then there exist a Souslin set \(S\subseteq [0,1]\) and an isomorphism \(\iota :(S,\mathfrak {B}(S))\rightarrow (\mathcal {X},\mathfrak {B}(\mathcal {X}))\).

The existence of an isomorphism can be weakened. For Borel measurable function \(f:\mathcal {X}\rightarrow \mathcal {Y}\) between two Souslin spaces \(\mathcal {X}\) and \(\mathcal {Y}\) with \(f(\mathcal {X})=\mathcal {Y}\) one always finds nice (i.e., Borel measurable) one-sided inverse functions.

Jankoff’s Theorem 2.12

(see [16] or e.g. [4, Thm. 6.9.1 and 9.1.3]) Let \(\mathcal {X}\) and \(\mathcal {Y}\) be two Souslin spaces and let \(f:\mathcal {X}\rightarrow \mathcal {Y}\) be a surjective Borel mapping. Then there exists a Borel measurable function \(g:\mathcal {Y}\rightarrow \mathcal {X}\) such that \(f(g(y))=y\) for all \(y\in \mathcal {Y}\).

In other words, restricting f to some \(\mathcal {X}_0\subseteq \mathcal {X}\) makes \(\tilde{f}:=f|_{\mathcal {X}_0}\) not only bijective but \(\tilde{f}\) and \(\tilde{f}^{-1}\) are measurable. We have

$$\begin{aligned} \mathcal {Y}\overset{g}{\rightarrow }\ \mathcal {X}\overset{f}{\rightarrow }\ \mathcal {Y}\qquad \text {with}\qquad f\circ g = \textrm{id}_\mathcal {Y}, \end{aligned}$$

i.e., g is injective, f is surjective, and with \(\mathcal {X}_0 = \textrm{im}\, g:= g(\mathcal {Y})\) we have \(\tilde{f}^{-1} = g\).

Definition 2.13

(see e.g. [4, Def. 9.2.1]) Let \((\mathcal {X},\mathcal {A},\mu )\) and \((\mathcal {Y},\mathcal {B},\nu )\) be two measure spaces with non-negative measures.

  1. (i)

    A point isomorphism \(T:\mathcal {X}\rightarrow \mathcal {Y}\) is a bijective mapping such that \(T(\mathcal {A}) = \mathcal {B}\) and \(\mu \circ T^{-1} = \nu \).

  2. (ii)

    The spaces \((\mathcal {X},\mathcal {A},\mu )\) and \((\mathcal {Y},\mathcal {B},\nu )\) are called isomorphic \(\textrm{mod}0\) if there exist sets \(N\in \mathcal {A}_\mu \), \(M\in \mathcal {B}_\nu \) with \(\mu (N)=\nu (M)=0\) and a point isomorphism \(T:\mathcal {X}\setminus N\rightarrow \mathcal {Y}\setminus M\) that are equipped with the restriction of the measures \(\mu \) and \(\nu \) and the \(\sigma \)-algebras \(\mathcal {A}_\mu \) and \(\mathcal {B}_\nu \).

A point isomorphism T between \((\mathcal {X},\mathcal {A},\mu )\) and \((\mathcal {Y},\mathcal {B},\nu )\) is of course measurable since \(\nu (B) = (\mu \circ T^{-1})(B) = \mu (T^{-1}(B))\) implies \(T^{-1}(B)\in \mathcal {A}\) for all \(B\in \mathcal {B}\).

Like Theorem 2.11 also the next result shows the importance of working on Souslin sets.

Theorem 2.14

(see e.g. [4, Thm. 9.2.2]) Let \((\mathcal {X},\mathcal {A})\) be a Souslin space with Borel probability measure \(\mu \). Then \((\mathcal {X},\mathcal {A},\mu )\) is isomorphic \(\textrm{mod}0\) to the space \(([0,1],\mathfrak {B}([0,1]),\nu )\) for some \(\nu \) Borel probability measure. If \(\mu \) is an atomless measure, then one can take for \(\nu \) the Lebesgue measure \(\lambda \).

Corollary 2.15

(see e.g. [4, Rem. 9.7.4]) Let \(\mu \) be a probability measure on a Souslin space \(\mathcal {X}\). Then there exists a measurable function \(f:[0,1]\rightarrow \mathcal {X}\) such that \(\mu =\lambda \circ f^{-1}\) where \(\lambda \) is the Lebesgue measure on [0, 1].

For both results note the difference to Proposition 2.2. In Proposition 2.2 we find for any measurable space \(\mathcal {X}\) and measure \(\mu \) a map

$$\begin{aligned} f:\mathcal {X}\rightarrow [0,1]\qquad \text {such that}\qquad \mu = \lambda \circ f^{-1}. \end{aligned}$$

But for Souslin spaces \(\mathcal {X}\) in Corollary 2.15 we find a map

$$\begin{aligned} f:[0,1]\rightarrow \mathcal {X}\qquad \text {such that}\qquad \lambda = \mu \circ f^{-1}. \end{aligned}$$

Theorem 2.14 restricts \(f:[0,1]\rightarrow \mathcal {X}\) to isomorphisms and hence not all measures can be transformed into \(\lambda \). Atoms in the measure \(\mu \) prevent it from being isomorphic to \(\lambda \). In fact, as explained in [4, Rem. 9.7.4], Corollary 2.15 follows from Theorem 2.14 by introducing atoms into \(f:[0,1]\rightarrow \mathcal {X}\) by introducing constant functions into f.

But Theorem 2.14 provides that if \(\mu \) has atoms, it can still be isomorphic \(\textrm{mod}0\) transformed into a measure \(\nu \) on [0, 1]. Without atoms we could chose \(\nu = \lambda \). So is it possible to transform the non-atomic part of \(\mu \) to \(\lambda \) and then add the atoms from \(\mu \) to \(\lambda \)? Yes, we can. This is done on the following spaces.

Definition 2.16

(see e.g. [4, Def. 9.4.6]) A measure space \((\mathcal {X},\mathcal {A},\mu )\) is called a Lebesgue–Rohlin space if it is isomorphic \(\textrm{mod}0\) to some measure space \((\mathcal {Y},\mathcal {B},\nu )\) with a countable basis with respect to which \(\mathcal {Y}\) is complete.

Example 2.17

(see e.g. [4, Exm. 9.4.2]) \((M,\mathfrak {B}(M),\mu )\), where M is a Borel set of a complete separable metric space \(\mathcal {X}\) and \(\mu \) is a Borel measure on M, is a Lebesgue–Rohlin space. Especially \(\mathcal {X}= \mathbb {R}^n\) or \(\mathbb {P}\mathbb {R}^n\) are complete metric spaces and therefore any Borel measure on a Borel subset \(M\in \mathfrak {B}(\mathbb {R}^n)\) gives a Lebesgue–Rohlin space.\(\circ \)

We can now transform any measure by an isomorphism \(\textrm{mod}0\) to the Lebesgue measure \(\lambda \) plus atoms.

Theorem 2.18

(see e.g. [4, Thm. 9.4.7]) Let \((\mathcal {X},\mathcal {A},\mu )\) be a Lebesgue–Rohlin space with a probability measure \(\mu \). Then it is isomorphic \(\textrm{mod}0\) to the interval [0, 1] with the measure \(\nu = c\lambda + \sum _{i=1}^\infty c_n\cdot \delta _{1/n}\), where \(c=1-\sum _{i=1}^\infty c_i\), \(\mu (a_i) = c_i\) and \(\{a_i\}\subseteq \mathcal {X}\) is the family of all atoms of \(\mu \).

So we can transform any measure to the Lebesgue measure \(\lambda \) on [0, 1] or to \(\lambda \) on [0, 1] plus atoms. But these transformations are performed mainly by measurable functions because the set \(\mathcal {X}\) where the original measure lives is too large. If we restrict the space where the measure lives, we get better transformations, especially continuous ones.

Theorem 2.19

(see [18] or e.g. [4, Thm. 9.7.1]) Let K be a compact metric space that is the image of [0, 1] under a continuous mapping \(\tilde{f}\) and let \(\mu \) be a Borel probability measure on K such that \(\textrm{supp}\,\mu =K\). Then there exists a continuous and surjective mapping \(f:[0,1]\rightarrow K\) such that \(\mu = \lambda \circ f^{-1}\), \(\lambda \) is the Lebesgue measure on [0, 1].

We will apply Theorem 2.19 especially in connection with the Theorem 2.6. The advantage is here that f on [0, 1] is continuous and can therefore be approximated by polynomials up to any precision \(\varepsilon >0\) in the \(\sup \)-norm.

3 Transformations of Linear Functionals: Basic Properties

For the transformation \(\leadsto \) between two linear functionals in Definition 1.5 we get the following technical result.

Lemma 3.1

Let \(\mathcal {X}\), \(\mathcal {Y}\), and \(\mathcal {Z}\) be Souslin spaces; \(\mathcal {U}\), \(\mathcal {V}\), and \(\mathcal {W}\) be vector spaces of real measurable functions on \(\mathcal {X}\), \(\mathcal {Y}\), and \(\mathcal {Z}\) respectively; and \(M:\mathcal {W}\rightarrow \mathbb {R}\), \(L:\mathcal {V}\rightarrow \mathbb {R}\), and \(K:\mathcal {U}\rightarrow \mathbb {R}\) be linear functionals. The following hold:

  1. (i)

    \(M\leadsto L\) and \(L\leadsto K\) imply \(M\leadsto K\).

  2. (ii)

    \(M\overset{{\textit{c}}}{\leadsto }L\) and \(L\overset{{\textit{c}}}{\leadsto }K\) imply \(M\overset{{\textit{c}}}{\leadsto }K\).

  3. (iii)

    \(M\overset{{\textit{s}}}{\leadsto }L\) and \(L\overset{{\textit{s}}}{\leadsto }K\) imply \(M\overset{{\textit{s}}}{\leadsto }K\).

  4. (iv)

    \(M\overset{{\textit{sc}}}{\leadsto }L\) and \(L\overset{{\textit{sc}}}{\leadsto }K\) imply \(M\overset{{\textit{sc}}}{\leadsto }K\).

Proof

(i): Since \(M\leadsto L\) there exists a Borel function \(f:\mathcal {Y}\rightarrow \mathcal {Z}\) such that \(\mathcal {W}\circ f\subseteq \mathcal {V}\) and \(M(w) = L(w\circ f)\) for all \(w\in \mathcal {W}\). And since \(L\leadsto K\) there exists a Borel function \(g:\mathcal {X}\rightarrow \mathcal {Y}\) such that \(\mathcal {V}\circ g\subseteq \mathcal {U}\) and \(L(v) = K(v\circ g)\) for all \(v\in \mathcal {V}\). Hence, \(h=f\circ g:\mathcal {X}\rightarrow \mathcal {Z}\) implies \(\mathcal {W}\circ h = \mathcal {W}\circ f\circ g \subseteq \mathcal {V}\circ g\subseteq \mathcal {U}\) and \(M(w) = L(w\circ f) = K(w\circ f\circ g) = K(w\circ h)\) for all \(w\in \mathcal {W}\), i.e., \(M\leadsto K\).

(ii)-(iv) follow in the same way as (i). \(\square \)

Lemma 3.1 can be seen as shortening the sequence:

$$\begin{aligned} M\leadsto L\leadsto K \qquad \Rightarrow \qquad M\leadsto K. \end{aligned}$$

The next lemma shows, that a strong transformation \(L\overset{{\textit{s}}}{\leadsto }K\) implies the reverse transformation \(K\leadsto L\).

Lemma 3.2

Let \(\mathcal {X}\) and \(\mathcal {Y}\) be Souslin sets, \(\mathcal {U}\) and \(\mathcal {V}\) vector spaces of real functions on \(\mathcal {X}\) resp. \(\mathcal {Y}\), and \(L:\mathcal {V}\rightarrow \mathbb {R}\) and \(K:\mathcal {U}\rightarrow \mathbb {R}\) be linear functionals. Then \(L\overset{{\textit{s}}}{\leadsto }K\) implies \(K\leadsto L\).

Proof

Since \(L\overset{{\textit{s}}}{\leadsto }K\) there exists a surjective Borel function \(f:\mathcal {X}\rightarrow \mathcal {Y}\) such that \(L(v) = K(v\circ f)\) and \(\mathcal {V}\circ f = \mathcal {U}\). Since f is surjective by Theorem 2.12 there exists a Borel function \(g:\mathcal {Y}\rightarrow \mathcal {X}\) such that \(f(g(y))=y\) for all \(y\in \mathcal {Y}\). Let \(u\in \mathcal {U}= \mathcal {V}\circ f\), then v in \(u = v\circ f\) is unique since for \(v_1\) and \(v_2\) with that property we have

$$\begin{aligned} v_1 = v_1\circ f\circ g = u\circ g = v_2\circ f\circ g = v_2. \end{aligned}$$

Hence, \(\mathcal {U}\circ g = \mathcal {V}\) and for all \(u\in \mathcal {U}\) we have

$$\begin{aligned} K(u) = K(v\circ f) = L(v) = L(v\circ f\circ g) = L(u\circ g). \end{aligned}$$

\(\square \)

While we have so far only transformed linear functionals, the importance of the transformation is revealed in the following result. It shows that the property of being a moment functional is preserved in one or both directions.

Theorem 3.3

Let \(\mathcal {X}\) and \(\mathcal {Y}\) be Souslin sets, \(\mathcal {U}\) and \(\mathcal {V}\) vector spaces of real functions on \(\mathcal {X}\) resp. \(\mathcal {Y}\), and \(L\!:\mathcal {V}\rightarrow \mathbb {R}\) and \(K\!:\mathcal {U}\rightarrow \mathbb {R}\) be linear functionals. If \(L\leadsto K\), then

  1. (i)

    K is a moment functional

implies

  1. (ii)

    L is a moment functional.

If \(L\overset{{\textit{s}}}{\leadsto }K\), then (i) \(\Leftrightarrow \) (ii).

Proof

Since \(L\leadsto K\) there exists a Borel function \(f:\mathcal {X}\rightarrow \mathcal {Y}\) such that \(\mathcal {V}\circ f\subseteq \mathcal {U}\) and \(L(v) = K(v\circ f)\) for all \(v\in \mathcal {V}\).

(i)\(\rightarrow \)(ii): Let K be a moment functional with representing measure \(\nu \) on \(\mathcal {X}\), then

$$\begin{aligned} L(v) = K(v\circ f) = \int _\mathcal {X}(v\circ f)(x)~\textrm{d}\nu (x) \overset{\text {Lemma 2.1}}{=} \int _\mathcal {Y}v(y)~\textrm{d}(\nu \circ f^{-1})(y), \end{aligned}$$

i.e., \(\nu \circ f^{-1}\) is a representing measure of L and hence L is a moment functional.

(ii)\(\rightarrow \)(i): When \(L\overset{{\textit{s}}}{\leadsto }K\), then Lemma 3.2 implies \(K\leadsto L\). \(\square \)

The importance of the transformation and hence Theorem 3.3 can be seen in

(4)

If K is a moment functional, then all \(L_1,\ldots , L_8\) are moment funtionals. Assume in (4) all transformations \(\leadsto \) are strong transformations \(\overset{{\textit{s}}}{\leadsto }\). Then: If one \(L_i\) or K is a moment functional, then all \(K,L_1,\ldots ,L_8\) are moment functionals.

Note, the transformation \(\leadsto \) in Definition 1.5 also covers extensions and restrictions of functionals. Let \(f = \textrm{id}_\mathcal {X}\) and let \(\mathcal {V}\) be a vector space of measurable functions on \(\mathcal {X}\), \(\mathcal {V}_0\subseteq \mathcal {V}\) be a linear subspace, and \(L:\mathcal {V}\rightarrow \mathbb {R}\) a linear functional. Then

$$\begin{aligned} L|_{\mathcal {V}_0}\overset{\textrm{id}_\mathcal {X}}{\leadsto }\ L. \end{aligned}$$

Or if \(L_i:\mathcal {V}_i\rightarrow \mathbb {R}\) are extensions of L, i.e., \(\mathcal {V}\subseteq \mathcal {V}_1\subseteq \mathcal {V}_2\subseteq \dots \subseteq \mathcal {V}_k\) with \(L_i = L_{i+1}|_{\mathcal {V}_i}\), then

$$\begin{aligned} L\overset{\textrm{id}_\mathcal {X}}{\leadsto }\ L_1 \overset{\textrm{id}_\mathcal {X}}{\leadsto }\ L_2 \overset{\textrm{id}_\mathcal {X}}{\leadsto }\dots \overset{\textrm{id}_\mathcal {X}}{\leadsto }\ L_k \qquad \text {or short}\qquad L\leadsto L_1\leadsto L_2\leadsto \dots \leadsto L_k \end{aligned}$$

shows that if \(L_k\) is a moment functional, then all \(L_i\) and L are moment functionals.

So far we introduced the transformation of a linear functional and gained basic properties. But as seen from Theorem 1.4 and Corollary 1.6, there are non-trivial results for the transformations. The next section is devoted to these non-trivial transformation results.

4 Non-trivial Transformations of Linear Functionals

Let \(\mathcal {V}\) be a (finite or infinite dimensional) vector space of measurable functions on a Souslin space \(\mathcal {X}\). Then by Theorem 2.11 there exist a Souslin set \(S\subseteq [0,1]\) and an isomorphism \(h:(S,\mathfrak {B}(S))\rightarrow (\mathcal {X},\mathfrak {B}(\mathcal {X}))\). This implies that \(\tilde{L}:\tilde{\mathcal {V}}\rightarrow \mathbb {R}\) with \(\tilde{\mathcal {V}}:=\{f\circ h \,|\, f\in \mathcal {V}\}\) and \(\tilde{L}(g):= L(g\circ h^{-1})\), \(g\in \tilde{\mathcal {V}}\), is a linear functional but now the functions \(\tilde{\mathcal {V}}\) live on \(S\subseteq [0,1]\). Especially, L is a moment functional if and only if \(\tilde{L}\) is a moment functional.

For example, let \(L:\mathbb {R}[x_1,\ldots ,x_n]\rightarrow \mathbb {R}\) be a moment functional with \(\mathcal {X}=\mathbb {R}^n\). Then \(h=(h_1,\ldots ,h_n):S\subseteq [0,1]\rightarrow \mathbb {R}^n\) is an isomorphism between \((S,\mathfrak {B}(S))\) and \((\mathbb {R}^n,\mathfrak {B}(\mathbb {R}^n))\) and \(\tilde{L}\) is a moment functional with \(\tilde{L}(h^\alpha ) = L(x^\alpha )\).

However, by Remark 2.8S needs not to be a Borel set. So determining whether \(\tilde{L}\) is a moment functional might be as hard as determining whether L is a moment functional. Additionally, \(\tilde{L}\) now no longer lives on polynomials but evaluates measurable functions \(h^\alpha = h_1^{\alpha _1}\ldots h_n^{\alpha _n}\) with \(\alpha =(\alpha _1,\ldots ,\alpha _n)\in \mathbb {N}_0^n\).

Allowing general Borel measurable functions on measurable spaces instead of isomorphisms we get Theorem 1.4 in the introduction. There we showed that any moment functional can be expressed as integration with respect to the Lebesgue measure \(\lambda \) on [0, 1].

The next result shows that any moment functional with an atomless representing measure has a “direction” in which it looks like (1), i.e, the Lebesgue measure on [0, 1] evaluated on \(\mathbb {R}[t]\).

Proposition 4.1

Let \(\mathcal {V}\) be a vector space of real measurable functions on a measurable space \((\mathcal {X},\mathcal {A})\) such that there exists an element \(v\in \mathcal {V}\) with \(1\le v\) on \(\mathcal {X}\) and let \(L:\mathcal {V}\rightarrow \mathbb {R}\) be a moment functional which has an atomless representing measure. Then there exists a measurable function \(f:\mathcal {X}\rightarrow [0,1]\) and an extension \(\overline{L}:\mathcal {V}+\mathbb {R}[f]\rightarrow \mathbb {R}\) of L such that \(\overline{L}(f^d) = \frac{\overline{L}(1)}{d+1}\) for all \(d\in \mathbb {N}_0\), i.e., \(\tilde{L}:\mathbb {R}[t]\rightarrow \mathbb {R}\) with \(\tilde{L}(t^d):= \overline{L}(f^d)\) for all \(d\in \mathbb {N}_0\) is represented by \(\overline{L}(1)\cdot \lambda \) where \(\lambda \) is the Lebesgue measure \(\lambda \) on [0, 1].

Proof

Let \(\mu \) be a representing measure of L. By Proposition 2.2 there exists a measurable \(f:\mathbb {R}^n\rightarrow [0,1]\) such that \(\mu \circ f^{-1} = \lambda \) on [0, 1]. Since f is measurable, \(|f|\le 1\) on \(\mathbb {R}^n\), and \(L(1)<\infty \), all \(f^d\), \(d\in \mathbb {N}_0\), are \(\mu \)-integrable:

$$\begin{aligned} \left| \int _{\mathbb {R}^n} f^d(x)~\textrm{d}\mu (x)\right| \le \int _{\mathbb {R}^n} |f(x)|^d~\textrm{d}\mu (x)\le \int _{\mathbb {R}^n} 1~\textrm{d}\mu (x) = L(1). \end{aligned}$$

Define \(\overline{L}:\mathbb {R}[f]\rightarrow \mathbb {R}\) by \(\overline{L}(f^d):= \int _{\mathbb {R}^n} f^d(x)~\textrm{d}\mu (x)\). Then

$$\begin{aligned} \overline{L}(f^d) = \int _{\mathbb {R}^n} f^d(x)~\textrm{d}\mu (x) \overset{\text {Lemma 2.1}}{=} \int _0^1 t^d~\textrm{d}(\mu \circ f^{-1})(t) = \int _0^1 t^d~\textrm{d}\lambda (t) = \frac{L(1)}{d+1} \end{aligned}$$

is represented by \(L(1)\cdot \lambda \) on [0, 1]. \(\square \)

Hence, for any moment functional with an atomless representing measure there exists a function f (a direction) such that it acts on \(\mathbb {R}[f]\cong \mathbb {R}[t]\) as (1), i.e., the Lebesgue measure on [0, 1]. Under some mild conditions every truncated moment functional in the interior of the truncated moment cone has an atomless representing measure. We can even find a linear combination of Gaussian distributions (Gaussian mixture) as a representing measure. This was proven in [7] for the first time.

Using the transformation \(\leadsto \) formulation with \(L_{\text {Leb}}\) from Example 1.1 we can visualize Proposition 4.1 as

Note the reverse statement of Proposition 4.1. If a linear functional L can never be (continuously) extended to \(\mathbb {R}[f]\) with \(\overline{L}(f^d) = \frac{\overline{L}(1)}{d+1}\) for some measurable f, then L is not a moment functional with an atomless representing measure.

Theorem 1.4 and Proposition 4.1 are very general. Especially Theorem 1.4 works on arbitrary Borel sets of \(\mathbb {R}^n\) (in fact on every Souslin space). For this generality we have to pay the price that f is in general only measurable. Additionally, since we always express L as integration with respect to \(\lambda \) on [0, 1], the chosen f depends on L. If we want additional properties for f to hold, especially continuity and independence from L, then we need to restrict the functionals we want to transform. This can be achieved by restricting the investigation to K-moment functionals on compact and path-connected sets \(K\subset \mathbb {R}^n\). Then from the Theorem 2.6 we get the existence of surjective and continuous functions \(f:[0,1]\rightarrow \ K\). We find the following result.

Theorem 4.2

Let \(n\in \mathbb {N}\) be a natural number, \(K\subset \mathbb {R}^n\) be a compact and path-connected set, and let \(\mathcal {V}\) be a vector space of real measurable functions on \((K,\mathfrak {B}(K))\). Then any surjective and continuous function \(f:[0,1]\rightarrow K\) induces for any linear functional \(L:\mathcal {V}\rightarrow \mathbb {R}\) a strong and continuous transformation

$$\begin{aligned} L:\mathcal {V}\rightarrow \mathbb {R}\quad \overset{sc:f}{\leadsto }\quad \tilde{L}:\mathcal {V}\circ f\rightarrow \mathbb {R}, \end{aligned}$$

i.e., for any linear functional \(L:\mathcal {V}\rightarrow \mathbb {R}\) the following are equivalent:

  1. (i)

    \(L:\mathcal {V}\rightarrow \mathbb {R}\) is a K-moment functional.

  2. (ii)

    \(\tilde{L}:\mathcal {V}\circ f\rightarrow \mathbb {R}\) defined by \(\tilde{L}(v\circ f):= L(v)\) is a [0, 1]-moment functional.

If \(\tilde{\mu }\) is a representing measure of \(\tilde{L}\), then \(\tilde{\mu }\circ f^{-1}\) is a representing measure of L.

There exists a measurable function \(g:K\rightarrow [0,1]\) such that \(f(g(x))=x\) for all \(x\in K\) and if \(\mu \) is a representing measure of L, then \(\mu \circ g^{-1}\) is a representing measure of \(\tilde{L}\).

Proof

Since \(K\subset \mathbb {R}^n\) is compact and path-connected, by the Hahn–Mazurkiewicz’ Theorem 2.6 there exists a continuous and surjective function \(f:[0,1]\rightarrow K\). By Example 2.5 or Lemma 2.7 [0, 1] and K are Souslin spaces and f is Borel measurable (since it is continuous). By Theorem 2.12 there exists a measurable function \(g:K\rightarrow [0,1]\) such that

$$\begin{aligned} f(g(x))=x\quad \text {for all}\ x\in K. \end{aligned}$$
(5)

(5) implies that \(\tilde{L}\) is well-defined by \(\tilde{L}(v\circ f) = L(v)\). To show this, for \(\tilde{v}\in \tilde{\mathcal {V}}\) let \(v_1,v_2\in \mathcal {V}\) be such that \(v_1\circ f = \tilde{v} = v_2\circ f\). But then g resp. (5) implies \(v_1 = v_1\circ f\circ g = \tilde{v}\circ g = v_2\circ f\circ g = v_2\), i.e., for any \(\tilde{v}\in \mathcal {V}\) there is a unique \(v\in \mathcal {V}\) with \(\tilde{v} = v\circ f\).

(i)\(\rightarrow \)(ii): Let \(L:\mathcal {V}\rightarrow \mathbb {R}\) be a K-moment functional and \(\mu \) be a representing measure of L, i.e., \(\textrm{supp}\,\mu \subseteq K\) and

$$\begin{aligned} L(v) = \int _K v(x)~\textrm{d}\mu (x) \quad \text {for all}\ v\in \mathcal {V}. \end{aligned}$$

Then

$$\begin{aligned} \tilde{L}(v\circ f) = L(v) = \int _K v(x)~\textrm{d}\mu (x)&{=} \int _K (v\circ f)(g(x))~\textrm{d}\mu (x)\\&\overset{\text {Lemma 2.1}}{=} \int _0^1 (v\circ f)(y)~\textrm{d}(\mu \circ g^{-1})(y), \end{aligned}$$

i.e., \(\mu \circ g^{-1}\) is a representing measure of \(\tilde{L}\) and hence \(\tilde{L}\) is a [0, 1]-moment functional.

(ii)\(\rightarrow \)(i): Let \(\tilde{\mu }\) be a representing measure of \(\tilde{L}:\tilde{\mathcal {V}}\rightarrow \mathbb {R}\). Then

$$\begin{aligned} L(v) = \tilde{L}(v\circ f) = \int _0^1 (v\circ f)(y)~\textrm{d}\tilde{\mu }(y) \overset{\text {Lemma 2.1}}{=} \int _K v(x)~\textrm{d}(\tilde{\mu }\circ f^{-1})(x), \end{aligned}$$

i.e., \(\tilde{\mu }\circ f^{-1}\) is a representing measure of L with \(\textrm{supp}\,\tilde{\mu }\circ f^{-1}\subseteq K\) and L is therefore a K-moment sequence. \(\square \)

In the previous result the functions \(f:[0,1]\rightarrow K\) and \(g:K\rightarrow [0,1]\) do not depend on the functions \(\mathcal {V}\) or the functional \(L:\mathcal {V}\rightarrow \mathbb {R}\). They depend only on K. We can therefore fix such functions f and g and investigate any L resp. \(\tilde{L}\).

If the continuous f can be chosen for each L, then in Theorem 4.2(ii) we can even ensure that \(\tilde{L}\) is represented by the Lebesgue measure \(\lambda \) on [0, 1] if and only if L has a representing measure \(\mu \) with \(\textrm{supp}\,\mu = K\), see Theorem 4.11 below.

In Theorem 4.2 we required that K consists of one path-connected component. If K consists of more than one component, then we can glue the parts together.

Corollary 4.3

Let \(n\in \mathbb {N}\) and \(K\subset \mathbb {R}^n\) be the union of \(k\in \mathbb {N}\cup \{\infty \}\) compact, path-connected and pairwise disjoint sets \(K_i\subset \mathbb {R}^n\): \(K = \bigcup _{i=1}^k K_i\). Let \(\mathcal {V}\) be a vector space of real valued measurable functions on \((K,\mathfrak {B}(K))\). There exists a continuous surjective function

$$\begin{aligned} f:\bigcup _{i=1}^k [2i-2,2i-1]\rightarrow K \end{aligned}$$

such that for any linear functional \(L:\mathcal {V}\rightarrow \mathbb {R}\) the following are equivalent:

  1. (i)

    \(L:\mathcal {V}\rightarrow \mathbb {R}\) is a K-moment functional.

  2. (ii)

    \(\tilde{L}:\tilde{\mathcal {V}}\rightarrow \mathbb {R}\) on \(\tilde{\mathcal {V}}:= \{v\circ f \,|\, v\in \mathcal {V}\}\) and defined by \(\tilde{L}(v\circ f):= L(v)\) is a \(\bigcup _{i=1}^k [2i-2,2i-1]\)-moment functional.

Proof

It is sufficient to show the existence of the function f (and g). The rest of the proof is verbatim the same as in the proof of Theorem 4.2.

Since for each \(i=1,2,\ldots ,k\) the set \(K_i\) is compact and path-connected and the translation of the unit interval [0, 1] to \([2i-2,2i-1]\) is continuous, by the Theorem 2.6 there exists a continuous and surjective \(f_i:[2i-2,2i-1]\rightarrow K_i\). Define \(f:\bigcup _{i=1}^k [2i-2,2i-1]\rightarrow K\) by \(f(x):= f_i(x)\) if \(x\in [2i-2,2i-1]\) for an \(i\in \{1,2,\ldots ,k\}\). Then f is continuous and surjective.

For \(g:K\rightarrow \bigcup _{i=1}^k [2i-2,2i-1]\) we proceed in the same way. By Theorem 2.12 for each \(f_i:[2i-2,2i-1]\rightarrow K_i\) there exists a measurable \(g_i: K_i\rightarrow [2i-2,2i-1]\). Hence, we define g as \(g(x):= g_i(x)\) if \(x\in K_i\). \(\square \)

Note, that when K consists of countably many compact and path-connected components (\(k=\infty \)), then in Corollary 4.3f is no longer supported on a bounded (and therefore compact) set: \(\bigcup _{i=1}^k [2i-2,2i-1]\). But if e.g. K is a compact and semi-algebraic set, then K has only finitely many path-connected components.

An advantage in Theorem 4.2 is that \(f=(f_1,\ldots ,f_n):[0,1]\rightarrow K\subset \mathbb {R}^n\) is continuous. Hence, all coordinate functions \(f_i:[0,1]\rightarrow \mathbb {R}\) are continuous. By the Stone–Weierstrass Theorem we can approximate each \(f_i\) in the \(\sup \)-norm on [0, 1] by polynomials to any precision. f can therefore be approximated to any precision by a polynomial map. A representing measure \(\tilde{\mu }\) of \(\tilde{L}\) provides the representing measure \(\tilde{\mu }\circ f^{-1}\) of L. An approximation \(f_\varepsilon \in \mathbb {R}[x_1,\ldots ,x_n]^n\) of f, i.e., \(\sup _{t\in [0,1]} \Vert f(t) - f_\varepsilon (t)\Vert <\varepsilon \) with any (fixed) norm \(\Vert \,\cdot \,\Vert \) on \(\mathbb {R}^n\) and \(\varepsilon >0\), provides an approximate representing measure \(\tilde{\mu }\circ f_\varepsilon ^{-1}\) of L.

Let \(K\subset \mathbb {R}^n\) be a compact and path-connected set, \(\mathcal {V}= \mathbb {R}[x_1,\ldots ,x_n]\), and \(L:\mathcal {V}\rightarrow \mathbb {R}\) be a linear functional. Then the induced functional \(\tilde{L}:\tilde{\mathcal {V}}\rightarrow \mathbb {R}\) on [0, 1] is defined by \(\tilde{L}(p\circ f):= L(p)\). It depends on \(p\circ f\), i.e., \(f^\alpha = f_1^{\alpha _1}\ldots f_n^{\alpha _n}\), \(\alpha =(\alpha _1,\ldots ,\alpha _n)\in \mathbb {N}_0^n\). So as in Theorem 1.4 the algebraic structure of \(\mathbb {R}[x_1,\ldots ,x_n]\) remains but the domain K is pulled back to [0, 1] by the continuous f.

That the algebraic structure remains also reveals one big difference between L and \(\tilde{L}\). E.g. \(\mathcal {V}=\mathbb {R}[x_1,\ldots ,x_n]\) separates points and is therefore dense in \(C(K,\mathbb {R})\). But \(f:[0,1]\rightarrow K\) is a space filling curve and therefore never injective (Netto’s Theorem). Hence, there are \(t_1,t_2\in [0,1]\) with \(t_1\ne t_2\) and \(f(t_1)=f(t_2)\). The set \(\tilde{\mathcal {V}}:= \{p\circ f\,|\, p\in \mathcal {V}\}\) therefore does not separate \(t_1\) from \(t_2\) and is by the Stone–Weierstrass Theorem not dense in \(C([0,1],\mathbb {R})\). So the \(\tilde{L}\) in Theorem 4.2 and Corollary 4.3 can at this point not extended to the Theorem 1.2.

In the next theorem we will identify each K-moment functional with a [0, 1]-moment functional, i.e., the Theorem 1.2.

Theorem 4.4

Let \(n\in \mathbb {N}\) be a natural number and \(K\subset \mathbb {R}^n\) be a compact and path-connected set. Then there exists a measurable function

$$\begin{aligned} g:K\rightarrow [0,1] \end{aligned}$$

such that for all linear functionals \(L:\mathcal {V}\rightarrow \mathbb {R}\) with \(1\in \mathcal {V}\subseteq C(K,\mathbb {R})\) the following are equivalent:

  1. (i)

    \(L:\mathcal {V}\rightarrow \mathbb {R}\) is a K-moment functional.

  2. (ii)

    \(L:\mathcal {V}\rightarrow \mathbb {R}\) continuouslyFootnote 1 extends to \(\overline{L}:\mathcal {V}+\mathbb {R}[g]\rightarrow \mathbb {R}\) such that \(\tilde{L}:\mathbb {R}[t]\rightarrow \mathbb {R}\) defined by \(\tilde{L}(t^d):=\overline{L}(g^d)\) for all \(d\in \mathbb {N}_0\) is a [0, 1]-moment functional, i.e.,

    (6)

If \(\mu \) is the representing measure of L, then \(\mu \circ g^{-1}\) represents \(\tilde{L}\).

Additionally, there exists a continuous and surjective function \(f\!:\![0,1]\rightarrow K\) independent on L resp. \(\tilde{L}\) such that \(f(g(x)) = x\) for all \(x\in K\) and if \(\tilde{\mu }\) is the representing measure of \(\tilde{L}\), then \(\tilde{\mu }\circ f^{-1}\) is the representing measure of L.

Proof

Since K is a compact and path-connected set, by the Hahn–Mazurkiewicz’ Theorem 2.6 there exists a continuous and surjective function \(f:[0,1]\rightarrow K\). By Lemma 2.7 [0, 1] and K are Souslin sets and hence by Theorem 2.12 there exists a measurable function \(g:K\rightarrow [0,1]\) such that

$$\begin{aligned} f(g(x))=x\quad \text {for all}\quad x\in K. \end{aligned}$$
(7)

(i)\(\rightarrow \)(ii): Let \(L:\mathcal {V}\rightarrow \mathbb {R}\) be a K-moment functional and \(\mu \) be a representing measure of L with \(\textrm{supp}\,\mu \subseteq K\). g is measurable with \(|g|\le 1\) and hence we have that all \(g^d\), \(d\in \mathbb {N}_0\), are \(\mu \)-integrable by

$$\begin{aligned} \left| \int _K g(x)^d~\textrm{d}\mu (x)\right| \le \int _K |g(x)|^d~\textrm{d}\mu (x)\le \int _K 1~\textrm{d}\mu (x) =\mu (K) = L(1) \end{aligned}$$
(8)

and hence L extents to \(\mathbb {R}[g]\). Let \(p\in \mathbb {R}[t]\), then

$$\begin{aligned} \tilde{L}(p) = L(p\circ g) = \int _K (p\circ g)(x)~\textrm{d}\mu (x) \overset{\text {Lemma 2.1}}{=} \int _0^1 p(t)~\textrm{d}(\mu \circ g^{-1})(t) \end{aligned}$$

and \(\mu \circ g^{-1}\) is a representing measure of \(\tilde{L}\), i.e., \(\tilde{L}\) is a [0, 1]-moment functional.

(ii)\(\rightarrow \)(i): Let \(\tilde{L}:\mathbb {R}[t]\rightarrow \mathbb {R}\) be a [0, 1]-moment functional and \(\tilde{\mu }\) be its unique representing measure. Since by the Stone–Weierstrass Theorem \(\mathbb {R}[t]\) is dense in \(C([0,1],\mathbb {R})\) the moment functional \(\tilde{L}\) extends uniquely to \(C([0,1],\mathbb {R})\). For simplicity we denote this extension also \(\tilde{L}:C([0,1],\mathbb {R})\rightarrow \mathbb {R}\). Since \(f:[0,1]\rightarrow K\) is continuous we have \(v\circ f\in C([0,1],\mathbb {R})\) for all \(v\in \mathcal {V}\). By (7) we have \(v = v\circ f\circ g\) for all \(v\in \mathcal {V}\) and hence

$$\begin{aligned} L(v) = L(v\circ f\circ g). \end{aligned}$$
(9)

But since \(v\circ f:[0,1]\rightarrow \mathbb {R}\) is continuous and \(\tilde{L}:\mathbb {R}[t]\rightarrow \mathbb {R}\) uniquely extends to \(C([0,1],\mathbb {R})\) we have

$$\begin{aligned} L(v\circ f\circ g) = \tilde{L}(v\circ f). \end{aligned}$$
(10)

In summary we get

$$\begin{aligned}&L(v) \overset{(9)}{=}\ L(v\circ f\circ g) \overset{(10)}{=}\ \tilde{L}(v\circ f) = \int _0^1 (v\circ f)(t)~\textrm{d}\tilde{\mu }(t) \nonumber \\&\qquad \overset{\text {Lem. 2.1}}{=} \int _K v(x)~\textrm{d}(\tilde{\mu }\circ f^{-1})(x) \end{aligned}$$
(11)

for all \(v\in \mathcal {V}\), i.e., \(\tilde{\mu }\circ f^{-1}\) is a representing measure of L and L is therefore a K-moment functional. \(\square \)

We see that all about L is already known if we know how it acts (via \(\tilde{L}\)) on powers of the fixed (and independent on L) function g. \(\tilde{L}:\mathbb {R}[t]\rightarrow \mathbb {R}\) is only a Hausdorff moment problem and its representing measure \(\tilde{\mu }\) provides a representing measure \(\mu = \tilde{\mu }\circ f^{-1}\) via a fixed (and independent on L) continuous function f.

Remark 4.5

Note, that in Theorem 4.4 and therefore also in Corollary 4.8 the condition \(1\in \mathcal {V}\) can be weakened to:

$$\begin{aligned} \text {There shall exists a}\ v\in \mathcal {V}\subseteq C(K,\mathbb {R})\ \text {such that}\ v>0\ \text {on}\ K. \end{aligned}$$

By compactness of K and continuity of v this implies \(1\le c\cdot v\in \mathcal {V}\) for some \(c>0\), i.e., \(\mu (K)<\infty \) in (8). However, since we have to extend \(L:\mathcal {V}\rightarrow \mathbb {R}\) to \(\overline{L}:\mathcal {V}+\mathbb {R}[g]\rightarrow \mathbb {R}\) and \(1\in \mathbb {R}[g]\) we can assume w.l.o.g. already \(1\in \mathcal {V}\). If \(1\not \in \mathcal {V}\) and L can not be extended to 1, then L can definitely not be extended to \(\mathbb {R}[g]\) and the statements of Theorem 4.4 and Corollary 4.8 remain valid.\(\circ \)

Theorem 4.4 requires the existence of a continuous extension \(\overline{L}:\mathcal {V}+\mathbb {R}[g]\rightarrow \mathbb {R}\) of L. Under the very mild condition \(1\in \mathcal {V}\) (resp. \(v\in \mathcal {V}\) with \(v > 0\) on K by the previous remark) extensions (not necessarily continuous) exist.

Lemma 4.6

Let g be as in Theorem 4.4 (resp. Corollary 4.8) and \(L:\mathcal {V}\rightarrow \mathbb {R}\) be a linear functional on the vector space \(\mathcal {V}\) with \(1\in \mathcal {V}\subseteq C(K,I_k)\) and \(L(1)>0\). Then there exists an extension \(\overline{L}:\mathcal {V}+\mathbb {R}[g]\rightarrow \mathbb {R}\) of \(L:\mathcal {V}\rightarrow \mathbb {R}\).

Proof

Since \(g:K\rightarrow I_k\subseteq [0,1]\) in Theorem 4.4 (resp. Corollary 4.8) we have \(|g|\le 1\). Hence, \(1\in \mathcal {V}\cap \mathbb {R}[g]\ne \emptyset \) and \(\mathcal {V}+\mathbb {R}[g] = \mathcal {V}\oplus (\mathbb {R}[g]{\setminus }\mathcal {V})\), i.e., \(f=f_1+f_2\in \mathcal {V}+\mathbb {R}[g]\) with unique \(f_1\in \mathcal {V}\) and \(f_2\in \mathbb {R}[g]{\setminus }\mathcal {V}\). Define

$$\begin{aligned} p:\mathcal {V}+\mathbb {R}[g]\rightarrow \mathbb {R}\qquad \text {by}\qquad p(f):= |L(f_1)| + L(1)\cdot \Vert f_2\Vert _\infty \end{aligned}$$

for all \(f=f_1 + f_2\in \mathcal {V}+\mathbb {R}[g]\), \(f_1\in \mathcal {V}\), and \(f_2\in \mathbb {R}[g]{\setminus }\mathcal {V}\). Hence, \(L(f)\le p(f)\) for all \(f\in \mathcal {V}\). Then

$$\begin{aligned} p(f+g)\le p(f)+p(g) \qquad \text {and}\qquad p(\alpha \cdot f) = \alpha \cdot p(f) \end{aligned}$$

hold for all \(f,g\in \mathcal {V}+\mathbb {R}[g]\) and \(\alpha \ge 0\). By the Hahn–Banach Theorem there exists an extension \(\overline{L}:\mathcal {V}+\mathbb {R}[g]\rightarrow \mathbb {R}\) of L. \(\square \)

An extension \(\overline{L}\) in Lemma 4.6 is in general not unique. If \(\mathcal {V}\) is a point separating algebra on K and L is a K-moment functional, then the extension L is unique (and continuous), since then the representing measure \(\mu \) of L is unique.

For the extension \(\overline{L}\) it is only necessary that \(1\in \mathcal {V}\) to ensure \(|g|\le 1\in \mathcal {V}\). \(\mathcal {V}\subseteq C(K,I_k)\) continuous is actually not necessary and hence Lemma 4.6 can be easily weakened.

As in Theorem 4.2 also in Theorem 4.4 the functions f and g do not depend on L or \(\tilde{L}\). They depend only on K. And as in Proposition 4.1 the functional \(\tilde{L}\) is defined in one “direction” \(\mathbb {R}[g]\cong \mathbb {R}[t]\) by \(\tilde{L}(t^d):=\overline{L}(g^d)\). But now it no longer needs to be \(L_{\text {Leb}}\) as in Example 1.1.

The problem of determining whether \(\tilde{L}:\mathbb {R}[t]\rightarrow \mathbb {R}\) in Theorem 4.4(ii) is a [0, 1]-moment functional is the Theorem 1.2. This problem is fully solved, analytically as well as numerically. But the function \(g:K\rightarrow [0,1]\) to establish the equivalence (i) \(\Leftrightarrow \) (ii) in Theorem 4.4 is a measurable function and not a polynomial. Hence, \(\overline{L}(g^d)\) is not directly accessible unless of course \(d=0\). Fortunately, since \(K\subset \mathbb {R}\) is compact, \(\mathbb {R}[x_1,\ldots ,x_n]\) is dense in \(C(K,\mathbb {R})\). Hence, for any given finite measure \(\mu \) on K, i.e., \(\mu (K)=L(1)<\infty \), we can approximate g by a polynomial \(g_\varepsilon \in \mathbb {R}[x_1,\ldots ,x_n]\) in the \(L^1(\mu )\)-norm to any arbitrary precision.

Theorem 4.7

Let \(n\in \mathbb {N}\) be a natural number, \(K\subset \mathbb {R}^n\) be a compact and path-connected set, and let \(g:K\rightarrow [0,1]\) be from Theorem 4.4. Then for any \(\varepsilon >0\) and K-moment functional \(L:\mathbb {R}[x_1,\ldots ,x_n]\rightarrow \mathbb {R}\) there exists a polynomial \(g_\varepsilon \in \mathbb {R}[x_1,\ldots ,x_n]\) such that

$$\begin{aligned} L(|g_\varepsilon -g|) \le \varepsilon \qquad \text {and}\qquad |L(g^d) - L(g_\varepsilon ^d)| \le d\cdot L(|g-g_\varepsilon |) \le d \cdot \varepsilon \end{aligned}$$

hold for all \(d\in \mathbb {N}_0\). \(g_\varepsilon \) can be chosen to be a square: \(g_\varepsilon = p_\varepsilon ^2\) for some \(p_\varepsilon \in \mathbb {R}[x_1,\ldots ,x_n]\). \(\square \)

Proof

L is a K-moment functional and therefore has a unique representing measure \(\mu \) with \(\textrm{supp}\,\mu \subseteq K\). \(g\ge 0\) and hence there exists a measurable function \(p:K\rightarrow [0,1]\) such that \(g=p^2\). Since K is compact and \(\mu (K) = L(1) < \infty \) the polynomials \(\mathbb {R}[x_1,\ldots ,x_n]\) are dense in \(L^1(K,\mu )\). By

$$\begin{aligned} \left| \int _K p(x)~\textrm{d}\mu (x)\right| \le \int _K |p(x)|~\textrm{d}\mu (x) \le \int _K 1~\textrm{d}\mu (x) = L(1)<\infty \end{aligned}$$

we have \(p\in L^1(K,\mu )\) and therefore for any \(\varepsilon >0\) there exists a \(p_\varepsilon \in \mathbb {R}[x_1,\ldots ,x_n]\) such that \(p_\varepsilon \le 1\) on K and

$$\begin{aligned} \Vert p-p_\varepsilon \Vert _{L^1(K,\mu )} = \int _K |p(x) - p_\varepsilon (x)|~\textrm{d}\mu (x) \le \frac{1}{2}\varepsilon . \end{aligned}$$

Set \(g_\varepsilon := p_\varepsilon ^2\). Then

$$\begin{aligned} L(|g-g_\varepsilon |)= & {} \int _K |g-g_\varepsilon |~\textrm{d}\mu (x) = \int _K |p^2(x) - p_\varepsilon ^2(x)|~\textrm{d}\mu (x) \nonumber \\= & {} \int _K |p-p_\varepsilon |\cdot |p+p_\varepsilon |~\textrm{d}\mu (x) \le 2\int _K |p(x) - p_\varepsilon (x)|~\textrm{d}\mu (x) \le \varepsilon .\nonumber \\ \end{aligned}$$
(12)

For \(d=0\) we have \(g^0 = g_\varepsilon ^0 = 1\), i.e., \(L(g^0) = L(1) = L(g_\varepsilon ^0)\), and for \(d=1\) we have \(|L(g)-L(g_\varepsilon )|\le L(|g-g_\varepsilon |)\le \varepsilon \). So let \(d\ge 2\). Then

$$\begin{aligned} |L(g^d) - L(g_\varepsilon ^d)|\le & {} L(|g^d - g_\varepsilon ^d|) = \int _K |g(x)^d - g_\varepsilon (x)^d|~\textrm{d}\mu (x) \nonumber \\= & {} \int _K |g(x)-g_\varepsilon (x)|\cdot \left| \sum _{i=0}^{d-1} g(x)^i\cdot g_\varepsilon (x)^{d-1-i} \right| ~\textrm{d}\mu (x) \\\le & {} d\cdot \int _K |g(x)-g_\varepsilon (x)|~\textrm{d}\mu (x) \le d\cdot \varepsilon .\nonumber \end{aligned}$$
(13)

\(\square \)

Note, the \(g_\varepsilon \) not only depends on \(\varepsilon >0\) but also on L resp. its representing measure \(\mu \). Since g is measurable (but not necessarily continuous) it is not possible to get \(\sup _{x\in K} |g(x) - g_\varepsilon (x)| \le \varepsilon \). So \(g_\varepsilon \) depends on L. Otherwise assume we find a \(g_\varepsilon \in \mathbb {R}[x_1,\ldots ,x_n]\) such that for any moment functional L (with \(L(1)=1\)), i.e., measure \(\mu \) on K with \(\mu (K)=1\), we have \(\Vert g-g_\varepsilon \Vert _{L^1(K,\mu )}\le \varepsilon \). Then for \(\mu =\delta _x\), \(x\in K\), we get

$$\begin{aligned} \sup _{x\in K} |g(x)-g_\varepsilon (x)| = \sup _{x\in K} \Vert g-g_\varepsilon \Vert _{L^1(K,\delta _x)} \le \varepsilon , \end{aligned}$$

a contradiction. So the choice of \(g_\varepsilon \) depends on L resp. \(\mu \).

Additionally, note that in fact we can \(g_\varepsilon \) not only chose to be a square, but in fact any power: \(g_\varepsilon = p_\varepsilon ^k\) for a fixed \(k\in \mathbb {N}\). Just replace \(p:=\sqrt{g}\) by \(p:= \root k \of {g}\) in the proof since \(g\ge 0\) and use the geometric series as in (13) also in (12).

In Corollary 4.3 we extended Theorem 4.2 from a compact and path-connected \(K\subset \mathbb {R}^n\) to an at most countable union of pairwise disjoint, compact, and path-connected \(K_i\)’s. In Theorem 4.4 we required that K is a compact and path-connected set. Since we needed compactness of [0, 1] in Theorem 4.4 we can at least extend Theorem 4.4 to a finite (disjoint) union of compact and path-connected sets.

Corollary 4.8

Let k, \(n\in \mathbb {N}\) be natural numbers and \(K\subset \mathbb {R}^n\) be the union of finitely many compact, path-connected, and pairwise disjoint sets \(K_i\): \(K = \bigcup _{i=1}^k K_i\). Then there exists a measurable function

$$\begin{aligned} g:K\rightarrow I_k:= \bigcup _{i=1}^{k} \left[ \frac{2i-2}{2k-1},\frac{2i-1}{2k-1}\right] \subset [0,1] \end{aligned}$$

such that for all linear functionals \(L: \mathcal {V}\rightarrow \mathbb {R}\) with \(1\in \mathcal {V}\subseteq C(K,\mathbb {R})\) the following are equivalent:

  1. (i)

    \(L:\mathbb {R}[x_1,\ldots ,x_n]\rightarrow \mathbb {R}\) is a K-moment functional.

  2. (ii)

    \(L:\mathcal {V}\rightarrow \mathbb {R}\) continuously extends to \(\overline{L}:\mathcal {V}+\mathbb {R}[g]\rightarrow \mathbb {R}\) such that \(\tilde{L}:\mathbb {R}[t]\rightarrow \mathbb {R}\) defined by \(\tilde{L}(t^d):= \overline{L}(g^d)\) for all \(d\in \mathbb {N}_0\) is a [0, 1]-moment functional.

Proof

For all \(i=1,\ldots ,k\) the sets \(K_i\) and \([\frac{2i-2}{2k-1},\frac{2i-1}{2k-1}]\) are compact and path-connected and therefore by the Theorem 2.6 there exist continuous and surjective functions \(f_i:[\frac{2i-2}{2k-1},\frac{2i-1}{2k-1}]\rightarrow K_i\). By Lemma 2.7 all \(K_i\) and \([\frac{2i-2}{2k-1},\frac{2i-1}{2k-1}]\) are Souslin sets and hence by Theorem 2.12 there exist measurable functions \(g_i:K_i\rightarrow [\frac{2i-2}{2k-1},\frac{2i-1}{2k-1}]\) such that \(f_i(g_i(x))=x\) for all \(x\in K_i\), \(i=1,\ldots ,k\). Define

$$\begin{aligned} f&:I_k\rightarrow K=\bigcup _{i=1}^k K_i \quad \text {by}\ f(x) = f_i(x)\ \text {for}\ x\in K_i \end{aligned}$$

and

$$\begin{aligned} g&:K=\bigcup _{i=1}^k K_i\rightarrow I_k\quad \text {by}\ g(x) = g_i(x)\ \text {for}\ x\in \left[ \frac{2i-2}{2k-1},\frac{2i-1}{2k-1}\right] . \end{aligned}$$

Then \(f(g(x)) = x\) for all \(x\in K\) and \(I_k\subset [0,1]\).

(i)\(\rightarrow \)(ii) and (ii)\(\rightarrow \)(i) are verbatim the same as in the proof of Theorem 4.4. \(\square \)

We are again facing the problem, that g is measurable but not necessarily a polynomial. But as in Theorem 4.7 we can approximate g by polynomials.

Corollary 4.9

Let \(n,k\in \mathbb {N}\) be natural numbers, \(K\subset \mathbb {R}^n\) the union of finitely many compact, path-connected, and pairwise disjoint sets \(K_i\), \(K = \bigcup _{i=1}^k K_i\), and let \(g:K\rightarrow I_k\) be from Corollary 4.8. Then for any \(\varepsilon >0\) and K-moment functional \(L:\mathbb {R}[x_1,\ldots ,x_n]\rightarrow \mathbb {R}\) there exists a polynomial \(g_\varepsilon \in \mathbb {R}[x_1,\ldots ,x_n]\) such that

$$\begin{aligned} L(|g_\varepsilon -g|) \le \varepsilon \qquad \text {and}\qquad |L(g^d) - L(g_\varepsilon ^d)| \le d\cdot L(|g-g_\varepsilon |) \le d \cdot \varepsilon \end{aligned}$$

hold for all \(d\in \mathbb {N}_0\). \(g_\varepsilon \) can be chosen to be a square: \(g_\varepsilon = p_\varepsilon ^2\) for some \(p_\varepsilon \in \mathbb {R}[x_1,\ldots ,x_n]\).

Proof

Since \(I_k\subset [0,1]\) it is verbatim the same as the proof of Theorem 4.7. \(\square \)

Note, that in Theorem 4.7 and Corollary 4.9 we have \(|\tilde{L}(t^d)|\le \tilde{L}(1) = L(1)\), i.e., the error bounds \(\le d\cdot \varepsilon \) exceed \(2\cdot \tilde{L}(1)\) at some point and become unreasonable.

We have seen in Theorem 4.2 resp. Corollary 4.3 that a linear functional \(L:\mathcal {V}\rightarrow \mathbb {R}\) is a K-moment functional (K is the countable union of compact and path-connected sets) if and only if it can be transformed by a continuous function \(f:I\rightarrow K\) to a I-moment functional (I is the countable union of intervals \([a_i,b_i]\in \mathbb {R}\)).

If we allow not only continuous functions f, then we can generalize this. If we drop continuity of f but add bijectivity almost everywhere we find that any functional on a Borel set of \(\mathbb {R}^n\) is a moment functional if and only if we can transform it into a moment functional with representing measure “Lebesgue measure on [0, 1] plus countably many point evaluations”, see (14).

Theorem 4.10

Let \(n\in \mathbb {N}\) be a natural number, \(B\in \mathfrak {B}(\mathbb {R}^n)\) be a Borel set, and \(\mathcal {V}\) be a vector space of real measurable functions on \(\mathfrak {B}\) with \(1\in \). Then the following are equivalent.

  1. (i)

    \(L:\mathcal {V}\rightarrow \mathbb {R}\) is a \(\mathfrak {B}\)-moment functional.

  2. (ii)

    There exist Borel sets \(M\in \mathfrak {B}(B)\) and \(N\in \mathfrak {B}([0,1])\) and a bijective and measurable function (isomorphism) \(f:[0,1]{\setminus } N\rightarrow B{\setminus } M\) such that

    $$\begin{aligned} L(v) = \int _0^1 v(f(t))~\textrm{d}\nu (t) \qquad \text {with}\qquad \nu = c\cdot \lambda + \sum _{i\in \mathbb {N}} c_i\cdot \delta _{1/i} \end{aligned}$$
    (14)

    for all \(v\in \mathcal {V}\), where c, \(c_i\ge 0\) and \(c + \sum _{i\in \mathbb {N}} c_i = L(1)\), i.e., \(\nu \circ f^{-1}\) is a representing measure of L.

Proof

(ii)\(\rightarrow \)(i): Clear since \(\nu \circ f^{-1}\) is a representing measure of L.

(i)\(\rightarrow \)(ii): Let \(\mu \) be a representing measure of L. Then \((B,\mathfrak {B}(B),\mu )\) is by Example 2.17 a Lebesgue–Rohlin space and therefore by Theorem 2.18 isomorph \(\textrm{mod}0\) to \(([0,1],\mathfrak {B}([0,1]),\nu )\) with \(\nu \) as in (14), i.e., there exist Borel sets \(M\in \mathfrak {B}(B)\) and \(N\in \mathfrak {B}([0,1])\) and a bijective and measurable function \(f:[0,1]{\setminus } N\rightarrow B{\setminus } M\) such that \(\nu = \mu \circ f\) and \(\mu (M)=\nu (N)=0\). Then by Lemma 2.1 for all \(v\in \mathcal {V}\) we have

$$\begin{aligned} L(v)&= \int _B v(x)~\textrm{d}\mu (x) = \int _{B\setminus M} v(f\circ f^{-1})~\textrm{d}\mu (x)\\ {}&= \int _{[0,1]\setminus N} v(f(t))~\textrm{d}(\mu \circ f)(t)=\int _0^1 v(f(t))~\textrm{d}\nu (t). \end{aligned}$$

\(\square \)

If we drop bijectivity almost everywhere for f then we get Theorem 1.4, i.e., in (14) we can chose \(c=L(1)\) and \(c_i=0\) for all \(i\in \mathbb {N}\).

In Theorem 1.4 and Theorem 4.10 we can only ensure that f is measurable, but not necessarily continuous or even a polynomial map. The reason is that we can not control the support of a representing measure of L. In Theorem 4.2 we already showed that f can be chosen as continuous and surjective, independent on L. But if we restrict the moment functionals resp. the support of a representing measure and chose f tailor made for each K-moment functional, then f can be chosen to be continuous and surjective and the representing measure will be the Lebesgue measure \(\lambda \) on [0, 1].

Theorem 4.11

Let \(n\in \mathbb {N}\), \(K\subset \mathbb {R}^n\) be a compact and path-connected set, \(\mathcal {V}\) be a vector space of real function on K, and \(L:\mathcal {V}\rightarrow \mathbb {R}\) be a linear functional. Then the following are equivalent:

  1. (i)

    \(L:\mathcal {V}\rightarrow \mathbb {R}\) is a K-moment functional with representing measure \(\mu \) such that \(\textrm{supp}\,\mu = K\).

  2. (ii)

    There exists a continuous and surjective function \(f:[0,1]\rightarrow K\) such that

    $$\begin{aligned} L(v) = \int _0^1 v(f(t))~\textrm{d}\lambda (t) \end{aligned}$$

    for all \(v\in \mathcal {V}\) where \(\lambda \) is the Lebesgue measure on [0, 1], i.e.,

    $$\begin{aligned} L \quad \overset{f}{\leadsto }\quad L_{\text {Leb}}:\mathcal {L}^1([0,1],\lambda )\rightarrow \mathbb {R}. \end{aligned}$$

Proof

(i)\(\rightarrow \)(ii): Let \(L:\mathcal {V}\rightarrow \mathbb {R}\) be a K-moment functional and let \(\mu \) be its unique representing measure with \(\textrm{supp}\,\mu = K\). Since K is a compact and path-connected set, by the Theorem 2.6 there exists a continuous and surjective function \(\tilde{f}:[0,1]\rightarrow K\). By Theorem 2.19 there exists a continuous and surjective function \(f:[0,1]\rightarrow K\) such that \(\mu = \lambda \circ f^{-1}\). For all \(v\in \mathcal {V}\) we get

$$\begin{aligned} L(p) = \int _K p(x)~\textrm{d}\mu (x) = \int _K p(x)~\textrm{d}(\lambda \circ f^{-1})(x) \overset{\text {Lemma 2.1}}{=} \int _0^1 p(f(t))~\textrm{d}\lambda (t). \end{aligned}$$
(15)

(ii)\(\rightarrow \)(i): By (15) \(\mu = \lambda \circ f^{-1}\) is a representing measure of L, i.e., L is a K-moment functional. To show that \(\textrm{supp}\,\mu = K\) holds, let \(U\subseteq K\) be open. Since f is continuous, \(f^{-1}(U)\subseteq [0,1]\) is open and therefore \(\mu (U) = \lambda (f^{-1}(U)) > 0\). \(\square \)

So far we transformed moment functionals to [0, 1]-moment functionals. We have seen that e.g. \(\mathbb {R}^n\)-moment functionals can not be continuously transformed into [0, 1]-moment functionals. But we can transform \(\mathbb {R}^n\)-moment functionals continuously into \([0,\infty )\)-moment functionals. We need the following.

Lemma 4.12

Let \(n\in \mathbb {N}\) and \(\varepsilon >0\). Then there exists a continuous and surjective function \(f_\varepsilon :[0,\infty )\rightarrow \mathbb {R}^n\) with

$$\begin{aligned} t - \varepsilon \le \Vert f_\varepsilon (t)\Vert \le t + \varepsilon \end{aligned}$$

for all \(t\ge 0\) and there exists a measurable function \(g_\varepsilon :\mathbb {R}^n\rightarrow [0,\infty )\) such that

$$\begin{aligned} f_\varepsilon (g_\varepsilon (x))=x\qquad \text {and}\qquad \Vert x\Vert -\varepsilon \le g_\varepsilon (x)\le \Vert x\Vert +\varepsilon \end{aligned}$$

for all \(x\in \mathbb {R}^n\).

Proof

Set

$$\begin{aligned} A_n:= \{x\in \mathbb {R}^n \,|\, (n-1)\cdot \varepsilon \le \Vert x\Vert \le n\cdot \varepsilon \} \end{aligned}$$

for all \(n\in \mathbb {N}\). Then all \(A_n\)’s are compact and path-connected and by the Theorem 2.6 there exist continuous and surjective functions \(f_{\varepsilon , n}:[(n-1)\cdot \varepsilon ,n\cdot \varepsilon ]\rightarrow A_n\) for all \(n\in \mathbb {N}\) such that \(f_{\varepsilon , n}(n\cdot \varepsilon ) = f_{\varepsilon ,n+1}(n\cdot \varepsilon )\), i.e., \(\Vert f_{\varepsilon , n}(n\cdot \varepsilon )\Vert =\Vert f_{\varepsilon ,n+1}(n\cdot \varepsilon )\Vert =n\cdot \varepsilon \) for all \(n\in \mathbb {N}\). Since \(\mathbb {R}^n = \bigcup _{n\in \mathbb {N}} A_n\) define \(f_\varepsilon :[0,\infty )\rightarrow \mathbb {R}^n\) by \(f_\varepsilon |_{[n-1,n]}:= f_{\varepsilon ,n}\). Then for \(t\in [(n-1)\cdot \varepsilon ,n\cdot \varepsilon ]\) we have

$$\begin{aligned} t-\varepsilon \le (n-1)\cdot \varepsilon \le \Vert f_\varepsilon (t)\Vert = \Vert f_{\varepsilon ,n}(t)\Vert \le n\cdot \varepsilon \le t+\varepsilon . \end{aligned}$$
(*)

Since \(f:[0,\infty )\rightarrow \mathbb {R}^n\) is surjective and \([0,\infty )\) and \(\mathbb {R}^n\) are Souslin sets by Lemma 2.7 then by Theorem 2.12 there exists a \(g_\varepsilon :\mathbb {R}^n\rightarrow [0,\infty )\) with \(f_\varepsilon (g_\varepsilon (x))=x\) for all \(x\in \mathbb {R}^n\). (\(*\)) implies

$$\begin{aligned} g_\varepsilon (x)-\varepsilon \le \Vert x\Vert = \Vert f_\varepsilon (g_\varepsilon (x))\Vert \le g_\varepsilon (x) + \varepsilon \end{aligned}$$

and therefore \(\Vert x\Vert -\varepsilon \le g_\varepsilon (x) \le \Vert x\Vert + \varepsilon \) for all \(x\in \mathbb {R}^n\). \(\square \)

Similar to Theorem 4.2 we then get the continuous transformation into \([0,\infty )\)-moment functionals.

Theorem 4.13

Let \(n\in \mathbb {N}\), \(f:[0,\infty )\rightarrow \mathbb {R}^n\) be a continuous and surjective function, and \(\mathcal {V}\) be a vector space of measurable functions on \(\mathbb {R}^n\). Then for all linear functionals \(L:\mathcal {V}\rightarrow \mathbb {R}\) the following are equivalent:

  1. (i)

    \(L:\mathcal {V}\rightarrow \mathbb {R}\) is a moment functional.

  2. (ii)

    \(\tilde{L}:\mathcal {V}\circ f\rightarrow \mathbb {R}\) defined by \(\tilde{L}(v\circ f):= L(v)\) is a \([0,\infty )\)-moment functional.

I.e., \(L\overset{{\textit{sc}}}{\leadsto }\tilde{L}\). If \(\tilde{\mu }\) is a representing measure of \(\tilde{L}\), then \(\tilde{\mu }\circ f^{-1}\). There exists a function \(g:\mathbb {R}^n\rightarrow [0,\infty )\) such that \(f(g(x))=x\) for all \(x\in \mathbb {R}^n\) and if \(\mu \) is a representing measure of L, then \(\mu \circ g^{-1}\) is a representing measure of \(\tilde{L}\).

Proof

Since \(\mathbb {R}^n\) and \([0,\infty )\) are Souslin sets and f is surjective, by Theorem 2.12 there exists a function \(g:\mathbb {R}^n\rightarrow [0,\infty )\) such that \(f(g(x))=x\) for all \(x\in \mathbb {R}^n\). It follows that \(\tilde{L}\) is well defined by \(\tilde{L}(v\circ f) = L(v)\).

(i)\(\rightarrow \)(ii): Let \(\mu \) be a representing measure of L, then

$$\begin{aligned} \tilde{L}(v\circ f)&= L(v) = \int _{\mathbb {R}^n} v(x)~\textrm{d}\mu (x) = \int _{\mathbb {R}^n} v(f(g(x)))~\textrm{d}\mu (x) \\&\overset{\text {Lemma 2.1}}{=} \int _0^\infty (v\circ f)(t)~\textrm{d}(\mu \circ g^{-1})(t), \end{aligned}$$

i.e., \(\mu \circ g^{-1}\) is a representing measure of \(\tilde{L}\).

(ii)\(\rightarrow \)(i): Let \(\tilde{\mu }\) be a representing measure of \(\tilde{L}\), then

$$\begin{aligned} L(v) = \tilde{L}(v\circ f) = \int _0^\infty (v\circ f)(t)~\textrm{d}\tilde{\mu }(t) \overset{\text {Lemma 2.1}}{=} \int _{\mathbb {R}^n} v(x)~\textrm{d}(\tilde{\mu }\circ f^{-1})(x), \end{aligned}$$

i.e., \(\tilde{\mu }\circ f^{-1}\) is a representing measure of L. \(\square \)

Remark 4.14

Similar to Theorem 4.4 we get that for any \(\varepsilon >0\) and \(g_\varepsilon \) from Lemma 4.12

  1. (i)

    \(L:\mathbb {R}[x_1,\ldots ,x_n]\rightarrow \mathbb {R}\) is a moment functional

implies that

  1. (ii)

    \(L:\mathbb {R}[x_1,\ldots ,x_n]\rightarrow \mathbb {R}\) continuously extends to \(\overline{L}:\mathbb {R}[x_1,\ldots ,x_n,g]\rightarrow \mathbb {R}\) such that \(\tilde{L}:\mathbb {R}[t]\rightarrow \mathbb {R}\) defined by \(\tilde{L}(t^d):= \overline{L}(g^d)\) is a \([0,\infty )\)-moment functional, i.e.,

That follows easily from the fact that \(0\le g_\varepsilon (x) \le \Vert x\Vert + \varepsilon \le \Vert x\Vert ^2 + 1 + \varepsilon \in \mathbb {R}[x_1,\ldots ,x_n]\). However, it is open whether the strong direction (ii)\(\rightarrow \)(i) as in Theorem 4.4 holds in general. In Theorem 4.4 compactness of K implied that \(\mathbb {R}[x_1,\ldots ,x_n]\) is dense in \(C(K,\mathbb {R})\) and hence f could be approximated and the representing measure of L is unique. On \(\mathbb {R}^n\) both do not hold and hence (ii)\(\rightarrow \)(i) can so far not be ensured in the same fashion as in Theorem 4.4.\(\circ \)

At the end of this section we want to discuss two things that can easily be missed. The first is a crucial technical remark and the second is a historical one.

For most transformations \(\leadsto \) we required that \(f:\mathcal {X}\rightarrow \mathcal {Y}\) is surjective to apply Theorem 2.12 to get a right-side inverse \(g:\mathcal {Y}\rightarrow \mathcal {X}\), i.e., \(f(g(y))=y\) for all \(y\in \mathcal {Y}\). E.g. in Theorem 4.4 we used this g directly to embed a [0, 1]-moment functional into an extension \(\overline{L}\) of L. However, for any \(f:\mathcal {X}\rightarrow \mathcal {Y}\) of course \(f:\mathcal {X}\rightarrow f(\mathcal {Y})\) is surjective. If f is continuous and \(\mathcal {X}\) Borel, then \(f(\mathcal {X})\) remains even a Borel set. Otherwise \(f(\mathcal {X})\) is at least a Souslin set.

To demonstrate, that \(f:\mathcal {X}\rightarrow \mathcal {Y}\) needs to be surjective and the restriction \(f:\mathcal {X}\rightarrow f(\mathcal {X})\) can not be used, let \(L:\mathbb {R}[x_1,\ldots ,x_n]\rightarrow \mathbb {R}\) be a linear functional such that \(L(p^2)\ge 0\) for all \(p\in \mathbb {R}[x_1,\ldots ,x_n]\). Let \(f\in \mathbb {R}[x_1,\ldots ,x_n]\), then define \(\tilde{L}:\mathbb {R}[t]\rightarrow \mathbb {R}\) by \(\tilde{L}(t^d):= L(f^d)\) for all \(d\in \mathbb {N}_0\). We have \(\tilde{L}(p^2) = L((p\circ f)^2) \ge 0\) for all \(p\in \mathbb {R}[t]\), i.e., \(\tilde{L}\) is a Hamburger moment functional and there exists a measure \(\nu \) on \(\mathbb {R}\) such that

$$\begin{aligned} \tilde{L}(p) = \int _\mathbb {R}p(t)~\textrm{d}\nu (t) \qquad \text {for all}\ p\in \mathbb {R}[t], \end{aligned}$$

i.e.,

$$\begin{aligned} L(f^d) = \tilde{L}(t^d) = \int _\mathbb {R}t^d~\textrm{d}\nu (t) \qquad \text {for all}\ d\in \mathbb {N}_0. \end{aligned}$$
(16)

The important thing is, that (16) does not imply that there exists a \(\mu \) such that \(L(f^d) = \int _{\mathbb {R}^n} f^d(x)~\textrm{d}\mu (x)\) for all \(d\in \mathbb {N}_0\). Theorem 2.12 incorrectly applied in (X) would suggest that there is a g such that \(f(g(t))=t\), i.e.,

$$\begin{aligned} \int _\mathbb {R}t^d~\textrm{d}\nu (t) \overset{\text {(X)}}{=} \int _\mathbb {R}f(g(t))^d~\textrm{d}\nu (t) = \int _{\mathbb {R}^n} f(x)^d~\textrm{d}(\nu \circ g^{-1})(x) \end{aligned}$$

and hence \(\nu \circ g^{-1}\) is a representing measure for \(L(f^d)\). Therefore (X) would imply \(L(f)\ge 0\) for all \(f\in \mathbb {R}[x_1,\ldots ,x_n]\) with \(f\ge 0\) since \(\nu \circ g^{-1}\) is non-negative. Haviland’s Theorem then shows that L is a moment functional. But for L we only had \(L(p^2)\ge 0\) for all \(p\in \mathbb {R}[x_1,\ldots ,x_n]\) and for \(n\ge 2\) there are functionals only with \(L(p^2)\ge 0\) which are not moment functionals [3, 11, 35]. This is the contradiction. We have to ensure, that \(\textrm{supp}\,\nu \subseteq f(\mathbb {R}^n)\) holds to apply Theorem 2.12.

For the historical remark, in this study we frequently encountered the case where a linear functional \(L:\mathcal {V}\rightarrow \mathbb {R}\) (or its transformation) lives on measurable functions \(\mathcal {V}\), i.e., we apparently face the problem that our functions \(v\in \mathcal {V}\) live on a measurable space \((\mathcal {X},\mathcal {A})\). But a main tool in the moment problem is the Riesz (Riesz–Markov–Kakutani) Theorem and it works with (compactly supported) continuous functions on locally compact Hausdorff spaces. While the linear functional is extended to compactly supported continuous functions via e.g. the Hahn–Banach Theorem, changing or extending a measurable space \((\mathcal {X},\mathcal {A})\) to a topological space, especially to a locally compact Hausdorff space, is in general not possible. Another important case where we rather work on a measurable space than a locally compact Hausdorff space is the Richter Theorem.

Richter’s Theorem 4.15

(see [27, Satz 4]) Let \(\mathcal {V}\) be a finite-dimensional vector space of measurable functions on a measurable space \((\mathcal {X},\mathcal {A})\). Then every moment functional \(L:\mathcal {V}\rightarrow \mathbb {R}\) has a finitely atomic representing measure

$$\begin{aligned} \sum _{i=1}^{k} c_i\cdot \delta _{x_i} \end{aligned}$$

with \(c_i > 0\), \(x_i\in \mathcal {X}\), and \(k \le \dim \mathcal {V}\).

Many other names are connected to this result, see e.g. [8] for a historical overview. While Richter was the first to proved this result in full generality taking the broader historical development into account it might even be justified to call it the Richter–Rogosinski–Rosenbloom Theorem [27, 29, 32].

The question when a linear functional acting on measurable functions is represented by a measure was already fully answered by P. J. Daniell in 1918 [6]. We need the following to state his theorem.

Definition 4.16

Let \(\mathcal {X}\) be a space. We call a set \(\mathcal {F}\) of functions \(f:\mathcal {X}\rightarrow \mathbb {R}\) a lattice (of functions) if the following holds:

  1. (i)

    \(c\cdot f\in \mathcal {F}\) for all \(c\ge 0\) and \(f\in \mathcal {F}\),

  2. (ii)

    \(f+g\in \mathcal {F}\) for all \(f,g\in \mathcal {F}\),

  3. (iii)

    \(\inf (f,g)\in \mathcal {F}\) for all \(f,g\in \mathcal {F}\),

  4. (iv)

    \(\inf (f,c)\in \mathcal {F}\) for all \(c\ge 0\) and \(f\in \mathcal {F}\), and

  5. (v)

    \(g-f\in \mathcal {F}\) for all \(f,g\in \mathcal {F}\) with \(f\le g\).

Some authors require that a lattice of functions is a vector space. But for proving Theorem 4.17 it is only necessary that a lattice is a cone.

Daniell’s Representation Theorem 4.17

(Daniell 1918 [6]) Let \(\mathcal {F}\) be a lattice of functions on a space \(\mathcal {X}\) and let \(L:\mathcal {F}\rightarrow \mathbb {R}\) be such that

  1. (i)

    \(L(f+g) = L(f) + L(g)\) for all \(f,g\in \mathcal {F}\),

  2. (ii)

    \(L(c\cdot f) = c\cdot L(f)\) for all \(c\ge 0\) and \(f\in \mathcal {F}\),

  3. (iii)

    \(L(f) \le L(g)\) for all \(f,g\in \mathcal {F}\) with \(f\le g\),

  4. (iv)

    \(L(f_n)\nearrow L(g)\) as \(n\rightarrow \infty \) for all \(g\in \mathcal {F}\) and \(f_n\in \mathcal {F}\) with \(f_n\nearrow g\).

Then there exists a measure \(\mu \) on \((\mathcal {X},\mathcal {A})\) with

$$\begin{aligned} \mathcal {A}:= \sigma (\{f^{-1}((-\infty ,a]) \,|\, a\in \mathbb {R},\ f\in \mathcal {F}\}) \end{aligned}$$

such that

$$\begin{aligned} L(f) = \int _\mathcal {X}f(x)~\textrm{d}\mu (x) \end{aligned}$$

for all \(f\in \mathcal {F}\).

The most impressive part is that the functional \(L:\mathcal {F}\rightarrow \mathbb {R}\) lives only on a lattice \(\mathcal {F}\) of functions \(f:\mathcal {X}\rightarrow \mathbb {R}\) where \(\mathcal {X}\) is a set without any structure. Theorem 4.17 provides a representing measure \(\mu \) including the \(\sigma \)-algebra \(\mathcal {A}\) of the measurable space \((\mathcal {X},\mathcal {A})\).

Riesz Representation Theorem follows directly from Theorem 4.17. \(C_0(\mathcal {X},\mathbb {R})\), \(\mathcal {X}\) a locally compact Hausdorff space, is a lattice of functions, (i) and (ii) are the linearity of L, (iii) non-negativity of L, and the continuity condition (iv) of L follows easily from uniform convergence in \(C_0(\mathcal {X},\mathbb {R})\).

5 Conclusion

We end with some conclusions and some open questions which appeared during our investigation.

We gained in Sect. 3 basic properties of the transformation \(\leadsto \) of linear functioals. Especially in Theorem 3.3 that a strong transformation \(L\overset{{\textit{s}}}{\leadsto }K\) implies that L is a moment functional if and only if K is a moment functional. In Lemma 3.2 we have seen that \(L\overset{{\textit{s}}}{\leadsto }K\) implies the weaker statements \(L\leadsto K\) and \(K\leadsto L\). So it is natural to ask if the reverse holds.

Open Problem 5.1

Does \(L\leadsto K\) and \(K\leadsto L\) imply \(L\overset{{\textit{s}}}{\leadsto }K\)?

Note, this problem has the same structure as the Cantor–Bernstein Theorem from set theory (i.e., \(|M|\le |N|\) and \(|N|\le |M|\) implies \(|M|=|N|\)).

Additionally, can the requirement of a strong transformation be weakened? While we have seen that surjectivity of \(f:\mathcal {X}\rightarrow \mathcal {Y}\) is necessary and can in general not be omitted, it should be possible to weaken the condition that \(\mathcal {V}\circ f = \mathcal {U}\) from \(L:\mathcal {V}\rightarrow \mathbb {R}\) and \(K:\mathcal {U}\rightarrow \mathbb {R}\). It is in fact only necessary that \(\mathcal {V}\) and \(\mathcal {U}\) (and therefore L and K) can be extended to some \(\overline{\mathcal {V}}\supseteq \mathcal {V}\) and \(\overline{\mathcal {U}}\supseteq \mathcal {U}\) such that \(\overline{\mathcal {V}}\circ f = \overline{\mathcal {U}}\).

In Proposition 4.1 we have seen that for a moment functional L with an atomless representing measure there exists an integrable function f such that L extended to \(\overline{L}:\mathcal {V}+\mathbb {R}[f]\rightarrow \mathbb {R}\) which obeys \(\overline{L}|_{\mathbb {R}[f]} = L_{\text {Leb}}\), i.e., \(\overline{L}(f^d) = \frac{L(1)}{d+1}\) for all \(d\in \mathbb {N}_0\). Because of the simplicity of \(L_{\text {Leb}}\) in Example 1.1, are there other “directions”, i.e., f’s, with similar properties?

Open Problem 5.2

Are there other “directions” f with \(\overline{L}(f^d) = \frac{L(1)}{d+1}\) or a similar behavior?

The importance of this question is again revealed in Theorem 4.4 where we have a similar structure in (6):

There exists a function \(g:K\rightarrow [0,1]\) such that: A linear functional \(L:\mathcal {V}\rightarrow \mathbb {R}\) is a K-moment problem if and only if it continuously extends to some \(\overline{L}:\mathcal {V}+\mathbb {R}[g]\rightarrow \mathbb {R}\) and \(\tilde{L}:\mathbb {R}[t]\rightarrow \mathbb {R}\) defined by \(\tilde{L}(t^d):=\overline{L}(g^d)\) for all \(d\in \mathbb {N}_0\) is a [0, 1]-moment functional.

At this point the reader shall be reminded of the following functional analytic fact. Let \(L:\mathbb {R}[x_1,\ldots ,x_n]\rightarrow \mathbb {R}\) be a linear functional with \(L(p^2)\ge 0\) for all \(p\in \mathbb {R}[x_1,\ldots ,x_n]\). \((\mathbb {C}[x_1,\ldots ,x_n],\langle \,\cdot ,\,\cdot \,\rangle )\) with \(\langle p,q\rangle := L(p\cdot \overline{q})\) is a pre-Hilbert space via complexification of L by linearity (and removing the possible kernel of L), and for all \(i=1,\ldots ,n\) the multiplication operators \(X_i\) are defined by \((X_ip)(x_1,\ldots ,x_n):= x_i\cdot p(x_1,\ldots ,x_n)\) for all \(p\in \mathbb {C}[x_1,\ldots ,x_n]\). \((X_1,\ldots ,X_n)\) is a tuple of commuting symmetric operators on \((\mathbb {C}[x_1,\ldots ,x_n],\langle \,\cdot ,\,\cdot \,\rangle )\). Then L is a moment functional if and only if \((X_1,\ldots ,X_n)\) extends to a tuple \((\overline{X_1},\ldots ,\overline{X_n})\) of communting self-adjoint operators on some Hilbert space \(\mathcal {H}\supset (\mathbb {C}[x_1,\ldots ,x_n],\langle \,\cdot \,,\,\cdot \,\rangle )\).

But extending L to \(\mathbb {R}[x_1,\ldots ,x_n,g]\supseteq \mathbb {R}[x_1,\ldots ,x_n]+\mathbb {R}[g]\) gives

By Theorem 4.4 it is sufficient to ensure that the multiplication operator G on \(\mathbb {C}[x_1,\ldots ,x_n,g]\), i.e., \((Gp)(x):= g(x)\cdot p(x)\), has a self-adjoint extension. So the tuple \((X_1,\ldots ,X_n)\) is replaced by G and the open question is loosely the following:

Open Problem 5.3

What is the functional analysis behind the g in Theorem 4.4?

Note, that in the setting of Theorem 4.4 the multiplication operators are bounded since K is compact. In the setup of \(K=\mathbb {R}^n\), see Remark 4.14, we have in general unbounded operators and only the easy direction (i)\(\rightarrow \)(ii) was shown. It is open if (ii)\(\rightarrow \)(i) also holds in the unbounded case.

Open Problem 5.4

Does (ii)\(\rightarrow \)(i) in Remark 4.14 holds in general or is there a counter example?

In Theorem 4.7 we have seen that this g in Theorem 4.4 can be approximated by polynomials \(g_\varepsilon \in \mathbb {R}[x_1,\ldots ,x_n]\). So a natural question (especially in applications) is to ask the following:

Open Problem 5.5

How does \(\deg g_\varepsilon \) of \(g_\varepsilon \) in Theorem 4.7 grow with \(\varepsilon \rightarrow 0\)?

The reason that g in Theorem 4.4 is only a measurable function but not a polynomial even for \(\mathcal {V}= \mathbb {R}[x_1,\ldots ,x_n]\) is a consequence of the reduction of the dimension and Sard’s Theorem [34]. We reduce the dimension of K, in general \(\dim K\ge 2\), to 1, i.e., the dimension of [0, 1]. However, a transformation \(\overset{f}{\leadsto }\) not necessarily needs to reduce the dimension of K.

To remain in the algebraic setup we have to investigate transformations \(\overset{f}{\leadsto }\) of linear functionals on \(\mathbb {R}[x_1,\ldots ,x_n]\) where f is a (bi)rational or polynomial function. Since a linear functional L is a moment functional if and only if \(L(f)\ge 0\) for all \(f\ge 0\) on K, \(f\in \mathbb {R}[x_1,\ldots ,x_n]\), i.e., it has long been known that moment functionals are closely related to a description of non-negative polynomials (Haviland Theorem), these transformations of moment functionals with (bi)rational or polynomial functions might give deeper insight into non-negative polynomials.

Open Problem 5.6

Do transformations \(\overset{f}{\leadsto }\) of moment functionals with polynomial or (bi)rational f give deeper insight into/characterizations of non-negative polynomials?