1 Introduction and main result

1.1 The Newtonian potential

Let us place the result of this paper into context through an example. Consider the Newtonian potential, the Green’s function of the Laplace operator on \(\mathbb{R }^d\) given by

$$\begin{aligned} \varPhi (x) = C_{d} {\left\{ \begin{array}{ll} |x|^{-(d-2)}&(d \ge 3)\\ \log 1/|x|&(d=2) \end{array}\right.} \quad \text{ for} \text{ all} \,x \in {\mathbb{R }}^d, x \ne 0. \end{aligned}$$
(1.1)

For \(d\ge 3\) and any measurable function \(\varphi : {\mathbb{R }} \rightarrow {\mathbb{R }}\) such that \(t^{d-3}\varphi (t)\) is integrable, the Newtonian potential can be written, up to a constant, as

$$\begin{aligned} |x|^{-(d-2)} = \int _{0}^{\infty } {t}^{-(d-2)} \, \varphi (|x|/t) \; \frac{\mathrm{d}t}{t} \quad \text{ for} \text{ all}\, x \in {\mathbb{R }}^d, x\ne 0. \end{aligned}$$
(1.2)

This is true because both sides are radially symmetric and homogeneous of degree \(-(d-2)\), where homogeneity of the right-hand side simply follows from the change of variables formula. In particular, \(\varphi \) can be chosen smooth with compact support and such that \(\varphi (|x|)\) is a positive semi-definite function on \({\mathbb{R }}^d\). The last condition means that \(\varphi (|x|)\) is positive as a quadratic form: for any \(f \in C_{c}^\infty ({\mathbb{R }}^d)\), that is, \(f: {\mathbb{R }}^d \rightarrow {\mathbb{R }}\) smooth with compact support,

$$\begin{aligned} \varPhi _{t}(f,f) := \int _{{\mathbb{R }}^d \times {\mathbb{R }}^d} \varphi (|x-y|/t) f(x)f(y) \; \mathrm{d}x \, \mathrm{d}y \ge 0. \end{aligned}$$
(1.3)

Similarly, if \(d=2\), and \(\varphi : {\mathbb{R }} \rightarrow {\mathbb{R }}\) is any absolutely continuous function with \(\varphi (0) =1\) and such that \(\varphi ^{\prime }(t)\) is integrable, then

$$\begin{aligned} \log 1/|x| = \int _{0}^{\infty } (\varphi (|x|/t) - \varphi (1/t)) \; \frac{\mathrm{d}t}{t} \quad \text{ for} \text{ all} \, x \in {\mathbb{R }}^2, x \ne 0 . \end{aligned}$$
(1.4)

Indeed, for \(x\ne 0\),

$$\begin{aligned} \log 1/|x| = \varphi (0) \log 1/|x| = -\int _{0}^{\infty } \varphi ^{\prime }(s) \log 1/|x| \; \mathrm{d}s = \int _{0}^\infty \varphi ^{\prime }(s) \int _{s/|x|}^s \frac{\mathrm{d}t}{t} \; \mathrm{d}s ,\qquad \end{aligned}$$
(1.5)

and thus, since \(\varphi ^{\prime }\) is integrable, by Fubini’s theorem,

$$\begin{aligned} \log 1/|x| = \int _{0}^\infty \int _{t}^{t|x|} \varphi ^{\prime }(s) \; \mathrm{d}s \; \frac{\mathrm{d}t}{t} = \int _{0}^{\infty } (\varphi (t|x|)-\varphi (t)) \; \frac{\mathrm{d}t}{t} , \end{aligned}$$
(1.6)

showing (1.4) after the change of variables \(t \mapsto 1/t\). Now suppose again that \(\varphi \) is chosen such that \(\varphi (|x|)\) is a positive semi-definite function on \({\mathbb{R }}^2\). Then the function \({\mathbb{R }}^2 \ni x \mapsto \varphi (|x|/t)-\varphi (1/t)\) is positive as a quadratic form on the domain of smooth and compactly supported functions with vanishing integral:

$$\begin{aligned} \varPhi _{t}(f,f)&:= \int _{{{\mathbb{R }}}^2\times {\mathbb{R }}^2} (\varphi (|x-y|/t)-\varphi (1/t)) f(x)f(y) \; \mathrm{d}x \, \mathrm{d}y \nonumber \\&= \int _{{\mathbb{R }}^2 \times {\mathbb{R }}^2} \varphi (|x-y|/t) \; f(x)f(y) \; \mathrm{d}x \, \mathrm{d}y \ge 0 \end{aligned}$$
(1.7)

for all \(f \in C_{c}^{\infty }({\mathbb{R }}^2)\) with \(\int f \; \mathrm{d}x = 0\).

The above shows that the Newtonian potentials (1.1) admit decompositions into integrals of compactly supported and positive semi-definite functions, with the appropriate restriction of the domain for \(d=2\).

Let us only remark at this point that the positivity of a quadratic form has the important implication that it entails the existence of a corresponding Gaussian process, discussed briefly in Sect. 1.4. It is however also of interest in mathematical physics for different reasons [22].

1.2 Finite range decompositions of quadratic forms

It is an open problem to characterize the class of positive quadratic forms, \(S: D(S) \times D(S) \rightarrow {\mathbb{R }}\), that admit decompositions into integrals (or sums) of positive quadratic forms of finite range: for all \(f,g \in D(S), t>{0}\),

$$\begin{aligned}&\left\{ \begin{array}{l} S(f,g) = \int _{0}^{\infty } S_{t}(f,g) \; \frac{\mathrm{d}t}{t},\\ S_{t}: D(S) \times D(S) \rightarrow {\mathbb{R }},\\ S_{t}(f,f) \ge 0,\\ S_{t}(f,g) = 0 \; \text{ if}\; {d}(\text{ supp}(f),\text{ supp}(g)) > \theta (t), \end{array}\right. \end{aligned}$$
(1.8)

where \(\theta : (0,\infty ) \rightarrow (0,\infty )\) is increasing and \(d\) is a distance function. The condition of finite range, the last condition in (1.8), generalizes the property of compact support of the function \(\varphi \) in (1.3) to quadratic forms that are not defined by a convolution kernel. The difficulty in decomposing quadratic forms in such a way is to achieve the two conditions of positivity and finite range simultaneously. Note that by splitting up the integral, one can obtain a decomposition into a sum from (1.8), and conversely, a decomposition into a sum can be written as an integral (without regularity in \(t\)).

For applications, not only the existence, but also the regularity of the decomposition (1.8) is important. Let \((X,\mu )\) be a metric measure space, i.e., a locally compact complete separable metric space \(X\) with a Radon measure \(\mu \) on \(X\) with full support (i.e., \(\mu \) is strictly positive), \(C_{c}(X)\) the space of continuous functions on \(X\) with compact support, and \(C_{b}(X)\) the space of bounded and continuous functions on \(X\). Let us say that the decomposition (1.8) is regular if \(C_{c}(X) \cap D(S)\) is \(S\)-dense in \(D(S)\) and if every \(S_{t}\) has a bounded continuous kernel \(s_{t} \in C_{b}(X \times X)\):

$$\begin{aligned} S_{t}(f,g) = \int s_{t}(x,y) f(x)g(y) \; \text{ d}\mu (x) \, \text{ d}\mu (y) \quad \text{ for} \text{ all} \quad f,g \in C_{c}(X) \cap D(S).\nonumber \\ \end{aligned}$$
(1.9)

For the decompositions (1.2), (1.4), the kernels are of course given in terms of the smooth function \(\varphi \) by the explicit formula

$$\begin{aligned} \phi _{t}(x,y) = t^{-(d-2)} \varphi (|x-y|/t) \quad \text{ for} \text{ all}\, x,y \in {\mathbb{R }}^d, t>0. \end{aligned}$$
(1.10)

Note that for \(d=2\) the second term in (1.4) could be omitted by (1.7), with the understanding that the quadratic form is restricted to functions with vanishing integral. It follows in particular that

$$\begin{aligned} |\phi _{t}(x,y)| \le C t^{-(d-2)} \quad \text{ uniformly} \text{ in} \text{ all}\; x,y \in {\mathbb{R }}^d. \end{aligned}$$
(1.11)

This reflects the decay of the Newtonian potential. Moreover, for all integers \(l_{x}, l_{y} \ge 0\), the derivatives of the kernel \(\phi _{t}\) decay according to

$$\begin{aligned} |D_{x}^{l_{x}} D_{y}^{{l_{y}}} \phi _{t}(x,y)| \le C_{l} t^{-(d-2)} t^{-l_{x}-l_{y}}, \end{aligned}$$
(1.12)

reflecting that \(|D^l\varPhi (x)| \le C_{l} |x|^{-(d-2-l)}\) for all \(x \in {\mathbb{R }}^d, x\ne 0\).

The main result of this paper is a rather simple construction of decompositions (1.8) with estimates like (1.11) for quadratic forms that arise by duality with Dirichlet forms in a large class. Let us call such forms as Green forms motivated by the Newtonian potential, or Green’s function, that is a special case. This is explained in Sect. 1.3.

The main idea of our method is that (1.8) can be achieved by applying formulae like (1.2) to the spectral representation of the Green form, and then exploiting finite propagation speed properties of appropriate wave flows. These are generalizations of the fact that if \(u(t,x)\) is a solution to

$$\begin{aligned} \partial _{t}^2 u-\Delta u = 0, \quad u(0,x) = u_{0}(x), \; \partial _{t} u(0,x) = 0 \end{aligned}$$
(1.13)

with compactly supported initial data \(u_{0}\) then

$$\begin{aligned} \text{ supp}(u(t, \cdot )) \subseteq N_{t}(\text{ supp}(u_0)) \end{aligned}$$
(1.14)

where \(N_t(U) = \{x \in X: d(x,U) \le t \}\) for any \(U \subset X\).

The idea of exploiting properties of the wave equation in the context of probability theory is not new. For example, Varopoulos [34] has used the finite propagation speed of the wave equation to obtain Gaussian bounds on the heat kernel of Markov chains, by decomposing it into compactly supported pieces. Our objective is slightly different in that we are interested in the constraint of positive definite decompositions.

Decompositions of singular functions into sums or integrals of smooth and compactly supported functions have a history in analysis. For example, Fefferman’s celebrated proof of pointwise almost everywhere convergence of the Fourier series [17] uses a decomposition of \(1/x\) on \({\mathbb{R }}\) like (1.2), albeit without using positivesemi-definiteness. Hainzl and Seiringer [22], motivated by applications to quantum mechanics such as Ref. [18], decompose general radially symmetric functions, without assuming a priori that they are positive definite, into weighted integrals over tent functions. These, like \(\varphi (|x|)\) in (1.2), are positive semi-definite. They state sufficient conditions for the weight to be non-negative, and thus obtain decompositions like (1.2) for a class of radially symmetric potentials including \(e^{-m|x|}/|x|\) on \({\mathbb{R }}^3\). Special cases and similar results have also appeared in earlier works of Pólya [27] and of Gneiting [20, 21].

These results, like (1.2), make essential use of radial symmetry. One example of particular interest for probability theory—where radial symmetry is not given—is the Green’s function of the discrete Laplace operator:

$$\begin{aligned} \Delta _{\mathbb{Z }^d}u(x) = \sum _{e \in \mathbb Z ^d: |e|_1=1} (u(x+e)-u(x)) \quad \text{ for} \text{ any} \, u: \mathbb Z ^d \rightarrow {\mathbb{R }}, x \in \mathbb Z ^d.\qquad \end{aligned}$$
(1.15)

Brydges et al. [6] showed that also in this discrete case, the corresponding Green’s function, or more generally the resolvent, admits a decomposition like (1.8) into a sum (instead of an integral) of positive semi-definite lattice functions with estimates analogous to (1.12). Brydges and Talaczyck [10] gave a related construction which applies to quite general elliptic operators on domains in \({\mathbb{R }}^d\), but estimates on the kernels of this decomposition are only known when the coefficients are constant. Their construction was adapted by Adams et al. [1] to show that the Green’s functions of constant coefficient discrete elliptic systems on \(\mathbb Z ^d\) admit decompositions with estimates analogous to (1.12) and that the decomposition obtained this way is analytic as a function of the (constant) coefficients. These results are all based on a constructions that average Poisson kernels.

Our method, as briefly sketched earlier, is different from that of [1, 6, 8, 10] and yields simpler proofs of their results about constant coefficient elliptic operators—both in discrete and continuous context. It furthermore naturally yields a decomposition into an integral instead of a sum (with integrand smooth in \(t\)), and gives effective estimates for decompositions of Green’s functions of variable coefficient operators.

1.3 Duality and spectral representation of the Green form

Let us now introduce the general set-up in which our result is framed more precisely. For motivation, we first return to the quadratic forms defined by the Newtonian potentials (1.1):

$$\begin{aligned} \varPhi (f,g) := \int _{{\mathbb{R }}^d \times {\mathbb{R }}^d} \varPhi (x-y)f(x)g(y) \; \text{ d}x \, \text{ d}y , \quad f,g \in D(\varPhi ) \end{aligned}$$
(1.16)

where

$$\begin{aligned} {\left\{ \begin{array}{ll} D(\varPhi ) = C_c^\infty ({\mathbb{R }}^d)&(d\ge 3)\\ D(\varPhi ) = \{ f \in C_c^\infty ({\mathbb{R }}^2): \int _{{\mathbb{R }}^2} f \; \text{ d}x = 0\}&(d=2). \end{array}\right.} \end{aligned}$$
(1.17)

These quadratic forms are not bounded on \(L^2({\mathbb{R }}^d)\), as is most apparent when \(d=2\). They are closely related to the Dirichlet forms given by

$$\begin{aligned} E(u,v) := \int _{{\mathbb{R }}^d} \nabla u \cdot \nabla v \; \text{ d}x , \quad u, v\in C_c^\infty ({\mathbb{R }}^d). \end{aligned}$$
(1.18)

The correspondence between the two is duality: for all \(f \in D(\varPhi )\),

$$\begin{aligned} \varPhi (f,f) = \sup \left\{ \int _{{\mathbb{R }}^d} fu \; \text{ d}x: u \in C_c^\infty ({\mathbb{R }}^d), E(u,u) \le 1 \right\} . \end{aligned}$$
(1.19)

This set-up admits the following natural generalization: Let \((X,\mu )\) always be a metric measure space and \(L^2(X)\) be the Hilbert space of equivalence classes of real-valued square \(\mu \)-integrable functions on \(X\) with inner product \((u,v)=(u,v)_{L^2}\). Let \(E: D(E) \times D(E) \rightarrow {\mathbb{R }}\) be a closed positive quadratic form on \(L^2(X)\) with \(D(E) \subseteq L^2(X)\) a dense linear subspace. It is sometimes convenient to assume that \(E\) is regular, i.e., \(C_c(X) \cap D(E)\) is \(E\)-dense in \(D(E)\). \(E\) is closed means that \(D(E)\) is a Hilbert space with inner product \(E(u,v)+m^2(u,v)_{L^{2}}\) for any \(m^{2}>0\). For the example (1.18), the domain of the form closure \(D(E)\) of \(C_c^\infty ({\mathbb{R }}^d)\) is the usual Sobolev space \(H^1({\mathbb{R }}^d)\) and \((u,v)_{H^1} = E(u,v)+(u,v)_{L^2}\) is the usual Sobolev inner product.

It follows [29] from closedness that \(E\) is the quadratic form associated to a unique self-adjoint operator \(L: D(L) \rightarrow L^2(X)\),

$$\begin{aligned} E(u,v) = (u,Lv) \quad \text{ for}\, u \in D(E), v \in D(L), \end{aligned}$$
(1.20)

where \(D(L)\subseteq D(E)\) is a dense linear subspace in \(L^2(X)\). Moreover, self-adjointness of \(L\) gives rise to a spectral family and functional calculus. This means in particular that for any Borel measurable \(F : [0,\infty ) \rightarrow {\mathbb{R }}\), there is a self-adjoint operator, denoted \(F(L) : D(F(L)) \rightarrow L^2(X)\), where

$$\begin{aligned} F(L)&:= \int _0^\infty F(\lambda ) \; \text{ d}P_\lambda , \end{aligned}$$
(1.21)
$$\begin{aligned} D(F(L))&:= \left\{ u \in L^2(X): \int _0^\infty F(\lambda )^2 \; \text{ d}(u, P_\lambda u) < \infty \right\} \end{aligned}$$
(1.22)

with \(P_\lambda \) the spectral family associated to \(L\), and \((u, P_\lambda u)\) is the spectral measure associated to \(L\) and \(u \in L^2(X)\). In these terms, \(E\) has the representation

$$\begin{aligned} E(u,u) = \Vert L^{{\frac{1}{2}}}u\Vert _{L^2(X)} = \int _{\mathrm{spec}(L)} \lambda \; \text{ d}(u,P_\lambda u), \quad u \in D(E) = D(L^{{\frac{1}{2}}}), \end{aligned}$$
(1.23)

where \(E(u,v)\) for \(u \ne v\) is defined by the polarization identity. Similarly, the corresponding Green form can be defined by polarization and

$$\begin{aligned} \varPhi (f,f) = \Vert L^{-{\frac{1}{2}}}f\Vert _{L^2(X)} = \int _{\mathrm{spec}(L)} \lambda ^{-1} \; \text{ d}(u,P_\lambda u), \quad f\in D(\varPhi ) = D(L^{-{\frac{1}{2}}}).\nonumber \\ \end{aligned}$$
(1.24)

This representation will be our starting point for the decomposition of the Green form. Before stating the result and its proof, let us sketch how the decomposition problem arises in probability theory.

1.4 Gaussian fields and statistical mechanics

Even though the linear space \(D(E)\) is complete under the metric induced by the inner product \(E(u,v)+m^2(u,v)_{L^2}\) for any \({m^2}> 0\), it is generally not complete for \({{m^2}} = 0\). It may however be completed to a Hilbert space abstractly; we denote this Hilbert space by \((H_E, (\cdot ,\cdot )_E)\). Similarly, we can complete the domain \(D(\varPhi )\) to a Hilbert space under the quadratic form \(\varPhi \); this Hilbert space is denoted by \((H_\varPhi , (\cdot ,\cdot )_\varPhi )\). \(H_E\) and \(H_\varPhi \) are dual in the following sense: The \(L^2\) inner product can be restricted to

$$\begin{aligned} \langle \cdot , \cdot \rangle : D(\varPhi ) \times D(E) \rightarrow {\mathbb{R }}, \quad \langle f, u \rangle = (f,u) = (L^{-{\frac{1}{2}}}f, L^{{\frac{1}{2}}} u) \end{aligned}$$
(1.25)

which extends to a bounded bilinear form on \(H_\varPhi \times H_E\). \(L\) acts by definition isometric from \(D(E)\) to \(D(\varPhi )\), with respect to the norms of \(H_E\) and \(H_\varPhi \), and it extends to an isometric isometry from \(H_E\) to \(H_\varPhi \). Thus, \(H_\varPhi \) is identified with the dual space of \(H_E\) naturally, via the extension of the \(L^2\) pairing \(\langle \cdot , \cdot \rangle \).

Remark 1.1

To give some insight into the interpretation of the spaces \(H_E\) and \(H_\varPhi \), let us mention how \(H_E\) can be characterized in the case of the Newtonian potential [13]:

$$\begin{aligned}&H_E \cong \{ f : {\mathbb{R }}^d \rightarrow {\mathbb{R }} \text{ measurable}\,: \nonumber \\&\text{ there} \text{ exists} \text{ an}\; E\text{--Cauchy} \text{ sequence}\quad f_n \in D(E) \text{ with}\, f_n \rightarrow f \text{ a.e.} \} / \sim _d\nonumber \\ \end{aligned}$$
(1.26)

where \(\sim _d\) is the usual identification of functions that are equal almost everywhere when \(d\ge 3\). For \(d=2, \sim _d\) in contrast identifies functions that may differ by a constant almost everywhere. (It is therefore sometimes said that the massless free field does not exist in two dimensions, but that its gradient does. The massless free field is the free field corresponding to \(\varPhi \) in the terminology explained below.) To understand this distinction, take a smooth cut-off function \(\varphi _1\) on \({\mathbb{R }}^2\), e.g., with \(\varphi _1 \equiv 1\) on \(B_1(0)\) and \(\varphi _1 \equiv 0\) on \(B_2(0)^c\), set \(\varphi _n(x)=\varphi _1(x/n)\), and note that \(E(\varphi _n,\varphi _n) = n^{d-2} E(\varphi _1,\varphi _1)\). Thus, \((\varphi _n)\) is bounded in \(H_E\) whenever \(d\le 2\), and then (by the Banach–Alaoglu theorem) there is \(\psi \in H_E\) such that \(\varphi _n \rightarrow \psi \) weakly along a subsequence in \(H_E\); however, \(\varphi _n \rightarrow 1\) pointwise, so that \(\psi \equiv 1 \in H_E\). Now \(E(1,1) = 0\) implies that the constant functions must be in the same equivalence class as the zero function.

It is well known that any separable real Hilbert space \((H,(\cdot , \cdot )_H)\) defines a Gaussian process indexed by \(H\) [32]. This is a probability space \((\varOmega , P)\) and a unitary map \(f \in H \rightarrow \langle f, \phi \rangle \in L^2(P)\) such that the random variables \(\langle f, \phi \rangle \) are Gaussian with variance \((f,f)_H\). Note that \(\langle f, \phi \rangle \) is merely a symbolic notation for the random variable on \(L^2(P)\) that corresponds to \(f \in H\). It cannot in general be interpreted as the pairing of \(f \in H\) with a random element \(\phi (\omega ) \in H\) defined for \(\omega \in \varOmega \); see e.g. [30].

In particular, if \((H,(\cdot ,\cdot )_H)\) is the Hilbert space \((H_\varPhi , (\cdot , \cdot )_{H_\varPhi })\), this process is called the free field or the Gaussian free field (corresponding to Dirichlet form \(E\) or Green’s function \(\varPhi \)). The importance of free fields in statistical mechanics, and probability theory in a wider sense, is well recognized. For instance, observables of many models of statistical mechanics are intricately related to them, by relations such as the the Kac–Siegert transform [4]. These models include spin models such as the Ising model, as well as Coulomb and dipole systems. In a different direction, if \(E\) is a Markovian form that satisfies some regularity conditions, there exists an associated Markov process [19], and it turns out that there are strong connections between the distributions of the local times of this Markov process and the free field associated to the same Dirichlet form; see e.g. [5, 1416, 33]. In particular, in a generalized “non-commutative” notion of Gaussian processes that are supersymmetric, this correspondence becomes especially striking; see e.g. the review [7]. The last-mentioned correspondence is the point of departure for an analysis of the critical behavior of models of self-avoiding walks in dimension four [9].

For typical applications to statistical mechanics, the measure space \((X,\mu )\) of Sect. 1.3 is endowed with additional structure such as a distance function, a notion of smoothness, etc., as is the case for the Newtonian potential. The global properties of the free field are of special interest for statistical mechanics. An example of such a global property is, if \(X\) is an infinite graph and \(X_n \uparrow X\) is an increasing sequence of finite graphs approximating \(X\), in an appropriate sense, the behavior of

$$\begin{aligned} \int \prod _{x\in X_n} e^{-V(\phi _x)} \; \text{ d}P(\phi ), \quad \text{ as}\, n\rightarrow \infty \end{aligned}$$
(1.27)

for some \(V: {\mathbb{R }} \rightarrow {\mathbb{R }}\). The covariance \(\varPhi \) is typically long-range as in (1.1). This makes the analysis of the global properties of free fields difficult.

Decompositions like (1.16) give rise to notions of scale and corresponding multiscale decompositions of the Gaussian free field and therefore provide a point of departure for multiscale analysis. One instance of such an application is the renormalization group method; see e.g. [4] and references therein.

1.5 Main result

Let \((X,\mu )\) be a metric measure space. In addition, let \(d : X \times X \rightarrow [0,\infty ]\) be an extended pseudometric on \(X\). (Extended means that \(d(x,y)\) may be infinite and pseudo that \(d(x,y) = 0\) for \(x\ne y\) is allowed. Example 1.2 below gives an example of interest where \(d\) is not the metric of \(X\).)

Let \(E: D(E) \times D(E) \rightarrow {\mathbb{R }}\) be a regular closed symmetric form on \(L^2(X)\) as in Sect. 1.3 and denote by \(L : D(L) \rightarrow L^2(X)\) its self-adjoint generator. Theorem 1.1 assumes that \((X,\mu ,d,E)\) satisfies one of the following two finite propagation speed conditions that we now introduce: for \(\gamma > 0\), \(B>0\), and an increasing function \(\theta :(0,\infty )\rightarrow (0,\infty )\), let us say that \((X,\mu ,d,E)\) satisfies \((P_{\gamma ,\theta })\) respectively (\(P_{\theta ,B}^{{*}}\)) if:

$$\begin{aligned} \text{ supp}(\cos (L^{{\frac{1}{2}}\gamma } t)u) \subseteq N_{\theta (t)}(\text{ supp}(u)) \quad \text{ for} \text{ all} \, u \in C_c(X), t > 0, \quad (P_{\gamma ,\theta }) \end{aligned}$$

respectively

$$\begin{aligned}&E(u,u) \le B\Vert u\Vert _{L^2(X)} \quad \text{ for} \text{ all} \, u \in L^2(X), \nonumber \\&\text{ supp}(L^{n}u) \subseteq N_{\theta (n)}(\text{ supp}(u)) \quad \text{ for} \text{ all} \, u \in C_c(X), n \in \mathbb N ,\quad \quad (P_{\theta ,B}^{{*}}) \end{aligned}$$

where as before \(N_t(U) = \{x \in X: d(x,U) \le t \}\) for any \(U \subset X\). The left-hand side of \((P_{\gamma ,\theta })\) is defined in terms of functional calculus for the self-adjoint operator \(L\).

Note that if \(L=-\Delta _{{\mathbb{R }}^d}\) is the Laplace operator of \({\mathbb{R }}^d\), then \(u(t,x) = [\cos (L^{1/2}t)u_0](x)\) is a solution to the standard wave equation (1.13), and the condition \((P_{\gamma ,\theta })\) with \(\gamma =1\) and \(\theta (t)=t\) is the finite propagation speed property (1.14). The property holds for more general elliptic operators and elliptic systems (not necessarily of second order), however; see Example 1.2 below. Similarly, if \(L =-\Delta _\mathbb{Z ^d}\) is the discrete Laplace operator (1.15), then (\(P_{\theta ,B}^{{*}}\)) holds with \(B=2\) and \(\theta (n) = n\), since \(Lu(x)\) only depends on \(u(y)\) when \(x\) and \(y\) are nearest neighbors. As for the property \((P_{\gamma ,\theta })\), the condition (\(P_{\theta ,B}^{{*}}\)) remains true for more general discrete Dirichlet forms; see Examples 1.2–1.3.

Let us introduce a further condition: the heat kernel bound (\(H_{\alpha ,\omega }\)) holds when the heat semigroup \((e^{-tL})_{t>0}\) has continuous kernels \(p_t\) for all \(t>0\) and there is \(\alpha >0\) and a bounded function \(\omega : X \rightarrow {\mathbb{R }}_+\) such that

$$\begin{aligned} p_t(x,x) \le \omega (x) t^{-\alpha /2} \quad \text{ for} \text{ all} \; x \in X. \quad (H_{\alpha ,\omega }) \end{aligned}$$

Criteria for (\(H_{\alpha ,\omega }\)) are classic; see e.g. [26] for second-order elliptic operators and also the discussion in the examples below.

Theorem 1.1

Suppose \((X,\mu ,d,E)\) satisfies \((P_{\gamma ,\theta })\) or (\(P_{\theta ,B}^{{*}}\)). Then the corresponding Green form (1.24) admits a finite range decomposition (1.8) with \(S=\varPhi \) and \(S_{t}=\varPhi _{t}\) such that the \(\varPhi _t\) are bounded quadratic forms with

$$\begin{aligned} |\varPhi _t(f,g)| \le C_{\gamma ,B} t^{2/\gamma } \Vert f\Vert _{L^2(X)}\Vert g\Vert _{L^2(X)} \quad \text{ for} \text{ all} \; f,g \in L^2(X).\qquad \end{aligned}$$
(1.28)

Moreover, (\(H_{\alpha ,\omega }\)) implies that the \(\varPhi _t\) have continuous kernels \(\phi _t\) that satisfy

$$\begin{aligned} |\phi _t(x,y)| \le C_{\alpha ,\gamma ,B} \sqrt{\omega (x)\omega (y)} t^{-(\alpha -2)/\gamma } .\qquad \end{aligned}$$
(1.29)

1.6 Examples

Example 1.1

(Elliptic operators with constant coefficients) Let \(a = (a_{ij})_{i,j=1,\dots , d}\) be a strictly positive definite matrix in \({\mathbb{R }}^{d\times d}\) and

$$\begin{aligned} E_a({u}, {v})&= \sum _{i,j=1}^d \int _{{\mathbb{R }}^d} (D_i {u}(x)) a_{ij} (D_j{v}(x)) \; \text{ d}x, \quad u,v \in C_c^\infty ({\mathbb{R }}^d) ,\end{aligned}$$
(1.30)
$$\begin{aligned} E_a^*({u}, {v})&= \sum _{i,j=1}^d \sum _{x\in \mathbb Z ^d} (\nabla _i {u}(x)) a_{ij} (\nabla _j {v}(x)), \quad u,v\in C_c(\mathbb Z ^d), \end{aligned}$$
(1.31)

where \(D_iu(x)\) is the partial derivative of \({u}(x)\) in direction \(i=1,\ldots ,d\),

$$\begin{aligned} \nabla _i u(x) = u(x+e_i)-u(x) \end{aligned}$$
(1.32)

with \(e_i\) the unit vector in the positive \(i\)th direction, and \(C_c(\mathbb Z ^d)\) is the space of functions \(u:\mathbb Z ^d \rightarrow {\mathbb{R }}\) with finite support. For \(m^2 \ge 0\), further set

$$\begin{aligned} E_{a,{m^2}}(u,v) = E_a(u,v) + m^2 \int _{{\mathbb{R }}^d} u(x)v(x) \; \text{ d}x \end{aligned}$$
(1.33)

and define \(E_{a,{m^2}}^*\) analogously. Assume that the eigenvalues of \(a\) are contained in the interval \([B_-^{{2}}, B_+^{{2}}]\), and in the discrete case also that \({m^2}\in [0, M_+^{{2}}]\), for \(B_-^{{2}},B_+^{{2}},M_+^{{2}} > 0\); these assumptions are only important for uniformity in the constants below.

In the continuous context, let \(d\) be the Euclidean distance on \(X={\mathbb{R }}^d\) and \(\mu \) be the Lebesgue measure. It follows that \((X,\mu ,d,E)\) satisfies \((P_{\gamma ,\theta })\) with \(\gamma =1\), \(\theta (t) = B_+t\); see Example 1.2 for more details. In the discrete context, let \(d\) be the infinity distance on \(X=\mathbb Z ^d\), i.e., \(d(x,y)=\max _{i=1,\ldots , d} |x_i-y_i|\), and \(\mu \) be the counting measure. Then (\(P_{\theta ,B}^{{*}}\)) holds with \(B=B_+ + M_+^{{2}}\) and \(\theta (n)=n\).

Theorem 1.1 thus implies that the Green’s functions associated to \(E_{a,{m^2}}\) and \(E_{a,{m^2}}^*\) admit finite range decompositions. Let us denote their kernels by \(\phi _t(x,y; a,{m^2})\) and \(\phi _t^*(x,y; a,{m^2})\). In addition to (1.29), it is not difficult to obtain estimates on the decay of the derivatives of \(\phi _t\) and \(\phi _t^*\), like (1.12), in this situation of constant coefficients. Since these estimates are of interest for applications, we provide the details in Sect. 3.2 (in a slightly more general context). We show that there are constants \(C_{l,k}>0\) depending only on \(B_-\) and \(B_+\), and in the discrete case also on \(M_+\), such that

$$\begin{aligned} |D_a^{l_a} D_{{m^2}}^{l_{{m^2}}} D_y^{{l_y}} D_x^{l_x} \phi _t(x,y;a,m^2)| \le C_{l,k} t^{-(d-2)-l_x-l_y+2l_{{m^2}}} (1+m^2t^2)^{-k}\qquad \end{aligned}$$
(1.34)

and

$$\begin{aligned} |D_a^{l_a} D_{{m^2}}^{l_{{m^2}}} \nabla _y^{{l_y}} \nabla _x^{l_x} \phi _t^*(x,y,t;a,m^2)| \le C_{l,k} t^{-(d-2)-l_x-l_y+2l_{{m^2}}} (1+m^2t^2)^{-k}\nonumber \\ \end{aligned}$$
(1.35)

for all integers \(l_a, l_{{m^2}}, l_x, l_y\), and \(k\) such that

$$\begin{aligned} l_{{m^2}} < {\frac{1}{2}}(d + l_x+l_y), \end{aligned}$$
(1.36)

and that the following approximation result holds: there is \(c>0\) such that

$$\begin{aligned} \nabla _x^{l_x} \nabla _y^{{l_y}} \phi _t^*(x,y; a, m^2)&= D_x^{l_x} D_y^{{l_y}} \phi _t(cx,cy; a,m^2) \nonumber \\ \quad&+ O(t^{-(d-2)-l_x-l_y-1}(1+m^2t^2)^{-k}). \end{aligned}$$
(1.37)

This reproduces and generalizes many results of [1, 6]. More precisely, we verify that there exists a smooth function \(\bar{\phi }: {\mathbb{R }}^d \times [B_-^{{2}},B_+^{{2}}] \times [0,\infty ) \rightarrow {\mathbb{R }}\) supported in \(|x| \le B_+\) such that

$$\begin{aligned} \phi _t(x,y;a,m^2) = t^{-(d-2)} \bar{\phi }\left(\frac{x-y}{t}; a,m^2t^2\right) \end{aligned}$$
(1.38)

which has the same structure as (1.10) when \(m^2=0\); this is scale invariance. Moreover, by (1.37), the discrete Green’s function has a scaling limit and the error is of the order of the rescaled lattice spacing \(O(t^{-1})\). This result improves [8].

Exapmle 1.2

(Elliptic operators and systems with variable coefficients) Let \(M \in \mathbb N \) and \(a_{ij}: {\mathbb{R }}^d \rightarrow {\mathbb{R }}^{M\times M}, i,j=1,\dots , d\), be the smooth coefficients of a uniformly elliptic system (or in particular, if \(M=1\), of a uniformly elliptic operator):

$$\begin{aligned} B_-^2 |\mathbf{\xi }|^2 \le \sum _{k,l=1}^M \sum _{i,j=1}^d a_{ij}^{kl}(x) \xi _i^k \xi _j^l \le B_+^2 |\mathbf{\xi }|^2 \quad \text{ for} \text{ all} \, \mathbf{\xi } \in {\mathbb{R }}^{dM}, x \in {\mathbb{R }}^d, \end{aligned}$$
(1.39)

with \(B_-,B_+ > 0\). Let us write \(\mathbf{u} = (\mathbf{u}^1, \ldots , \mathbf{u}^M) \in {\mathbb{R }}^{dM}\) with \(\mathbf{u}^i \in {\mathbb{R }}^d, i=1, \ldots , M\). Let

$$\begin{aligned} E(\mathbf{u}, \mathbf{v}) = \sum _{i,j=1}^d \int _{{\mathbb{R }}^d} (D_i \mathbf{u}^k(x)) a_{ij}^{kl}(x) (D_j \mathbf{u}^l(x)) \; \text{ d}x, \quad \mathbf{u}, \mathbf{v} \in C_c^\infty ({\mathbb{R }}^d, {\mathbb{R }}^M)\qquad \end{aligned}$$
(1.40)

and analogously in the discrete case (as in (1.30), (1.31)).

To apply Theorem 1.1, \((X,\mu ,d)\) is defined by \(X = {\mathbb{R }}^d \times \{1, \dots , M\}, \mu \) is the product of the Lebesgue measure on \({\mathbb{R }}^d\) and the counting measure on \(\{1, \dots , M\}\), and the distance is given by \(d((x,i),(y,j)) = d(x,y)\). In particular, \(d\) is only a pseudometric on \(X\). We may use the identification of \(\mathbf{u} : {\mathbb{R }}^d \rightarrow {\mathbb{R }}^M\) and \(u: X \rightarrow {\mathbb{R }}\) by \(u(x,i) = \mathbf{u}^i(x)\).

It suffices to verify the condition \((P_{1,B_+t})\) for smooth, compactly supported \(\mathbf{u}_0: {\mathbb{R }}^d \rightarrow {\mathbb{R }}^M\). For such a \(\mathbf{u}_0\), set, by using spectral theory for self-adjoint operators:

$$\begin{aligned} \mathbf{u}(t) := \cos ((L+m^2)^{{\frac{1}{2}}}t)\mathbf{u}_0. \end{aligned}$$
(1.41)

Then, since \(\mathbf{u}_0\) is smooth, \(\mathbf{u}(t,x): {\mathbb{R }} \times {\mathbb{R }}^d \rightarrow {\mathbb{R }}^M\) is smooth jointly in \((t,x)\), and

$$\begin{aligned} \partial _t^2 \mathbf{u} + L\mathbf{u} + m^2 \mathbf{u} = 0, \quad \partial _t\mathbf{u}(0) = 0, \; \mathbf{u}(0) = \mathbf{u}_0 \end{aligned}$$
(1.42)

holds in the classical sense. If \(M=1, m^2=0\), and \(a\) is the \(d\times d\) identity matrix, \((P_{1,t})\) is the finite propagation speed of the wave equation.

Similarly, in the general situation, the property \((P_{1,{B_+}t})\) can be deduced from the finite propagation speed of first-order hyperbolic systems. This is well known, but the explicit reduction for the case of (1.42) with (1.40) is difficult to find in the literature. Let us therefore sketch how to convert (1.42) to a hyperbolic system for readers interested in this case. For example, one can define \(\mathbf{v}: {\mathbb{R }} \times {\mathbb{R }}^d \rightarrow {\mathbb{R }}^{(d+2)M}\) by

$$\begin{aligned} \mathbf{v}_{0}^k = \partial _{t} \mathbf{u}^k, \quad \mathbf{v}_{i}^k = \sum _{j=1}^d\sum _{l=1}^M a_{ij}^{kl} \partial _{x_j} \mathbf{u}^l, \quad \mathbf{v}_{d+1}^k =m \mathbf{u}^k, \quad i=1, \ldots , d, \; k=1, \ldots , M.\nonumber \\ \end{aligned}$$
(1.43)

It follows that \(\mathbf{v}\) satisfies

$$\begin{aligned} \mathbf{S}\partial _t\mathbf{v} + \sum _{j=1}^d \mathbf{A}_j \partial _{x_j} \mathbf{v} + \mathbf{B} \mathbf{v}= 0, \quad \mathbf{v}(0) = (\mathbf{0}, (aD \mathbf{u}_0)^1, \dots , (aD \mathbf{u}_0)^d, m\mathbf{u}_0)\qquad \end{aligned}$$
(1.44)

where \(\mathbf{S}, \mathbf{A}_j, \mathbf{B}: {\mathbb{R }}^d \rightarrow R^{(d+2)M \times (d+2)M}\) are defined as the block matrices

$$\begin{aligned} \mathbf{S} = \begin{pmatrix} 1_{M\times M}&0_{dM\times M}&0_{M\times M} \\ 0_{M\times dM}&a^{-1}&0_{M\times dM} \\ 0_{M\times M}&0_{dM\times M}&1_{M\times M} \end{pmatrix}, \quad \mathbf{B} = \begin{pmatrix} 0_{1\times 1}&0_{d\times 1}&m \\ 0_{1 \times d}&0_{d\times d}&0_{1\times d} \\ -m&0_{d\times 1}&0_{1\times 1} \end{pmatrix} \otimes 1_{M\times M},\nonumber \\ \end{aligned}$$
(1.45)

and

$$\begin{aligned} \mathbf{A}_i = \begin{pmatrix} 0&-\delta _{1i}&\cdots&-\delta _{di}&0\\ -\delta _{1i}&0&\cdots&0&0 \\ \vdots&\vdots&\ddots&\vdots&0 \\ -\delta _{di}&0&\cdots&0&0\\ 0&0&\cdots&0&0 \end{pmatrix} \otimes 1_{M\times M} ,\quad i=1, \dots , d. \end{aligned}$$
(1.46)

It is immediate that this system is symmetric uniformly hyperbolic, by the symmetry and uniform ellipticity of the matrix \(a\). The property \((P_{1,B_+ t})\) now follows from the finite propagation speed of linear hyperbolic systems; see e.g. [3, 25].

Nash showed [26] that \((H_{d,\omega })\) holds when \(M=1\). In [23, 24], conditions are given for \((H_{d,\omega })\) to hold when \(M > 1\). In particular, this includes the constant coefficient case. The latter case can be treated using the Fourier transform; see Sect. 3.2.

Example 1.3

(Random walk on graphs) Let \((X,E)\) be a (locally finite) graph, with vertex set \(X\) and edge set \(E \subset P_2(X)\), where \(X\) is a countable (or finite) set and \(P_2(X)\) are the subsets of \(X\) with two elements. Let \(d : X \times X \rightarrow [0,\infty ]\) be the graph distance on \((X,E)\), i.e., \(d(x,y)\) is the (unweighted) length of the shortest path from \(x\) to \(y\).

Suppose that edge weights \(\mu _{xy}=\mu _{yx} \ge 0\), \(x,y\in X\) are given. These induce a natural measure, also denoted \(\mu \), on \(X\) by

$$\begin{aligned} \mu _x = \sum _{y\in X} \mu _{xy}, \quad \mu (A) = \sum _{x\in A} \mu _x \quad \text{ for} \text{ all} \, A \subseteq X. \end{aligned}$$
(1.47)

The associated Dirichlet form is

$$\begin{aligned} E(u,u) = {\frac{1}{2}}\sum _{xy \in E} \mu _{xy} (u(x)-u(y))^2 \quad \text{ for} \text{ all} \, u \in D(E) = L^2(\mu ) \end{aligned}$$
(1.48)

and its generator is given by

$$\begin{aligned} Lu(x) = \mu _x^{-1} \sum _{y \in X} \mu _{xy} (u(x)-u(y)) \quad \text{ for} \text{ all} \text{ finitely} \text{ supported} \;u: X \rightarrow {\mathbb{R }}.\nonumber \\ \end{aligned}$$
(1.49)

\(L\) is called the probabilistic Laplace operator associated to the simple random walk on the weighted graph \((X,\mu )\) with transition probabilities \(\mu _{xy}/\mu _x\). Let us remark that a probabilistic interpretation (or a maximum principle) does not hold in general for Examples 1.1–1.2 (when \(a\) is non-diagonal or vector-valued).

The Dirichlet form (1.48) is bounded on \(L^2(\mu )\) with operator norm \(2\) so that the property \((P_{\theta ,B}^{{*}})\) holds with \(\theta (n)=n\) and \(B=2\), and Theorem 1.1 is applicable.

For applications, it is often useful to add a killing rate to the random walk. The probabilistic Green density with killing rate \(\kappa \in (0,1)\) is defined by:

$$\begin{aligned} G^\kappa (x,y) = \sum _{n\ge 0} p^n(x,y) \kappa ^n = (\kappa L + (1-\kappa ))^{-1}(x,y) = (L^\kappa )^{-1}(x,y)\qquad \end{aligned}$$
(1.50)

where \(p^n(x,y)\) is the kernel of the operator \(P^n\) on \(L^2(\mu )\). Note that (1.50) only converges for \(\kappa =0\) when the random walk is transient, but that \(L^{-1}\) still makes sense as a quadratic form on its appropriate domain when the random walk is recurrent, as in (1.16), (1.17) for \(d=2\). Note further that \(\mathrm{spec}(L^\kappa ) \subseteq [0,2]\) for all \(\kappa \in [0,1]\), so that Theorem 1.1 is applicable uniformly in \(\kappa \in [0,1]\).

Closely related to the killed Green’s function \(G^\kappa \) is the resolvent kernel of \(L\). The resolvent of \(L\) is defined on \(L^2(\mu )\) by \(G_{{m^2}}= (L+m^2)^{-1}\) for \(m^2 > 0\). It is related to the killed Green’s density by

$$\begin{aligned} G^\kappa = \kappa ^{-1} G_{(1-\kappa )/\kappa }. \end{aligned}$$
(1.51)

One difference compared with the killed Green’s function is that \(L+m^2\) is not bounded uniformly in \(m^2 \ge 0\). To achieve the condition (\(P_{\theta ,B}^{{*}}\)) for fixed \(B>0\), it is therefore necessary to restrict to \(m^2 \le M_+^2\) with \(M_+^2 = B - 2\).

Remark 1.2

Other examples which Theorem 1.1 is applicable to include Dirichlet spaces that satisfy a Davies-Gaffney estimate [31] such as weighted manifolds and quadratic forms corresponding to powers of elliptic operators like \(\Delta ^{2}\).

1.7 Remarks

Remark 1.3

Theorem 1.1 also gives the decomposition into sums as in [1, 6, 10]: suppose that the assumptions of Theorem 1.1 are satisfied and, for notational simplicity, that the resulting decomposition has a kernel. Then, for any \(L > 1\),

$$\begin{aligned} \varPhi (x,y) = \sum _{j \in \mathbb Z } C_j(x,y) \quad \text{ for} \text{ all} \; x,y \in X \times X \end{aligned}$$
(1.52)

where the functions \(C_j : X \times X \rightarrow [0,\infty ), j\in \mathbb Z \) are given by

$$\begin{aligned} C_j(x,y) := \int _{L^{j-1}}^{L^{j}} \phi _t(x,y) \; \frac{\mathrm{d}t}{t} \quad \text{ for} \text{ all} \, x,y \in X. \end{aligned}$$
(1.53)

They satisfy the following properties:

$$\begin{aligned}&C_j \text{ is} \text{ the} \text{ kernel} \text{ of} \text{ a} \text{ positive} \text{ semi-definite} \text{ form}, \end{aligned}$$
(1.54)
$$\begin{aligned}&C_j(x,y) = 0 \quad \text{ for} \text{ all} \, x,y \in X \text{ with} \, d(x,y) \ge L^j, \end{aligned}$$
(1.55)

and, if \((H_{\alpha ,\omega })\) holds,

$$\begin{aligned} |C_j(x,y)| \le c_\alpha (x,y) {\left\{ \begin{array}{ll} L^{-(\alpha -2)(j-1)}&(\alpha > 2) \\ L^{(2-\alpha )j}&(\alpha < 2) \\ \log (L)&(\alpha = 2) \end{array}\right.} \end{aligned}$$
(1.56)

with \(c_\alpha (x,y)\) is independent of \(L\). Thus, \((C_j)_{j\in \mathbb Z }\) is a finite range decomposition into discrete scales of the Green’s function \(\varPhi \). Similarly, gradient estimates such as (1.34), (1.35), (1.37) in Example 1.1 have obvious discrete versions.

Remark 1.4

More generally than in Theorem 1.1, we may consider a family of symmetric forms, \(E^s, s \in Y\), where \(Y\) is a domain in a Banach space, with generators \(L^s\). Let us assume that \(E^s\) is smooth in \(s\), in the following sense: there exists a projection-valued measure \(P\) on a measurable space \(M\) and a function \(V: M \times Y \rightarrow (0,\infty )\), smooth in \(Y\), such that

$$\begin{aligned} F(L^s) = \int _{\mathrm{spec}(L^s)} F(\lambda ) \; \text{ d}P_\lambda ^s = \int _M F(V(s, \tau )) \; \text{ d}P_\tau . \end{aligned}$$
(1.57)

An example of this condition is \(E^{s}(f,f) = E(f,f) + s(f,f)\) in which case \(V(s, \lambda ) = \lambda + s\) and \((L^{s})^{-1}\) is the resolvent of \(L\); similarly, the killed Green’s function of Example 1.3 can be expressed in this way. Then the family of kernels \(\phi ^s\) is continuous in \(s\), and if (\(H_{\alpha ,\omega }\)) holds for \(s=0\), and \(V(\lambda ,s) \ge z^2(s) V(\lambda ,0) + m^2(s)\), then

$$\begin{aligned} |\phi ^s_t(x,y)| \le C_{\alpha ,\gamma ,l} \sqrt{\omega (x)\omega (y)} (z(s)t)^{-(\alpha -2)/\gamma } (1+tm(s))^{-l} . \end{aligned}$$
(1.58)

This can be verified by a straightforward adaption of the proof of Theorem 1.1.

2 Proof of Theorem 1.1

2.1 Spectral decomposition

The starting point for the proof is the spectral representation of the Green form (1.24):

$$\begin{aligned} \varPhi (f,f) = \int _{\mathrm{spec}(L)} \lambda ^{-1} \; \text{ d}(f, P_\lambda f) \quad \text{ for} \text{ all} \,f\in D(\varPhi ) , \end{aligned}$$
(2.1)

where \(f \in D(\varPhi )\) implies that the integral can be restricted to \(\mathrm{spec}(L) \setminus 0\). The main result follows by decomposition of the function \(\lambda ^{-1}: \mathrm{spec}(L)\setminus 0 \rightarrow {\mathbb{R }}_+\). Different decompositions are needed under the two conditions (\(P_{\gamma ,\theta }\)), (\(P_{\theta ,B}^{{*}}\)). The main idea of the proof is that decompositions with good properties exist. The result that we prove after using it to deduce Theorem 1.1 is summarized in the following lemma.

Lemma 2.1

(Spectral decomposition) Suppose that \(L\) satisfies \((P_{\gamma ,\theta })\) or (\(P_{\theta ,B}^{{*}}\)); in the second case, we assume that \(\gamma =1\). Then there exists a smooth family of functions \(W_t \in C^\infty ({\mathbb{R }}), t>0\), such that for all \(\lambda \in \mathrm{spec}(L) \setminus 0, t>0\), and all integers \(l\),

$$\begin{aligned} \lambda ^{-1} = \int _0^\infty t^{\frac{2}{\gamma }} W_t(\lambda ) \; \frac{\mathrm{d}t}{t}, \end{aligned}$$
(2.2)
$$\begin{aligned} W_t(\lambda )&\ge 0,\end{aligned}$$
(2.3)
$$\begin{aligned} (1+t^{\frac{2}{\gamma }}\lambda )^l W_t(\lambda )&\le C_{l} , \end{aligned}$$
(2.4)

and that for all \(u \in C_c(X)\),

$$\begin{aligned} \text{ supp}(W_t(L)u) \subseteq N_{\theta (t)}(\text{ supp}(u)) . \end{aligned}$$
(2.5)

Remark 2.1

More precisely, we will give explicit formulae for \(W_t\) that imply

$$\begin{aligned} (1+t^2\lambda )^l \lambda ^{m} \left| \frac{\partial ^m}{\partial \lambda ^m} W_t(\lambda )\right| \le C_{l,m} \end{aligned}$$
(2.6)

for all \(m\) and \(l\), improving (2.4). This improvement is used in Sect. 3.2.

Proof

(Theorem 1.1) It follows from (2.2) that, for any \(f \in D(\varPhi )\),

$$\begin{aligned} \varPhi (f,f)&= \int _{\mathrm{spec}(L)} \left( \int _0^\infty t^{\frac{2}{\gamma }} W_t(\lambda ) \; \frac{\mathrm{d}t}{t} \right) \; \text{ d}(f,P_\lambda f) \nonumber \\&= \int _{0}^\infty t^{\frac{2}{\gamma }} \left( \int _{\mathrm{spec}(L)} W_t(\lambda ) \; \text{ d}(f,P_\lambda f) \right) \; \frac{\mathrm{d}t}{t} \nonumber \\&= \int _{0}^\infty t^{\frac{2}{\gamma }} (f, W_t(L) f) \; \frac{\mathrm{d}t}{t} . \end{aligned}$$
(2.7)

The exchange of the order of the two integrals in the equation above is justified by non-negativity of the integrand, by (2.3). The latter also implies that \((f, W_t(L)f) \ge 0\) for all \(f \in L^2(X)\). The polarization identity allows to recover \(\varPhi (f,g)\) for all \(f,g \in D(\varPhi )\). Finally, (2.5) completes the verification of (1.8) for \(\varPhi _t\) defined by

$$\begin{aligned} \varPhi _t(f,g) = t^{\frac{2}{\gamma }} (f,W_t(L)g). \end{aligned}$$
(2.8)

It remains to prove that (\(H_{\alpha ,\omega }\)) implies (1.29).

The semigroup property and the continuity of \(p_t\) imply that \(p_t \in C_b(X,L^2(X))\) with

$$\begin{aligned} \Vert p_t(x,\cdot )\Vert _{L^2(X)} = \int _X p_t(x,y)p_t(y,x) \; \text{ d}\mu (y) = p_{2t}(x,x), \qquad \end{aligned}$$
(2.9)
$$\begin{aligned} \Vert p_t(x,\cdot )-p_t(y,\cdot )\Vert _{L^2(X)} = p_{2t}(x,x)+p_{2t}(y,y)-2p_{2t}(x,y) \rightarrow 0 \quad \text{ as}\, x\rightarrow y.\nonumber \\ \end{aligned}$$
(2.10)

This implies that \(e^{-tL}: L^2(X) \rightarrow C_b(X)\) is a continuous linear operator (since \(e^{-tL}f(x) = (p_t(x,\cdot ), f)\)). Duality then implies continuity of \(e^{-tL}: C_b(X)^* \rightarrow L^2(X)\) (with respect to the strong topology on \(C_b(X)^*\)). Let \(M(X) \subseteq C_b(X)^*\) be the space of signed finite Radon measures on \(X\) equipped with the weak-* topology. Let \(m_i \in M(X)\) with \(m_i \rightarrow 0\). Then

$$\begin{aligned} \Vert e^{-tL}m_i\Vert _{L^2(X)}&= \left( \int _X \biggl ( \int _X p_t(x,y) \; \text{ d}m_i(y) \biggr )^2 \; \text{ d}\mu (x) \right)^{{\frac{1}{2}}} \nonumber \\&= \left( \int _X \int _X (p_t(y,\cdot ), p_t(z,\cdot )) \; \text{ d}m_i(y) \, \text{ d}m_i(z) \right)^{{\frac{1}{2}}} \rightarrow 0 \end{aligned}$$
(2.11)

which means that \(e^{-tL}:M(X) \rightarrow L^2(X)\) is continuous (because \(X\) is separable and therefore the weak-* topology of \(M(X)\) is metrizable). This implies that \((1+t^{2/\gamma }L)^{-l}: M(X) \rightarrow L^2(X)\) is likewise continuous for all \(l > \alpha /4\). To see this, we use the relation

$$\begin{aligned} (1+t^{2/\gamma }\lambda )^{-l} = \varGamma (l)^{-1} \int _0^\infty e^{-s} s^{l-1} e^{-st^{2/\gamma }\lambda } \; \text{ d}s \end{aligned}$$
(2.12)

which holds by the change of variables formula and the definition of Euler’s gamma function. The spectral theorem thus implies that, for any \(u \in L^2(X)\),

$$\begin{aligned} \Vert (1+t^{2/\gamma }L)^{-l}u\Vert _{L^2(X)} \le \varGamma (l)^{-1} \int _0^\infty e^{-s} s^{l-1} \Vert e^{-st^{2/\gamma } L} u\Vert _{L^2(X)}\; \text{ d}s . \end{aligned}$$
(2.13)

Since \(\mu \) has full support, \(L^2(X) \cap M(X)\) is dense in \(M(X)\) (where \(L^p(X)\) is always with respect to \(\mu \)), and the claimed continuity of \((1+t^{2/\gamma }L)^{-l}:M(X) \rightarrow L^2(X)\) follows from (2.11). In particular, the pointwise bound for \(p_t\) implies that for \(l > \alpha /4\),

$$\begin{aligned} \Vert (1+t^{2/\gamma }L)^{-l}\delta _x\Vert _{L^2(X)}&\le \varGamma (l)^{-1} \int _0^\infty e^{-s} s^{l-1} \Vert e^{-st^{2/\gamma } L} \delta _x\Vert _{L^2(X)}\; \text{ d}s \nonumber \\&\le \varGamma (l)^{-1} \sqrt{\omega (x)} t^{-\alpha /2\gamma } \int _0^\infty e^{-s} s^{l-1-\alpha /4} \; \text{ d}s \nonumber \\&= C \sqrt{\omega (x)} t^{-\alpha /2\gamma } \end{aligned}$$
(2.14)

Let \(\kappa _t(\lambda ) = W_t(\lambda )^{1/2}\). Then (2.4) and the spectral theorem also imply that

$$\begin{aligned} \Vert \kappa _t(L)(1+t^{2/\gamma }L)^l\Vert _{L^2(X)\rightarrow L^2(X)} = \sup _{\lambda > 0} \kappa _t(\lambda )(1+t^{2/\gamma }\lambda )^l \le C_l, \end{aligned}$$
(2.15)

uniformly in \(t>0\). It follows from (2.14) that \(\kappa _t(L): M(X) \rightarrow L^2(X)\) with

$$\begin{aligned} \Vert \kappa _t(L)\delta _x\Vert _{L^2} \le C \sqrt{\omega (x)} t^{-\alpha /2\gamma } . \end{aligned}$$
(2.16)

Finally, by the Cauchy-Schwarz inequality,

$$\begin{aligned} |\phi _t(x,y)| = t^{2/\gamma } (\kappa _t(L) \delta _y, \kappa _t(L)\delta _x) \le t^{2/\gamma } \Vert \kappa _t(L)\delta _y\Vert _{L^2(X)} \Vert \kappa _t(L)\delta _x\Vert _{L^2(X)}\nonumber \\ \end{aligned}$$
(2.17)

which, with (2.16), proves (1.29). The continuity of \(\phi _t\) is implied by the continuity of \(\kappa _t(L): M(X) \rightarrow L^2(X)\) and of \(\delta _x\) in \(x \in X\) (in the weak-* topology). \(\square \)

Remark 2.2

The decay for \(\phi ^s\) claimed in (1.58) can be obtained by a straightforward generalization of the above argument, replacing (2.12) by

$$\begin{aligned} (1+t^{2/\gamma }z^2\lambda + t^{2/\gamma }m^2)^{-l} = \varGamma (l)^{-1} \int _0^\infty e^{-s} s^{l-1} e^{-st^{2/\gamma }m^2} e^{-s z^2 t^{2/\gamma } \lambda } \; \text{ d}s . \end{aligned}$$
(2.18)

Remark 2.3

Furthermore, by (2.4), the operators \(W_t(L)\) are smoothing for \(t>0\), in the general sense that, for any \(t>0\),

$$\begin{aligned} W_t(L): L^2(X) \rightarrow C^\infty (L), \quad \text{ where} \, C^\infty (L) := \bigcap _{n=0}^\infty D(L^n) \subset L^2(X) \end{aligned}$$
(2.19)

is the set of \(C^\infty \)-vectors for \(L\); see [28]. Standard elliptic regularity estimates imply, e.g., that \(C^\infty (L) = C^\infty (X)\) when \(E\) is the quadratic form associated to an elliptic operator with smooth coefficients.

2.2 Proof of Lemma 2.1

To complete the proof of Theorem 1.1, it remains to demonstrate Lemma 2.1. We first prove it under condition \((P_{\gamma ,\theta })\) in Lemma 2.2 below; this proof is quite straightforward using the assumption and (1.2). Subsequently, we prove Lemma 2.1 in the situation of condition (\(P_{\theta ,B}^{{*}}\)) in Lemma 2.3; here additional ideas are required.

To fix conventions, let us define the Fourier transform of an integrable function \(\varphi : {\mathbb{R }} \rightarrow {\mathbb{R }}\) by

$$\begin{aligned} \hat{\varphi }(k) = (2\pi )^{-1} \int _{\mathbb{R }} \varphi (x) e^{-ikx} \; \text{ d}x \quad \text{ for} \text{ all} \, k \in {\mathbb{R }} . \end{aligned}$$
(2.20)

Lemma 2.2

(Lemma 2.1 under  \((P_{\gamma ,\theta })\)) For any \(\varphi : {\mathbb{R }} \rightarrow [0,\infty )\) such that \(\hat{\varphi }\) is smooth and symmetric with \(\text{ supp}(\hat{\varphi }) \subseteq [-1,1]\), and for any \(\gamma > 0\), there is \(C>0\) such that

$$\begin{aligned} W_t(\lambda ) := C \varphi (\lambda ^{{\frac{1}{2}}\gamma }t) \end{aligned}$$
(2.21)

satisfies (2.2), (2.3), (2.4), and also (2.6), for all \(\lambda >0, t>0\); and if \((P_{\gamma ,\theta })\) holds, then \((W_t)\) also satisfies (2.5).

Remark 2.4

It is not difficult to see that such \(\varphi \) exist. For example, if \(\hat{\kappa }\) is a smooth real-valued function with support in \([-{\frac{1}{2}},{\frac{1}{2}}]\), then \(\varphi = |\kappa |^2\) satisfies the assumptions. For simplicity, let us assume sometimes in the following that \(\varphi \) is chosen such that \(C=1\) when Lemma 2.1 is applied.

Proof

Note that for any \(\varphi :[0,\infty )\rightarrow {\mathbb{R }}\) with \(t\varphi (t)\) integrable, there is \(C > 0\) such that

$$\begin{aligned} \lambda ^{-1} = C \int _0^\infty t^{\frac{2}{\gamma }} \varphi (\lambda ^{{\frac{1}{2}}\gamma } t) \; \frac{\mathrm{d}t}{t} \quad \text{ for} \text{ all} \, \lambda >0. \end{aligned}$$
(2.22)

This simply follows (as in (1.2)) because the right-hand side is homogeneous in \(\lambda \) of degree \(-1\), which is immediate by rescaling of the integration variable. This shows (2.2); (2.3) is obvious by assumption; and (2.4) follows since \(\hat{\varphi }\) is smooth. The improved estimate (2.6) follows from the chain rule (or Faà di Bruno’s formula) and

$$\begin{aligned} \lambda ^{m-{\frac{1}{2}}\gamma } \left| \frac{\partial ^m}{\partial \lambda ^m} \lambda ^{{\frac{1}{2}}\gamma } \right| \le C_{\gamma ,m} \end{aligned}$$
(2.23)

for non-negative integers \(m\), using that \(\text{ supp}(\hat{\varphi }) \subseteq [-1,1]\) implies that \(\varphi \) is smooth. Moreover, since \(\text{ supp}(\hat{\varphi }) \subset [-1,1]\), and since \(\hat{\varphi }\) is smooth,

$$\begin{aligned} W_t(L)u = C \int _{-1}^1 {\hat{\varphi }(s)} \cos (L^{{\frac{1}{2}}\gamma } ts)u \; \text{ d}s \quad \text{ for} \text{ all} \, u \in L^2(X), \end{aligned}$$
(2.24)

where the integral is the Riemann integral, i.e., the strong limit of its Riemann sums (with values in \(L^2\)). Therefore (2.5) follows from \((P_{\gamma ,\theta })\). \(\square \)

The previous proof makes essential use of the finite propagation speed of the wave equation \((P_{\gamma ,\theta })\) to prove (2.5). This property fails for discrete Dirichlet forms such as (1.31) where we instead know the property (\(P_{\theta ,B}^{{*}}\)) that polynomials of degree \(n\) of the generator have finite range \(\theta (n)\).

This leads to the following problem. Find polynomials \(W_t^*\), \(t>0\), of degree at most \(t\) satisfying the properties (2.3), (2.4), (2.6) such that the decomposition formula (2.2) for \(1/\lambda \) holds. In the proof of Lemma 2.2, the verification of (2.4) (and (2.6)) and of the decomposition formula (2.2) are directly linked to the “ballistic” scaling of the wave equation: \(W_t(\lambda )=W_1(\lambda t^2)\). To construct polynomials satisfying such “ballistic” estimates, we are led by the following remarkable discovery of Carne [11]. The Chebyshev polynomials \(T_k, k \in \mathbb Z \), defined by

$$\begin{aligned} T_k(\theta ) = \cos (k\arccos (\theta )) \quad \text{ for} \text{ all} \, \theta \in [-1,1], k \in \mathbb Z , \end{aligned}$$
(2.25)

are solutions to the discrete (in space and time) wave equation in the following sense. Let \(\nabla _+f(n) = f(n+1)-f(n)\) and \(\nabla _-f(n) = f(n-1)-f(n)\) be the discrete (forward and backward) time differences. Then, as polynomials in \(X\),

$$\begin{aligned} \nabla _-\nabla _+ T_n(X) = \nabla _+\nabla _- T_n(X) = 2(X-1) T_n(X). \end{aligned}$$
(2.26)

In particular, when \(2(X-1)=-L\) or equivalently \(X = 1-{\frac{1}{2}}L\), then \(v(n,x) = [T_n(1-{\frac{1}{2}}L) u](x)\) solves the following “Cauchy problem” for the discrete wave equation:

$$\begin{aligned} -\nabla _+\nabla _- v + Lv = 0, \quad v(0) = u, (\nabla _+v-\nabla _-v)(0)=0 . \end{aligned}$$
(2.27)

The analogy between the discrete- and the continuous-time wave equations is like that between the discrete- and the continuous-time random walk. It turns out that the structure of Chebyshev polynomials allows to prove the following lemma.

Lemma 2.3

(Lemma 2.1  under  (\(P_{\theta ,B}^{{*}}\))) Let \(\varphi : {\mathbb{R }} \rightarrow [0,\infty )\) satisfy the assumptions of Lemma 2.2. Then \(W_t^*: [0,4] \rightarrow [0,\infty )\), defined by

$$\begin{aligned} W_t^*(\lambda ) := \sum _{n \in \mathbb Z } \varphi (\arccos (1-{\frac{1}{2}}\lambda ) t-2\pi nt) \quad \text{ for} \text{ all} \, \lambda \in [0,4], t>0, \end{aligned}$$
(2.28)

is the restriction of a polynomial in \(\lambda \) of degree at most \(t\) to \([0,4]\), with coefficients smooth in \(t\), and, for any \(\varepsilon > 0\), (2.2), (2.3), (2.4), (2.5), and (2.6) hold for all \(\lambda \in (0,4-\varepsilon ], t > 0\).

Proof

The proof verifies that \(W_t^*\) as defined in (2.28) has the asserted properties. Let

$$\begin{aligned} \varphi _{t}^{*}(x) := \sum _{n \in \mathbb Z } \varphi (xt-2\pi nt) = \sum _{k \in \mathbb Z } t^{-1}{\hat{\varphi }(k/t)} \cos (kx) \end{aligned}$$
(2.29)

where the second equality follows by symmetry of \(\hat{\varphi }\), the change of variables formula, and a version of the Poisson summation formula which is easily verified, for sufficiently nice \(\varphi \). Then the claim (2.2) can be expressed as

$$\begin{aligned} \lambda ^{-1} = \int _0^\infty t^{2} \varphi _t^*(\arccos (1-{\frac{1}{2}}\lambda )) \; \frac{\mathrm{d}t}{t} \quad \text{ for} \text{ all}\, \lambda \in (0,4] . \end{aligned}$$
(2.30)

Let \(x = \arccos (1-{\frac{1}{2}}\lambda )\) or equivalently \(\lambda = 2(1-\cos x) = 4 \sin ^2(\frac{1}{2} x)\). In terms of this change of variables, (2.30) and thus the claim (2.28) are then equivalent to

$$\begin{aligned} \displaystyle \frac{1}{4}\sin ^{-2}\left({\frac{1}{2}}x\right) = \int _0^\infty t^{2} \varphi ^*_t(x) \; \frac{\mathrm{d}t}{t} \quad \text{ for} \text{ all} \, x \in (0,\pi ]. \end{aligned}$$
(2.31)

The left-hand side defines a meromorphic function on \(\mathbb C \) with poles at \(2\pi \mathbb Z \). Its development into partial fractions is (see e.g. [2, page 204])

$$\begin{aligned} \displaystyle \frac{1}{4} \sin ^{-2} \left({\frac{1}{2}}x\right) = \sum _{n \in \mathbb Z } (x-2\pi n)^{-2} \quad \text{ for} \text{ all} x \in \mathbb C \setminus 2\pi \mathbb Z . \end{aligned}$$
(2.32)

It follows, by (2.22) with \(\gamma =1\) and \(\lambda = (x-2\pi n)^{2}\), assuming \(C=1\), that

$$\begin{aligned} \displaystyle \frac{1}{4} \sin ^{-2} \left({\frac{1}{2}}x\right) = \sum _{n \in \mathbb Z } \int _0^\infty t^{2} \varphi ((x-2\pi n)t) \; \frac{\mathrm{d}t}{t} \quad \text{ for} \text{ all} x \in (0,\pi ]. \end{aligned}$$
(2.33)

The order of the sum and the integral can be exchanged, by non-negativity of the integrand, thus showing (2.31) and therefore (2.2).

To verify that \(W_t^*\) is the restriction of a polynomial, we note that by (2.28), (2.29), and \(\text{ supp}(\hat{\varphi }) \subseteq [-1,1]\),

$$\begin{aligned} W_t^{*}(\lambda ) = \varphi _t^{*}\left(\arccos \left(1-{\frac{1}{2}}\lambda \right)\right)&= \sum _{k\in \mathbb Z } t^{-1} {\hat{\varphi }(k/t)} \cos \left(k\arccos \left(1-{\frac{1}{2}}\lambda \right)\right) \nonumber \\&= \sum _{k \in \mathbb Z \cap [-t,t]} t^{-1} {\hat{\varphi }(k/t)} T_k\left(1-{\frac{1}{2}}\lambda \right) \end{aligned}$$
(2.34)

where \(T_k, k \in \mathbb Z \), are the Chebyshev polynomials defined by (2.25). This shows that \(W_t^*(\lambda )\) is indeed the restriction of a polynomial in \(\lambda \) of degree at most \(t\) to the interval \(\lambda \in [0,4]\). In particular, (2.5) is a trivial consequence of (\(P_{\theta ,B}^{{*}}\)) which states that polynomials in \(L\) of degree \(n\) have range at most \(\theta (n)\).

Finally, we verify the estimate (2.6) and thus in particular (2.4). To this end, we note that, in analogy to (2.23), for \(\lambda \in [0,4-\varepsilon ]\) and non-negative integers \(m\),

$$\begin{aligned} \lambda ^{m-{\frac{1}{2}}} \left| \frac{\partial ^m}{\partial \lambda ^m}\arccos \left(1-{\frac{1}{2}}\lambda \right)\right| \le C_{\varepsilon ,m} . \end{aligned}$$
(2.35)

For example, for \(m=1\),

$$\begin{aligned}&\frac{\partial }{\partial \lambda } \arccos \left(1-{\frac{1}{2}}\lambda \right) = {\frac{1}{2}}\left(\lambda -\displaystyle \frac{1}{4}\lambda ^2\right)^{-{\frac{1}{2}}} \le \varepsilon ^{-{\frac{1}{2}}} \lambda ^{-{\frac{1}{2}}} \quad \text{ for} \lambda \in [0,4-\varepsilon ] .\qquad \end{aligned}$$
(2.36)

Therefore (2.6) follows, by the chain rule (or Faà di Bruno’s formula), from

$$\begin{aligned} (1+t^2(1-\cos (x))^l t^{-m} \left| \frac{\partial ^{m}}{\partial x^{m}} \varphi ^*_t(x)\right| \le C_{l,m} \end{aligned}$$
(2.37)

which we will now show. The argument is essentially a discrete version of the classic fact that the Fourier transform acts continuously on the Schwartz space of smooth and rapidly decaying functions on \({\mathbb{R }}\). To show (2.37), first note that

$$\begin{aligned} (1-\cos (x))e^{ikx} = e^{ikx}-{\frac{1}{2}}e^{i(k+1)x} -{\frac{1}{2}}e^{i(k-1)x} =: \Delta _ke^{ikx} \end{aligned}$$
(2.38)

and thus by induction, for any \(l \in \mathbb N \),

$$\begin{aligned} (1-\cos (x))^le^{ikx} = (1-\cos (x))^{l-1} \Delta _ke^{ikx} = \Delta _k (1-\cos (x))^{l-1} e^{ikx} = \Delta _k^l e^{ikx} .\nonumber \\ \end{aligned}$$
(2.39)

It follows by (2.29) and summation by parts that

$$\begin{aligned} (1+t^2(1-\cos (x))^l t^{-m} \frac{\partial ^{m}}{\partial x^{m}} \varphi _t^*(x)&= \sum _{k\in \mathbb Z } t^{-1} {\hat{\varphi }}(k/t)(ik/t)^{m} [(1 + t^2 \Delta _k)^{l} e^{ikx}] \nonumber \\&= \sum _{k\in \mathbb Z } [(1 + t^2 \Delta _k)^l t^{-1} {\hat{\varphi }}(k/t) (ik/t)^{m}] e^{ikx}.\nonumber \\ \end{aligned}$$
(2.40)

Let \(h(s) = \displaystyle \frac{1}{2} (|s|-1) 1_{|s|\le 1}\) for \(s \in {\mathbb{R }}\). Then, for any smooth \(f: {\mathbb{R }} \rightarrow {\mathbb{R }}\),

$$\begin{aligned} \Delta _k^nf(k) = (h^{*n} * D^{2n}f)(k), \end{aligned}$$
(2.41)

where \(*\) denotes convolution of two functions on \({\mathbb{R }}\), \(h^{*n} = h * h * \cdots * h\), and \(Df\) is the derivative of \(f\). Indeed,

$$\begin{aligned} \Delta _k f(k)&= -\displaystyle \frac{1}{2} \int _0^1 [Df(k+t) - Df(k-t)] \; \text{ d}t \nonumber \\&= -\displaystyle \frac{1}{2} \int _0^1 \int _{-t}^t D^2f(k+s) \; \text{ d}s \; \text{ d}t = \int _{\mathbb{R }} D^2f(s) h(s-k) \; \text{ d}s = (h * D^2f)(k)\nonumber \\ \end{aligned}$$
(2.42)

and (2.41) then follows by induction:

$$\begin{aligned} \Delta ^{n+1}_{k}f = \Delta (h^{*n} * D^{2n} f) = h * D^2(h^{*n} * D^{2n} f) = h * h^{*n} * D^2D^{2n} f.\qquad \end{aligned}$$
(2.43)

It then follows using the facts that \(\sum _{k\in \mathbb Z } |h^{*n}(k-s)| \le C_n\), uniformly in \(s \in {\mathbb{R }}\), and that \(\hat{\varphi }\) is smooth and of rapid decay,

$$\begin{aligned}&t^{-1} \sum _{k\in \mathbb Z } \Big | (1 + t^2\varDelta _k^2)^l [\hat{\varphi }(k/t) (ik/t)^{m}] \Big | \nonumber \\&=\sum _{n=0}^l C_{l,n} t^{-1} \sum _{k\in \mathbb Z } \int _{\mathbb{R }} |h^{*n}(k-s)|\, |[D^{2n}((\cdot )^{m} \hat{\varphi })](s/t)| \; ds\nonumber \\&\le \sum _{n=0}^l C_{l,n} t^{-1} \int _{\mathbb{R }} |[D^{2n}((\cdot )^{m}\hat{\varphi })](s/t)| \; ds \nonumber \\&= \sum _{n=0}^l C_{l,n} \int _{\mathbb{R }} |[D^{2n}((\cdot )^{m}\hat{\varphi })](s)| \; ds \le C_{m,l} \end{aligned}$$
(2.44)

and thus (2.37), and therefore (2.6), follow from this inequality and (2.40).\(\square \)

Proof

(Lemma 2.1) Lemma 2.1 under \((P_\gamma ,\theta )\) follows immediately from Lemma 2.2; under (\(P_{\theta ,B}^{{*}}\)), it follows from Lemma 2.3 by setting \(W_t(\lambda ) = W_t^*(\frac{3}{B} \lambda )\). \(\square \)

3 Extensions

3.1 Discrete approximation

In view of the discussion about Chebyshev polynomials before Lemma 2.3, it is not surprising that the functions \(W_t^*\) of Lemma 2.3 approximate the \(W_t\) of Lemma 2.2. In Proposition 3.1 below, we show that this is indeed the case with natural error \(O(t^{-1})\) as \(t\rightarrow \infty \). This result is used in Sect. 3.2 to prove (1.37).

Proposition 3.1

(Discrete approximation) Let \(\varphi \) be as in Lemma 2.2 and 2.3, with associated functions \(W_t\) and \(W_t^*\) for \(\gamma =1\). Then, for any integer \(l\),

$$\begin{aligned} |W_t^*(\lambda ) - W_t(\lambda )| \le C_l (1\vee t)^{-1} (1+t^2 \lambda )^{-l} \quad \text{ for} \text{ all} \, \lambda \in [0,4]. \end{aligned}$$
(3.1)

In particular, \(W_t^*(\lambda /t^2) \rightarrow C \varphi (\lambda ^{{\frac{1}{2}}})\) as \(t\rightarrow \infty \).

Proof

Note that it suffices to restrict to \(t\ge 1\), since for \(t \le 1\), the claim follows from (2.4). The left-hand side of (3.1) is then proportional to the absolute value of

$$\begin{aligned} \varphi \left(\arccos \left(1-\frac{1}{2} \lambda \right)t\right)-\varphi (\lambda ^{\frac{1}{2}}t) + \sum _{n\in \mathbb Z \setminus \{0\}} \varphi \left(\arccos \left(1-{\frac{1}{2}}\lambda \right)t + 2\pi n t\right)\qquad \end{aligned}$$
(3.2)

We estimate the difference of the first two terms in (3.2) and the sum separately, and show that each of them satisfies (3.1). The first two terms can be written as

$$\begin{aligned} \varphi \left(\arccos \left(1-{\frac{1}{2}}\lambda \right)t\right)-\varphi (\lambda ^{\frac{1}{2}}t) = \left(\arccos \left(1-{\frac{1}{2}}\lambda \right)-\lambda ^{\frac{1}{2}}\right)t \zeta _t(\lambda ) \end{aligned}$$
(3.3)

with

$$\begin{aligned} \zeta _t(\lambda ) = \int _0^1 \varphi ^{\prime }(s \arccos \left(1-{\frac{1}{2}}\lambda \right)t + (1-s) \lambda ^{\frac{1}{2}}t) \; \text{ d}s. \end{aligned}$$
(3.4)

The bounds

$$\begin{aligned} \sqrt{2\lambda }&= \arccos (1-\lambda ) + O(\lambda ) \quad \text{ as} \, \lambda \rightarrow 0+ , \end{aligned}$$
(3.5)
$$\begin{aligned} \sqrt{2\lambda }&\le \arccos (1-\lambda )\le \tfrac{\pi }{2} \sqrt{2\lambda } \quad \text{ for} \text{ all} \, \lambda \in [0,2] , \end{aligned}$$
(3.6)

and the rapid decay of \(\varphi ^{\prime }\) therefore imply that

$$\begin{aligned} |\zeta _t(\lambda )| \le C_l (1+\lambda t^2)^{-l} \end{aligned}$$
(3.7)

and

$$\begin{aligned} \varphi \left(\arccos \left(1-{\frac{1}{2}}\lambda \right)t\right)-\varphi (\lambda ^{\frac{1}{2}}t) \le C_l t^{-1} (1+t^2\lambda )^{-l}. \end{aligned}$$
(3.8)

To estimate the sum in (3.2), we can use the rapid decay of \(\varphi \) with the inequality \(x+y \ge 2 (xy)^{1/2}\) to obtain that

$$\begin{aligned} \sum _{n\in \mathbb Z \setminus \{0\}} \varphi (xt + 2\pi n t)&\le C_l \sum _{n\in \mathbb Z \setminus \{0\}} (1+xt + 2\pi n t)^{-l} \nonumber \\&\le C_l (1+xt)^{-l/2} t^{-l/2} \sum _{n>0} n^{-l/2} \le C_l (1+xt)^{-l/2} t^{-l/2}\nonumber \\ \end{aligned}$$
(3.9)

for any \(l > 2\), with the constant changing from line to line. In particular, upon substituting \(x=\arccos \left(1-{\frac{1}{2}}\lambda \right)\), this bound and (3.5) imply

$$\begin{aligned} \sum _{n\in \mathbb Z \setminus \{0\}} \varphi \left(\arccos \left(1-{\frac{1}{2}}\lambda \right)t + 2\pi n t\right) \le C_l t^{-2l} (1+t^2\lambda )^{-l} . \end{aligned}$$
(3.10)

The claim then follows by adding (3.8) and (3.10).\(\square \)

3.2 Estimates for systems with constant coefficients

In this section, we verify the assertions of Example 1.1. We work in the slightly more general context of second-order elliptic systems (instead of operators) with constant coefficients. These are defined as in Example 1.2, and the claims of Example 1.1 hold mutadis mutandis. The analysis is straightforward, with aid of the Fourier transform. It reproduces results of [1].

3.2.1 Spectral measures

The spectral measures corresponding to the vector-valued case of (1.30) are given in terms of the Fourier transform as follows. For \(F: [0,\infty ) \rightarrow {\mathbb{R }}\),

$$\begin{aligned} (v, F(L)u) = \sum _{k,l=1}^M \int _{{\mathbb{R }}^{d}} \left[F\left( \sum _{i,j=1}^d a_{ij} \xi _i\xi _j\right)\right]_{kl} \; {\overline{\hat{v}}}^k(\xi ) {\hat{u}}^l(\xi ) \; \text{ d}\xi \end{aligned}$$
(3.11)

where \(\hat{u} = (\hat{u}^1, \dots , \hat{u}^{M})\) is the component-wise Fourier transform of \(u = (u^1, \dots , u^M)\),

$$\begin{aligned} a(\xi ) := \sum _{i,j=1}^d a_{ij} \xi _i\xi _j = \left(\sum _{i,j=1}^d a_{ij}^{kl} \xi _i\xi _j\right)_{k,l=1,\dots ,M} \end{aligned}$$
(3.12)

are symmetric positive definite \(M \times M\) matrices, for all \(\xi \in {\mathbb{R }}^d\), and the matrices \(F(a(\xi ))\) are defined in terms of the spectral decomposition of \(a(\xi )\). Similarly, for the (vector-valued case of the) discrete Dirichlet form (1.31),

$$\begin{aligned} (v, F(L)u) = \sum _{k,l=1}^M \int _{[-\pi ,\pi ]^{d}} \left[F\left(\sum _{i,j=1}^d a_{ij} (1-e^{i\xi _i})(1-e^{-i\xi _j})\right)\right]_{kl} \; {\overline{\hat{v}}}^k(\xi ) \hat{u}^l(\xi ) \; d\xi \nonumber \\ \end{aligned}$$
(3.13)

where here \(\hat{u}\) is the component-wise discrete Fourier transform. Let us also write

$$\begin{aligned} a^*(\xi )&:= \sum _{i,j=1}^d a_{ij} (1-e^{i\xi _i})(1-e^{-i\xi _j}) \nonumber \\&= \left(\sum _{i,j=1}^d a_{ij}^{kl} (1-e^{i\xi _i})(1-e^{-i\xi _j}) \right)_{k,l=1,\dots ,M} . \end{aligned}$$
(3.14)

We will often use, without mentioning this further, that the spectra of \(a(\xi )\) and \(a^*(\xi )\) are bounded from above and from below by multiples of \(|\xi |^2\).

3.2.2 Estimates

Let us introduce the following notation for derivatives: for a function \(u : {\mathbb{R }}^d \rightarrow {\mathbb{R }}\), we regard the \(l\)th derivative, \(D^lu(x)\), as an \(l\)-linear form, and \(|D^lu(x)|\) is a norm of the form \(D^lu(x)\). In terms of the Fourier transform, we denote by \({\hat{D}}^{l}(\xi )\) the corresponding “multiplier” operator from functions to \(l\)-linear forms, and by \(|{\hat{D}^l}(\xi )|\) its norm. Similarly, for a discrete function \(u: \mathbb Z ^d \rightarrow {\mathbb{R }}\), the \(l\)th order discrete difference in positive coordinate direction is denoted by \(\nabla ^lu(x)\) and has Fourier multiplier \({\hat{\nabla }^l(\xi )}\). In particular, when \(l=1\),

$$\begin{aligned} \hat{D}(\xi ) \cong (i\xi _1, \dots , i\xi _d), \quad \hat{\nabla }(\xi ) \cong (e^{i\xi _1}-1, \dots , e^{i\xi _d}-1). \end{aligned}$$
(3.15)

Furthermore, \(k\) and \(p\) will denote integers that may be chosen arbitrarily, and \(C\) constants that can change from instance to instance and may depend on \(k\) and \(p\), as well as \(l=(l_x,l_y,l_a,l_{{m^2}}), B_+, B_-\), and \(M_+\), but not on \(x\), \(\xi \), and \(m\).

Proof

((1.38),(1.34),(1.35)) It follows by the change of variables \(\xi \mapsto t\xi \), from the fact that \(a(\xi )\) is homogeneous of degree \(2\), and from \(W_t(\lambda ) = W_1(\lambda t^2)\) that

$$\begin{aligned} \phi _t(x,y; a, m^2)&= t^2 \int _{{\mathbb{R }}^d} W_t(a(\xi ) +m^2) e^{i (x-y) \cdot \xi }\; \text{ d}\xi \nonumber \\&= t^{-(d-2)} \bar{\phi }(\frac{x-y}{t}; a, m^2t^2) \end{aligned}$$
(3.16)

with

$$\begin{aligned} \bar{\phi }(x; a, m^2) := \int _{{\mathbb{R }}^d} W_1(a(\xi ) +m^2) e^{i (x-y) \cdot \xi }\; \text{ d}\xi \end{aligned}$$
(3.17)

which is supported in \(|x|\le B_+\). This verifies (1.38). Furthermore, (1.34) is a straightforward consequence of (3.16) by differentiation and (2.6). Let us omit the details and only verify them explicitly in the discrete case (1.35). The (derivatives of the) decomposition kernel \(\phi _t^*\) can here be expressed as

$$\begin{aligned} D_a^{l_a} D_{{m^2}}^{l_{{m^2}}}\nabla _x^{l_x} \nabla _y^{{l_y}} \phi _t^*(x,y; a, {m^2}) = t^{-(d-2)-l_x-l_y+2l_{{m^2}}} \bar{\phi }_{t;l}^*(x-y; a,{m^2})\nonumber \\ \end{aligned}$$
(3.18)

with

$$\begin{aligned} \bar{\phi }_{t;l}^*(x; a,m^2) = t^{d+l_x+l_y-2l_{{m^2}}} \int _{[-\pi ,\pi ]^d} D_a^{l_a} D_{{m^2}}^{l_{{m^2}}}W_t^*(a^*(\xi ) +m^2) {\overline{\hat{\nabla }}}{}^{l_y} \hat{\nabla }^{l_x} e^{i x\cdot \xi } \; \text{ d}\xi .\nonumber \\ \end{aligned}$$
(3.19)

Thus (2.6), \(|{\hat{\nabla }}(\xi )| \le C|\xi |\), and \(\eta \cdot a^*(\xi )\eta \ge C|\xi |^2 |\eta |^2\) for \(\eta \in {\mathbb{R }}^M\) imply

$$\begin{aligned} |\bar{\phi }_{t;l}^*(x; a,m^2)|&\le C \int _{[-\pi ,\pi ]^d} (1+C|\xi |^2 t^2 +m^2 t^2)^{-k-p} (t|\xi |)^{l_x+l_y-2l_{{m^2}}} \; t^d \text{ d}\xi \nonumber \\&\le C (1+m^2 t^2)^{-k} \int _{{\mathbb{R }}^d} (1+C|\xi |^2)^{-p} |\xi |^{l_x+l_y-2l_{{m^2}}} \; \text{ d}\xi \end{aligned}$$
(3.20)

and therefore that the integral converges if \({\frac{1}{2}}(d+l_x+l_y) > l_{{m^2}}\) and \(p\) is chosen sufficiently large. It follows that

$$\begin{aligned} |\bar{\phi }_{t;l}^*(x; a,m^2)| \le C (1+m^2t^2)^{-k} \end{aligned}$$
(3.21)

verifying the claim. \(\square \)

Proof

(1.37) Let us assume that \(B=3\). Then

$$\begin{aligned} \nabla _x^{l_x}\nabla _y{\!\!}^{{l_y}} \phi _t^*(x,y)-D_x^{l_x}D_y^{{l_y}} \phi _t(x,y)&= t^2 \int _{[-\pi ,\pi ]^d} W_t^*(a^*(\xi )) \hat{\nabla }^{l_x} {\overline{\hat{\nabla }}}^{l_y} e^{i\xi \cdot (x-y)}\; \text{ d}\xi \nonumber \\&\quad - t^2 \int _{{\mathbb{R }}^d} W_t(a(\xi )) \hat{D}^{l_x} {\overline{\hat{D}}}^{l_y} e^{i\xi \cdot (x-y)}\; \text{ d}\xi .\nonumber \\ \end{aligned}$$
(3.22)

To simplify notation, we will write \({\hat{D}}^l = {\hat{D}}^{l_{x}} {\overline{\hat{D}}}^{l_{y}}={\hat{D}}^{l_x} \otimes {\overline{\hat{D}}}^{l_{y}}\) if \(l=(l_x,l_y)\), and similarly for \(\nabla \). Then the difference (3.22) may be estimated as follows. Proposition 3.1 implies

$$\begin{aligned}&\int _{[-\pi ,\pi ]^d} |W_t^*(a^*(\xi )+m^2) - W_t(a^*(\xi )+m^2)| |\hat{D}^l(\xi )| \; \text{ d}\xi \nonumber \\&\quad \le C t^{-1} \int _{{\mathbb{R }}^d} (1+C|\xi |^2 t^2 + m^2 t^2)^{-p-k} |\xi |^{l} \; \text{ d}\xi \le C t^{-d-l-1} (1+m^2t^2)^{-k}\qquad \qquad \end{aligned}$$
(3.23)

where we have assumed in the second inequality above that \(p\) was chosen sufficiently large so that the integral is convergent. Similarly, we may proceed for the other differences, always choosing \(p\) large enough in the estimates. Using (2.6) with \(m=1\) and \(|a^*(\xi ) - a(\xi )| = O(|\xi |^3)\), which follows from Taylor’s theorem, we obtain

$$\begin{aligned}&\int _{[-\pi ,\pi ]^d} |W_t(a^*(\xi )+m^2) - W_t(a(\xi )+m^2)| |\hat{D}^l(\xi )| \; \text{ d}\xi \nonumber \\&\quad \le C \int _{{\mathbb{R }}^d}|\xi | (1+C|\xi |^2 t^2 + m^2 t^2)^{-p-k} |\xi |^{l} \; \text{ d}\xi \le C t^{-d-l-1} (1+m^2t^2)^{-k} .\qquad \qquad \end{aligned}$$
(3.24)

Taylor’s theorem similarly implies that \(|\hat{\nabla }^l(\xi ) - \hat{D}^l(\xi )| \le C |\xi |^{l+1}\) so that, by (2.4),

$$\begin{aligned}&\int _{[-\pi ,\pi ]^d} |W_t^*(a^*(\xi )+m^2)| |\hat{\nabla }^l(\xi ) - \hat{D}^l(\xi )| \; \text{ d}\xi \nonumber \\&\quad \le C \int _{{\mathbb{R }}^d} (1+C|\xi |^2t^2+m^2t)^{-p-k} |\xi |^{l+1} \; \text{ d}\xi \le C t^{-d-l-1} (1+m^2t^2)^{-k} .\qquad \qquad \end{aligned}$$
(3.25)

Finally, we obtain by (2.4) that

$$\begin{aligned}&\int _{{\mathbb{R }}^d\setminus [-\pi ,\pi ]^d} |W_t(a(\xi )+m^2)| |\hat{D}^{l}(\xi )| \; \text{ d}\xi \nonumber \\&\quad \le C \int _{{\mathbb{R }}^d\setminus [-\pi ,\pi ]^d} (1+C|\xi |^2t^2+m^2t^2)^{-p-k} |\xi |^{l}\; \text{ d}\xi \le C t^{-2p} (1+m^2t^2)^{-k} .\nonumber \\ \end{aligned}$$
(3.26)

The combination of the previous four inequalities gives (1.37). \(\square \)