1 Introduction

The Riesz transform \(\nabla (-\Delta )^{-\frac{1}{2}}\) on \({\mathbb {R}}^n\), which is the Hilbert transform for \(n = 1\), is one of the most fundamental operators in Analysis. It is closely related to the development of Harmonic Analysis, and has a very wide range of applications. This classical Calderón–Zygmund convolution operator is an isometry on \(L^2({\mathbb {R}}^n)\), as seen by means of the Fourier transform. As is well known, it is bounded on \(L^p({\mathbb {R}}^n)\) for \(1< p < +\infty \) and of weak type (1, 1). There are many other aspects of the Riesz transform, such as endpoint boundedness \(H^1 \longrightarrow L^1\) and \(L^{\infty } \longrightarrow BMO\), dimension-free \(L^p\)-norm bounds, weighted inequalities as well as the Riesz transforms associated to Schrödinger operators, etc. The literature on these topics is huge. In this paper, we restrict ourselves to the \(L^p\) boundedness and the weak type (1, 1) (endpoint) estimate, in the Heisenberg setting defined below.

With the development of analysis and geometry, it is natural to extend the classical results to the settings of Lie groups, weighted Riemannian manifolds and various second-order differential operators. See for example [2, 5, 11, 23, 38,39,40,41,42,43,44, 49, 51, 52] for early work.

We now consider a smooth manifold M endowed with a measure \(d\mu \), a second-order differential operator \(\Delta \) and a first-order operator \(\nabla \) such that the Green’s formula holds; our Heisenberg setting is a special case of this. Then the (first-order) Riesz transform is defined as \(\nabla (-\Delta )^{-{1}/{2}}\), with a suitable modification in the case where 0 is an eigenvalue of \(\Delta \) in \(L^2(\mu )\). Note that this operator is always bounded on \(L^2(\mu )\). Without the Fourier transform, the usual tools to study its \(L^p\) boundedness are instead the Poisson semi-group, probabilistic Littlewood–Paley theory, the resolvent and in particular the heat semi-group and its kernel.

For the case \(2< p < +\infty \), the classical result is not always valid in this general setting. More precisely, for any fixed \(p_0 \ge 2\), there exists a Riemannian manifold of this type where the Riesz transform is bounded on \(L^p\) for \(2< p < p_0\) but unbounded for \(p > p_0\); if \(p_0 > 2\) it is unbounded also on \(L^{p_0}\) and not even of weak type \((p_0, p_0)\). See [18, 31] and [14,15,16,17, 25,26,27] as well as [28] for some concrete examples. We also mention [8, 9, 13, 20, 21] and references therein, with characterizations under the assumptions of volume doubling and Gaussian heat kernel estimates.

For the opposite case \(1< p < 2\), there are many known results but, to the best of our knowledge, no counterexample. In particular, an interesting dichotomy has been obtained in [4]: for any fixed \(p_0 \in (1, 2)\) and fixed \(n \ge 1\), either there is an \(L^{p_0}\) bound on the Riesz transform which is uniform over all complete Riemannian manifolds of dimension n, or there exists a complete n-dimensional Riemannian manifold in which the Riesz transform is unbounded on \(L^{p_0}\).

Next, we focus on the weak type (1, 1) property for the Riesz transform. There exist numerous known specific results and several general sufficient conditions. We distinguish the following cases.

  1. 1.

    The standard assumption of Harmonic Analysis, namely the volume doubling property, holds. Then the most striking result is due to T. Coulhon and X. T. Duong. They showed in [18] that Gaussian estimates for the heat kernel imply the weak type (1, 1), without assuming Gaussian estimates for its gradient. Such gradient estimates are necessary in earlier works that use standard singular integral theory, but in general they do not hold in the setting of a complete Riemannian manifold. Furthermore, the method of Coulhon and Duong can be adapted to settings with sub-Gaussian heat kernel estimates and to even more general situations. See [17] and [37] for details.

  2. 2.

    The volume is not doubling but of at most polynomial growth in the sense of Nazarov–Treil–Volberg (cf. for example [47]). The best-known example seems to be the classical Ornstein–Uhlenbeck operator, studied by P.A. Meyer, R. Gundy, G. Pisier and many other authors. In particular, this example satisfies Bakry and Émery’s condition \(\Gamma _2 \ge 0\) (cf. [10]). The weak type (1, 1) estimate for the Riesz transform associated to the Ornstein–Uhlenbeck operator can be found in [22]; see also [48] for higher order Riesz transforms and other related operators. Recently, A. Hassell and A. Sikora provided another interesting example, the connected sum of a finite number of Riemannian manifolds with very strong geometric and analytic conditions. For this setting, they proved the weak type (1, 1) of the Riesz transform in [28, Theorem 7.1]. Their proof is based on spectral multipliers and the resolvent. One may naturally ask whether their result can be proved by means of the singular integral theory developed in [47].

  3. 3.

    Manifolds of exponential volume growth without spectral gap. The typical case is the affine group, for which a partial result can be found in [50] and a suitable singular integral operator theory has been established in [29]. Moreover, this theory can be adapted to some other situations.

  4. 4.

    Manifolds of exponential volume growth with spectral gap. Under some additional assumptions such as the local volume doubling property and small-time Gaussian heat kernel estimates, the \(L^p\) boundedness of the Riesz transform for \(1< p < 2\) can be found in [18]. As for the weak type (1, 1) estimate, there is at present no adequate singular integral theory, as far as the authors know. However, in the setting of a symmetric space of the non-compact type, Anker obtained in [2] the weak type (1, 1) of the first- and second-order Riesz transforms. See also [6] for other examples. Recently, the present authors studied Riesz transforms associated to the Laplacian with drift in [34, 36] and [35]. Also notice that in the papers [36] and [35], which treat the Laplacian with drift in Euclidean space, the setting can be seen as the direct product of a Euclidean space and the simplest weighted manifold on the real line satisfying exponential volume growth with spectral gap. Let us finally observe that the setting of the present paper is natural in this context, since it consists of a sub-Laplacian with drift in a typical sub-Riemannian manifold, the Heisenberg group. We refer the reader to [46] and [1] and references therein for further details about sub-Riemannian manifold.

2 Description of the Setting and Statements of Results

Let \({\mathbb {H}}_{n} ={\mathbb {R}}^{ n} \times {\mathbb {R}}^{ n} \times {\mathbb {R}}\) be the Heisenberg group of \(2n + 1\) real dimensions, with points written \(g = (x,y,t)\). The group law is

$$\begin{aligned} (x,y, t) \cdot (x',y', t') = \left( x + x', y+y', t + t' +2 \sum _{j = 1}^n \left( y_j \, x_j' - x_j \, y_j' \right) \right) , \end{aligned}$$
(2.1)

and the Haar measure dg on \({\mathbb {H}}_n\) coincides with the \((2n + 1)\)-dimensional Lebesgue measure.

The vector fields

$$\begin{aligned} {\mathrm{X}}_j = \frac{\partial }{\partial x_j} + 2 y_j \frac{\partial }{\partial t}, \qquad {\mathrm{Y}}_j = \frac{\partial }{\partial y_j} - 2 x_j \frac{\partial }{\partial t}, \qquad 1 \le j \le n, \end{aligned}$$

are left invariant on \({\mathbb {H}}_{n}\) and generate its Lie algebra. The associated sub-Laplacian is

$$\begin{aligned} \Delta = \sum _{j = 1}^n \left( {\mathrm{X}}_j^2 + {\mathrm{Y}}_j^2 \right) . \end{aligned}$$

Further, \(\nabla = (X, Y) = ({\mathrm{X}}_1, \ldots , {\mathrm{X}}_n, {\mathrm{Y}}_1, \ldots , {\mathrm{Y}}_n)\) is the horizontal gradient.

The drift is defined by a nonzero vector \( v = (a, b) \in {\mathbb {R}}^{ n} \times {\mathbb {R}}^{ n}\), and the sub-Laplacian with this drift is

$$\begin{aligned} \Delta _v = \Delta + 2 \sum _{i = 1}^n (a_i \mathrm {X}_i + b_i \mathrm {Y}_i). \end{aligned}$$

Consider the homomorphism from \({\mathbb {H}}_{n}\) to the multiplicative group \({\mathbb {R}}_+\)

$$\begin{aligned} \psi _v(g) = \exp {\left[ \sum _{i = 1}^n \left( a_i x_i + b_i y_i \right) \right] }, \end{aligned}$$

and the measure \(d\mu _v(g) = \psi _v(g)^2 \, dg\).

It is easy to verify that \(\Delta _v\) and \(\mu _v\) satisfy the Green’s formula

$$\begin{aligned} \int f \Delta _{v} w \, d\mu _v = - \int \langle \nabla f, \nabla w \rangle \, d\mu _v = \int \Delta _{v} f \, w \, d\mu _v, \end{aligned}$$

provided that f and w are smooth in \({\mathbb {H}}_{n}\) and that f or w has compact support. Thus \(\Delta _v\) is symmetric and has a negative-definite, self-adjoint extension in \(L^2(d\mu _v)\). Its spectral gap is positive and equals \(|v|^2\), since

$$\begin{aligned} \Delta _v f =\frac{1}{\psi _v} \left( \Delta - |v|^2 \right) (\psi _v f). \; \end{aligned}$$

The heat semigroup \((e^{h \Delta _v})_{h > 0}\) generated by \(\Delta _v\) is a diffusion semigroup to which the general Littlewood–Paley–Stein theory applies; see [51, Chap. III].

The Riesz transform \({\mathcal {R}}_k = \nabla ^k (-\Delta _v)^{-k/2}\) of any order \(k \in \{1,2,\ldots \}\) can be expressed in terms of the heat semigroup; indeed

$$\begin{aligned} {\mathcal {R}}_k = \frac{1}{\Gamma (k/2)} \int _{0}^{\infty } h^{k/2-1}\,\nabla ^k \,e^{h \Delta _v}\,dh. \end{aligned}$$
(2.2)

This operator is bounded on \(L^p(d\mu _v)\) for \(1< p < +\infty \) and any k, as verified in Lohoué and Mustapha [44, Théorème 2(ii)]; see also the remarks about the Heisenberg group in Sect. 4 of the same paper. Observe further that the Green’s formula implies that the first-order Riesz transform \(\nabla (-\Delta _v)^{-1/2}\) is an isometry on \(L^2(d\mu _v)\).

Our results deal with the weak type (1, 1) of Riesz transforms and some other operators.

Theorem 1

The first-order Riesz transform \({\mathcal {R}}_1\) is of weak type (1, 1) with respect to \(d\mu _v\).

We do not know whether the same holds for the second-order transform. But we have the following negative result.

Theorem 2

The Riesz transforms \({\mathcal {R}}_k\) of order \(k \ge 3\) are not of weak type (1, 1) for \(d\mu _v\).

The maximal Littlewood–Paley–Stein operators related to \(\Delta _v\) are given by

$$\begin{aligned} {\mathcal {H}}_k f(g) = \sup _{h > 0} \left| h^{{k}/{2}} \, \nabla ^k e^{h \Delta _v} f(g) \right| , \qquad k = 0, 1, \dots . \end{aligned}$$

In particular, \({\mathcal {H}}_0\) is the maximal operator of the semigroup, and Littlewood–Paley–Stein theory implies that \({\mathcal {H}}_0\) is bounded on \(L^p(d\mu _v)\), \(1< p < +\infty \). Actually, \({\mathcal {H}}_k\) is bounded on these \(L^p\) spaces for all \(k = 0,1,\dots \); see for instance Lohoué’s paper [40]. In Sect. 4.3, we give another proof of this result. As for weak type (1, 1), we prove the following result.

Theorem 3

For \(k = 0\) and \(k = 1\), the operator \({\mathcal {H}}_k\) is of weak type (1, 1) with respect to \(d\mu _v\), but not for \(k \ge 2\).

In this paper, we do not include the horizontal Littlewood–Paley–Stein functions, obtained by replacing \(\nabla ^k\) by differentiations with respect to t. However, we believe that they can be treated with similar methods, combined with arguments from [34] and [35].

We also mention the (first) Littlewood–Paley–Stein operator

$$\begin{aligned} H_1f(g)&= \left( \int _0^{+\infty } \left| h^{\frac{1}{2}} \, \nabla e^{h \Delta _v} f(g) \right| ^2 \frac{dh}{h} \right) ^{\frac{1}{2}}. \end{aligned}$$

The operator \(H_1\) is bounded on \(L^p(d\mu _v)\) for \(1< p <+\infty \). In the case \(1 < p \le 2\), this follows from results obtained in the setting of a manifold by Coulhon–Duong–Li [19, Theorems 1.2 or 1.3]; their arguments hold also in our case. Moreover, the boundedness for \(2< p < +\infty \) can be seen by adapting the techniques used in [40] or [9] to our setting. The weak type (1, 1) of \(H_1\) can be proved essentially by the method used for our Theorem 1.

The proof of our Theorem 1 and that of the result for \({\mathcal {H}}_1\) in Theorem 3 follow the same lines. The kernels of these operators are computed and estimated. In both cases, the local part of the operator is relatively simple to deal with. After several reductions, the arguments for the global parts boil down to an estimate for a maximal operator defined by taking convolutions with the characteristic functions of certain rectangles. This is Proposition 7, which is the fundamental point of our arguments.

The plan of this paper is as follows. After some preliminaries in Sect. 3, we prove the positive part of Theorem 3 in Sect. 4. The long arguments in Sect.  4.1 include Proposition 7, mentioned above and also used to prove Theorem 1 in Sect. 5. Finally, Sect. 6 contains the counterexamples needed for Theorem 2 and the negative part of Theorem 3.

3 Notation and Auxiliary Results

3.1 Notation

We will often use complex notation for \( {\mathbb {H}}_n = {\mathbb {C}}^n \times {\mathbb {R}} \). Setting \(z_j = x_j + iy_j\) and \(z = (z_1, \dots ,z_n) \), we write points of \({\mathbb {H}}_n\) as \(g = (z,t)\) instead of (xyt) whenever convenient. Further, we let \(|z| = \left( \sum _1^n |z_j|^2\right) ^{1/2}\).

From now on, we assume that \(|v| = 1\) and that \(b= 0\) in the expression for v, so that \(v = (a, 0)\) with \(a = (a_1, \dots , a_n) \in {\mathbb {S}}^{n - 1}\). This means no loss of generality, as seen via a dilation and an orthogonal transformation. Thus \(\Delta _v = \Delta + 2\sum _{j = 1}^n a_j {\mathrm{X}}_j\) and \(\psi _v(g) = \exp {(a \cdot x)}\), where \(a \cdot x = \sum _{j = 1}^n a_j \, x_j\). When \(n \ge 2\), the vector x is decomposed into orthogonal components as \(x = (a \cdot x) a + x_{\perp }\).

We denote by \(c>0\) and \(C < \infty \) many different constants which only depend on n and the quantities k and p appearing in the statements of the theorems. By \(A \lesssim B\) and \(A > rsim B\) we mean \(A \le CB\) and \(A \ge cB\), respectively, for positive quantities A and B. When both these inequalities hold, we write \(A \sim B\).

3.2 The Carnot–Carathéodory Distance

The Carnot–Carathéodory distance on \({\mathbb {H}}_{n}\) will be denoted by d(., .). We write \(d(g) = d(g, o)\) where \(g \in {\mathbb {H}}_n\) and \(o=(0, 0, 0)\) is the origin of \({\mathbb {H}}_{n}\), and observe that \(d(g', g) =d(g^{-1} g')\). Moreover, B(gr) denotes for \(g \in {\mathbb {H}}_{n}\) and \(r> 0\) the ball \(\{g' \in {\mathbb {H}}_n; \, d(g', g) < r \}\).

It is well known that for all \(g = (z,t) \in {\mathbb {H}}_n\)

$$\begin{aligned} |z| \le d(g) \sim |z| + \sqrt{|t|} \sim |a \cdot x| + |x_{\perp }|+|y|+\sqrt{|t|}, \end{aligned}$$
(3.1)

see for example [33, pp. 98–99].

More precisely, it is shown in [12, Theorem 1.36] that d for \(z \ne 0\) is given by

$$\begin{aligned} d(z,t) = |z|\, \frac{\theta }{\sin \theta }, \end{aligned}$$

where \(\theta \in (-\pi , \pi ) \) is determined by

$$\begin{aligned} \mu (\theta ) = t/ {|z|^2} \end{aligned}$$

and \(\mu :(-\pi , \pi ) \rightarrow {\mathbb {R}}\) is the strictly increasing bijection

$$\begin{aligned} \mu (\theta ) = \frac{2\theta -\sin 2\theta }{2\sin ^2\theta }. \end{aligned}$$
(3.2)

From this, we will deduce for \(z \ne 0\) a sharp estimate of the difference

$$\begin{aligned} d(z,t) - |a\cdot x| = d(z,t) - |z| +|z|- |a\cdot x| \end{aligned}$$

and start with

$$\begin{aligned} d(z,t) - |z| = |z|\, \frac{\theta -\sin \theta }{\sin \theta }. \end{aligned}$$
(3.3)

From (3.2) we see that \(\mu (\pm \pi /2) = \pm \pi /2\) and that \(|\mu (\theta )| \sim |\theta |\) for \(|\theta | \le \pi /2\). In the case when \( |t|/ {|z|^2} \le \pi /2\), we thus have \(|\theta | = |\mu ^{-1}(|t|/ {|z|^2})| \le \pi /2\), and (3.3) leads to

$$\begin{aligned} d(z,t) -|z| \sim |z|\, \theta ^2 \sim \frac{ t^2}{|z|^3}. \end{aligned}$$

In the opposite case \( |t|/ {|z|^2} > \pi /2\), one has instead \(|\theta | > \pi /2\) and \(|\mu (\theta )| \sim 1/ \sin ^2\theta \), and thus \( 1/ |\sin \theta | \sim \sqrt{|t|}/|z|\). Now (3.3) implies

$$\begin{aligned} d(z,t) -|z| \sim \frac{ |z|}{|\sin \theta |} \sim |t|^{1/2}. \end{aligned}$$

In both cases one also has

$$\begin{aligned} |z|- |a\cdot x| = \frac{|z|^2- (a\cdot x)^2}{|z|+|a\cdot x| } \sim \frac{|x_\perp |^2 + |y|^2}{|z| }. \end{aligned}$$

All this can be summarized as follows; here the case \(z = 0, \; t\ne 0\) is obtained by continuity.

Lemma 4

For all points \(g = (z,t) \ne o\) in \({\mathbb {H}}_{n}\),

$$\begin{aligned} d(z, t) - |a \cdot x| \sim Q(g), \end{aligned}$$
(3.4)

where

$$\begin{aligned} Q(g) = \frac{((a \cdot x)^2 + |x_{\perp }|^2 + |y|^2)(|x_{\perp }|^2+|y|^2) +t^2}{\left( |a \cdot x| + |x_{\perp }|+|y|+\sqrt{|t|}\right) ^3}. \end{aligned}$$
(3.5)

This allows us to compute the \(\mu _v\)-measure of a ball. Observe that

$$\begin{aligned} \mu _v\left( B(g, r) \right) = \psi _v^{2}(g)\,\mu _v \left( B(o, r) \right) , \qquad \forall g \in {\mathbb {H}}_n, \ r > 0. \end{aligned}$$

Lemma 5

$$\begin{aligned} \mu _v\left( B(o, r) \right) \sim \left\{ \begin{array}{ll} r^{2 n + 2}, &{} 0< r < 1, \\ r^{n + 1} e^{2 r}, &{} r \ge 1. \end{array} \right. \end{aligned}$$
(3.6)

Proof

The case \(r<1\) is clear, since the density of the measure is of constant order of magnitude in B(o, 1). So we assume \(r \ge 1\). By \(m(h),\; 0<h < 2r\), we denote the \(2n-\)dimensional Lebesgue measure of the set

$$\begin{aligned} E_h = \{(z,t) \in B(o,r):\, a\cdot x = r-h\}. \end{aligned}$$

Then

$$\begin{aligned} \mu _v\left( B(o, r) \right) = \int _{0}^{2r}\, m(h)\,e^{2(r-h)}\,dh. \end{aligned}$$
(3.7)

Let \(g = (z,t)\) be a point in \(E_h\) for some \(h \in (0,2r)\). Then (3.4) implies that

$$\begin{aligned} Q(g) \lesssim r - |r-h| \le h. \end{aligned}$$
(3.8)

From (3.5) and (3.8) we get

$$\begin{aligned} \frac{\left( |x_{\perp }|+|y| + \sqrt{|t|}\right) ^4}{\left( r + |x_{\perp }|+|y| +\sqrt{|t|}\right) ^3} \lesssim Q(g) \lesssim r, \end{aligned}$$

which, considering the case where \(|x_{\perp }|+|y| + \sqrt{|t|} \ge r\), leads to

$$\begin{aligned} |x_{\perp }|+|y| + \sqrt{|t|} \lesssim r. \end{aligned}$$
(3.9)

This implies

$$\begin{aligned} m(h) \lesssim r^{2n+1}. \end{aligned}$$

Assume now that \(h<r/2\), in order to get a better estimate of m(h). Then \(|a\cdot x| = |r-h| \ge r/2\), and (3.8), (3.5) and (3.9) imply

$$\begin{aligned} h > rsim Q(g) > rsim \frac{r^2\,(|x_{\perp }|+|y|)^2 +t^2}{r^3}. \end{aligned}$$

From this we obtain

$$\begin{aligned} |x_{\perp }|+|y| \lesssim \sqrt{rh} \qquad \mathrm {and} \qquad t \lesssim r^{3/2}\,h^{1/2}, \end{aligned}$$

and we conclude that

$$\begin{aligned} m(h) \lesssim r^{n+1}\, h^n. \end{aligned}$$
(3.10)

Inserting now our two estimates for m(h) in the integral in (3.7), we will get

$$\begin{aligned} \mu _v\left( B(o, r) \right) \lesssim \int _{0}^{r/2}\, r^{n+1}\, h^n\,e^{2(r-h)}\,dh + \int _{r/2}^{2r}\,r^{2n+1} \,e^{2(r-h)}\,dh\lesssim r^{n+1}\,e^{2r}. \end{aligned}$$
(3.11)

This is the upper estimate for \(r \ge 1\) in the lemma. To get also the lower estimate, we let \(1/4< h < 1/2\) and take \(|x_{\perp }|< c \,\sqrt{r}, \quad |y| < c\, \sqrt{r}\) and \(|t| < c\, r^{3/2} \). If the positive constant c here is small enough, Q(g) will be much smaller than h, and (3.4) will imply that the point g is in \(E_h\). It follows that the estimate (3.10) is sharp for these h. Now (3.7) gives the desired lower estimate, and the lemma is proved. \(\square \)

3.3 Semigroup Kernels

The heat semigroup \((e^{h \Delta })_{h > 0}\) generated by the sub-Laplacian has a convolution kernel \(p_h\), in the sense that

$$\begin{aligned} e^{h \Delta } f(g) = f *p_h(g) = \int _{{\mathbb {H}}_n} f(g') p_h((g')^{-1} g) \, dg' \end{aligned}$$

for suitable functions f. It is well known that \(p_h\) has the form (cf. [24, 30] or [45])

$$\begin{aligned} p_h(z, t) = \frac{1}{2 (4 \pi h)^{n+1}} \int _{{\mathbb {R}}} \exp {\Big (\frac{\lambda }{4 h} ( i t - | z |^2 \coth {\lambda }) \Big )} \Big ( \frac{\lambda }{\sinh {\lambda }} \Big )^n \, d\lambda . \end{aligned}$$
(3.12)

We note the homogeneity property of \(p_h\)

$$\begin{aligned} p_h(z, t) = h^{-n-1} p_1\left( \frac{z}{\sqrt{h}},\frac{t}{h}\right) , \qquad h > 0, \quad (z, t)\in {\mathbb {H}}_{n}. \end{aligned}$$
(3.13)

The following sharp global estimate for \(p_h\), proved in [32, Théorème 1], will play an important role:

$$\begin{aligned} p_h(z, t) \sim h^{-n - 1} \left( 1 + \frac{|z|\, d(z, t)}{h} \right) ^{-{1}/{2}}\, \left[ \frac{h + d(z, t)^2}{h + |z| \, d(z, t)} \right] ^{n - 1}\, e^{-\frac{d(z, t)^2}{4 h}} , \end{aligned}$$
(3.14)

for all \(h > 0\) and \((z, t) \in {\mathbb {H}}_{n}\).

We will also need sharp upper estimates for horizontal derivatives of \(p_h\), see [32, Théorème 2],

$$\begin{aligned} |\nabla ^k p_h(g)| \lesssim h^{-{k}/{2}} \left( 1 + \frac{d(g)}{\sqrt{h}} \right) ^k p_h(g), \qquad h > 0, \;\; g \in {\mathbb {H}}_n, \end{aligned}$$
(3.15)

for \(k = 1, 2, \dots \).

Next, we consider the sub-Laplacian with drift. The corresponding semigroup \((e^{h \Delta _v})_{h > 0}\) has an integral kernel \(p_h^{(v)}(g,g')\), in the sense that

$$\begin{aligned} e^{h \Delta _v}f(g) = \int p_h^{(v)}(g,g') f(g')\,d\mu _v(g'), \qquad h > 0, \;\; g \in {\mathbb {H}}_n, \end{aligned}$$

for suitable functions f. It is explicitly given by (cf. [3, p. 4])

$$\begin{aligned} p_h^{(v)}(g, g') = e^{- h} \; \frac{1}{\psi _v(g) \psi _v(g')} \,\; p_h((g')^{-1} g) = e^{-h-a\cdot x -a\cdot x'} \; p_h((g')^{-1} g), \end{aligned}$$
(3.16)

for all \(h > 0, \;\; g \in {\mathbb {H}}_n,\;\; g' = (x',y',t') \in {\mathbb {H}}_n\).

4 Proof of Theorem 3

The proof of the negative result for weak type (1, 1) and \(k \ge 3\) is deferred to Sect. 6.

4.1 Weak Type (1, 1) of \({\mathcal {H}}_1\)

Given \(\phi \in L^1(d\mu _v)\), we must prove that \({\mathcal {H}}_1 \phi \) is in \(L^{1,\infty }(d\mu _v)\). We prefer to work with \(f(g) =\phi (g)\, e^{2 \, a \cdot x}\), which satisfies

$$\begin{aligned} \Vert f \Vert _{L^1(dg)} = \Vert \phi \Vert _{L^1(d\mu _v)}, \end{aligned}$$

and we can assume that these functions are nonnegative.

To write \({\mathcal {H}}_1 \,\phi \) in terms of a convolution involving f, we first see from (3.16) that the kernel of \(\nabla \, e^{h \Delta _v}\) is \(\nabla _g \, p_ h^{(v)}(g, g')\). It satisfies

$$\begin{aligned} \left| \nabla _g \, p_ h^{(v)}(g, g')\right|&= \left| \left( \nabla _g \, e^{-h-a\cdot x -a\cdot x'}\right) \,p_h((g')^{-1} g) + e^{-h-a\cdot x -a\cdot x'} \,\nabla _g \, p_h((g')^{-1} g)\right| \nonumber \\&\lesssim e^{-h-2a\cdot x}\,e^{a\cdot (x - x')} \,\left( p_ h((g')^{-1} g) + |\nabla _g \, p_ h((g')^{-1} g)| \right) . \end{aligned}$$

Observe that the factor \(e^{a\cdot (x - x')}\) is a function of \((g')^{-1} g\). We now use (3.14) and (3.15) to estimate \(p_ h\) and \(|\nabla \, p_ h|\) here. If we define \(K_h(g)\) by

$$\begin{aligned} h^{-1/2}\, K_h(g)&= e^{-h}\,e^{a \cdot x}\, h^{-n - 1} \,\left( 1 + h^{-\frac{1}{2}} \left( 1+\frac{d(g)}{\sqrt{h}}\right) \,\right) \, \left( 1+\frac{|z|d(g)}{h}\right) ^{-1/2}\,\nonumber \\&\quad \times \left[ \frac{h + d(g)^2}{h + |z| d(g)} \right] ^{n - 1}\, e^{-\frac{d(g)^2}{4h}}, \end{aligned}$$
(4.1)

the result will be

$$\begin{aligned} \left| \nabla _g \, p_ h^{(v)}(g, g')\right| \lesssim e^{- 2 \, a \cdot x}\, h^{-1/2}\,K_h((g')^{-1}g). \end{aligned}$$
(4.2)

Integrating (4.2) against \(h^{1/2}\,\phi (g')\,d\mu _v(g') =h^{1/2}\, f(g')\,dg'\), we get

$$\begin{aligned} |h^{1/2}\, \nabla \, e^{h \Delta _v}\phi (g)| \lesssim e^{- 2 \, a \cdot x} \int f(g') K_h( (g')^{-1}g)\,dg' = e^{- 2 \, a \cdot x}\, f*K_h(g). \end{aligned}$$
(4.3)

We begin with the local part of \({\mathcal {H}}_1\). Thus we replace \(K_h(g)\) in (4.3) by \(K_h^{\mathrm{loc}}(g) = K_h(g)\, \chi _{\{d(g) \le 1 \}}\), and consider

$$\begin{aligned} \sup _{h>0} e^{- 2 \, a \cdot x}\, f*K_h^{\mathrm{loc}}(g). \end{aligned}$$

To estimate the right-hand side of (4.1) for \(d(g) \le 1\), we replace 4 by 8 in the last exponent. This allows us to eliminate the powers of \({d(g)^2}/{h}\) in the factors preceding the exponentials, and one finds that

$$\begin{aligned} K_h^{\mathrm{loc}}(g) \lesssim h^{-n-1}\, e^{-\frac{d(g)^2}{8 h}} \chi _{\{d(g) \le 1 \}}. \end{aligned}$$

From (4.2) it follows that the local part of \({\mathcal {H}}_1\) can be estimated in terms of the analogue of the Euclidean local gaussian maximal operator; notice that the local homogeneous dimension of our space is \(2 n + 2\). Since the measure \(\mu _v\) is locally doubling, this implies the weak type (1, 1) of the local part of \({\mathcal {H}}_1\).

It remains to deal with the global part of \({\mathcal {H}}_1\), with kernel

$$\begin{aligned} K_h^{\mathrm{glob}}(g) = K_h(g)\, \chi _{\{d(g) > 1 \}}. \end{aligned}$$

We must estimate

$$\begin{aligned} \sup _{h>0}\, e^{- 2 \, a \cdot x}\, f*K_h^{\mathrm{glob}}(g). \end{aligned}$$
(4.4)

Thus we assume that \(d(g) > 1\) and observe that \(d(g) > rsim 1 + |z|\), since also \(d(g) \ge |z|\) because of (3.1).

To bound the right-hand side of (4.1), we first use the fact that

$$\begin{aligned} - h -\frac{d(g)^2}{4h} = - d(g) - \frac{(d(g)-2h)^2}{4h} \end{aligned}$$
(4.5)

to rewrite the exponentials. Hence,

$$\begin{aligned} K_h^{\mathrm{glob}}(g)&\lesssim h^{-n - {1}/{2}} \, \left( 1 + \frac{1}{\sqrt{h}} + \frac{d(g)}{h}\right) \, \left( 1+\frac{|z|d(g)}{{h}}\right) ^{-1/2}\, \nonumber \\&\quad \times \left[ 1+ \frac{ d(g)^2}{h + |z| d(g)} \right] ^{n - 1}\, \exp \left( a \cdot x - d(g) - \frac{(d(g)-2h)^2}{4h}\right) . \end{aligned}$$
(4.6)

We shall estimate this product in a way that depends on the relative size of h and d(g).

If \(h< d(g) < 4h, \; d(g) > 1\), the product of the factors preceding the exponential in (4.6) is controlled by

$$\begin{aligned} d(g)^{-n - \frac{1}{2}}\,(1+|z|)^{- \frac{1}{2}} \, \left( \frac{d(g)}{1+|z|}\right) ^{n - 1} \lesssim d(g)^{- \frac{3}{2}}\,(1+|z|)^{-n + \frac{1}{2}} \lesssim (1+|z|)^{-n-1 }. \end{aligned}$$
(4.7)

It follows that

$$\begin{aligned} K_h^{\mathrm{glob}}(g) \lesssim (1+|z|)^{-n-1 }\, \exp \left( a \cdot x - d(g)\right) . \end{aligned}$$
(4.8)

If instead \(1 < d(g) \le h\), the factors preceding the exponential in (4.6) have a product controlled by \(h^{-n-1/2}\,d(g)^{n-1} \lesssim 1\), and

$$\begin{aligned} \frac{(d(g)-2h)^2}{4 h} \ge \frac{h^2}{4 h} \ge \frac{1}{4}\, \max (h,d(g)) \sim h+d(g). \end{aligned}$$

We conclude that then

$$\begin{aligned} K_h^{\mathrm{glob}}(g) \lesssim \exp \left( a \cdot x - d(g) - c \, h - c \, d(g)\right) \le \exp ( - c \, d(g)), \end{aligned}$$
(4.9)

the last inequality in view of (3.1).

It remains to consider the case \(d(g) \ge 4h, \;d(g) > 1\). Then

$$\begin{aligned} \frac{(d(g) - 2 h)^2}{4 h} \ge \frac{d(g)^2}{16 h} > rsim \max (d(g), h^{-1}) \sim d(g)+ h^{-1}. \end{aligned}$$
(4.10)

One also has \(\sqrt{h} \lesssim d(g)\) since \(d(g) > 1\), so that \(1/\sqrt{h} \lesssim d(g)/h\). This allows us to estimate the non-exponential factors in (4.6) by constant times

$$\begin{aligned} h^{-n - {1}/{2}}\, \frac{d(g)}{h} \,\left( \frac{d(g)^2}{h}\right) ^{n - 1} \le h^{-C}\,d(g)^C \end{aligned}$$

for some C. But these powers can be absorbed by the factor \(\exp \left( -c \, d(g) - c \, h^{-1}\right) \) coming from (4.10). We conclude that in this case

$$\begin{aligned} K_h^{\mathrm{glob}}(g) \lesssim \exp \left( a \cdot x - d(g) - c \, d(g)- c \, h^{-1}\right) \le \exp \left( -c \, d(g)\right) . \end{aligned}$$
(4.11)

The following simple lemma will allow us to restrict the kernel \(K_h^{\mathrm{glob}}\) to a smaller set depending on h.

Lemma 6

If \(L\in L^1(dg)\), the operator S defined by

$$\begin{aligned} Sf(g) = e^{- 2 \, a \cdot x} f*L(g) \end{aligned}$$

is bounded from \(L^1(dg)\) into \(L^1(d\mu _v)\).

Proof

It is enough to integrate Sf(g) with respect to \(d\mu _v(g)\) and swap the order of integration. \(\square \)

Applying this lemma with \(L(g)= \exp ( -c \, d(g))\), the estimate (4.2) together with (4.9) and (4.11) allows us to conclude that the operator obtained by multiplying \(K_h^{\mathrm{glob}}\) in (4.4) by the characteristic function of the set

$$\begin{aligned} \{g:\,1 < d(g) \notin (h, 4h)\} \end{aligned}$$

is of strong type (1,1). In the remaining set \(\{g:\,\max (h,1)< d(g) < 4h \}\), (4.8) yields

$$\begin{aligned} K_h^{\mathrm{glob}}(g) \lesssim (1+|z|)^{-n-1 }\,\exp \left( a \cdot x - d(g)\right) =: \widetilde{K}^{\mathrm{glob}}(g). \end{aligned}$$

Now if \(a \cdot x \le 1\) then \( \widetilde{K}^{\mathrm{glob}}(g) \lesssim e^{-d(g)}\), and Lemma 6 implies that the corresponding part of the operator is of strong type (1, 1).

Further, if \(|x_{\perp }| + |y|+\sqrt{|t|} \ge a \cdot x >1\), then (3.5) and (3.1) imply \(Q(g) > rsim d(g) \). From (3.4), we see that \( \widetilde{K}^{\mathrm{glob}}(g)\) is then dominated by \(e^{-c \, Q(g)} \lesssim e^{-c \, d(g)}\) for some constants c. Again, Lemma 6 shows that the corresponding part of the operator is of strong type (1, 1).

What we need to consider is thus

$$\begin{aligned} \sup _{h>0} e^{- 2 \, a \cdot x} f* \left( \widetilde{K}^{\mathrm{glob}}\,\chi _{E^h}\right) (g), \end{aligned}$$

where

$$\begin{aligned} E^h = \{g: \max (h,1)< d(g)< 4h,\;\;a \cdot x > 1, \;\;|x_{\perp }| + |y|+\sqrt{|t|}\, < a \cdot x \}. \end{aligned}$$

If \(g \in {E}^h\) for some \(h >0\), we have \(h > 1/4\) and \(h \sim d(g) \sim a \cdot x \sim 1+|z| \). This and (3.4) imply that \(\widetilde{K}^{\mathrm{glob}}(g) \lesssim (a \cdot x)^{-n - 1}\, e^{- c \, Q(g)}\) in \({E}^h\).

Defining for \(k=1,2,\dots \)

$$\begin{aligned} E_k = \left\{ g:\,2^{k-1}< a \cdot x \le 2^k,\; |x_{\perp }| + |y| +\sqrt{|t|} \le 2^k\right\} , \end{aligned}$$

we get for each \(h > 1/4\) and some C

$$\begin{aligned} {E}^h \subset \bigcup _{\begin{array}{c} h/C<2^k<Ch \\ {k\ge 1} \end{array}} E_k. \end{aligned}$$

This leads to an estimate for our remaining operator saying that

$$\begin{aligned} \sup _{h>0}\, e^{- 2 \, a \cdot x}\, f*(\widetilde{K}^{\mathrm{glob}} \, \chi _{E^h})(g) \lesssim \sup _{k\ge 1} \,e^{- 2 \, a \cdot x}\, f* M_k(g), \end{aligned}$$
(4.12)

where

$$\begin{aligned} M_k(g) = 2^{-(n + 1) k}\, e^{- c \, Q(g)}\,\chi _{E_k}(g). \end{aligned}$$
(4.13)

Let \(g = (x,y,t) \in E_k\). Then \(\sqrt{|t|} \le 2^{k}\) and so

$$\begin{aligned} |x_{\perp }| + |y| + 2^{-k} |t| < |x_{\perp }| + |y| + \sqrt{|t|} \le 2^k. \end{aligned}$$

Thus there exists an \(m \in \{1, \dots , k\}\) for which

$$\begin{aligned} 2^{(m - 1)/2} \, 2^{k/2}&< |x_{\perp }| + |y| + 2^{-k} |t| \le 2^{m/2} \, 2^{k/2} \qquad \mathrm {if} \quad m \ge 2, \\&\qquad |x_{\perp }| + |y| + 2^{-k} |t| \le 2^{m/2} \, 2^{k/2} \qquad \quad \, \mathrm {if} \quad m = 1. \end{aligned}$$

For \(m > 1\), we then have

$$\begin{aligned} |x_{\perp }| + |y|> 2^{(m - 1)/2 - 1} \, 2^{k/2} \qquad \text{ or } \qquad 2^{-k} |t| > 2^{(m - 1)/2 - 1} \, 2^{k/2}. \end{aligned}$$

In both cases, one sees from (3.5) that \(Q(g) > rsim 2^m\). Thus by (4.13)

$$\begin{aligned} M_k(g) \lesssim e^{-c \, 2^m}\, 2^{-(n + 1) k}, \end{aligned}$$

and this inequality holds trivially also if \(m = 1\). For any \(g \in E_k\), this allows us to conclude that

$$\begin{aligned} M_k(g) \lesssim \sum _{m=1}^{k} e^{-c \, 2^m}\, 2^{-(n + 1) k} \,\chi _{E_{k,m}}(g), \end{aligned}$$

where

$$\begin{aligned}&E_{k,m} =\\&\quad \{g: \,2^{k-1}< a \cdot x \le 2^k,\,\,\, |x_{\perp }| \le 2^{m/2}\,2^{k/2}, \,\,\, |y| \le 2^{m/2}\,2^{k/2}, \,\,\, |t| \le 2^{m/2}\,2^{3k/2} \}. \end{aligned}$$

We now define operators

$$\begin{aligned} T_mf(g) = \sup _{k\ge m} e^{- 2 \, a \cdot x}\, 2^{-(n + 1) k} \, f*\chi _{E_{k,m}}(g), \qquad m = 1, 2, \dots . \end{aligned}$$

For the right-hand side of (4.12), we then have

$$\begin{aligned} \sup _{k\ge 1} e^{- 2 \, a \cdot x} f* M_k(g) \lesssim \sum _{m=1}^\infty e^{-c \, 2^m} \, T_m f(g). \end{aligned}$$
(4.14)

The following proposition will allow summation in m in the space \(L^{1,\infty }(d\mu _v)\), and make the proof of the weak type (1, 1) of \({\mathcal {H}}_1\) complete.

Proposition 7

For each \(m \in \{1, 2, \dots \}\), the operator \(T_m\) is bounded from \(L^1(dg)\) into \(L^{1,\infty }(d\mu _v)\) with quasinorm no larger than \(C\,2^{Cm}\) for some constant C.

Proof

Fixing m, we let \(0\le f \in L^1(dg)\) and take \(\lambda >0\). Choosing a large \(\Omega >0\), we consider the level set

$$\begin{aligned} L^{m}_\lambda = \{g:\, T_m f(g) \ge \lambda ,\; a \cdot x \ge -\Omega \}. \end{aligned}$$

To prove the proposition, we shall verify that

$$\begin{aligned} \mu _v(L^{m}_\lambda ) \le C \, \frac{2^{C m}}{\lambda } \int f(g') \,dg', \end{aligned}$$

with constants C independent of m, f, \(\lambda \) and \(\Omega \).

Now if \(g \in L^{m}_\lambda \), we have for some \(k\in \{m,\, m + 1, \dots \}\)

$$\begin{aligned} 2^{(n + 1) k} \le \frac{ e^{-2 \, a \cdot x}}{\lambda } \int _{g E_{k,m}^{-1}} f\,dg' \le \frac{e^{2\Omega }}{\lambda } \int f\,dg', \end{aligned}$$

and this implies an upper bound for k, say \(k\le \kappa =\kappa (f,\lambda ,\Omega ) \). Then

$$\begin{aligned} L^{m}_\lambda = \left\{ g: a \cdot x \ge - \Omega , \ \max _{m \le k \le \kappa } e^{-2 \, a \cdot x} \, 2^{-(n + 1) k} f*\chi _{E_{k,m}}( g) \ge \lambda \right\} , \end{aligned}$$

which is a closed set since each \(f*\chi _{E_{k,m}}\) is a continuous function. For each \(g \in L^{m}_\lambda \), we let \(k(g)\in \{m, m+1, \dots , \kappa \}\) be the maximal value of k for which \(e^{-2 \, a \cdot x}\, 2^{-(n + 1) k}f*\chi _{E_{k,m}}( g) \ge \lambda \).

We verify that the set \( L^{m}_\lambda \) is bounded. For \(g \in L^{m}_\lambda \) we have

$$\begin{aligned} \int _{g{\mathcal {E}}^{-1}} f\,dg' \ge e^{-2 \Omega }\lambda , \end{aligned}$$
(4.15)

where \( {\mathcal {E}}\) is the compact set

$$\begin{aligned} {\mathcal {E}} = \bigcup _{k=m}^\kappa \overline{E_{k,m}}\,. \end{aligned}$$

Since f is integrable, the integral in (4.15) tends to 0 as \(d(g)\rightarrow \infty \). Thus \(L^{m}_\lambda \) is bounded and hence a compact set.

We will construct by recursion a sequence of points

$$\begin{aligned} g_j = \big (x^{(j)}, y^{(j)}, t^{(j)}\big )\in L^{m}_\lambda , \qquad j=1,2,\dots , \end{aligned}$$

which will turn out to be finite. With each \(g_j\) we will associate an open set \(g_jP_j^{-1}\), called a forbidden region; here

$$\begin{aligned} P_j = \{g:\,&a \cdot x > -1,\;\;\; |x_{\perp }|< 2^{m/2}\,2^{2 + k(g_j)/2}, \\&|y|< 2^{m/2}\, 2^{2+k(g_j)/2},\;\;\; |t|< 2^{m/2}\, 2^{4+3k(g_j)/2} \}. \end{aligned}$$

Together, these regions will be seen to cover the level set \(L^{m}_\lambda \).

Assume \(g_i\) defined for \(1\le i < j\), where \(j\in \{1,2,\dots \}\). Our idea is to choose \(g_j\) as a point in \(L^{m}_\lambda \) but not in any region forbidden by the already selected points \(g_i\). Further, it should maximize \(k(g_j)\) and, secondly, maximize the quantity \(a \cdot x^{(j)}\).

More precisely, let

$$\begin{aligned} k_j = \max \, \left\{ k(g) \in \{m, m+1,\dots ,\kappa \}:\, g\in L^{m}_\lambda \setminus \bigcup _{1\le i < j} g_iP_i^{-1} \right\} , \end{aligned}$$

provided the set \(L^{m}_\lambda \setminus \bigcup _{1\le i < j} g_iP_i^{-1}\) is nonempty; otherwise the recursion ends. We choose \(g_j\) as a point in the compact set

$$\begin{aligned} {\mathcal {A}}_j = \left\{ g\in L^{m}_\lambda \setminus \bigcup _{1\le i < j} g_iP_i^{-1}:\,k(g)= k_j\right\} \end{aligned}$$
(4.16)

such that \(a \cdot x\) is maximal among the points of this set. To verify that \({\mathcal {A}}_j\) is closed and thus compact, assume that \(\widetilde{g_{\ell }} \longrightarrow \widetilde{g}\) as \(\ell \longrightarrow +\infty \) and that \(\widetilde{g_{\ell }} \in {\mathcal {A}}_j\). Then

$$\begin{aligned} e^{-2 \, a \cdot x} \, 2^{-(n + 1) k_j} f *\chi _{E_{k_j, m}}(\widetilde{g_{\ell }}) \ge \lambda \end{aligned}$$

for all \(\ell \), and by continuity the same inequality holds at \(\widetilde{g}\). This means that \(k(\widetilde{g}) \ge k_j\); thus \(k(\widetilde{g}) = k_j\) and \(\widetilde{g} \in {\mathcal {A}}_j\). (Here we actually verified that the function \(g \longmapsto k(g)\) is upper semicontinuous.)

Having thus defined the sequence \((g_j)\), we observe that \(1\le i < j\) implies \(k_j \le k_i\), and that if here \(k_j = k_i\) then \(a \cdot x^{(j)} \le a \cdot x^{(i)}\). We will verify the following three claims:

$$\begin{aligned}&L^{m}_\lambda \subset \bigcup _{j\ge 1} g_jP_j^{-1}, \end{aligned}$$
(4.17)
$$\begin{aligned}&\mu _v(g_jP_j^{-1}) \lesssim 2^{n m} \, e^{2 \, a \cdot x^{(j)}} \, 2^{(n + 1) k_j} \end{aligned}$$
(4.18)

and

$$\begin{aligned} \text {the sets } \quad g_j E_{k_j, m}^{-1},\,\, j= 1,2,\dots , \text { are pairwise disjoint.} \end{aligned}$$
(4.19)

This would imply Proposition 7, since we would get

$$\begin{aligned} \mu _v(L^{m}_\lambda )&\le \sum _j \mu _v(g_jP_j^{-1})\lesssim 2^{n m} \, \sum _j e^{2 \, a \cdot x^{(j)}} \, 2^{(n + 1) k_j} \\&\le 2^{n m}\sum _j \frac{1}{\lambda }\int _{g_j E_{k_j, m}^{-1}} f(g)\,dg \le 2^{n m}\, \frac{1}{\lambda }\int f(g)\,dg. \end{aligned}$$

In the third step here, we used the fact that \(g_j\in L^{m}_\lambda \).

To verify (4.18), notice that

$$\begin{aligned} \mu _v(g_jP_j^{-1})&= e^{2 \, a \cdot x^{(j)}} \int _{P_j}e^{-2 \, a \cdot x} \,dx dydt \\&= e^{2 \, a \cdot x^{(j)}} \int _{\begin{array}{c} |x_{\perp }|< 2^{m/2}\,2^{2 + k_j/2} \\ a \cdot x > -1 \end{array} } \, e^{-2 \, a \cdot x} \, \int _{\begin{array}{c} |y|< 2^{m/2} \, 2^{2+k_j/2} \\ |t|< 2^{m/2} \, 2^{4+3k_j/2} \end{array}}\, dydt\,dx \\&\lesssim e^{2 \, a \cdot x^{(j)}}\, 2^{n m}\, 2^{(n + 1) k_j}. \end{aligned}$$

Aiming at (4.19), we argue by contradiction and assume that \(g_i E_{k_i, m}^{-1}\) and \(g_j E_{k_j, m}^{-1}\) have a common point for some \(1\le i < j\). Then there exist points

$$\begin{aligned} \tilde{g}^{(i)} = \big (\tilde{x}^{(i)}, \tilde{y}^{(i)}, \tilde{t}^{(i)}\big ) \in E_{k_i, m}\qquad \text { and } \qquad \tilde{g}^{(j)} = \big (\tilde{x}^{(j)}, \tilde{y}^{(j)}, \tilde{t}^{(j)}\big ) \in E_{k_j, m} \end{aligned}$$

such that \(g_i\big (\tilde{g}^{(i)}\big )^{-1} =g_j\big (\tilde{g}^{(j)}\big )^{-1} \) or equivalently \(g_j = g_i \widehat{g}^{-1}\), where \(\widehat{g} =\big (\tilde{g}^{(j)}\big )^{-1} \tilde{g}^{(i)}\).

To get the contradiction, it is enough to verify that the point \(\widehat{g} = (\widehat{x}, \widehat{y}, \widehat{t})\) is in \(P_i\), since \(g_j\) cannot be in the forbidden region \(g_iP_i^{-1}\).

For the components in the a direction of these points, we have

$$\begin{aligned} a \cdot \widehat{x} = a \cdot \tilde{x}^{(i)} - a \cdot \tilde{x}^{(j)}. \end{aligned}$$

Since \(2^{k_i-1}< a \cdot \tilde{x}^{(i)} \le 2^{k_i}\) and \(2^{k_j-1} < a \cdot \tilde{x}^{(j)} \le 2^{k_j}\), this leads to \(a \cdot \widehat{x} \ge 2^{k_i-1}- 2^{k_j}\). Here \(k_j \le k_i\), and if this last inequality is strict, we conclude that \(a \cdot \widehat{x} \ge 0\). But if \(k_j = k_i\), then \(a \cdot \tilde{x}^{(i)} \ge a \cdot \tilde{x}^{(j)}\) because of the recurrence construction, and thus \(a \cdot \widehat{x} \ge 0\) also in this case.

For the components orthogonal to a, we get (when \(n \ge 2\))

$$\begin{aligned} |\widehat{x}_{\perp }| = |\tilde{x}^{(i)}_{\perp } -\tilde{x}^{(j)}_{\perp }| \le |\tilde{x}^{(i)}_{\perp }| +|\tilde{x}^{(j)}_{\perp }|\le 2^{m/2} \, 2^{k_i/2} +2^{m/2} \, 2^{k_j/2} \le 2\cdot 2^{m/2}\, 2^{k_i/2}. \end{aligned}$$

In the same way, \( |\widehat{y}| \le 2\cdot 2^{m/2}\, 2^{k_i/2}\). For the t coordinates, we have

$$\begin{aligned} \widehat{t} = \tilde{t}^{(i)} - \tilde{t}^{(j)} + 2 \tilde{x}^{(j)} \cdot \tilde{y}^{(i)} - 2 \tilde{x}^{(i)} \cdot \tilde{y}^{(j)}. \end{aligned}$$

Since

$$\begin{aligned} |\tilde{x}^{(j)}| \le | a \cdot \tilde{x}^{(j)} | + |\tilde{x}^{(j)}_{\perp }| \le 2^{k_j} + 2^{m/2}\, 2^{k_j/2} \le 2\cdot 2^{k_j}, \end{aligned}$$

and similarly for \(\tilde{x}^{(i)}\), this implies

$$\begin{aligned} |\widehat{t}| \le 2^{m/2}\, 2^{3k_i/2} + 2^{m/2}\, 2^{3k_j/2} + 4\cdot 2^{k_j}\, 2^{m/2}\, 2^{k_i/2} + 4\cdot 2^{k_i} \,2^{m/2}\, 2^{k_j/2}\le 10\cdot 2^{m/2} \, 2^{3k_i/2}. \end{aligned}$$

It follows that \(\widehat{g} \in P_i\), and (4.19) is proved.

To verify (4.17), observe that for any j one has

$$\begin{aligned} \int _{g_j E_{k_j, m}^{-1}}f(g') \,dg' \ge e^{-2\Omega }\lambda . \end{aligned}$$

Because of (4.19) and since f is integrable, this can only happen for a finite number of j, so the sequence \((g_j)\) must be finite. This means that the set \(L^{(m)}_\lambda \setminus \bigcup _{1\le i < j} g_iP_i^{-1}\) is empty for some j, which is (4.17).

Proposition 7 is proved, and so is the weak type (1, 1) of \({\mathcal {H}}_1\). \(\square \)

4.2 Weak Type (1, 1) of \({\mathcal {H}}_0\)

Here one follows the argument just given for \({\mathcal {H}}_1\). The main difference will be that the factor \((1 + |z|)^{-n - 1}\) in (4.8) is now \((1 + |z|)^{-n - 3/2}\).

4.3 \(L^p\)-Boundedness of \({\mathcal {H}}_k\)

Fix \(k \ge 1\) and \(p \in (1, +\infty )\). We first consider small h and note that there exists a constant \(C > 1\) such that

$$\begin{aligned} {h^{k/2}} \left| \nabla _g^k \, p_h^{(v)}(g, g') \right| \lesssim p_{C h}^{(v)}(g, g'), \qquad 0 < h \le 1, \;\; g, g' \in {\mathbb {H}}_n. \end{aligned}$$
(4.20)

This can be seen from the expression (3.16) and the classical gaussian estimates for the heat kernel and its derivatives on stratified groups; see Theorems IV.4.2 and IV.4.3 of [53]. Consequently,

$$\begin{aligned} \sup _{0 < h \le 1} {h^{k/2}} \left| \nabla ^k e^{h \Delta _v} \phi (g) \right| \lesssim {\mathcal {H}}_0\, \phi (g), \end{aligned}$$

and \({\mathcal {H}}_0\) is bounded on \(L^p(\mu _v)\) as pointed out in the Introduction.

It remains to prove the \(L^p\) boundedness of

$$\begin{aligned} \sup _{h > 1} {h^{k/2}} \left| \nabla ^k e^{h \Delta _v} \phi (g) \right| . \end{aligned}$$

We use an argument inspired by [51, p. 75]. Let \(\varepsilon (p) = k/2 + 1/p'\), where \(p'\) denotes the conjugate exponent of p.

Write for \(h > 1\)

$$\begin{aligned} \left| \nabla ^k e^{h \Delta _v} \phi \right| = \left| \int _h^{+\infty } \frac{d}{ds} \left( \nabla ^k e^{s \Delta _v} \phi \right) \, ds \right| ; \end{aligned}$$

the convergence of the integral follows from Theorem IV.4.2 of [53]. By Hölder’s inequality, this is majorized by

$$\begin{aligned}&\left( \int _h^{+\infty } s^{- p' \varepsilon (p)} \, ds \right) ^{{1}/{p'}} \,\left[ \int _h^{+\infty } \left| s^{\varepsilon (p)} \, \frac{d}{ds} \left( \nabla ^k e^{s \Delta _v} \phi \right) \right| ^p \, ds \right] ^{{1}/{p}} \\&\quad \lesssim h^{-k/2} \left[ \int _1^{+\infty } \left| s^{\varepsilon (p)} \, \nabla ^k \Delta _v \, e^{s \Delta _v} \phi \right| ^p \, ds \right] ^{{1}/{p}}. \end{aligned}$$

In conclusion, we get

$$\begin{aligned} \sup _{h > 1} {h^{k/2}} \left| \nabla ^k e^{h \Delta _v} \phi (g) \right| \lesssim \left[ \int _1^{+\infty } \left| s^{\varepsilon (p)} \, \nabla ^k \Delta _v\, e^{s \Delta _v} \phi \right| ^p \, ds \right] ^{{1}/{p}}. \end{aligned}$$

With \(s>1\) we write \(\nabla ^k \Delta _v\, e^{s \Delta _v} =\nabla ^k e^{ \Delta _v/4}\,\Delta _v \,e^{\Delta _v/4}\;e^{(s - {1}/{2}) \Delta _v}\), and for the operator norms on \(L^p(d\mu _v)\) we will have

$$\begin{aligned} \Vert \nabla ^k \Delta _v \,e^{s \Delta _v} \Vert _{p \rightarrow p} \le \Vert \nabla ^k e^{ \Delta _v/4} \Vert _{p \rightarrow p} \, \Vert \Delta _v\, e^{ \Delta _v/4} \Vert _{p \rightarrow p} \, \Vert e^{(s - {1}/{2}) \Delta _v} \Vert _{p \rightarrow p}. \end{aligned}$$

From (4.20) and the boundedness of \({\mathcal {H}}_0\), it follows that \(\Vert \nabla ^k e^{ \Delta _v/4} \Vert _{p \rightarrow p} \lesssim 1\) and \(\Vert \Delta _v \,e^{ \Delta _v/4} \Vert _{p \rightarrow p} \lesssim 1\). Using interpolation and the spectral gap, we see that \(\Vert e^{(s - {1}/{2}) \Delta _v} \Vert _{p \rightarrow p}\) is exponentially decreasing as \(s \rightarrow +\infty \). The boundedness of \({\mathcal {H}}_k\) on \(L^p(\mu _v)\) follows.

5 Proof of Theorem 1

We let \(f(g) =\phi (g)\, e^{2 \, a \cdot x}\) as in Sect. 4.1.

The first-order Riesz transform is given by

$$\begin{aligned} {\mathcal {R}}_1 = \frac{1}{\sqrt{\pi }} \int _{0}^{\infty } h^{-1/2}\,\nabla \,e^{h \Delta _v}\,dh, \end{aligned}$$

cf. (2.2). Except for a factor \(h^{-1}\), the integrand here appeared in connection with the operator \({\mathcal {H}}_1\) in the beginning of Sect. 4.1. Again, we have a kernel which is a function of \((g')^{-1} g\) multiplied by \(e^{-2a\cdot x}\), and we arrive at a convolution, cf. (4.3). Indeed, for any \(g = (x,y,t) \in {\mathbb {H}}_n\)

$$\begin{aligned} h^{-1/2}\,\nabla \,e^{h \Delta _v}\,\phi (g) = e^{-2a\cdot x}\, f * \widehat{K}_h(g), \end{aligned}$$

where

$$\begin{aligned} \widehat{K}_h(g) = h^{-1/2}\,\nabla (e^{-h+a\cdot x}\,p_h(g)). \end{aligned}$$

Moreover, (4.2) implies that \(|\widehat{K}_h| \lesssim h^{-1}\,K_h\), where \(K_h\) is given by (4.1).

For the Riesz operator, we are thus led to the expression

$$\begin{aligned} {\mathcal {R}}_1\,\phi (g) = \frac{1}{\sqrt{\pi }}\, e^{-2a\cdot x}\,\int _{0}^{\infty } f * \widehat{K}_h(g)\,dh. \end{aligned}$$
(5.1)

We shall now verify the convergence of the integral \(\int _{0}^{\infty } |\widehat{K}_h(g)|\,dh\) and estimate it, for all \(g \ne o\). It will then follow that \({\mathcal {R}}_1\) is given by (5.1) for all \(g \notin \mathrm {supp}\,\phi =\mathrm {supp}\,f\).

Assume first that \(0 < d(g) \le 2\), so that \(a\cdot x \le 2\). From (4.1) we then see that

$$\begin{aligned} \int _{0}^{\infty } |\widehat{K}_h(g)|\,dh&\lesssim \int _{0}^{\infty } h^{-1}\, {K}_h(g)\,dh \\&\lesssim \int _{0}^{\infty } h^{-n-3/2} \, \left( 1 + h^{-\frac{1}{2}} \left( 1+\frac{d(g)}{\sqrt{h}}\right) \,\right) \, \left[ 1 + \frac{ d(g)^2}{h} \right] ^{n - 1}\,\\&\qquad \qquad \times \exp \left( -\frac{d(g)^2}{4h}\right) \,dh. \end{aligned}$$

To estimate this integral, one uses the exponential factor for \(h < d(g)^2\) but not for other values of h, and finds that

$$\begin{aligned} \int _{0}^{\infty } |\widehat{K}_h(g)|\,dh \lesssim d(g)^{-2n - 2}, \qquad 0 < d(g) \le 2. \end{aligned}$$
(5.2)

Using again (3.15), one also verifies that

$$\begin{aligned} \int _{0}^{\infty } |\nabla \widehat{K}_h(g)|\,dh \lesssim d(g)^{-2n - 3}, \qquad 0 < d(g) \le 2. \end{aligned}$$
(5.3)

Assuming now \( d(g) > 1\), we first consider the integral over \(d(g)/4< h < d(g) \). For such h, we use (4.6) and the middle quantity in (4.7), and get

$$\begin{aligned}&\int _{d(g)/4}^{d(g)} h^{-1}\, {K}_h(g)\,dh\\&\lesssim e^{a \cdot x - d(g)}\, d(g)^{-\frac{5}{2}} (1 + |z|)^{\frac{1}{2} -n} \int _{d(g)/4}^ {d(g)} \exp \left\{ - \frac{(2h - d(g))^2}{4 h} \right\} \, dh \\&\sim e^{a \cdot x - d(g)}\, d(g)^{-2}\, (1 + |z|)^{\frac{1}{2}-n}. \end{aligned}$$

The integral over \(0 < h \notin (d(g)/4, d(g))\) is controlled by \(e^{a \cdot x - d(g)- cd(g)}\), as seen by means of the middle expressions in (4.9) and (4.11). Thus altogether

$$\begin{aligned} \int _{0}^{\infty } |\widehat{K}_h(g)| \,dh \lesssim e^{a \cdot x -d(g)}\, d(g)^{-2}\, (1 + |z|)^{\frac{1}{2}-n}, \qquad d(g) > 1. \end{aligned}$$
(5.4)

To prove the weak type (1,1) of \({\mathcal {R}}_1\), we split the operator into a global and a local part. Choose a smooth function \(\eta \ge 0\) in \({\mathbb {H}}_{n}\) satisfying \(\eta (g) = 1\) if \(d(g) \le 1\) and \(\eta (g) = 0\) if \(d(g) \ge 2\). Then we define \(\widehat{K}_h^{\mathrm{glob}}(g) = \widehat{K}_h(g)\,(1 - \eta (g))\) and

$$\begin{aligned} {\mathcal {R}}_1^{\mathrm{glob}}\phi (g) = \frac{1}{\sqrt{\pi }} \, e^{-2a\cdot x}\,\int _{0}^{\infty } f * \widehat{K}_h^{\mathrm{glob}}(g)\,dh. \end{aligned}$$

The local part is

$$\begin{aligned} {\mathcal {R}}_1^{\mathrm{loc}} = {\mathcal {R}}_1 - {\mathcal {R}}_1^{\mathrm{glob}}. \end{aligned}$$

For \(g \notin \mathrm {supp}\,f\), the local part is given by

$$\begin{aligned} {\mathcal {R}}_1^{\mathrm{loc}}\phi (g) = \frac{1}{\sqrt{\pi }}\, e^{-2a\cdot x}\,\int _{0}^{\infty } f * \widehat{K}_h^{\mathrm{loc}}(g)\,dh, \end{aligned}$$

where \(\widehat{K}_h^{\mathrm{loc}}(g) = \widehat{K}_h(g)\,\eta (g)\) satisfies the estimates (5.2) and (5.3), like \(\widehat{K}_h\). Notice that these are the standard estimates for singular integrals of Calderón–Zygmund type. By means of a suitable splitting of \({\mathbb {H}}_{n}\) into pieces, it can be proved first that \({\mathcal {R}}_1^{\mathrm{loc}}\) is bounded on \(L^p(\mu _v),\; 1< p < \infty \), and then that it is also of weak type (1,1), see [7, Lemma 5 p. 1316 f.].

For the global part, (5.4) implies that

$$\begin{aligned} \int _{0}^{\infty } |\widehat{K}_h^{\mathrm{glob}}(g)|\,dh \lesssim e^{a \cdot x - d(g)}\, d(g)^{-2}\, (1 + |z|)^{\frac{1}{2}-n} =: K^*(g). \end{aligned}$$
(5.5)

The operator we thus need to estimate is \(f \mapsto e^{- 2 \, a \cdot x}\, f *{K^*}\), for \(0\le f \in L^1(dg)\). It is actually enough to consider \(e^{- 2 \, a \cdot x}\, f *(K^{*}\, \chi _E)\), where

$$\begin{aligned} E = \{ g:\, a \cdot x > 1,\;\;|x_{\perp }| + |y| + \sqrt{|t|} \le a \cdot x\}. \end{aligned}$$

Indeed, for the complement of this set we can apply Lemma 6 as in Sect. 4.1.

If the point \(g \in E\) is in the slice defined by \(2^{k - 1} < a \cdot x \le 2^k\) for some \(k \in \{ 1, 2, \dots \}\), we combine (5.5) with (3.4) to conclude that

$$\begin{aligned} K^{*}(g) \lesssim 2^{-(n+3/2)k} \, e^{-cQ(g)}. \end{aligned}$$

With \(M_k\) defined by (4.13), possibly with another value of the constant c, this means that

$$\begin{aligned} K^{*}(g) \lesssim \sum _{k = 1}^{+\infty } 2^{-k/2} M_k(g). \end{aligned}$$
(5.6)

From (4.14) and Proposition 7, we know that the operator

$$\begin{aligned} f \mapsto e^{- 2 \, a \cdot x} f *M_k(g) \end{aligned}$$

is bounded from \(L^1(dg)\) into \(L^{1,\infty }(\mu _v)\), uniformly in k. The estimate (5.6) then makes it possible to sum in \(L^{1,\infty }(\mu _v)\) and obtain the weak type (1, 1) of the operator \(f \mapsto e^{- 2 \, a \cdot x} f *(K^{*}\,\chi _E)(g)\). Thus \({\mathcal {R}}_1^{\mathrm{glob}}\) is of weak type (1, 1), and the proof of Theorem 1 is complete.

6 Counterexamples

6.1 Proof of Theorem 2

Let \(k\ge 3\). Instead of \({\mathcal {R}}_k\), it is enough to find a counterexample for \((a \cdot {\mathrm{X}})^k (-\Delta _v)^{-k/2}\). We will apply this operator to a function \(\phi \) supported near the origin, and evaluate \((a \cdot {\mathrm{X}})^k (-\Delta _v)^{-k/2}\,\phi \) at points far away.

For large \(r>0\) we introduce the set

$$\begin{aligned} \Omega _r = \Big \{g=(x, y, t):\, \ r - 1< a \cdot x< r, \;\; |x_{\perp }|< \sqrt{r}, \; \; |y|< \sqrt{r},\;\; |t| < r^{3/2} \Big \}. \end{aligned}$$

If \(g \in \Omega _r\), Lemma 4 implies that \(|Q(g)| \lesssim 1\), and so

$$\begin{aligned} d(g) = a \cdot x + O(1) = r + O(1), \qquad r \rightarrow \infty . \end{aligned}$$
(6.1)

Lemma 8

If \(g \in \Omega _r\) and r is large enough, then

$$\begin{aligned} (-1)^k \, \int _0^\infty h^{k/2 - 1}\, (a \cdot {\mathrm{X}})^k \, p_h^{(v)}(g, o) \, dh > rsim e^{-2r}\,r^{-n-k/2-2}. \end{aligned}$$
(6.2)

The kernel \(p_h^{(v)}\) was introduced in Sect. 3.3. Before proving this lemma, we use it to construct the desired counterexample. Let \(\phi \) be a nonnegative, continuous function supported in the ball \(B(o,\rho )\) for some small \(\rho \), and satisfying \(\int \phi \,d\mu _v = 1\). When the point g is not in the support of \(\phi \), one has

$$\begin{aligned} (a \cdot {\mathrm{X}})^k\, e^{-h\Delta _v}\,\phi (g) = \int (a \cdot {\mathrm{X}}_g)^k \, p_h^{(v)}((g')^{-1}g)\,\phi (g')\,d\mu _v(g'). \end{aligned}$$
(6.3)

We define a subset of \(\Omega _r\) by

$$\begin{aligned} \Omega _r' = \left\{ g:\, \ r - \frac{3}{4}< a \cdot x< r-\frac{1}{4}, \;\; |x_{\perp }|< \frac{\sqrt{r}}{2}, \; \; |y|< \frac{\sqrt{r}}{2}, \;\; |t| < \frac{r^{3/2}}{2} \right\} . \end{aligned}$$

Then we can fix \(\rho >0\) such that \((g')^{-1}g\in \Omega _r\) if \(g' \in B(o,\rho )\) and \(g \in \Omega _r'\), for any large r. This is seen from the group law (2.1), and \(\rho \) will depend only on n.

We now combine (6.3) with (2.2), where \(\nabla \) is replaced by \(a \cdot {\mathrm{X}}\). With \(g \in \Omega _r'\), we can swap the order of integration and obtain

$$\begin{aligned}&(a \cdot {\mathrm{X}})^k\,(-\Delta )^{-k/2}\,\phi (g)\\&\quad = \frac{1}{\Gamma (k/2)}\, \int _0^\infty h^{k/2 - 1}\, \int (a \cdot {\mathrm{X}}_g)^k \, p_h^{(v)}((g')^{-1}g) \,\phi (g')\,d\mu _v(g') \, dh\\&\quad =\frac{1}{\Gamma (k/2)}\, \int \int _0^\infty h^{k/2 - 1}\, (a \cdot {\mathrm{X}}_g)^k \, p_h^{(v)}((g')^{-1}g) \, dh \,\phi (g')\,d\mu _v(g'). \end{aligned}$$

From Lemma 8, we conclude that for \(g \in \Omega _r'\)

$$\begin{aligned} (-1)^k \, (a \cdot {\mathrm{X}})^k\,(-\Delta _v)^{-k/2}\,\phi (g)& > rsim \int e^{-2r}\,r^{-n-k/2-2} \,\phi (g')\,d\mu _v(g')\\&= \,e^{-2r}\,r^{-n-k/2-2}. \end{aligned}$$

Since \(\mu _v(\Omega _r') \sim e^{2 r}\, r^{n + 1} \) and \(k\ge 3\) , this violates the weak type (1,1) for \((a \cdot {\mathrm{X}})^k (-\Delta _v)^{-k/2}\) as \(r \rightarrow +\infty \) and ends the proof of Theorem 2.

Proof of Lemma 8

We start by estimating the integral in (6.2) taken only over \(0 < h \notin (d(g)/4, d(g))\), in terms of the kernel \(K_h\) introduced in the beginning of Sect. 4.1. Letting \(g' = o\) in (4.2), we get \(|\nabla \, p_h^{(v)}(g,o)| \lesssim e^{- 2 \, a \cdot x}\, h^{-1/2}\,K_h(g)\). In our case, we have derivatives of order k, and (3.15) says that this gives extra factors controlled by \(1 + h^{-k/2} + h^{-k} d(g)^k\).

If \(h > d(g)\), we can estimate \(K_h(g) = K_h^{\mathrm{glob}}(g)\) by means of (4.9). Then all powers of h and d(g) can be absorbed by the factors \(\exp (-ch-cd(g))\) in (4.9). As a result,

$$\begin{aligned} \int _{d(g)}^\infty h^{k/2 - 1}\, |(a \cdot {\mathrm{X}})^k \, p_h^{(v)}(g, o)| \, dh \lesssim \exp (- \, a \cdot x -d(g) -cd(g)) \sim e^{-2r -cr}, \end{aligned}$$

the last estimate because of (6.1). For the integral over \(0< h < d(g)/4\), we use instead (4.11) in a very similar way, to get

$$\begin{aligned} \int _0^{d(g)/4} h^{k/2 - 1}\, |(a \cdot {\mathrm{X}})^k \, p_h^{(v)}(g, o)| \, dh \lesssim e^{-2r -cr}. \end{aligned}$$

These parts of the integral in (6.2) are thus much smaller than the right-hand side and can be neglected. It remains to deal with the integral over \(d(g)/4< h < d(g)\), which requires much more precision.

We start by using Leibniz’ rule and (3.16) together with the fact that \((a \cdot {\mathrm{X}})\, e^{- a \cdot x} = - e^{- a \cdot x}\), to get

$$\begin{aligned} (a \cdot {\mathrm{X}})^k \,p_h^{(v)}(g, o) = e^{-h} \, e^{- a \cdot x} \, \sum _{j = 0}^k \genfrac(){0.0pt}1{k}{j} \, (-1)^{k- j} \, \left( (a \cdot {\mathrm{X}})^j\, p_h \right) (g). \end{aligned}$$
(6.4)

Now (3.12) and the definition of \({\mathrm{X}}\) show that for \(j = 0, \dots , k \) and any point \( g = ( z, t) = ( x, y, t)\)

$$\begin{aligned}&(a \cdot {\mathrm{X}})^j\, p_h(g) \nonumber \\&= \frac{1}{2 (4 \pi )^{n + 1}}\, h^{-n - 1} \int _{{\mathbb {R}}} \left( a \cdot \nabla _x + 2a \cdot y\, \partial /\partial t \right) ^j \, \exp {\left( F(\lambda )\right) } \, \left( \frac{\lambda }{\sinh {\lambda }} \right) ^n\, d\lambda \end{aligned}$$
(6.5)

where

$$\begin{aligned} F(\lambda ) = \frac{1}{4h} \left( i t\lambda - | z|^2 \lambda \coth {\lambda }\right) \end{aligned}$$

and \(\nabla _x = (\partial /\partial _{x_1},\dots , \partial /\partial _{x_n})\) is the ordinary gradient in \({\mathbb {R}}^n\).

One finds that

$$\begin{aligned}&\left( a \cdot \nabla _x + 2a \cdot y\, \partial /\partial t \right) ^j \, \exp {\left( F(\lambda )\right) } \\&\quad = \sum _{\alpha ,\beta } c_{\alpha ,\beta }\,h^{-\alpha -\beta } \,(-\lambda \coth {\lambda })^{\alpha } \, (a \cdot x)^{2\alpha +\beta -j}\,(a \cdot y)^{\beta }\, (i\lambda )^{\beta } \, \exp {\left( F(\lambda )\right) } \end{aligned}$$

for some positive constants \(c_{\alpha ,\beta }\), where the sum is taken over all nonnegative integers \(\alpha ,\,\beta \) verifying \(\alpha +\beta \le j\) and \(2\alpha +\beta \ge j\). The reason for this last inequality is that each term in the sum arises when \(\alpha \) differentiations \(a \cdot \nabla _x\) and \(\beta \) differentiations \(2a \cdot y\, \partial /\partial t\) fall on \(\exp {\left( F(\lambda )\right) }\), and the remaining \(j-\alpha -\beta \) differentiations fall on the powers of \(a \cdot x\) that will occur. Thus one must have \(j-\alpha -\beta \le \alpha \).

So (6.5) implies that for \(0 \le j \le k\), with other positive constants \(c_{\alpha ,\beta }\),

$$\begin{aligned} (a \cdot {\mathrm{X}})^j\, p_h(g) = h^{-n - 1} \sum _{\alpha ,\beta } c_{\alpha ,\beta }\, (-1)^{\alpha }\, h^{-\alpha -\beta }\, (a \cdot x)^{2\alpha +\beta -j}\, (a \cdot y)^{\beta }\, I_{\alpha ,\beta } \end{aligned}$$
(6.6)

where

$$\begin{aligned} I_{\alpha ,\beta } = \int _{{\mathbb {R}}} \exp {\left( F(\lambda )\right) } \left( \frac{\lambda }{\sinh {\lambda }} \right) ^n (\lambda \coth {\lambda })^{\alpha } \, (i\lambda )^{\beta }\, \, d\lambda . \end{aligned}$$

We will estimate \(I_{\alpha ,\beta }\), assuming the point \( g = ( z, t) = ( x, y, t)\) in \(\Omega _r\) for some large r, and we take \(h \in (d(g)/4 , d(g))\). Then \(h\sim d(g)\sim r\), and \(|z|^2/h\sim r\) and \(t/h\sim \sqrt{r}\). We remark that Beals, Gaveau and Greiner [12, Sect. 2] compute a similar integral, by moving the contour of integration to a line in the complex plane; in our case this is not necessary.

The main part of \(I_{\alpha ,\beta }\) comes from a neighborhood of the point \(\lambda = 0\), and we observe that \(F(0) = -|z|^2/(4h)\). Further,

$$\begin{aligned} F(\lambda ) - F(0) = -\frac{|z|^2}{4h} \,(\lambda \coth {\lambda }-1) + i\frac{t\lambda }{4h} = -\frac{|z|^2}{4h}\,\left( \frac{1}{3}\,\lambda ^2 +O(\lambda ^4)\right) + i\frac{t\lambda }{4h} \end{aligned}$$
(6.7)

as \(\lambda \rightarrow 0\). Using the symbol \(\wedge \) for the minimum, we also have

$$\begin{aligned} \lambda \coth {\lambda }-1 \sim \lambda ^2 \wedge |\lambda |,\qquad \lambda \in {\mathbb {R}}\setminus \{ 0 \}. \end{aligned}$$
(6.8)

This is because the quotient \( (\lambda \coth {\lambda }-1) /{(\lambda ^2 \wedge |\lambda |)}\) is continuous and positive for \(\lambda \ne 0\) and has positive limits at 0 and at \(\pm \infty \). Thus

$$\begin{aligned} \Re (F(\lambda ) - F(0)) < - c \, r \, \lambda ^2 \wedge |\lambda |,\qquad \lambda \in {\mathbb {R}}\setminus \{ 0 \} \end{aligned}$$

for some \(c>0\). Since \(\lambda /\sinh {\lambda }\) is bounded on \({\mathbb {R}}\) and \(|\lambda \coth {\lambda }| \lesssim 1+|\lambda |\), we will have

$$\begin{aligned} |I_{\alpha ,\beta }| \lesssim \exp {\left( -\frac{|z|^2}{4h} \right) } \, \int _{{\mathbb {R}}} e^{- c \, r \, \lambda ^2 \wedge |\lambda |} \, (1+|\lambda |^\alpha ) \, |\lambda |^{\beta }\, d\lambda . \end{aligned}$$

Here we separate the integrals over \(|\lambda |< 1\) and \(|\lambda | > 1\) and easily get

$$\begin{aligned} |I_{\alpha ,\beta }| \lesssim \exp {\left( -\frac{|z|^2}{4h} \right) } \, r^{-1/2-\beta /2}. \end{aligned}$$
(6.9)

Next we verify that this estimate is sharp in the case \(\alpha = j\). Notice that \(\alpha = j\) forces \(\beta = 0\). We split \(I_{j,0}\) as \(I_{j,0} = I^0 + I^\infty \), where

$$\begin{aligned} I^0 = \int _{-r^{-1/4}}^{r^{-1/4}} \exp {\left( F(\lambda )\right) } \left( \frac{\lambda }{\sinh {\lambda }} \right) ^n (\lambda \coth {\lambda })^j \, d\lambda \end{aligned}$$

and \(I^\infty \) is the corresponding integral over \(|\lambda | > r^{-1/4}\).

If \(|\lambda | > r^{-1/4}\), (6.8) implies \(\lambda \coth {\lambda }-1 > rsim r^{-1/2} + \lambda ^2 \wedge |\lambda |\). Thus

$$\begin{aligned} |I^\infty |&\lesssim \exp {\left( -\frac{|z|^2}{4h} \right) } \, \exp {\left( -c \, r^{1/2} \right) } \, \int _{{\mathbb {R}}} e^{- c \, r \lambda ^2 \wedge |\lambda |} \, (1+|\lambda |^j) \, d\lambda \nonumber \\&\lesssim \exp {\left( -\frac{|z|^2}{4h} \right) } \, \exp {\left( -c \, r^{1/2} \right) }, \end{aligned}$$
(6.10)

for some positive constants c.

For \(I^0\) we use (6.7) and the fact that \(\lambda /\sinh {\lambda } = 1 + O(\lambda ^2)\) as \(\lambda \rightarrow 0\), getting

$$\begin{aligned} I^0 = \exp {\left( -\frac{|z|^2}{4h} \right) } \int _{-r^{-1/4}}^{r^{-1/4}} \exp {\left( -\frac{|z|^2}{12h} \lambda ^2 + i\frac{t\lambda }{4h} + O(r\lambda ^4)\right) }\, \left( 1 + O(\lambda ^2)\right) \, d\lambda . \end{aligned}$$

Since here \(r\lambda ^4 \le 1\), we can replace the term \(O(r\lambda ^4)\) in the exponent by a factor \( 1 + O(r\lambda ^4)\) outside the exponential. Thus we have

$$\begin{aligned} I^0 = \exp {\left( -\frac{|z|^2}{4h} \right) } \int _{-r^{-1/4}}^{r^{-1/4}} \exp {\left( -\frac{|z|^2}{12h} \lambda ^2 +i\frac{t\lambda }{4h} \right) } \,\left( 1 + O(r\lambda ^4 + \lambda ^2)\right) \, d\lambda . \end{aligned}$$

The effect of the \(O(\dots )\) term in this integral is controlled by

$$\begin{aligned} \exp {\left( -\frac{|z|^2}{4h} \right) } \int _{-r^{-1/4}}^{r^{-1/4}} \exp {\left( - c \, r \, \lambda ^2 \right) } \,(r\lambda ^4 + \lambda ^2)\, d\lambda \, \lesssim \, \exp {\left( -\frac{|z|^2}{4h} \right) }\, r^{-3/2}. \end{aligned}$$

What remains is

$$\begin{aligned} \exp {\left( -\frac{|z|^2}{4h} \right) } \int _{-r^{-1/4}}^{r^{-1/4}} \exp {\left( -\frac{|z|^2}{12h} \lambda ^2 + i\frac{t\lambda }{4h} \right) } \, d\lambda . \end{aligned}$$

Here we can extend the integration to all of \({\mathbb {R}}\), with an error that can be estimated as in (6.10). The resulting integral over \({\mathbb {R}}\) is an elementary Fourier transform, taken at the point t/(4h). Its value is

$$\begin{aligned} \sqrt{\pi }\,\frac{\sqrt{12h}}{|z|}\, \exp {\left( -\frac{12h}{4|z|^2} \left( \frac{t}{4h} \right) ^2 \right) } \sim r^{-1/2}, \end{aligned}$$

because \(\frac{12h}{4|z|^2} \left( \frac{t}{4h} \right) ^2 \sim 1\).

We can now summarize the last few estimates, and conclude that for large r

$$\begin{aligned} I_{j,0}\, \sim \,I^0 \, \sim \,\exp {\left( -\frac{|z|^2}{4h} \right) } \, r^{-1/2}; \end{aligned}$$
(6.11)

cf. (6.9).

The next step is to insert our estimates for \(I_{\alpha ,\beta }\) in (6.6). The inequality (6.9) implies

$$\begin{aligned} \big |h^{-\alpha -\beta }\, (a \cdot x)^{2\alpha +\beta -j} \, (a \cdot y)^{\beta }\, I_{\alpha ,\beta }\big | \lesssim \, r^{\alpha + \beta /2 - j} \, |I_{\alpha ,\beta }| \lesssim \, \exp {\left( -\frac{|z|^2}{4h} \right) } \, r^{-1/2 + \alpha - j}. \end{aligned}$$

In the case when \(\alpha = j\), and thus \(\beta = 0\), (6.11) shows that this estimate is sharp in the sense that

$$\begin{aligned} h^{-\alpha -\beta }\, (a \cdot x)^{2\alpha +\beta -j}\, (a \cdot y)^{\beta }\, I_{\alpha ,\beta } \,= \, h^{-j}\,(a \cdot x)^j \, I_{j,0} \, \sim \, \exp {\left( -\frac{|z|^2}{4h} \right) } \, r^{-1/2 }. \end{aligned}$$

This means that the term with \(\alpha = j\) dominates in the sum in (6.6), and

$$\begin{aligned} (-1)^j \, (a \cdot {\mathrm{X}})^j\, p_h(g) \sim \, \exp {\left( -\frac{|z|^2}{4h} \right) } \, r^{-n-3/2 }. \end{aligned}$$

When we now insert this estimate in (6.4), the factors \((-1)^j\) will cancel, and we conclude that

$$\begin{aligned} (-1)^k \,(a \cdot {\mathrm{X}})^k \,p_h^{(v)}(g, o)&\sim e^{-h} \, e^{- a \cdot x} \, \sum _{j = 0}^k \genfrac(){0.0pt}1{k}{j} \, \exp {\left( -\frac{|z|^2}{4h} \right) } \, r^{-n-3/2 } \nonumber \\&\sim e^{-h} \, e^{- a \cdot x} \, \exp {\left( -\frac{|z|^2}{4h} \right) } \, r^{-n-3/2 }. \end{aligned}$$
(6.12)

Next, we verify that in \(\Omega _r\)

$$\begin{aligned} \frac{|z|^2}{4h} = \frac{d(g)^2}{4h} + O(1), \qquad r \rightarrow \infty . \end{aligned}$$
(6.13)

Indeed,

$$\begin{aligned} |z|^2 - d(g)^2= & {} (a \cdot x)^2 - d(g)^2 + |x_\perp |^2 + |y|^2\\= & {} (a \cdot x + d(g))(a \cdot x - d(g)) + O(r) = O(r), \end{aligned}$$

where we applied (6.1). This proves (6.13), and then (6.12) can be rewritten as

$$\begin{aligned} (-1)^k \,(a \cdot {\mathrm{X}})^k \,p_h^{(v)}(g, o) \sim e^{- a \cdot x - d(g)} \, \exp {\left( -\frac{(2h-d(g))^2}{4h} \right) } \, r^{-n-3/2 }. \end{aligned}$$

Integrating and applying again (6.1), we arrive at

$$\begin{aligned}&(-1)^k \, \int _{d(g)/4}^{d(g)} h^{k/2 - 1}\, (a \cdot {\mathrm{X}})^k \, p_h^{(v)}(g, o) \, dh\\&\quad \sim \,\, e^{-2r}\, r^{-n-3/2 } \, \int _{d(g)/4}^{d(g)} h^{k/2-1}\, \exp {\left( -\frac{(2h-d(g))^2}{4h} \right) } \, dh \sim \,\, e^{-2r}\, r^{-n-k/2 - 2 }, \end{aligned}$$

holding for all \(g \in \Omega _r\). Lemma 8 is proved. \(\square \)

6.2 Counterexample for Theorem 3

As in the preceding subsection, we consider \((a \cdot {\mathrm{X}})^k \,p_h^{(v)}(g)\) with \(g \in \Omega _r\). But we fix \(h=d(g)/2\) instead of integrating in h, and there is now a factor \(h^{k/2}\) that replaces \(h^{k/2 - 1}\). Instead of the estimate in Lemma 8, we now get for \(g \in \Omega _r\)

$$\begin{aligned} \left| h^{k/2} \, (a \cdot {\mathrm{X}})^k \,p_h^{(v)}(g, o)\big |_{h=d(g)/2}\right| \sim e^{- a \cdot x-d(g)}\, r^{k/2-n-3/2}\sim e^{-2r}\, r^{k/2-n-3/2}. \end{aligned}$$

As \(r \rightarrow +\infty \), this contradicts the weak type (1, 1) inequality for \({\mathcal {H}}_k\) when \(k\ge 2\).

The proof of Theorem 3 is complete.