1 Introduction

Time-dependent parabolic equations and, in particular, the problem of finding the upper and lower bounds for their fundamental solutions has attracted considerable attention in recent years (see e.g. [5, 1215, 31] and the monographs by Stroock and Varadhan [27], and van Casteren [4]). The aim of this paper is to get a Gaussian-type upper bound for the transition kernel of a particular kind of diffusion process (evolution) on a nilpotent meta-abelian group N. The type of the evolution equation considered here comes from the study of the heat equation on a class of solvable Lie groups, the so called higher rank NA groups which are, by definition, the semi-direct products of a nilpotent and abelian (with dimension greater than 1) groups (more on that in Sect. 1.4).

1.1 Our setting

In what follows we assume that the group N is meta-abelian

$$\begin{aligned} N=M\rtimes V, \end{aligned}$$

where M and V are abelian Lie groups with the corresponding Lie algebras \(\mathfrak m\) and \(\mathfrak v.\) We consider a family of automorphisms \(\{\Phi (a)\}_{a\in \mathbb {R}^k}\) of \(\mathfrak n,\) that leaves \(\mathfrak m\) and \(\mathfrak v\) invariant, where \(a\mapsto \Phi (a)\) is a homomorphism of \(\mathbb {R}^k\) into \(\text {Aut}(\mathfrak n)\). Let \(\mathfrak m\) and \(\mathfrak v\) be spanned, respectively, by \(\{Y_1,\ldots ,Y_{d_1}\}\) and \(\{X_1,\ldots ,X_{d_2}\}.\) We use these bases to identify \(\mathfrak m\) and \(\mathfrak v\) with \(\mathbb {R}^{d_1}\) and \(\mathbb {R}^{d_2}\) respectively. We also use the exponential mapping to identify M and V with \(\mathfrak m\) and \(\mathfrak v\) and thus with \(\mathbb {R}^{d_1}\) and \(\mathbb {R}^{d_2}\) respectively. For \(x\in N\) we write \(x=m(x)v(x)=mv=(m,v)\) where \(m(x)=m\in M\) and \(v(x)=v\in V\) denote the components of x in \(M\rtimes V.\)

Now we consider the action of an Lie abelian group \(A=\mathbb {R}^k\) on N. We have a semi-direct product \(S=N\rtimes A=N\rtimes \mathbb R^k\) with the multiplication in S given by

$$\begin{aligned} (x,a)(y,b)=(xy^a,a+b), \end{aligned}$$

where, for \(x=\exp X,\) \(X\in \mathfrak n,\) the action of \(a\in A=\exp A=\mathbb R^k\) on N is defined as

$$\begin{aligned} x^a=\exp (\Phi (a)X). \end{aligned}$$

The group S is a solvable Lie group. The rank of S is, by definition, equal to \(\dim A.\) Similarly, for \(g\in S\) we write \(g=x(g)a(g)=xa=(x,a),\) where \(x(g)=x\in N\) and \(a(g)=a\in A\) denote the components of g in \(N\rtimes A.\) In what follows we identify the group A,  its Lie algebra \(\mathfrak {a},\) and \(\mathfrak {a}^*,\) the space of linear forms on \(\mathfrak {a},\) with the Euclidean space \(\mathbb {R}^k\) endowed with the usual scalar product \(\langle \cdot ,\cdot \rangle \) and the corresponding norm \(\Vert a\Vert =\langle a,a\rangle ^{1\slash 2}.\) By \(\Vert \cdot \Vert _\infty \) we denote the maximum norm \(\Vert a\Vert _\infty =\max _{1\le j\le k}|a_j|.\)

Let \(\sigma \) be a continuous function from \([0,+\infty )\) to \(A=\mathbb {R}^k,\) and denote

$$\begin{aligned} \Phi ^\sigma (t)=\Phi (\sigma (t)). \end{aligned}$$

We assume also that

  1. (A1)

    in the \(\{Y_i\}_{1\le i\le d_1}\) basis on \(\mathfrak m,\) \({{\mathrm{ad}}}_X\) is lower triangular for all \(X\in \mathfrak v\) and

  2. (A2)

    the restriction \(S^\sigma \) of \(\Phi ^\sigma \) to M considered as a linear operator on \(\mathfrak m\) is given in the \(\{Y_i\}_{1\le i\le d_1}\) basis by a \(d_1\times d_1\) lower triangular matrix:

    $$\begin{aligned} S^\sigma (t)=\Phi ^\sigma (t)|_M=[s^\sigma _{ij}]_{1\le i,j\le d_1}. \end{aligned}$$

    Specifically, for \(i\ge j,\)

    $$\begin{aligned} s_{ij}^\sigma (u)=h_{ij}^M(\sigma (u))e^{\xi _j(\sigma (u))}, \end{aligned}$$

    where \(h_{ij}^M\in \mathbb {R}[a_1,\ldots ,a_k]\) are polynomials in \(a\in A=\mathbb {R}^k\) with \(h_{jj}^M=1,\) for \(1\le j\le d_1,\) and \(\xi _1,\ldots ,\xi _{d_1}\in A^*=(\mathbb {R}^k)^*.\)

  3. (A3)

    The matrix

    $$\begin{aligned} T^\sigma (t)=\Phi ^\sigma (t)|_V=[t^\sigma _{ij}]_{1\le i,j\le d_2} \end{aligned}$$

    is a \(d_2\times d_2\) lower triangular and, for \(i\ge j,\)

    $$\begin{aligned} t_{ij}^\sigma (u)=h_{ij}^V(\sigma (u))e^{\vartheta _j(\sigma (u))}, \end{aligned}$$

    where \(h_{ij}^V\in \mathbb {R}[a_1,\ldots ,a_k]\) are polynomials in \(a\in A=\mathbb {R}^k\) with \(h_{jj}^V=1,\) for \(1\le j\le d_2,\) and \(\vartheta _1,\ldots ,\vartheta _{d_2}\in A^*=(\mathbb {R}^k)^*.\)

1.2 Evolution kernel

Let, for \(Z\in \mathfrak n,\)

$$\begin{aligned} Z(t)=\Phi ^\sigma (t)Z. \end{aligned}$$

Let,

$$\begin{aligned} \mathcal L_N^\sigma (t)=\sum _{i=1}^{d_2}X_i(t)^2+\sum _{j=1}^{d_1}Y_j(t)^2. \end{aligned}$$

Now we consider the evolution process generated by \(\mathcal L_N^\sigma (t).\) By C(N) we denote the set of coninuous functions on N. Let

$$\begin{aligned} C_\infty (N)=\left\{ f\in C(N):\lim _{x\rightarrow \infty }f(x)\text { exists}\right\} . \end{aligned}$$

Let \(d=\dim \mathfrak n.\) For \(X\in \mathfrak {n},\) we let \(\tilde{X}\) denote the corresponding right-invariant vector field. For a multi-index \(I=(i_1,\ldots ,i_{d}),\) \(i_j\in \mathbb {Z}^+\) and a basis \(X_1,\ldots ,X_d\) of the Lie algebra \(\mathfrak n\) we write \(X^I=X_1^{i_1},\ldots , X_m^{i_d}.\) For \(\kappa ,\ell =0,1,2,\ldots ,\infty \) we define

$$\begin{aligned} C^{(\kappa ,\ell )}(N)=\{f:\tilde{X}^IX^Jf\in C_\infty (N)\quad \text {for every }|I|<\kappa +1\text { and }|J|<\ell +1\} \end{aligned}$$

and

$$\begin{aligned} \Vert f\Vert _{(\kappa ,\ell )}^0&=\sup _{|I|=\kappa ,|J|=\ell }\Vert \tilde{X}^IX^Jf\Vert _\infty ,\\ \Vert f\Vert _{(\kappa ,\ell )}&=\sup _{|I|\le \kappa ,|J|\le \ell }\Vert \tilde{X}^IX^Jf\Vert _\infty . \end{aligned}$$

In particular \(C^{(0,2)}(N)\) with the norm \(\Vert f\Vert _{(0,2)}\) is a Banach space. It is known (see [4, 19, 28]) that there exists the (unique) family of bounded operators \(U^\sigma _{s,t}\) on \(C_\infty \) which satisfies

  1. (i)

    \(U^\sigma _{s,s}=\mathrm {Id},\) for all \(s\ge 0,\)

  2. (ii)

    \(\lim _{h\rightarrow 0}U^\sigma _{s,s+h}f=f\) in \(C_\infty (N),\)

  3. (iii)

    \(U^\sigma _{s,r}U^\sigma _{r,t}=U^\sigma _{s,t},\) \(0\le s\le r\le t,\)

  4. (iv)

    \(\partial _sU^\sigma _{s,t}f=-\mathcal L^\sigma _N(s) U^\sigma _{s,t}f\) for every \(f\in C^{(0,2)}(N),\)

  5. (v)

    \(\partial _tU^\sigma _{s,t}f=U^\sigma _{s,t}\mathcal L^\sigma _N(t)f\) for every \(f\in C^{(0,2)}(N),\)

  6. (vi)

    \(U^\sigma _{s,t}:C^{(0,2)}(N)\rightarrow C^{(0,2)}(N)\) for all \(s\le t.\)

The family \(U^\sigma _{s,t}\) is called the evolution generated by \(\mathcal L_N^\sigma (t).\) By \(P^\sigma _{t,s}\) we denote the corresponding kernel

$$\begin{aligned} U^\sigma _{s,t}f(x)=\int _{N}P^\sigma _{t,s}(x;y)f(y)dy. \end{aligned}$$

Since \(\mathcal L^\sigma _N(t)\) commutes with left translation, the same is true for \(U^\sigma _{s,t}.\) Hence,

$$\begin{aligned} P^\sigma _{t,s}(x;y)=P^\sigma _{t,s}(e;x^{-1}y). \end{aligned}$$

With a small abuse of notation we write

$$\begin{aligned} P^\sigma _{t,s}(x)=P^\sigma _{t,s}(e;x). \end{aligned}$$

Hence, the operator \(U^\sigma _{s,t}\) is a convolution operator with a probability measure (with a smooth density) \(P^\sigma _{t,s},\)

$$\begin{aligned} U^\sigma _{s,t}f=f*P^\sigma _{t,s}. \end{aligned}$$

We call \(P^\sigma _{t,s}(x)\) or \(P^\sigma _{t,s}(x;y)\) the evolution kernel. Sometimes \(P^\sigma _{t,s}(x;y)\) is called the transition kernel since in probabilistic terms \(P^\sigma _{t,s}(x;y)\) is the transition kernel for the time-dependent Markov process (or evolution), \(\omega (t),\) on N defined by the operator \(\mathcal L^{\sigma }_N(t).\) Probability that starting from x at time s the proces \(\omega (t)\) is in a given set \(B\subset N\) is

$$\begin{aligned} \mathbf{P}_{s,x}(\omega (t)\in B)=\int _BP^\sigma _{t,s}(x;y)dy. \end{aligned}$$

By (iii), for \(s\le r\le t,\)

$$\begin{aligned} P^\sigma _{t,r}*P^\sigma _{r,s}=P^\sigma _{t,s}. \end{aligned}$$

1.3 Main result

Our aim is to estimate the evolution kernel \(P^\sigma _{t,s}.\) In order to do this, first we disintegrate the process \(\omega (t)\) into the corresponding processes on M and V respectively. Specifically, let

$$\begin{aligned} \mathcal L_M^\sigma (t)=\sum _{j=1}^{d_1}Y_j(t)^2\quad \text {and}\quad \mathcal L_V^\sigma (t)=\sum _{j=1}^{d_2}X_j(t)^2 \end{aligned}$$
(1.1)

thought of as operators on M and V respectively.

For \(v\in V,\) let

$$\begin{aligned} \mathcal L_M^\sigma (t)^v=\sum _{j=1}^{d_1}({{\mathrm{Ad}}}(v)Y_j(t))^2. \end{aligned}$$
(1.2)

Then the operator \(\mathcal L^\sigma _N(t)\) is the skew-product of the above defined operators, i.e.,

$$\begin{aligned} \mathcal L_N^{\sigma (t)}f(m,v)=\mathcal L_V^\sigma (t)f(m,\cdot )|_v+\mathcal L_M^\sigma (t)^v f(\cdot ,v)|_m ,\quad t\in \mathbb {R}^+. \end{aligned}$$

The time-dependent family of operators \(\mathcal L_V^{\sigma }(t)\) gives rise to an evolution on \(V=\mathbb {R}^{d_2}\) that is described by a kernel \(P^{V,\sigma }_{t,s}\) which may be explicitly computed, since V is abelian. For \(\eta \in C^\infty ([0,+\infty ), V)\) let

$$\begin{aligned} \mathcal L_M^\sigma (t)^\eta =\sum _{j=1}^{d_1}({{\mathrm{Ad}}}(\eta (t))Y_j(t))^2. \end{aligned}$$

This family of operators gives rise to an evolution on \(M=\mathbb {R}^{d_1}\) that is described by a kernel \(P^{M,\sigma ,\eta }_{t,s}\) which may also be explicitly computed (see Sect. 4).

One of our main tools is the following skew-product formula for \(P^\sigma _{t,s}\) (which can be proved along the lines of [23, Theorem 1.2], where diagonal action of A on N was considered).

Theorem 1.1

For \(m\in M\) and \(v\in V,\)

$$\begin{aligned}&\int _NP^{\sigma }_{t,s}(m,v;m^\prime ,v^\prime )f(m^\prime ,v^\prime ) dm^\prime dv^\prime \\&\quad =\int \int _M P^{M,\sigma ,\eta }_{t,s}(m;m^\prime )f\left( m^\prime ,\eta (t)\right) dm^\prime d\mathbf W^{V,\sigma }_{s,v}(\eta ) \end{aligned}$$

where \(\mathbf W^{V,\sigma }_{s,v}\) is the probability measure on the space \(C([s,+\infty ),V)\) generated by the diffusion process \(\eta (t)\) starting from \(v\in V\) at time s,  with the generator \(\mathcal L_V^\sigma (t).\)

A difficulty in applying the above formula is that the process \(\eta (t)\) does not have independent coordinates. This difficulty is overcome with the help of Proposition 3.1 which gives the estimate for the joint probability of \(\sup _{u\in [s,t]}\Vert \eta (u)\Vert _\infty \) and the position of the process \(\eta \) at time t,  i.e., \(\eta (t).\) This makes all the computation quite involved.

In order to state our main theorem we need to introduce some notation. Let, for \(1\le j\le d_2,\)

$$\begin{aligned} V_j(\tau ,t)=\max _{s\in [\tau ,t]}(A_V^\sigma (\tau ,s)-A_V^\sigma (\tau ,s)A_V^\sigma (\tau ,t)^{-1}A_V^\sigma (\tau ,s))_{jj}, \end{aligned}$$
(1.3)

where

$$\begin{aligned} A^\sigma _V(\tau ,s)=2\int _\tau ^sT^\sigma (u)T^\sigma (u)^*du. \end{aligned}$$
(1.4)

Set

$$\begin{aligned} \begin{aligned} \mathcal S(\tau ,t)&=\sum _{i\ge j}\int _\tau ^t|s_{ij}^\sigma (u)|^2du,&\mathcal S_{\Pi }(\tau ,t)&=\prod _{j=1}^{d_1}\int _\tau ^te^{2\xi _j(\sigma (u))}du,\\ \mathcal T(\tau ,t)&=\sum _{i\ge j}\int _\tau ^t|t_{ij}^\sigma (u)|^2du,&\mathcal T_{\Pi }(\tau ,t)&=\prod _{j=1}^{d_2}\int _\tau ^te^{2\vartheta _j(\sigma (u))}du,\\ \mathcal V(\tau ,t)&=\sum _{j=1}^{d_2}V_j(\tau ,t). \end{aligned} \end{aligned}$$
(1.5)

The main result is the following estimate.

Theorem 1.2

For every \(T>0\) there are positive constants \(c_1,c_2,c_3\) and a natural number \(k_o\) such that for all \(T\ge t\ge \tau \ge 0\) and all \((m,v)\in N,\)

$$\begin{aligned}&P^\sigma _{t,\tau }(m,v)\nonumber \\&\quad \le c_1\frac{\tilde{\Theta }(\tau ,t,v)-\Vert v\Vert _\infty +2}{\mathcal S_{\Pi }(\tau ,t)^{1\slash 2}\mathcal T_{\Pi }(\tau ,t)^{1\slash 2}} \exp \left( -\frac{c_2\Vert v\Vert ^2}{\mathcal T(\tau ,t)}-\frac{c_3\Vert m\Vert ^2}{(\tilde{\Theta }(\tau ,t,v)+1)^{2k_o}\mathcal S(\tau ,t)}\right) \nonumber \\&\quad \quad +c_1\frac{\Vert m\Vert ^\frac{1}{2k_o}}{\mathcal S_{\Pi }(\tau ,t)^{1\slash 2}\mathcal T_{\Pi }(\tau ,t)^{1\slash 2}}\exp \left( -\frac{c_2\Vert v\Vert ^2}{\mathcal T(\tau ,t)}-\frac{c_3\Vert m\Vert ^2}{(\tilde{\Theta }(\tau ,t,v)+1+\Vert m\Vert ^\frac{1}{2k_o})^{2k_o}\mathcal S(\tau ,t)}\right) \nonumber \\&\quad \quad +c_1\mathcal S_{\Pi }(\tau ,t)^{-1\slash 2}\mathcal T_{\Pi }(\tau ,t)^{-1\slash 2}\mathcal V(\tau ,t)^{1\slash 2}\exp \left( -\frac{c_2\Vert v\Vert ^2}{\mathcal T(\tau ,t)}-\frac{\Vert m\Vert ^{1\slash k_o}}{2\mathcal V(\tau ,t)}\right) , \end{aligned}$$
(1.6)

where

$$\begin{aligned} \Theta (\tau ,t,v)=\max _{s\in [\tau ,t]}\Vert A_V^\sigma (\tau ,s)A_V^\sigma (\tau ,t)^{-1}v\Vert _\infty , \end{aligned}$$
(1.7)

and

$$\begin{aligned} \tilde{\Theta }(\tau ,t,v)=\Theta (\tau ,t,v)+C\sum _{j=1}^nV_j(\tau ,t)^{1\slash 2}. \end{aligned}$$
(1.8)

Remark

In Sect. 7 we give explicit estimates for the quantities \(\tilde{\Theta }(\tau ,t,v)\) and \(\mathcal V(\tau ,t).\)

Remark

Gaussian estimates in \(\mathbb {R}^n\) for the fundamental solution of the time-dependent parabolic equations are usually obtained under the assumption that the operator is (uniformly) elliptic (see e.g. classical papers by Aronson [2] and Fabes and Stroock [11]). We do not require this condition and our estimate explicitly depends on the coefficients of the operator.

Remark

If the action of A on N is diagonal, i.e., the polynomials in entries of matrices \(S^\sigma (t)\) and \(T^\sigma (t)\) [see the assumptions (A2) and (A3)] satisfy \(h_{ij}^M=h_{ij}^V=0\) for \(i\not =j\) then all the quantities appearing in Theorem 1.2 can be easily computed. We get

$$\begin{aligned} V_j(\tau ,t)=2\int _\tau ^te^{2\vartheta _j(\sigma (u))}du,\quad \mathcal S(\tau ,t)=\sum _{j=1}^{d_1}\int _\tau ^te^{2\xi _j(\sigma (u)}du \end{aligned}$$

and

$$\begin{aligned} \mathcal V(\tau ,t)=2\sum _{j=1}^{d_2}\int _\tau ^te^{2\vartheta _j(\sigma (u))}du\quad \mathcal T(\tau ,t)=\mathcal V(\tau ,t)\slash 2. \end{aligned}$$

Finally,

$$\begin{aligned} \Theta (\tau ,t,v)=\max _{s\in [\tau ,t]} \int _\tau ^se^{2\vartheta _j(\sigma (u))}du \left( \int _\tau ^te^{2\vartheta _j(\sigma (u))}du\right) ^{-1}\Vert v\Vert _\infty =\Vert v\Vert _\infty . \end{aligned}$$

In this setting Theorem 1.2 simplifies and we obtain [23, Theorem 4.1].

1.4 Applications

Since the estimate given by Theorem 1.2, at first glance, seems to be quite technical and complicated it is worth to explain why this formula is important and where it can be used. First of all the estimate for \(P^\sigma _{t,s},\) given by Theorem 1.2, can be applied in the analysis of left-invariant, second-order differential operators on the higher rank NA groups, i.e., the semi-direct product \(N\rtimes \mathbb {R}^k\) as described above (at this moment we do not assume that \(N=M\rtimes V\)). Consider, for \(\alpha =(\alpha _1,\ldots ,\alpha _k)\in \mathbb {R}^k,\) the left-invariant differential operator of the form

$$\begin{aligned} \begin{aligned} \mathcal {L}_{\alpha }&=\sum _{j=1}^{d_2}X_j(a)^2 +\sum _{j=1}^{d_1}Y_j(a)^2+\Delta _\alpha ,\\ \end{aligned} \end{aligned}$$
(1.9)

where

$$\begin{aligned} \Delta _\alpha =\sum _{j=1}^k(\partial _{a_j}^2-2\alpha _j\partial _{a_j}). \end{aligned}$$

In this setting properties of bounded harmonic functions on S is certainly of interest. Under some assumption on the drift vector \(\alpha \) there exists a Poisson kernel \(\nu \) for \(\mathcal L_\alpha \) [6, 7]. That is, there is a \(C^\infty \) function \(\nu \) on N such that every bounded \(\mathcal L_\alpha \)-harmonic function F on S may be written as a Poisson integral against a bounded function f on \(S\slash A=N,\)

$$\begin{aligned} F(g)=\int _{S\slash A}f(gx)\nu (x)dx=\int _Nf(x)\check{\nu }^a(x^{-1}x_{o})dx,\quad g=(x_{o,}a), \end{aligned}$$

where

$$\begin{aligned} \check{\nu }^a(x)=\nu (a^{-1}x^{-1}a)\chi (a)^{-1}, \end{aligned}$$

where \(\chi \) is the modular function for left invariant Haar measure on S,  i.e.,

$$\begin{aligned} \chi (g)=\det ({{\mathrm{Ad}}}(g)). \end{aligned}$$

Conversely the Poisson integral of any \(f\in L^\infty (N)\) is a bounded \(\mathcal L_\alpha \)-harmonic function.

It is known that the Poisson kernel \(\nu \) is equal to \(\lim _{t\rightarrow \infty }\pi _N(\mu _t),\) where \(\pi _N(g)=x(g)\) is a projection from S onto N. To get some information on \(\mu _t\) we use a well known formula which express \(T_t\) as a skew-product of the diffusion on N and A. For \(f\in C_c(N\times \mathbb {R}^k)\) and \(t\ge 0,\)

$$\begin{aligned} T_tf(x,a)=\mathbf{E}_aU^\sigma _{0,t}f(x,\sigma _t)=\mathbf{E}_a(f*_{N}P^\sigma _{t,0})(x,\sigma _t), \end{aligned}$$
(1.10)

where the expectation \(\mathbf{E}\) is taken with respect to the distribution of the process \(\sigma _t\) (Brownian motion with drift) in \(\mathbb {R}^k\) generated by \(\Delta _\alpha .\) The operator \(U^\sigma _{0,t}\) acts on the first variable of the function f (as a convolution operator). The idea of such a decomposition goes back to [16, 17, 29]. In the context of NA groups with \(\dim A=1\) this decomposition was used in [710], and later was generalized by the authors and applied for \(\dim A>1,\) see e.g. [20, 22]. Note that Theorem 1.1 is a generalization of (1.10) to evolution operators.

Estimates for the Poisson kernel for the operator (1.9) were obtained by the authors in a series of papers [2024]. However, in all these papers the action of A on N is diagonal. Thus Theorem 1.2 opens the door to consider non-diagonal actions. This is going to be the subject of our future research.

1.5 Structure of the paper

The outline of the rest of the paper is as follows. In Sect. 2 we state the formula for the evolution kernel in \(\mathbb {R}^n\) and recall the Borell–TIS inequality which is in Sect. 3 used in the proof of an appropriate estimate for \(\mathbf{P}\left( \sup _{s\in [\tau ,t]}\Vert \eta (s)\Vert _\infty \ge u\text { and }\eta (t)\in B\right) \) for \(u\in \mathbb {R}\) and \(B\subset \mathbb {R}^n.\) In Sects. 4 and 5 we study evolutions on M and V,  respectively. Finally in Sect. 6 we give the proof of Theorem 1.2 and in Sect. 7 we give some estimates for quantieties given in (1.7) and (1.8).

2 Preliminaries

2.1 Gaussian variables and fields

We follow the presentation in [1]. For \(\mathbb {R}^n\)-valued random variables X and Y their covariance matrix is defined as \(\mathrm {Cov}(X,Y)=\mathbf{E}(X-\mathbf{E}X)(Y-\mathbf{E}Y)^t.\) An \(\mathbb {R}^n\)-valued random variable X is said to be multivariate Gaussian if for every non-zero \(\alpha =(\alpha _1,\ldots ,\alpha _n)\in \mathbb {R}^n,\) the real valued random variable \(\langle \alpha ,X\rangle =\sum _{i=1}^n\alpha _iX_i\) is Gaussian. In this case the density of X is given by the multivariate normal density

$$\begin{aligned} (2\pi )^{-n\slash 2}(\det C)^{-1\slash 2}e^{-\frac{1}{2}C^{-1}(x-m)\cdot (x-m)}, \end{aligned}$$

where \(m=\mathbf{E}X\) and \(C=\mathrm {Cov}(X,X)\) is a positive semi-definite \(n\times n\) covariance matrix. In this case we write \(X\sim \mathcal N_n(m,C)\) or simply \(X\sim \mathcal N(m,C).\)

Lemma 2.1

Let \(X\sim \mathcal N_n(m,C).\) Assume that \(d<n\) and make the partition

$$\begin{aligned} X&=(X^1,X^2)=((X_1,\ldots ,X_d),(X_{d+1},\ldots ,X_n)),\\ m&=(m_1,m_2)=((m_1,\ldots ,m_d),(m_{d+1},\ldots ,m_n)) \end{aligned}$$

and

$$\begin{aligned} C=\begin{bmatrix} C_{11}&\quad \! C_{12}\\ C_{21}&\quad \! C_{22} \end{bmatrix}, \end{aligned}$$

where \(C_{11}\) is a \(d\times d\)-matrix. Then each \(X^i\sim \mathcal N(m^i,C_{ii})\) and the conditional distribution of \(X^i\) given \(X^j\) is also Gaussian, with mean vector

$$\begin{aligned} m_{i\mid j}=m^i+C_{ij}C_{jj}^{-1}(X^j-m^j) \end{aligned}$$

and covariance matrix

$$\begin{aligned} C_{i\mid j}=C_{ii}-C_{ij}C_{jj}^{-1}C_{ji}. \end{aligned}$$

Proof

See e.g. [1, p. 8]. \(\square \)

A random field is a stochastic process, taking values in some space, usually in a Euclidean space, and defined over a parametric space T. A real valued Gaussian process is a random field f on a parameter set T for which the (finite dimensional) distributions of \((f_{t_1},\ldots ,f_{t_n})\) are multivariate Gaussian for each \(1\le n<+\infty \) and each \((t_1,\ldots ,t_n)\in T^n.\)

2.2 Gaussian inequalities

The following powerful inequality was discovered independently, and was proved in very different ways, by Borell [3] and Tsirelson et al. [30]. Following [1] we call the following inequality Borell–TIS inequality.

Theorem 2.2

(Borell–TIS inequality) Let \(f_t\) be a centered Gaussian process, almost surely bounded on T. Write \(|f|_T=\sup _{t\in T}f_t.\) Then \(\mathbf{E}|f|_T<+\infty \) and, for all \(u>0,\)

$$\begin{aligned} \mathbf{P}(|f|_T-\mathbf{E}|f|_T>u)\le e^{-u^2\slash 2\sigma _T^2}, \end{aligned}$$

where

$$\begin{aligned} \sigma _T^2=\sup _{t\in T}\mathbf{E}f_t^2. \end{aligned}$$

Proof

For the proof see the original papers [3, 30] or [1]. \(\square \)

Immediately, we get the following

Corollary 2.3

Let \(f_t\) be a centered Gaussian process, almost surely bounded on T. Then for all \(u>\mathbf{E}|f|_T,\)

$$\begin{aligned} \mathbf{P}(|f|_T>u)\le e^{-(u-\mathbf{E}|f|_T)^2\slash 2\sigma _T^2}. \end{aligned}$$

2.3 Evolution equation in \(\mathbb {R}^n\)

Let

$$\begin{aligned} L(t)=\frac{1}{2}\sum _{i,j=1}^na_{ij}(t)\partial _{i}\partial _j+\sum _{j=1}^n \delta _j(t)\partial _j, \end{aligned}$$
(2.1)

where \(\partial _i=\partial _{x_i}\) and \(a(t)=[a_{ij}(t)]\) is a symmetric, positive definite matrix and the \(a_{ij}\) and \(\delta _j\) belong to \( C([0,\infty ),\mathbb {R})\). For \(s>t\), let \(P_{t,s}\) be the evolution kernel generated by L(t). Let, for \(1\le i,j\le n,\)

$$\begin{aligned} A_{s,t}=[A_{ij}(s,t)]&=\left[ \displaystyle \int _s^ta_{ij}(u)du\right] ,\nonumber \\ D_{s,t}=[D_j(s,t)]&=\left[ \displaystyle \int _s^t\delta _j(u)du\right] . \end{aligned}$$
(2.2)

Proposition 2.4

The evolution kernel \(P_{t,s}\) corresponding to the operator L(t) defined in (2.1) is given by

$$\begin{aligned} P_{t,s}(x)=(2\pi )^{-\frac{n}{2}}(\det A_{s,t})^{-\frac{1}{2}}e^{-\frac{1}{2}(A_{s,t}^{-1}(x-D_{s,t}))\cdot (x-D_{s,t})}. \end{aligned}$$

Proof

See e.g. [23, Proposition 2.9] \(\square \)

3 Main probabilistic estimate

Consider the operator L(t),  defined in (2.1), without the drift vector \(\delta (t)=(\delta _1(t),\ldots ,\delta _n(t)),\) i.e,

$$\begin{aligned} L(t)=\frac{1}{2}\sum _{i,j=1}^na_{ij}(t)\partial _{i}\partial _j. \end{aligned}$$
(3.1)

Let \(b_s=(b_s^1,\ldots ,b_s^n)\) be the stochastic process generated by the operator L(t). Define, for \(v\in \mathbb {R}^n,\)

$$\begin{aligned} B_\varepsilon (v)=\prod _{j=1}^nB_\varepsilon ^1(v_j)\quad \text {and}\quad B_\varepsilon ^1(v_j)=[v_j-\varepsilon \slash 2,v_j+\varepsilon \slash 2], \end{aligned}$$

and let \(\Vert \cdot \Vert _\infty \) denote the \(\ell ^\infty \)-norm on \(\mathbb {R}^n,\) i.e., for a vector \(y\in \mathbb {R}^n,\) \(\Vert y\Vert _\infty =\max _{1\le j\le n}|y_j|.\)

The distribution of the process \(b_t\) starting at time \(\tau \) from v,  i.e., \(b_\tau =v,\) is denoted by \(\mathbf{P}_{\tau ,v}(\cdot ).\) This is a probability measure on the space of trajectories \(b_t\in C([0,\infty ),\mathbb {R}^n).\)

Proposition 3.1

Let \(b_t\) be the process generated by L(t) defined in (3.1). For every \(T>0\) there exists a constant \(C>0\) such that, for every \(\varepsilon >0,\) \(u\ge 0,\) \(v\in \mathbb {R}^n,\) and all \(T\ge t\ge \tau \ge 0,\) the following estimate holds,

$$\begin{aligned}&(\det A_{\tau ,t})^{\frac{1}{2}}\mathbf{P}_{\tau ,0}\left( \sup _{s\in [\tau ,t]}\Vert b_s\Vert _\infty >u\text { and }b_t\in B_\varepsilon (v)\right) \\&\quad \le C\int _{B_\varepsilon (v)}e^{-(u-\Theta (\tau ,t,a)-C\sum _{j=1}^{n}V_j(\tau ,t)^{1\slash 2})^2\slash 2\mathcal V(\tau ,t)}e^{-\frac{1}{2}A_{\tau ,t}^{-1}a\cdot a}da \end{aligned}$$

for all \(u>\sup _{a\in B_\varepsilon (v)}\Theta (\tau ,t,a)+C\sum _{j=1}^{n}V_j(\tau ,t)^{1\slash 2},\) where

$$\begin{aligned} \Theta (\tau ,t,a)=\sup _{s\in [\tau ,t]}\Vert A_{\tau ,s}A_{\tau ,t}^{-1}a\Vert _\infty \end{aligned}$$

and

$$\begin{aligned} V_j(\tau ,t)=\max _{s\in [\tau ,t]}(A_{\tau ,s}-A_{\tau ,s}A_{\tau ,t}^{-1}A_{\tau ,s})_{jj},\quad \mathcal V(\tau ,t)=\sum _{j+1}^nV_j(\tau ,t). \end{aligned}$$

Proof

Let \(\tau \) be fixed. To simplify notation we write, for \(s\ge \tau ,\)

$$\begin{aligned} A_s=[A_{ij}(\tau ,s)], \end{aligned}$$

where \([A_{ij}]\) is the matrix defined in (2.2). We write

$$\begin{aligned}&\mathbf{P}_{\tau ,0}\left( \sup _{s\in [0,t]}\Vert b_s\Vert _\infty >u\text { and }b_t\in B_\varepsilon (v)\right) \nonumber \\&\quad =\int _{B_\varepsilon (v)}\mathbf{P}_{\tau ,0}\left( \sup _{s\in [0,t]}\Vert b_s\Vert _\infty >u\mid b_t=a\right) P_{t,\tau }(a)da. \end{aligned}$$
(3.2)

Now we estimate the conditional probability under the integral sign. For \(s\le t\) and fixed, consider 2n-dimensional random vector

$$\begin{aligned} (b_s,b_t)=(b_s^1,\ldots ,b_s^n,b_t^1,\ldots ,b_t^n)\sim \mathcal N_{2n}(0,C). \end{aligned}$$

By Proposition 2.4, \(\mathrm {Cov}(b_s,b_s)=[A_{ij}(0,s)]=A_s.\) Since the process \(b_s\) has independent increments we get

$$\begin{aligned} \mathrm {Cov}(b_s,b_t)=\mathrm {Cov}(b_s,b_s+b_t-b_s)=\mathrm {Cov}(b_s,b_s)=A_s. \end{aligned}$$

By \(\mathrm {Cov}(b_t,b_s)=\mathrm {Cov}(b_s,b_t)^t\) we get,

$$\begin{aligned} C=\begin{bmatrix}A_s&\quad \! A_s\\ A_s&\quad \! A_t \end{bmatrix}. \end{aligned}$$

By Lemma 2.1 the conditional distribution of \(b_s\) given \(b_t\) is Gaussian with mean vector

$$\begin{aligned} m_{s\mid t}=A_sA_t^{-1}b_t \end{aligned}$$

and covariance matrix

$$\begin{aligned} C_{s\mid t}=A_s-A_sA_t^{-1}A_s. \end{aligned}$$

Let \(b_s(a)=(b_s^1(a),\ldots ,b_s^n(a))\) denote the process whose distribution is the conditional distribution of \(b_s\) given \(b_t=a.\) In this notation

$$\begin{aligned} \mathbf{P}_{\tau ,0}\left( \sup _{s\in [\tau ,t]}\Vert b_s\Vert _\infty >u\mid b_t=a\right) =\mathbf{P}_{\tau ,0}\left( \sup _{s\in [\tau ,t]}\max _{1\le j\le n}|b_s^j(a)|>u\right) . \end{aligned}$$
(3.3)

Let

$$\begin{aligned} \tilde{b}_s(a):=b_s(a)-\mathbf{E}_0b_s(a)=b_s(a)-A_sA_t^{-1}a. \end{aligned}$$

Clearly,

$$\begin{aligned} \tilde{b}_s(a)\sim \mathcal N_n(0,C_{s\mid t}). \end{aligned}$$
(3.4)

We continue (3.3) as follows.

$$\begin{aligned}&\mathbf{P}_{\tau ,0}\left( \sup _{s\in [\tau ,t]}\max _{1\le j\le n}|b_s^j(a)|>u\right) \\&\quad =\mathbf{P}_0\left( \sup _{s\in [\tau ,t]}\max _{1\le j\le n}|\tilde{b}_s^j(a)+(A_sA_t^{-1}a)_j|>u\right) \\&\quad \le \mathbf{P}_{\tau ,0}\left( \sup _{s\in [\tau ,t]}\max _{1\le j\le n}|\tilde{b}_s^j(a)|+\sup _{s\in [\tau ,t]}\Vert A_sA_t^{-1}a\Vert _\infty >u\right) \\&\quad =\mathbf{P}_{\tau ,0}\left( \sup _{s\in [\tau ,t]}\max _{1\le j\le n}|\tilde{b}_s^j(a)|>u-\Theta (\tau ,t,a)\right) . \end{aligned}$$

Thus, by symmetry of \(\tilde{b}_s(a)\) (see (3.4)),

$$\begin{aligned} \mathbf{P}_{\tau ,0}\left( \sup _{s\in [\tau ,t]}\max _{1\le j\le n}|b_s^j(a)|>u\right) \le 2\mathbf{P}_{\tau ,0}\left( \sup _{s\in [\tau ,t]}\max _{1\le j\le n}\tilde{b}_s^j(a)>u-\Theta (\tau ,t,a)\right) . \end{aligned}$$
(3.5)

Denote

$$\begin{aligned} \Xi (\tau ,t,a)&=\mathbf{E}_{\tau ,0}\sup _{s\in [\tau ,t]}\max _{1\le j\le n}\tilde{b}_s^j(a),\nonumber \\ \sigma _t^2&=\sup _{s\in [\tau ,t]}\max _{1\le j\le n}\mathbf{E}_{\tau ,0}(\tilde{b}_s^j(a))^2. \end{aligned}$$
(3.6)

By Corollary 2.3, with \(T=\{1,\ldots ,n\}\times [\tau ,t],\)

$$\begin{aligned} \mathbf{P}_{\tau ,0}\left( \sup _{s\in [\tau ,t]}\max _{1\le j\le n}\tilde{b}_s^j(a)>u-\Theta (\tau ,t,a)\right) \le e^{-(u-\Theta (\tau ,t,a)-\Xi (\tau ,t,a))^2\slash 2\sigma _t^2}, \end{aligned}$$

for all \(u>\Theta (\tau ,t,a)+\Xi (\tau ,t,a).\) Taking together the above estimate, (3.3), (3.5), and putting the resulting upper bound into (3.2) we get

$$\begin{aligned}&\mathbf{P}_{\tau ,0}\left( \sup _{s\in [\tau ,t]}\Vert b_s\Vert _\infty >u\text { and }b_t\in B_\varepsilon (v)\right) \nonumber \\&\quad \le 2\int _{B_\varepsilon (v)}e^{-(u-\Theta (\tau ,t,a)-\Xi (\tau ,t,a))^2\slash 2\sigma _t^2}P_{t,\tau }(a)da\nonumber \\&\quad \le C(\det A_t)^{-\frac{1}{2}}\int _{B_\varepsilon (v)}e^{-(u-\Theta (\tau ,t,a)-\Xi (\tau ,t,a))^2\slash 2\sigma _t^2}e^{-\frac{1}{2}A_t^{-1}a\cdot a}da, \end{aligned}$$
(3.7)

for all \(u>\sup _{a\in B_\varepsilon (v)}\Theta (\tau ,t,a)+\Xi (\tau ,t,a).\) We used Proposition 2.4 in the last inequality.

Let us estimate the quantities introduced in (3.6). The coordinate process \(\tilde{b}_s^j(a)\) is a Gaussian process and, by (3.4),

$$\begin{aligned} \tilde{b}_s^j(a)\sim \mathcal N_1(0,v_j(s)),\quad \text {where }v_j(s)=v_j(\tau ,s)=(A_s-A_sA_t^{-1}A_s)_{jj}. \end{aligned}$$
(3.8)

In this notation

$$\begin{aligned} V_j(\tau ,t)=\max _{s\in [\tau ,t]}v_j(s). \end{aligned}$$

It follows from [26, (1.1)] that for every \(\eta >0\) and \(T>0\) there is \(c=c_{\eta ,T}\) such that for every \(u>0\) and all \(T\ge t\ge \tau \ge 0,\)

$$\begin{aligned} \mathbf{P}_{\tau ,0}\left( \sup _{s\in [\tau ,t]}\max _{1\le j\le n}\tilde{b}_s(a)>u\right) \le c e^{-(1-\eta )u^2\slash 2\sigma _t^2}. \end{aligned}$$

Hence, taking \(\eta =1\slash 2,\)

$$\begin{aligned} \mathbf{E}_{\tau ,0}\sup _{s\in [\tau ,t]}\max _{1\le j\le n}\tilde{b}_s^j(a)= & {} \int _0^\infty \mathbf{P}_{\tau ,0}(\sup _{s\in [\tau ,t]}\max _{1\le j\le n}\tilde{b}_s(a)>u)du\\ {}\le & {} c\int _0^\infty e^{-u^2\slash 4\sigma _t^2}du\le c\int _\mathbb {R}e^{-u^2\slash 4\sigma _t^2}du \le c\sigma _t. \end{aligned}$$

Hence,

$$\begin{aligned} \Xi (\tau ,t,a)\le c\sigma _t. \end{aligned}$$

By (3.8),

$$\begin{aligned} \sigma _t^2=\sup _{s\in [\tau ,t]}\max _{1\le j\le n}v_j(\tau ,s)\le \sum _{j=1}^nV_j(\tau ,t)=\mathcal V(\tau ,t). \end{aligned}$$

Hence, for \(u>\sup _{a\in B_\varepsilon (v)}\Theta (\tau ,t,a)+C\sum _{j=1}^nV_j(\tau ,t)^{1\slash 2},\) we can rewrite (3.7) as follows,

$$\begin{aligned}&(\det A_t)^{\frac{1}{2}}\mathbf{P}_{\tau ,0}\left( \sup _{s\in [\tau ,t]}\Vert b_s\Vert _\infty >u\text { and }b_t\in B_\varepsilon (v)\right) \\&\quad \le C\int _{B_\varepsilon (v)}e^{-(u-\Theta (\tau ,t,a)-C\sum _{j=1}^nV_j(\tau ,t)^{1\slash 2})^2\slash 2\mathcal V(\tau ,t)}e^{-\frac{1}{2}A_t^{-1}a\cdot a}da. \end{aligned}$$

Hence, the result follows. \(\square \)

With the notation as in Proposition 3.1 we have immediately the following

Corollary 3.2

For every \(T>0\) there exists a constant \(C>0\) such that for all \(\varepsilon >0\) and all \(T\ge t\ge \tau \ge 0\) the following estimate holds,

$$\begin{aligned}&\varepsilon ^{-n}(\det A_{\tau ,t})^{\frac{1}{2}}\mathbf{P}_{\tau ,0}\left( \sup _{s\in [\tau ,t]}\Vert b_s\Vert _\infty >u\text { and }b_t\in B_\varepsilon (v)\right) \\&\quad \le C\sup _{a\in B_\varepsilon (v)}e^{-(u-\Theta (\tau ,t,a)-C\sum _{j=1}^nV_j(\tau ,t)^{1\slash 2})^2\slash 2\sum _{j=1}^nV_j(\tau ,t)}e^{-\frac{1}{2}A_{\tau ,t}^{-1}a\cdot a} \end{aligned}$$

for all \(u>\sup _{a\in B_\varepsilon (v)}\Theta (\tau ,t,a)+C\sum _{j=1}^nV_j(\tau ,t)^{1\slash 2}.\)

4 Evolution on M

We choose coordinates \(y_i\) for M for which \(Y_i\) corresponds to \(\partial _i=\partial _{y_i},\) \(1\le i\le d_1.\) Let \(\eta \in C([0,\infty ),V)\) and consider the evolution on M generated by the operator

$$\begin{aligned} \mathcal L_M^\sigma (t)^\eta =\sum _{j=1}^{d_1}({{\mathrm{Ad}}}(\eta (t))Y_j(t))^2. \end{aligned}$$

Then

$$\begin{aligned} {{\mathrm{Ad}}}(\eta (t))Y_j(t)={{\mathrm{Ad}}}(\eta (t))\Phi ^{\sigma }(t)Y_j=\sum _{k=1}^{d_1}\psi _{j,k}(t)Y_k, \end{aligned}$$

and consequently,

$$\begin{aligned} {{\mathrm{Ad}}}(\eta (t))Y_j(t)&=\sum _{j=1}^{d_1}\left( {{\mathrm{Ad}}}(\eta (t))Y_j(t)\right) ^2 =\sum _{k,l=1}^{d_1}\sum _{j=1}^{d_1}\psi _{k,j}(t)\psi _{l,j}(t)Y_kY_l\\&=\sum _{k,l=1}^{d_1}(\psi (t)\psi (t)^*)_{kl}Y_kY_l, \end{aligned}$$

where \(\psi (t)=[\psi _{i,j}(t)]\) is the matrix of \({{\mathrm{Ad}}}(\eta (t))\Phi ^\sigma (t)|_M.\) Thus the matrix \([a_{ij}]\) from (2.1) for the operator \(\mathcal L_M^\sigma (t)^\eta \) is

$$\begin{aligned} a^{\sigma ,\eta }_M(t)=2[{{\mathrm{Ad}}}(\eta (t))S^\sigma (t)][{{\mathrm{Ad}}}(\eta (t))S^\sigma (t)]^*, \end{aligned}$$

where the adjoint is in the \(y_j\) coordinates. Let

$$\begin{aligned} A^{\sigma ,\eta }_M(s,t)=\int _s^t a^{\sigma ,\eta }_M(u)\, du. \end{aligned}$$

For a \(d\times d\) invertible matrix A we set

$$\begin{aligned} B(A)(x)=\frac{1}{2} A^{-1}x\cdot x\quad \text {and}\quad \mathcal D(A)=(2\pi )^{-\frac{d}{2}}(\det A)^{-\frac{1}{2}}. \end{aligned}$$

It follows from Proposition 2.4 that the evolution kernel \(P^{M,\sigma ,\eta }_{t,s}\) for the operator \(\mathcal L_M^\sigma (t)^\eta \) is Gaussian, and in our notation, is given by

$$\begin{aligned} P^{M,\sigma ,\eta }_{t,s}(m)=\mathcal D(A^{\sigma ,\eta }_M(t,s))e^{-B(A^{\sigma ,\eta }_M(t,s))(m)},\quad m\in M=\mathbb {R}^{d_{1}} \end{aligned}$$

and the corresponding transition kernel is,

$$\begin{aligned} P^{M,\sigma ,\eta }_{t,s}(m;m^\prime )=P^{M,\sigma ,\eta }_{t,s}(m-m^\prime ),\quad m^1,m^2\in M=\mathbb {R}^{d_1}. \end{aligned}$$

For a matrix A the operator norm, that is the norm A considered as a liner operator from \(\ell ^2(\mathbb {R}^n)\rightarrow \ell ^2(\mathbb {R}^n)\) is denoted by \(\Vert A\Vert .\) We will need the following two simple lemmas.

Lemma 4.1

Let A be a positive semi-definite matrix. Then

$$\begin{aligned} B(A)(x)\ge \Vert x\Vert ^2\slash (2\Vert A\Vert ). \end{aligned}$$

Proof

See e.g. [22, Lemma 4.1] \(\square \)

Lemma 4.2

Let K and D be square matrices and let

$$\begin{aligned} A=\begin{bmatrix} K&\quad \! B\\ C&\quad \! D \end{bmatrix}. \end{aligned}$$

If \(\det K\not =0\) then \(\det A=\det K\det (D-CK^{-1}B).\)

Proof

See e.g. [32]. \(\square \)

Now we prove an upper bound on \(\mathcal {D}(A^{\sigma ,\eta }(s,t))\) that is independent of \(\eta \) generalizing [22, Lemma 4.2].

Lemma 4.3

There is a constant \(C>0\) such that

$$\begin{aligned} \mathcal {D}(A^{\sigma ,\eta }_M(s,t))\le C\left( \prod _{i=1}^{d_1}\int _s^ts_{ii}^\sigma (u)^2du\right) ^{-1\slash 2}, \end{aligned}$$

where \(s^\sigma _{ij}(t)\) are the entries of the matrix \(S^\sigma (t)\) defined in (1.1).

Proof

By the assumptions (A1) and (A2) on p. 2 the operator \({{\mathrm{ad}}}_X:\mathfrak m\rightarrow \mathfrak m\) is lower triangular for all \(X\in \mathfrak v\) and \(\Phi ^\sigma (t)|_M=S^\sigma (t),\) where \(S^\sigma (t)\) is linear operator on \(\mathfrak m\) that is lower triangular in the \(Y_i\) basis. We omit the t and \(\sigma \) dependence for the sake of simplicity. In the coordinates defined by the \(Y_i\) basis,

$$\begin{aligned} {{\mathrm{ad}}}_X=\begin{bmatrix} X_o&\quad \! 0\\ v^\mathrm {t}&\quad \! 0 \end{bmatrix},\quad {{\mathrm{Ad}}}_x=e^{{{\mathrm{ad}}}_X}=\begin{bmatrix} e^{X_o}&\quad \! 0\\ v(X)^\mathrm {t}&\quad \! 1 \end{bmatrix},\quad \text {where } x=\exp X, \end{aligned}$$

where the \(X_o\) is \((d_1-1)\times (d_1-1)\)-matrix and v is a \((d_1-1)\times 1\)-column vector.

Then

$$\begin{aligned} {{\mathrm{Ad}}}_xS=e^{{{\mathrm{ad}}}_X}\begin{bmatrix}S_0&\quad \! 0\\ S_1&\quad \! s_{d_1d_1}\end{bmatrix}=\begin{bmatrix}e^{X_o}S_o&\quad \! 0\\ v(X)^\mathrm {t}S_o+S_1&\quad \! s_{d_1d_1}\end{bmatrix} =:\begin{bmatrix}e^{X_o}S_o&\quad \! 0\\ F^\mathrm {t}&\quad \! s_{d_1d_1}\end{bmatrix}. \end{aligned}$$
(4.1)

Then

$$\begin{aligned} {{\mathrm{Ad}}}_xS({{\mathrm{Ad}}}_xS)^\mathrm {t}=\begin{bmatrix} e^{X_o}S_oS_o^\mathrm {t}e^{X_o^\mathrm {t}}&\quad \! G\\ G^\mathrm {t}&\quad \! s_{d_1d_1}^2+|F|^2\end{bmatrix}, \end{aligned}$$

where

$$\begin{aligned} G=e^{X_o}S_oF=e^{X_o}S_o(S_o^\mathrm {t}v(X)+S_1^\mathrm {t}). \end{aligned}$$

Hence,

$$\begin{aligned} A^{\sigma ,\eta }(s,t)=\begin{bmatrix} A_o&\quad \! B\\ B^\mathrm {t}&\quad \! A+E \end{bmatrix}, \end{aligned}$$

where

$$\begin{aligned} A_o&=\int _s^te^{X_o(u)}S_o(u)S_o(u)^\mathrm {t}e^{X_o(u)^\mathrm {t}}du,&\qquad B&=\int _s^t G(u)du,\\ A&=\int _s^ts_{d_1d_1}^2(u)du,&\qquad E&=\int _s^t|F(u)|^2du. \end{aligned}$$

From Lemma 4.2,

$$\begin{aligned} \det A^{\sigma ,\eta }(s,t)&=(\det A_o)(A+E-B^\mathrm {t}A_o^{-1}B)\\&=(\det A_o)A+(\det A_o)(E-B^\mathrm {t}A_o^{-1}B)\\&=(\det A_o)A+\det \begin{bmatrix} A_o&\quad \! B\\ B^\mathrm {t}&\quad \! E \end{bmatrix}. \end{aligned}$$

The determinant on the right is non-negative since it is the \(s_{d_1d_1}=0\) case of formula (4.1). Hence,

$$\begin{aligned} \det A^{\sigma ,\eta }_M(s,t)\ge A(\det A_o). \end{aligned}$$

Our result follows by induction. \(\square \)

Now we estimate the operator norm of the matrix

$$\begin{aligned} A^{\sigma ,\eta }_M(s,t)=\int _s^t{{\mathrm{Ad}}}(\eta (u))S^\sigma \left( {{\mathrm{Ad}}}(\eta (u))S^\sigma (u)\right) ^{\mathrm {t}}du. \end{aligned}$$
(4.2)

Recall that we assume that \(S^\sigma \) is lower triangular [assumption (A2)]. Specifically, for \(i\ge j,\)

$$\begin{aligned} s_{ij}^\sigma (u)=h_{ij}^M(\sigma (u))e^{\xi _j(\sigma (u))}, \end{aligned}$$

where \(h_{ij}^M\) are polynomials in \(a\in A=\mathbb {R}^k,\) and \(h_{jj}^M=1.\)

Lemma 4.4

Let \(\eta =\eta (u)=(\eta _1(u),\ldots ,\eta _{d_2}(u))\in \mathbb {R}^{d_2}=V\) be a continuous function. There exist constants \(C>0\) and \(k_o\in \mathbb {N},\) such that

$$\begin{aligned} \Vert A^{\sigma ,\eta }_M(s,t)\Vert \le C(1+\Lambda ^\eta (s,t))^{2k_o}\sum _{i\ge j}\int _0^t|s_{ij}^\sigma (u)|^2du, \end{aligned}$$

where

$$\begin{aligned} \Lambda ^\eta (s,t)=\sup _{s\le u\le t}\Vert \eta (u)\Vert _\infty . \end{aligned}$$

Proof

We note first that for \(X\in \mathfrak {n},\)

$$\begin{aligned} {{\mathrm{Ad}}}_X\big |_{\mathfrak m}=\sum _{\ell =0}^{k_o}\left( {{\mathrm{ad}}}_X\big |_{\mathfrak m}\right) ^\ell \slash \ell ! \end{aligned}$$

Hence,

$$\begin{aligned} \Vert {{\mathrm{Ad}}}_X\big |_{\mathfrak m}\Vert \le C(1+\Vert {{\mathrm{ad}}}_X\Vert )^{k_o}\le C'(1+\Vert X\Vert )^{k_o}. \end{aligned}$$

Since all norms on finite dimensional vector space are equivalent we get

$$\begin{aligned} \Vert S^\sigma (u)\Vert ^2\le C\sum _{i\ge j}|s_{ij}^\sigma (u)|^2. \end{aligned}$$
(4.3)

Our result follows by bringing the norm inside the integral in (4.2). \(\square \)

5 Evolution on V

Now we consider the evolution process \(\eta (t)\) on V generated by

$$\begin{aligned} \mathcal L_V^\sigma (t)=\sum _{j=1}^{d_2}X_j(t)^2=\sum _{j=1}^{d_2}(T^\sigma (t)X_j)^2 \end{aligned}$$

(see (1.1) on p. 4). The matrix \(a_t=[a_{ij}(t)]\) defined in (2.1) is equal to

$$\begin{aligned} a^\sigma _V(t)=2T^\sigma (t)T^\sigma (t)^*. \end{aligned}$$

Let

$$\begin{aligned} A^\sigma _V(s,t)=\int _s^ta^\sigma _V(u)du. \end{aligned}$$

One of the differences between this setting and the case of meta-abelian groups considered in [23] is that the coordinates \(\eta _j(t),\) \(j=1,\ldots ,d_2,\) of the process \(\eta (t)\) are no longer independent.

Notice that exactly in the same way as in the proof of Lemma 4.3 we can show that

$$\begin{aligned} (\det A^\sigma _V(s,t))^{-1\slash 2}\le C\left( \,\prod _{j=1}^{d_2}\int _s^te^{2\vartheta _j(\sigma (u))}du\right) ^{-1\slash 2}. \end{aligned}$$
(5.1)

We have the following, analogous to (4.3), inequality

$$\begin{aligned} \Vert T^\sigma (u)\Vert ^2\le C\sum _{i\ge j}|t_{ij}^\sigma (u)|^2 \end{aligned}$$

which implies (as in Lemma 4.4)

$$\begin{aligned} \Vert A^\sigma _V(s,t)\Vert \le C\sum _{i\ge j}\int _s^t|t_{ij}^\sigma (u)|^2du. \end{aligned}$$
(5.2)

Hence in the notation introduced in (1.3) and (1.5), using Lemma 4.1, Corollary 3.2 reads,

Proposition 5.1

For every \(T>0\) there exist constants \(C,c>0\) such that, for all \(T\ge t\ge \tau \ge 0\), all \(u>0\), and all \(\varepsilon >0\), the following estimate holds,

$$\begin{aligned}&\varepsilon ^{-n}\mathbf {W}_0^{V,\sigma }\left( \sup _{s\in [\tau ,t]}\Vert \eta (s)\Vert _\infty \ge u\text { and }\eta (t)\in B_\varepsilon (v)\right) \\&\quad \le C\mathcal T_{\Pi }(\tau ,t)^{-1\slash 2}\sup _{a\in B_\varepsilon (v)}e^{-(u-\Theta (\tau ,t,a)-C\sum _{j=1}^{d_2}V_j(\tau ,t)^{1\slash 2})^2\slash 2\mathcal V(\tau ,t)}\\&\quad \quad \times \exp \left( -\frac{c\Vert a\Vert ^2}{\mathcal T(\tau ,t)}\right) , \end{aligned}$$

for all \(u>\sup _{a\in B_\varepsilon (v)}\Theta (\tau ,t,a)+C\sum _{j=1}^{d_2}V_j(\tau ,t)^{1\slash 2},\) where

$$\begin{aligned} \Theta (\tau ,t,v)=\max _{s\in [\tau ,t]}\Vert A_V^\sigma (\tau ,s)A_V^\sigma (\tau ,t)^{-1}v\Vert _\infty . \end{aligned}$$

6 Proof of Theorem 1.2

In this section we estimate the transition kernel for the evolution on \(N=M\rtimes V,\)

$$\begin{aligned} P^\sigma _{t,\tau }(m,v)=P^\sigma _{t,\tau }(0,0;m,v),\quad t\ge \tau . \end{aligned}$$

Proof of Theorem 1.2

We allow the constants C and D to change from line to line. By Lemmas 4.1 and 4.3, for \(t\ge \tau \) and \(m,m^\prime \in M,\)

$$\begin{aligned} P^{M,\sigma ,\eta }_{t,\tau }(m,m^\prime )&=\mathcal D(A^{\sigma ,\eta }_M(\tau ,t))e^{-B(A^{\sigma ,\eta }_M(\tau ,t))(m-m^\prime )}\nonumber \\&\le C\mathcal S_{\Pi }(\tau ,t)^{-1\slash 2}e^{-\frac{\Vert m-m^\prime \Vert ^2}{2\Vert A^{\sigma ,\eta }_M(\tau ,t)\Vert }}. \end{aligned}$$
(6.1)

By Theorem 1.1, for \(m,m^\prime \in M\) and \(v,v^\prime \in V\),

$$\begin{aligned} \int _VP^\sigma _{t,\tau }(m,v;m^\prime ,v^\prime )\psi (v^\prime )dv^\prime= & {} \int P^{M,\sigma ,\eta }_{t,\tau }(m,m^\prime )\psi (\eta (t))\, d\mathbf {W}^{V,\sigma }_{\tau ,v}(\eta )\\\le & {} C\mathcal S_{\Pi }(\tau ,t)^{-1\slash 2}\int \psi (\eta (t))e^{-\frac{\Vert m-m^\prime \Vert ^2}{2\Vert A^{\sigma ,\eta }_M(\tau ,t)\Vert }}d\mathbf {W}^{V,\sigma }_{\tau ,v} (\eta ). \end{aligned}$$

Let

$$\begin{aligned} F(m,\sigma ,\eta )=\exp \left( -\frac{D\Vert m\Vert ^2}{(1+\Lambda ^\eta (s,t))^{2k_o}\mathcal S(\tau ,t)}\right) , \end{aligned}$$

where

$$\begin{aligned} \Lambda ^\eta (s,t)=\sup _{s\le u\le t}\Vert \eta (u)\Vert _\infty . \end{aligned}$$

Then, by Lemma 4.4,

$$\begin{aligned} \mathcal T_{\Pi }(\tau ,t)^{1\slash 2}\int P^\sigma _{t,\tau }(m,v)\psi (v)dv \le C\int F(m,\sigma ,\eta )\psi (\eta (t))\, d\mathbf {W}^{V,\sigma }_{\tau ,0}(\eta ). \end{aligned}$$
(6.2)

For \(v\in \mathbb {R}^{d_2}\) given and \(\varepsilon >0\), let

$$\begin{aligned} \psi _\varepsilon (\cdot )=\varepsilon ^{-n}\mathbf 1_{B_\varepsilon (v)}(\cdot ), \end{aligned}$$

where

$$\begin{aligned} B_\varepsilon (v)=\prod _{j=1}^{d_2}B_\varepsilon ^1(v_j)\quad \text {and}\quad B_\varepsilon ^1(v_j)=[v_j-\varepsilon \slash 2,v_j+\varepsilon \slash 2]. \end{aligned}$$

We will estimate (6.2) with \(\psi _\varepsilon \) in place of \(\psi \) as \(\varepsilon \) tend to zero.

Let \(\mathbf{E}_{\tau ,v}^\eta \) denote expectation with respect to the distribution \(d\mathbf {W}_{\tau ,v}^{V,\sigma }(\eta )\) of \(\eta \) in the space of trajectories (\(\eta (\tau )=v\in V\)). For \(\ell =1,2,\cdots ,\) define the sets of paths in V

$$\begin{aligned} \mathcal {A}_\ell =\left\{ \eta :\ell -1\le \Lambda ^\eta (\tau ,t)=\sup _{u\in [\tau ,t]}\Vert \eta (u)\Vert _{\infty }<\ell \right\} . \end{aligned}$$

The integral on the right in (6.2) can be written as an infinite sum and estimated as follows

$$\begin{aligned}&\sum _{\ell =1}^\infty \mathbf{E}_{\tau ,0}^\eta F(m,\sigma ,\eta ) \psi _\varepsilon (\eta (t))\mathbf 1_{\mathcal {A}_\ell }(\eta )\nonumber \\&\quad \le \sum _{\ell =1}^\infty \exp \left( -\frac{D\Vert m\Vert ^2}{(1+\ell )^{2k_o}\mathcal S(\tau ,t)}\right) \mathbf{E}_{\tau ,0}^\eta \psi _\varepsilon (\eta (t))\mathbf 1_{\mathcal {A}_\ell }(\eta )\nonumber \\&\quad \le \sum _{\ell =1}^\infty \exp \left( -\frac{D\Vert m\Vert ^2}{\ell ^{2k_o}\mathcal S(\tau ,t)}\right) \mathbf{E}_{\tau ,0}^\eta \psi _\varepsilon (\eta (t))\mathbf 1_{\mathcal {A}_\ell }(\eta ), \end{aligned}$$
(6.3)

for some \(k_o\ge 1.\)

To simplify notation we introduce

$$\begin{aligned} c_\ell&=\exp \left( -\frac{D\Vert m\Vert ^2}{\ell ^{2k_o}\mathcal S(\tau ,t)}\right) ,\\ \mathcal {E}_\ell (\varepsilon )&=\mathbf{E}_{\tau ,0}^\eta \psi _\varepsilon (\eta (t))\mathbf 1_{\mathcal {A}_\ell }(\eta ) =\varepsilon ^{-n}\mathbf {W}^{V,\sigma }_{\tau ,0}\left( \eta \in \mathcal {A}_\ell \text { and }\eta (t)\in B_\varepsilon (v)\right) . \end{aligned}$$

Let \(v\not =0\) and choose \(\varepsilon \slash 2<\Vert v\Vert _\infty \). If \(\eta \in \mathcal {A}_\ell \) and \(\eta (t)\in B_\varepsilon (v)\) then \(\Vert \eta (t)\Vert _\infty \ge \Vert v\Vert _\infty -\varepsilon \slash 2\). Hence,

$$\begin{aligned} \mathcal {E}_\ell =0\quad \text {for }\ell <\Vert v\Vert _\infty -\varepsilon \slash 2. \end{aligned}$$
(6.4)

Let

$$\begin{aligned} I=I(\varepsilon )=\sum _{\ell =1}^\infty c_\ell \mathcal E_\ell (\varepsilon ), \end{aligned}$$

and set

$$\begin{aligned} \ell ^*_\varepsilon =\left\lceil \sup _{a\in B_\varepsilon (v)}\tilde{\Theta }(\tau ,t,a)\right\rceil . \end{aligned}$$

Obviously, \(\lceil \tilde{\Theta }(\tau ,t,v)\rceil >\Vert v\Vert _\infty -\varepsilon \slash 2\) for sufficiently small \(\varepsilon .\) Since \(\ell ^*_\varepsilon \ge \lceil \tilde{\Theta }(\tau ,t,v)\rceil \) we get that \(\ell ^*_\varepsilon >\Vert v\Vert _\infty -\varepsilon \slash 2\) for sufficiently small \(\varepsilon .\) Hence,

$$\begin{aligned} I=I_1+I_2:=\sum _{\Vert v\Vert _\infty \le \ell \le \ell ^*_\varepsilon }c_\ell \mathcal E_\ell (\varepsilon )+\sum _{\ell =\ell ^*_\varepsilon }^\infty c_\ell \mathcal E_\ell (\varepsilon ). \end{aligned}$$

We estimate \(I_1.\) Since

$$\begin{aligned} \mathcal E_\ell (\varepsilon )\le \mathcal E^\prime (\varepsilon ), \end{aligned}$$

where

$$\begin{aligned} \mathcal E^\prime (\varepsilon )=\varepsilon ^{-n}\mathbf {W}^{V,\sigma }_{\tau ,0}\left( \eta (t)\in B_\varepsilon (v)\right) , \end{aligned}$$

we get, by (5.1) and (5.2),

$$\begin{aligned} I_1&\le \mathcal E^\prime (\varepsilon )\sum _{\Vert v\Vert _\infty \le \ell \le \ell ^*_\varepsilon }\exp \left( -\frac{D\Vert m\Vert ^2}{\ell ^{2k_o}\mathcal S(\tau ,t)}\right) \\&\le c(\ell ^*_\varepsilon -\Vert v\Vert _\infty +1)\mathcal T_{\Pi }(\tau ,t)^{-1\slash 2} \exp \left( -\frac{c\Vert v\Vert ^2}{\mathcal T(\tau ,t)}\right) \exp \left( -\frac{D\Vert m\Vert ^2}{(\ell ^*_\varepsilon )^{2k_o}\mathcal S(\tau ,t)}\right) . \end{aligned}$$

Now we consider \(I_2.\) In this case to estimate \(\mathcal E_\ell (\varepsilon )\) we use Proposition 5.1,

$$\begin{aligned} I_2\le & {} C\mathcal T_{\Pi }(\tau ,t)^{-1\slash 2}\exp \left( -\frac{c\Vert v\Vert ^2}{\mathcal T(\tau ,t)}\right) \\&\times \sum _{\ell =\ell ^*_\varepsilon }^\infty \exp \left( -\frac{D\Vert m\Vert ^2}{\ell ^{2k_o}\mathcal S(\tau ,t)}\right) \sup _{a\in B_\varepsilon (v)}\exp \left( -\frac{(\ell -\tilde{\Theta }(\tau ,t,a))^2}{2\mathcal V(\tau ,t)}\right) . \end{aligned}$$

Now we estimate the series above. We split the sum above into two parts: \(\ell ^*_\varepsilon \le \ell \le \ell ^*_\varepsilon +\Vert m\Vert ^{1\slash 2k_o}\) and \(\ell >\ell ^*_\varepsilon +\Vert m\Vert ^{1\slash 2k_o},\) and estimate the corresponding parts by the following two terms [we note that if \(\varepsilon \rightarrow 0\) then \(\ell ^*_\varepsilon \rightarrow \tilde{\Theta }(\tau ,t,v)\)]:

$$\begin{aligned} \Vert m\Vert ^\frac{1}{2k_o}\exp \left( -\frac{D\Vert m\Vert ^2}{(\ell ^*_\varepsilon +\Vert m\Vert ^\frac{1}{2k_o})^{2k_o}\mathcal S(\tau ,t)}\right) \end{aligned}$$

and

$$\begin{aligned}&\sum _{\ell \ge \ell ^*_\varepsilon +\Vert m\Vert ^{1\slash 2k_0}}\exp \left( -\frac{(\ell -\sup _{a\in B_\varepsilon (v)}\tilde{\Theta }(\tau ,t,a))^2}{2\mathcal V(\tau ,t)}\right) \\&\quad \le \sum _{\ell =1}^\infty \exp \left( -\frac{(\ell +1+\Vert m\Vert ^{1\slash 2k_o})^2}{2\mathcal V(\tau ,t)}\right) \\&\quad \le \int _{\Vert m\Vert ^{1\slash 2k_o}}^\infty e^{-r^2\slash 2\mathcal V(\tau ,t)}dr\le \sqrt{2}\sqrt{\mathcal V(\tau ,t)}e^{-\frac{\Vert m\Vert ^{1\slash k_o}}{2\mathcal V(\tau ,t)}}. \end{aligned}$$

The theorem follows. \(\square \)

7 Estimates for quantities (1.8) and (1.7)

In order to apply the estimate given in Theorem 1.2 into some particular problem one needs to control the quantities (1.8) and (1.7) appearing there. The aim of this section is to give some estimates for them which make the bound in Theorem 1.2 explicit.

We will need the following classical bounds for the norm of the inverse matrices which is due to Richter [25] (see also [18] for a different proof). Recall that for a matrix A\(\Vert A\Vert \) stands for the operator norm \(\ell ^2\rightarrow \ell ^2.\)

Theorem 7.1

Let A be a nonsingular \(n\times n\)-matrix. Then

$$\begin{aligned} (n^{(n-2)\slash 2}|\det A|^{-1}\Vert A\Vert )^{1\slash (n-1)}\le \Vert A^{-1}\Vert \le n^{-(n-2)\slash 2}|\det A|^{-1}\Vert A\Vert ^{n-1}. \end{aligned}$$

For the matrix \(A_V^\sigma (\tau ,t),\) defined in (1.4) on p. 5, we have by (5.2),

$$\begin{aligned} \Vert A_V^\sigma (\tau ,t)\Vert \le \mathcal T(\tau ,t), \end{aligned}$$

where \(\mathcal T(\tau ,t)\) is defined in (1.5). Also note that \(\mathcal T(\tau ,t)\) is an increasing function of t. Hence, the following is a corollary from Theorem 7.1. Notation below is as in (1.5).

Lemma 7.2

There is a constant \(c>0\) such that for every \(t\ge \tau \ge 0\) and for every \(v\in \mathbb {R}^{d_2},\)

$$\begin{aligned} \Theta (\tau ,t,v)\le c {d_2}^{-(d_2-2)\slash 2}\mathcal T_{\Pi }(\tau ,t)^{-1}\mathcal T(\tau ,t)^{d_2}\Vert v\Vert . \end{aligned}$$

Proof

By Theorem 7.1,

$$\begin{aligned} \Theta (\tau ,t,a)&=\sup _{s\in [\tau ,t]}\Vert A_V^\sigma (\tau ,s)A_V^\sigma (\tau ,t)^{-1}v\Vert _\infty \\&\le c{d_2}^{-(d_2-2)\slash 2}\sup _{s\in [\tau ,t]}\Vert A_V^\sigma (\tau ,s)\Vert |\det A_V^\sigma (\tau ,t)|^{-1}\Vert A_V^\sigma (\tau ,t)\Vert ^{d_2-1}\Vert v\Vert \\&\le c{d_2}^{-(d_2-2)\slash 2}\sup _{s\in [\tau ,t]}\mathcal T(\tau ,s)|\det A_V^\sigma (\tau ,t)|^{-1}\mathcal T(\tau ,t)^{d_2-1}\Vert v\Vert \\&\le c{d_2}^{-(d_2-2)\slash 2}|\det A_V^\sigma (\tau ,t)|^{-1}\mathcal T(\tau ,t)^{d_2}\Vert v\Vert . \end{aligned}$$

Now (5.1) finishes the proof. \(\square \)

Lemma 7.3

There is a positive constant c such that for every \(t\ge \tau \ge 0\) and for every \(v\in \mathbb {R}^{d_2},\)

$$\begin{aligned} \tilde{\Theta }(\tau ,t,v)\le c\mathcal T_{\Pi }(\tau ,t)^{-1}\mathcal T(\tau ,t)^{d_2}\Vert v\Vert +c(\mathcal T(\tau ,t)+\mathcal T(\tau ,t)^{d_2+1}\mathcal T_{\Pi }(\tau ,t)^{-1})^{1\slash 2}. \end{aligned}$$
(7.1)

Proof

Since all norms on the finite dimensional vector space are equivalent we get, by (1.3), for every j

$$\begin{aligned} 0\le & {} V_j(\tau ,t)^{1\slash 2}\le c\left( \max _{s\in [\tau ,t]}\Vert A_V^\sigma (\tau ,s)-A_V^\sigma (\tau ,s)A_V^\sigma (\tau ,t)^{-1}A_V^\sigma (\tau ,s)\Vert \right) ^{1\slash 2}\\\le & {} c\left( \max _{s\in [\tau ,t]}\Vert A_V^\sigma (\tau ,s)\Vert +\max _{s\in [\tau ,t]}\Vert A_V^\sigma (\tau ,s)\Vert \max _{s\in [\tau ,t]}\Vert A_V^\sigma (\tau ,t)^{-1}A_V^\sigma (\tau ,s) \Vert \right) ^{1\slash 2}. \end{aligned}$$

From the proof of Lemma 7.2

$$\begin{aligned} \max _{s\in [\tau ,t]}\Vert A_V^\sigma (\tau ,s)A_V^\sigma (\tau ,t)^{-1}\Vert \le {d_2}^{-(d_2-2)\slash 2}\mathcal T_{\Pi }(\tau ,t)^{-1}\mathcal T(\tau ,t)^{d_2}. \end{aligned}$$

Thus

$$\begin{aligned} 0\le V_j(\tau ,t)^{1\slash 2}\le c(\mathcal T(\tau ,t)+\mathcal T(\tau ,t)^{d_2+1}\mathcal T_{\Pi }(\tau ,t)^{-1})^{1\slash 2}. \end{aligned}$$

Hence the sum \(\sum _jV_j(\tau ,t)^{1\slash 2}\) has the same (with a different constat) estimate. This together with the estimate obtained in Lemma 7.2 finish the proof. \(\square \)

From the proof of Lemma 7.3 we have the following corollary .

Lemma 7.4

There is \(c>0\) such that for all \(t\ge \tau >0,\)

$$\begin{aligned} \mathcal {V}(\tau ,t)\le c(\mathcal T(\tau ,t)+\mathcal T(\tau ,t)^{d_2+1}\mathcal T_{\Pi }(\tau ,t)^{-1}). \end{aligned}$$