1 Introduction

This paper discusses the local higher integrability of the spatial gradient of a weak solution to a double-phase parabolic system

$$\begin{aligned} u_t-{{\,\textrm{div}\,}}\mathcal {A}(z,\nabla u)=-{{\,\textrm{div}\,}}(|F|^{p-2}F+a(z)|F|^{q-2}F) \text { in } \Omega _T, \end{aligned}$$
(1.1)

where \(z=(x,t)\), \(\Omega _T=\Omega \times (0,T)\) is a space-time cylinder with a bounded open set \(\Omega \subset \mathbb {R}^n\) for \(n\geqq 2\) and \(2\leqq p<q<\infty \). Here \(\mathcal {A}(z,\nabla u):\Omega _T\times \mathbb {R}^{Nn}\longrightarrow \mathbb {R}^{Nn}\) with \(N\geqq 1\) is a Carathéodory vector field satisfying that there exist constants \(0<\nu \leqq L<\infty \) such that

$$\begin{aligned} \mathcal {A}(z,\xi )\cdot \xi \geqq \nu (|\xi |^p+a(z)|\xi |^q)\quad \text {and}\quad |\mathcal {A}(z,\xi )|\leqq L(|\xi |^{p-1}+a(z)|\xi |^{q-1})\nonumber \\ \end{aligned}$$
(1.2)

for almost every \(z\in \Omega _T\) and every \(\xi \in \mathbb {R}^{Nn}\). It is further assumed that the source term \(F:\Omega _T\longrightarrow \mathbb {R}^{Nn}\) satisfies

$$\begin{aligned} \iint _{\Omega _T} H(z,|F|)\ \textrm{d}z=\iint _{\Omega _T}(|F|^p+a(z)|F|^q)\,\textrm{d}z<\infty . \end{aligned}$$

Here we denote \(H(z,s):\Omega _T\times \mathbb {R}^{+ }\longrightarrow \mathbb {R}^+\),

$$\begin{aligned} H(z,s)=s^p+a(z)s^q. \end{aligned}$$

We assume that the non-negative coefficient function \(a:\Omega _T\longrightarrow \mathbb {R}^+\) satisfies

$$\begin{aligned} q\leqq p+\frac{2\alpha }{n+2} \quad \text {and}\quad a\in C^{\alpha ,\frac{\alpha }{2}}(\Omega _T)\text { for some }\alpha \in (0,1]. \end{aligned}$$
(1.3)

Here \(a\in C^{\alpha ,\frac{\alpha }{2}}(\Omega _T)\) means that \(a\in L^{\infty }(\Omega _T)\) and that there exists a constant \([a]_{\alpha ,\frac{\alpha }{2};\Omega _T}<\infty \) such that

$$\begin{aligned}&|a(x,t)-a(y,t)|\leqq [a]_{\alpha ,\frac{\alpha }{2};\Omega _T}|x-y|^\alpha \quad \text {and}\quad |a(x,t)-a(x,s)|\\&\quad \leqq [a]_{\alpha ,\frac{\alpha }{2};\Omega _T}|t-s|^\frac{\alpha }{2} \end{aligned}$$

for every \((x,y)\in \Omega \) and \((t,s)\in (0,T)\). For short we denote \([a]_{\alpha }=[a]_{\alpha ,\frac{\alpha }{2};\Omega _T}\).

We summarize the existing related results in the elliptic and parabolic cases. The elliptic double-phase system

$$\begin{aligned} -{{\,\textrm{div}\,}}(|\nabla u|^{p-2}\nabla u+a(x)|\nabla u|^{q-2}\nabla u) =-{{\,\textrm{div}\,}}(|F|^{p-2}F+a(x)|F|^{q-2}F) \end{aligned}$$

in \(\Omega \), where

$$\begin{aligned} 1<p<q\leqq p+\frac{\alpha p}{n} \quad \text {and}\quad a(x)\in C^{\alpha }(\Omega )\text { for some }\alpha \in (0,1] \end{aligned}$$
(1.4)

models a class (pq)-growth problems related to strongly anisotropic materials in the contexts of homogenization and nonlinear elasticity, see [26,27,28]. The proper function space for weak solutions is \(u\in W^{1,1}(\Omega ,\mathbb {R}^N)\) with

$$\begin{aligned} \int _{\Omega } H(x,|\nabla u|)\, \textrm{d}x=\int _{\Omega }(|\nabla u|^p+a(x)|\nabla u|^q)\, \textrm{d}x<\infty . \end{aligned}$$

Under (1.4) it has been proved that \(|\nabla u|\in L_{{{\,\textrm{loc}\,}}}^q(\Omega )\) in [13] (see also [21, 22] for the (pq)-growth problems). Harnack’s inequality, Hölder continuity, gradient Hölder continuity, gradient higher integrability and Calderón–Zygmund type estimates have been discussed in [2, 8, 9, 11] (see also [17, 18]). For applications and more information, we refer to [23, 24]. A standard approach in the elliptic double-phase systems is to consider two cases: for each ball \(B_r(x_0)\subset \Omega \), either

$$\begin{aligned} \inf _{B_{r}(x_0)}a(x)\leqq [a]_{\alpha }r^\alpha \quad \text {or}\quad \inf _{B_{r}(x_0)}a(x)> [a]_{\alpha }r^\alpha . \end{aligned}$$
(1.5)

The first condition in (1.5) is called the p-phase and in this case the behavior is similar to the p-Laplace systems in \(B_r(x_0)\). The second condition in (1.5) implies that

$$\begin{aligned} \sup _{B_{r}(x_0)}a(x)< 2\inf _{B_r(x_0)}a(x) \end{aligned}$$
(1.6)

and this leads to the behavior similar to the (pq)-Laplace systems in \(B_r(x_0)\). For this reason the second condition in (1.5) is called the (pq)-phase.

Parabolic double-phase problems have not been investigated until very recently. The existence of weak solutions to (1.1) has been considered in [7, 25]. These results seem to cover different ranges of exponents already in the stationary case, see [6]. It has been proved in [25] by using the difference quotient method that \(|\nabla u|\in L^q_{{{\,\textrm{loc}\,}}}(\Omega _T)\) under appropriate structural assumptions (see also [1, 4, 5, 10, 14] for the (pq)-growth problems).

The main result of this paper is an a priori estimate for the gradient of a weak solution to (1.1). We denote \(Q_{r}(z_0)=B_{r}(x_0)\times (t_0-r^2,t_0+r^2)\) and

$$\begin{aligned}&data =(n,N,p,q,\alpha ,\nu ,L,[a]_{\alpha },{{\,\textrm{diam}\,}}(\Omega ),\Vert u\Vert _{L^\infty (0,T;L^2(\Omega ))},\\&\quad \Vert H(z,|\nabla u|)\Vert _{L^1(\Omega _T)},\Vert H(z,|F|)\Vert _{L^1(\Omega _T)}). \end{aligned}$$

Theorem 1.1

Assume that (1.2) and (1.3) hold true and let u be a weak solution to (1.1). Then there exist constants \(0<\varepsilon _0=\varepsilon _0( data )\) and \(c=c( data ,\Vert a\Vert _{L^\infty (\Omega _T)})\geqq 1\) such that

for every \(Q_{2r}(z_0)\subset \Omega _T\) and \(\varepsilon \in (0,\varepsilon _0)\).

As far as we are aware this is the first regularity result for parabolic double-phase problems under the general structural conditions (1.2) and (1.3). We consider weak solutions that satisfy a technical assumption \(|\nabla u|\in L^q(\Omega _T)\), see Definition 2.1. It is also possible to obtain the main result under the assumption

$$\begin{aligned} \iint _{\Omega _T} H(z,|\nabla u|)\, \textrm{d}z=\iint _{\Omega _T}(|\nabla u|^p+a(z)|\nabla u|^q)\,\textrm{d}z<\infty , \end{aligned}$$

by applying a parabolic Lipschitz truncation, see Remark 2.2. This technique is out of the scope of this paper and it is discussed in [19]. This extends the corresponding results for the p-Laplace systems \((a(z)\equiv 0)\) in [20], where a reverse Hölder inequality for the gradient has been proved in p-intrinsic cylinders

$$\begin{aligned} Q_{\rho }^\lambda (z_0)=B_{\rho }(x_0)\times (t_0-\lambda ^{2-p}\rho ^2,t_0+\lambda ^{2-p}\rho ^2) \end{aligned}$$

by using parabolic Caccioppoli and Poincaré inequalities and a stopping time argument. Appropriate intrinsic cylinders have to be considered for other parabolic systems. The parabolic (pq)-Laplace system (\(a(z)\equiv a_0\) for some constant \(a_0>0\)) was considered in [16] and the gradient reverse Hölder inequality was proved in (pq)-intrinsic cylinders

$$\begin{aligned} G_{\rho }^\lambda (z_0)=B_{\rho }(x_0)\times \left( t_0-\tfrac{\lambda ^2}{\lambda ^p+a_0\lambda ^q}\rho ^2,t_0+\tfrac{\lambda ^2}{\lambda ^p+a_0\lambda ^q}\rho ^2\right) . \end{aligned}$$

In [3], the gradient higher integrability result has been discussed for the parabolic \(p(\cdot )\)-Laplace type system

$$\begin{aligned} u_t-{{\,\textrm{div}\,}}( |\nabla u|^{p(z)-2}\nabla u)=-{{\,\textrm{div}\,}}(|F|^{p(z)-2}F) \end{aligned}$$

in \(\Omega _T\), where \(p(\cdot ):\Omega _T \longrightarrow \mathbb {R}^+\) is a continuous function with

$$\begin{aligned} \frac{2n}{n+2}<\inf _{z\in \Omega _T}p(z)\leqq p(\cdot )\leqq \sup _{z\in \Omega _T}p(z)<\infty \end{aligned}$$

and \(p(\cdot )\) satisfies a logarithmic modulus of continuity condition. In this case the intrinsic cylinders are of the form

$$\begin{aligned} B_{\rho }(x_0)\times \bigl (t_0-\lambda ^{\frac{2-p(z_0)}{p(z_0)}}\rho ^2,t_0+\lambda ^\frac{2-p(z_0)}{p(z_0)}\rho ^2\bigr ). \end{aligned}$$

Several new features appear in the parabolic double-phase problem (1.1) compared to the p-Laplace systems in [20] and to the (pq)-case in [16]. The first novelty in our argument is that we provide a new criterion replacing (1.5) in order to be able to adopt the stopping time argument with intrinsic cylinders in [20]. For each point

$$\begin{aligned} z_0\in \{ z\in \Omega _T:|\nabla u(z)|^p+a(z)|\nabla u(z)|^q>\Lambda \}, \end{aligned}$$

we consider \(\lambda =\lambda (z_0)>0\) such that \(\Lambda =\lambda ^p+a(z_0)\lambda ^q\). Employing the fact that \(s \rightarrow s^p + a(z_0)s^q\) is strictly increasing and noting that \(z_0\in \{ z\in \Omega _T:|\nabla u(z)|^p>\lambda ^p\}\), we may apply a stopping time argument with the p-intrinsic cylinders. The second novelty is to consider two alternatives: for \(K>1\), either

$$\begin{aligned} K\lambda ^p\geqq a(z_0)\lambda ^q\quad \text {or}\quad K\lambda ^p\leqq a(z_0)\lambda ^q. \end{aligned}$$

These are called p-intrinsic and (pq)-intrinsic cases, respectively. The p-intrinsic case is related to the p-Laplace systems and the (pq)-intrinsic case is related to the (pq)-problems. For the double-phase problems we have to consider both of them. We are convinced that this technique will be useful in other regularity results for parabolic doubly-nonlinear problems. In the p-intrinsic case, it is possible to obtain the reverse Hölder inequality in the p-intrinsic cylinders as in [20]. Roughly speaking, we have

$$\begin{aligned}&|\nabla u|^{p-2}\nabla u+a(z)|\nabla u|^{q-2}\nabla u\approx |\nabla u|^{p-2}\nabla u+K\lambda ^{p-q}|\nabla u|^{q-2}\nabla u \\&\quad \approx |\nabla u|^{p-2}\nabla u \end{aligned}$$

in the stopping time argument with a p-intrinsic cylinder. On the other hand, the (pq)-intrinsic case implies the second condition in (1.5) for a sufficiently large \(K>1\). This leads (1.6) and we have

$$\begin{aligned} |\nabla u|^{p-2}\nabla u+a(z)|\nabla u|^{q-2}\nabla u\approx |\nabla u|^{p-2}\nabla u+a(z_0)|\nabla u|^{q-2}\nabla u \end{aligned}$$

in the stopping time argument with p-intrinsic cylinder. Consequently, we may apply the (pq)-intrinsic cylinders to obtain the reverse Hölder inequality. Note that

$$\begin{aligned} z_0\in \{ z\in \Omega _T:|\nabla u(z)|^p+a(z)|\nabla u(z)|^q>\lambda ^p+a(z_0)\lambda ^q\} \end{aligned}$$

and \(G_\rho ^\lambda (z_0)\subset Q_\rho ^\lambda (z_0)\) with \(a_0=a(z_0)\). Thus, it is possible to obtain a stopping time argument in the (pq)-intrinsic cylinders from the stopping time argument in the p-intrinsic cylinders. Finally, the continuity of \(a(\cdot )\) implies the continuity of \(\lambda (\cdot )\) and this enables us to prove a Vitali type covering lemma. The desired estimate follows by using Fubini’s theorem.

2 Energy estimates

We apply the following definition of weak solution:

Definition 2.1

A function \(u:\Omega _T\longrightarrow \mathbb {R}^N\) with

$$\begin{aligned} u\in C(0,T;L^2(\Omega ,\mathbb {R}^N))\cap L^q(0,T;W^{1,q}(\Omega ,\mathbb {R}^N)) \end{aligned}$$

is a weak solution to (1.1), if

$$\begin{aligned} \iint _{\Omega _T}(-u\cdot \varphi _t+\mathcal {A}(z,\nabla u)\cdot \nabla \varphi )\,\textrm{d}z=\iint _{\Omega _T}(|F|^{p-2}F\cdot \nabla \varphi +a(z)|F|^{p-2}F\cdot \nabla \varphi )\,\textrm{d}z \end{aligned}$$

for every \(\varphi \in C_0^\infty (\Omega _T,\mathbb {R}^N)\).

Remark 2.2

A more standard assumption on the function space would be

$$\begin{aligned} u\in C(0,T;L^2(\Omega ,\mathbb {R}^N))\cap L^1(0,T;W^{1,1}(\Omega ,\mathbb {R}^N)) \end{aligned}$$

with

$$\begin{aligned} \iint _{\Omega _T} H(z,|\nabla u|)\, \textrm{d}z=\iint _{\Omega _T}(|\nabla u|^p+a(z)|\nabla u|^q)\,\textrm{d}z<\infty . \end{aligned}$$

However, this assumption does not seem to be enough in the proof of the energy estimate using the Steklov averages, see Lemma 2.3 below. This unexpected challenge does not occur in the elliptic case, since the mollification in time is not needed. It is possible to derive Lemma 2.3 under the natural function space assumption above by a parabolic Lipschitz truncation technique, see [19]. We emphasize that the assumption \(|\nabla u|\in L^q(\Omega _T)\) is only applied in the proof of Lemma 2.3 and it is not needed in the rest of the paper. With this observation Theorem 1.1 holds true also under the natural function space assumption above.

In the rest of this section, we provide three energy estimates. The first lemma is a parabolic double-phase Caccioppoli inequality. In general, the time derivative of a weak solution does not belong to \(L^2\) and does not even exist a priori. To be able to derive a suitable energy estimate, we use the following mollification in time. We define the Steklov average \(f_h\), with \(0<h<T\), of \(f\in L^1(\Omega _T)\) by

For the properties of Steklov averages, we refer to [12].

We apply the following notation. A space-time cylinder in \(\mathbb {R}^{n+1}\) is denoted by

$$\begin{aligned} Q_{R,\ell }(z_0)=B_{R}(x_0)\times (t_0-\ell ,t_0+\ell ), \quad r>0,\quad l>0, \end{aligned}$$

and the integral average of u over \(Q_{R,\ell }(z_0)\) is denoted by

Lemma 2.3

Let u be a weak solution to (1.1). Then there exists a constant \(c=c(n,p,q,\nu ,L)\) such that

for every \(Q_{R,\ell }(z_0)\subset \Omega _T\), with \(R,\ell >0\), \(r\in [R/2,R)\) and \(\tau \in [\ell /2^2,\ell )\).

Proof

Let \(\eta \in C_0^\infty (B_{R}(x_0))\) be a cut-off function with

$$\begin{aligned} 0\leqq \eta \leqq 1,\quad \eta \equiv 1\text { in } B_{r}(x_0)\quad \text {and}\quad \Vert \nabla \eta \Vert _{L^\infty }\leqq \frac{2}{R-r}. \end{aligned}$$
(2.1)

For \(\tau \in [\ell /2^2,\ell )\), let \(h_0>0\) be sufficiently small so that there exists a cut-off function \(\zeta \in C_0^\infty (I_{\ell -h_0}(t_0))\) with

$$\begin{aligned} 0\leqq \zeta \leqq 1,\quad \zeta \equiv 1\text { in } I_{\tau }(t_0)\quad \text {and}\quad \Vert \partial _t\zeta \Vert _{L^\infty }\leqq \frac{3}{\ell -\tau }. \end{aligned}$$
(2.2)

Let \(t_*\in I_{\tau }(t_0)\) and \(\delta \in (0,h_0)\). We define \(\zeta _\delta \) as

$$\begin{aligned} \zeta _{\delta }(t)= {\left\{ \begin{array}{ll} 1,&{}t\in (-\infty ,t_*-\delta ),\\ 1-\frac{t-t_*+\delta }{\delta },&{}t\in [t_*-\delta ,t_*],\\ 0,&{}t\in (t_*,\infty ). \end{array}\right. } \end{aligned}$$
(2.3)

For \(h\in (0,h_0)\), we consider (1.1) in terms of Steklov averages and obtain

$$\begin{aligned} \partial _t[u-u_{Q_{R,\ell }(z_0)}]_h-{{\,\textrm{div}\,}}[\mathcal {A}(\cdot ,\nabla u)]_h\ =-{{\,\textrm{div}\,}}[|F|^{p-2}F+a|F|^{q-2}F]_h\qquad \end{aligned}$$
(2.4)

in \(B_{R}(x_0)\times I_{\ell -h}(t_0)\). Then we observe that

$$\begin{aligned}&[u-u_{Q_{R,\ell }(z_0)}]_h\eta ^q\zeta ^2 \zeta _\delta \in W^{1,2}_0(I_{\ell -h};L^2(B_{R}(x_0),\mathbb {R}^N))\cap L^q(I_{\ell -h};\\&\quad W_0^{1,q}(B_{R}(x_0),\mathbb {R}^N)). \end{aligned}$$

By applying \(\varphi =[u-u_{Q_{R,\ell }(z_0)}]_h\eta ^q\zeta ^2\zeta _{\delta }\) as a test function in (2.4), we have

(2.5)

We show that \(\textrm{II}\) and \(\textrm{III}\) are finite under the assumption \(|\nabla u|\in L^q(\Omega _T)\). The structural assumptions and the properties of the Steklov average lead to

The first term on the right-hand side is finite as in the case of the parabolic p-Laplace systems. The second term on the right-hand side can be written as

By Hölder’s inequality and the properties of the Steklov average, there exists a constant \(c=c(n)\) such that

This shows that \(\textrm{II}\) is finite if \(|\nabla u|\in L^q(\Omega _T)\). A similar argument applies for \(\textrm{III}\).

Estimate of \(\textrm{I}\): Integration by parts gives

(2.6)

We estimate the first term on the right-hand side of (2.6) by (2.2) and obtain

For the second term on the right-hand side of (2.6), by (2.3) we have

Thus, we get

Estimate of \(\textrm{II}\): It holds that

(2.7)

To estimate the first term in (2.7), we apply (1.2) to get

To estimate the second term in (2.7), we use (1.2) and (2.1) to conclude that

By Young’s inequality, there exists a constant \(c=c(p,q,\nu ,L)\) such that

It follows that

Estimate of \(\textrm{III}\): We apply Young’s inequality as above and obtain

By applying the estimates above in (2.5), we obtain

Since \(t_*\in I_{\ell }(t_0)\) is arbitrary, \(|B_{R}|\approx c(n)|B_{r}|\) and \(|I_{\ell }|\approx |I_{\tau }|\), we get

\(\square \)

The second lemma is a gluing lemma, which enables us to estimate integral averages over time-slices. The spatial integral average of u over \(B_R(x_0)\) is denoted by

Lemma 2.4

Let u be a weak solution to (1.1) and let \(\eta \in C_0^\infty (B_{R}(x_0))\) be a function such that

(2.8)

where \(c=c(n)\). Then there exists a constant \(c=c(n,L)\) such that

for every \(Q_{R,\ell }(z_0)=B_{R}(x_0)\times (t_0-\ell ,t_0+\ell )\subset \Omega _T\) with \(R,\ell >0\).

Proof

Let \(t_1,t_2\in (t_0-\ell ,t_0+\ell )\) with \(t_1<t_2\). For \(\delta \in (0,1)\) small enough, we define \(\zeta _\delta \in W_0^{1,\infty }(t_0-\ell ,t_0+\ell )\) by

$$\begin{aligned} \zeta _\delta (t)= {\left\{ \begin{array}{ll} \qquad 0,&{}t_0-\ell \leqq t\leqq t_1-\delta ,\\ \frac{t-t_1+\delta }{\delta },&{}t_1-\delta<t<t_1,\\ \qquad 1,&{}t_1\leqq t\leqq t_2,\\ \frac{t_2+\delta -t}{\delta },&{}t_2<t<t_2+\delta ,\\ \qquad 0,&{}t_2+\delta \leqq t\leqq t_0+\ell . \end{array}\right. } \end{aligned}$$

By applying \(\eta \zeta _\delta \in W_0^{1,\infty }(Q_{R,\ell }(z_0))\) as a test function in (1.1), we obtain

Letting \(\delta \longrightarrow 0^+\) and using the third condition in (2.8), we obtain

where \(c=c(n,L)\). This completes the proof. \(\quad \square \)

Then we consider a parabolic Poincaré inequality.

Lemma 2.5

Let u be a weak solution to (1.1). Then there exists a constant \(c=c(n,N,m,L)\) such that

for every \(Q_{R,\ell }(z_0)=B_{R}(x_0)\times (t_0-\ell ,t_0+\ell )\subset \Omega _T\) with \(R,\ell >0\), \(m\in (1,q]\) and \(\theta \in (1/m,1]\),

Proof

The triangle inequality gives

where \(c=c(m)\). By applying the Poincaré inequality in the spatial direction, we have

where \(c=c(n,N,m)\).

To complete the proof, we estimate the second term on the right-hand side in the estimate above. By Hölder’s inequality, we have

For \(\eta \in C_0^\infty (B_R(x_0))\) satisfying (2.8), it holds that

where the second term on the right-hand side can be estimated by Lemma 2.4. For the first term on the right-hand side we may apply (2.8) and obtain

Therefore, using the Poincaré inequality in the spatial direction and Hölder’s inequality, we have

This completes the proof. \(\quad \square \)

3 Parabolic Sobolev-Poincaré inequalities

This section provides a parabolic Sobolev-Poincaré inequality by adapting techniques in [20] to the double-phase case. Throughout this section, let \(z_0=(x_0,t_0)\in \Omega _T\), with \(x_0\in \Omega \) and \(t_0\in (0,T)\), be a Lebesgue point of \(|\nabla u(z)|^p+a(z)|\nabla u(z)|^q\) satisfying

$$\begin{aligned} |\nabla u(z_0)|^p+a(z_0)|\nabla u(z_0)|^q>\Lambda \end{aligned}$$
(3.1)

for some \(\Lambda >1+\Vert a\Vert _{L^\infty (\Omega _T)}\). Recall that \(H(z,s):\Omega _T\times \mathbb {R}^{+ }\longrightarrow \mathbb {R}^+\), \(H(z,s)=s^p+a(z)s^q\). For a fixed point \(z_0\), we denote

$$\begin{aligned} H_{z_0}(s)=s^p+a(z_0)s^q. \end{aligned}$$
(3.2)

Note that \(H_{z_0}(s)\) is strictly increasing and continuous with

$$\begin{aligned} \lim _{s\rightarrow 0^+}H_{z_0}(s)=0 \quad \text {and}\quad \lim _{s\rightarrow \infty }H_{z_0}(s)=\infty . \end{aligned}$$

By the intermediate value theorem for continuous functions, there exists \(\lambda =\lambda (z_0)>1\) such that

$$\begin{aligned} \Lambda =\lambda ^p+a(z_0)\lambda ^q=H_{z_0}(\lambda ). \end{aligned}$$
(3.3)

Let

$$\begin{aligned} M_1=\frac{1}{2|B_1|}\iint _{\Omega _T}\left( H(z,|\nabla u|)+H(z,|F|)\right) \,\textrm{d}z. \end{aligned}$$

The parameter \(K=K(n,\alpha ,[a]_{\alpha },M_1)>1\) will be determined later in (5.2).

The p-intrinsic cylinders and the (pq)-intrinsic cylinders are considered separately in the argument. In the p-intrinsic case we assume that

(3.4)

where

$$\begin{aligned} Q_{4\rho }^\lambda (z_0)= B_{4\rho }(x_0)\times I_{4\rho }^\lambda (t_0), \quad I_{4\rho }^\lambda (t_0)=(t_0-\lambda ^{2-p}(4\rho )^2,t_0+\lambda ^{2-p}(4\rho )^2),\nonumber \\ \end{aligned}$$
(3.5)

is a p-intrinsic cylinder. In the (pq)-intrinsic case we assume that

(3.6)

where

$$\begin{aligned} G_{4\rho }^\lambda (z_0)=B_{4\rho }(x_0)\times J_{4\rho }^\lambda (t_0),\quad J_{4\rho }^\lambda (t_0)=\left( t_0-\tfrac{\lambda ^2}{H_{z_0}(\lambda )}(4\rho )^2,t_0+\tfrac{\lambda ^2}{H_{z_0}(\lambda )}(4\rho )^2\right) ,\nonumber \\ \end{aligned}$$
(3.7)

is a (pq)-intrinsic cylinder.

3.1 The p-intrinsic case

In this case we consider estimates in p-intrinsic cylinders as in (3.5) and assume that (3.4) holds. We begin by estimating the last term in Lemma 2.5.

Lemma 3.1

Let u be a weak solution to (1.1). Then, for \(s\in [2\rho ,4\rho ]\) and \(\theta \in ((q-1)/p,1]\), there exists a constant \(c=c(n,p,q,\alpha ,L,[a]_{\alpha },M_1)\) such that

whenever \(Q_{4\rho }^{\lambda }(z_0)\subset \Omega _T\) satisfies (3.4).

Proof

It follows from (1.3) that \(q-1<p\). By (1.3) there exists a constant \(c=c([a]_{\alpha })\) such that

(3.8)

We apply the first condition in (3.4) to estimate the second term on the right-hand side of (3.8) and obtain

In order to estimate the last term on the right-hand side of (3.8), we recall that \(|Q_{s}^\lambda (z_0)|=c(n)s^{n+2}\lambda ^{2-p}\). Hölder’s inequality gives

where \(c=c(n)\), \(\gamma =\alpha p/(n+2)\) and \(\theta \in ((q-1)/p,1]\). We have \(\gamma \in (0,p-1)\), since

$$\begin{aligned} 1<\frac{2(n+1)}{n+2}\Longrightarrow 1<\frac{(n+1)p}{n+2}\Longrightarrow \gamma =\frac{\alpha p}{n+2}<p-1. \end{aligned}$$

It follows from the second condition in (3.4), \(\lambda \geqq 1\) and the first condition in (1.3) that

where \(c=c(n,p,q,\alpha )\). Therefore, we obtain

where \(c=c(n,p,q,\alpha ,M_1)\). Similarly, replacing \(|\nabla u|\) by |F| in the above argument, we have

This completes the proof. \(\quad \square \)

Next we provide a p-intrinsic parabolic Poincaré inequality.

Lemma 3.2

Let u be a weak solution to (1.1). Then, for \(s\in [2\rho ,4\rho ]\) and \(\theta \in ((q-1)/p,1]\), there exists a constant \(c=c(n,N,p,q,\alpha ,L,[a]_{\alpha },M_1)\) such that

whenever \(Q_{4\rho }^{\lambda }(z_0)\subset \Omega _T\) satisfies (3.4).

Proof

By Lemmas 2.5 and 3.1, there exists a constant \(c=c(n,N,p,q,\alpha ,L,[a]_{\alpha },M_1)\) such that

(3.9)

To estimate the second term on the right-hand side of (3.9), we use the second condition in (3.4) and obtain

where \(c=c(n,p)\). Similarly, the third term on the right-hand side of (3.9) is estimated as

where \(c=c(n,p,q)\). The conclusion follows from Hölder’s inequality. \(\quad \square \)

Lemma 3.3

Let u be a weak solution to (1.1). Then for \(Q_{4\rho }^{\lambda }(z_0)\subset \Omega _T\) satisfying (3.4), \(s\in [2\rho ,4\rho ]\) and \(\theta \in ((q-1)/p,1]\), there exists a constant \(c=c(n,N,p,q,\alpha ,L,[a]_{\alpha },M_1)\) such that

Proof

By Lemmas 2.5 and 3.1, there exists a constant \(c=c(n,N,p,q,\alpha ,L,[a]_{\alpha },M_1)\) such that

(3.10)

By (3.4) for the second term on the right-hand side of (3.10), we obtain

Similarly, the third and the fourth terms on the right-hand side of (3.10) can be estimated as

and

The conclusion follows from Hölder’s inequality. \(\quad \square \)

3.2 The (pq)-intrinsic case

In this case we consider estimates in (pq)-intrinsic cylinders as in (3.7) and assume that (3.6) holds. The second and third conditions in (3.6) imply

It follows that

(3.11)

Next we discuss a (pq)-intrinsic parabolic Poincaré inequality.

Lemma 3.4

Let u be a weak solution to (1.1). Then, for \(\theta \in ((q-1)/p,1]\) and \(s\in [2\rho ,4\rho ]\), there exists a constant \(c=c(n,N,p,q,L)\) such that

whenever \(G_{4\rho }^{\lambda }(z_0)\subset \Omega _T\) satisfies (3.6).

Proof

Note that

By Lemma 2.5, there exists a constant \(c=c(n,N,p,q,L)\) such that

(3.12)

Since

$$\begin{aligned} H_{z_0}'(s)=ps^{p-1}+qa(z_0)s^{q-1} \end{aligned}$$

for every \(s>0\), we have

$$\begin{aligned} sH_{z_0}'(s)\leqq q H_{z_0}(s) \leqq \tfrac{q}{p} sH_{z_0}'(s) \end{aligned}$$
(3.13)

for every \(s>0\). We estimate the last term on the right-hand side of (3.12) by (3.13) and obtain

By the same argument as above for \(H_{z_0}'(|\nabla u|+|F|)\), it follows from (3.11) that

where \(c=c(n,p,q)\). Therefore, we obtain

(3.14)

In order to estimate the first term on the right-hand side of (3.14), we apply (3.11). Keeping in mind that \(q-1<p\) and \(p\geqq 2\), we have

for any \(\theta \in ((q-1)/p,1]\) with \(c=c(n,p)\). We estimate the last term on the right-hand side of (3.14) in a similar way. Then for any \(\theta \in ((q-1)/q,1]\), we have

where \(c=c(n,p)\). Hence, we conclude that

which completes the proof. \(\quad \square \)

Note that by replacing \(H_{z_0}^{\theta }(s)\) with \(s^{\theta p}\) in the proof of Lemma 3.4, we will also have the following result. All necessary calculations are already contained in the proof of the previous lemma.

Lemma 3.5

Let u be a weak solution to (1.1). Then, for \(\theta \in ((q-1)/p,1]\) and \(s\in [2\rho ,4\rho ]\), there exists a constant \(c=c(n,N,p,q,L)\) such that

whenever \(G_{4\rho }^{\lambda }(z_0)\subset \Omega _T\) satisfies (3.6).

4 Reverse Hölder inequalities

In this section, we assume that \(z_0\), \(\Lambda \) and \(\lambda =\lambda (z_0)\geqq 1\) satisfy (3.1)–(3.3). Let

$$\begin{aligned} K=1+40[a]_\alpha M_1^\frac{\alpha }{n+2}\quad \text {and}\quad \kappa =10K. \end{aligned}$$

The distribution sets are denoted as

$$\begin{aligned} \Psi (\Lambda )=\{ z\in \Omega _T: H(z,|\nabla u(z)|)>\Lambda \} \end{aligned}$$
(4.1)

and

$$\begin{aligned} \Phi (\Lambda )=\{ z\in \Omega _T: H(z,|F|)>\Lambda \}. \end{aligned}$$
(4.2)

We consider the p-intrinsic and (pq)-intrinsic cases separately. In the first case we assume that

(4.3)

where the p-intrinsic cylinder is defined in (3.5). In the second case we assume that

(4.4)

where the (pq)-intrinsic cylinder is defined in (3.7). We discuss reverse Hölder inequalities in both cases separately.

The following auxiliary lemmas will be employed in the argument the first lemma is a Gagliardo–Nirenberg inequality and the second one is a standard iteration lemma, see [15, Lemma 8.3]:

Lemma 4.1

Let \(B_{\rho }(x_0)\subset \mathbb {R}^n\), \(\sigma ,s,r\in [1,\infty )\) and \(\vartheta \in (0,1)\) such that

$$\begin{aligned} -\frac{n}{\sigma }\leqq \vartheta \left( 1-\frac{n}{s}\right) -(1-\vartheta )\frac{n}{r}. \end{aligned}$$

Then there exists a constant \(c=c(n,\sigma )\) such that

for every \(v\in W^{1,s}(B_{\rho }(x_0))\).

Lemma 4.2

Let \(0<r<R<\infty \) and \(h:[r,R]\longrightarrow \mathbb {R}\) be a non-negative and bounded function. Suppose there exist \(\vartheta \in (0,1)\), \(A,B\geqq 0\) and \(\gamma >0\) such that

$$\begin{aligned} h(r_1)\leqq \vartheta h(r_2)+\frac{A}{(r_2-r_1)^\gamma }+B \quad \text {for all}\quad 0<r\leqq r_1<r_2\leqq R. \end{aligned}$$

Then there exists a constant \(c=c(\vartheta ,\gamma )\) such that

$$\begin{aligned} h(r)\leqq c\left( \frac{A}{(R-r)^\gamma }+B\right) . \end{aligned}$$

4.1 The p-intrinsic case

In this case we consider estimates in p-intrinsic cylinders as in (3.5) and assume that (4.3) holds. We denote

and \(M_2=\Vert u\Vert _{L^\infty (0,T;L^2(\Omega ))}\).

Lemma 4.3

Let u be a weak solution to (1.1). Then there exists a constant \(c=c( data )\) such that

whenever \(Q_{2\kappa \rho }^{\lambda }(z_0)\subset \Omega _T\) satisfies (4.3).

Proof

Let \(2\rho \leqq \rho _1<\rho _2\leqq 4\rho \). By Lemma 2.3 there exists a constant \(c=c(n,p,q,\nu ,L)\) such that

(4.5)

We estimate the first term on the right-hand side of (4.5). By Lemma 3.2 with \(\theta =1\) and the second condition in (4.3), we obtain

(4.6)

where \(c=c(n,N,p,q,\alpha ,L,[a]_{\alpha },M_1)\). On the other hand, we observe that

By Lemma 3.3 with \(\theta =1\) and (4.3), we obtain

where \(c=c(n,N,p,q,\alpha ,L,[a]_{\alpha },M_1)\). On the other hand, by Lemma 4.1 with \(\sigma =q\), \(s=p\), \(r=2\) and \(\vartheta =\tfrac{p}{q}\), we obtain

where \(c=c(n,q)\). We observe that

where \(c=c(n,N,p,q,\alpha ,{{\,\textrm{diam}\,}}(\Omega ),M_2)\). Furthermore, by (4.6) we have

where \(c=c(n,N,p,q,\alpha ,L,[a]_\alpha ,{{\,\textrm{diam}\,}}(\Omega ),M_1,M_2)\).

For the second term on the right-hand side of (4.5), the Poincaré inequality implies that

where \(c=c(n,N,p)\). By Hölder’s inequality and (4.6), we have

where \(c=c(n,N,p,q,\alpha ,L,[a]_{\alpha },M_1)\).

For the last term on the right-hand side of (4.5), by (4.3) we obtain

By combining all estimates above, we conclude from (4.5) that

$$\begin{aligned} \begin{aligned} S(u,Q_{\rho _1}^\lambda (z_0))\leqq c\frac{\rho _2^q}{(\rho _2-\rho _1)^q}\lambda ^2+c\frac{ \rho _2^2}{(\rho _2-\rho _1)^2}\lambda \ S(u,Q_{\rho _2}^\lambda (z_0))^\frac{1}{2}. \end{aligned} \end{aligned}$$

Finally, we apply Young’s inequality to obtain

$$\begin{aligned} \begin{aligned} S(u,Q_{\rho _1}^\lambda (z_0))\leqq \frac{1}{2}S(u,Q_{\rho _2}^\lambda (z_0))+ c\left( \frac{\rho _2^q}{(\rho _2-\rho _1)^q}+\frac{ \rho _2^4}{(\rho _2-\rho _1)^4}\right) \lambda ^2. \end{aligned} \end{aligned}$$

The proof is concluded by an application of Lemma 4.2. \(\quad \square \)

Next we prove an estimate for the first term on the right-hand side of the energy estimate in Lemma 2.3 by using Lemma 4.1.

Lemma 4.4

Let u be a weak solution to (1.1). Then there exist constants \(c=c( data )\) and \(\theta _0=\theta _0(n,p,q)\in (0,1)\) such that for any \(\theta \in (\theta _0,1)\),

whenever \(Q_{2\kappa \rho }^{\lambda }(z_0)\subset \Omega _T\) satisfies (4.3).

Proof

By (1.4) we obtain

(4.7)

We begin with the first term on the right-hand side of (4.7). By choosing \(\sigma =p\), \(s=\theta p\) and \(r=2\), we see that any \(\theta \in (n/(n+2),1)\) satisfies the condition in Lemma 4.1 as

$$\begin{aligned} -\frac{n}{p}\leqq \theta \left( 1-\frac{n}{\theta p}\right) -(1-\theta )\frac{n}{2}\Longleftrightarrow \frac{n}{n+2}\leqq \theta . \end{aligned}$$

Thus, we obtain

where \(c=c(n,p)\).

For the second term on the right-hand side of (4.7), we apply Lemma 4.1 with \(\sigma =q\), \(s=\theta q\) and \(r=2\). For any \(\theta \in (n/(n+2),1)\), we have

where \(c=c(n,q)\). By using the first condition in (4.3), we have

Then we consider the last term on the right-hand side of (4.7). We observe that

$$\begin{aligned} \frac{nq}{(n+2)p}\leqq \frac{n}{n+2}\left( 1+\frac{2}{(n+2)p}\right) \leqq \frac{n}{n+2}\frac{n+3}{n+2}<1. \end{aligned}$$

Thus, by letting \(\sigma =q\), \(s=\theta p\), \(r = 2\) and \(\vartheta =\theta p/q\), the assumptions in Lemma 4.1 are satisfied for any \(\theta \in (nq/((n+2)p),1)\), since

$$\begin{aligned} -\frac{n}{q}\leqq \frac{\theta p}{q}\left( 1-\frac{n}{\theta p}\right) -\left( 1-\frac{\theta p}{q}\right) \frac{n}{2}\Longleftrightarrow \frac{n q}{(n+2)p}\leqq \theta . \end{aligned}$$

Therefore, we have

where \(c=c(n,q)\). Note that

Thus we obtain

where \(c=c(n,p,q,\alpha ,{{\,\textrm{diam}\,}}(\Omega ),M_2)\). The claim follows by combining the estimates above. \(\quad \square \)

At this stage, we have all the required tools to prove the reverse Hölder inequality when (4.3) holds true.

Lemma 4.5

Let u be a weak solution to (1.1). Then there exist constants \(c=c( data )\) and \(\theta _0=\theta _0(n,p,q)\in (0,1)\) such that for any \(\theta \in (\theta _0,1)\),

whenever \(Q_{2\kappa \rho }^{\lambda }(z_0)\subset \Omega _T\) satisfies (4.3).

Proof

Lemma 2.3 implies that

(4.8)

where \(c=c(n,p,q,\nu ,L)\). To estimate the first term on the right-hand side in (4.8), we apply Lemmas 4.3 and 4.4 to conclude that there exist \(\theta _0=\theta _0(n,p,q)\in (0,1)\) and \(c=c( data )\) such that for any \(\theta \in (\theta _0,1)\),

By Lemmas 3.2 and 3.3 we obtain

By recalling that \(\tfrac{\alpha p}{n+2} < p-1\) and letting

$$\begin{aligned} \beta =\min \left\{ p-1-\frac{\alpha p}{n+2},\frac{1}{2}\right\} , \end{aligned}$$

we have

To estimate the second term on the right-hand side of (4.8), we apply the Poincaré inequality with \(\theta \in (2n/((n+2)p),1)\) and Lemma 4.3 to obtain

where \(c=c( data )\). Lemma 3.2 implies that

By combining the estimates above and applying (4.8) and Young’s inequality, we obtain

The third condition in (4.3) implies that

This completes the proof. \(\quad \square \)

The following lemma will be used in the next section.

Lemma 4.6

Let u be a weak solution to (1.1). Then there exist constants \(c=c( data )\) and \(\theta _0=\theta _0(n,p,q)\in (0,1)\) such that for any \(\theta \in (\theta _0,1)\),

$$\begin{aligned} \begin{aligned}&\iint _{Q_{2\kappa \rho }^\lambda (z_0)}H(z,|\nabla u|)\,\textrm{d}z \leqq c\Lambda ^{1-\theta }\iint _{Q_{2\rho }^\lambda (z_0)\cap \Psi (c^{-1}\Lambda )}H(z,|\nabla u|)^\theta \,\textrm{d}z\\&\quad +c\iint _{Q_{2\rho }^\lambda (z_0)\cap \Phi (c^{-1}\Lambda )}H(z,|F|)\,\textrm{d}z, \end{aligned} \end{aligned}$$

whenever \(Q_{2\kappa \rho }^{\lambda }(z_0)\subset \Omega _T\) satisfies (4.3). Here \(\Psi (\Lambda )\) and \(\Phi (\Lambda )\) are defined in (4.1) and (4.2).

Proof

The second condition in (4.3) implies that

By representing \(Q_{2\rho }^\lambda (z_0)\) as a union of \(Q_{2\rho }^\lambda (z_0)\cap \Psi ((4c)^{-1/\theta }\lambda ^p)\) and \(Q_{2\rho }^\lambda (z_0)\setminus \Psi ((4c)^{-1/\theta }\lambda ^p)\), we have

for any \(c > 0\). A similar argument gives

It follows from Lemma 4.5 that

By recalling the second and third conditions in (4.3), we obtain

Thus, we have

$$\begin{aligned} \iint _{Q_{2\kappa \rho }^\lambda (z_0)}H(z,|\nabla u|)\,\textrm{d}z&\leqq 2c\lambda ^{p(1-\theta )}\iint _{Q_{2\rho }^\lambda (z_0)\cap \Psi ((4c)^{-1/\theta }\lambda ^p)}H(z,|\nabla u|)^{\theta }\,\textrm{d}z\nonumber \\&\quad +2c\iint _{Q_{2\rho }^\lambda (z_0)\cap \Phi ((4c)^{-1}\lambda ^p)}H(z,|F|)\,\textrm{d}z. \end{aligned}$$
(4.9)

We note that

$$\begin{aligned} \frac{\lambda ^p}{4c}\geqq \frac{\lambda ^p}{(4c)^{1/\theta }}\geqq \frac{\lambda ^p}{(4c)^{1/\theta _0}}\geqq \frac{\lambda ^p+a(z_0)\lambda ^q}{2K(4c)^{1/\theta _0}}=\frac{\Lambda }{2K(4c)^{1/\theta _0}}, \end{aligned}$$

where we applied the first condition in (4.3). The estimate above implies that

$$\begin{aligned}&\Psi ((4c)^{-1/\theta }\lambda ^p)\subset \Psi ((2K(4c)^{1/\theta _0})^{-1}\Lambda ) \quad \text {and}\quad \\&\quad \Phi ((4c)^{-1}\lambda ^p)\subset \Phi ((2K(4c)^{1/\theta _0})^{-1}\Lambda ). \end{aligned}$$

Therefore, by replacing \(2K(4c)^{1/\theta _0}\) with c, (4.9) can be written as

$$\begin{aligned} \begin{aligned} \iint _{Q_{2\kappa \rho }^\lambda (z_0)}H(z,|\nabla u|)\,\textrm{d}z&\leqq c\Lambda ^{1-\theta }\iint _{Q_{2\rho }^\lambda (z_0)\cap \Psi (c^{-1}\Lambda )}H(z,|\nabla u|)^{\theta }\,\textrm{d}z\\&\quad +c\iint _{Q_{2\rho }^\lambda (z_0)\cap \Phi (c^{-1}\Lambda )}H(z,|F|)\,\textrm{d}z. \end{aligned} \end{aligned}$$

This completes the proof. \(\quad \square \)

4.2 The (pq)-intrinsic case

In this case we consider estimates in (pq)-intrinsic cylinders as in (3.7) and assume that (4.4) holds. We remark that constants in the estimates depend only on \(n,N,p,q,\nu ,L\) since (1.1) reduces to a parabolic (pq)-Laplace system in \(G_{2\kappa \rho }^\lambda (z_0)\). We denote

Lemma 4.7

Let u be a weak solution to (1.1). Then there exists a constant \(c=c(n,N,p,q,\nu ,L)\) such that

whenever \(G_{2\kappa \rho }^{\lambda }(z_0)\subset \Omega _T\) satisfies (4.4).

Proof

Let \(2\rho \leqq \rho _1<\rho _2\leqq 4\rho \). By Lemma 2.3, there exists a constant \(c=c(n,p,q,\nu ,L)\) such that

(4.10)

For the first term on the right-hand side of (4.10), we apply Lemma 3.4 together with the second and third conditions in (4.4) to obtain

where \(c=c(n,N,p,q,L)\).

For the second term on the right-hand side of (4.10), as in the proof of Lemma 4.3, we obtain

where \(c=c(n,N,p)\). By using Lemma 3.5 and (3.11), we obtain

where \(c=c(n,N,p,q,L)\). By combining estimates and arguing as in the proof of Lemma 4.3, we have

$$\begin{aligned} \begin{aligned} S(u,G_{\rho _1}^\lambda (z_0)) \leqq&\frac{1}{2}S(u,G_{\rho _2}^\lambda (z_0))+c\left( \frac{\rho _2^q}{(\rho _2-\rho _1)^q}+\frac{ \rho _2^4}{(\rho _2-\rho _1)^4}\right) \lambda ^2. \end{aligned} \end{aligned}$$

The conclusion follows by applying Lemma 4.2. \(\quad \square \)

Lemma 4.8

Let u be a weak solution to (1.1). Then there exists a constant \(c=c(n,p,q)\) such that for any \(\theta \in (n/(n+2),1)\),

whenever \(G_{2\kappa \rho }^{\lambda }(z_0)\subset \Omega _T\) satisfies (4.4).

Proof

From the second condition in (4.4), we obtain

By Lemma 4.1, there exists a constant \(c=c(n,p,q)\) such that for any \(\theta \in (n/(n+2),1)\), we have

and

Thus we conclude that

This completes the proof. \(\quad \square \)

Lemma 4.9

Let u be a weak solution to (1.1). Then there exist constants \(c=c(n,N,p,q,\nu ,L)\) and \(\theta _0=\theta _0(n,p,q)\in (0,1)\) such that for any \(\theta \in (\theta _0,1)\),

whenever \(G_{2\kappa \rho }^{\lambda }(z_0)\subset \Omega _T\) satisfies (4.4). Moreover, we have

$$\begin{aligned} \begin{aligned}&\iint _{G_{2\kappa \rho }^\lambda (z_0)}H(z,|\nabla u|)\,\textrm{d}z \leqq c\Lambda ^{1-\theta }\iint _{G_{2\rho }^\lambda (z_0)\cap \Psi (c^{-1}\Lambda )}H(z,|\nabla u|)^\theta \,\textrm{d}z\\&\quad +c\iint _{G_{2\rho }^\lambda (z_0)\cap \Phi (c^{-1}\Lambda )}H(z,|F|)\,\textrm{d}z, \end{aligned} \end{aligned}$$

where \(\Psi (\Lambda )\) and \(\Phi (\Lambda )\) are as in (4.1) and (4.2).

Proof

Once the first estimate in the statement holds, then the second estimate follows as in the proof of Lemma 4.6.

To prove the first estimate in the statement, we apply Lemma 2.3 to obtain

(4.11)

Using Lemmas 4.8, 3.4 and 4.7 for the first term on the right-hand side of (4.11), we obtain

As in the proof of Lemma 4.5, we obtain

and from Lemma 3.5 we conclude that

For the second term on the right-hand side of (4.11), we have

where

A similar argument for |F| gives

By collecting all the estimates above and applying Young’s inequality together with the second condition in (4.4), we obtain

We use the fourth condition in (4.4) to absorb the first term on the right-hand side. This completes the proof. \(\quad \square \)

5 The proof of Theorem 1.1

In this section, we will complete the proof of Theorem 1.1. We divide the section into three subsections. In the first subsection, we construct intrinsic cylinders which are either p-intrinsic or (pq)-intrinsic, see (4.3) and (4.4). In the second subsection, we prove a Vitali type covering property for the system of intrinsic cylinders constructed in the first subsection. Note that the collection consists of two different types of intrinsic cylinders depending on the center point of the cylinders. Finally, in the last subsection, we complete the proof of gradient estimate by applying Fubini’s theorem together with Lemma 4.2.

5.1 Stopping time argument

Let

(5.1)

where \(Q_{2r}(z_0)=B_{2r}(x_0)\times (t_0-(2r)^2,t_0+(2r)^2)\). Moreover, let

$$\begin{aligned} K=1+40[a]_{\alpha } M_1^\frac{\alpha }{n+2}\text {and}\quad \kappa =10K. \end{aligned}$$
(5.2)

With \(\Psi (\Lambda )\) and \(\Phi (\Lambda )\) as in (4.1)–(4.2) and \(\rho \in [r,2r]\), we denote

$$\begin{aligned} \Psi (\Lambda ,\rho )=\Psi (\Lambda )\cap Q_{\rho }(z_0)=\{ z\in Q_{\rho }(z_0): H(z,|\nabla u(z)|)>\Lambda \} \end{aligned}$$

and

$$\begin{aligned} \Phi (\Lambda ,\rho )=\Phi (\Lambda )\cap Q_{\rho }(z_0) =\{ z\in Q_{\rho }(z_0): H(z,|F(z)|)>\Lambda \}. \end{aligned}$$

Next we apply a stopping time argument. Let \(r\leqq r_1<r_2\leqq 2r\) and

$$\begin{aligned} \Lambda >\left( \frac{4\kappa r}{r_2-r_1}\right) ^\frac{q(n+2)}{2}\Lambda _0, \end{aligned}$$
(5.3)

where \(\kappa \) is as in (5.2). For every \(w\in \Psi (\Lambda ,r_1)\), let \(\lambda _{w}>0\) be such that

$$\begin{aligned} \Lambda =\lambda _{w}^p+a(w)\lambda _{w}^q=H_{w}(\lambda _{w}), \end{aligned}$$
(5.4)

where \(H_{w}\) is as in (3.2). We claim that

$$\begin{aligned} \lambda _{w}>\left( \frac{4\kappa r}{r_2-r_1}\right) ^\frac{n+2}{2}\lambda _0. \end{aligned}$$

For a contradiction, assume that the inequality above does not hold. Then

$$\begin{aligned} \Lambda < \left( \frac{4\kappa r}{r_2-r_1}\right) ^{\frac{q(n+2)}{2}}\left( \lambda _0^p+a(w)\lambda _0^q\right) \leqq \left( \frac{ 4\kappa r}{r_2-r_1}\right) ^{\frac{q(n+2)}{2}}\Lambda _0, \end{aligned}$$

which is a contradiction with (5.3). Therefore, for \(s\in [(r_2-r_1)/(2\kappa ),r_2-r_1)\), we have

(5.5)

Since \(w\in \Psi (\Lambda ,r_1)\) and (5.4) holds, it follows that \(w\in \Psi (\lambda _{w}^p,r_1)\). By the Lebesgue differentiation theorem there exists \(\rho _{w}\in (0,(r_2-r_1)/(2\kappa ))\) such that

(5.6)

and

(5.7)

for every \(s\in (\rho _{w},r_2-r_1)\). Observe that (5.6) and (5.5), imply that

$$\begin{aligned} \lambda _{w}\leqq \left( \frac{2r}{\rho _{w}} \right) ^\frac{n+2}{2} \lambda _0. \end{aligned}$$
(5.8)

For \(K>1\) as in (5.2), either

$$\begin{aligned} K\lambda _{w}^p\geqq a(w)\lambda _{w}^q \quad \text {or}\quad K\lambda _{w}^p\leqq a(w)\lambda _{w}^q. \end{aligned}$$
(5.9)

In addition, either

$$\begin{aligned} a(w)\geqq 2[a]_{\alpha }(10\rho _{w})^\alpha \quad \text {or}\quad a(w)\leqq 2[a]_{\alpha }(10\rho _{w})^\alpha . \end{aligned}$$
(5.10)

We consider three cases:

  1. (1)

    \(K\lambda _{w}^p\geqq a(w)\lambda _{w}^q\), that is, the first condition in (5.9) holds,

  2. (2)

    \(K\lambda _{w}^p\leqq a(w)\lambda _{w}^q\) and \(a(w)\geqq 2[a]_{\alpha }(10\rho _{w})^\alpha \), that is, the second condition in (5.9) and the first condition in (5.10) and

  3. (3)

    \(K\lambda _{w}^p\leqq a(w)\lambda _{w}^q\) and \(a(w)\leqq 2[a]_{\alpha }(10\rho _{w})^\alpha \), that is, the second condition in (5.9) and the second condition in (5.10) hold.

First we note that (1), together with (5.6)–(5.7), imply (4.3) for p-intrinsic cylinders by replacing the center point and radius with \(w\) and \(\rho _{w}\). Next we show that (2) implies (4.4) for (pq)-intrinsic cylinders. From the second condition in (5.9) we obtain \(a(w)>0\) and \(G_{s}^{\lambda _{w}}(w)\subsetneq Q_s^{\lambda _{w}}(w)\). By (5.6)–(5.7), we obtain

for every \(s\in (\rho _{w},r_2-r_1)\). Recall that \(w\in \Psi (\Lambda ,r_1)\) and \(\Lambda =H_{w}(\lambda _{w})\). We find \(\varsigma _{w}\in (0,\rho _{w}]\) such that

(5.11)

and

for every \(s\in (\varsigma _{w},r_2-r_1)\). Moreover, it follows from the first condition in (5.10) that

$$\begin{aligned} 2[a]_{\alpha }(10\rho _{w})^\alpha \leqq a(w)\leqq \inf _{Q_{10\rho _{w}}(w)}a(z)+[a]_{\alpha }(10\rho _{w})^\alpha \end{aligned}$$

and

$$\begin{aligned} \sup _{Q_{10\rho _{w}}(w)}a(z)\leqq \inf _{Q_{10\rho _{w}}(w)}a(z)+[a]_{\alpha }(10\rho _{w})^\alpha \leqq 2\inf _{Q_{10\rho _{w}}(w)}a(z). \end{aligned}$$

Therefore,

$$\begin{aligned} \frac{a(w)}{2}\leqq a(z)\leqq 2a(w) \text { for every } z\in Q_{10\rho _{w}}(w). \end{aligned}$$
(5.12)

Hence, (2) implies (5.11)–(5.12). This shows that (4.4) is satisfied by replacing the center point and radius with \(w\) and \(\varsigma _{w}\).

Finally, we prove that (3) never occurs due to (5.2). From the second condition in (5.9) and the second condition in (5.10), we have

$$\begin{aligned} K\lambda _{w}^p=a(w)\frac{K\lambda _{w}^p}{a(w)}\leqq 20[a]_{\alpha }\rho _{w}^{\alpha }\lambda _{w}^q. \end{aligned}$$

By applying (5.6) and recalling that \(\gamma =\tfrac{\alpha p}{n+2}\), we obtain

Observe that

and

$$\begin{aligned} q-\gamma +(p-2)\frac{\gamma }{p}=q-\frac{2\gamma }{p}=q-\frac{2\alpha }{n+2}\leqq p. \end{aligned}$$

It follows from (5.2) that

$$\begin{aligned} \lambda _{w}^p\leqq \frac{1}{2}\lambda _{w}^{q-\gamma +(p-2)\frac{\gamma }{p}}\leqq \frac{1}{2}\lambda _{w}^p. \end{aligned}$$

Therefore, the second condition in (5.9) and the second condition in (5.10) cannot occur together.

5.2 Vitali type covering argument

In Section 4 we considered reverse Hölder inequality, and in Section 5.1 we discussed a stopping time argument, for p-intrinsic and (pq)-intrinsic cylinders. For each \(w\in \Psi (\Lambda ,r_1)\), we consider

$$\begin{aligned} \mathcal {Q}(w)=Q_{2\rho _{w}}^{\lambda _{w}}(w)\text { in } (1) \quad \text {and}\quad \mathcal {Q}(w)=G_{2\varsigma _{w}}^{\lambda _{w}}(w)\text { in } (2). \end{aligned}$$

We prove a Vitali type covering lemma for this collection of intrinsic cylinders. We denote

$$\begin{aligned} \mathcal {F}=\left\{ \mathcal {Q}(w): w\in \Psi (\Lambda ,r_1)\right\} \quad \text {and}\quad l_{w}= {\left\{ \begin{array}{ll} 2\rho _{w}&{}\text {in}\,(1),\\ 2\varsigma _{w}&{}\text {in}\,(2). \end{array}\right. } \end{aligned}$$

Recall that \(l_{w}\in (0,R)\) for every \(w\in \Psi (\Lambda ,r_1)\), where \(R=(r_2-r_1)/\kappa \) and \(\kappa \) is as in (5.2). Let

$$\begin{aligned} \mathcal {F}_j=\left\{ \mathcal {Q}(w)\in \mathcal {F}: \frac{R}{2^j}<l_{w}<\frac{R}{2^{j-1}} \right\} ,\quad j\in {\mathbb {N}}. \end{aligned}$$

We construct subcollections \(\mathcal {G}_j\subset \mathcal {F}_j\), \(j\in {\mathbb {N}}\), recursively as follows. Let \(\mathcal {G}_1\) be a maximal disjoint collection of cylinders in \(\mathcal {F}_1\). By (5.8) we observe that the measure of each cylinder in \(\mathcal {G}_1\) is bounded from below, which implies that the collection is finite. Suppose that we have selected \(\mathcal {G}_1,...,\mathcal {G}_{k-1}\) with \(k\geqq 2\), and let

$$\begin{aligned} \mathcal {G}_k=\Bigl \{ \mathcal {Q}(w)\in \mathcal {F}_k: \mathcal {Q}(w)\cap \mathcal {Q}(v)=\emptyset \text { for every }\mathcal {Q}(v)\in \bigcup _{j=1}^{k-1}\mathcal {G}_j\Bigr \} \end{aligned}$$

be a maximal collection of pairwise disjoint cylinders. It follows that

$$\begin{aligned} \mathcal {G}=\bigcup _{j=1}^\infty \mathcal {G}_j, \end{aligned}$$
(5.13)

is a countable subcollection of pairwise disjoint cylinders in \(\mathcal {F}\). We claim that for each \(\mathcal {Q}(w)\in \mathcal {F}\), there exists \(\mathcal {Q}(v)\in \mathcal {G}\) such that

$$\begin{aligned} \mathcal {Q}(w)\cap \mathcal {Q}(v)\ne \emptyset \quad \text {and}\quad \mathcal {Q}(w)\subset \kappa \mathcal {Q}(v), \end{aligned}$$
(5.14)

where

$$\begin{aligned} \kappa \mathcal {Q}(v)=Q_{2\kappa \rho _{v}}^{\lambda _{v}}(v)\text { in } (1) \quad \text {and}\quad \kappa \mathcal {Q}(v)=G_{2\kappa \varsigma _{v}}^{\lambda _{v}}(v)\text { in } (2). \end{aligned}$$

For every \(\mathcal {Q}(w)\in \mathcal {F}\), there exists \(j \in \mathbb {N}\) such that \(\mathcal {Q}(w)\in \mathcal {F}_j\). By the construction of \(\mathcal {G}_j\), there exists a cylinder \(\mathcal {Q}(v)\in \cup _{i=1}^j \mathcal {G}_i\) for which the first condition in (5.14) holds true. Moreover, since \(l_{w}\leqq \tfrac{R}{2^{j-1}}\) and \(l_{v} \geqq \tfrac{R}{2^j}\), we have

$$\begin{aligned} l_{w}\leqq 2l_{v}. \end{aligned}$$
(5.15)

In the remaining of this subsection, we will prove the second claim in (5.14). We note that if \(\lambda =\lambda _{w}=\lambda _{v}\) and either

$$\begin{aligned} \mathcal {Q}(w)=Q_{l_{w}}^{\lambda }(w) \quad \text {and}\quad \mathcal {Q}(v)=Q_{l_{v}}^{\lambda }(v) \end{aligned}$$

or

$$\begin{aligned} \mathcal {Q}(w)=G_{l_{w}}^{\lambda }(w) \quad \text {and}\quad \mathcal {Q}(v)=G_{l_{v}}^{\lambda }(v), \end{aligned}$$

then the second claim in (5.14) holds true if \(\kappa \geqq 5\). Indeed, once the scaling factor in the time interval of two intrinsic cylinders is the same, these cylinders are in the standard parabolic metric space. Thus the standard proof of Vitali’s covering lemma can be applied in these cases.

Regardless of (1) and (2), for \(i\in \{v,w\}\), there exist \(2\rho _i\geqq l_i>0\) and \(\lambda _i>0\) such that

$$\begin{aligned} \Lambda =\lambda _i^p+a(z_i)\lambda _i^q \end{aligned}$$
(5.16)

and

(5.17)

We show that the second claim in (5.14) holds in all four possible cases that may occur:

  1. (i)

    \(\mathcal {Q}(v)=Q_{l_v}^{\lambda _v}(v)\) and \(\mathcal {Q}(w)=Q_{l_w}^{\lambda _w}(w)\),

  2. (ii)

    \(\mathcal {Q}(v)=G_{l_v}^{\lambda _v}(v)\) and \( \mathcal {Q}(w)=G_{l_w}^{\lambda _w}(w)\),

  3. (iii)

    \(\mathcal {Q}(v)=G_{l_v}^{\lambda _v}(v)\) and \( \mathcal {Q}(w)=Q^{\lambda _w}_{l_w}(w)\) and

  4. (iv)

    \(\mathcal {Q}(v)=Q_{l_v}^{\lambda _v}(v)\) and \(\mathcal {Q}(w)=G_{l_w}^{\lambda _w}(w)\).

Observe that in any of these cases, the first condition in (5.14) implies that \(Q_{l_{w}}(w) \cap Q_{l_{v}}(v) \ne \emptyset \) and

$$\begin{aligned} Q_{l_{w}}(w) \subset Q_{5l_{v}}(v)\subset Q_{10\rho _v}(v). \end{aligned}$$
(5.18)

This will already imply that the second claim in (5.14) holds for the spatial part of the set by enlarging the radius by factor 5. In the rest of this subsection, we show the inclusion of the time intervals when enlarging the radius with factor \(\kappa \) by considering each case separately.

First we collect a few facts that will be applied in the argument. By (5.18) we have

$$\begin{aligned} | a(w) - a(v) | \leqq [a]_\alpha (10 \rho _v)^\alpha . \end{aligned}$$
(5.19)

On the other hand, from (5.17), we may deduce that

$$\begin{aligned} \rho _v^{n+2} = \frac{1}{ 2 \left| B_1 \right| \lambda _v^2 } \iint _{Q_{\rho _{v}}^{\lambda _v}(v)}(H(z,|\nabla u|)+H(z,|F|))\,\textrm{d}z\leqq \frac{M_1}{\lambda _v^2}, \end{aligned}$$
(5.20)

If \(\lambda _w \leqq \lambda _v\), we claim that

$$\begin{aligned} \lambda _v\leqq \left( 2\left( 1+10[a]_{\alpha } M_1^\frac{\alpha }{n+2}\right) \right) ^\frac{1}{p}\lambda _w. \end{aligned}$$
(5.21)

For a contraction, assume that (5.21) does not hold. It follows from (5.16) and (5.19) that

$$\begin{aligned} \begin{aligned} \Lambda =\lambda _w^p+a(w)\lambda _w^q\leqq \lambda _w^p+a(v)\lambda _w^q+[a]_{\alpha }(10\rho _v)^{\alpha }\lambda _w^q. \end{aligned} \end{aligned}$$
(5.22)

By (5.20) we obtain

$$\begin{aligned} \rho _v^\alpha \lambda _w^q \leqq M_1^\frac{\alpha }{n+2} \lambda _v^{-\frac{2\alpha }{n+2} } \lambda _w^q < M_1^\frac{\alpha }{n+2} \lambda _w^{q-\frac{2\alpha }{n+2}} \leqq M_1^\frac{\alpha }{n+2} \lambda _w^p, \end{aligned}$$

since \(\lambda _w \leqq \lambda _v\) and \(q \leqq p + 2\alpha /(n+2)\). Substituting the negation of (5.21) and the above display into the right-hand side of (5.22) leads to a contradiction since

$$\begin{aligned} \Lambda < \frac{1}{2}\left( \lambda _v^p+a(v)\lambda _v^q\right) =\frac{1}{2}\Lambda . \end{aligned}$$

On the other hand, if \(\lambda _v\leqq \lambda _w\), we claim that

$$\begin{aligned} \lambda _w\leqq \left( 2\left( 1+10[a]_\alpha M_1^\frac{\alpha }{n+2}\right) \right) ^\frac{1}{p}\lambda _v. \end{aligned}$$

Otherwise, it follows from (5.20) that

$$\begin{aligned} \begin{aligned} \Lambda&=\lambda _v^p+a(v)\lambda _v^q\leqq \lambda _v^p+a(w)\lambda _v^q+[a]_{\alpha }(10\rho _v)^\alpha \lambda _v^q\\&\leqq \lambda _v^p+a(w)\lambda _v^q+10[a]_\alpha M_1^\frac{\alpha }{n+2}\lambda _v^{q-\frac{2\alpha }{n+2}}\\&\leqq \left( 1+10[a]_\alpha M_1^\frac{\alpha }{n+2}\right) \lambda _v^p+a(w)\lambda _v^q< \frac{1}{2}(\lambda _w^p+a(w)\lambda _w^q)=\frac{1}{2}\Lambda . \end{aligned} \end{aligned}$$

In any case we have

$$\begin{aligned} (2K)^{-\frac{1}{p}}\lambda _w\leqq \lambda _v \leqq (2K)^\frac{1}{p} \lambda _w. \end{aligned}$$
(5.23)

Let \(v=(x_v,t_v)\) and \(w=(x_w,t_w)\) for \(x_v,x_w\in \mathbb {R}^n\) and \(t_v,t_w\in \mathbb {R}\).

(i): \(\mathcal {Q}(v)=Q_{l_v}^{\lambda _v}(v)\) and \(\mathcal {Q}(w)=Q_{l_w}^{\lambda _w}(w)\).  For any \(\tau \in I_{l_w}^{\lambda _w}(t_w)\), we apply (5.15), (5.23) and \((p-2)/p\leqq 1\) to have

$$\begin{aligned} \begin{aligned} |\tau - t_v|&\leqq |\tau - t_w| + |t_w - t_v| \leqq 2\lambda _w^{2-p} l_w^2 + \lambda _v^{2-p} l_v^2\\&\leqq \left( 16 K +1 \right) \lambda _v^{2-p} l_v^{2}\leqq 100K^2\lambda _v^{2-p}l_v^2= \lambda _v^{2-p}(\kappa l_v)^2, \end{aligned} \end{aligned}$$

which implies \(I_{l_w}^{\lambda _w}(t_w) \subset \kappa I_{ l_v}^{\lambda _v}(t_v)\). Thus, we have \(Q_{l_w}^{\lambda _w}(w) \subset \kappa Q_{ l_v}^{\lambda _v}(v)\).

(ii): \(\mathcal {Q}(v)=G_{l_v}^{\lambda _v}(v)\) and \( \mathcal {Q}(w)=G_{l_w}^{\lambda _w}(w)\). For any \(\tau \in J_{l_w}^{\lambda _w}(t_w)\), we have

$$\begin{aligned} |\tau - t_v| \leqq |\tau - t_w| + |t_w - t_v|\leqq 2\frac{\lambda _w^2}{H_{w}(\lambda _w)} l_w^2 + \frac{\lambda _v^2}{H_{v}(\lambda _v)} l_v^2. \end{aligned}$$

By (5.16), (5.23) and \(2/p\leqq 1\) we have

$$\begin{aligned} \frac{\lambda _w^2}{H_{w}(\lambda _w)}=\frac{\lambda _w^2}{\Lambda }\leqq 2K\frac{\lambda _v^2}{\Lambda } = 2K\frac{\lambda _v^2}{H_{v}(\lambda _v)}. \end{aligned}$$

Therefore applying (5.15), we obtain

$$\begin{aligned} |\tau - t_v| \leqq (16K+1)\frac{\lambda _v^2}{H_{v}(\lambda _v)} l_v^2\leqq 100K^2\frac{\lambda _v^2}{H_{v}(\lambda _v)}l_v^2= \frac{\lambda _v^2}{H_{v}(\lambda _v)^2}(\kappa l_v)^2. \end{aligned}$$

This implies that \(J_{l_w}^{\lambda _w}(t_w) \subset \kappa J_{ l_v}^{\lambda _v}(t_v)\). Thus, we have \(G_{l_w}^{\lambda _w}(w) \subset \kappa G_{ l_v}^{\lambda _v}(v)\).

(iii): \(\mathcal {Q}(v)=G_{l_v}^{\lambda _v}(v)\) and \( \mathcal {Q}(w)=Q^{\lambda _w}_{l_w}(w)\). For any \(\tau \in I_{l_w}^{\lambda _w}(t_w)\), we have from (5.16) that

$$\begin{aligned} |\tau - t_v| \leqq |\tau - t_w| + |t_w - t_v|\leqq 2 \lambda _w^{2-p}l_w^2 + \frac{\lambda _v^2}{H_v(\lambda _v)} l_v^2 = 2 \lambda _w^{2-p}l_w^2 + \frac{\lambda _v^2}{\Lambda } l_v^2\nonumber \\ \end{aligned}$$
(5.24)

Recalling \(K\lambda _w^p\geqq a(w)\lambda _w^q\), we apply (5.23), \(2/p\leqq 1\) and (5.16) to get

$$\begin{aligned} \lambda _w^{2-p} \leqq \frac{2 \lambda _w^2}{\lambda _w^p + \tfrac{a(w)}{K} \lambda _w^q} \leqq \frac{2K \lambda _w^2}{\lambda _w^p + a(w) \lambda _w^q} \leqq 4K^2 \frac{\lambda _v^2}{\lambda _w^p + a(w)\lambda _w^q}=4K^2 \frac{\lambda _v^2}{\Lambda }, \end{aligned}$$

which, together with (5.24) and (5.15), implies

$$\begin{aligned} |\tau - t_v| \leqq (32K^2+1)\frac{\lambda _v^2}{\Lambda }l_v^2\leqq 100K^2\frac{\lambda _v^2}{\Lambda }l_v^2=\frac{\lambda _v^2}{\Lambda }(\kappa l_v)^2. \end{aligned}$$

Therefore \(I_{l_w}^{\lambda _w}(t_w) \subset \kappa J_{ l_v}^{\lambda _v}(t_v)\) and \(Q_{l_w}^{\lambda _w}(w)\subset \kappa G_{l_v}^{\lambda _v}(v)\).

(iv): \(\mathcal {Q}(v)=Q_{l_v}^{\lambda _v}(v)\) and \(\mathcal {Q}(w)=G_{l_w}^{\lambda _w}(w)\). For any \(\tau \in J_{l_w}^{\lambda _w}(t_w)\), we apply (5.15), (5.23) and \((p-2)/p\leqq 1\) to have

$$\begin{aligned} |\tau - t_v|&\leqq |\tau - t_w| + |t_w - t_v|\leqq 2 \frac{\lambda _w^2}{H_w(\lambda _w)}l_w^2 + \lambda _v^{2-p} l_v^2 \\&\leqq 2 \lambda _w^{2-p}l_w^2 + \lambda _v^{2-p} l_v^2 \leqq (16K+1)\lambda _v^{2-p}l_v^2\\&\leqq 100K^2\lambda _v^{2-p}l_v^2=\lambda _v^{2-p}(\kappa l_v)^2. \end{aligned}$$

Therefore \(J^{\lambda _w}_{l_w}(t_w)\subset \kappa I_{l_v}^{\lambda _v}(t_v)\) and \(G_{l_w}^{\lambda _w}(w) \subset \kappa Q_{l_v}^{\lambda _v}(v)\). Since we have covered every case, the proof of the second condition in (5.14) is completed.

5.3 Final proof of the gradient estimate

We write the countable pairwise disjoint collection \(\mathcal {G}\) defined in (5.13) as \(\mathcal {G}=\cup _{j=1}^\infty \mathcal {Q}_j\), where \(\mathcal {Q}_j=\mathcal {Q}(w_j)\) with \(w_j \in \Psi (\Lambda ,r_1)\).

Lemmas 4.6 and 4.9 imply that there exist \(c=c( data )\) and \(\theta _0=\theta _0(n,p,q)\in (0,1)\) such that

$$\begin{aligned} \iint _{\kappa \mathcal {Q}_{j}}H(z,|\nabla u|)\,\textrm{d}z&\leqq c\Lambda ^{1-\theta }\iint _{\mathcal {Q}_j\cap \Psi (c^{-1}\Lambda )}H(z,|\nabla u|)^\theta \,\textrm{d}z \\&\quad +c\iint _{\mathcal {Q}_j\cap \Phi (c^{-1}\Lambda )}H(z,|F|)\,\textrm{d}z \end{aligned}$$

for every \(j\in \mathbb {N}\) with \(\theta = (\theta _0+1)/2\). By summing over j and applying the fact that the cylinders in \(\mathcal {G}\) are pairwise disjoint, we obtain

$$\begin{aligned}{} & {} \iint _{\Psi (\Lambda ,r_1)}H(z,|\nabla u|)\,\textrm{d}z \leqq \sum _{j=1}^\infty \iint _{\kappa \mathcal {Q}_{j}}H(z,|\nabla u|)\,\textrm{d}z\nonumber \\{} & {} \qquad \leqq c\Lambda ^{1-\theta }\sum _{j=1}^\infty \iint _{\mathcal {Q}_j\cap \Psi (c^{-1}\Lambda )}H(z,|\nabla u|)^\theta \,\textrm{d}z +c\sum _{j=1}^\infty \iint _{\mathcal {Q}_j\cap \Phi (c^{-1}\Lambda )}H(z,|F|)\,\textrm{d}z\nonumber \\{} & {} \qquad \leqq c\Lambda ^{1-\theta }\iint _{\Psi (c^{-1}\Lambda ,r_2)}H(z,|\nabla u|)^\theta \,\textrm{d}z +c\iint _{\Phi (c^{-1}\Lambda ,r_2)}H(z,|F|)\,\textrm{d}z. \end{aligned}$$
(5.25)

Moreover, since

$$\begin{aligned} \iint _{\Psi (c^{-1}\Lambda ,r_1)\setminus \Psi (\Lambda ,r_1)}H(z,|\nabla u|)\,\textrm{d}z \leqq \Lambda ^{1-\theta }\iint _{\Psi (c^{-1}\Lambda ,r_2)}H(z,|\nabla u|)^{\theta }\,\textrm{d}z, \end{aligned}$$

we conclude from (5.25) that

$$\begin{aligned}{} & {} \iint _{\Psi (c^{-1}\Lambda ,r_1)}H(z,|\nabla u|)\,\textrm{d}z\nonumber \\{} & {} \quad \leqq c\Lambda ^{1-\theta }\iint _{\Psi (c^{-1}\Lambda ,r_2)}H(z,|\nabla u|)^\theta \,\textrm{d}z +c\iint _{\Phi (c^{-1}\Lambda ,r_2)}H(z,|F|)\,\textrm{d}z.\nonumber \\ \end{aligned}$$
(5.26)

For \(k\in {\mathbb {N}}\), let

$$\begin{aligned} H(z,|\nabla u|)_k=\min \{H(z,|\nabla u|),k\} \end{aligned}$$

and

$$\begin{aligned} \Psi _k(\Lambda ,\rho )=\{z\in Q_{\rho }(z_0):H(z,|\nabla u(z)|)_k>\Lambda \}. \end{aligned}$$

It is easy to see that if \(\Lambda >k\), then \(\Psi _k(\Lambda ,\rho )=\emptyset \) and if \(\Lambda \leqq k\), then \(\Psi _k(\Lambda ,\rho )=\Psi (\Lambda ,\rho )\). Therefore, we deduce from (5.26) that

$$\begin{aligned} \begin{aligned}&\iint _{\Psi _k(c^{-1}\Lambda ,r_1)}\left( H(z,|\nabla u|)_k\right) ^{1-\theta }H(z,|\nabla u|)^\theta \,\textrm{d}z\\&\quad \leqq c\Lambda ^{1-\theta }\iint _{\Psi _k(c^{-1}\Lambda ,r_2)}H(z,|\nabla u|)^\theta \,\textrm{d}z+c\iint _{\Phi (c^{-1}\Lambda ,r_2)}H(z,|F|)\,\textrm{d}z. \end{aligned} \end{aligned}$$

Recalling (5.3), we denote

$$\begin{aligned} \Lambda _1=c^{-1}\left( \frac{4\kappa r}{r_2-r_1}\right) ^\frac{q(n+2)}{2}\Lambda _0. \end{aligned}$$

Then for any \(\Lambda >\Lambda _1\), we obtain

$$\begin{aligned} \begin{aligned}&\iint _{\Psi _k(\Lambda ,r_1)}\left( H(z,|\nabla u|)_k\right) ^{1-\theta }H(z,|\nabla u|)^\theta \,\textrm{d}z\\&\quad \leqq c\Lambda ^{1-\theta }\iint _{\Psi _k(\Lambda ,r_2)}H(z,|\nabla u|)^\theta \,\textrm{d}z+c\iint _{\Phi (\Lambda ,r_2)}H(z,|F|)\,\textrm{d}z. \end{aligned} \end{aligned}$$

Let \(\varepsilon \in (0,1)\) to be chosen later. We multiply the inequality above by \(\Lambda ^{\varepsilon -1}\) and integrate each term over \((\Lambda _1,\infty )\), which implies

$$\begin{aligned} \begin{aligned} \textrm{I}&=\int _{\Lambda _1}^{\infty }\Lambda ^{\varepsilon -1}\iint _{\Psi _k(\Lambda ,r_1)}\left( H(z,|\nabla u|)_k\right) ^{1-\theta }H(z,|\nabla u|)^\theta \,\textrm{d}z\,d\Lambda \\&\leqq c\int _{\Lambda _1}^{\infty }\Lambda ^{\varepsilon -\theta }\iint _{\Psi _k(\Lambda ,r_2)}H(z,|\nabla u|)^\theta \,\textrm{d}z\,d\Lambda \\&\quad +c\int _{\Lambda _1}^{\infty }\Lambda ^{\varepsilon -1}\iint _{\Phi (\Lambda ,r_2)}H(z,|F|)\,\textrm{d}z\,d\Lambda \\&= \textrm{II}+ \textrm{III}. \end{aligned} \end{aligned}$$

We apply Fubini’s theorem to estimate \(\textrm{I}\) and obtain

$$\begin{aligned} \begin{aligned} \textrm{I}&=\frac{1}{\varepsilon }\iint _{\Psi _k(\Lambda _1,r_1)}\left( H(z,|\nabla u|)_k\right) ^{1-\theta +\varepsilon }H(z,|\nabla u|)^\theta \,\textrm{d}z\\&\quad -\frac{1}{\varepsilon }\Lambda _1^\varepsilon \iint _{\Psi _k(\Lambda _1,r_1)}\left( H(z,|\nabla u|)_k\right) ^{1-\theta }H(z,|\nabla u|)^\theta \,\textrm{d}z. \end{aligned} \end{aligned}$$

Since

$$\begin{aligned} \begin{aligned}&\iint _{Q_{r_1}(z_0)\setminus \Psi _k(\Lambda _1,r_1)}\left( H(z,|\nabla u|)_k\right) ^{1-\theta +\varepsilon }H(z,|\nabla u|)^\theta \,\textrm{d}z\\&\quad \leqq \Lambda _1^{\varepsilon }\iint _{Q_{2r}(z_0)}\left( H(z,|\nabla u|)_k\right) ^{1-\theta }H(z,|\nabla u|)^\theta \,\textrm{d}z, \end{aligned} \end{aligned}$$

we have

$$\begin{aligned} \begin{aligned} \textrm{I}\geqq&\frac{1}{\varepsilon }\iint _{Q_{r_1}(z_0)}\left( H(z,|\nabla u|)_k\right) ^{1-\theta +\varepsilon }H(z,|\nabla u|)^\theta \,\textrm{d}z\\&-\frac{2}{\varepsilon }\Lambda _1^\varepsilon \iint _{Q_{2r}(z_0)}\left( H(z,|\nabla u|)_k\right) ^{1-\theta }H(z,|\nabla u|)^\theta \,\textrm{d}z. \end{aligned} \end{aligned}$$

Similarly, by Fubini’s theorem, we have

$$\begin{aligned} \textrm{II} \leqq \frac{1}{1-\theta +\varepsilon }\iint _{Q_{r_2}(z_0)}\left( H(z,|\nabla u|)_k\right) ^{1-\theta +\varepsilon }H(z,|\nabla u|)^\theta \,\textrm{d}z \end{aligned}$$

and

$$\begin{aligned} \textrm{III}\leqq \frac{1}{\varepsilon }\iint _{Q_{2r}(z_0)}H(z,|F|)^{1+\varepsilon }\,\textrm{d}z. \end{aligned}$$

By combining the estimates above we obtain

$$\begin{aligned} \begin{aligned}&\iint _{Q_{r_1}(z_0)}\left( H(z,|\nabla u|)_k\right) ^{1-\theta +\varepsilon }H(z,|\nabla u|)^\theta \,\textrm{d}z\\&\quad \leqq \frac{c\varepsilon }{1-\theta +\varepsilon }\iint _{Q_{r_2}(z_0)}\left( H(z,|\nabla u|)_k\right) ^{1-\theta +\varepsilon }H(z,|\nabla u|)^\theta \,\textrm{d}z\\&\qquad +c\Lambda _1^\varepsilon \iint _{Q_{2r}(z_0)}\left( H(z,|\nabla u|)_k\right) ^{1-\theta }H(z,|\nabla u|)^\theta \,\textrm{d}z\\&\qquad +c\iint _{Q_{2r}(z_0)}H(z,|F|)^{1+\varepsilon }\,\textrm{d}z. \end{aligned} \end{aligned}$$

We choose \(\varepsilon _0=\varepsilon _0( data )\in (0,1)\) so that for any \(\varepsilon \in (0,\varepsilon _0)\),

$$\begin{aligned} \frac{c\varepsilon }{1-\theta +\varepsilon }\leqq \frac{1}{2}. \end{aligned}$$

Then, by applying Lemma 4.2 we get

$$\begin{aligned} \begin{aligned}&\iint _{Q_{r}(z_0)}\left( H(z,|\nabla u|)_k\right) ^{1-\theta +\varepsilon }H(z,|\nabla u|)^\theta \,\textrm{d}z\\&\quad \leqq c\Lambda _0^\varepsilon \iint _{Q_{2r}(z_0)}\left( H(z,|\nabla u|)_k\right) ^{1-\theta }H(z,|\nabla u|)^\theta \,\textrm{d}z\\&\qquad +c\iint _{Q_{2r}(z_0)}H(z,|F|)^{1+\varepsilon }\,\textrm{d}z. \end{aligned} \end{aligned}$$

The claim follows by letting \(k\longrightarrow \infty \) and recalling (5.1).