1 Introduction

Recently, the second order regularity for parabolic p-Laplace type equations has been studied by Høeg and Lindqvist [13], Dong et al. [10], and the authors [11]. In this article, we consider a rather general class of parabolic equations

$$\begin{aligned} u_t-|Du|^{\gamma }\big (\Delta u+(p-2)\Delta _\infty ^Nu\big )=0 \end{aligned}$$
(1.1)

with \(1<p<\infty \) and \(-1<\gamma <\infty \), where

$$\begin{aligned} \Delta _{\infty }^Nu:=\left| Du\right| ^{-2}\sum _{i,j=1}^nu_{x_i}u_{x_j}u_{x_ix_j}=\left| Du\right| ^{-2}\langle Du,D^2uDu\rangle =\left| Du\right| ^{-2}\Delta _{\infty }u \end{aligned}$$

denotes the normalized infinity Laplacian. The equation contains the game theoretic or normalized p-parabolic equation and the divergence form standard p-parabolic equation as special cases. The equation is not uniformly parabolic or in divergence form except in special cases, and it can be highly degenerate or singular in the gradient variable. Regularity for such equations has been recently studied for example by Imbert, Jin and Silvestre as well as Parviainen and Vázquez as discussed below. The objective of this article is to develop a systematic approach to study the second order spatial regularity of viscosity solutions to (1.1). In this approach proving the estimate reduces down to verifying that a certain coefficient matrix is positive definite. For the further notation and the definition of viscosity solutions to (1.1), we refer to Sect. 2.

In [11] we considered second order Sobolev regularity of the parabolic p-Laplace equation

$$\begin{aligned} u_t-\Delta _p u=0 \end{aligned}$$
(1.2)

where \(\Delta _pu:={{\,\mathrm{div\,}\,}}(|Du|^{p-2}Du)\) is the p-Laplace operator. Notice that, in the special case \(\gamma =p-2\), Eq. (1.1) can be formally, and also rigorously by [16], rewritten as (1.2). One of the key tools is the fundamental inequality (the name stems from Dong, Peng, Zhang and Zhou [10] for a related inequality)

$$\begin{aligned} |Du|^4|D^2u|^2 \ge 2|Du|^2|D^2uDu|^2+\frac{(|Du|^2\Delta u-\Delta _{\infty }u)^2}{n-1}-(\Delta _{\infty }u)^2 \end{aligned}$$
(1.3)

which holds for any smooth function u as shown by Sarsa [25]. Curiously, in [11] it was sufficient to use the above inequality in a simpler form just estimating \((|Du|^2\Delta u-\Delta _{\infty }u)^2\ge 0\) on the right hand side. With the general equation in this paper, we use the inequality in the full generality. A natural approach to obtain second order Sobolev estimates is to differentiate (1.1), multiply the equation with suitable quantities containing gradients, and manipulate in a suitable way. Thus, among other terms, one can obtain terms in divergence form, which can be controlled. In the case of (1.2), one then uses (1.3) in a simple form as explained above and thus gets an upper bound for a quantity containing second derivatives. Part of the difficulty in dealing with the general equation instead of the p-parabolic equation stems from the fact that this approach gives rise to the mixed terms of the type

$$\begin{aligned} |Du|^{-\gamma }u_t\Delta _{\infty }^Nu \end{aligned}$$

which are difficult to handle.

Another difficulty arises from the fact that of course u is not known to be smooth a priori when differentiating the equation, and negative powers of the gradient are problematic as the gradient might vanish. A natural approach to these problems is regularizing the equation by adding a small regularization parameter, which removes the singularity. Unfortunately, when differentiating the regularized equation, one gets another set of problematic terms that no longer match the terms in the fundamental inequality. Treating these terms is a subtle issue, and we need to guarantee that a sum of certain terms remain nonnegative by carefully analyzing explicit coefficients of the terms.

In order to analyze the nonnegativity of the problematic terms and their coefficients systematically, we develop several techniques. We interpret the terms and their coefficients, as a quadratic form and derive a range condition for the parameters from the positive definiteness condition of this quadratic form. In order to improve the range obtained in this way, we use a hidden divergence structure. Indeed, suitable mixed terms can actually be written in a divergence form, and thus by adding such terms, we can manipulate the coefficients at the cost of adding divergence form terms that can be estimated.

Some steps, in particular checking that the quadratic form is positive definite, of the above plan when written down explicitly are quite complicated, and thus for the convenience of the reader we first provide a formal calculation in Sect. 5, where we assume that the solution is smooth and the gradient nonvanishing. In this case, the above plan gives an optimal (optimality is discussed in Example 5.1) a priori estimate (Proposition 5.1),

$$\begin{aligned} \int _{Q_r}\Big |D(|Du|^{\frac{p-2+s}{2}} Du)\Big |^2dxdt \le \frac{C}{r^2}\left( \int _{Q_{2r}}|Du|^{p+s}dxdt + \int _{Q_{2r}}|Du|^{p+s-\gamma }dxdt \right) \end{aligned}$$

in the range

$$\begin{aligned} 1<p<\infty , \quad -1<\gamma <\infty \quad \text {and} \quad n\ge 2. \end{aligned}$$

with the range condition

$$\begin{aligned} s>\max \Big \{-1-\frac{p-1}{n-1},\gamma +1-p\Big \}. \end{aligned}$$
(1.4)

The left hand side in the above estimate is of the same form as the estimate in [11]. In particular, we may set \(s=2-p\), \(s=0\) and \(s=p-2\) giving

$$\begin{aligned} D^2u,\quad D(\left| Du\right| ^{\frac{p-2}{2}}Du)\quad \text { and }\quad D(\left| Du\right| ^{p-2}Du) \end{aligned}$$

as special cases.

Perhaps surprisingly, removing the smoothness assumption and the assumption on the nonvanishing gradient by using the regularized equation turns out to be a problem. In particular, the additional terms resulting from the regularization add to the technical complication of showing that the quadratic form is positive definite. To reduce technical complication partly for expository reasons, we have decided to restrict ourselves to the case \(n=2\) in the regularized case. In this context we obtain the following result.

Theorem 1.1

Let \(n=2\). Let \(u:\Omega _T\rightarrow {\mathbb {R}}\) be a viscosity solution to the general p-parabolic equation (1.1). If p and \(\gamma \) satisfy one of the following conditions:

  1. (i)

    \(1<p\le 5\) and \(-1<\gamma <1\); or

  2. (ii)

    \(1<p<\infty \) and \(-1<\gamma <\sqrt{2}-\frac{1}{2}\),

then \(D^2u\) exists and belongs to \(L^2_\textrm{loc}(\Omega _T)\). Moreover, we have the estimate

$$\begin{aligned} \int _{Q_{ r}}|D^2u|^2dxdt \le \frac{C}{r^2}\left( \int _{ Q_{2r}} |Du|^2 dxdt + \int _{ Q_{2r}}|Du|^{ 2-\gamma } dxdt \right) , \end{aligned}$$

where \(C=C(p,\gamma )>0\) and \(Q_r\subset Q_{2r}\Subset \Omega _T\) are concentric parabolic cylinders.

This also implies that time derivative exists as an \(L^2\)-function, which is not evident directly by the definition.

Corollary 1.2

(Time derivative) Let \(n=2\). Let \(u:\Omega _T\rightarrow {\mathbb {R}}\) be a viscosity solution to the general p-parabolic Eq. (1.1). If p and \(\gamma \) satisfy one of the following conditions:

  1. (i)

    \(1<p\le 5\) and \(0\le \gamma <1\); or

  2. (ii)

    \(1<p<\infty \) and \(0\le \gamma <\sqrt{2}-\frac{1}{2}\),

then the time derivative \(u_t\) exists as a function and \(u_t\in L^2_{\text {loc}}(\Omega _T)\).

At least to some extent the range condition in Theorem 1.1 is an artifact as we explain later. It would be interesting to know whether the theorem is valid in the whole range of parameters.

Next we review the known regularity results of Eq. (1.1) and explain how our results fit into the existing literature. If \(\gamma =p-2\), then Eq. (1.1) is the parabolic p-Laplace Eq. (1.2). For the regularity theory of weak solutions to (1.2) we refer to the monograph of DiBenedetto [8]. In particular, if u is a continuous weak solution to (1.2), then \(u\in C^\alpha _\text {loc}\) and \(Du\in C^\beta _\text {loc}\) for some \(0<\alpha ,\beta <1\).

Moreover, Lindqvist [18] showed in the degenerate case \(2<p<\infty \) that

$$\begin{aligned} D(|Du|^{\frac{p-2}{2}}Du)\in L^2_\text {loc}, \end{aligned}$$

and further that

$$\begin{aligned} D(|Du|^{p-2}Du)\in L^{\frac{p}{p-1}}_\text {loc}. \end{aligned}$$

The singular case is treated in [20]. The results then imply the existence of time derivative \(u_t\) as a function in suitable spaces similar to Corollary 1.2. In the case of the obstacle problem the existence of the time derivative was established in [19]. Dong, Peng, Zhang and Zhou [10] gave a proof that \(D^2u\in L^2_\text {loc}\) with a sharp range \(1<p<3\). This range of p can be recovered from assumption (i) of Theorem 1.1. In the global case, estimates for \(D(|Du|^{p-2}Du)\) have been derived by Cianchi and Maz’ya in [7].

If \(\gamma =0\), Eq. (1.1) is the normalized parabolic p-Laplace equation

$$\begin{aligned} u_t-\Delta _{p}^Nu=0 \end{aligned}$$

where \(\Delta _{p}^Nu:=\Delta u+(p-2)\Delta _\infty ^Nu\) is the normalized or game theoretic p-Laplace operator. This equation arises from a two-player stochastic game with a fixed running time, see Manfredi et al. [21], or from image processing, see Does [9]. Banerjee and Garofalo [5, 6] studied the potential theoretic aspects and boundary regularity of the normalized p-Laplacian evolution. These papers also contain Lipschitz regularity results for solutions to the normalized p-parabolic equation. The regularity method in [21] is global whereas in [23] a local game theoretic method is applied in this context. Later Jin and Silveste [15] established \(C_{\text {loc}}^{1,\alpha }\)-regularity in space and \(C_{\text {loc}}^{0,\frac{1+\alpha }{2}}\)-regularity in time. In [13], Høeg and Lindqvist studied the second order Sobolev regularity for the normalized p-parabolic equation and showed that when \(\frac{6}{5}<p<\frac{14}{5}\), the second order spatial derivatives \(D^2u\) and the time derivative \(u_t\) belong to \(L^2_\text {loc}\). Moreover, they also proved that when \(1<p<2\), \(u_t\) also belongs to \(L^2_\text {loc}\). In [3], \(C_{\text {loc}}^{1,\alpha }\)-regularity was established to the normalized p-parabolic equation with a source term. The work of Dong et al. [10] also applies to the normalized p-parabolic equation; in this case they obtained \(D^2u\in L^{2+\delta }_\text {loc}\) and \(u_t\in L^{2+\delta }_\text {loc}\) for some \(\delta >0\) if \(1<p<3+\frac{2}{n-2}\). The key result of [10] with \(\delta =0\) can be recovered from assumption (ii) of Theorem 1.1. Recently Andrade and Santos [1] established improved Sobolev regularity estimates when p is close to 2.

As stated, (1.1) is in non-divergence form and can be highly degenerate or singular. Thus even defining viscosity solutions in such a way that existence and uniqueness can be obtained becomes a nontrivial issue. This was done by Ohnuma and Sato [22], see also Giga’s monograph [12]. For viscosity solutions to the general Eq. (1.1), where \(1<p<\infty \) and \(-1<\gamma <\infty \) are allowed to be independent of each other, Imbert, Jin and Silvestre [14] proved in particular that \(Du\in C^\alpha _\text {loc}\) for suitable \(0<\alpha <1\). In [24], Parviainen and Vázquez established Harnack’s inequality and asymptotic behaviour by using the fact that for radial solutions Eq. (1.1) is equivalent to a divergence form equation but in fictitious dimension. Attouchi [2] in the degenerate case and Attouchi-Ruosteenoja [4] in the singular case established spatial \(C^{1,\alpha }_\text {loc}\)-regularity for an equation of type (1.1) but with a source term. The elliptic Harnack’s inequality in the singular range was obtained in [17].

This article is organized as follows. In Sect. 2 we provide the necessary preliminaries. In Sect. 3 we explain the ideas of the proof of Theorem 1.1. In Sect. 4 we state several auxiliary lemmas needed in the proofs, including the fundamental inequality (1.3). Sections 5 and 6 are parallel to each other. In the former, we provide the formal calculation. In the latter, we provide a similar calculation in a regularized setting, which eventually yields Theorem 1.1. In Sect. 6.2 we prove Theorem 1.1 and Corollary 1.2. Some of the proofs for the technical lemmas are postponed to the appendix.

2 Preliminaries

We use the following notation. Let \(\Omega \subset {\mathbb {R}}^n\), \(n\ge 2\), be a domain and define the cylinder

$$\begin{aligned} \Omega _T:=\Omega \times (0,T). \end{aligned}$$

If U is compactly contained in \(\Omega \), i.e. \(U\subset \Omega \) and the closure of U is a compact subset of \(\Omega \), we write \(U \Subset \Omega \). For \(0<t_1<t_2<\infty \), we set

$$\begin{aligned} U_{t_1,t_2}:=U\times (t_1,t_2). \end{aligned}$$

Moreover, we will use parabolic cylinders of the form

$$\begin{aligned} Q_r(x_0,t_0):=B_r(x_0)\times (t_0-r^2,t_0], \end{aligned}$$

where \(B_r(x_0)\) denotes the open ball with radius \(r>0\) and center point \(x_0\in \Omega \). When no confusion arises, we may drop the reference point \((x_0,t_0)\) and write \(Q_{r }\).

Given a function \(u=u(x,t)\) of point \(x\in {\mathbb {R}}^n\) and time \(t>0\), the spatial gradient of u is denoted by \(Du=(u_{x_1},\ldots ,u_{x_n})\), and the time derivative by \(u_t\). The Hessian matrix of u is denoted by \(D^2u=(u_{x_ix_j})_{i,j=1}^n\). The Laplacian of u is given by

$$\begin{aligned} \Delta u:=\sum _{i=1}^nu_{x_ix_i} \end{aligned}$$

and the infinity Laplacian by

$$\begin{aligned} \Delta _{\infty }u:=\sum _{i,j=1}^nu_{x_i}u_{x_j}u_{x_ix_j}=\langle Du,D^2uDu\rangle \end{aligned}$$

where \(\langle \cdot ,\cdot \rangle \) stands for the inner product in \({\mathbb {R}}^n\). The normalized infinity Laplacian is denoted by

$$\begin{aligned} \Delta _{\infty }^Nu:=\frac{\Delta _{\infty }u}{|Du|^2}. \end{aligned}$$

We study viscosity solutions to the general p-parabolic equation

$$\begin{aligned} u_t-|Du|^{\gamma }\big (\Delta u+(p-2)\Delta _{\infty }^Nu\big )=0\quad {\text {in}}\Omega _T, \end{aligned}$$
(2.1)

where \(1<p<\infty \) and \(-1<\gamma <\infty \). The definition of suitable viscosity solutions to (2.1) requires some care because the operator may be singular. Nonetheless, a definition that fits our needs can be found in [22]. First set

$$\begin{aligned} F(Du,D^2 u):=\left| Du\right| ^{\gamma }\big (\Delta u+(p-2)\Delta _{\infty }^{N} u\big ) \end{aligned}$$

whenever \(Du\ne 0\). We define \({\mathscr {F}}\) to be a set of functions \(f\in C^{2}([0,\infty ))\) such that

$$\begin{aligned} \begin{aligned} f(0)=f'(0)=f''(0)=0, f''(r)>0 {\text { for}}\,{\text {all }}r>0, \end{aligned} \end{aligned}$$

and moreover we require for \(g(x):=f(\left| x\right| )\) that

$$\begin{aligned} \begin{aligned} \lim _{x\rightarrow 0,x\ne 0}F(Dg(x),D^2g(x))=0. \end{aligned} \end{aligned}$$

Further, let

$$\begin{aligned} \begin{aligned} \Sigma =\{ \sigma \in C^{1}({\mathbb {R}}) :\, \sigma {\text { is}}\,{\text {even}},\, \sigma (0)=\sigma '(0)=0,{\text { and} }\sigma (r)>0 {\text { for}}\, {\text {all }} r\ne 0 \}. \end{aligned} \end{aligned}$$

Definition 2.1

A function \(\varphi \in C^{2}(\Omega _T)\) is admissible if for any \((x_0,t_0)\in \Omega _T\) with \(D\varphi (x_0,t_0)=0\), there are \(\delta >0\), \(f\in {\mathscr {F}}\) and \(\sigma \in \Sigma \) such that

$$\begin{aligned} \left| \varphi (x,t)-\varphi (x_0,t_0)-\varphi _t(x_0,t_0)(t-t_0)\right| \le f(\left| x-x_0\right| )+\sigma (t-t_0) \end{aligned}$$

for all \((x,t)\in B_{\delta }(x_0)\times (t_0-\delta ,t_0+\delta )\).

If \(D\varphi \ne 0\), a \(C^2\)-function is automatically admissible.

Definition 2.2

We say that \(\varphi \) touches u at \((x_0,t_0)\in \Omega _T\) (strictly) from below if

  1. (1)

    \(u(x_0,t_0)=\varphi (x_0,t_0)\), and

  2. (2)

    \(u(x,t)>\varphi (x,t)\) for all \((x,t)\in \Omega _T\) such that \((x,t)\ne (x_0,t_0)\).

The definition for touching (strictly) from above is analogous.

Definition 2.3

A function \(u:\Omega _T\rightarrow {\mathbb {R}}\cup \{\infty \}\) is a viscosity supersolution to (2.1) if

  1. (i)

    u is lower semicontinuous,

  2. (ii)

    u is finite in a dense subset of \(\Omega _T\),

  3. (iii)

    for all admissible \(\varphi \in C^{2}(\Omega _T)\) touching u at \((x_0,t_0)\in \Omega _T\) from below

    $$\begin{aligned} \begin{aligned} {\left\{ \begin{array}{ll} \varphi _t(x_0,t_0)-F(D\varphi (x_0,t_0),D^2\varphi (x_0,t_0))\ge 0 &{} \text {if }D\varphi (x_0,t_0)\ne 0,\\ \varphi _t(x_0,t_0)\ge 0 &{} \text {if }D\varphi (x_0,t_0)= 0. \end{array}\right. } \end{aligned} \end{aligned}$$

The definition of a subsolution \(u: \Omega _T\rightarrow {\mathbb {R}}\cup \{-\infty \}\) is analogous except that we require upper semicontinuity, touching from above, and we reverse the inequalities above: in other words if \(-u\) is a viscosity supersolution. If a continuous function is both a viscosity super- and subsolution, it is a viscosity solution.

It is shown in [16] that if \(\gamma = p-2>-1\), then the above notion coincides with the notion of p-super/subparabolic functions, having a direct connection to the distributional weak super/subsolutions as well. Moreover, if \(\gamma \ge 0\), then viscosity solutions can be defined in a standard way by using semicontinuous envelopes, see Proposition 2.2.8 in [12].

3 Plan of proof

In this section we explain the idea of the proof of Theorem 1.1 and our plan of the proof.

3.1 Derivation of a basic estimate

In order to prove second order estimates, we first derive a key basic estimate (3.4) (or actually equality at this point). To this end, we regularize the original Eq. (1.1) and consider

$$\begin{aligned} u^{\varepsilon }_t-(\left| Du^{\varepsilon }\right| ^2+\varepsilon )^{\gamma /2}\Big (\Delta u^{\varepsilon }+(p-2)\frac{\Delta _{\infty }u^{\varepsilon }}{\left| Du^{\varepsilon }\right| ^2+\varepsilon }\Big )=0 \end{aligned}$$
(3.1)

for small \(\varepsilon >0\). Solutions to this equation are smooth according to the standard theory. We differentiate Eq. (3.1) with respect to \(x_k\), \(k=1,\ldots ,n\), and find that the spatial partial derivatives \(u^{\varepsilon }_{x_k}\), \(k=1,\ldots ,n\), solve the equation

$$\begin{aligned} \begin{aligned}&(|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2-\gamma }{2}}(u^{\varepsilon }_{x_k})_t -{{\,\mathrm{div\,}\,}}\big ((|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2}{2}}ADu^{\varepsilon }_{x_k}\big ) \\&\quad +(p-2-\gamma )(|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-4-\gamma }{2}}u^{\varepsilon }_t\langle Du^{\varepsilon },Du^{\varepsilon }_{x_k}\rangle =0 \end{aligned} \end{aligned}$$
(3.2)

where

$$\begin{aligned} A=I+(p-2)\frac{Du^{\varepsilon }\otimes Du^{\varepsilon }}{|Du^{\varepsilon }|^2+\varepsilon } \end{aligned}$$

is a uniformly positive definite \(n\times n\)-matrix. Here I denotes the identity matrix.

We continue with the intention to study the derivatives of \(|Du|^{\frac{p-2+s}{2}}Du\); in particular the choice \(s=2-p\) corresponds to \(D^2u\). We multiply the differentiated Eq. (3.2) by \((|Du^{\varepsilon }|^2+\varepsilon )^{s/2}u^{\varepsilon }_{x_k}\) and obtain

$$\begin{aligned} \begin{aligned}&(|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2-\gamma +s}{2}}u^{\varepsilon }_{x_k}(u^{\varepsilon }_{x_k})_t -(|Du^{\varepsilon }|^2+\varepsilon )^{s/2}u^{\varepsilon }_{x_k}{{\,\mathrm{div\,}\,}}\big ((|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2}{2}}ADu^{\varepsilon }_{x_k}\big ) \\&\quad +(p-2-\gamma )(|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-4-\gamma +s}{2}}u^{\varepsilon }_t\langle Du^{\varepsilon },Du^{\varepsilon }_{x_k}\rangle u^{\varepsilon }_{x_k}=0. \end{aligned} \end{aligned}$$
(3.3)

Using the chain rule

$$\begin{aligned} u^{\varepsilon }_{x_k}(u^{\varepsilon }_{x_k})_t =\frac{1}{2}\Big ((u^{\varepsilon }_{x_k})^2+\frac{\varepsilon }{n}\Big )_t, \end{aligned}$$

and summing (3.3) over \(k=1,\ldots ,n\) gives that

$$\begin{aligned} \begin{aligned}&\frac{\big ((|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p+s-\gamma }{2}}\big )_t}{p+s-\gamma } -(|Du^{\varepsilon }|^2+\varepsilon )^{s/2}\sum _{k=1}^nu^{\varepsilon }_{x_k}{{\,\mathrm{div\,}\,}}\big ((|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2}{2}}ADu^{\varepsilon }_{x_k}\big ) \\&\quad +(p-2-\gamma )(|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2-\gamma +s}{2}}u^{\varepsilon }_t\frac{\Delta _{\infty }u^{\varepsilon }}{|Du^{\varepsilon }|^2+\varepsilon }=0. \end{aligned} \end{aligned}$$

Observing that

$$\begin{aligned}&{{\,\mathrm{div\,}\,}}\big ((|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}}AD^2u^{\varepsilon }Du^{\varepsilon }\big )\\&\quad =\sum _{k=1}^n{{\,\mathrm{div\,}\,}}\Big (\big ((|Du^{\varepsilon }|^2+\varepsilon )^{s/2} u^{\varepsilon }_{x_k} \big )\big ((|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2}{2}}A Du^{\varepsilon }_{x_k}\big )\Big )\\&\quad =(|Du^{\varepsilon }|^2+\varepsilon )^{s/2}\sum _{k=1}^nu^{\varepsilon }_{x_k}{{\,\mathrm{div\,}\,}}\big ((|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2}{2}}ADu^{\varepsilon }_{x_k}\big )\\&\qquad +(|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}} \left\{ |D^2u^{\varepsilon }|^2+(p-2+s)\frac{|D^2u^{\varepsilon }Du^{\varepsilon }|^2}{|Du^{\varepsilon }|^2+\varepsilon } +s(p-2)\frac{(\Delta _{\infty }u^{\varepsilon })^2}{(|Du^{\varepsilon }|^2+\varepsilon )^2} \right\} , \end{aligned}$$

we obtain the identity

$$\begin{aligned} \begin{aligned}&(|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}} \left\{ |D^2u^{\varepsilon }|^2+(p-2+s)\frac{|D^2u^{\varepsilon }Du^{\varepsilon }|^2}{|Du^{\varepsilon }|^2+\varepsilon } +s(p-2)\frac{(\Delta _{\infty }u^{\varepsilon })^2}{(|Du^{\varepsilon }|^2+\varepsilon )^2} \right. \\&\qquad \left. +(p-2-\gamma )(|Du^{\varepsilon }|^2+\varepsilon )^{-\gamma /2}u^{\varepsilon }_t\frac{\Delta _{\infty }u^{\varepsilon }}{|Du^{\varepsilon }|^2+\varepsilon } \right\} \\&\quad = {{\,\mathrm{div\,}\,}}\big ((|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}}AD^2u^{\varepsilon }Du^{\varepsilon }\big ) -\frac{\big ((|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p+s-\gamma }{2}}\big )_t}{p+s-\gamma }. \end{aligned} \end{aligned}$$
(3.4)

Here we assume that \(s\ne \gamma -p\). This is not restrictive, because eventually such value of s violates the resulting range condition (1.4) in any case. It is important that the terms on the right hand side are in divergence form and can thus be well estimated. An important step towards the desired result would be a pointwise inequality

$$\begin{aligned} (|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}} |D^2u^{\varepsilon }|^2 \lesssim {{\,\mathrm{div\,}\,}}\big ((|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}}AD^2u^{\varepsilon }Du^{\varepsilon }\big ) -\frac{\big ((|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p+s-\gamma }{2}}\big )_t}{p+s-\gamma }, \end{aligned}$$

which then could be integrated to obtain the final result and for this we need to estimate the excess terms on the left hand side of (3.4).

3.2 Formal calculation for smooth solutions with a nonvanishing gradient

Compared to our earlier work [11] where we treated the case \(\gamma =p-2\), we now have two extra difficulties for the general case \(-1<\gamma <\infty \). The first difficulty arises from the fourth term on the left hand side of (3.4), that is,

$$\begin{aligned} (p-2-\gamma )(|Du^{\varepsilon }|^2+\varepsilon )^{-\gamma /2}u^{\varepsilon }_t\frac{\Delta _{\infty }u^{\varepsilon }}{|Du^{\varepsilon }|^2+\varepsilon }. \end{aligned}$$

Note that this mixed term vanishes if \(\gamma =p-2\). In general we regard the term mixed in the sense that we cannot determine its sign by the sign of the coefficient \(p-2-\gamma \).

We first discuss the difficulty of mixed terms in the formal case with \({\varepsilon }=0\), and denote a solution by u. In this case, we assume in addition that \(Du\ne 0\). As indicated above, we would like to estimate the excess term in (3.4) and obtain an estimate for \(|Du|^{p-2+s}|D^2u|^2\) with the range (1.4). To this end, we write the fundamental inequality (1.3) in the form

$$\begin{aligned} 2|D_T|Du||^2+\frac{(\Delta _T u)^2}{n-1}+(\Delta _{\infty }^Nu)^2\le |D^2u|^2 \end{aligned}$$

and employ it in identity (3.4) on the term \(|D u|^{p-2+s} |D^2 u|^2\) to obtain that

$$\begin{aligned} \begin{aligned}&|Du|^{p-2+s} \left\{ \frac{1}{n-1}(\Delta _Tu)^2+(p+s)|D_T|Du||^2+(p-1)(s+1)(\Delta _{\infty }^Nu)^2 \right. \\&\qquad \left. +(p-2-\gamma )\left| Du\right| ^{-\gamma }u_t\Delta _{\infty }^Nu\right\} \\&\quad \le {{\,\mathrm{div\,}\,}}\big (|Du|^{p-2+s}AD^2uDu\big ) -\frac{\big (\left| Du\right| ^{p+s-\gamma }\big )_t}{p+s-\gamma }, \end{aligned} \end{aligned}$$
(3.5)

where

$$\begin{aligned} \left| D_T\left| Du\right| \right| ^2:=\frac{|D^2uDu|^2}{|Du|^2}-(\Delta _{\infty }^Nu)^2 \quad \text {and}\quad \Delta _T u:=\Delta u-\Delta _{\infty }^Nu. \end{aligned}$$

Note that \(|D_T|Du||^2\ge 0\). Sometimes \(\Delta _T u\) is called the normalized 1-Laplacian for the obvious reason.

Except the mixed term that is the last term on the left hand side in (3.5), the nonnegativity of other terms in the left hand side of (3.5) can be easily obtained by the restriction \(s>-1\). In order to develop a systematic way of checking nonnegativity of the mixed term utilizing other terms, we use Eq. (1.1) to rewrite

$$\begin{aligned} \left| Du\right| ^{-\gamma }u_t\Delta _{\infty }^Nu=\Delta _Tu\Delta _{\infty }^Nu+(p-1)(\Delta _{\infty }^Nu)^2, \end{aligned}$$

and view the mixed term \(\Delta _Tu\Delta _{\infty }^Nu\) as a part of a quadratic form of \(\Delta _T u\) and \(\Delta _{\infty }^Nu\). That is, we consider

$$\begin{aligned} \begin{aligned} Q:&=\frac{1}{n-1}(\Delta _Tu)^2+(p-1)(p-1+s-\gamma )(\Delta _{\infty }^Nu)^2+(p-2-\gamma )\Delta _Tu\Delta _{\infty }^Nu \\&=:\langle {\bar{x}},M{\bar{x}}\rangle , \end{aligned} \end{aligned}$$

where \({\bar{x}}:=(\Delta _Tu,\Delta _{\infty }^Nu)^T\in {\mathbb {R}}^2\) and

$$\begin{aligned} M= \begin{bmatrix} \displaystyle {\frac{1}{n-1} }&{} \displaystyle {\frac{1}{2}}(p-2-\gamma )\\ \displaystyle {\frac{1}{2}(p-2-\gamma )} &{} (p-1)(p-1+s-\gamma ) \end{bmatrix} \end{aligned}$$

is a symmetric \(2\times 2\)-matrix.

It turns out that in order to derive the desired estimate, it suffices to ensure along with few other conditions that the quadratic form Q is strictly positive in \({\mathbb {R}}^2\backslash \{0\}\), that is, M is positive definite. However, the range condition in (1.4) does not suffice to guarantee that the positive definiteness of Q, hence we need to improve the estimate. We employ the following observation: If \(q>1\), then

$$\begin{aligned} \begin{aligned} u_t |Du|^{q-2}\Delta _q^N u=u_t{{\,\mathrm{div\,}\,}}\big (|Du|^{q-2}Du\big )&= {{\,\mathrm{div\,}\,}}(u_t|Du|^{q-2}Du)-\frac{(|Du|^{q})_t}{q} \end{aligned} \end{aligned}$$
(3.6)

holds for any smooth function u with nonvanishing gradient. In other words, the quantity on the left hand side is a ‘good term’ with a hidden divergence structure.

It is easier to utilize this observation with inequality (3.5), if we rewrite the right hand side of that inequality using Eq. (1.1). To be more precise,

$$\begin{aligned} \begin{aligned} {{\,\mathrm{div\,}\,}}&\big (|Du|^{p-2+s }AD^2uDu\big ) -\frac{\big (\left| Du\right| ^{p +s-\gamma }\big )_t}{p +s-\gamma } \\&= {{\,\mathrm{div\,}\,}}\big (|Du|^{p-2+s }(D^2uDu-\Delta uDu)\big ) +u_t{{\,\mathrm{div\,}\,}}\big (|Du|^{p-2+s-\gamma }Du\big ), \end{aligned} \end{aligned}$$
(3.7)

where the last term now matches with (3.6) setting \(q:=p+s-\gamma \). On the other hand, for a solution u, by Eq. (1.1), and by the definition of normalized q-Laplacian \(\Delta _q^Nu\), one has

$$\begin{aligned} u_t=|Du|^{\gamma }(\Delta _Tu+(p-1)\Delta _{\infty }^Nu), \quad \text {and}\quad \Delta _{p+s-\gamma }^Nu=\Delta _Tu+(p-1+s-\gamma )\Delta _{\infty }^Nu \end{aligned}$$

and thus

$$\begin{aligned} \begin{aligned}&u_t{{\,\mathrm{div\,}\,}}\big (|Du|^{p-2+s-\gamma }Du\big )\\&\quad =|Du|^{\gamma }\big (\Delta _Tu+(p-1)\Delta _{\infty }^Nu\big )\cdot |Du|^{p-2+s-\gamma }\cdot \big (\Delta _Tu+(p-1+s-\gamma )\Delta _{\infty }^Nu\big ) \\&\quad =|Du|^{p-2+s }\Big \{(\Delta _Tu)^2+(2 p-2+s-\gamma ) \Delta _Tu\Delta _{\infty }^Nu+(p-1)(p-1+s-\gamma )(\Delta _{\infty }^Nu)^2\Big \}. \end{aligned}\nonumber \\ \end{aligned}$$
(3.8)

The idea is to add \(u_t{{\,\mathrm{div\,}\,}}\big (|Du|^{p-2+s-\gamma }Du\big )\) with a suitable weight on both sides of (3.5): then by the above equation, it produces new coefficients on the left hand side that can be utilized later to get better range, and controllable terms on the right hand side by (3.7). We also add another positive weight by using

$$\begin{aligned} \begin{aligned} |Du|^{p-2+s}&\Big \{|D^2 u|^2-(\Delta u)^2 +(p-2+s)\frac{|D^2uDu|^2}{|Du|^2}-(p-2+s )\Delta u\Delta _{\infty }^Nu\Big \} \\&\quad = {{\,\mathrm{div\,}\,}}\big (|Du|^{p-2+s}(D^2uDu-\Delta uDu)\big ) \end{aligned} \end{aligned}$$
(3.9)

from Lemma 4.2 below which holds for any smooth function with nonvanishing gradient. This allows us to obtain simplified coefficients in intermediate steps. Thus we obtain

$$\begin{aligned} \begin{aligned}&|Du|^{p-2+s } \left\{ \left( w_2- \frac{n-2}{n-1} w_1\right) (\Delta _T u)^2 +w_1(p +s )\left| D_T\left| Du\right| \right| ^2 \right. \\&\qquad \left. +w_2(p-1)(p-1+s-\gamma )(\Delta _{\infty }^Nu)^2+\big (w_2(2p-2+s-\gamma )-w_1(p +s )\big )\Delta _Tu\Delta _{\infty }^Nu\right\} \\&\quad \le w_1{{\,\mathrm{div\,}\,}}\big (|Du|^{p-2+s }(D^2uDu-\Delta uDu)\big ) +w_2 u_t{{\,\mathrm{div\,}\,}}\big (|Du|^{p-2+s-\gamma }Du\big ), \end{aligned}\nonumber \\ \end{aligned}$$
(3.10)

which reduces to (3.5) if \(w_1=1\) and \(w_2=1\). Calculations reveal that if the range condition (1.4) holds, then the weights \(w_1\) and \(w_2\) can be adjusted so that the weighted quadratic form

$$\begin{aligned}&\Big (w_2-\frac{n-2}{n-1}w_1\Big )(\Delta _T u)^2 +w_2(p-1)(p-1+s-\gamma )(\Delta _{\infty }^Nu)^2 \\&\quad +\big (w_2(2p-2+s-\gamma )-w_1( p +s )\big )\Delta _Tu\Delta _{\infty }^Nu \end{aligned}$$

is positive in \({\mathbb {R}}^2\backslash \{0\}\). This positivity in the formal case \(\varepsilon =0\) is shown in Lemma 5.2. By Proposition 5.1, this then implies the desired estimate

$$\begin{aligned} \int _{Q_r}\left| D(\left| Du\right| ^{\frac{p-2+s }{2}} Du)\right| ^2dxdt \le \frac{C}{r^2}\Big ( \int _{Q_{2r}}|Du|^{p +s }dxdt + \int _{Q_{2r}}|Du|^{p+s-\gamma }dxdt \Big ). \end{aligned}$$

Heuristically, in order to prove the above estimate, and setting \(s=2-p\) for simplicity, we could have left a small piece of \(|D^2 u|^2\) when applying the fundamental inequality for (3.5). Then the rest of the terms can be dropped by the above positivity result: in detail this is implemented in Lemma 4.4 also for other values of s. The obtained pointwise estimate can then be integrated by parts along with a cutoff function to get Proposition 5.1.

3.3 Solutions without smoothness assumptions and regularized equation

The second difficulty, which is related to the regularization, is that the left hand side of (3.4) consists of regularized versions of second order derivative quantities,

$$\begin{aligned} \frac{|D^2u^{\varepsilon }Du^{\varepsilon }|^2}{|Du^{\varepsilon }|^2+\varepsilon } \quad \text {and}\quad \frac{\Delta _{\infty }u^{\varepsilon }}{|Du^{\varepsilon }|^2+\varepsilon }, \end{aligned}$$

whereas employing the fundamental inequality (1.3) results in quantities like

$$\begin{aligned} \frac{|D^2u^{\varepsilon }Du^{\varepsilon }|^2}{|Du^{\varepsilon }|^2}\quad \text {and}\quad \Delta _{\infty }^Nu^{\varepsilon }. \end{aligned}$$

This mismatch causes that some of the formal calculations do not work as such but have further complications: in particular positive definiteness of the quadratic form becomes an issue.

For a certain range of parameters, the main result is obtained by a straightforward generalization of the formal calculation (\(\varepsilon =0\)) in the previous section. However, in the process of extending the range, we consider

$$\begin{aligned} \begin{aligned} S&:= w_1{{\,\mathrm{div\,}\,}}\big ((|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}}(D^2u^{\varepsilon }Du^{\varepsilon }-\Delta u^{\varepsilon }Du^{\varepsilon })\big ) \\&\quad +w_2u^{\varepsilon }_t{{\,\mathrm{div\,}\,}}\big ((|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s-\gamma }{2}}Du^{\varepsilon }\big ) \\&\quad +w_3\varepsilon {{\,\mathrm{div\,}\,}}\big ((|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}-1}(D^2u^{\varepsilon }Du^{\varepsilon }-\Delta u^{\varepsilon }Du^{\varepsilon })\big ) \\&\quad +w_4\varepsilon u^{\varepsilon }_t{{\,\mathrm{div\,}\,}}\big ((|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s-\gamma }{2}-1}Du^{\varepsilon }\big ), \end{aligned} \end{aligned}$$
(3.11)

where \(w_1,w_2,w_3,w_4\in {\mathbb {R}}\). Compared to the right hand side of (3.10), or (6.3), this sum has two additional terms with weights \(w_3\) and \(w_4\). The latter additional term has a hidden divergence structure, similarly to (3.6). These divergence structures can be used to adjust the coefficients on the left hand side of the estimate (3.10), and thus to improve the range of parameters. To be more precise, we denote

$$\begin{aligned} \theta :=\frac{|Du^{\varepsilon }|^2}{|Du^{\varepsilon }|^2+\varepsilon } \quad \text {and}\quad \kappa :=1-\theta =\frac{\varepsilon }{|Du^{\varepsilon }|^2+\varepsilon }, \end{aligned}$$
(3.12)

and obtain

$$\begin{aligned} \frac{|D^2u^{\varepsilon }Du^{\varepsilon }|^2}{|Du^{\varepsilon }|^2+\varepsilon }=\theta \left| D\left| Du^{\varepsilon }\right| \right| ^2 \quad \text {and}\quad \frac{\Delta _{\infty }u^{\varepsilon }}{|Du^{\varepsilon }|^2+\varepsilon }=\theta \Delta _{\infty }^Nu^{\varepsilon }. \end{aligned}$$

The second mixed term of (3.11) can also be written as a part of the quadratic form as follows

$$\begin{aligned} \begin{aligned}&\varepsilon u^{\varepsilon }_t{{\,\mathrm{div\,}\,}}\left( \left( \left| Du^{\varepsilon }\right| ^2+\varepsilon \right) ^{\frac{p-2+s-\gamma }{2}-1}Du^{\varepsilon }\right) \\&\quad =\theta (\left| Du^{\varepsilon }\right| ^2+\varepsilon )^{\frac{p-2+s }{2}}\Big ((\Delta _Tu^{\varepsilon })^2+\big ((2p-6+s-\gamma )\theta +2 \big )\Delta _Tu^{\varepsilon }\Delta _{\infty }^Nu^{\varepsilon }\\&\qquad +\big ((p-2)\theta +1\big )\big ((p-4+s-\gamma )\theta +1 \big )(\Delta _{\infty }^Nu^{\varepsilon })^2\Big ) \end{aligned} \end{aligned}$$
(3.13)

with weight \(w_4\), where

$$\begin{aligned} u^{\varepsilon }_t=(|Du^{\varepsilon }|^2+\varepsilon )^{\gamma /2}\Big (\Delta _T u^{\varepsilon }+\big ((p-2)\theta +1\big )\Delta _{\infty }^Nu^{\varepsilon }\Big ) \end{aligned}$$

by using the regularized equation and recalling the shorthand notation \(\Delta _T u^{\varepsilon }:=\Delta u^{\varepsilon }-\Delta _{\infty }^Nu^{\varepsilon }.\) This will give rise to new coefficients and thus to a better range condition.

In order to produce new coefficients on the left hand side of (3.4), especially for the second order term \((\left| Du^{\varepsilon }\right| ^2+\varepsilon )^{\frac{p-2+s}{2}}\left| D^2u^{\varepsilon }\right| ^2\), and also to improve the range of the parameters, we add another divergence structure

$$\begin{aligned} \begin{aligned}&\varepsilon {{\,\mathrm{div\,}\,}}\big ((\left| Du^{\varepsilon }\right| ^2+\varepsilon )^{\frac{p-2+s}{2}-1}(D^2u^{\varepsilon }Du^{\varepsilon }-\Delta u^{\varepsilon }Du^{\varepsilon })\big )\\&\quad =\theta (\left| Du^{\varepsilon }\right| ^2+\varepsilon )^{\frac{p-2+s}{2}} \Big \{|D^2 u^{\varepsilon }|^2-(\Delta u^{\varepsilon })^2 +(p-4+s)\theta \left| D\left| Du^{\varepsilon }\right| \right| ^2\\&\qquad -(p-4+s)\theta \Delta u^{\varepsilon }\Delta _{\infty }^Nu^{\varepsilon }\Big \}. \end{aligned} \end{aligned}$$
(3.14)

Also observe that the above choice of the power \((p-2+s)/2-1\) will be useful in the proof of Lemma 4.5 when deriving an upper bound for the left hand side of the estimate, after integration by parts where we estimate \({\varepsilon }/(|Du^{\varepsilon }|^2+\varepsilon )\le 1\) and thus the additional \(-1\) in the power gets canceled out. Besides, the error terms obtained in Lemmas 4.4 and 4.7 in [10] can be seen as special cases of the error terms above.

Then combining (3.8), (3.9), (3.13) and (3.14) together with definition (3.11) of S, we get

$$\begin{aligned}&(|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}} \Big \{ c_1|D^2u^{\varepsilon }|^2+c_2|D_T|Du^{\varepsilon }||^2 +(c_3-c_1)(\Delta _Tu^{\varepsilon })^2 \nonumber \\&\quad +\Big ((c_3+c_4)\big ((p-2)\theta +1\big )-c_1\Big )(\Delta _{\infty }^Nu^{\varepsilon })^2 \nonumber \\&\quad +\Big (c_3\big ((p-2)\theta +1\big )+(c_3+c_4)-(2c_1+c_2)\Big ) \Delta _Tu^{\varepsilon }\Delta _{\infty }^Nu^{\varepsilon }\Big \}= S, \end{aligned}$$
(3.15)

where \(c_1,c_2,c_3\) and \(c_4\) depend on \(w_1,w_2,w_3, w_4\) and \(\theta \) as computed in detail in Sect. 4.2. Then we again use the fundamental inequality on part of \(c_1|D^2u^{\varepsilon }|^2\) and find such weights \(w_1,w_2,w_3\) and \(w_4\) that the last three terms on the left hand side can be interpreted as a positive definite quadratic form and thus removed. Finally, S on the right hand side can be multiplied by a cutoff function and integrated by parts to get the final estimate. However, the nonnegativity can only be checked in certain ranges, since it needs to hold uniformly for all \(\theta \in [0,1)\).

4 Hidden divergence structures, the key estimate and auxiliary lemmas

In this section we prove several auxiliary tools. The lemmas in this section will be used to prove estimates for both \(u^{\varepsilon }\), that solves (3.1) with \(\varepsilon >0\), and u, that solves (3.1) with \(\varepsilon =0\) and \(Du\ne 0\). Therefore we state the lemmas in such a generality that applies to both of these cases.

4.1 Hidden divergence structures

In this subsection we gather some useful facts about generic smooth functions. First, if \(u:\Omega _T\rightarrow {\mathbb {R}}\), \(\Omega _T\subset {\mathbb {R}}^{n+1}\) is a smooth function, then |Du| is locally Lipschitz continuous and thus, by Rademacher’s theorem, differentiable almost everywhere on each time slice. Here and in similar occurrences in what follows, we write that D|Du| exists almost everywhere in space.

Note that if \((x_0,t_0)\in \Omega _T\) is a space-time point where |Du| is differentiable and \(Du(x_0,t_0)=0\), then \(D|Du|(x_0,t_0)=0\). Indeed, if we had \(D|Du|(x_0,t_0)\ne 0\), then we could find a point \(\xi \in \Omega \times \{t_0\}\) (close to \((x_0,t_0)\)) such that \(|Du|(\xi )<0\), which is obviously impossible. On the other hand, if \(Du(x_0,t_0)\ne 0\) for some \((x_0,t_0)\in \Omega _T\), then |Du| is differentiable at \((x_0,t_0)\) and

$$\begin{aligned} D|Du|(x_0,t_0)=\frac{D^2u(x_0,t_0)Du(x_0,t_0)}{|Du(x_0,t_0)|}. \end{aligned}$$

For each point in \(\Omega _T\) where \(Du\ne 0\), we fix an orthonormal basis of \({\mathbb {R}}^n\), \(\{e_1,\ldots ,e_n\}\), such that \(e_n=\frac{Du}{|Du|}\). Hence we have, for those points where \(Du\ne 0\),

$$\begin{aligned} \frac{D^2uDu}{|Du|} =\langle e_1,D|Du|\rangle e_1 +\ldots +\langle e_{n-1},D|Du|\rangle e_{n-1} +\bigg \langle \frac{Du}{|Du|},D|Du|\bigg \rangle \frac{Du}{|Du|}. \end{aligned}$$

For those points where |Du| is differentiable, let us define the part of D|Du| which is tangential to the spatial level sets of u as

$$\begin{aligned} D_T|Du|:= {\left\{ \begin{array}{ll} \langle e_1,D|Du|\rangle e_1 +\ldots +\langle e_{n-1},D|Du|\rangle e_{n-1} &{}\quad \text {if }Du\ne 0, \\ 0 &{}\quad \text {if }Du=0, \end{array}\right. } \end{aligned}$$

and its orthogonal counterpart, the normalized infinity Laplacian, as

$$\begin{aligned} \Delta _{\infty }^Nu:= {\left\{ \begin{array}{ll} \langle \frac{Du}{|Du|},D|Du|\rangle =\frac{\Delta _{\infty }u}{|Du|^2} &{}\quad \text {if }Du\ne 0, \\ 0 &{}\quad \text {if }Du=0. \end{array}\right. } \end{aligned}$$

We employ these notation to write

$$\begin{aligned} |D|Du||^2=\left| D_T|Du|\right| ^2+(\Delta _{\infty }^Nu)^2 \quad {\text {a.e.}}{\text {in}}{\text {space in }} \Omega _T, \end{aligned}$$
(4.1)

and

$$\begin{aligned} \Delta _T u=\Delta u-\Delta _{\infty }^Nu \quad {\text {a.e.}}{\text {in }} \Omega _T. \end{aligned}$$
(4.2)

Lemma 4.1

(Fundamental inequality) Let \(u:\Omega _T\rightarrow {\mathbb {R}}\) be a smooth function. Then

$$\begin{aligned} |D^2u|^2\ge 2|D_T|Du| |^2 +\frac{(\Delta _T u)^2}{n-1} +(\Delta _{\infty }^Nu )^2\quad {\text {a.e.}}{\text {in}}{\text {space}}{\text { in }} \Omega _T. \end{aligned}$$
(4.3)

If \(n=2\), we have equality in the place of inequality.

For the proof of Lemma 4.1, we refer to [11, 25].

The following lemmas show that certain terms that first appear to be in non-divergence form, can actually be expressed in a divergence form. On the other hand, these structures can be utilized in tuning the coefficients in the quadratic form as explained in Sect. 3.3, and thus they improve the range we obtain. The first Lemma 4.2 will mainly adjust the coefficient of the term \((\left| Du^{\varepsilon }\right| ^2+\varepsilon )^{\frac{p-2+s}{2}}\left| D^2u^{\varepsilon }\right| ^2.\) The second divergence structure, Lemma 4.3, will produce certain new coefficients on the quadratic form as Q. The proofs of both of these lemmas are direct calculations.

Lemma 4.2

(Hidden divergence structure 1) Let \(u:\Omega _T\rightarrow {\mathbb {R}}\) be a smooth function. Then for any \(\alpha \in {\mathbb {R}}\) and \(\varepsilon >0\),

$$\begin{aligned} \begin{aligned}&(|Du|^2+\varepsilon )^{\alpha /2} \left\{ |D^2u|^2-(\Delta u)^2 +\alpha \frac{|D^2uDu|^2}{|Du|^2+\varepsilon }-\alpha \Delta u\frac{\Delta _{\infty }u}{|Du|^2+\varepsilon }\right\} \\&\quad = {{\,\mathrm{div\,}\,}}\big ((|Du|^2+\varepsilon )^{\alpha /2}(D^2uDu-\Delta uDu)\big ). \end{aligned} \end{aligned}$$

Furthermore, if \(Du\ne 0\), then the above equality holds also for \(\varepsilon =0\).

Proof

By the derivative rule of composite function, the right hand side

$$\begin{aligned}&{{\,\mathrm{div\,}\,}}\big ((|Du|^2+\varepsilon )^{\alpha /2}(D^2uDu-\Delta uDu )\big )\\&\quad = \big< D^2uDu-\Delta uDu , D\big ((|Du|^2+\varepsilon )^{\alpha /2} \big )\big>+ (|Du|^2+\varepsilon )^{\alpha /2}{{\,\mathrm{div\,}\,}}( D^2uDu-\Delta uDu )\\&\quad =\big < D^2uDu-\Delta uDu , D\big ((|Du|^2+\varepsilon )^{\alpha /2} \big )\big >+ (|Du|^2+\varepsilon )^{\alpha /2}\big (|D^2u|^2 -(\Delta u)^2 \big )\\&\quad = (|Du|^2+\varepsilon )^{\alpha /2}\Big \{|D^2u|^2-(\Delta u)^2+\alpha \frac{|D^2uDu|^2}{|Du|^2+\varepsilon }-\alpha \Delta u\frac{\Delta _{\infty }u}{|Du|^2+\varepsilon }\Big \}, \end{aligned}$$

where

$$\begin{aligned} \hspace{9 em} D\big ((|Du|^2+\varepsilon )^{\alpha / 2} \big )=\alpha (|Du|^2+\varepsilon )^{\frac{\alpha -2}{2}}D^2uDu.\hspace{10 em} \end{aligned}$$

\(\square \)

The next lemma demonstrates that a mixed term can be written in a divergence form. On the other hand by using Eq. (3.1), as explained in (3.13), the mixed term adds up in the quadratic form, and thus adding such mixed terms can be used to improve the range.

Lemma 4.3

(Hidden divergence structure 2) Let \(u:\Omega _T\rightarrow {\mathbb {R}}\) be a smooth function. Then for any \(\beta \in {\mathbb {R}}\) and \(\varepsilon >0\),

$$\begin{aligned} \begin{aligned}&u_t(|Du|^2+\varepsilon )^{\beta /2} \left( \Delta u +\beta \frac{\Delta _{\infty }u}{|Du|^2+\varepsilon }\right) \\&\quad =u_t{{\,\mathrm{div\,}\,}}\left( (|Du|^2+\varepsilon )^{\beta /2}Du\right) \\&\quad = {\left\{ \begin{array}{ll} \displaystyle {{\,\mathrm{div\,}\,}}\big (u_t(|Du|^2+\varepsilon )^{\beta /2} Du\big )-\left( \frac{(|Du|^2+\varepsilon )^{\frac{\beta +2}{2}}}{\beta +2}\right) _t &{}\quad \text {if }\beta \ne -2, \\ \displaystyle {{\,\mathrm{div\,}\,}}\big (u_t(|Du|^2+\varepsilon )^{-1}Du\big )-\left( \frac{\ln (|Du|^2+\varepsilon )}{2}\right) _t &{}\quad \text {if }\beta =-2. \end{array}\right. } \end{aligned} \end{aligned}$$

Furthermore, if \(Du\ne 0\), then the above equality holds also for \(\varepsilon =0\).

Proof

We give the proof when \(\beta \ne -2\), the second case is similar. By the derivative rule of composite function again, one has

$$\begin{aligned}&{{\,\mathrm{div\,}\,}}\big (u_t(|Du|^2+\varepsilon )^{\beta /2} Du\big )-\left( \frac{(|Du|^2+\varepsilon )^{\frac{\beta +2}{2}}}{\beta +2}\right) _t\\&\quad =u_t{{\,\mathrm{div\,}\,}}\big ((|Du|^2+\varepsilon )^{\beta /2} Du\big )+(|Du|^2+\varepsilon )^{\beta /2} Du Du_t -\left( \frac{(|Du|^2+\varepsilon )^{\frac{\beta +2}{2}}}{\beta +2}\right) _t\\&\quad =u_t{{\,\mathrm{div\,}\,}}\big ((|Du|^2+\varepsilon )^{\beta /2} Du\big )\\&\quad = u_t\big <D\big ((|Du|^2+\varepsilon )^{\beta /2}\big ), Du\big >+u_t (|Du|^2+\varepsilon )^{\beta /2} {{\,\mathrm{div\,}\,}}(Du) \\&\quad =u_t(|Du|^2+\varepsilon )^{\beta /2}\left( \Delta u+\beta \frac{\Delta _{\infty }u}{|Du|^2+\varepsilon }\right) . \end{aligned}$$

\(\square \)

For \(\alpha \in {\mathbb {R}}\), we denote the ‘first good divergence structure’ as

$$\begin{aligned} GD_1(\alpha ):={{\,\mathrm{div\,}\,}}\big ((|Du|^2+\varepsilon )^{\alpha /2}(D^2u Du-\Delta uDu)\big ) \end{aligned}$$

and the ‘second good divergence structure’

$$\begin{aligned} \begin{aligned} GD_2(\alpha ):= {\left\{ \begin{array}{ll} \displaystyle {{\,\mathrm{div\,}\,}}\big (u_t(|Du|^2+\varepsilon )^{\frac{\alpha -\gamma }{2}} Du\big )-\left( \frac{(|Du|^2+\varepsilon )^{\frac{\alpha -\gamma +2}{2}}}{\alpha -\gamma +2}\right) _t &{}\quad \text {if }\alpha \ne \gamma -2, \\ \displaystyle {{\,\mathrm{div\,}\,}}\big (u_t(|Du|^2+\varepsilon )^{-1} Du\big )-\left( \frac{\ln (|Du|^2+\varepsilon )}{2}\right) _t &{}\quad \text {if }\alpha =\gamma -2. \end{array}\right. } \end{aligned} \end{aligned}$$

Then as explained in (3.11), we consider the following weighted sum of these ‘good structures’,

$$\begin{aligned} \begin{aligned} S:=&w_1GD_1(p-2+s)+w_2GD_2(p-2+s)\\&+\varepsilon w_3GD_1(p-4+s)+\varepsilon w_4GD_2(p-4+s) \end{aligned} \end{aligned}$$
(4.4)

for some parameter \(s\in {\mathbb {R}}\) and some weights \(w_1,w_2,w_3,w_4\in {\mathbb {R}}\). Observe that taking into account Lemmas 4.2 and 4.3, then S introduced above coincides with S in (3.11), i.e. the notation is consistent. The reason for using the mixed term form in S there was to emphasize the idea that we can improve the range by adding the mixed terms. To derive the final estimate, we need terms in the divergence form, and therefore this form was used in the above definition of S, but as stated they are equivalent.

4.2 The key estimate

As explained in (3.15), S represents the right hand side in our key estimate, and on the left we should have the second derivatives and a positive definite quadratic form. In this section, we derive the key estimate corresponding to (3.15) in detail.

We use Lemmas 4.2 and 4.3 to rewrite S as a linear combination of time derivatives and second order spatial derivative quantities, similarly to the left hand side of (3.4). First recall shorthand notation \(\theta \) and \(\kappa \) from (3.12)

$$\begin{aligned} \theta =\frac{|Du^{\varepsilon }|^2}{|Du^{\varepsilon }|^2+\varepsilon } \quad \text {and}\quad \kappa =\frac{\varepsilon }{|Du^{\varepsilon }|^2+\varepsilon }, \end{aligned}$$

thus \(0\le \theta ,\kappa \le 1\), \(\theta +\kappa =1\) and

$$\begin{aligned} \frac{|D^2u^{\varepsilon }Du^{\varepsilon }|^2}{|Du^{\varepsilon }|^2+\varepsilon }=\theta \left| D\left| Du^{\varepsilon }\right| \right| ^2 \quad \text {and}\quad \frac{\Delta _{\infty }u^{\varepsilon }}{|Du^{\varepsilon }|^2+\varepsilon }=\theta \Delta _{\infty }^Nu^{\varepsilon }. \end{aligned}$$

In particular, if \(\varepsilon =0\) and the gradient does not vanish, then \(\theta \equiv 1\) and \(\kappa \equiv 0\). Next we recall the definition of S from the above, and use the good divergence structures i.e. Lemma 4.2 (with \(\alpha =p-2+s \; \text {and}\; p-4+s\) ) and Lemma 4.3 (with \(\beta =p-2+s-\gamma \;\text {and}\; p-4+s-\gamma \)). For a smooth solution and indeed for any smooth function, we have

$$\begin{aligned} S&= w_1 (|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}} \left\{ |D^2u^{\varepsilon }|^2-(\Delta u^{\varepsilon })^2 +(p-2+s)\left( \frac{|D^2u^{\varepsilon }Du^{\varepsilon }|^2}{|Du^{\varepsilon }|^2+\varepsilon }\right. \right. \\&\quad \qquad \left. \left. - \Delta u^{\varepsilon }\frac{\Delta _{\infty }u^{\varepsilon }}{|Du^{\varepsilon }|^2+\varepsilon }\right) \right\} \\&\qquad +w_2u^{\varepsilon }_t(|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s-\gamma }{2}} \left\{ \Delta u^{\varepsilon }+(p-2+s-\gamma )\frac{\Delta _{\infty }u^{\varepsilon }}{|Du^{\varepsilon }|^2+\varepsilon }\right\} \\&\qquad +\varepsilon w_3(|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-4+s}{2}} \left\{ |D^2u^{\varepsilon }|^2-(\Delta u^{\varepsilon })^2 +p-4+s)\left( \frac{|D^2u^{\varepsilon }Du^{\varepsilon }|^2}{|Du^{\varepsilon }|^2+\varepsilon }\right. \right. \\&\quad \qquad \left. \left. - \Delta u^{\varepsilon }\frac{\Delta _{\infty }u^{\varepsilon }}{|Du^{\varepsilon }|^2+\varepsilon }\right) \right\} \\&\qquad +\varepsilon w_4u^{\varepsilon }_t(|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-4+s-\gamma }{2}} \Big \{\Delta u^{\varepsilon }+(p-4+s-\gamma )\frac{\Delta _{\infty }u^{\varepsilon }}{|Du^{\varepsilon }|^2+\varepsilon }\Big \}. \end{aligned}$$

Then by simplifying, we get

$$\begin{aligned} S=&(|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}} \Big \{ (w_1+w_3\kappa )\big (|D^2u^{\varepsilon }|^2-(\Delta u^{\varepsilon })^2\big )\\&+\big (w_1(p-2+s)+w_3(p-4+s)\kappa \big )\theta (|D|Du^{\varepsilon }||^2-\Delta u^{\varepsilon }\Delta _{\infty }^Nu^{\varepsilon }) \\&+(w_2+w_4\kappa )(|Du^{\varepsilon }|^2+\varepsilon )^{-\gamma /2}u^{\varepsilon }_t\Delta u^{\varepsilon }\\&+\big (w_2(p-2+s-\gamma )+w_4(p-4+s-\gamma )\kappa \big )\theta (|Du^{\varepsilon }|^2+\varepsilon )^{-\gamma /2}u^{\varepsilon }_t\Delta _{\infty }^Nu^{\varepsilon }\Big \}\\ =&(|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}} \Big \{ c_1\big (|D^2u^{\varepsilon }|^2-(\Delta u^{\varepsilon })^2\big ) +c_2(|D|Du^{\varepsilon }||^2-\Delta u^{\varepsilon }\Delta _{\infty }^Nu^{\varepsilon }) \\&\quad +c_3(|Du^{\varepsilon }|^2+\varepsilon )^{-\gamma /2}u^{\varepsilon }_t\Delta u^{\varepsilon }+c_4(|Du^{\varepsilon }|^2+\varepsilon )^{-\gamma /2}u^{\varepsilon }_t\Delta _{\infty }^Nu^{\varepsilon }\Big \} \end{aligned}$$

almost everywhere in \(\Omega _T\), where

$$\begin{aligned} {\left\{ \begin{array}{ll} c_1=w_1+w_3\kappa ,\qquad &{}c_2=\big (w_1(p-2+s)+w_3(p-4+s)\kappa \big )\theta ,\\ c_3=w_2+w_4\kappa , \qquad &{}c_4=\big (w_2(p-2+s-\gamma )+w_4(p-4+s-\gamma )\kappa \big )\theta . \end{array}\right. } \end{aligned}$$
(4.5)

Observe that given p, \(\gamma \) and s, if \(\varepsilon =0\), then \(c_1,\ldots ,c_4\) reduce to constants that only depend on \(w_1\) and \(w_2\), which shows that in smooth case by adjusting \(w_1\) and \(w_2\), we can get the desired estimate as explained in (3.10).

By employing expressions (4.1) and (4.2), we can write

$$\begin{aligned} \begin{aligned} s&= (|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}} \Big \{ c_1|D^2u^{\varepsilon }|^2+c_2|D_T|Du^{\varepsilon }||^2 -c_1(\Delta _Tu^{\varepsilon })^2 -c_1(\Delta _{\infty }^Nu^{\varepsilon })^2 \\&\quad -(2c_1+c_2)\Delta _Tu^{\varepsilon }\Delta _{\infty }^Nu^{\varepsilon }+c_3(|Du^{\varepsilon }|^2+\varepsilon )^{-\gamma /2}u^{\varepsilon }_t\Delta _Tu^{\varepsilon }\\&\quad +(c_3+c_4)(|Du^{\varepsilon }|^2+\varepsilon )^{-\gamma /2}u^{\varepsilon }_t\Delta _{\infty }^Nu^{\varepsilon }\Big \} \end{aligned} \end{aligned}$$
(4.6)

almost everywhere in \(\Omega _T\). Next we use regularized Eq. (3.1) to replace time derivatives \(u_t\) in (4.6) with spatial derivatives. Thus we arrive to the key estimate for a smooth solution to the regularized equation (which is actually equality at this point)

$$\begin{aligned} \begin{aligned}&(|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}} \Big \{ c_1|D^2u^{\varepsilon }|^2+c_2|D_T|Du^{\varepsilon }||^2 +(c_3-c_1)(\Delta _Tu^{\varepsilon })^2 \\&\quad +\big ((c_3+c_4)P_\theta -c_1\big )(\Delta _{\infty }^Nu^{\varepsilon })^2 \\&\quad +\big (c_3P_\theta +(c_3+c_4)-(2c_1+c_2)\big )\Delta _Tu^{\varepsilon }\Delta _{\infty }^Nu^{\varepsilon }\Big \}=S, \end{aligned} \end{aligned}$$
(4.7)

where

$$\begin{aligned} P_\theta :=(p-2)\theta +1 \in (0,\infty ) \end{aligned}$$

for the sake of brevity. We rewrite this as

$$\begin{aligned} \begin{aligned} (|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}} \Big \{ c_1|D^2u^{\varepsilon }|^2+c_2|D_T|Du^{\varepsilon }||^2 + R\Big \}=S \end{aligned} \end{aligned}$$
(4.8)

where

$$\begin{aligned} R&:= (c_3-c_1)(\Delta _Tu^{\varepsilon })^2 +\big ((c_3+c_4)P_\theta -c_1\big )(\Delta _{\infty }^Nu^{\varepsilon })^2 \\&\quad +\big (c_3P_\theta +(c_3+c_4)-(2c_1+c_2)\big )\Delta _Tu^{\varepsilon }\Delta _{\infty }^Nu^{\varepsilon }\end{aligned}$$

is a quadratic form in variables \(\Delta _T u\) and \(\Delta _{\infty }^Nu\). We rewrite R as

$$\begin{aligned} R=\langle {\bar{x}},N{\bar{x}}\rangle , \end{aligned}$$

where \({\bar{x}}=(\Delta _T u^{\varepsilon },\Delta _{\infty }^Nu^{\varepsilon })^T\in {\mathbb {R}}^2\) and \(N\in {\mathbb {R}}^{2\times 2}\) is a symmetric matrix whose entries \(N_{ij}\), \(i,j=1,2\), are given by

$$\begin{aligned} {\left\{ \begin{array}{ll} N_{11}&{}=c_2-c_1\\ N_{12}=N_{21}&{}= \tfrac{1}{2}\big (c_3P_\theta +(c_3+c_4) -(2c_1+c_2)\big )\\ N_{22}&{}=(c_3+c_4)P_\theta -c_1. \end{array}\right. } \end{aligned}$$

Note that

$$\begin{aligned} \Vert N\Vert _{L^\infty (\Omega _T)} :=\sup \{|N(x,t)|:(x,t)\in \Omega _T\} \end{aligned}$$

where

$$\begin{aligned} |N(x,t)|=\sqrt{\big (N_{11}(x,t)\big )^2+\big (N_{12}(x,t)\big )^2+\big (N_{21}(x,t)\big )^2+\big (N_{22}(x,t)\big )^2}, \end{aligned}$$

has an upper bound that only depends on p, \(\gamma \) and s by fixing \(w_1,w_2,w_3\) and \(w_4\).

4.3 Auxiliary lemmas

In this subsection we state two technical lemmas that can be used to conclude our main integral estimate.

We want to apply the fundamental inequality, Lemma 4.1, to estimate \(|D^2u^{\varepsilon }|^2\) in (4.8) from below to improve the range condition by using terms we obtain in this application. However, the direct application will eliminate the full Hessian \(|D^2u^{\varepsilon }|^2\) that we want to estimate. We could leave a small fraction of \(|D^2u^{\varepsilon }|^2\) (like the method was first described at the end of Sect. 3.3 for simplicity) and apply the fundamental inequality only to a remaining part, but actually this will not be necessary: The next lemma shows that already a seemingly weaker lower bound is sufficient. This will simplify the exposition.

Lemma 4.4

Let \(u^{\varepsilon }:\Omega _T\rightarrow {\mathbb {R}}\) be a smooth solution to (3.1), S as in (4.4), \(c_1\) as in (4.5), and \({\varepsilon }\ge 0\). If \(\varepsilon =0\), we assume in addition that \(Du^{\varepsilon }\ne 0\). Suppose that we can select \(w_1,w_2,w_3,w_4\in {\mathbb {R}}\) such that \(c_1=c_1(n,p,\gamma ,s,w_1,w_2,w_3,w_4)>0\), \(c=c(n,p,\gamma ,s,w_1,w_2,w_3,w_4)>0\) and

$$\begin{aligned} (|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}}\Big \{c|D_T|Du^{\varepsilon }||^2+Q\Big \}\le S \quad {\text {a.e.}}\,{\text {in}}\,{\text {space}}\,{\text {in }}\Omega _T, \end{aligned}$$
(4.9)

where

$$\begin{aligned} Q=\langle {\bar{x}},M{\bar{x}}\rangle \end{aligned}$$

with \({\bar{x}}=(\Delta _Tu^{\varepsilon },\Delta _{\infty }^Nu^{\varepsilon })^T\in {\mathbb {R}}^2\) and a uniformly bounded positive definite (with a uniform constant) symmetric matrix \(M=M(n,p,\gamma ,s,w_1,w_2,w_3,w_4)\in {\mathbb {R}}^{2\times 2}\). Then there is \(\lambda =\lambda (n,p,\gamma ,s,w_1,w_2,w_3,w_4)>0\) such that

$$\begin{aligned} \lambda (|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}}|D^2u^{\varepsilon }|^2\le S \quad {\text {a.e.}}\,{\text {in}}\,{\text {space}}\,{\text {in }}\Omega _T. \end{aligned}$$

Proof

Recall that

$$\begin{aligned} S=w_1GD_1(p-2+s)+w_2GD_2(p-2+s)\\ +\varepsilon w_3GD_1(p-4+s)+\varepsilon w_4GD_2(p-4+s) \end{aligned}$$

which, as pointed out in (4.8), can be written as

$$\begin{aligned} \begin{aligned} S&= (|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}} \Big \{ c_1|D^2u^{\varepsilon }|^2+c_2|D_T|Du^{\varepsilon }||^2+\langle {\bar{x}},N{\bar{x}}\rangle \Big \} \end{aligned} \end{aligned}$$

almost everywhere in space \(\Omega _T\), where \({\bar{x}}\) and N are as in (4.8). Observe that we utilized Eq. (3.1) at this step to get rid of the time derivatives.

For any \(\lambda \in (0,1)\), we write

$$\begin{aligned} S=\lambda S+(1-\lambda )S \end{aligned}$$

and use the assumption (4.9) to estimate \((1-\lambda )S\) from below. We end up with

$$\begin{aligned} S&\ge (|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}} \Big \{ \lambda c_1|D^2u^{\varepsilon }|^2 +\big (c+\lambda (c_2-c)\big )|D_T|Du^{\varepsilon }||^2 \\&\quad +\langle {\bar{x}},\big (M+\lambda (N-M)\big ){\bar{x}}\rangle \Big \}. \end{aligned}$$

We claim that we can select \(\lambda >0\) such that \(c+\lambda (c_2-c)\ge 0\) and \(M+\lambda (N-M)\) is a positive definite matrix. Indeed, since \(c>0\), then

$$\begin{aligned} c+\lambda (c_2-c) \ge c-\lambda \Vert c_2-c\Vert _{L^\infty (\Omega _T)}>0, \end{aligned}$$

uniformly if \(\lambda =\lambda (n,p,\gamma ,s,w_1,w_2,w_3,w_4)>0\) is small enough. Next we recall that the boundedness and positive definiteness of M implies

$$\begin{aligned} \Vert M\Vert _{L^\infty (\Omega _T)}\le C, \quad \text {and}\quad M_{11}\ge c \quad \text {and}\quad \det (M)\ge c \quad \text {in }\Omega _T \end{aligned}$$

by Sylvester’s criterion and choosing small enough \(c>0\). For the positive definiteness of the matrix \(M+\lambda (N-M)\) we can use Sylvester’s criterion again and check that the leading principal minors are positive if \(\lambda >0\) is small enough. The first principal minor is the upper-left corner entry, i.e.

$$\begin{aligned} \big (M+\lambda (N-M)\big )_{11} =M_{11}+\lambda (N_{11}-M_{11}) \ge c-\lambda (\Vert N\Vert _{L^{\infty }(\Omega _T)}+\Vert M\Vert _{L^\infty (\Omega _T)}), \end{aligned}$$

and the second principal minor is the determinant, i.e.

$$\begin{aligned} {\text {det}}\big (M+\lambda (N-M)\big )&= {\text {det}}(M) +\lambda \big (M_{11}N_{22}+M_{22}N_{11}-2M_{12}N_{12}-2\det (M)\big ) \\&\quad +\lambda ^2{\text {det}}(N-M) \\&\ge c-2\lambda \left( 2\Vert M\Vert _{L^\infty (\Omega _T)}\Vert N\Vert _{L^\infty (\Omega _T)}+\Vert M\Vert _{L^\infty (\Omega _T)}^2\right) \\&\quad -\lambda ^2\Vert N-M\Vert _{L^\infty (\Omega _T)}^2 \ge c-4\lambda \left( \Vert M\Vert _{L^\infty (\Omega _T)}+\Vert N\Vert _{L^\infty (\Omega _T)}\right) ^2. \end{aligned}$$

Hence we choose \(\lambda \) such that

$$\begin{aligned} 0<\lambda < \min \Big \{1, \frac{c}{\Vert c_2-c\Vert _{L^\infty (\Omega _T)}},&\frac{c}{\Vert N\Vert _{L^{\infty }(\Omega _T)}+\Vert M\Vert _{L^\infty (\Omega _T)}},\\&\frac{c}{4\left( \Vert M\Vert _{L^\infty (\Omega _T)}+\Vert N\Vert _{L^\infty (\Omega _T)}\right) ^2} \Big \}. \end{aligned}$$

Since we have now proven the nonnegativity of the excess terms, the result follows. \(\square \)

The following lemma shows that we can derive the desired integral estimate from the pointwise lower bound. The proof uses rather standard techniques and is based on localization with a suitable cutoff function and then integration by parts. For the convenience of the reader, we give the details in the appendix.

Lemma 4.5

Let \(u^{\varepsilon }:\Omega _T\rightarrow {\mathbb {R}}\) be a smooth solution to (3.1), and S as in (4.4). If \(\varepsilon =0\), we assume in addition that \(Du^{\varepsilon }\ne 0\). Suppose that we can find weights \(w_1,w_2,w_3,w_4\in {\mathbb {R}}\) such that

$$\begin{aligned} \lambda (|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}}|D^2u^{\varepsilon }|^2\le S \quad {\text {a.e.}}\,{\text {in}}\,{\text {space}}\,{\text {in }}\Omega _T, \end{aligned}$$
(4.10)

for some constant \(\lambda =\lambda (n,p,\gamma ,s,w_1,w_2,w_3,w_4)>0\). If \(s\ne \gamma -p\), then for any concentric parabolic cylinders \(Q_r\subset Q_{2r}\Subset \Omega _T\) with center point \((x_0,t_0)\in \Omega _T\), we have the estimate

$$\begin{aligned} \begin{aligned}&\int _{Q_r}\left| D\big ((|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{4}}Du^{\varepsilon }\big )\right| ^2dxdt\\&\quad \le \frac{C}{r^2}\left( \int _{Q_{2r}}(|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}}|Du^{\varepsilon }|^2dxdt +\int _{Q_{2r}}(|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p+s-\gamma }{2}}dxdt\right) \\&\qquad +\varepsilon \left( \frac{C}{r^2}\int _{Q_{2r}}\big |\ln (|Du^{\varepsilon }|^2+\varepsilon )\big |dxdt +C\int _{B_{2r}}\big |\ln (|Du^{\varepsilon }(x,t_0)|^2+\varepsilon )\big |dx\right) \end{aligned} \end{aligned}$$
(4.11)

where \(C=C(n,p,\gamma ,s,\lambda ,w_1,w_2,w_3,w_4)>0\).

The last two integrals on the right hand side of (4.11) do not appear if \(s\ne \gamma -p+2\). The source of such error terms in the case \(s=\gamma -p+2\) is the logarithm in Lemma 4.3 when \(\beta =-2\).

5 Smooth case with non-zero gradient

Let \(1<p<\infty \) and \(-1<\gamma <\infty \). In this section we assume that \(u:\Omega _T\rightarrow {\mathbb {R}}\) is a smooth solution to

$$\begin{aligned} u_t-|Du|^{\gamma }\big (\Delta u+(p-2)\Delta _{\infty }^Nu\big )=0, \end{aligned}$$
(5.1)

such that \(Du\ne 0\). That is, u does not have critical points in space. Our main result in this case is the following a priori estimate. Usually extending a regularity result to a general nonsmooth case is quite straightforward.

Proposition 5.1

Let \(n\ge 2\), \(1<p<\infty \) and \(-1<\gamma <\infty \). Let \(u:\Omega _T\rightarrow {\mathbb {R}}\) be a smooth solution to (5.1) such that \(Du\ne 0\). If

$$\begin{aligned} s>\max \Big \{-1-\frac{p-1}{n-1},\gamma +1-p\Big \}, \end{aligned}$$
(5.2)

then for any concentric parabolic cylinders \(Q_r\subset Q_{2r}\Subset \Omega _T\), we have the estimate

$$\begin{aligned} \int _{Q_r}\left| D\left( \left| Du\right| ^{\frac{p-2+s }{2}} Du\right) \right| ^2dxdt \le \frac{C}{r^2}\left( \int _{Q_{2r}}|Du|^{p+s}dxdt + \int _{Q_{2r}}|Du|^{p+s-\gamma }dxdt \right) , \end{aligned}$$

where \(C=C(n,p,\gamma ,s)>0\).

The following Lemma, Lemma 5.2, is the main ingredient in the proof of Proposition 5.1. Thus we postpone the proof of Proposition 5.1 until after the proof of Lemma 5.2.

In the following lemma we consider the weighted sum

$$\begin{aligned} S=w_1GD_1(p-2+s)+w_2GD_2(p-2+s) \end{aligned}$$

where \(w_1,w_2\in {\mathbb {R}}\), and the notation was defined in (4.4). Note that since \(\varepsilon =0\) in this section, the terms with weights \(w_3\) and \(w_4\) in (4.4) disappear. The purpose of Lemma 5.2 is to show that under restriction (5.2), we can find positive weights \(w_1=w_1(n,p,\gamma ,s)>0\) and \(w_2=w_2(n,p,\gamma ,s)>0\) such that S has a suitably nonnegative lower bound to make Lemma 4.4 applicable. Moreover, by the proof of Lemma 5.2 and Sylvester’s condition, we can choose the value \(c=c(n,p,\gamma ,s)>0\) small enough such that for M in the proof it holds

$$\begin{aligned} M_{11}\ge c \quad \text {and}\quad \det (M)\ge c. \end{aligned}$$

The proof of Proposition 5.1 is then finished by using Lemma 4.5.

Lemma 5.2

Let \(n\ge 2\), \(1<p<\infty \) and \(-1<\gamma <\infty \). Let \(u:\Omega _T\rightarrow {\mathbb {R}}\) be a smooth solution to (5.1) such that \(Du\ne 0\). If (5.2) holds, then we can select \(w_1=w_1(n,p,\gamma ,s)>0\) and \(w_2=w_2(n,p,\gamma ,s)>0\), such that

$$\begin{aligned} |Du|^{p-2+s}\Big \{c|D_T|Du||^2+Q\Big \}\le S \end{aligned}$$

where \(c=c(n,p,\gamma ,s)>0\) and

$$\begin{aligned} Q=\langle {\bar{x}},M{\bar{x}}\rangle \end{aligned}$$

with \({\bar{x}}=(\Delta _Tu,\Delta _{\infty }^Nu)^T\in {\mathbb {R}}^2\) and a uniformly bounded positive definite (with a uniform constant) symmetric matrix \(M=M(n,p,\gamma ,s)\in {\mathbb {R}}^{2\times 2}\).

Proof

Similarly as in (4.7), recalling that \({\varepsilon }=0\), by expressions (4.5), we arrive at

$$\begin{aligned} \begin{aligned} |Du|^{p-2+s}&\Big \{ w_1|D^2u|^2+w_1(p-2+s)|D_T|Du||^2+(w_2-w_1)(\Delta _T u)^2 \\&\quad +\big (w_2(p-1)(p-1+s-\gamma )-w_1\big )(\Delta _{\infty }^Nu)^2 \\&\quad +\big (w_2(2p-2+s-\gamma )-w_1(p+s)\big )\Delta _Tu\Delta _{\infty }^Nu \Big \}=S. \end{aligned} \end{aligned}$$
(5.3)

We estimate \(|D^2u|^2\) on the left hand side of (5.3) from below by the fundamental inequality, Lemma 4.1. This yields the following lower bound for S

$$\begin{aligned} \begin{aligned}&|Du|^{p-2+s} \Big \{ w_1(p+s)|D_T|Du||^2 +Q \Big \}\le S, \end{aligned} \end{aligned}$$

where

$$\begin{aligned} Q:&= \Big (w_2- \frac{n-2}{n-1} w_1\Big )(\Delta _T u)^2 +w_2(p-1)(p-1+s-\gamma )(\Delta _{\infty }^Nu)^2 \\&\quad +\big (w_2(2p-2+s-\gamma )-w_1(p+s)\big )\Delta _Tu\Delta _{\infty }^Nu. \end{aligned}$$

We write Q more compactly as

$$\begin{aligned} Q=\langle {\bar{x}},M{\bar{x}}\rangle , \end{aligned}$$

where \({\bar{x}}=(\Delta _T u,\Delta _{\infty }^Nu)^T\in {\mathbb {R}}^2\) is a vector and

$$\begin{aligned} M:= \begin{bmatrix} w_2- \displaystyle {\frac{n-2}{n-1}} w_1 &{} \displaystyle {\frac{1}{2}}\big (w_2(2p-2+s-\gamma )-w_1(p+s)\big ) \\ \displaystyle {\frac{1}{2}}\big (w_2(2p-2+s-\gamma )-w_1(p+s)\big ) &{} w_2(p-1)(p-1+s-\gamma ) \end{bmatrix} \end{aligned}$$

is a symmetric \(2\times 2\)-matrix. We claim that under assumption (5.2) we can choose \(w_1,w_2\in {\mathbb {R}}\) such that M is uniformly bounded positive definite (with a uniform constant).

If \(n=2\), this is easy to see by selecting

$$\begin{aligned} w_1=2p-2+s-\gamma \quad \text {and}\quad w_2=p+s, \end{aligned}$$

because then

$$\begin{aligned} M= \begin{bmatrix} p+s &{} 0 \\ 0 &{} (p+s)(p-1)(p-1+s-\gamma ) \end{bmatrix} \end{aligned}$$

and hence

$$\begin{aligned} Q=(p+s)\left( (\Delta _T u)^2+(p-1)(p-1+s-\gamma )(\Delta _{\infty }^Nu)^2\right) . \end{aligned}$$

In other words, with such choice of \(w_1\) and \(w_2\), the mixed term \(\Delta _T u\Delta _{\infty }^Nu\) vanishes. Notice that (5.2) implies that \(w_1>0\) and \(w_2>0\).

For the higher dimensional case \(n\ge 3\), we set \(w_1=1\) and find \(w_2=w_2(n,p,\gamma ,s)>0\) such that M is uniformly bounded positive definite (with a uniform constant). This is possible precisely when (5.2) holds: Since the proof is quite tedious, we postpone it to Lemma B.1 in the appendix. \(\square \)

We are ready to give the proof of Proposition 5.1.

Proof of Proposition 5.1

Let us fix \(w_1=w_1(n,p,\gamma ,s)>0\) and \(w_2=w_2(n,p,\gamma ,s)>0\) according to Lemma 5.2. Lemma 4.4 is then applicable because \(w_1>0, w_3=0\) implies that \(c_1=w_1+w_3 \kappa >0\) and the conclusion of Lemma 5.2 implies that (4.9) holds. Therefore, by Lemma 4.4 there exists \(\lambda =\lambda (n,p,\gamma ,s,w_1,w_2,w_3,w_4)>0\) such that

$$\begin{aligned} \lambda |Du|^{p-2+s}|D^2u|^2 \le S \end{aligned}$$

in \(\Omega _T\). Now the desired estimate follows from Lemma 4.5. \(\square \)

Range (5.2) in Proposition 5.1, is optimal in the following sense: In the elliptic case, [10] and [25], the best known range is \(s>-1-\frac{p-1}{n-1}\). On the other hand, Example 5.1 below shows that in the parabolic case we cannot hope to reach any better range than \(s>\gamma +1-p\). A counterexample of this type was used in [10, Sect. 1.3] for the standard p-parabolic equation.

Example 5.1

(Counterexample) Let \(u:{\mathbb {R}}^n\times (0,\infty )\rightarrow {\mathbb {R}}\) be given by

$$\begin{aligned} u(x,t):=Ct+|x_1|^{\alpha } \end{aligned}$$

for some \(C\in {\mathbb {R}}\) and \(\alpha >0\). Note that

$$\begin{aligned} |Du|^{\gamma }\Delta _{p}^Nu=\alpha ^{\gamma +1}(\alpha -1)(p-1)|x_1|^{(\alpha -1)(\gamma +1)-1}. \end{aligned}$$

Hence, if

$$\begin{aligned} \alpha =1+\frac{1}{\gamma +1} \quad \text {and}\quad C=\alpha ^{\gamma +1}(\alpha -1)(p-1) \end{aligned}$$

then u solves (2.1) in the classical sense whenever \(x_1\ne 0\). Indeed, by a direct computation, we have

$$\begin{aligned} u_{x_1}=\alpha |x_1|^{\alpha -2}x_1, \quad u_{x_i}=0 \quad \textrm{for}\quad i=2,\ldots ,n, \end{aligned}$$

and

$$\begin{aligned} u_{x_1x_1}=\alpha (\alpha -1)|x_1|^{\alpha -2}, \quad u_{x_ix_j}=0, \end{aligned}$$

where \(i,j=1,\ldots ,n\) and i and j are not both 1.

Next we verify that the function u is a viscosity solution in the whole \({\mathbb {R}}^n\) according to Definition 2.3 also at those points where \(x_1=0\). Whenever \(x_1\ne 0\), \(x_0=(x_1,\ldots ,x_n)\), and the test function \(\varphi \) touches u at \((x_0,t_0)\) from below (the argument is analogous from above), we may use the facts that

$$\begin{aligned} D\varphi (x_0,t_0)=Du(x_0,t_0)\ne 0,\quad \phi _t(x_0,t_0)=u_t(x_0,t_0)=0 \end{aligned}$$

and

$$\begin{aligned} D^2\varphi (x_0,t_0)\le D^2u(x_0,t_0). \end{aligned}$$

Let us consider the points where \(x_1=0\). We study the degenerate case \(\gamma >0\) and the singular case \(-1<\gamma \le 0\) separately. If \(x_1=0\) and \(\gamma >0\), then there are no test functions touching u from above and for a test function \(\varphi \) touching from below, we have \(D\varphi (x_0,t_0)=Du(x_0,t_0)=0\) and

$$\begin{aligned} \varphi _t(x_0,t_0)=u_t(x_0,t_0)=C. \end{aligned}$$

Since \(C>0\), the function u is a viscosity supersolution.

The given function is also a viscosity solution whenever \(-1<\gamma \le 0\): the proof for the supersolution property is the same as in the degenerate case above. It is also a subsolution because (similarly to the degenerate case) there are no admissible test functions touching u from above. We provide a detailed proof of this fact. Thriving for a contradiction, suppose that there is an admissible test function \(\varphi \) touching u at \((x_0,t_0)\) with \(x_0=(0,\ldots ,0)\) (for simplicity) from above. Then necessarily

$$\begin{aligned} \varphi _t(x_0,t_0)=C>0. \end{aligned}$$

By the definition of a viscosity solution it holds that

$$\begin{aligned} \phi (x,t)=u(x_0,t_0)+\varphi _t(x_0,t_0)(t-t_0)+f(\left| x\right| )+\sigma (t-t_0) \end{aligned}$$

is an admissible test function touching strictly from above. By strict touching and regularity of u, by translating with respect to \(x_1\) and lifting we may assume that \(\phi \) touches u at a point \((x,t_0)\), \(x=({\varepsilon },0,\ldots ,0)\), with small \({\varepsilon }>0\). Also observe that by an approximation, we could assume that \(\sigma \) is a \(C^2\) function, but we omit this step as well. Also recall the notation \(g(y)=f(\left| y\right| )\) and that

$$\begin{aligned} \lim _{y\rightarrow 0,y\ne 0}F\big (Dg(y),D^2g(y)\big )=0. \end{aligned}$$

Then by this and the counter assumption it holds at a point \((x,t_0)\) for x close enough \(x_0\) that

$$\begin{aligned} \phi _t(x,t_0)-F\big (D\phi (x,t_0),D^2\phi (x,t_0)\big )=\varphi _t(x_0,t_0)-F\big (Dg(x),D^2g(x)\big )>0. \end{aligned}$$
(5.4)

On the other hand, since u is now \(C^2\)-function with the explicit formula, we have

$$\begin{aligned} \begin{aligned} \phi _t(x,t_0)-F\big (D\phi (x,t_0),D^2\phi (x,t_0)\big )&=\varphi _t(x_0,t_0)-F\big (Dg(x),D^2g(x)\big ) \\&=u_t(x,t_0)-F\big (Dg(x),D^2g(x)\big ) \\&\le u_t(x,t_0)-F\big (Du(x,t_0),D^2u(x,t_0)\big )=0, \end{aligned} \end{aligned}$$

which contradicts inequality (5.4).

In the above inequality we used the fact that since \(\phi \) touches u from above at \((x,t_0)\) we have \(D^2g(x)\ge D^2 u(x,t_0)\) and \(Dg(x)=Du(x,t_0)\ne 0\) and thus

$$\begin{aligned} F\big (Dg(x),D^2g(x)\big )\ge F\big (Du(x,t_0),D^2u(x,t_0)\big ). \end{aligned}$$

We study the local \(W^{1,2}\)-regularity of \(|Du|^{\frac{p-2+s}{2}}Du\) for \(s\in {\mathbb {R}}\) and see what kind of restrictions for s arise. We have

$$\begin{aligned} \left| D(|Du|^{\frac{p-2+s}{2}}Du)\right|&=\frac{1}{2}\alpha ^{\frac{p+s}{2}}(\alpha -1)(p+s)|x_1|^{\frac{(\alpha -1)(p+s)}{2}-1}\\&=C(p,s,\gamma )|x_1|^{\frac{ p+s }{2(\gamma +1)}-1}. \end{aligned}$$

The function \(D(|Du|^{\frac{p-2+s}{2}}Du)\) locally belongs to \(L^2({\mathbb {R}}^n\times (0,\infty ))\) if and only if

$$\begin{aligned} 2\Big (\frac{ p+s }{2(\gamma +1)}-1\Big )>-1, \end{aligned}$$

that is,

$$\begin{aligned} s>\gamma +1-p. \end{aligned}$$

Observe that range condition (5.2) gives this in the plane, but in higher dimensions we have an additional restriction, which is the same restriction as in the elliptic case.

When \(s=2-p\), then for \(W^{2,2}\)-regularity, the range

$$\begin{aligned} -1<\gamma <1 \end{aligned}$$

is sharp in the plane.

Remark 5.1

Also the case \(n=1\) holds. Recall that the key point is identity (3.4), that is,

$$\begin{aligned} \begin{aligned}&(|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}} \left\{ |D^2u^{\varepsilon }|^2+(p-2+s)\frac{|D^2u^{\varepsilon }Du^{\varepsilon }|^2}{|Du^{\varepsilon }|^2+\varepsilon } +s(p-2)\frac{(\Delta _{\infty }u^{\varepsilon })^2}{(|Du^{\varepsilon }|^2+\varepsilon )^2} \right. \\&\qquad \left. +(p-2-\gamma )(|Du^{\varepsilon }|^2+\varepsilon )^{- {\gamma }/{2}}u^{\varepsilon }_t\frac{\Delta _{\infty }u^{\varepsilon }}{|Du^{\varepsilon }|^2+\varepsilon } \right\} \\&\quad = {{\,\mathrm{div\,}\,}}\big ((|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}}AD^2u^{\varepsilon }Du^{\varepsilon }\big ) -\frac{\big ((|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p+s-\gamma }{2}}\big )_t}{p+s-\gamma }, \end{aligned} \end{aligned}$$

provided that \(s\ne \gamma -p\). If \(n=1\) this reduces to

$$\begin{aligned} \begin{aligned}&(|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}} \Big \{1+(p-2+s)\theta +s(p-2)\theta ^2 \\&\quad +(p-2-\gamma )\big ((p-2)\theta +1\big )\theta \Big \}|D^2u^{\varepsilon }|^2 \\&= {{\,\mathrm{div\,}\,}}\big ((|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}}(D^2u^{\varepsilon }Du^{\varepsilon }-\Delta u^{\varepsilon }Du^{\varepsilon }) \big ) \\&\quad +u^{\varepsilon }_t{{\,\mathrm{div\,}\,}}\big ((|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s-\gamma }{2}}Du^{\varepsilon }\big ). \end{aligned} \end{aligned}$$
(5.5)

The left hand side of (5.5) is

$$\begin{aligned}&(|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}} \Big \{(p-1)(p-1+s-\gamma )\theta ^2 +(2p-2 +s-\gamma )\theta \kappa +\kappa ^2\Big \}|D^2u^{\varepsilon }|^2 \\&\quad \ge \lambda (|Du^{\varepsilon }|^2+\varepsilon )^{\frac{p-2+s}{2}}|D^2u^{\varepsilon }|^2, \end{aligned}$$

for some constant \(\lambda =\lambda (p,\gamma ,s)>0\), provided that \(s>\gamma +1-p\). From this it is easy to derive the desired integral estimate. We conclude that Proposition 5.1 holds in case \(n=1\) without the additional smoothness assumptions for u, and with the interpretation

$$\begin{aligned} s>\max \Big \{-1-\frac{p-1}{n-1},\gamma +1-p\Big \}=\max \{-\infty ,\gamma +1-p\}=\gamma +1-p. \end{aligned}$$

6 Removing the smoothness assumption

Section 5 gives a formal derivation of the regularity estimate under the assumption that the gradient of a solution does not vanish. In this section, we remove the additional assumption in a certain range of parameters by regularizing the equation and then finally pass to a limit to obtain the result for the original equation.

6.1 Regularization

Let \(u^{\varepsilon }:\Omega _T\rightarrow {\mathbb {R}}\) be a smooth solution to the equation

$$\begin{aligned} u^{\varepsilon }_t-(|Du^{\varepsilon }|^2+\varepsilon )^{\gamma /2}\Big (\Delta u^{\varepsilon }+(p-2)\frac{\Delta _{\infty }u^{\varepsilon }}{|Du^{\varepsilon }|^2+\varepsilon }\Big )=0 \end{aligned}$$
(6.1)

where \(1<p<\infty \), \(-1<\gamma <\infty \), and \(\varepsilon >0\) is a regularization parameter. As explained in Sect. 3.3, the mismatch between the second order differential quantities in the fundamental inequality and the regularized equation and consequently in the basic estimate causes that some of the formal calculations do not work as such even if most of the steps work for general s. In particular positive definiteness of the quadratic form becomes an issue.

In this section, partly for the convenience of the reader, we have decided to limit ourselves to the planar case \(n=2\) and focus on the square-integrability of the second order derivatives \(D^2u\), that is, we consider the case \(s=2-p\). In this case the range condition in (1.4) that is

$$\begin{aligned} s>\max \Big \{-1-\frac{p-1}{n-1},\gamma +1-p\Big \} \end{aligned}$$

reduces to

$$\begin{aligned} 1<p<\infty \quad \text {and}\quad -1<\gamma <1. \end{aligned}$$

Then range (i)

$$\begin{aligned} 1<p\le 5 \quad \text {and}\quad -1<\gamma <1, \end{aligned}$$
(6.2)

in Theorem 1.1 and the Proposition below will be obtained by a straightforward generalization of the formal calculation (\(\varepsilon =0\)). That is, we consider the sum

$$\begin{aligned} S= w_1{{\,\mathrm{div\,}\,}}(D^2u^{\varepsilon }Du^{\varepsilon }-\Delta u^{\varepsilon }Du^{\varepsilon }) +w_2u^{\varepsilon }_t{{\,\mathrm{div\,}\,}}\big ((|Du^{\varepsilon }|^2+\varepsilon )^{-\frac{\gamma }{2}}Du^{\varepsilon }\big ), \end{aligned}$$
(6.3)

and show that if (6.2) holds, then we can find \(w_1,w_2>0\) such that

$$\begin{aligned} c \left| D_T\left| Du^{\varepsilon }\right| \right| ^2 +Q\le S, \end{aligned}$$

where \(c>0\) and Q is positive definite. For range (ii) in the Proposition below which is the same as in Theorem 1.1, we instead consider the full S as defined in (3.11) or equivalently in (4.4).

Our main result for \(u^{\varepsilon }\) is the following.

Proposition 6.1

Let \(n=2\). Let \(u^{\varepsilon }:\Omega _T\rightarrow {\mathbb {R}}\) be a smooth solution to (6.1). If p and \(\gamma \) satisfy one of the following conditions:

  1. (i)

    \(1<p\le 5\) and \(-1<\gamma <1\); or

  2. (ii)

    \(1<p<\infty \) and \(-1<\gamma <\sqrt{2}-\frac{1}{2}\),

then for any concentric parabolic cylinders \(Q_r\subset Q_{2r}\Subset \Omega _T\) with center point \((x_0,t_0)\in \Omega _T\), we have the estimate

$$\begin{aligned} \begin{aligned} \int _{Q_r}|D^2u^{\varepsilon }|^2dxdt&\le \frac{C}{r^2}\left( \int _{Q_{2r}}|Du^{\varepsilon }|^2dxdt +\int _{Q_{2r}}(|Du^{\varepsilon }|^2+\varepsilon )^{\frac{2-\gamma }{2}}dxdt\right) \\&\quad +\varepsilon \left( \frac{C}{r^2}\int _{Q_{2r}}\big |\ln (|Du^{\varepsilon }|^2+\varepsilon )\big |dxdt +C\int _{B_{2r}}\big |\ln (|Du^{\varepsilon }(x,t_0)|^2+\varepsilon )\big |dx \right) \end{aligned} \end{aligned}$$

where \(C=C(p,\gamma )>0\).

The proof of Proposition 6.1 is postponed to the end of the section. The main ingredients of the proof of Proposition 6.1 are the following lemmas, Lemmas 6.2 and 6.3. The first lemma, Lemma 6.2, yields case (i). The second lemma, Lemma 6.3 yields case (ii). In both lemmas we consider the same weighted sum as before now selecting \(s=2-p\) i.e.

$$\begin{aligned} S&=w_1GD_1(p-2+s)+w_2GD_2(p-2+s)+\varepsilon w_3GD_1(p-4+s)\nonumber \\&\quad +\varepsilon w_4GD_2(p-4+s)\nonumber \\&=w_1GD_1(0)+w_2GD_2(0)+w_3\varepsilon GD_1(-2)+w_4\varepsilon GD_2(-2), \end{aligned}$$
(6.4)

where \(w_1,w_2,w_3,w_4\in {\mathbb {R}}\) are some weights, and the notation was defined in (4.4).

The purpose of Lemmas 6.2 and 6.3 is to show that under restrictions (i) and (ii), respectively, we can find suitable weights \(w_1,w_2,w_3\) and \(w_4\), that only depend on p and \(\gamma \), such that S has a suitable lower bound.

Lemma 6.2

Let \(n=2\), S be as in (6.4), and (i) in Proposition 6.1 hold. For \(\eta =\eta (p,\gamma )>0\) small enough, if

$$\begin{aligned} \begin{aligned} {\left\{ \begin{array}{ll} w_1= p-\gamma -2\sqrt{(p-1)(1-\gamma )} +\eta , \quad &{}w_2=2, \\ w_3=0, \quad &{}w_4=0, \end{array}\right. } \end{aligned} \end{aligned}$$

then

$$\begin{aligned} S\ge c|D_T|Du^{\varepsilon }||^2+Q \end{aligned}$$

where \(c=c(p,\gamma )>0\) and

$$\begin{aligned} Q=\langle {\bar{x}},M{\bar{x}}\rangle \end{aligned}$$

with \({\bar{x}}=(\Delta _Tu^{\varepsilon },\Delta _{\infty }^Nu^{\varepsilon })^T\in {\mathbb {R}}^2\) and a uniformly bounded positive definite (with a uniform constant) symmetric matrix \(M=M(p,\gamma )\in {\mathbb {R}}^{2\times 2}\).

Lemma 6.3

Let \(n=2\), S be as in (6.4), and (ii) in Proposition 6.1 hold. If

$$\begin{aligned} \begin{aligned} {\left\{ \begin{array}{ll} w_1=p-\gamma ,\quad &{}w_2=2 \\ w_3=4-p+\gamma ,\quad &{}w_4=2, \end{array}\right. } \end{aligned} \end{aligned}$$

then a statement similar to that in Lemma 6.2 holds.

To begin with, recall from (4.7) that S can be written as

$$\begin{aligned} S&= c_1|D^2u^{\varepsilon }|^2+c_2|D_T|Du^{\varepsilon }||^2+(c_3-c_1)(\Delta _T u^{\varepsilon })^2+\big ((c_3+c_4)P_\theta -c_1\big ) (\Delta _{\infty }^Nu^{\varepsilon })^2\nonumber \\&\quad +\big (c_3P_\theta +(c_3+c_4)-(2c_1+c_2)\big )\Delta _T u^{\varepsilon }\Delta _{\infty }^Nu^{\varepsilon }. \end{aligned}$$

where

$$\begin{aligned} {\left\{ \begin{array}{ll} c_1=w_1+w_3\kappa , \quad &{}c_2=-2w_3\theta \kappa , \\ c_3=w_2+w_4\kappa ,\quad &{}c_4=-(w_2\gamma +w_4\kappa (2+\gamma ))\theta , \end{array}\right. } \end{aligned}$$
(6.5)

and

$$\begin{aligned}{} & {} P_\theta =(p-2)\theta +1\in (0,\infty ), \quad \theta =\frac{|Du|^2}{|Du|^2+\varepsilon }\in [0, 1),\\{} & {} \quad \kappa =1-\theta =\frac{\varepsilon }{|Du^{\varepsilon }|^2+\varepsilon }\in (0,1]. \end{aligned}$$

Fundamental equality (4.3) in the plane yields that

$$\begin{aligned} \begin{aligned} S=&c_1\big (2|D_T|Du^{\varepsilon }| |^2 + (\Delta _T u^{\varepsilon })^2 +(\Delta _{\infty }^Nu^{\varepsilon })^2\big )+c_2|D_T|Du^{\varepsilon }||^2+(c_3-c_1)(\Delta _T u^{\varepsilon })^2\\&+\big ((c_3+c_4)P_\theta -c_1\big ) (\Delta _{\infty }^Nu^{\varepsilon })^2 +\big (c_3P_\theta +(c_3+c_4)-(2c_1+c_2)\big )\Delta _T u^{\varepsilon }\Delta _{\infty }^Nu^{\varepsilon }\\ =&(2c_1 +c_2)|D_T|Du^{\varepsilon }||^2+Q. \end{aligned}\nonumber \\ \end{aligned}$$
(6.6)

where

$$\begin{aligned} Q= & {} c_3 (\Delta _T u^{\varepsilon })^2+ (c_3+c_4)P_\theta (\Delta _{\infty }^Nu^{\varepsilon })^2\nonumber \\{} & {} +\big (c_3P_\theta +(c_3+c_4)-(2c_1+c_2)\big ) \Delta _T u^{\varepsilon }\Delta _{\infty }^Nu^{\varepsilon }, \end{aligned}$$
(6.7)

is a quadratic form in \( \Delta _T u^{\varepsilon }\) and \( \Delta _{\infty }^Nu^{\varepsilon }\). We write Q compactly as

$$\begin{aligned} Q=\langle {\bar{x}},M{\bar{x}}\rangle , \end{aligned}$$

where \({\bar{x}}=(\Delta _T u^{\varepsilon },\Delta _{\infty }^Nu^{\varepsilon })^T\in {\mathbb {R}}^2\) is a vector and

$$\begin{aligned} M:= \begin{bmatrix} c_3 &{} \displaystyle {\frac{1}{2}\big (c_3P_\theta +(c_3+c_4)-(2c_1+c_2)\big )} \\ \displaystyle {\frac{1}{2}\big (c_3P_\theta +(c_3+c_4)-(2c_1+c_2)\big )} &{} (c_3+c_4)P_\theta \end{bmatrix} \end{aligned}$$

is a symmetric \(2\times 2\)-matrix.

To prove Lemmas 6.2 and 6.3, it now suffices to check that (6.6) satisfies all the requirements of the lemmas: The coefficient of \(|D_T|Du^{\varepsilon }||^2\) in (6.6) needs to be bounded from below by a positive constant, that is,

$$\begin{aligned} 2c_1+c_2=2(w_1+w_3\kappa )-2w_3\theta \kappa \ge c \end{aligned}$$
(6.8)

uniformly in \(\Omega _T\). For the quadratic form Q, we need to analyse the uniform boundedness and uniform positive definiteness of matrix M. Uniform boundedness is quite straightforward, so we focus our attention on the uniform positive definiteness. By Sylvester’s condition, it suffices to check that

$$\begin{aligned} c_3=w_2+w_4\kappa \ge c, \end{aligned}$$
(6.9)

and

$$\begin{aligned} \det (M)=c_3(c_3+c_4)P_\theta -\frac{\big (c_3P_\theta +(c_3+c_4)-(2c_1+c_2)\big )^2}{4}\ge c \end{aligned}$$
(6.10)

uniformly in \(\Omega _T\). Next we prove Lemma 6.2, which implies nonnegativity of the necessary terms when \(1<p\le 5\) and \(-1<\gamma <1\). In this case a simple choice of the weights \(w_3=w_4=0\) will work.

Proof

(Proof of Lemma 6.2) Similarly to the smooth case, we start with \(w_3=w_4=0\), plug these values into (6.5), and obtain

$$\begin{aligned} {\left\{ \begin{array}{ll} c_1=w_1,\quad &{}c_2=0,\\ c_3=w_2, \quad &{}c_4=-w_2\gamma \theta . \end{array}\right. } \end{aligned}$$

This together with (6.6) gives

$$\begin{aligned} \begin{aligned} S=2w_1|D_T|Du^{\varepsilon }||^2+ w_2(\Delta _T u^{\varepsilon })^2+w_2 P_\theta R_\theta (\Delta _{\infty }^Nu^{\varepsilon })^2+\big (w_2(P_\theta +R_\theta )-2w_1\big )\Delta _T u^{\varepsilon }\Delta _{\infty }^Nu^{\varepsilon }, \end{aligned} \end{aligned}$$

where we denote \(R_\theta :=1-\gamma \theta \in (0,2),\) for the sake of brevity. To simplify the above identity, we select \(w_2=2.\) Thus

$$\begin{aligned} \begin{aligned} S=&2w_1|D_T|Du^{\varepsilon }||^2+ 2\Big ((\Delta _T u^{\varepsilon })^2+ P_\theta R_\theta (\Delta _{\infty }^Nu^{\varepsilon })^2+( P_\theta +R_\theta -w_1)\Delta _T u^{\varepsilon }\Delta _{\infty }^Nu^{\varepsilon }\Big )\\ =&2w_1|D_T|Du^{\varepsilon }||^2+Q, \end{aligned} \end{aligned}$$

where the matrix of the quadratic form Q is

$$\begin{aligned} M(\theta ):= \begin{bmatrix} 2 &{} P_\theta +R_\theta -w_1 \\ P_\theta +R_\theta -w_1 &{} 2P_\theta R_\theta \end{bmatrix}. \end{aligned}$$

The determinant of \(M(\theta )\) is uniformly positive if and only if

$$\begin{aligned} P_\theta R_\theta -\frac{(P_\theta +R_\theta -w_1)^2}{4}\ge c>0, \end{aligned}$$

that is,

$$\begin{aligned} X_2(\theta ):= (\sqrt{P_\theta } +\sqrt{R_\theta })^2>w_1> (\sqrt{P_\theta } -\sqrt{R_\theta })^2:=X_1(\theta ) \end{aligned}$$

uniformly in \(\Omega _T\). Thus it suffices to verify

$$\begin{aligned} \inf _\theta X_2(\theta )>\sup _\theta X_1(\theta ). \end{aligned}$$

Computing the derivative of \(X_1\) with respect to \(\theta \), one has

$$\begin{aligned} X_1^\prime (\theta ) =&p-2-\gamma -\frac{(p-2) R_\theta -\gamma P_\theta }{\sqrt{P_\theta R_\theta }}\\ =&\frac{(\sqrt{P_\theta }-\sqrt{R_\theta })\big ((p-2)\sqrt{R_\theta }+\gamma \sqrt{P_\theta }\big )}{ \sqrt{P_\theta R_\theta }}. \end{aligned}$$

Then in the eligible range of parameters \(X_1^\prime (\theta )=0\) if and only if \(p-2+\gamma =0.\) Hence, by considering the values at the endpoints, we obtain the supremum of \(X_1\) with respect to \(\theta \):

$$\begin{aligned} \sup _\theta X_1 (\theta )&=\max \{X_1 (0), X_1 (1)\}\\&=\max \Big \{0, \big (\sqrt{p-1}-\sqrt{1-\gamma }\big )^2 \Big \}\\&= \big (\sqrt{p-1}-\sqrt{1-\gamma }\big )^2. \end{aligned}$$

Similarly, we obtain the derivative of \(X_2\) with respect of \(\theta \):

$$\begin{aligned} X_2^\prime (\theta )=\frac{(\sqrt{P_\theta }+\sqrt{R_\theta })\big ((p-2)\sqrt{R_\theta }-\gamma \sqrt{P_\theta }\big )}{ \sqrt{P_\theta R_\theta }}, \end{aligned}$$

and thus the eligible stationary point is

$$\begin{aligned} \theta _2=\frac{p-2-\gamma }{(p-2)\gamma } \end{aligned}$$

if \((p-2)\gamma >0\). Then

$$\begin{aligned} \inf _\theta X_2(\theta )&=\min \Big \{X_2(0),X_2\Big (\frac{p-2-\gamma }{(p-2)\gamma }\Big ), X_2(1)\Big \}\\&=\min \Big \{4, \frac{p-2}{\gamma }+\frac{\gamma }{p-2}+2,\big (\sqrt{p-1}+\sqrt{1-\gamma }\big )^2 \Big \}\\&=\min \Big \{4, \big (\sqrt{p-1}+\sqrt{1-\gamma }\big )^2 \Big \}. \end{aligned}$$

Obviously, we have

$$\begin{aligned} \big (\sqrt{p-1}+\sqrt{1-\gamma }\big )^2>\big (\sqrt{p-1}-\sqrt{1-\gamma }\big )^2. \end{aligned}$$

Note that

$$\begin{aligned} 4>\big (\sqrt{p-1}-\sqrt{1-\gamma }\big )^2 \end{aligned}$$

is equivalent to

$$\begin{aligned} 1<p\le 5 \quad \text {and}\quad -1<\gamma <1, \end{aligned}$$

or \(5<p<7+4\sqrt{2}\) and \(-1<\gamma <-2-p+4\sqrt{p-1}\). Thus if p and \(\gamma \) satisfy range (i), for small enough \(\eta =\eta (p,\gamma )>0\), in addition to the above choice \(w_2=2\), we set

$$\begin{aligned} w_1=\big (\sqrt{p-1}-\sqrt{1-\gamma }\big )^2+\eta . \end{aligned}$$

The proof is finished. \(\square \)

Next we prove Lemma 6.3, which implies nonnegativity of the necessary terms when \(1<p<\infty \) and \(-1<\gamma <\sqrt{2}-\frac{1}{2}\). In this case, we use a choice of the weights which leads to the vanishing coefficient of the mixed term \(\Delta _T u^{\varepsilon }\Delta _{\infty }^Nu^{\varepsilon }\) in (6.7).

To be more precise, at the beginning of this section, we obtained three conditions (6.8), (6.9) and (6.10), i.e. that

$$\begin{aligned} {\left\{ \begin{array}{ll} 2c_1+c_2&{}=2(w_1+w_3\kappa )-2w_3\theta \kappa \ge c,\\ c_3&{}=w_2+w_4\kappa \ge c,\\ \det (M)&{}=c_3(c_3+c_4)P_\theta -\frac{\left( c_3P_\theta +(c_3+c_4)-(2c_1+c_2)\right) ^2}{4}\ge c \end{array}\right. } \end{aligned}$$
(6.11)

need to hold uniformly in \(\Omega _T\). Here

$$\begin{aligned} M= \begin{bmatrix} c_3 &{} \displaystyle {\frac{1}{2}\big (c_3P_\theta +(c_3+c_4)-(2c_1+c_2)\big )} \\ \displaystyle {\frac{1}{2}\big (c_3P_\theta +(c_3+c_4)-(2c_1+c_2)\big )} &{} (c_3+c_4)P_\theta \end{bmatrix} \end{aligned}$$

is the coefficient matrix of quadratic form

$$\begin{aligned} Q=c_3 (\Delta _T u^{\varepsilon })^2+ (c_3+c_4)P_\theta (\Delta _{\infty }^Nu^{\varepsilon })^2 +\big (c_3P_\theta +(c_3+c_4)-(2c_1+c_2)\big ) \Delta _T u^{\varepsilon }\Delta _{\infty }^Nu^{\varepsilon }. \end{aligned}$$

To simplify the computations in checking the last condition in (6.11), we will consider a special case where the coefficient \(c_3P_\theta +(c_3+c_4)-(2c_1+c_2)\) of the mixed term \(\Delta _T u^{\varepsilon }\Delta _{\infty }^Nu^{\varepsilon }\) vanishes.

Lemma 6.4

(Vanishing mixed term) The mixed term \( \Delta _T u^{\varepsilon }\Delta _{\infty }^Nu^{\varepsilon }\) in Q vanishes, i.e.

$$\begin{aligned} c_3P_\theta +(c_3+c_4)-(2c_1+c_2)=0 \end{aligned}$$

uniformly in \(\Omega _T\) if and only if

$$\begin{aligned} {\left\{ \begin{array}{ll} 2w_1=(p-\gamma )w_2, \\ 2w_3=(4-p+\gamma )w_4, \\ (p-2-\gamma )(w_4-w_2)=0. \end{array}\right. } \end{aligned}$$

Proof

Recall that \(\theta =1-\kappa \), \(\kappa >0\), and \(P_\theta =(p-2)\theta +1\). Then recalling the expressions of \(c_1,\ldots ,c_4\) in (6.5), we can write the coefficient of the mixed term \( \Delta _T u^{\varepsilon }\Delta _{\infty }^Nu^{\varepsilon }\) as a polynomial of \(\kappa \) as

$$\begin{aligned}&c_3P_\theta +(c_3+c_4)-(2c_1+c_2)\\&\hspace{1 em}=\big ((4-p+\gamma )w_4-2w_3\big )\kappa ^2+(p-2-\gamma )(w_4-w_2)\kappa +(p-\gamma )w_2-2w_1. \end{aligned}$$

Set all the coefficients to be zero, we have the desired condition. \(\square \)

By the above Lemma, we can easily to obtain the following result.

Corollary 6.5

If

$$\begin{aligned} {\left\{ \begin{array}{ll} w_1 =p-\gamma ,\quad &{}w_2=2, \\ w_3 =4-p+\gamma ,\quad &{}w_4=2, \end{array}\right. } \end{aligned}$$
(6.12)

then the mixed term \( \Delta _T u^{\varepsilon }\Delta _{\infty }^Nu^{\varepsilon }\) in Q vanishes.

The above corollary gives a choice of the coefficients \(w_1,w_2,w_3\) and \(w_4\) to obtain the vanishing coefficient of mixed term \( \Delta _T u^{\varepsilon }\Delta _{\infty }^Nu^{\varepsilon }\). This then helps us in proving Lemma 6.3.

Proof of Lemma 6.3

If \(w_1,w_2,w_3\) and \(w_4\) satisfy (6.12), then by Corollary 6.5, the last condition in (6.11) reduces to checking that

$$\begin{aligned} \det (M)=c_3(c_3+c_4)P_\theta \ge c. \end{aligned}$$

Since

$$\begin{aligned} P_\theta =(p-2)\theta +1\ge \min \{p-1,1\}>0, \end{aligned}$$

sufficient conditions to obtain (6.11) can be written as

$$\begin{aligned} {\left\{ \begin{array}{ll} 2c_1+c_2&{}=2(w_1+w_3\kappa )-2w_3\theta \kappa \ge c,\\ c_3&{}=w_2+w_4\kappa \ge c,\\ c_3+c_4&{}=w_2+w_4\kappa -\big (w_2\gamma +w_4\kappa (2+\gamma )\big )\theta \ge c \end{array}\right. } \end{aligned}$$

uniformly in \(\Omega _T.\)

First, using values (6.12) in the first condition and replacing \(\theta \) by \(1-\kappa \), we have

$$\begin{aligned} 2c_1+c_2 =&2(p-\gamma )+2(4-p+\gamma )\kappa ^2. \end{aligned}$$

Since \(\kappa \) is positive, the sign of the derivative with respect to \(\kappa \) that is \(4(4-p+\gamma )\kappa \) is fixed. Then \(2c_1+c_2\) with respect to \(\kappa \) is monotone and the minimum point corresponds either \(\kappa =0\) or \(\kappa =1\). Thus

$$\begin{aligned} \begin{aligned} 2c_1+c_2\ge \min \{2(p-\gamma ), 8\}>0. \end{aligned} \end{aligned}$$

For the second condition, when \(w_2=w_4=2\), it is obvious that

$$\begin{aligned} c_3=2+2\kappa \ge 2>0. \end{aligned}$$

Finally, for the last condition plugging values (6.12) in and rewriting as

$$\begin{aligned} c_3+c_4 =&2(1-\gamma )-2\kappa +2(2+\gamma )\kappa ^2. \end{aligned}$$

When the derivative of \(c_3+c_4\) with respect to \(\kappa \) vanishes, that is, \(-2+4(2+\gamma )\kappa =0,\) one has

$$\begin{aligned} \kappa _1=\frac{1}{2(2+\gamma )}\in (0,1]. \end{aligned}$$

Then the minimum point is one of the boundary points or the extreme point \(\kappa _1 \). Selecting \(\kappa =\kappa _1\), we have

$$\begin{aligned} c_3+c_4=2(1-\gamma )-\frac{1}{2(2+\gamma )}>0 \end{aligned}$$
(6.13)

if and only if

$$\begin{aligned} -1<\gamma <\sqrt{2}-\frac{1}{2}. \end{aligned}$$

If \(\kappa =0\), we have \(c_3+c_4=2(1-\gamma )\), and if \(\kappa =1\), then \(c_3+c_4=4\). It follows that the minimum is given by strictly positive expression (6.13), and the proof is finished. \(\square \)

The proof of Proposition 6.1 now immediately follows.

Proof of Proposition 6.1

The result immediately follows from the previous lemmas, since under assumption (i), Lemma 6.2 implies that (4.9) holds and thus Lemma 4.4 is applicable. Similarly under assumption (ii), Lemma 6.3 implies that Lemma 4.4 is applicable. Now the desired estimate follows from Lemma 4.5. \(\square \)

6.2 Passing to the original equation

In this section we justify the limiting argument to let \(\varepsilon \rightarrow 0\) in Proposition 6.1 and thus derive our main result, Theorem 1.1.

Proof of Theorem 1.1

Let \(u:\Omega _T\rightarrow {\mathbb {R}}\) be a viscosity solution to

$$\begin{aligned} u_t-|Du|^{\gamma }\big (\Delta u+(p-2)\Delta _\infty ^Nu\big )=0. \end{aligned}$$

Let us fix concentric parabolic cylinders \(Q_r\subset Q_{2r}\Subset \Omega _T\) with center point \((x_0,t_0)\in \Omega _T\) and moreover, let us fix a smooth subdomain \(U\Subset \Omega \) and \(0<t_1<t_2<T\) such that \(Q_{2r}\Subset U_{t_1,t_2}\Subset \Omega _T\). For \(\varepsilon >0\) small, let us consider the Dirichlet problem

$$\begin{aligned} {\left\{ \begin{array}{ll} \begin{aligned} u^{\varepsilon }_t-(|Du^{\varepsilon }|^2+\varepsilon )^{\gamma /2}\Big (\Delta u^{\varepsilon }+(p-2)\frac{\Delta _{\infty }u^{\varepsilon }}{|Du^{\varepsilon }|^2+\varepsilon }\Big )=0 \quad &{}{\text {in }} U_{t_1,t_2};\\ u^{\varepsilon }=u \quad &{}{\text {on }} \partial _pU_{t_1,t_2}, \end{aligned} \end{array}\right. } \end{aligned}$$

where

$$\begin{aligned} \partial _pU_{t_1,t_2}:=({{\overline{U}}}\times \{t_1\})\cup ( \partial U\times (t_1,t_2]) \end{aligned}$$

is the parabolic boundary of \(U_{t_1,t_2}\). By the classical theory of uniformly parabolic equations, the above problem has a unique solution \(u^{\varepsilon }\in C^\infty ( U_{t_1,t_2})\cap C( {\overline{U}} _{t_1,t_2})\).

Proposition 6.1 is applicable to \(u^{\varepsilon }\) and we conclude that

$$\begin{aligned} \begin{aligned} \int _{Q_r}|D^2u^{\varepsilon }|^2dxdt&\le \frac{C}{r^2}\left( \int _{Q_{2r}}|Du^{\varepsilon }|^2dxdt +\int _{Q_{2r}}(|Du^{\varepsilon }|^2+\varepsilon )^{\frac{2-\gamma }{2}}dxdt\right) \\&\quad +\varepsilon \left( \frac{C}{r^2}\int _{Q_{2r}}\big |\ln (|Du^{\varepsilon }|^2+\varepsilon )\big |dxdt \right. \\&\quad \left. +C\int _{B_{2r}}\big |\ln \big (|Du^{\varepsilon }(x,t_0)|^2+\varepsilon \big )\big |dx \right) \end{aligned} \end{aligned}$$
(6.14)

where \(C=C(p,\gamma )>0\). By [14], for any \(Q_R\Subset U_{t_1,t_2}\) there exist positive constants \(\alpha \in (0,1)\) and \(C>0\), that are allowed to depend on p, \(\gamma \), \({{\,\mathrm{dist\,}\,}}(Q_R,\partial U_{t_1,t_2})\) and \(\Vert u\Vert _{L^\infty (U_{t_1,t_2})}\), such that

$$\begin{aligned} \Vert Du^{\varepsilon }\Vert _{C^{\alpha }(Q_R)}\le C. \end{aligned}$$
(6.15)

Arzelà-Ascoli theorem gives that \(u^{\varepsilon }\) and \(Du^{\varepsilon }\) both converge locally uniformly, up to a subsequence, and

$$\begin{aligned} u^{\varepsilon }\xrightarrow {\varepsilon \rightarrow 0}{\bar{u}} \quad {\text {and}}\quad Du^{\varepsilon }\xrightarrow {\varepsilon \rightarrow 0}D{\bar{u}} \end{aligned}$$

for some continuous function \({\bar{u}}:U_{t_1,t_2}\rightarrow {\mathbb {R}}\), which by a barrier argument is continuous up to the parabolic boundary, and whose spatial gradient \(D{\bar{u}}\) is locally continuous.

By the well known [12] stability properties of viscosity solutions, \({\bar{u}}\) is a viscosity solution to

$$\begin{aligned} {\left\{ \begin{array}{ll} \begin{aligned} {\bar{u}}_t-|D{\bar{u}}|^\gamma \big (\Delta {\bar{u}}+(p-2)\Delta _{\infty }^N{\bar{u}}\big )=0 \quad &{}{\text {in }} U_{t_1,t_2};\\ {\bar{u}}=u \quad &{}{\text {on }} \partial _pU_{t_1,t_2}. \end{aligned} \end{array}\right. } \end{aligned}$$

By the uniqueness theorem for viscosity solutions [22], we conclude that \({\bar{u}}=u\).

By employing bound (6.15), we find that the right hand side of (6.14) is bounded from above by a constant independent of \(\varepsilon \). Thus \(\{D^2u^{\varepsilon }\}_{\varepsilon }\) is bounded in \(L^2(Q_{r})\), and consequently we may extract a subsequence that converges weakly in \(L^2(Q_r)\). Further, using integration by parts, we see that the limit is \(D^2u\), and thus \(D^2u\in L^2_\text {loc}(\Omega _T)\). Finally, we conclude that

$$\begin{aligned} \int _{Q_{r}}|D^2u|^2dxdt&\le \liminf _{\varepsilon \rightarrow 0}\int _{Q_{r}}|D^2u^{\varepsilon }|^2dxdt \\&\le \liminf _{\varepsilon \rightarrow 0} \left( \frac{C}{r^2}\left( \int _{Q_{2r}} |Du^{\varepsilon }|^2 dxdt +\int _{Q_{2r}}(|Du^{\varepsilon }|^2+\varepsilon )^{\frac{2-\gamma }{2}}dxdt\right) \right. \\&\quad \left. +\varepsilon \left( \frac{C}{r^2}\int _{Q_{2r}}\big |\ln (|Du^{\varepsilon }|^2+\varepsilon )\big |dxdt +C\int _{B_{2r}}\big |\ln \big (|Du^{\varepsilon }(x,t_0)|^2+\varepsilon \right) \big |dx \Big )\right) \\&= \frac{C}{r^2}\left( \int _{Q_{2r}}|Du|^2 dxdt +\int _{Q_{2r}}|Du|^{2-\gamma } dxdt\right) , \end{aligned}$$

which is the desired estimate. \(\square \)

It is possible to improve the ranges in Theorem 1.1. However, the computations get more technical, even if they follow the same ideas as above, and thus we have chosen to omit them. In any case the question whether the full range obtained in the smooth case in Proposition 5.1 can also be obtained here remains an open problem.

Next we give the proof of Corollary 1.2.

Proof of Corollary 1.2

Assume that \(u^{\varepsilon }\) is a smooth solution to (6.1), and observe

$$\begin{aligned} \left| u^{\varepsilon }_t\right| =&\left| (|Du^{\varepsilon }|^2+\varepsilon )^{\gamma /2}\left( \Delta u^{\varepsilon }+(p-2)\frac{\Delta _{\infty }u^{\varepsilon }}{|Du^{\varepsilon }|^2+\varepsilon }\right) \right| \nonumber \\ \le&(|Du^{\varepsilon }|^2+\varepsilon )^{\gamma /2}(\left| \Delta u^{\varepsilon }\right| +\left| p-2\right| \left| D^2u^{\varepsilon }\right| )\nonumber \\ \le&(p+2)(|Du^{\varepsilon }|^2+\varepsilon )^{\gamma /2}\left| D^2u^{\varepsilon }\right| . \end{aligned}$$

As above, the spatial gradient is Hölder continuous and since \(\gamma \) is nonnegative, we have in \(Q_{2r}\)

$$\begin{aligned} (|Du^{\varepsilon }|^2+\varepsilon )^{\gamma /2}\le C. \end{aligned}$$

For all \( Q_r\subset Q_{2r}\Subset \Omega _T\), we have

$$\begin{aligned}&\int _{Q_r} |u^{\varepsilon }_t|^2 dxdt \nonumber \\&\quad \le (p+2)^2\int _{Q_r} (|Du^{\varepsilon }|^2+\varepsilon )^{\gamma } |D^2u^{\varepsilon }|^2 dxdt \nonumber \\&\quad \le (p+2)^2 \left| \left| (|Du^{\varepsilon }|^2+\varepsilon )^{\gamma }\right| \right| _{L^\infty (Q_r)} \int _{Q_r} |D^2u^{\varepsilon }|^2 dxdt. r \end{aligned}$$

Then we use (6.14) estimate the right hand side of the above estimate. Similarly to the proof of Theorem 1.1, up to a subsequence, \(\{ u^{\varepsilon }_t \}_{\varepsilon }\) converges weakly in \(L^2(Q_r)\). By integration by parts, the weak limit is \(u_t\). In particular \(u_t\) exists as a function and \(u_t\in L^2_{\text {loc}}(\Omega _T)\). \(\square \)