A systematic approach on the second order regularity of solutions to the general parabolic p-Laplace equation

We study a general form of a degenerate or singular parabolic equation ut-|Du|γ(Δu+(p-2)Δ∞Nu)=0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} u_t-|Du|^{\gamma }\big (\Delta u+(p-2)\Delta _\infty ^Nu\big )=0 \end{aligned}$$\end{document}that generalizes both the standard parabolic p-Laplace equation and the normalized version that arises from stochastic game theory. We develop a systematic approach to study second order Sobolev regularity and show that D2u\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$D^2u$$\end{document} exists as a function and belongs to Lloc2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L^2_\text {loc}$$\end{document} for a certain range of parameters. In this approach proving the estimate boils down to verifying that a certain coefficient matrix is positive definite. As a corollary we obtain, under suitable assumptions, that a viscosity solution has a Sobolev time derivative belonging to Lloc2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L^2_\text {loc}$$\end{document}.

. In this article, we consider a rather general class of parabolic equations with 1 < p < ∞ and −1 < γ < ∞, where denotes the normalized infinity Laplacian.The equation contains the game theoretic or normalized p-parabolic equation and the divergence form standard p-parabolic equation as special cases.The equation is not uniformly parabolic or in divergence form except in special cases, and it can be highly degenerate or singular in the gradient variable.Regularity for such equations has been recently studied for example by Imbert, Jin and Silvestre as well as Parviainen and Vázquez as discussed below.The objective of this article is to develop a systematic approach to study the second order spatial regularity of viscosity solutions to (1.1).In this approach proving the estimate reduces down to verifying that a certain coefficient matrix is positive definite.For the further notation and the definition of viscosity solutions to (1.1), we refer to Section 2.
In [11] we considered second order Sobolev regularity of the parabolic p-Laplace equation where ∆ p u := div (|Du| p−2 Du) is the p-Laplace operator.Notice that, in the special case γ = p − 2, equation (1.1) can be formally, and also rigorously by [16], rewritten as (1.2).One of the key tools is the fundamental inequality (the name stems from Dong, Peng, Zhang and Zhou [10] for a related inequality) which holds for any smooth function u as shown by Sarsa in [25].Curiously, in [11] it was sufficient to use the above inequality in a simpler form just estimating (|Du| 2 ∆u − ∆ ∞ u) 2 ≥ 0 on the right hand side.With the general equation in this paper, we use the inequality in the full generality.A natural approach to obtain second order Sobolev estimates is to differentiate (1.1), multiply the equation with suitable quantities containing gradients, and manipulate in a suitable way.Thus, among other terms, one can obtain terms in divergence form, which can be controlled.In the case of (1.2), one then uses (1.3) in a simple form as explained above and thus gets an upper bound for a quantity containing second derivatives.Part of the difficulty in dealing with the general equation instead of the p-parabolic equation stems from the fact that this approach gives rise to the mixed terms of the type |Du| −γ u t ∆ N ∞ u which are difficult to handle.
Another difficulty arises from the fact that of course u is not known to be smooth a priori when differentiating the equation, and negative powers of the gradient are problematic as the gradient might vanish.A natural approach to these problems is regularizing the equation by adding a small regularization parameter, which removes the singularity.Unfortunately, when differentiating the regularized equation, one gets another set of problematic terms that no longer match the terms in the fundamental inequality.Treating these terms is a subtle issue, and we need to guarantee that a sum of certain terms remain nonnegative by carefully analyzing explicit coefficients of the terms.
In order to analyze the nonnegativity of the problematic terms and their coefficients systematically, we develop several techniques.We interpret the terms and their coefficients, as a quadratic form and derive a range condition for the parameters from the positive definiteness condition of this quadratic form.In order to improve the range obtained in this way, we use a hidden divergence structure.Indeed, suitable mixed terms can actually be written in a divergence form, and thus by adding such terms, we can manipulate the coefficients at the cost of adding divergence form terms that can be estimated.Some steps, in particular checking that the quadratic form is positive definite, of the above plan when written down explicitly are quite complicated, and thus for the convenience of the reader we first provide a formal calculation in Section 5, where we assume that the solution is smooth and the gradient nonvanishing.In this case, the above plan gives an optimal (optimality is discussed in Example 5.1) a priori estimate (Proposition 5.1), with the range condition The left hand side in the above estimate is of the same form as the estimate in [11].In particular, we may set s = 2 − p, s = 0 and s = p − 2 giving as special cases.Perhaps surprisingly, removing the smoothness assumption and the assumption on the nonvanishing gradient by using the regularized equation turns out to be a problem.In particular, the additional terms resulting from the regularization add to the technical complication of showing that the quadratic form is positive definite.To reduce technical complication partly for expository reasons, we have decided to restrict ourselves to the case n = 2 in the regularized case.In this context we obtain the following result.
This also implies that time derivative exists as an L 2 -function, which is not evident directly by the definition.
Corollary 1.2 (Time derivative).Let n = 2. Let u : Ω T → R be a viscosity solution to the general p-parabolic equation (1.1).If p and γ satisfy one of the following conditions: then the time derivative u t exists as a function and u t ∈ L 2 loc (Ω T ).At least to some extent the range condition in Theorem 1.1 is an artifact as we explain later.It would be interesting to know whether the theorem is valid in the whole range of parameters.
Next we review the known regularity results of equation (1.1) and explain how our results fit into the existing literature.If γ = p − 2, then equation (1.1) is the parabolic p-Laplace equation (1.2).For the regularity theory of weak solutions to (1.2) we refer to the monograph of DiBenedetto [8].In particular, if u is a continuous weak solution to (1.2), then u ∈ C α loc and Du ∈ C β loc for some 0 < α, β < 1.Moreover, Lindqvist [18] showed in the degenerate case 2 < p < ∞ that The singular case is treated in [20].The results then imply the existence of time derivative u t as a function in suitable spaces similar to Corollary 1.2.In the case of the obstacle problem the existence of the time derivative was established in [19].Dong, Peng, Zhang and Zhou [10] gave a proof that D 2 u ∈ L 2 loc with a sharp range 1 < p < 3.This range of p can be recovered from assumption (i) of Theorem 1.1.In the global case, estimates for D(|Du| p−2 Du) have been derived by Cianchi and Maz'ya in [7].
If γ = 0, equation (1.1) is the normalized parabolic p-Laplace equation is the normalized or game theoretic p-Laplace operator.This equation arises from a two-player stochastic game with a fixed running time, see Manfredi, Parviainen and Rossi [21], or from image processing, see Does [9].Banerjee and Garofalo [5,6] studied the potential theoretic aspects and boundary regularity of the normalized p-Laplacian evolution.These papers also contain Lipschitz regularity results for solutions to the normalized p-parabolic equation.The regularity method in [21] is global whereas in [23] a local game theoretic method is applied in this context.Later Jin and Silveste [15] established C 1,α  loc -regularity in space and C 0, 1+α 2 loc -regularity in time.In [13], Høeg and Lindqvist studied the second order Sobolev regularity for the normalized pparabolic equation and showed that when 6  5 < p < 14  5 , the second order spatial derivatives D 2 u and the time derivative u t belong to L 2 loc .Moreover, they also proved that when 1 < p < 2, u t also belongs to L 2 loc .In [3], C 1,α loc -regularity was established to the normalized p-parabolic equation with a source term.The work of Dong, Peng, Zhang and Zhou [10] also applies to the normalized p-parabolic equation; in this case they obtained The key result of [10] with δ = 0 can be recovered from assumption (ii) of Theorem 1.1.Recently Andrade and Santos [1] established improved Sobolev regularity estimates when p is close to 2.
As stated, (1.1) is in non-divergence form and can be highly degenerate or singular.Thus even defining viscosity solutions in such a way that existence and uniqueness can be obtained becomes a nontrivial issue.This was done by Ohnuma and Sato in [22], see also Giga's monograph [12].For viscosity solutions to the general equation (1.1), where 1 < p < ∞ and −1 < γ < ∞ are allowed to be independent of each other, Imbert, Jin and Silvestre [14] proved in particular that Du ∈ C α loc for suitable 0 < α < 1.In [24], Parviainen and Vázquez established Harnack's inequality and asymptotic behaviour by using the fact that for radial solutions equation (1.1) is equivalent to a divergence form equation but in fictitious dimension.Attouchi [2] in the degenerate case and Attouchi-Ruosteenoja [4] in the singular case established spatial C 1,α  loc -regularity for an equation of type (1.1) but with a source term.The elliptic Harnack's inequality in the singular range was obtained in [17].
This article is organized as follows.In Section 2 we provide the necessary preliminaries.In Section 3 we explain the ideas of the proof of Theorem 1.1.In Section 4 we state several auxiliary lemmas needed in the proofs, including the fundamental inequality (1.3).Sections 5 and 6 are parallel to each other.In the former, we provide the formal calculation.In the latter, we provide a similar calculation in a regularized setting, which eventually yields Theorem 1.1.In Section 6.2 we prove Theorem 1.1 and Corollary 1.2.Some of the proofs for the technical lemmas are postponed to the appendix.

PRELIMINARIES
We use the following notation.Let Ω ⊂ R n , n ≥ 2, be a domain and define the cylinder Moreover, we will use parabolic cylinders of the form where B r (x 0 ) denotes the open ball with radius r > 0 and center point x 0 ∈ Ω.When no confusion arises, we may drop the reference point (x 0 ,t 0 ) and write Q r .
Given a function u = u(x,t) of point x ∈ R n and time t > 0, the spatial gradient of u is denoted by Du = (u x 1 , . . ., u x n ), and the time derivative by u t .The Hessian matrix of u is denoted by D 2 u = (u x i x j ) n i, j=1 .The Laplacian of u is given by and the infinity Laplacian by where •, • stands for the inner product in R n .The normalized infinity Laplacian is denoted by We study viscosity solutions to the general p-parabolic equation where 1 < p < ∞ and −1 < γ < ∞.The definition of suitable viscosity solutions to (2.1) requires some care because the operator may be singular.Nonetheless, a definition that fits our needs can be found in [22].First set whenever Du = 0. We define F to be a set of functions f ∈ C 2 ([0, ∞)) such that and moreover we require for g(x) Further, let If Dϕ = 0, a C 2 -function is automatically admissible.
The definition for touching (strictly) from above is analogous.
(i) u is lower semicontinuous, (ii) u is finite in a dense subset of Ω T , (iii) for all admissible ϕ ∈ C 2 (Ω T ) touching u at (x 0 ,t 0 ) ∈ Ω T from below The definition of a subsolution u : Ω T → R ∪ {−∞} is analogous except that we require upper semicontinuity, touching from above, and we reverse the inequalities above: in other words if −u is a viscosity supersolution.If a continuous function is both a viscosity superand subsolution, it is a viscosity solution.
It is shown in [16] that if γ = p−2 > −1, then the above notion coincides with the notion of p-super/subparabolic functions, having a direct connection to the distributional weak super/subsolutions as well.Moreover, if γ ≥ 0, then viscosity solutions can be defined in a standard way by using semicontinuous envelopes, see Proposition 2.2.8 in [12].

PLAN OF PROOF
In this section we explain the idea of the proof of Theorem 1.1 and our plan of the proof.
3.1.Derivation of a basic estimate.In order to prove second order estimates, we first derive a key basic estimate (3.4) (or actually equality at this point).To this end, we regularize the original equation (1.1) and consider for small ε > 0. Solutions to this equation are smooth according to the standard theory.We differentiate equation (3.1) with respect to x k , k = 1, . . ., n, and find that the spatial partial derivatives u ε x k , k = 1, . . ., n, solve the equation where Here I denotes the identity matrix.
We continue with the intention to study the derivatives of |Du| p−2+s 2 Du; in particular the choice s = 2 − p corresponds to D 2 u.We multiply the differentiated equation (3.2) by x k and obtain Using the chain rule and summing (3.3) over k = 1, . . ., n gives that we obtain the identity Here we assume that s = γ − p.This is not restrictive, because eventually such value of s violates the resulting range condition (1.4) in any case.It is important that the terms on the right hand side are in divergence form and can thus be well estimated.An important step towards the desired result would be a pointwise inequality which then could be integrated to obtain the final result and for this we need to estimate the excess terms on the left hand side of (3.4).

3.2.
Formal calculation for smooth solutions with a nonvanishing gradient.Compared to our earlier work [11] where we treated the case γ = p − 2, we now have two extra difficulties for the general case −1 < γ < ∞.The first difficulty arises from the fourth term on the left hand side of (3.4), that is, Note that this mixed term vanishes if γ = p − 2. In general we regard the term mixed in the sense that we cannot determine its sign by the sign of the coefficient p − 2 − γ.
We first discuss the difficulty of mixed terms in the formal case with ε = 0, and denote a solution by u.In this case, we assume in addition that Du = 0.As indicated above, we would like to estimate the excess term in (3.4) and obtain an estimate for |Du| p−2+s |D 2 u| 2 with the range (1.4).To this end, we write the fundamental inequality (1.3) in the form and employ it in identity (3.4) on the term |Du| p−2+s |D 2 u| 2 to obtain that where Note that |D T |Du|| 2 ≥ 0. Sometimes ∆ T u is called the normalized 1-Laplacian for the obvious reason.Except the mixed term that is the last term on the left hand side in (3.5), the nonnegativity of other terms in the left hand side of (3.5) can be easily obtained by the restriction s > −1.In order to develop a systematic way of checking nonnegativity of the mixed term utilizing other terms, we use equation (1.1) to rewrite 2 , and view the mixed term ∆ T u∆ N ∞ u as a part of a quadratic form of ∆ T u and ∆ N ∞ u.That is, we consider It turns out that in order to derive the desired estimate, it suffices to ensure along with few other conditions that the quadratic form Q is strictly positive in R 2 \{0}, that is, M is positive definite.However, the range condition in (1.4) does not suffice to guarantee that the positive definiteness of Q, hence we need to improve the estimate.We employ the following observation: If q > 1, then holds for any smooth function u with nonvanishing gradient.In other words, the quantity on the left hand side is a 'good term' with a hidden divergence structure.
It is easier to utilize this observation with inequality (3.5), if we rewrite the right hand side of that inequality using equation (1.1).To be more precise, where the last term now matches with (3.6) setting q := p + s − γ.On the other hand, for a solution u, by equation (1.1), and by the definition of normalized q-Laplacian ∆ N q u, one has and thus The idea is to add u t div |Du| p−2+s−γ Du with a suitable weight on both sides of (3.5): then by the above equation, it produces new coefficients on the left hand side that can be utilized later to get better range, and controllable terms on the right hand side by (3.7).We also add another positive weight by using from Lemma 4.2 below which holds for any smooth function with nonvanishing gradient.This allows us to obtain simplified coefficients in intermediate steps.Thus we obtain ) which reduces to (3.5) if w 1 = 1 and w 2 = 1.Calculations reveal that if the range condition (1.4) holds, then the weights w 1 and w 2 can be adjusted so that the weighted quadratic form . This positivity in the formal case ε = 0 is shown in Lemma 5.2.By Proposition 5.1, this then implies the desired estimate Heuristically, in order to prove the above estimate, and setting s = 2 − p for simplicity, we could have left a small piece of |D 2 u| 2 when applying the fundamental inequality for (3.5).
Then the rest of the terms can be dropped by the above positivity result: in detail this is implemented in Lemma 4.4 also for other values of s.The obtained pointwise estimate can then be integrated by parts along with a cutoff function to get Proposition 5.1.

3.3.
Solutions without smoothness assumptions and regularized equation.The second difficulty, which is related to the regularization, is that the left hand side of (3.4) consists of regularized versions of second order derivative quantities, whereas employing the fundamental inequality (1.3) results in quantities like This mismatch causes that some of the formal calculations do not work as such but have further complications: in particular positive definiteness of the quadratic form becomes an issue.
For a certain range of parameters, the main result is obtained by a straightforward generalization of the formal calculation (ε = 0) in the previous section.However, in the process of extending the range, we consider where w 1 , w 2 , w 3 , w 4 ∈ R. Compared to the right hand side of (3.10), or (6.3), this sum has two additional terms with weights w 3 and w 4 .The latter additional term has a hidden divergence structure, similarly to (3.6).These divergence structures can be used to adjust the coefficients on the left hand side of the estimate (3.10), and thus to improve the range of parameters.To be more precise, we denote and obtain The second mixed term of (3.11) can also be written as a part of the quadratic form as follows with weight w 4 , where by using the regularized equation and recalling the shorthand notation ∆ T u ε := ∆u ε −∆ N ∞ u ε .This will give rise to new coefficients and thus to a better range condition.
In order to produce new coefficients on the left hand side of (3.4), especially for the second order term , and also to improve the range of the parameters, we add another divergence structure Also observe that the above choice of the power (p − 2 + s)/2 − 1 will be useful in the proof of Lemma 4.5 when deriving an upper bound for the left hand side of the estimate, after integration by parts where we estimate ε/(|Du ε | 2 + ε) ≤ 1 and thus the additional −1 in the power gets canceled out.Besides, the error terms obtained in Lemmas 4.4 and 4.7 in [10] can be seen as special cases of the error terms above.Then combining (3.8), (3.9), (3.13) and (3.14) together with definition (3.11) of S, we get where c 1 , c 2 , c 3 and c 4 depend on w 1 , w 2 , w 3 , w 4 and θ as computed in detail in Section 4.2.
Then we again use the fundamental inequality on part of c 1 |D 2 u ε | 2 and find such weights w 1 , w 2 , w 3 and w 4 that the last three terms on the left hand side can be interpreted as a positive definite quadratic form and thus removed.Finally, S on the right hand side can be multiplied by a cutoff function and integrated by parts to get the final estimate.However, the nonnegativity can only be checked in certain ranges, since it needs to hold uniformly for all θ ∈ [0, 1).

LEMMAS
In this section we prove several auxiliary tools.The lemmas in this section will be used to prove estimates for both u ε , that solves (3.1) with ε > 0, and u, that solves (3.1) with ε = 0 and Du = 0. Therefore we state the lemmas in such a generality that applies to both of these cases.
4.1.Hidden divergence structures.In this subsection we gather some useful facts about generic smooth functions.First, if u : Note that if (x 0 ,t 0 ) ∈ Ω T is a space-time point where |Du| is differentiable and Du(x 0 ,t 0 ) = 0, then D|Du|(x 0 ,t 0 ) = 0. Indeed, if we had D|Du|(x 0 ,t 0 ) = 0, then we could find a point ξ ∈ Ω×{t 0 } (close to (x 0 ,t 0 )) such that |Du|(ξ ) < 0, which is obviously impossible.On the other hand, if Du(x 0 ,t 0 ) = 0 for some (x 0 ,t 0 ) ∈ Ω T , then |Du| is differentiable at (x 0 ,t 0 ) and For those points where |Du| is differentiable, let us define the part of D|Du| which is tangential to the spatial level sets of u as and its orthogonal counterpart, the normalized infinity Laplacian, as We employ these notation to write and If n = 2, we have equality in the place of inequality.
For the proof of Lemma 4.1, we refer to [25,11].
The following lemmas show that certain terms that first appear to be in non-divergence form, can actually be expressed in a divergence form.On the other hand, these structures can be utilized in tuning the coefficients in the quadratic form as explained in Section 3.3, and thus they improve the range we obtain.The first Lemma 4.2 will mainly adjust the coefficient of the term The second divergence structure, Lemma 4.3, will produce certain new coefficients on the quadratic form as Q.The proofs of both of these lemmas are direct calculations.Lemma 4.2 (Hidden divergence structure 1).Let u : Ω T → R be a smooth function.Then for any α ∈ R and ε > 0, Furthermore, if Du = 0, then the above equality holds also for ε = 0.
Proof.By the derivative rule of composite function, the right hand side The next lemma demonstrates that a mixed term can be written in a divergence form.On the other hand by using equation (3.1), as explained in (3.13), the mixed term adds up in the quadratic form, and thus adding such mixed terms can be used to improve the range.
Furthermore, if Du = 0, then the above equality holds also for ε = 0.
Proof.We give the proof when β = −2, the second case is similar.By the derivative rule of composite function again, one has For α ∈ R, we denote the 'first good divergence structure' as and the 'second good divergence structure' Then as explained in (3.11), we consider the following weighted sum of these 'good structures', for some parameter s ∈ R and some weights w 1 , w 2 , w 3 , w 4 ∈ R. Observe that taking into account Lemmas 4.2 and 4.3, then S introduced above coincides with S in (3.11), i.e. the notation is consistent.The reason for using the mixed term form in S there was to emphasize the idea that we can improve the range by adding the mixed terms.To derive the final estimate, we need terms in the divergence form, and therefore this form was used in the above definition of S, but as stated they are equivalent.

4.2.
The key estimate.As explained in (3.15), S represents the right hand side in our key estimate, and on the left we should have the second derivatives and a positive definite quadratic form.In this section, we derive the key estimate corresponding to (3.15) in detail.
We use Lemmas 4.2 and 4.3 to rewrite S as a linear combination of time derivatives and second order spatial derivative quantities, similarly to the left hand side of (3.4).First recall shorthand notation θ and κ from (3.12) In particular, if ε = 0 and the gradient does not vanish, then θ ≡ 1 and κ ≡ 0. Next we recall the definition of S from the above, and use the good divergence structures i.e.For a smooth solution and indeed for any smooth function, we have Then by simplifying, we get almost everywhere in Ω T , where Observe that given p, γ and s, if ε = 0, then c 1 , . . ., c 4 reduce to constants that only depend on w 1 and w 2 , which shows that in smooth case by adjusting w 1 and w 2 , we can get the desired estimate as explained in (3.10).By employing expressions (4.1) and (4.2), we can write almost everywhere in Ω T .Next we use regularized equation (3.1) to replace time derivatives u t in (4.6) with spatial derivatives.Thus we arrive to the key estimate for a smooth solution to the regularized equation (which is actually equality at this point) where for the sake of brevity.We rewrite this as where Note that has an upper bound that only depends on p, γ and s by fixing w 1 , w 2 , w 3 and w 4 .

Auxiliary lemmas.
In this subsection we state two technical lemmas that can be used to conclude our main integral estimate.We want to apply the fundamental inequality, Lemma 4.1, to estimate |D 2 u ε | 2 in (4.8) from below to improve the range condition by using terms we obtain in this application.However, the direct application will eliminate the full Hessian |D 2 u ε | 2 that we want to estimate.We could leave a small fraction of |D 2 u ε | 2 (like the method was first described at the end of Section 3.3 for simplicity) and apply the fundamental inequality only to a remaining part, but actually this will not be necessary: The next lemma shows that already a seemingly weaker lower bound is sufficient.This will simplify the exposition.Lemma 4.4.Let u ε : Ω T → R be a smooth solution to (3.1), S as in (4.4), c 1 as in (4.5), and ε ≥ 0. If ε = 0, we assume in addition that Du ε = 0. Suppose that we can select w 1 , w 2 , w 3 , )

and a uniformly bounded positive definite (with a uniform constant) symmetric matrix M
Proof.Recall that which, as pointed out in (4.8), can be written as almost everywhere in space Ω T , where x and N are as in (4.8).Observe that we utilized equation (3.1) at this step to get rid of the time derivatives.
For any λ ∈ (0, 1), we write and use the assumption (4.9) to estimate (1 − λ )S from below.We end up with We claim that we can select λ > 0 such that c + λ (c 2 − c) ≥ 0 and M + λ (N − M) is a positive definite matrix.Indeed, since c > 0, then and the second principal minor is the determinant, i.e.
Hence we choose λ such that Since we have now proven the nonnegativity of the excess terms, the result follows.
The following lemma shows that we can derive the desired integral estimate from the pointwise lower bound.The proof uses rather standard techniques and is based on localization with a suitable cutoff function and then integration by parts.For the convenience of the reader, we give the details in the appendix.Lemma 4.5.Let u ε : Ω T → R be a smooth solution to (3.1), and S as in (4.4).If ε = 0, we assume in addition that Du ε = 0. Suppose that we can find weights w 1 , w 2 , w 3 , ) for some constant λ = λ (n, p, γ, s, w 1 , w 2 , w 3 , w 4 ) > 0. If s = γ − p, then for any concentric parabolic cylinders Q r ⊂ Q 2r ⋐ Ω T with center point (x 0 ,t 0 ) ∈ Ω T , we have the estimate where C = C(n, p, γ, s, λ , w 1 , w 2 , w 3 , w 4 ) > 0.
The last two integrals on the right hand side of (4.11) do not appear if s = γ − p + 2. The source of such an error terms in the case s = γ − p + 2 is the logarithm in Lemma 4.3 when β = −2.

SMOOTH CASE WITH NON-ZERO GRADIENT
Let 1 < p < ∞ and −1 < γ < ∞.In this section we assume that u : ) such that Du = 0.That is, u does not have critical points in space.Our main result in this case is the following a priori estimate.Usually extending a regularity result to a general nonsmooth case is quite straightforward.Proposition 5.1.Let n ≥ 2, 1 < p < ∞ and −1 < γ < ∞.Let u : Ω T → R be a smooth solution to (5.1) then for any concentric parabolic cylinders Q r ⊂ Q 2r ⋐ Ω T , we have the estimate The following Lemma, Lemma 5.2, is the main ingredient in the proof of Proposition 5.1.Thus we postpone the proof of Proposition 5.1 until after the proof of Lemma 5.2.
In the following lemma we consider the weighted sum where w 1 , w 2 ∈ R, and the notation was defined in (4.4).Note that since ε = 0 in this section, the terms with weights w 3 and w 4 in (4.4) disappear.The purpose of Lemma 5.2 is to show that under restriction (5.2), we can find positive weights w 1 = w 1 (n, p, γ, s) > 0 and w 2 = w 2 (n, p, γ, s) > 0 such that S has a suitably nonnegative lower bound to make Lemma 4.4 applicable.Moreover, by the proof of Lemma 5.2 and Sylvester's condition, we can choose the value c = c(n, p, γ, s) > 0 small enough such that for M in the proof it holds M 11 ≥ c and det(M) ≥ c.
The proof of Proposition 5.1 is then finished by using Lemma 4.5.

and a uniformly bounded positive definite (with a uniform constant) symmetric matrix M
Proof.Similarly as in (4.7), recalling that ε = 0, by expressions (4.5), we arrive at (5.3) We estimate |D 2 u| 2 on the left hand side of (5.3) from below by the fundamental inequality, Lemma 4.1.This yields the following lower bound for S where is a symmetric 2 × 2-matrix.We claim that under assumption (5.2) we can choose w 1 , w 2 ∈ R such that M is uniformly bounded positive definite (with a uniform constant).
If n = 2, this is easy to see by selecting 2 .In other words, with such choice of w 1 and w 2 , the mixed term ∆ T u∆ N ∞ u vanishes.Notice that (5.2) implies that w 1 > 0 and w 2 > 0.
For the higher dimensional case n ≥ 3, we set w 1 = 1 and find w 2 = w 2 (n, p, γ, s) > 0 such that M is uniformly bounded positive definite (with a uniform constant).This is possible precisely when (5.2) holds: Since the proof is quite tedious, we postpone it to Lemma B.1 in the appendix.
We are ready to give the proof of Proposition 5.1.
Proof of Proposition 5.1.Let us fix w 1 = w 1 (n, p, γ, s) > 0 and w 2 = w 2 (n, p, γ, s) > 0 according to Lemma 5.2.Lemma 4.4 is then applicable because w 1 > 0, w 3 = 0 implies that c 1 = w 1 + w 3 κ > 0 and the conclusion of Lemma 5.2 implies that (4.9) holds.Therefore, by Lemma 4.4 there exists λ = λ (n, p, γ, s, w 1 , w 2 , w 3 , w 4 ) > 0 such that in Ω T .Now the desired estimate follows from Lemma 4.5.Range (5.2) in Proposition 5.1, is optimal in the following sense: In the elliptic case, [10] and [25], the best known range is s > −1 − p−1 n−1 .On the other hand, Example 5.1 below shows that in the parabolic case we cannot hope to reach any better range than s > γ + 1 − p.A counterexample of this type was used in [10, Section 1.3] for the standard p-parabolic equation.for some C ∈ R and α > 0. Note that then u solves (2.1) in the classical sense whenever x 1 = 0. Indeed, by a direct computation, we have , u x i x j = 0, where i, j = 1, • • • , n and i and j are not both 1.
Since C > 0, the function u is a viscosity supersolution.The given function is also a viscosity solution whenever −1 < γ ≤ 0: the proof for the supersolution property is the same as in the degenerate case above.It is also a subsolution because (similarly to the degenerate case) there are no admissible test functions touching u from above.We provide a detailed proof of this fact.Thriving for a contradiction, suppose that there is an admissible test function ϕ touching u at (x 0 ,t 0 ) with x 0 = (0, . . ., 0) (for simplicity) from above.Then necessarily By the definition of a viscosity solution it holds that is an admissible test function touching strictly from above.By strict touching and regularity of u, by translating with respect to x 1 and lifting we may assume that φ touches u at a point (x,t 0 ), x = (ε, 0, . . . , 0),with small ε > 0. Also observe that by an approximation, we could assume that σ is a C 2 function, but we omit this step as well.Also recall the notation g(y) = f (|y|) and that lim y→0,y =0 F Dg(y), D 2 g(y) = 0.
Then by this and the counter assumption it holds at a point (x,t 0 ) for x close enough x 0 that (5.4) On the other hand, since u is now C 2 -function with the explicit formula, we have which contradicts inequality (5.4).
In the above inequality we used the fact that since φ touches u from above at (x,t 0 ) we have D 2 g(x) ≥ D 2 u(x,t 0 ) and Dg(x) = Du(x,t 0 ) = 0 and thus We study the local W 1,2 -regularity of |Du| p−2+s 2 Du for s ∈ R and see what kind of restrictions for s arise.We have

The function D(|Du| p−2+s 2
Du) locally belongs to Observe that range condition (5.2) gives this in the plane, but in higher dimensions we have an additional restriction, which is the same restriction as in the elliptic case.When s = 2 − p, then for W 2,2 -regularity, the range is sharp in the plane.
Remark 5.1.Also the case n = 1 holds.Recall that the key point is identity (3.4), that is, ( The left hand side of (5.5) is for some constant λ = λ (p, γ, s) > 0, provided that s > γ + 1 − p. From this it is easy to derive the desired integral estimate.We conclude that Proposition 5.1 holds in case n = 1 without the additional smoothness assumptions for u, and with the interpretation

REMOVING THE SMOOTHNESS ASSUMPTION
Section 5 gives a formal derivation of the regularity estimate under the assumption that the gradient of a solution does not vanish.In this section, we remove the additional assumption in a certain range of parameters by regularizing the equation and then finally pass to a limit to obtain the result for the original equation.6.1.Regularization.Let u ε : Ω T → R be a smooth solution to the equation where 1 < p < ∞, −1 < γ < ∞, and ε > 0 is a regularization parameter.As explained in Section 3.3, the mismatch between the second order differential quantities in the fundamental inequality and the regularized equation and consequently in the basic estimate causes that some of the formal calculations do not work as such even if most of the steps work for general s.In particular positive definiteness of the quadratic form becomes an issue.
In this section, partly for the convenience of the reader, we have decided to limit ourselves to the planar case n = 2 and focus on the square-integrability of the second order derivatives D 2 u, that is, we consider the case s = 2 − p.In this case the range condition in (1.4) that is ) in Theorem 1.1 and the Proposition below will be obtained by a straightforward generalization of the formal calculation (ε = 0).That is, we consider the sum and show that if (6.2) holds, then we can find w 1 , w 2 > 0 such that where c > 0 and Q is positive definite.For range (ii) in the Proposition below which is the same as in Theorem 1.1, we instead consider the full S as defined in (3.11) or equivalently in (4.4).
Our main result for u ε is the following.Proposition 6.1.Let n = 2. Let u ε : Ω T → R be a smooth solution to (6.1).If p and γ satisfy one of the following conditions: The proof of Proposition 6.1 is postponed to the end of the section.The main ingredients of the proof of Proposition 6.1 are the following lemmas, Lemma 6.2 and Lemma 6.3.The first lemma, Lemma 6.2, yields case (i).The second lemma, Lemma 6.3 yields case (ii).In both lemmas we consider the same weighted sum as before now selecting s = 2 − p i.e.
where w 1 , w 2 , w 3 , w 4 ∈ R are some weights, and the notation was defined in (4.4).The purpose of Lemma 6.2 and Lemma 6.3 is to show that under restrictions (i) and (ii), respectively, we can find suitable weights w 1 , w 2 , w 3 and w 4 , that only depend on p and γ, such that S has a suitable lower bound.Lemma 6.2.Let n = 2, S be as in (6.4), and (i) in Proposition 6.1 hold.For η = η(p, γ) > 0 small enough, if and a uniformly bounded positive definite (with a uniform constant) symmetric matrix M = M(p, γ) ∈ R 2×2 .Lemma 6.3.Let n = 2, S be as in (6.4), and (ii) in Proposition 6.1 hold.If then a statement similar to that in Lemma 6.2 holds.
To begin with, recall from (4.7) that S can be written as and Fundamental equality (4.3) in the plane yields that where To prove Lemma 6.2 and 6.3, it now suffices to check that (6.6) satisfies all the requirements of the lemmas: The coefficient of |D T |Du ε || 2 in (6.6) needs to be bounded from below by a positive constant, that is, uniformly in Ω T .For the quadratic form Q, we need to analyse the uniform boundedness and uniform positive definiteness of matrix M. Uniform boundedness is quite straightforward, so we focus our attention on the uniform positive definiteness.By Sylvester's condition, it suffices to check that and uniformly in Ω T .Next we prove Lemma 6.2, which implies nonnegativity of the necessary terms when 1 < p ≤ 5 and −1 < γ < 1.In this case a simple choice of the weights w 3 = w 4 = 0 will work.
Proof of Lemma 6.2.Similarly to the smooth case, we start with w 3 = w 4 = 0, plug these values into (6.5), and obtain This together with (6.6) gives where we denote R θ := 1 − γθ ∈ (0, 2), for the sake of brevity.To simplify the above identity, we select w 2 = 2. Thus Then in the eligible range of parameters X ′ 1 (θ ) = 0 if and only if p − 2 + γ = 0. Hence, by considering the values at the endpoints, we obtain the supremum of X 1 with respect to θ : Obviously, we have Thus if p and γ satisfy range (i), for small enough η = η(p, γ) > 0, in addition to the above choice w 2 = 2, we set The proof is finished.
Next we prove Lemma 6.3, which implies nonnegativity of the necessary terms when 1 < p < ∞ and −1 < γ < √ 2 − 1 2 .In this case, we use a choice of the weights which leads to the vanishing coefficient of the mixed term ∆ T u ε ∆ N ∞ u ε in (6.7).To be more precise, at the beginning of this section, we obtained three conditions (6.8), (6.9) and (6.10), i.e. that ≥ c (6.11) need to hold uniformly in Ω T .Here is the coefficient matrix of quadratic form To simplify the computations in checking the last condition in (6.11), we will consider a special case where the coefficient Proof.Recall that θ = 1 − κ, κ > 0, and P θ = (p − 2)θ + 1.Then recalling the expressions of c 1 , • • • , c 4 in (6.5), we can write the coefficient of the mixed term ∆ T u ε ∆ N ∞ u ε as a polynomial of κ as Set all the coefficients to be zero, we have the desired condition.
By the above Lemma, we can easily to obtain the following result.Corollary 6.5.If then the mixed term ∆ T u ε ∆ N ∞ u ε in Q vanishes.The above corollary gives a choice of the coefficients w 1 , w 2 , w 3 and w 4 to obtain the vanishing coefficient of mixed term ∆ T u ε ∆ N ∞ u ε .This then helps us in proving Lemma 6.3.Proof of Lemma 6.3.If w 1 , w 2 , w 3 and w 4 satisfy (6.12), then by Corollary 6.5, the last condition in (6.11) reduces to checking that 0, sufficient conditions to obtain (6.11) can be written as uniformly in Ω T .
First, using values (6.12) in the first condition and replacing θ by 1 − κ, we have Since κ is positive, the sign of the derivative with respect to κ that is 4(4 − p + γ)κ is fixed.Then 2c 1 + c 2 with respect to κ is monotone and the minimum point corresponds either κ = 0 or κ = 1.Thus 2c 1 + c 2 ≥ min{2(p − γ), 8} > 0. For the second condition, when w 2 = w 4 = 2, it is obvious that Finally, for the last condition plugging values (6.12) in and rewriting as When the derivative of c 3 + c 4 with respect to κ vanishes, that is, Then the minimum point is one of the boundary points or the extreme point κ 1 .Selecting κ = κ 1 , we have if and only if If κ = 0, we have c 3 + c 4 = 2(1 − γ), and if κ = 1, then c 3 + c 4 = 4.It follows that the minimum is given by strictly positive expression (6.13), and the proof is finished.
The proof of Proposition 6.1 now immediately follows.
Proof of Proposition 6.1.The result immediately follows from the previous lemmas, since under assumption (i), Lemma 6.2 implies that (4.9) holds and thus Lemma 4.4 is applicable.
Similarly under assumption (ii), Lemma 6.3 implies that Lemma 4.4 is applicable.Now the desired estimate follows from Lemma 4.5.
6.2.Passing to the original equation.In this section we justify the limiting argument to let ε → 0 in Proposition 6.1 and thus derive our main result, Theorem 1.1.
Proof of Theorem 1.1.Let u : Ω T → R be a viscosity solution to is the parabolic boundary of U t 1 ,t 2 .By the classical theory of uniformly parabolic equations, the above problem has a unique solution Proposition 6.1 is applicable to u ε and we conclude that where C = C(p, γ) > 0. By [14], for any Q R ⋐ U t 1 ,t 2 there exist positive constants α ∈ (0, 1) and C > 0, that are allowed to depend on p, γ, By the uniqueness theorem for viscosity solutions [22], we conclude that ū = u.By employing bound (6.15), we find that the right hand side of (6.14) is bounded from above by a constant independent of ε.Thus {D 2 u ε } ε is bounded in L 2 (Q r ), and consequently we may extract a subsequence that converges weakly in L 2 (Q r ).Further, using integration by parts, we see that the limit is D 2 u, and thus D 2 u ∈ L 2 loc (Ω T ).Finally, we conclude that which is the desired estimate.
It is possible to improve the ranges in Theorem 1.1.However, the computations get more technical, even if they follow the same ideas as above, and thus we have chosen to omit them.In any case the question whether the full range obtained in the smooth case in Proposition 5.1 can also be obtained here remains an open problem.
Next we give the proof of Corollary 1.2.
of the lemma.Assumption (A.1) can be written as for some absolute constant C > 0. We multiply (A.2) with φ 2 and integrate over Q 2r , apply integration by parts to each integral on the right hand side to obtain for any η > 0 and some C = C(n, p, γ, w 1 , w 2 , w 3 , w 4 ) > 0. Above we also employed estimate (A.4).Finally, we select η > 0 small enough and employ (A.3) and (A.where + w 2 (2p − 2 + s − γ) − w 1 (p + s) ∆ T u∆ N ∞ u.This can also be written as Q = x, M x , where x = (∆ T u, ∆ N ∞ u) T ∈ R 2 is a vector and Then we stated that if w 1 = 1 and the range condition is satisfied, we can select w 2 = w 2 (n, p, γ, s) > 0 in such a way that Q is positive definite, which then allows us to get rid of the excess terms.Next we prove this fact.
Proof.We will show that det(M) > 0 and w 2 − n−2 n−1 > 0 with uniform lower bound, and thus by Sylvester's condition M is uniformly bounded positive definite with a uniform constant.We fix w 1 = 1 and introduce the following shorthand notation, We observe that P, K, G, E > 0 under the assumptions of the lemma.Using this notation, one has The discriminant of such a polynomial is Notice that b 2 − 4ac > 0 and hence our polynomial has two distinct roots, unless G = P, in which case our polynomial is of the first order and has one root.Moreover det(M) > 0 if and only if w 2 − n−2 n−1 lies between these roots, that is, where These formulas are valid if G = P. Indeed, recall that a < 0 if G = P, then 2  > 0.
small enough.Next we recall that the boundedness and positive definiteness of M implies M L ∞ (Ω T ) ≤ C, and M 11 ≥ c and det(M) ≥ c in Ω T by Sylvester's criterion and choosing small enough c > 0. For the positive definiteness of the matrix M + λ (N − M) we can use Sylvester's criterion again and check that the leading principal minors are positive if λ > 0 is small enough.The first principal minor is the upper-left corner entry, i.e.

5 )
to arrive to the desired estimate.APPENDIX B. POSITIVE DEFINITENESS CONDITION FOR THE COEFFICIENT MATRIXIn the proof of Lemma 5.2, we wrote one of the key estimates as|Du| p−2+s w 1 (p + s)|D T |Du|| 2 + Q ≤ S

1 2+ b w 2 −
Then we rewrite the determinantdet(M) = a w 2 − n − 2 n − then |Du| is locally Lipschitz continuous and thus, by Rademacher's theorem, differentiable almost everywhere on each time slice.Here and in similar occurrences in what follows, we write that D|Du| exists almost everywhere in space.