Introduction

We study the following reaction-diffusion system in a bounded domain \(\Omega \subset {\mathbb {R}}^d\) with smooth boundary:

$$\begin{aligned} \partial _t{\textbf{u}}={\textbf{D}}\Delta _x{\textbf{u}}-{\textbf{f}}({\textbf{u}})+{\textbf{g}},\ \ {\textbf{u}}\big |_{t=0}=\textbf{u}_\textbf{0},\ {\textbf{u}}\big |_{\partial \Omega }=0 \end{aligned}$$
(1.1)

(endowed with the Dirichlet boundary conditions). Here \({\textbf{u}}=(u_1,\cdots ,u_k)\) is an unknown vector-valued function, \({\textbf{D}}:=(d_{ij})_{i,j=1}^k\) is a given constant diffusion matrix and \({\textbf{f}}({\textbf{u}}):=(f_1({\textbf{u}}),\cdots ,f_k({\textbf{u}}))\) and \({\textbf{g}}:=(g_1,\cdots , g_k)\) are given nonlinearity and external forces respectively.

Equations of the form (1.1) model various classical phenomena in modern science (e.g., heat conduction, chemical kinetics, various quantum effects (Ginzburg-Landau equations), mathematical biology (Fitz-Hugh-Nagumo or Keller-Segel equations (if we allow the nonlinearity to depend also on \(\nabla _x{\textbf{u}}\)), etc.) and have been intensively studied from both mathematical and applied points of view, see [3, 8, 10, 15, 24, 30, 34, 39, 41, 43] and references therein. In a sense, this is the most studied and somehow simplest model example of an evolutionary PDE which may generate non-trivial dynamics.

Since the analytic properties of the linear system (1.1) are completely understood, the analogous properties for the nonlinear equation depend strongly on whether or not we are able to treat the term \({\textbf{f}}({\textbf{u}})\) as a perturbation. As usual, if we want to have global existence of a solution, we need to find the proper a priori estimates, usually with the help of energy functionals or some "wisely" chosen Lyapunov type functionals. This, in turn, requires some restrictions on the function \({\textbf{f}}\) and matrix \({\textbf{D}}\) (to prevent the finite-time blow up of solutions). Then, if the found a priori estimates are strong enough to treat the nonlinearity as a perturbation (the so-called subcritical case), the analytic properties of the nonlinear equation is usually the same as for the dominating linear one and more or less complete theory is available. In contrast to this, in the supercritical case, the nonlinearity is strong enough to destroy the nice properties of the underlying linear equation, for instance, to produce the finite-time blow up of initially smooth solutions (despite the fact that the "energy" remains bounded and dissipative, see [6] for such a phenomena in complex Ginzburg-Landau equation, [36] for chemical kinetics equations or [25] for chemotaxis models). Usually, the sub/super criticality of the considered equation is determined by the growth rate of the nonlinearity \({\textbf{f}}({\textbf{u}})\) which depends on a priori estimates available (through the choice of the phase space for the problem) and the space dimension (through Sobolev embedding theorems). Thus, the typical picture for equation (1.1) is the following: we have the so-called critical growth exponent \(p=p_{crit}>1\) and an extra condition

$$\begin{aligned} |{\textbf{f}}({\textbf{u}})|\le C(1+|{\textbf{u}}|^p),\ {\textbf{u}}\in {\mathbb {R}}^k \end{aligned}$$
(1.2)

on the nonlinearity and the equation is subcritical if \(p<p_{crit}\), critical if \(p=p_{crit}\) and supercritical if \(p>p_{crit}\), see [3, 8, 39, 41] for more details.

Note that in the scalar case \(k=1\), equation (1.1) possesses the maximum/comparison principle as well as a global Lyapunov functional (which can be obtained by formal multiplication of the equation by \(\partial _tu\) and integration in x) and these two properties simplify greatly the study of the equation. Indeed, one can take \(L^\infty (\Omega )\) as a natural phase space and compare the solution u of (1.1) with the corresponding solutions of ODEs

$$\begin{aligned} y'_\pm (t)=-f(y_{\pm }(t))\pm \Vert g\Vert _{L^\infty },\ \ y(0)=\pm \Vert u(0)\Vert _{L^\infty }. \end{aligned}$$

This gives the global existence of the solution u(t) under the following Osgood type conditions which are somehow close to the optimal ones, namely, the global solvability holds if there exists a positive smooth function \(\psi \) such that

$$\begin{aligned} f(u){\text {sgn}}(u)\ge -\psi (|u|), \ \ \forall u\in {\mathbb {R}}\ \text { and } \int _0^\infty \frac{dz}{\psi (z)}=\infty , \end{aligned}$$

see [26, 38] and references therein. Moreover, the dissipativity (in, say, \(L^2(\Omega )\) as well as in \(L^\infty (\Omega )\)) will hold if we take \(\psi \equiv C\), i.e.,

$$\begin{aligned} f(u){\text {sgn}}(u)\ge -C, \ \ \forall u\in {\mathbb {R}}, \end{aligned}$$
(1.3)

see, e.g., [3, 39, 41]. The natural class of nonlinearities are the polynomials \(f(u)=a_{2n+1}u^{2n+1}+\cdots +a_0\), \(n\in {\mathbb {N}}\), with the "right" sign of the leading order coefficient: \(a_{2n+1}>0\). In this case, the restriction (1.2) on the growth rate of f is not necessary and we have well-posedness and dissipativity of the considered scalar equation no matter how fast the growth of the non-linearity is.

In contrast to this, in the case of "wrong sign" \(a_{2n+1}<0\), we always have blow up in finite time at least for some initial data and the properties of solutions u(t) strongly depend on the growth exponent p in (1.2). Since this case is out of the scope of our paper, we will not discuss it here and refer the interested reader to [38] (see also reference therein) for more details.

Unfortunately, the universal conditions on \({\textbf{f}}\) and \({\textbf{D}}\) which would allow to avoid the finite-time blow up and give the dissipativity in nice phase spaces are known in the scalar case \(k=1\) only, so many different classes of sufficient conditions are suggested for the case of systems strongly depending on the area of science where the considered system comes from. For instance, from the point of view of chemical kinetics, it is natural to assume that \({\textbf{D}}\) is diagonal with non-negative entries and \({\textbf{f}}({\textbf{u}})\) satisfies the balance law

$$\begin{aligned} \sum _{i=1}^k f_i({\textbf{u}})\le 0 \end{aligned}$$
(1.4)

which mimics the mass conservation law for the concentrations \(u_i\) of reagents (which usually belong to the non-negative cone in \({\mathbb {R}}^k\)). The natural energy here is the \(L^1\)-norm of the solution \({\textbf{u}}(t)\) (the total mass is conserved or at least non-increasing), see [12, 22, 34, 37, 38] and the references therein for more details. We note that in the supercritical case the solutions may blow up in finite time despite the conservation of total mass, see [36].

Alternatively, the so-called invariant region technique is often used in order to get the global existence of solutions of (1.1). The idea of this technique is to find a bounded invariant region \({\textbf{R}}\) in the phase space of the problem and consider only the solutions \({\textbf{u}}(t)\) satisfying \({\textbf{u}}(0)\in {\textbf{R}}\). Since the trajectory cannot leave the invariant region, this gives the global existence of solutions for such initial data and their boundedness. Usually this method is applied for the case where \({\textbf{R}}\) is a segment

$$\begin{aligned} {\textbf{R}}:=\{{\textbf{u}}\in L^\infty (\Omega ),\ \ a_i\le u_i\le b_i, \ i=1,\cdots ,k\},\ \ a_i,b_i\in {\mathbb {R}}\end{aligned}$$

and \({\textbf{D}}\) is a diagonal matrix. Necessary and sufficient conditions for such a region to be invariant can be found, e.g., in [40].

Clearly, assumptions (1.4) are not appropriate for many other types of equations of the form (1.1), for instance, for complex Ginzburg Landau or Fitz-Hugh-Nagumo equations (and the invariant region technique is hardly applicable for the non-diagonal matrices \({\textbf{D}}\)), so other types of assumptions should be used instead. The most widespread (especially in the literature related with the attractor theory, see [3, 8, 39, 41]) is the following dissipativity condition:

$$\begin{aligned} {\textbf{f}}({\textbf{u}}).{\textbf{u}}\ge -C,\ \ {\textbf{u}}\in {\mathbb {R}}^k \end{aligned}$$
(1.5)

which is a straightforward extension of (1.3) to the case of systems and which is usually accompanied by the assumption that \({\textbf{D}}\) has a positive symmetric part. These assumptions are related with the so-called \(L^2\)-energy identity

$$\begin{aligned} \frac{1}{2}\frac{d}{dt}\Vert {\textbf{u}}(t)\Vert ^2_{L^2}+({\textbf{D}}\nabla _x{\textbf{u}},\nabla _x{\textbf{u}}(t))+ ({\textbf{f}}({\textbf{u}}(t)),{\textbf{u}}(t))=({\textbf{g}},{\textbf{u}}(t)) \end{aligned}$$
(1.6)

which can be formally obtained by multiplying equation (1.1) by \({\textbf{u}}\) and integrating over x and which gives (due to these assumptions) the dissipative control of the \(L^2\)-norm of \({\textbf{u}}(t)\), see Lemma 3.1. Here and below \(({\textbf{u}},\textbf{v}):=\int _{\Omega }{\textbf{u}}(x).\textbf{v}(x)\,dx\) and

$$\begin{aligned} ({\textbf{D}}\nabla _x{\textbf{u}},\nabla _x{\textbf{u}}):=\sum _{i=1}^d({\textbf{D}}\partial _{x_i}{\textbf{u}},\partial _{x_i}{\textbf{u}}). \end{aligned}$$

As one can see from (1.6), assumption (1.5) may be slightly weakened by adding the term \(-k|{\textbf{u}}|^2\) with a sufficiently small k (depending on the first eigenvalue of the Laplacian in \(\Omega \) and matrix \({\textbf{D}}\)) to the right-hand side. This term may be essential, e.g., when the non-linearities with sub-linear growth rate are considered, since in this paper we mainly concentrate on fast growing non-linearities, we restrict ourselves to the case \(k=0\) only.

More important is that the critical exponent which corresponds to this energy control (and the choice \(H=L^2(\Omega )\) as a phase space):

$$\begin{aligned} {\bar{p}}_{crit}:=1+\frac{4}{d} \end{aligned}$$

is rather restrictive (the most natural cubic nonlinearity is supercritical in 3D case) and not much can be said in general about the supercritical case where the uniqueness of solutions may be lost and finite-time blow up of the \(L^\infty \)-norm may occur (see [6] for the numerical blow up evidence in 3D complex Ginzburg-Landau equation, see also [8, 35] and references therein for study the long-time behavior of solutions without uniqueness using the multi-valued or trajectory approaches). We also mention here the so-called anisotropic dissipativity assumption:

$$\begin{aligned} \sum _{i=1}^k f_i({\textbf{u}})u_i|u_i|^{l_i}\ge -C, \end{aligned}$$

where \(l=(l_1,\cdots , l_k)\) is a sufficiently large vector, introduced in [19]. This restriction accompanied by the assumption that \({\textbf{D}}\) is diagonal gives \(p_{crit}=\infty \) if \(l=l(d)\) is large enough.

A natural alternative is to use the so-called monotonicity assumption:

$$\begin{aligned} \nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\xi .\xi \ge -K|\xi |^2,\ \forall \xi ,{\textbf{u}}\in {\mathbb {R}}^k \end{aligned}$$
(1.7)

for some \(K\in {\mathbb {R}}\). Here and below

$$\begin{aligned} \nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})={\textbf{f}}'({\textbf{u}}) :=(\partial _{{u}_j}f_i({\textbf{u}}))_{i,j=1}^k \end{aligned}$$

is the Frechet derivative (=Jacobi matrix) of the function \({\textbf{f}}({\textbf{u}})\). This assumption is also very widespread in the literature related with attractors and is naturally related with the \(H^1\)-energy identity:

$$\begin{aligned}&\frac{1}{2}\frac{d}{dt}\Vert \nabla _x{\textbf{u}}(t)\Vert ^2_{L^2}+({\textbf{D}}\Delta _x{\textbf{u}}(t),\Delta _x{\textbf{u}}(t))+ \nonumber \\&\quad +\,\,(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}}(t))\nabla _x{\textbf{u}}(t),\nabla _x{\textbf{u}}(t))=-({\textbf{g}},\Delta _x{\textbf{u}}(t)) \end{aligned}$$
(1.8)

which is obtained by formal multiplication of (1.1) by \(-\Delta _x{\textbf{u}}\) and integration over x. Together with (1.7) this gives the dissipative control of the \(H^1\)-norm of the solution, see Lemma 3.2 for the details. The critical growth exponent associated with this \(H^1\)-energy control is

$$\begin{aligned} {{\widetilde{p}}}_{crit}=1+\frac{4}{d-2},\ d>2 \end{aligned}$$
(1.9)

can be found in many works, see [3, 8] and references therein. However, as pointed out in [46], the monotonicity assumption (1.7) gives for free the control of \(H^2\)-norm of the solution \({\textbf{u}}(t)\) together with the \(L^2\)-norm of \({\textbf{f}}({\textbf{u}}(t))\), namely, we have a priori estimates for the solutions in the nonlinear space

$$\begin{aligned} {\mathbb {D}}:=\{{\textbf{u}}\in H^2(\Omega )\cap H^1_0(\Omega ),\ {\textbf{f}}({\textbf{u}})\in L^2(\Omega )\}, \end{aligned}$$

due to the control of the \(L^2\)-norm of \(\partial _t{\textbf{u}}(t)\), see Lemma 3.3 and Corollary 3.4 below, and this gives us much better value of the critical exponent:

$$\begin{aligned} p_{crit}=1+\frac{4}{d-4},\ d>4. \end{aligned}$$
(1.10)

As far as we know, up to the moment, this is the best growth restriction which guarantees (of course, under the monotonicity assumption (1.7)) the global existence of smooth solutions and which is widely used nowadays not only for reaction-diffusion equations, but for many other related problems (like Cahn-Hilliard equations, see [35] and references therein; strongly damped wave equations, see [11, 27] and reference therein, etc.). By these reasons, we will use formula (1.10) throughout the paper as a definition of the critical exponent for the considered case.

We also note that the monotonicity assumption (1.7) gives the uniqueness of weak solutions (= solutions in the energy phase space \(H=L^2(\Omega )\), see Sect. 5 below) even in the supercritical case which, in turn, allows to get a lot of information about the solutions and their long-time behavior in the supercritical case as well. The theory of equations (1.1) in the critical or supercritical cases is of a great current interest, see for instance [8,9,10, 14, 35, 47, 48] and references therein. However, in most cases rather essential extra restrictions on the nonlinearity \({\textbf{f}}\) are posed like the following two sided estimate:

$$\begin{aligned} C(|{\textbf{u}}|^{p-1}-1)\ge \nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\ge -K+\alpha |{\textbf{u}}|^{p-1}, \ \ {\textbf{u}}\in {\mathbb {R}}^k,\ \ C,\alpha >0 \end{aligned}$$
(1.11)

which really simplifies the situation, but automatically excludes some interesting new phenomena which may appear in a general case.

The aim of the present paper is to give a comprehensive study of weak and strong solutions (=solutions in the phase space \(\mathbb D\)) of problem (1.1) as well as their long-time behavior in the supercritical case \(p>p_{crit}\) with dissipative (assumption (1.5) is fulfilled) and monotone (assumption (1.7) is satisfied) nonlinearities trying to avoid/minimize further restrictions on \({\textbf{f}}\).

Strong and weak solutions of problem (1.1) have been constructed in [46] (see also Theorem 4.3 of Section 4 and Theorem 5.2 of Sect. 5). The construction of strong solutions is more or less standard. However, in contrast to the usual case where two sided conditions like (1.11) are posed, the phase space \({\mathbb {D}}\) is non-linear and it not clear even whether or not smooth functions are dense in it, so some accuracy is required. In [46] this difficulty has been overcome using Galerkin approximations and the monotone operators theory, in the present paper, we prefer to give more explicit approximation scheme which does not use at least in a straightforward way the monotone operators technique and which is more convenient for what follows, see Sect. 5 for the details.

The absence of any two-sided control for the growth of the nonlinearity \({\textbf{f}}\) also leads to extra difficulties on the level of weak solutions. Indeed, since we cannot guarantee that \({\textbf{f}}({\textbf{u}})\in L^1\), we cannot treat the equation in the sense of distributions and have to use variational inequalities, see Definition 5.1. In addition, even the parabolic smoothing property (whether or not a weak solution becomes strong at the next time moment) becomes non-trivial and has been posed in [46] as an open problem.

Our first main result gives the positive answer on this question in the case where the nonlinearity has a polynomial growth rate.

Theorem 1.1

Let the nonlinearity \({\textbf{f}}\) satisfy the assumptions (1.2) (for some \(p>0\)), (1.5) and (1.7), the diffusion matrix have positive symmetric part and \({\textbf{g}}\in L^2(\Omega )\). Then any weak solution \({\textbf{u}}(t)\) starting from \({\textbf{u}}(0)\in H=L^2(\Omega )\) belongs to \({\mathbb {D}}\) for any \(t>0\). In addition, the strong solutions of (1.1) are dissipative in \({\mathbb {D}}\)-norm as well.

The proof of this theorem is based on estimation of \({\textbf{f}}({\textbf{u}})\) in Lebesgue spaces \(L^q(\Omega )\) with \(0<q<1\) and is given in Sect. 6.

Our next result shows that the critical growth exponent can be slightly improved.

Theorem 1.2

Let the assumptions of Theorem 1.1 hold and let, in addition, the growth exponent p of the nonlinearity satisfy

$$\begin{aligned} p<p_{crit}+\varepsilon =1+\frac{4}{d-4}+\varepsilon ,\ \ d>4 \end{aligned}$$

for some small positive \(\varepsilon =\varepsilon ({\textbf{D}})\). Then any weak solution \({\textbf{u}}(t)\) of problem (1.1) starting from \({\textbf{u}}(0)\in L^2(\Omega )\) belongs to \(L^\infty (\Omega )\) for any \(t>0\). In particular, finite-time blow up of smooth solutions is impossible and the actual regularity of a solution \({\textbf{u}}(t)\) is restricted by the regularity of \(\Omega \), \({\textbf{f}}\) and \({\textbf{g}}\) only. In the case where this data is \(C^\infty \)-smooth, the corresponding solution \({\textbf{u}}(t)\) will be also \(C^\infty \) for any \(t>0\).

We now turn to the attractors. The existence of a global attractor for problem (1.1) in H has been verified in [46], however, the question about strong attraction in \({\mathbb {D}}\) has been remained open. Our next result gives a positive answer on this question under the extra restriction

$$\begin{aligned} |\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})|\le C(|{\textbf{f}}({\textbf{u}})|+1+|{\textbf{u}}|),\ \ {\textbf{u}}\in {\mathbb {R}}^k. \end{aligned}$$
(1.12)

Theorem 1.3

Let the assumptions of Theorem 1.1 hold and let, in addition, \({\textbf{f}}\) satisfy (1.12). Then the solution semigroup S(t) associated with problem (1.1) possesses a compact global attractor \({\mathcal {A}}\) in the phase space \(\mathbb D\).

The proof of this theorem is given in Sect. 7 and is based on the energy type arguments. We expect that assumption (1.12) is technical and can be removed, but it is strongly related with the validity of the integration by parts formula

$$\begin{aligned} ({\textbf{f}}({\textbf{u}}),\Delta _x{\textbf{u}})=-(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\nabla _x{\textbf{u}},\nabla _x{\textbf{u}}),\ \ {\textbf{u}}\in \mathbb D, \end{aligned}$$

see the discussion in Sect. 9 below.

Finally, we study the finite-dimensionality of the constructed global attractor \({\mathcal {A}}\) in \({\mathbb {D}}\) and the existence of the so-called exponential attractor (see [16,17,18, 35] and also Sect. 8 for more details).

Theorem 1.4

Let the assumptions of Theorem 1.3 hold and let, in addition, some extra convexity assumptions on the function \({\textbf{u}}\rightarrow |{\textbf{f}}({\textbf{u}})|\) be posed (see formula (8.3)). Then the solution semigroup S(t) associated with equation (1.1) possesses an exponential attractor \({\mathcal {M}}\) in H and, in particular, the fractal dimension of the global attractor \({\mathcal {A}}\) in H is finite.

The finite-dimensionality of the global attractor \({\mathcal {A}}\) has been established in [46] under the similar assumptions using the so-called method of l-trajectories developed in [32, 33]. In Sect. 8 we suggest an alternative more transparent method for constructing of an exponential attractor which does not utilize l-trajectories and works directly in the phase space.

Note that the results of the paper are heavily based on the monotonicity arguments, so we cannot consider more or less general non-linearities \({\textbf{f}}({\textbf{u}},\nabla _x{\textbf{u}})\) which contain on the spatial gradient of \({\textbf{u}}\). However, there is an important particular case where our theory works with very minor changes, namely, where

$$\begin{aligned} {\textbf{f}}({\textbf{u}},\nabla _x{\textbf{u}}):={\textbf{f}}({\textbf{u}})+\textbf{L}\nabla _x{\textbf{u}} \end{aligned}$$

and \(\textbf{L}\) is a linear map (a mild dependence of \(\textbf{L}\) on \({\textbf{u}}\) is also allowed). Some results in this direction are obtained in [47]. We also note that many of our results can be extended to the case of quasilinear systems where the Laplacian \(\Delta _x{\textbf{u}}\) is replaced, for instance, by the so-called p-Laplacian for which the theory of monotone operators is applicable, see [3, 5, 43, 45] and references therein. We return to this type of problems somewhere else.

The paper is organized as follows.

Notation and spaces which will be used throughout the paper are introduced in Sect. 2 and the standard a priori estimates for the solutions \({\textbf{u}}(t)\) of problem (1.1) are recalled in Sect. 3.

The existence of strong solutions for problem (1.1) is verified in Sect. 4 based on special approximations of the nonlinearity \({\textbf{f}}\). The definition of a weak solution of problem (1.1) in the sense of variational inequalities as well as the proof of its global existence and uniqueness is given in Sect. 5. Moreover, the existence of a global attractor \({\mathcal {A}}\) for the solution semigroup S(t) is also verified there.

The weak to strong smoothing property, see Theorem 1.1, is verified in Sect. 6. The further regularity of strong solutions is obtained in Sect. 7. In particular, the proofs of Theorems 1.2 and 1.3 are given there. Some results about the partial regularity of the elliptic problem associated with equations (1.1) which have an independent interest are obtained in Appendix A. The existence of an exponential attractor \({\mathcal {M}}\), see Theorem 1.4, is given in Sect. 8.

Finally, Sect. 9 discusses natural extensions of the developed theory to other classes of dissipative PDEs, in particular, to fractional reaction-diffusion systems and (fractional) Cahn-Hilliard type equations. At the end of this section we also discuss some important (at least from our point of view) open problems for further investigation.

Assumptions and preliminaries

Throughout the paper we consider the following reaction-diffusion system in a bounded domain \(\Omega \subset {\mathbb {R}}^d\):

$$\begin{aligned} \partial _t{\textbf{u}}={\textbf{D}}\Delta _x{\textbf{u}}-{\textbf{f}}({\textbf{u}})+{\textbf{g}},\ \ {\textbf{u}}\big |_{\partial \Omega }=0,\ \ {\textbf{u}}\big |_{t=0}={\textbf{u}}_0. \end{aligned}$$
(2.1)

Here \({\textbf{u}}=(u_1,\cdots ,u_k)\) is an unknown vector-valued function, \({\textbf{D}}\) is a constant diffusion matrix satisfying

$$\begin{aligned} {\textbf{D}}+{\textbf{D}}^*>0, \end{aligned}$$
(2.2)

\({\textbf{g}}\in L^2(\Omega )\) is a given external force and the nonlinearity \({\textbf{f}}\) is assumed to satisfy the following conditions:

$$\begin{aligned} {\left\{ \begin{array}{ll} 1.\ {\textbf{f}}\in C^1({\mathbb {R}}^k,{\mathbb {R}}^k),\\ 2.\ {\textbf{f}}({\textbf{u}}).{\textbf{u}}\ge -C, \ {\textbf{u}}\in {\mathbb {R}}^k, \\ 3.\ \nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\ge -K,\ \ {\textbf{u}}\in {\mathbb {R}}^k, \end{array}\right. } \end{aligned}$$
(2.3)

where C and K are some fixed constants, \({\textbf{u}}.\textbf{v}\) stands for the standard inner product in \({\mathbb {R}}^k\) and \(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\ge -K\) means \(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\xi .\xi \ge -K|\xi |^2\) for all \(\xi \in {\mathbb {R}}^k\).

For any \(l\in {\mathbb {N}}\) and any \(1\le p\le \infty \) we denote by \(W^{l,p}(\Omega )\) the Sobolev space of distributions \(u\in \mathcal D'(\Omega )\) such that u and all its partial derivatives up to order l inclusively belong to the Lebesgue space \(L^p(\Omega )\). As usual, for non-integer values of l, we define \(W^{l,p}(\Omega ):=B^l_{p,p}(\Omega )\) using real interpolation (\(B^l_{p,p}\) is a classical Besov space, see e.g., [42]). Moreover, the symbol \(W^{l,p}_0(\Omega )\) stands for the closure of \(C^\infty _0(\Omega )\) in \(W^{l,p}(\Omega )\) and the space \(W^{-l,p}(\Omega )\) is defined as a dual space to \(W^{l,q}_0(\Omega )\), \(\frac{1}{p}+\frac{1}{q}=1\), with respect to the standard inner product in \(H=L^2(\Omega )\). To simplify the notation, we will write \(H^l(\Omega )\) instead of \(W^{l,2}(\Omega )\). We will also systematically use the Sobolev spaces \(W^{l,p}(\Omega ,{\mathbb {R}}^k)\) of vector-valued functions \({\textbf{u}}:\Omega \rightarrow {\mathbb {R}}^k\) and will write for brevity \(W^{l,p}(\Omega )\) instead of \(W^{l,p}(\Omega ,{\mathbb {R}}^k)\) if this does not lead to misunderstandings.

In a sequel, we will also use the space \(L^p(\Omega )\) with \(0<p<1\) and use the standard notation

$$\begin{aligned} \Vert {\textbf{u}}\Vert _{L^p(\Omega )}:=\left( \int _\Omega |\textbf{u(x)}|^p\,dx\right) ^{1/p} \end{aligned}$$

simply ignoring the fact that it is not a norm. Recall that the topology in this space is defined by the metric \(d_p({\textbf{u}},\textbf{v}):=\Vert {\textbf{u}}-\textbf{v}\Vert _{L^p}^p\).

We say that the function \({\textbf{u}}(t,x)\) is a strong solution of (2.1) if

$$\begin{aligned} {\textbf{u}}\in C_{\textrm{w}}(0,T; H^2(\Omega )\cap H^1_0(\Omega )),\ \ {\textbf{f}}({\textbf{u}})\in C_{\textrm{w}}(0,T; L^2(\Omega )) \end{aligned}$$
(2.4)

and equation (2.1) is satisfied in the sense of distributions (here and below C(abX) means the space of X-valued functions defined and continuous on a closed interval [ab] and the lower index "w" stands for the continuity in weak topology). In particular, for strong solutions we require that the initial data \({\textbf{u}}_0\in {\mathbb {D}}\), where

$$\begin{aligned} {\mathbb {D}}:= & {} \{{\textbf{u}}\in H^2(\Omega )\cap H^1_0(\Omega ),\ {\textbf{f}}({\textbf{u}})\in L^2(\Omega )\},\ \nonumber \\ \Vert {\textbf{u}}\Vert ^2_{{\mathbb {D}}}:= & {} \Vert {\textbf{u}}\Vert ^2_{H^2}+\Vert {\textbf{f}}({\textbf{u}})\Vert ^2_{L^2}. \end{aligned}$$
(2.5)

Note that in general \({\mathbb {D}}\) is not a linear space and this causes a lot of extra difficulties in comparison with the case of linear phase space. We define the topology in the space \(\mathbb D\) using the embedding

$$\begin{aligned} j:{\mathbb {D}}\rightarrow [H^2(\Omega )\cap H^1_0(\Omega )]\times L^2(\Omega ),\ \ j({\textbf{u}})=\{{\textbf{u}},{\textbf{f}}({\textbf{u}})\}. \end{aligned}$$

In particular, the sequence \({\textbf{u}}_n\rightarrow {\textbf{u}}\) strongly in \({\mathbb {D}}\) if \({\textbf{u}}_n\rightarrow {\textbf{u}}\) in \(H^2(\Omega )\) and \({\textbf{f}}({\textbf{u}}_n)\rightarrow {\textbf{f}}({\textbf{u}})\) in \(L^2(\Omega )\). Analogously, we say that \({\textbf{u}}_n\rightharpoondown {\textbf{u}}\) weakly in \({\mathbb {D}}\) if \({\textbf{u}}_n\rightarrow {\textbf{u}}\) weakly in \(H^2(\Omega )\) and \({\textbf{f}}({\textbf{u}}_n)\rightarrow {\textbf{f}}({\textbf{u}})\) weakly in \(L^2(\Omega )\). We will also write for brevity \(C_{\text {w}}(0,T;{\mathbb {D}})\) instead of (2.4).

A priori estimates

In this section, we give a number of more or less standard estimates for strong solutions of problem (2.1) which will be justified later. We start with the dissipative estimate in the space \(H=L^2(\Omega )\).

Lemma 3.1

Let \({\textbf{g}}\in H\), assumptions (2.2)–(2.3) hold and let \({\textbf{u}}\) be a sufficiently smooth solution of (2.1). Then, the following estimate is valid:

$$\begin{aligned}&\Vert {\textbf{u}}(T)\Vert ^2_{L^2}+\int _T^{T+1}\Vert {\textbf{u}}(t)\Vert ^2_{H^1}\,dt+ \int _T^{T+1}|{\textbf{f}}({\textbf{u}}).{\textbf{u}}|\,dt\le \nonumber \\&\quad \le Ce^{-\alpha T}\Vert {\textbf{u}}_0\Vert ^2_{L^2}+C(\Vert {\textbf{g}}\Vert ^2_{L^2}+1),\nonumber \\ \end{aligned}$$
(3.1)

where the positive constants C and \(\alpha \) are independent of t and \({\textbf{u}}_0\).

Proof

We multiply equation (2.1) by \({\textbf{u}}\) and integrate over x. This gives

$$\begin{aligned} \frac{1}{2}\frac{d}{dt}\Vert {\textbf{u}}\Vert ^2_{L^2}+ ({\textbf{D}}\nabla _x{\textbf{u}},\nabla _x{\textbf{u}})+({\textbf{f}}({\textbf{u}}),{\textbf{u}})=({\textbf{g}},{\textbf{u}}), \end{aligned}$$

where \(({\textbf{u}},\textbf{v}):=\int _\Omega {\textbf{u}}(x).\textbf{v}(x)\,dx\) is a standard inner product in H and

$$\begin{aligned} ({\textbf{D}}\nabla _x{\textbf{u}},\nabla _x{\textbf{u}}):=\sum _{i=1}^d({\textbf{D}}\partial _{x_i}{\textbf{u}},\partial _{x_i}{\textbf{u}})= \int _{x\in \Omega }{\text {Tr}}({\textbf{D}}\nabla _x{\textbf{u}}(\nabla _x{\textbf{u}})^*)\,dx. \end{aligned}$$

Using the dissipativity assumption \({\textbf{f}}({\textbf{u}}).{\textbf{u}}\ge -C\) and positivity of the matrix \({\textbf{D}}\) together with the Friedrichs (Poincare) inequality, we arrive at

$$\begin{aligned} \frac{1}{2}\frac{d}{dt}\Vert {\textbf{u}}\Vert ^2_{L^2}+\alpha \Vert {\textbf{u}}\Vert ^2_{L^2}+\alpha \Vert \nabla _x{\textbf{u}}\Vert ^2_{L^2}+ (|{\textbf{f}}({\textbf{u}}).{\textbf{u}}|,1)\le C(\Vert {\textbf{g}}\Vert ^2_{L^2}+1) \end{aligned}$$

for some positive constants C and \(\alpha \). The Gronwall inequality applied to this relation gives (3.1) and finishes the proof of the lemma. \(\square \)

The next lemma gives the analogous dissipative estimate for the \(H^1\)-norm of the solution.

Lemma 3.2

Let the assumptions of Lemma 3.1 hold and \({\textbf{u}}\) be a sufficiently regular solution of (2.1). Then, the following estimate is valid:

$$\begin{aligned}&\Vert {\textbf{u}}(T)\Vert _{H^1}^2+\int _T^{T+1}\Vert {\textbf{u}}(t)\Vert _{H^2}^2\,dt+\nonumber \\&\quad + \int _T^{T+1}(|\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\nabla _x{\textbf{u}}(t).\nabla _x{\textbf{u}}(t)|,1)\,dt\le Ce^{-\alpha T}\Vert {\textbf{u}}_0\Vert ^2_{H^1}+C(\Vert {\textbf{g}}\Vert ^2_{L^2}+1),\nonumber \\ \end{aligned}$$
(3.2)

where the positive constants C and \(\alpha \) are independent of t and \({\textbf{u}}_0\).

Proof

We multiply equation (2.1) by \(-\Delta _x{\textbf{u}}\) and integrate over x to get

$$\begin{aligned} \frac{1}{2}\frac{d}{dt}\Vert \nabla _x{\textbf{u}}\Vert ^2_{L^2}+({\textbf{D}}\Delta _x{\textbf{u}},\Delta _x{\textbf{u}})+ (\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\nabla _x{\textbf{u}},\nabla _x{\textbf{u}})=-({\textbf{g}},\Delta _x{\textbf{u}}). \end{aligned}$$

Using the inequality \(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\ge -K\) and positivity of matrix \({\textbf{D}}\) again, we arrive at

$$\begin{aligned}&\frac{d}{dt}\Vert \nabla _x{\textbf{u}}\Vert ^2_{L^2}+\alpha \Vert \nabla _x{\textbf{u}}\Vert ^2_{L^2}+ \alpha \Vert \Delta _x{\textbf{u}}\Vert ^2_{L^2}+\nonumber \\&\quad +\,\,(|\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\nabla _x{\textbf{u}}.\nabla _x{\textbf{u}}|,1) \le C(\Vert {\textbf{g}}\Vert ^2_{L^2}+\Vert \nabla _x{\textbf{u}}\Vert ^2_{L^2}). \end{aligned}$$
(3.3)

Applying the Gronwall inequality to this relation and using (3.1) in order to control the integral of \(\Vert \nabla _x{\textbf{u}}\Vert ^2_{L^2}\), we end up with (3.2) and finish the proof of the lemma. \(\square \)

Let now \({\varvec{\theta }}=\partial _t{\textbf{u}}\). Then this function solves

$$\begin{aligned} \partial _t{\varvec{\theta }}={\textbf{D}}\Delta _x{\varvec{\theta }}-\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}}){\varvec{\theta }},\ \ {\varvec{\theta }}\big |_{t=0} ={\textbf{D}}\Delta _x{\textbf{u}}_0-{\textbf{f}}({\textbf{u}}_0)+{\textbf{g}},\ {\varvec{\theta }}\big |_{\partial \Omega }=0. \end{aligned}$$
(3.4)

The next lemma gives the \(L^2\)-estimate for the time derivative \({\varvec{\theta }}\).

Lemma 3.3

Let the assumptions of Lemma 3.1 hold and let \({\textbf{u}}\) be a sufficiently regular solution of equation (2.1). Then the following estimate is valid:

$$\begin{aligned}&\Vert {\varvec{\theta }}(T)\Vert ^2_{L^2}+\int _T^{T+1}\Vert {\varvec{\theta }}(t)\Vert _{H^1}^2\,dt+\nonumber \\&\quad + \int _{T}^{T+1}(|\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}}){\varvec{\theta }}(t).{\varvec{\theta }}(t)|,1)\,dt \le C\Vert {\textbf{u}}_0\Vert _{\mathbb D}^2e^{K_1T}+Ce^{K_1 T}(\Vert {\textbf{g}}\Vert ^2+1),\qquad \end{aligned}$$
(3.5)

where positive constants C and \(K_1\) are independent of t and \({\textbf{u}}_0\).

Proof

We multiply equation (3.4) by \({\varvec{\theta }}\) and use assumption \(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\ge -K\) and positivity of matrix \({\textbf{D}}\) to get

$$\begin{aligned} \frac{d}{dt}\Vert {\varvec{\theta }}\Vert ^2_{L^2}+\alpha \Vert \nabla _x{\varvec{\theta }}\Vert ^2_{L^2}+ (|\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}}){\varvec{\theta }}.{\varvec{\theta }}|,1)\le 2K\Vert {\varvec{\theta }}\Vert ^2_{L^2}. \end{aligned}$$

Applying the Gronwall inequality to this relation, we get the desired estimate and finish the proof of the lemma. \(\square \)

As a corollary of this lemma, we get the key control for the norm of the solution in the space \({\mathbb {D}}\).

Corollary 3.4

Let the assumptions of Lemma 3.1 hold and let \({\textbf{u}}\) be a sufficiently regular solution of (2.1). Then the following estimate is valid:

$$\begin{aligned} \Vert {\textbf{u}}(T)\Vert _{{\mathbb {D}}}^2\le Ce^{K_1 T}\Vert {\textbf{u}}_0\Vert ^2_{{\mathbb {D}}}+Ce^{K_1T}(1+\Vert {\textbf{g}}\Vert ^2_{L^2}), \end{aligned}$$
(3.6)

where the positive constants C and \(K_1\) are independent of T and \({\textbf{u}}_0\).

Proof

We rewrite equation (2.1) as an elliptic problem

$$\begin{aligned} {\textbf{D}}\Delta _x{\textbf{u}}(T)-{\textbf{f}}({\textbf{u}}(T))=\tilde{\textbf{g}}(T):={\varvec{\theta }}(T)-{\textbf{g}} \end{aligned}$$
(3.7)

for every fixed T. Multiplying then this equation by \(\Delta _x{\textbf{u}}(T)\) (without integration in time!) and using the control for \({\varvec{\theta }}(T)\) obtained above (together with the elliptic regularity estimate for the Laplacian and the assumption \(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\ge -K\)), we arrive at the estimate

$$\begin{aligned} \Vert {\textbf{u}}(T)\Vert _{H^2}^2\le C(\Vert {\textbf{g}}\Vert ^2_{L^2}+\Vert {\varvec{\theta }}(T)\Vert _{L^2}^2+\Vert \nabla _x{\textbf{u}}(T)\Vert _{L^2}^2). \end{aligned}$$

Expressing the \(L^2\)-norm of \({\textbf{f}}({\textbf{u}})\) from equation (3.7), we get

$$\begin{aligned} \Vert {\textbf{u}}(T)\Vert ^2_{{\mathbb {D}}}\le C(\Vert {\textbf{g}}\Vert ^2_{L^2}+\Vert {\varvec{\theta }}(T)\Vert _{L^2}^2+\Vert \nabla _x{\textbf{u}}(T)\Vert _{L^2}^2). \end{aligned}$$

Estimating the right-hand side of this inequality by (3.5) and (3.2) and using that

$$\begin{aligned} {\varvec{\theta }}(0)=\partial _t{\textbf{u}}(0)={\textbf{D}}\Delta _x{\textbf{u}}(0)-{\textbf{f}}({\textbf{u}}(0))+{\textbf{g}}, \end{aligned}$$

we arrive at the desired estimate and finish the proof of the corollary. \(\square \)

Remark 3.5

Note that, in contrast to estimates for the \(L^2\) and \(H^1\) norms of the solution \({\textbf{u}}(t)\), the obtained estimate for the \(\mathbb D\)-norm of \({\textbf{u}}(t)\) is not dissipative and even grows exponentially in time. We will remove this drawback later (under some extra assumptions on \({\textbf{f}}\)).

We conclude this section by establishing the global Lipschitz continuity with respect to the initial data which plays a crucial role in constructing weak solutions for (2.1).

Lemma 3.6

Let the assumptions of Lemma 3.1 hold and let \({\textbf{u}}_1(t)\) and \({\textbf{u}}_2(t)\) be two sufficiently regular solutions of equation (2.1). Then the following estimate is valid:

$$\begin{aligned}&\Vert {\textbf{u}}_1(T)-{\textbf{u}}_2(T)\Vert ^2_{L^2}+\nonumber \\&\quad +\int _0^T\Vert {\textbf{u}}_1(t)-{\textbf{u}}_2(t)\Vert ^2_{H^1}\,dt\le Ce^{K_1 T}\Vert {\textbf{u}}_1(0)-{\textbf{u}}_2(0)\Vert ^2_{L^2}, \end{aligned}$$
(3.8)

where the positive constants C and \(K_1\) are independent of T, \({\textbf{u}}_1\) and \({\textbf{u}}_2\).

Proof

Indeed, let \(\textbf{v}(t)={\textbf{u}}_1(t)-{\textbf{u}}_2(t)\). Then this function solves

$$\begin{aligned} \partial _t\textbf{v}={\textbf{D}}\Delta _x\textbf{v}-[{\textbf{f}}({\textbf{u}}_1)-{\textbf{f}}({\textbf{u}}_2)]. \end{aligned}$$
(3.9)

Multiplying this equation by \(\textbf{v}\), using that, due to the monotonicity assumption \(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\ge -K\),

$$\begin{aligned} {[}{\textbf{f}}({\textbf{u}}_1)-{\textbf{f}}({\textbf{u}}_2)].[{\textbf{u}}_1-{\textbf{u}}_2]\ge -K|\textbf{v}|^2, \end{aligned}$$

and arguing as in the proof of Lemma 3.3, we arrive at the desired estimate. \(\square \)

Existence of strong solutions

Although the construction of a solution from a priori estimates obtained above is more or less standard, it is a bit delicate here since the phase space \({\mathbb {D}}\) is in general nonlinear and, particularly, it is not clear whether or not smooth functions are dense in \({\mathbb {D}}\). By this reason, we sketch the proof here.

We expect that the existence result can be also obtained using the monotone operators theory (e.g., the standard Yosida approximations), but we prefer to give an alternative, a bit more transparent proof. Our idea is to approximate the nonlinearity \({\textbf{f}}\) by a sequence \({\textbf{f}}_n\) of functions of which are globally Lipschitz continuous without destroying assumptions (2.3). Then, on the one hand, the existence of solutions for such \({\textbf{f}}_n\) is well-known and, on the other hand, as not difficult to see, all estimates obtained above will be uniform with respect to n. Thus, it will only remain to pass to the limit \(n\rightarrow \infty \). We start with the approximation of \({\textbf{f}}\).

Lemma 4.1

Let the function \({\textbf{f}}\) satisfy assumptions (2.3). Then, there exists a sequence of functions \({\textbf{f}}_n\in C^1({\mathbb {R}}^k,{\mathbb {R}}^k)\) such that

$$\begin{aligned} 1.\ \ {\textbf{f}}_n({\textbf{u}}).{\textbf{u}}\ge -C,\ \ 2.\ \ \nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\ge -K \end{aligned}$$
(4.1)

uniformly with respect to n. Moreover,

$$\begin{aligned} {\textbf{f}}_n\rightarrow {\textbf{f}} \end{aligned}$$
(4.2)

in \(C_{loc}({\mathbb {R}}^k,{\mathbb {R}}^k)\) and

$$\begin{aligned} |\nabla _{\textbf{u}}{{\textbf{f}}_n}({\textbf{u}})|\le C_n,\ \ {\textbf{u}}\in {\mathbb {R}}^k, \end{aligned}$$
(4.3)

where the constant \(C_n\) may depend on n.

Sketch of the proof

Let us first introduce a smooth scalar convex function \(\Psi (z)\) (we may also require that \(\Psi '(s)\ge 0\) as \(s\ge 0\)) in such a way that \(\nabla _{\textbf{u}}\Psi (|{\textbf{u}}|^2)\) grows faster than \(|{\textbf{f}}({\textbf{u}})|\) as \(|{\textbf{u}}|\rightarrow \infty \). Then, for every fixed \(\varepsilon >0\), we consider the function \({\bar{\textbf{f}}}_\varepsilon ({\textbf{u}})={\textbf{f}}({\textbf{u}})+\varepsilon \nabla _{\textbf{u}}\Psi (|{\textbf{u}}|^2)\). Since the second term will dominate the first one if \(|{\textbf{u}}|^2\) is large enough, introducing the first smooth cut-off function \(\theta _R(|{\textbf{u}}|^2)\) such that \(\theta _R=1\) is \(|{\textbf{u}}|\le R\) and zero if \(|{\textbf{u}}|\ge 2R\), we may find \(R=R_\varepsilon \ge \frac{1}{\varepsilon }\) such that the function

$$\begin{aligned} {\bar{\textbf{f}}}_{\varepsilon } ({\textbf{u}}):=\theta _{R_\varepsilon }(|{\textbf{u}}|^2){\textbf{f}}({\textbf{u}})+\ \varepsilon \nabla _{\textbf{u}} \Psi (|{\textbf{u}}|^2) \end{aligned}$$

satisfies assumptions (2.3) uniformly with respect to \({\textbf{u}}\in {\mathbb {R}}^k\). Note that \({\bar{\textbf{f}}}_\varepsilon ({\textbf{u}})=\varepsilon \nabla _{\textbf{u}} \Psi (|{\textbf{u}}|^2)= 2\varepsilon \Psi '(|{\textbf{u}}|^2){\textbf{u}}\) for \(|{\textbf{u}}|^2\ge 2R_\varepsilon \) and we may make it linear for \(|{\textbf{u}}|^2\ge 3R_\varepsilon \) by cutting-off \(\Psi ''\) on the interval \(2R_\varepsilon \le |{\textbf{u}}|^2\le 3R_\varepsilon \). For instance, we may introduce another smooth cut-off function \({\tilde{\theta }}_{R_\varepsilon }(\tau )=1\) as \(\tau \le 2R_\varepsilon \) and zero if \(\tau \ge 3R_\varepsilon \) and define

$$\begin{aligned} \Psi '_{R_\varepsilon }(s):=\int _0^{s}{\tilde{\theta }}_{R_\varepsilon }(\tau )\Psi ''(\tau )d\tau +\Psi '(0). \end{aligned}$$

This gives

$$\begin{aligned} {\textbf{f}}_\varepsilon ({\textbf{u}})=\theta _{R_\varepsilon }(|{\textbf{u}}|^2){\textbf{f}}({\textbf{u}})+ \varepsilon \nabla _{\textbf{u}}\Psi _{R_\varepsilon }(|{\textbf{u}}|^2). \end{aligned}$$

Taking finally \(\varepsilon =\varepsilon _n=\frac{1}{n}\), we get the desired approximating sequence. \(\square \)

We now introduce the approximating system for (2.1)

$$\begin{aligned} \partial _t{\textbf{u}}={\textbf{D}}\Delta _x{\textbf{u}}-{\textbf{f}}_n({\textbf{u}})+{\textbf{g}},\ \ {\textbf{u}}\big |_{t=0}={\textbf{u}}_0^n,\ \ {\textbf{u}}\big |_{\partial \Omega }=0, \end{aligned}$$
(4.4)

where the functions \({\textbf{f}}_n\) are constructed in Lemma 4.1. However, the choice of the approximating initial data \({\textbf{u}}_0^n\) requires some accuracy. Indeed, we cannot just fix \({\textbf{u}}_0^n={\textbf{u}}_0\) since \(\Vert {\textbf{f}}_n({\textbf{u}}_0)\Vert _{L^2}\) will not be uniformly bounded and, as a result, we may lose the estimate of the \({\mathbb {D}}\)-norm of the limit solution. Instead, we define \({\textbf{u}}_0^n\) as a solution of the following auxiliary elliptic problem:

$$\begin{aligned} {\textbf{D}}\Delta _x\textbf{v}-{\textbf{f}}_n(\textbf{v})-K\textbf{v}=\textbf{G}:= {\textbf{D}}\Delta _x{\textbf{u}}_0+{\textbf{f}}({\textbf{u}}_0)-K{\textbf{u}}_0,\ \ \textbf{v}\big |_{\partial \Omega }=0. \end{aligned}$$
(4.5)

The next lemma gives useful properties of the solutions of this auxiliary problem.

Lemma 4.2

Let the functions \({\textbf{f}}_n\) be as above and \({\textbf{u}}_0\in {\mathbb {D}}\). Then, for every fixed n, problem (4.5) has a unique solution \(\textbf{v}={\textbf{u}}_0^n\). Moreover, \(\Vert {\textbf{u}}_0^n\Vert _{H^2}\) and \(\Vert {\textbf{f}}_n({\textbf{u}}_0^n)\Vert _{L^2}\) are uniformly bounded as \(n\rightarrow \infty \) and

$$\begin{aligned} {\textbf{u}}_0^n\rightharpoondown {\textbf{u}}_0,\ \ {\textbf{f}}_n({\textbf{u}}_0^n)\rightharpoondown {\textbf{f}}({\textbf{u}}_0) \end{aligned}$$
(4.6)

in the spaces \(H^2\) and \(L^2\) respectively.

Proof

Indeed, the existence and uniqueness of a solution for (4.5) is obvious since \({\textbf{f}}_n({\textbf{u}})+K{\textbf{u}}\) are monotone and globally Lipschitz continuous. Let us prove uniform bounds. Indeed, multiplying (4.5) by \(\Delta _x\textbf{v}=\Delta _x{\textbf{u}}_0^n\) and using the monotonicity, we get the estimate

$$\begin{aligned} \Vert {\textbf{u}}_0^n\Vert _{H^2}^2\le C\Vert \textbf{G}\Vert ^2_{L^2}\le C\Vert {\textbf{u}}_0\Vert ^2_{{\mathbb {D}}}, \end{aligned}$$

so \({\textbf{u}}_0^n\) is uniformly bounded in \(H^2\). Expressing \({\textbf{f}}_n(\textbf{v})\) from equation (4.5), we see that \({\textbf{f}}_n({\textbf{u}}_0^n)\) are also uniformly bounded.

Let us verify the convergence. Since \({\textbf{u}}_0^n\) is uniformly bounded, passing to a subsequence if necessary, we may assume that \({\textbf{u}}_0^n\rightharpoondown \textbf{w}\) as \(n\rightarrow \infty \) and \({\textbf{u}}_0^n\rightarrow \textbf{w}\) strongly to \(\textbf{w}\) in \(H^1\). Then, we have the convergence \({\textbf{u}}_0^n(x)\rightarrow \textbf{w}(x)\) almost everywhere. Moreover, from this convergence and Lemma 4.1, we may conclude that \({\textbf{f}}_n({\textbf{u}}_0^n(x))\rightarrow {\textbf{f}}(\textbf{w}(x))\) almost everywhere. Since \({\textbf{f}}_n({\textbf{u}}_0^n)\) are uniformly bounded in \(L^2\), passing to a subsequence again, we infer that \({\textbf{f}}_n({\textbf{u}}_0^n)\rightharpoondown {\textbf{f}}(\textbf{w})\). Passing after that to the weak limit \(n\rightarrow \infty \) in equations (4.5), we see that the limit function \(\textbf{w}\) solves

$$\begin{aligned} {\textbf{D}}\Delta _x\textbf{w}-{\textbf{f}}(\textbf{w})-K\textbf{w}=\textbf{G}={\textbf{D}}\Delta _x{\textbf{u}}_0-{\textbf{f}}({\textbf{u}}_0)-K{\textbf{u}}_0,\ \ \textbf{w}\big |_{\partial \Omega }=0. \end{aligned}$$
(4.7)

Finally, since the solution \(\textbf{w}\in {\mathbb {D}}\) of equation (4.7) is unique (again due to the monotonicity of \({\textbf{f}}({\textbf{u}})+K{\textbf{u}}\)), we conclude that \(\textbf{w}={\textbf{u}}_0\). The uniqueness also gives that passing to a subsequence was not necessary and the whole sequence \({\textbf{u}}_0^n\) converges to \({\textbf{u}}_0\). \(\square \)

We are now ready to state and prove the main result of this section.

Theorem 4.3

Let the nonlinearity \({\textbf{f}}\) and matrix \({\textbf{D}}\) satisfy assumptions (2.3) and (2.2), \({\textbf{g}}\in L^2\) and \({\textbf{u}}_0\in {\mathbb {D}}\). Then, problem (2.1) possesses a unique strong solution \({\textbf{u}}(t)\in {\mathbb {D}}\) which satisfies all estimates formally obtained in Section 3.

Proof

We approximate the desired solution \({\textbf{u}}\) by the approximate solutions \({\textbf{u}}_n(t)\) of problems (4.4), where \({\textbf{f}}_n\) and \({\textbf{u}}_0^n\) are chosen as in Lemmas 4.1 and 4.2. Then, since \({\textbf{f}}_n\) is globally Lipschitz continuous, the existence and uniqueness of a solution \({\textbf{u}}_n(t)\) of (4.4) is straightforward. At the next step, we need to check that all estimates of Section 3 are indeed uniform with respect to n (the justification of all these estimates for the case of sublinear growth rate is also obvious). This is obvious for Lemmas 3.1 and 3.2 (as well as for Lemma 3.6) since \({\textbf{u}}_0^n\rightarrow {\textbf{u}}_0\) strongly in \(H^1\) and \({\textbf{f}}_n\) satisfy (2.3) uniformly with respect to n. Thus, we only need to look on the estimates related with time differentiation and \({\mathbb {D}}\) norms. The key role in these estimates is played by the \(L^2\)-norm of time derivative \({\varvec{\theta }}_n(t):=\partial _t{\textbf{u}}_n(t)\) and the \(L^2\)-norm of it at time moment t is estimated by its \(L^2\)-norm at time \(t=0\). But due to our construction

$$\begin{aligned} {\varvec{\theta }}_n(0)={\textbf{D}}\Delta _x{\textbf{u}}_0^n-{\textbf{f}}_n({\textbf{u}}_0^n)+{\textbf{g}}= {\textbf{D}}\Delta _x{\textbf{u}}_0-{\textbf{f}}({\textbf{u}}_0)+{\textbf{g}}+K({\textbf{u}}_0^n-{\textbf{u}}_0). \end{aligned}$$
(4.8)

Therefore, according to Lemma 4.2,

$$\begin{aligned} \Vert {\varvec{\theta }}_n(0)\Vert _{L^2}^2\le C(\Vert {\textbf{u}}_0\Vert ^2_{\mathbb D}+\Vert {\textbf{g}}\Vert ^2_{L^2}+1). \end{aligned}$$
(4.9)

By this reason, the analogue of estimate (3.6) on the level of approximations reads

$$\begin{aligned} \Vert \partial _t{\textbf{u}}_n(t)\Vert ^2_{L^2}+\Vert {\textbf{u}}_n(t)\Vert ^2_{H^2}+\Vert {\textbf{f}}_n({\textbf{u}}_n(t))\Vert ^2_{L^2}\le Ce^{K_1t}(\Vert {\textbf{u}}_0\Vert ^2_{{\mathbb {D}}}+\Vert {\textbf{g}}\Vert ^2_{L^2}+1), \end{aligned}$$
(4.10)

where positive constants C and \(K_1\) are independent of t, \({\textbf{u}}_0\) and n.

When the uniform estimates are obtained, we may pass to the limit \(n\rightarrow \infty \) in equations (4.4) and construct the desired strong solution \({\textbf{u}}(t)\) of the limit problem (2.1) (the passage to the limit in the nonlinear term is done exactly as in Lemma 4.2). The uniqueness of a solution is an immediate corollary of Lemma 3.6 (which does not require justification on the level of strong solutions). Finally, passing to the limit in the corresponding estimates for \({\textbf{u}}_n\), we prove that the limit solution \({\textbf{u}}(t)\) satisfies indeed all of the estimates of Section 3. The only non-immediate thing is the passage to the limit in the terms like \((\nabla _{\textbf{u}}{{\textbf{f}}_n}({\textbf{u}}_n)\nabla _x{\textbf{u}}_n,\nabla _x{\textbf{u}}_n)\) (and in the analogous term containing \(\partial _t{\textbf{u}}_n\)) since we do not have any control of the integral norms of \(\nabla _{\textbf{u}}{{\textbf{f}}_n}({\textbf{u}}_n)\). However, the passage to the limit could be performed here using the condition \(\nabla _{\textbf{u}}{\textbf{f}_\textbf{n}}({\textbf{u}}_n)\ge -K\), the fact that \(\nabla _{\textbf{u}}{{\textbf{f}}_n}({\textbf{u}}_n(t,x))\rightarrow \nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}}(t,x))\) almost everywhere and the convexity arguments. Namely, under these conditions, we may establish that

$$\begin{aligned}&\int _T^{T+1}(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}}(t))\nabla _x{\textbf{u}}(t),\nabla _x{\textbf{u}}(t))\,dt\le \nonumber \\&\quad \le \liminf _{n\rightarrow \infty }\int _T^{T+1}(\nabla _{\textbf{u}}{{\textbf{f}}_n}({\textbf{u}}_n(t)) \nabla _x{\textbf{u}}_n(t),\nabla _x{\textbf{u}}_n(t))\,dt, \end{aligned}$$
(4.11)

see e.g [4], Theorem 5.4. Thus, the theorem is proved. \(\square \)

Weak solutions, dissipativity and attractors

In the last section, we have proved the global existence and uniqueness of strong solutions of (2.1). Thus, the solution semigroup

$$\begin{aligned} S(t):{\mathbb {D}}\rightarrow {\mathbb {D}},\ \ S(t){\textbf{u}}_0:={\textbf{u}}(t) \end{aligned}$$
(5.1)

is well-defined. Moreover, according to Lemma 3.6, this semigroup is globally Lipschitz continuous in the \(L^2\)-metric:

$$\begin{aligned} \Vert S(t){\textbf{u}}_0^1-S(t){\textbf{u}}_0^2\Vert _{L^2}^2\le Ce^{K_1t}\Vert {\textbf{u}}_0^1-{\textbf{u}}_0^2\Vert _{L^2}^2,\ \ {\textbf{u}}_0^1,{\textbf{u}}_0^2\in {\mathbb {D}}. \end{aligned}$$
(5.2)

Thus, we can extend this semigroup by continuity from \({\mathbb {D}}\) to its closure in \(L^2\) which obviously coincides with the whole \(L^2\) (since \(C^\infty \subset {\mathbb {D}}\)). Thus, the semigroup

$$\begin{aligned} {\widehat{S}}(t){\textbf{u}}_0:=\lim _{n\rightarrow \infty }S(t){\textbf{u}}_0^n,\ \ {\textbf{u}}_0^n\in {\mathbb {D}},\ \ {\textbf{u}}_0=\lim _{n\rightarrow \infty }{\textbf{u}}_0^n \end{aligned}$$
(5.3)

is well-defined. Moreover, the limit in (5.3) can be considered in the space \(C(0,T;L^2(\Omega ))\), so the trajectories \({\textbf{u}}(t):={\widehat{S}}(t){\textbf{u}}_0\) automatically belong to the space \(C(0,T;L^2(\Omega ))\) for all \(T\ge 0\).

Our next step is to understand in what sense the trajectory \({\textbf{u}}(t)\) thus constructed satisfies the initial equation (2.1). Note that, for the general \({\textbf{u}}_0\in L^2\), we do not know whether or not \({\textbf{f}}({\textbf{u}})\in L^1\), so we cannot treat it in the distributional sense. Indeed, the only control related with \({\textbf{f}}({\textbf{u}})\) which we have up to now follows from estimate (3.1) and claims that \({\textbf{f}}({\textbf{u}}).{\textbf{u}}\in L^1\) (being pedantic, this is proved for the strong solutions only, but it can be easily extended to weak solutions using the Fatou lemma). Unfortunately, this is not enough to control the \(L^1\)-norm of the function \({\textbf{f}}\) itself, so we cannot treat the term \({\textbf{f}}({\textbf{u}})\) in a distributional sense.

Instead, we use the ideas from the monotone operator theory and variational inequalities, see e.g., [5]. Namely, following [21], we take an arbitrary test function

$$\begin{aligned} \textbf{v}\in C_{\textrm{w}}(0,T;{\mathbb {D}})\cap C^1_{\textrm{w}}(0,T;L^2(\Omega )),\ \ \forall T\in {\mathbb {R}}_+, \end{aligned}$$
(5.4)

and multiply formally equation (2.1) by \({\textbf{u}}(t)-\textbf{v}(t)\) and integrate over [0, T]. Then, integrating by parts and using that

$$\begin{aligned} -({\textbf{D}}\Delta _x{\textbf{u}}-{\textbf{D}}\Delta _x\textbf{v},{\textbf{u}}-\textbf{v})\ge 0,\ \ ({\textbf{f}}({\textbf{u}})-{\textbf{f}}(\textbf{v}),{\textbf{u}}-\textbf{v})\ge -K\Vert {\textbf{u}}-\textbf{v}\Vert ^2_{L^2}, \end{aligned}$$

we end up with

$$\begin{aligned}&\frac{1}{2}\Vert {\textbf{u}}(T)-\textbf{v}(T)\Vert ^2_{L^2}-\frac{1}{2}\Vert {\textbf{u}}(0)-\textbf{v}(0)\Vert ^2_{L^2}+ \nonumber \\&\quad \quad +\int _0^T(\partial _t\textbf{v}(t),{\textbf{u}}(t)-\textbf{v}(t))\,dt\le \nonumber \\&\quad \le \int _0^T({\textbf{D}}\Delta _x\textbf{v}(t)-{\textbf{f}}(\textbf{v}(t))+{\textbf{g}},{\textbf{u}}(t)-\textbf{v}(t))\,dt +K\int _0^T\Vert {\textbf{u}}(t)-\textbf{v}(t)\Vert ^2_{L^2}\,dt.\nonumber \\ \end{aligned}$$
(5.5)

The advantage of this approach is that the variational inequality (5.5) makes sense for all \({\textbf{u}}\in C(0,T;L^2(\Omega ))\) and, therefore, can be used to define a weak solution \({\textbf{u}}(t)\) of problem (2.1).

Definition 5.1

A function \({\textbf{u}}\in C(0,T;L^2(\Omega ))\), \(T\in {\mathbb {R}}_+\) is a weak solution of problem (2.1) if \({\textbf{u}}(0)={\textbf{u}}_0\) and the variational inequality (5.5) holds for every \(T\in {\mathbb {R}}_+\) and every test function v satisfying (5.4).

We are now ready to state the key result of this section.

Theorem 5.2

Let \({\textbf{g}}\in L^2(\Omega )\), the diffusion matrix \({\textbf{D}}\) enjoy \({\textbf{D}}+{\textbf{D}}^*>0\) and \({\textbf{f}}\) satisfy assumptions (2.3). Then, for every \({\textbf{u}}_0\in L^2(\Omega )\), there exists a unique weak solution \({\textbf{u}}(t)\) of problem (2.1) and this solution has the form \({\textbf{u}}(t)=\widehat{S}(t){\textbf{u}}_0\), where the extension \({\widehat{S}}(t)\) of the solution semigroup S(t) is defined by (5.3).

Proof

Indeed, any strong solution is a weak solution of (2.1) (since all manipulations used in the derivation of the variational inequality (5.5) are justified on the level of strong solutions). Let now \({\textbf{u}}(t)=\lim _{n\rightarrow \infty }{\textbf{u}}_n(t)\), where \({\textbf{u}}_n(t)\) are the strong solutions \({\textbf{u}}_n(t):=S(t){\textbf{u}}_0^n\), \({\textbf{u}}_0^n\in {\mathbb {D}}\) and \({\textbf{u}}_0^n\rightarrow {\textbf{u}}_0\) in \(L^2\). The variational inequality for \({\textbf{u}}_n\) reads

$$\begin{aligned}&\frac{1}{2}\Vert {\textbf{u}}_n(T)-\textbf{v}(T)\Vert ^2_{L^2}-\frac{1}{2}\Vert {\textbf{u}}_n(0)-\textbf{v}(0)\Vert ^2_{L^2}+\nonumber \\&\qquad + \int _0^T(\partial _t\textbf{v}(t),{\textbf{u}}_n(t)-\textbf{v}(t))\,dt\le \nonumber \\&\quad \le \int _0^T({\textbf{D}}\Delta _x\textbf{v}(t)-{\textbf{f}}(\textbf{v}(t))+{\textbf{g}},{\textbf{u}}_n(t)-\textbf{v}(t))\,dt+ K\int _0^T\Vert {\textbf{u}}_n(t)-\textbf{v}(t)\Vert ^2_{L^2}\,dt.\nonumber \\ \end{aligned}$$
(5.6)

Using that \({\textbf{u}}_n\rightarrow {\textbf{u}}\) in \(C(0,T;L^2(\Omega ))\) and passing to the limit \(n\rightarrow \infty \) in (5.6), we see that \({\textbf{u}}(t)\) satisfies (5.5) and, therefore, \({\textbf{u}}(t):={\widehat{S}}(t){\textbf{u}}_0\) is the desired weak solution of (2.1).

Vice versa, let \({\bar{\textbf{u}}}(t)\) be a weak solution of (2.1) and let \({\textbf{u}}(t):={\widehat{S}}(t){\bar{\textbf{u}}}(0)\). Then, there exists a sequence \({\textbf{u}}_0^n\in {\mathbb {D}}\) such that \({\textbf{u}}_0^n\rightarrow {\bar{\textbf{u}}}(0)\) and the sequence of strong solutions \({\textbf{u}}_n(t)=S(t){\textbf{u}}_0^n\) which converges as \(n\rightarrow \infty \) to the weak solution \({\textbf{u}}(t)\). We need to show that \({\textbf{u}}(t)={\bar{\textbf{u}}}(t)\).

Indeed, by the definition of a strong solution, \({\textbf{u}}_n(t)\) satisfies the assumptions of (5.4) and therefore can be used as a test function \(\textbf{v}={\textbf{u}}_n\) in the variational inequality (5.5) for the weak solution \({\bar{\textbf{u}}}\). Taking \(\textbf{v}={\textbf{u}}_n\) in it and using that \(\partial _t{\textbf{u}}_n={\textbf{D}}\Delta _x{\textbf{u}}_n-{\textbf{f}}({\textbf{u}}_n)+{\textbf{g}}\), we get

$$\begin{aligned} \frac{1}{2}\Vert {\bar{\textbf{u}}}(T)-{\textbf{u}}_n(T)\Vert _{L^2}^2\le K\int _0^T\Vert {\bar{\textbf{u}}}(t)-{\textbf{u}}_n(t)\Vert ^2_{L^2}\,dt+\frac{1}{2}\Vert {\bar{\textbf{u}}}(0)-{\textbf{u}}_n(0)\Vert ^2_{L^2}. \end{aligned}$$

Passing to the limit \(n\rightarrow \infty \) in this inequality, we get

$$\begin{aligned} \Vert {\bar{\textbf{u}}}(T)-{\textbf{u}}(T)\Vert ^2_{L^2}\le 2K\int _0^T\Vert {\bar{\textbf{u}}}(t)-{\textbf{u}}(t)\Vert ^2_{L^2}\,dt \end{aligned}$$

and, since T is arbitrary, the Gronwall inequality gives that \({\bar{\textbf{u}}}(t)={\textbf{u}}(t)\) for all t. Thus, the theorem is proved. \(\square \)

Remark 5.3

There is an alternative possibility to relate a weak solution \({\textbf{u}}(t)\) with equation (2.1), namely, for every \({\varvec{\psi }}\in C_0^\infty ({\mathbb {R}}^k,{\mathbb {R}}^k)\), the identity

$$\begin{aligned} \partial _t({\varvec{\psi }}({\textbf{u}}))=\nabla _{\textbf{u}}{\varvec{\psi }}({\textbf{u}}){\textbf{D}}\Delta _x{\textbf{u}}- \nabla _{\textbf{u}}{\varvec{\psi }}({\textbf{u}}){\textbf{f}}({\textbf{u}})+\nabla _{\textbf{u}}{\varvec{\psi }}({\textbf{u}}){\textbf{g}} \end{aligned}$$
(5.7)

should be satisfied in the sense of distributions, see [46]. It is not difficult to verify that, indeed, any weak solution \({\textbf{u}}(t)\) should satisfy this identity. The drawback of this approach is that it is unclear whether or not (5.7) is enough for the uniqueness.

Using identity (5.7) it is not difficult to show that under the additional restriction

$$\begin{aligned} |{\textbf{f}}({\textbf{u}})|\le C(|{\textbf{f}}({\textbf{u}}).{\textbf{u}}|+|{\textbf{u}}|+1) \end{aligned}$$
(5.8)

which guarantees that \({\textbf{f}}({\textbf{u}})\in L^1([0,T]\times \Omega )\), any weak solution satisfies equation (2.1) in a sense of distributions. However, in contrast to the scalar case \(k=1\), in the case of systems (5.8) is an extra restriction which we prefer to avoid.

As a next step, we note that the weak solutions are dissipative. Indeed, passing to the limit in the estimate of Lemma 3.1 for strong solutions, we derive that

$$\begin{aligned} \Vert {\widehat{S}}(T){\textbf{u}}_0\Vert _{L^2}^2+\int _T^{T+1}\Vert \widehat{S}(t){\textbf{u}}_0\Vert ^2_{H^1}\,dt\le Ce^{-\alpha T}\Vert {\textbf{u}}_0\Vert ^2_{L^2}+C(1+\Vert {\textbf{g}}\Vert ^2_{L^2}) \end{aligned}$$
(5.9)

which is a standard dissipative estimate for the semigroup \(\widehat{S}(t)\). Analogously, passing to the limit in the estimate of Lemma 3.2, we get the dissipative estimate in \(H^1\) for weak solutions

$$\begin{aligned} \Vert {\widehat{S}}(T){\textbf{u}}_0\Vert _{H^1}^2+\int _T^{T+1}\Vert \widehat{S}(t){\textbf{u}}_0\Vert ^2_{H^2}\,dt\le Ce^{-\alpha T}\Vert {\textbf{u}}_0\Vert ^2_{H^1}+C(1+\Vert {\textbf{g}}\Vert ^2_{L^2}). \end{aligned}$$
(5.10)

In addition, estimates (5.9) and (5.10) give in a standard way the \(L^2\)-\(H^1\) smoothing property for the semigroup \({\widehat{S}}(t)\), namely, the following estimate holds:

$$\begin{aligned} \Vert {\widehat{S}}(t){\textbf{u}}_0\Vert _{H^1}^2\le C\frac{t+1}{t}\left( e^{-\alpha t} \Vert {\textbf{u}}_0\Vert _{L^2}^2+1+\Vert {\textbf{g}}\Vert ^2_{L^2}\right) . \end{aligned}$$
(5.11)

These estimates ensure us that the ball \({\mathcal {B}}_R:=\{{\textbf{u}}\in H^1_0,\ \Vert {\textbf{u}}\Vert _{H^1}\le R\}\) will be a compact (in \(L^2\)) absorbing ball for the semigroup \({\widehat{S}}(t)\) if \(R=R(\Vert {\textbf{g}}\Vert _{L^2})\) is large enough. Remind that the latter means that for every bounded set B of \(L^2\) there exists a time \(T=T(B)\) such that

$$\begin{aligned} {\widehat{S}}(t)B\subset {\mathcal {B}}_R \end{aligned}$$

for all \(t\ge T\). This fact, in turn, allows us to establish the existence of a global attractor for the solution semigroup \(\widehat{S}(t)\) in the phase space \(H=L^2(\Omega )\). We recall that, by definition, a set \({\mathcal {A}}\subset H\) is a global attractor for a semigroup \({\widehat{S}}(t): H\rightarrow H\) if the following conditions are satisfied:

  1. (1)

    \({\mathcal {A}}\) is a compact subset of H;

  2. (2)

    \({\mathcal {A}}\) is strictly invariant: \({\widehat{S}}(t)\mathcal A={\mathcal {A}}\), for all \(t\ge 0\);

  3. (3)

    It attracts the images of bounded sets as \(t\rightarrow \infty \), namely, for any bounded \(B\subset H\) and any neighbourhood \(\mathcal O({\mathcal {A}})\), there exists \(T=T(B,{\mathcal {O}})\) such that

    $$\begin{aligned} {\widehat{S}}(t)B\subset {\mathcal {O}}({\mathcal {A}}) \end{aligned}$$

for all \(t\ge T\).

The next theorem may be considered as the second key result of this section.

Theorem 5.4

Let the assumptions of Theorem 5.2 be satisfied. Then the weak solution semigroup \({\widehat{S}}(t)\) possesses a global attractor \({\mathcal {A}}\) in \(H=L^2(\Omega )\) which is a bounded set of \(H^1(\Omega )\) and possesses the following description:

$$\begin{aligned} {\mathcal {A}}={\mathcal {K}}\big |_{t=0}, \end{aligned}$$
(5.12)

where \({\mathcal {K}}\subset C_b({\mathbb {R}},L^2(\Omega ))\) is the set of all complete bounded trajectories of the semigroup \({\widehat{S}}(t)\):

$$\begin{aligned} {\mathcal {K}}:=\{{\textbf{u}}\in C_b({\mathbb {R}},L^2(\Omega )),\ {\textbf{u}}(t+h)={\widehat{S}}(t){\textbf{u}}(h),\ h\in {\mathbb {R}},\ t\in {\mathbb {R}}_+\}. \end{aligned}$$
(5.13)

Proof

According to the abstract attractor existence theorem, see e.g., [3], we need to verify two assumptions:

  1. (1)

    The operators \({\widehat{S}}(t): H\rightarrow H\) are continuous for every fixed t;

  2. (2)

    The semigroup \({\widehat{S}}(t)\) possesses a compact absorbing set in H.

The first assumption is guaranteed by Lemma 3.6 and the second one follows from estimate (5.11). Thus, the global attractor \({\mathcal {A}}\) exists. The fact that \({\mathcal {A}}\) is a bounded subset of \(H^1\) follows from the fact that the attractor is always a subset of an absorbing set and the representation formula (5.12) is a standard corollary of the attractor existence theorem. Thus, the theorem is proved. \(\square \)

Weak to strong smoothing property

In this section, we establish that any weak solution \({\textbf{u}}(t)\) of problem (2.1) becomes strong for \(t>0\). The main difficulty here is the fact that we cannot in general estimate \(|{\textbf{f}}({\textbf{u}})|\) through \({\textbf{f}}({\textbf{u}}).{\textbf{u}}\) and, by this reason, we do not know whether or not \({\textbf{f}}({\textbf{u}})\) and \(\partial _t{\textbf{u}}\) are distributions. This makes the situation with the parabolic smoothing property a bit more delicate than usual. We overcome this difficulty under the extra assumption that

$$\begin{aligned} |{\textbf{f}}({\textbf{u}})|\le C(1+|{\textbf{u}}|^p) \end{aligned}$$
(6.1)

for some \(p>1\) by using the \(L^q\)-spaces with \(q<1\). Namely, we will use the fact that

$$\begin{aligned} \Vert {\textbf{f}}({\textbf{u}})\Vert _{L^{2/p}}\le C(\Vert {\textbf{u}}\Vert ^{p}_{L^2}+1), \end{aligned}$$
(6.2)

where, for \(q<1\), we denote by \(\Vert \textbf{v}\Vert _{L^q}\) exactly the same expression as for the case \(q\ge 1\) (simply ignoring the fact that it is no more a norm). Thus, at least on the level of approximations, we may expect that, for \({\varvec{\theta }}=\partial _t{\textbf{u}}\),

$$\begin{aligned} \Vert {\varvec{\theta }}(t)\Vert _{L^2(0,1;L^2(\Omega ))+L^\infty (0,1;L^{2/p}(\Omega ))}\le C(\Vert {\textbf{u}}_0\Vert _{H^1}^{p}+\Vert {\textbf{g}}\Vert ^{p}_{L^2}+1) \end{aligned}$$
(6.3)

and this can be used in order to establish the smoothing property for \({\varvec{\theta }}\). Namely, the following theorem holds.

Theorem 6.1

Let \(d\ge 3\), the assumptions of Theorem 5.2 and (6.1) hold, and let \({\textbf{u}}_0\in H^1\) and \({\textbf{u}}(t)\) be the corresponding weak solution of equation (2.1). Then, \({\textbf{u}}(t)\in {\mathbb {D}}\) for all \(t>0\) and the following estimate is valid:

$$\begin{aligned} \Vert \partial _t{\textbf{u}}(t)\Vert _{L^2}^2+\Vert {\textbf{u}}(t)\Vert ^2_{{\mathbb {D}}}\le Ct^{-N} \left( Q(\Vert {\textbf{u}}_0\Vert _{H^1})+Q(\Vert {\textbf{g}}\Vert _{L^2})\right) ,\ \ t\in (0,1], \end{aligned}$$
(6.4)

where the exponent \(N=N(d,p)\), positive constant C and the monotone increasing function Q are independent of \({\textbf{u}}_0\) and t. Moreover, the function Q(z) can be chosen as a polynomial of z.

Proof

We first note that it is enough to verify (6.4) for \({\textbf{u}}_0\in {\mathbb {D}}\) only when \({\textbf{u}}(t)\) is a strong solution. Moreover, it is enough to obtain the estimate for \(\partial _t{\textbf{u}}\) only since the estimate for \(\Vert {\textbf{u}}(t)\Vert _{\mathbb D}\) will follow from the elliptic problem (3.7). Second, we approximate the strong solution \({\textbf{u}}(t)\) by the solutions \({\textbf{u}}_n(t)\) of auxiliary problems (4.4). Finally, analyzing the proof of Lemma 4.1, we see that assumption (6.1) allows us to pose the extra assumption

$$\begin{aligned} |{\textbf{f}}_n({\textbf{u}})|\le C(1+|{\textbf{u}}|^{p_1}) \end{aligned}$$
(6.5)

for some \(p_1>p>1\) and constant C independent of n (e.g., by taking \(\Psi ({\textbf{u}}):=|{\textbf{u}}|^{p_1+1}\)).

Let \({\varvec{\theta }}_n(t):=\partial _t{\textbf{u}}_n(t)\). Then, this function solves the equation

$$\begin{aligned} \partial _t{\varvec{\theta }}_n={\textbf{D}}\Delta _x{\varvec{\theta }}_n-\nabla _{\textbf{u}}{{\textbf{f}}_n}({\textbf{u}}_n){\varvec{\theta }}_n. \end{aligned}$$
(6.6)

Multiplying this equation by \(t^N{\varvec{\theta }}_n(t)\) where \(N>1\) is a sufficiently big number, we end up with

$$\begin{aligned}&\frac{d}{dt}(t^N\Vert {\varvec{\theta }}_n(t)\Vert ^2_{L^2})+\alpha t^N\Vert {\varvec{\theta }}_n(t)\Vert ^2_{H^1}- \nonumber \\&\quad -2K(t^N\Vert {\varvec{\theta }}_n(t)\Vert ^2_{L^2})\le C t^{N-1}\Vert {\varvec{\theta }}_n(t)\Vert ^2_{L^2}. \end{aligned}$$
(6.7)

We need to estimate the integral in the right-hand side. To this end, we fix some \(s\in (0,2)\) which will be specified below and write

$$\begin{aligned} \Vert {\varvec{\theta }}_n\Vert ^2_{L^2}= & {} (|{\varvec{\theta }}_n|^2,1)=(|{\varvec{\theta }}_n|^s,|{\varvec{\theta }}_n|^{2-s})\le \nonumber \\\le & {} C(|\Delta _x{\textbf{u}}_n|^s,|{\varvec{\theta }}_n|^{2-s})+C(|{\textbf{g}}|^s,|{\varvec{\theta }}_n|^{2-s})+C(|{\textbf{f}}_n({\textbf{u}}_n(s))|^s,|{\varvec{\theta }}_n|^{2-s}),\nonumber \\ \end{aligned}$$
(6.8)

where we have used equation (4.4) in order to express \({\varvec{\theta }}_n\) through \({\textbf{u}}_n\). Let us estimate every term in the RHS separately. Applying the Hölder and Young inequalities to the first term, we get

$$\begin{aligned} t^{N-1}(|\Delta _x{\textbf{u}}_n|^s,|{\varvec{\theta }}_n|^{2-s})\le & {} C\Vert \Delta _x{\textbf{u}}_n\Vert _{L^2}^s\left( t^{\frac{N-1}{2-s}} \Vert {\varvec{\theta }}_n\Vert _{L^2}\right) ^{2-s}\le \nonumber \\\le & {} C\Vert \Delta _x{\textbf{u}}_n\Vert ^2_{L^2}+ t^{2\frac{N-1}{2-s}}\Vert {\varvec{\theta }}_n\Vert ^2_{L^2}. \end{aligned}$$
(6.9)

This gives us a good estimate if N is chosen in such a way that \(2\frac{N-1}{2-s}\ge N\). The second term in the RHS of (6.8) can be estimated analogously to have

$$\begin{aligned} t^{N-1}(|{\textbf{g}}|^s,|{\varvec{\theta }}_n|^{2-s})\le C\Vert {\textbf{g}}\Vert ^2_{L^2}+ t^{2\frac{N-1}{2-s}}\Vert {\varvec{\theta }}_n\Vert ^2_{L^2}. \end{aligned}$$
(6.10)

Let us now estimate the most complicated third term. To this end, we use the embedding theorem \(H^1\subset L^r\) where \(\frac{1}{r}=\frac{1}{2}-\frac{1}{d}\) together with the Hölder inequality with exponents \(q_1\) and \(q_2\), \(\frac{1}{q_1}+\frac{1}{q_2}=1\), with \((2-s)q_2=r\) to get

$$\begin{aligned} (|{\textbf{f}}_n({\textbf{u}}_n)|^s,|{\varvec{\theta }}_n|^{2-s})\le \Vert {\textbf{f}}_n({\textbf{u}}_n)\Vert _{L^{sq_1}}^s \Vert {\varvec{\theta }}_n\Vert _{H^1}^{2-s}. \end{aligned}$$

Moreover, due to assumption (6.5), we have

$$\begin{aligned} \Vert {\textbf{f}}_n({\textbf{u}}_n)\Vert _{L^{2/p_1}}\le C(\Vert {\textbf{u}}_n\Vert _{L^2}+1)^{p_1}\le C(\Vert {\textbf{u}}_0\Vert _{L^2} +1+\Vert {\textbf{g}}\Vert _{L^2})^{p_1}, \end{aligned}$$

where C is independent of n. We may also fix \(sq_1=\frac{2}{p_1}\) to get

$$\begin{aligned} (|{\textbf{f}}_n({\textbf{u}}_n)|^s,|{\varvec{\theta }}_n|^{2-s})\le C(\Vert {\textbf{u}}_0\Vert _{L^2}+ 1+\Vert {\textbf{g}}\Vert _{L^2})^{sp_1}\Vert {\varvec{\theta }}_n\Vert ^{2-s}_{H^1} \end{aligned}$$
(6.11)

and end up with the following system for the exponents \(q_1\), \(q_2\) and s

$$\begin{aligned} \frac{1}{q_1}+\frac{1}{q_2}=1,\ \ \frac{1}{q_1}=\frac{sp_1}{2},\ \ \frac{1}{q_2}=(2-s)\left( \frac{1}{2}-\frac{1}{d}\right) . \end{aligned}$$

Solving this system, we get

$$\begin{aligned} s=\frac{4}{d(p_1-1)+2},\ \ q_1=\frac{d(p_1-1)+2}{2p_1} \end{aligned}$$

and we see that \(0<s<2\) and \(1<q_1<\infty \), so all of the exponents are in the prescribed range and (6.11) holds indeed. Applying the Young inequality, we arrive at

$$\begin{aligned} t^{N-1}(|{\textbf{f}}_n({\textbf{u}}_n)|^s,|{\varvec{\theta }}_n|^{2-s})\le Q_\varepsilon (\Vert {\textbf{u}}_0\Vert _{L^2})+Q_\varepsilon (\Vert {\textbf{g}}\Vert _{L^2})+ \varepsilon t^{2\frac{N-1}{2-s}}\Vert {\varvec{\theta }}_n\Vert ^2_{H^1}, \end{aligned}$$
(6.12)

where \(\varepsilon >0\) is arbitrary small and the polynomial monotone function \(Q_\varepsilon \) is independent of \({\textbf{u}}_0\). Combining estimates (6.8), (6.9), (6.10) and (6.12) for estimating the RHS of (6.7) and fixing \(\varepsilon >0\) to be small enough and N satisfying \(2\frac{N-1}{2-s}\ge N\), we arrive at

$$\begin{aligned} \frac{d}{dt}\left( t^N\Vert {\varvec{\theta }}_n(t)\Vert ^2_{L^2}\right)\le & {} K_1\left( t^N\Vert {\varvec{\theta }}_n(t)\Vert ^2_{L^2}\right) + C\Vert \Delta _x{\textbf{u}}_n(t)\Vert ^2_{L^2}\nonumber \\&+\,\,Q(\Vert {\textbf{u}}_0^n\Vert _{H^1})+Q(\Vert {\textbf{g}}\Vert _{L^2}),\ \ t\le 1.\qquad \end{aligned}$$
(6.13)

Applying the Gronwall inequality to this relation and using (5.10) for estimating the integral of the \(H^2\)-norm of the solution, we end up with

$$\begin{aligned} t^N\Vert {\varvec{\theta }}_n(t)\Vert ^2_{L^2}\le Q(\Vert {\textbf{u}}_0^n\Vert _{H^1})+Q(\Vert {\textbf{g}}\Vert _{L^2}),\ \ t\in (0,1] \end{aligned}$$
(6.14)

and passing to the limit \(n\rightarrow \infty \), we derive the desired estimate for \(\partial _t{\textbf{u}}(t)\). Thus, the theorem is proved. \(\square \)

Combining this result with the \(L^2\) to \(H^1\) smoothing property (5.11), we get the following result.

Corollary 6.2

Under the assumptions of Theorem 6.1, the weak solution semigroup \({\widehat{S}}(t)\) possesses the following smoothing property:

$$\begin{aligned} \Vert {\widehat{S}}(t){\textbf{u}}_0\Vert _{{\mathbb {D}}}^2\le C t^{-M}\left( Q(\Vert {\textbf{u}}_0\Vert _{L^2})+ Q(\Vert {\textbf{g}}\Vert _{L^2})\right) , \ t\in (0,1], \end{aligned}$$
(6.15)

where the positive constants C and \(M=M(d,p)\) and monotone polynomial function Q are independent of t and \({\textbf{u}}_0\in L^2(\Omega )\).

Thus, under the assumption (6.1), any weak solution indeed becomes strong for \(t>0\) (the extra assumption \(d\ge 3\) is not essential since for \(d\le 2\) the equation is subcritical and the smoothing property is obvious). This, in particular, gives the following result on the regularity of the global attractor.

Corollary 6.3

Let the assumptions of Theorem 6.1 hold. Then the global attractor \({\mathcal {A}}\) of the solution semigroup \(\widehat{S}(t)\) constructed in Theorem 5.4 is a bounded subset of \({\mathbb {D}}\):

$$\begin{aligned} \Vert {\mathcal {A}}\Vert _{{\mathbb {D}}}\le Q(\Vert {\textbf{g}}\Vert _{L^2}) \end{aligned}$$
(6.16)

for some monotone increasing function Q.

Indeed, this assertion is an immediate corollary of (6.15) and the strict invariance of the global attractor.

Further regularity and strong attraction

In this section we study the problem whether or not the constructed strong solution \({\textbf{u}}\) is actually more regular (than \({\textbf{u}}\in C_{\text {w}}(0,T;{\mathbb {D}})\)). We start with some partial result on the regularity of \(\partial _t{\textbf{u}}\) which does not require any extra assumptions on \({\textbf{f}}\) and \({\textbf{g}}\).

Proposition 7.1

Let the assumptions of Theorem 4.3 hold. Then, there exists a positive number \(r>0\) depending only on the matrix \({\textbf{D}}\) such that, for any strong solution \({\textbf{u}}(t)\in {\mathbb {D}}\), the following estimate holds:

$$\begin{aligned} t\Vert \partial _t{\textbf{u}}(t)\Vert _{L^{r+2}}^{r+2}+\int _0^ts\Vert \nabla _x(|\partial _t{\textbf{u}}(s)|^{\frac{r+2}{2}})\Vert ^2_{L^2}\,ds\le C \Vert \partial _t{\textbf{u}}(0)\Vert _{L^2}^{r+2}, \end{aligned}$$
(7.1)

where \(0\le t\le 1\) and the constant C is independent of t and \({\textbf{u}}\).

Proof

Let \({\varvec{\theta }}:=\partial _t{\textbf{u}}\). Then this function satisfies equation (3.4). Let us multiply this equation by \({\varvec{\theta }}|{\varvec{\theta }}|^r\) and integrate in x. This gives

$$\begin{aligned} \frac{1}{r+2}\frac{d}{dt}\Vert {\varvec{\theta }}(t)\Vert ^{r+2}_{L^{r+2}}- ({\textbf{D}}\Delta _x{\varvec{\theta }},{\varvec{\theta }}|{\varvec{\theta }}|^r)+ (\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}}){\varvec{\theta }},{\varvec{\theta }}|{\varvec{\theta }}|^r)=0. \end{aligned}$$

Integrating by parts in the second term, we get

$$\begin{aligned}&-({\textbf{D}}\Delta _x{\varvec{\theta }}, {\varvec{\theta }}|{\varvec{\theta }}|^r)\ge \\&\quad \ge ({\textbf{D}}\nabla _x{\varvec{\theta }},\nabla _x{\varvec{\theta }}|{\varvec{\theta }}|^r)- Cr(|\nabla _x{\varvec{\theta }}|^2,|{\varvec{\theta }}|^r)\ge (\alpha -Cr)(|\nabla _x{\varvec{\theta }}|^2,|{\varvec{\theta }}|^r) \end{aligned}$$

for some positive \(\alpha \). Fixing now \(r>0\) small enough and estimating the term containing \({\textbf{f}}\) using \(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\ge -K\), we arrive at

$$\begin{aligned} \frac{1}{r+2}\frac{d}{dt}\Vert {\varvec{\theta }}(t)\Vert ^{r+2}_{L^{r+2}}+ \frac{\alpha }{2}(|\nabla _x{\varvec{\theta }}|^2,|{\varvec{\theta }}(t)|^r)\le K\Vert {\varvec{\theta }}(t)\Vert ^{r+2}_{L^{r+2}}. \end{aligned}$$

Multiplying this estimate by t and integrating in time, we arrive at

$$\begin{aligned} t\Vert {\varvec{\theta }}(t)\Vert ^{r+2}_{L^{r+2}}+\int _0^ts(|\nabla _x{\varvec{\theta }}(s)|^2,|{\varvec{\theta }}(s)|^r)\,ds\le C\int _0^t\Vert {\varvec{\theta }}(s)\Vert ^{r+2}_{L^{r+2}}\,ds \end{aligned}$$
(7.2)

for \(0\le t\le 1\). To estimate the right-hand side of this inequality, we use estimate (3.5) and Sobolev embedding theorem together with the Hölder inequality which gives that, for sufficiently small \(r=r(d)>0\),

$$\begin{aligned} \int _0^t \Vert {\varvec{\theta }}(s)\Vert ^{r+2}_{L^{r+2}}\,ds\le C\left( \Vert {\varvec{\theta }}\Vert _{L^\infty (0,t;L^2)}+\Vert {\varvec{\theta }}\Vert _{ L^2(0,t;H^1)}\right) ^{r+2}\le C\Vert {\varvec{\theta }}(0)\Vert _{L^{2}}^{r+2} \end{aligned}$$

and finishes the proof of the proposition. \(\square \)

The obtained extra regularity in time can be transformed to extra regularity in space assuming that the right-hand side \({\textbf{g}}\) is slightly more regular.

Corollary 7.2

Let the assumptions of Proposition 7.1 hold and let, in addition,

$$\begin{aligned} {\textbf{g}}\in L^{q}(\Omega ) \end{aligned}$$
(7.3)

for some \(q>2\). Then, there exists \(r=r({\textbf{D}},q)>0\) such that

$$\begin{aligned} t\Vert \nabla _x{\textbf{u}}(t)\Vert _{L^{\frac{d(r+2)}{d-2}}}^{r+2}\le C\left( \Vert {\textbf{u}}(0)\Vert _{{\mathbb {D}}}+\Vert {\textbf{g}}\Vert _{L^q}\right) ^{r+2} \end{aligned}$$
(7.4)

for \(0\le t\le 1\). In particular, if the attractor \({\mathcal {A}}\) is a bounded set in \({\mathbb {D}}\) then it is also bounded in \(W^{1,\frac{d(r+2)}{d-2}}(\Omega )\).

Indeed, due to Proposition 7.1, we control the \(L^{r+2}\)-norm of \(\partial _t{\textbf{u}}\). Rewriting problem (2.1) as an elliptic boundary value problem

$$\begin{aligned} {\textbf{D}}\Delta _x{\textbf{u}}(t)-{\textbf{f}}({\textbf{u}}(t))=\tilde{\textbf{g}}(t):=\partial _t{\textbf{u}}(t)-{\textbf{g}}, \end{aligned}$$

we get also the control for the \(L^{r+2}\)-norm of \(\tilde{\textbf{g}}(t)\) (point-wisely in time). Applying the elliptic regularity result proved in Appendix (see Theorem A.1) to this equation, we arrive at the desired estimate (7.4).

The obtained partial regularity results allow us to establish the crucial \(L^\infty \)-estimates for critical and slightly supercritical growth rate of the nonlinearity \({\textbf{f}}\). Namely, the following result holds.

Theorem 7.3

Let the assumptions of Proposition 7.1 hold and \(d\ge 4\). Then there exists a constant \(\varepsilon =\varepsilon (\mathbb D)>0\) such that, for every nonlinearity \({\textbf{f}}\) satisfying in addition (6.1) with the exponent p restricted by the assumption

$$\begin{aligned} p<p_{crit}+\varepsilon ,\ \ p_{crit}:=1+\frac{4}{d-4}, \end{aligned}$$
(7.5)

and the external forces \({\textbf{g}}\) satisfying (7.3) for some \(q>\frac{d}{2}\), any weak solution \({\textbf{u}}(t)\) of problem (2.1) possesses the following smoothing property:

$$\begin{aligned} \Vert {\textbf{u}}(t)\Vert _{L^\infty }\le Q_t(\Vert {\textbf{u}}(0)\Vert _{L^2}+\Vert {\textbf{g}}\Vert _{L^q}),\ \ t\in (0,1] \end{aligned}$$
(7.6)

for some monotone function \(Q_t\) which depends on t, but is independent of \({\textbf{u}}\) and \({\textbf{g}}\).

Proof

Note that, due to estimate (6.15) we may assume from the very beginning that \({\textbf{u}}_0\in {\mathbb {D}}\) and work with strong solutions only. The derivation of (7.6) can be done by the standard bootstrapping arguments by iterating the classical interior regularity result for the linear parabolic equation

$$\begin{aligned} \partial _t{\textbf{u}}-{\textbf{D}}\Delta _x{\textbf{u}}=\textbf{h}(t) \end{aligned}$$
(7.7)

which we state in the following lemma. \(\square \)

Lemma 7.4

Let \({\textbf{u}}\) be a weak solution of (7.7). Then the following interior regularity holds:

$$\begin{aligned} \Vert {\textbf{u}}\Vert _{L^\infty (T,T+1;W^{2-\mu ,s})}\le C_{T,s,\mu }\left( \Vert {\textbf{u}}\Vert _{L^2(0,T+1;L^2)}+ \Vert \textbf{h}\Vert _{L^\infty (0,T+1;L^s)}\right) , \end{aligned}$$
(7.8)

where \(\mu >0\) is arbitrarily small, \(1<s<\infty \) and \(T>0\).

Sketch of the proof

This estimate, in turn, can be easily deduced from the fact that this linear equation generates an analytic semigroup in \(L^s(\Omega )\) (since the operator \(A=-{\textbf{D}}\Delta _x\) is sectorial in \(L^s(\Omega )\) with the domain \(W^{2,s}(\Omega )\cap W^{1,s}_0(\Omega )\), see e.g., [42]) or, alternatively, from the anisotropic maximal \(L^q(L^s)\)-regularity estimate for parabolic equations. Let us sketch the first approach. Using the variation of constants formula

$$\begin{aligned} {\textbf{u(t)}}=e^{-At}{\textbf{u}}_0+\int _0^te^{-A(t-\tau )}{} \textbf{h}(\tau )\,ds \end{aligned}$$

together with the estimate

$$\begin{aligned} \Vert e^{-A(t-\tau )}\Vert _{{\mathcal {L}}(L^s, W^{2-\mu , s})}=\Vert e^{-A(t-\tau )}\Vert _{{\mathcal {L}}(L^s, D(A^{\frac{2-\mu }{2}}))}\le \frac{C_s}{(t-\tau )^{\frac{2-\mu }{2}}}, \end{aligned}$$

we get that the solution of (7.7) with \(\textbf{u}_\textbf{0}(0)=0\) satisfies

$$\begin{aligned} \Vert {\textbf{u}}(t)\Vert _{W^{2-\mu ,s}}\le C_{\mu ,s,q,T}\Vert \textbf{h}\Vert _{L^q(0,T;L^s)},\ \ t\in [0,T] \end{aligned}$$

for any \(1<s,q<\infty \) and \(0<\mu <2\) satisfying \(\frac{(2-\mu )q}{q-1}<1\). The desired interior estimate is a standard corollary of this estimate. Indeed, applying the last estimate to the function \(t^N{\textbf{u}}(t)\) (where \(N=N(d,s,q)>1\) is big enough), we get

$$\begin{aligned} \Vert t^N{\textbf{u}}\Vert _{L^\infty (0,T+1;W^{2-\mu ,s})}\le C\Vert t^{N-1}{\textbf{u}}\Vert _{L^q(0,T+1;L^s)}+C\Vert \textbf{h}\Vert _{L^\infty (0,T+1;L^s)} \end{aligned}$$

and using the Sobolev embedding theorem and the Hölder inequality, we arrive at

$$\begin{aligned} \Vert t^{N-1}{\textbf{u}}\Vert _{L^q(0,T+1;L^s)}\le \frac{1}{2C}\Vert t^N{\textbf{u}}\Vert _{L^\infty (0,T+1;W^{2-\mu ,s})}+ C'\Vert {\textbf{u}}\Vert _{L^2(0,T+1;L^2)}. \end{aligned}$$

Finally, inserting the last estimate to the previous one, we have

$$\begin{aligned} \Vert t^N{\textbf{u}}\Vert _{L^\infty (0,T+1;W^{2-\mu ,s})}\le C\left( \Vert {\textbf{u}}\Vert _{L^2(0,T+1;L^2)}+\Vert \textbf{h}\Vert _{L^\infty (0,T+1;L^s)}\right) \end{aligned}$$

which gives (7.8) and finishes the proof of the lemma. \(\square \)

We are now ready to continue the proof of the theorem. From the interior estimate (7.8) and Sobolev embedding theorem, we derive the iterative estimate

$$\begin{aligned} \Vert {\textbf{u}}\Vert _{L^\infty (T_{k+1},1;L^{q_{k+1}})}\le C_k\left( \Vert {\textbf{u}}\Vert _{L^2(0,1;L^2)}+ \Vert \textbf{h}\Vert _{L^\infty (T_k,1;L^{s_k})}\right) , \end{aligned}$$
(7.9)

where \(T_{k+1}>T_k\) and

$$\begin{aligned} q_{k+1}:=\min \left\{ \infty , \frac{s_k d}{d-s_k(2-\mu )}\right\} . \end{aligned}$$

In our situation \(\textbf{h}(t)={\textbf{g}}-{\textbf{f}}({\textbf{u}}(t))\) and, due to our growth restriction (6.1), we have

$$\begin{aligned} \Vert \textbf{h}\Vert _{L^{s_k}}\le C(\Vert {\textbf{g}}\Vert _{L^q}+1+\Vert {\textbf{u}}\Vert ^p_{L^{q_k}}), \end{aligned}$$

where \(s_k:=\min \{q,q_kp^{-1}\}\). Thus, in order to prove the theorem, it is sufficient to verify that the sequence \(q_k\) defined via

$$\begin{aligned} q_0=\frac{d(r+2)}{d-r-4},\ \ q_{k+1}=\frac{q_k d}{pd-q_k(2-\mu )},\ \ \mu \ll 1 \end{aligned}$$

will become larger than \(q>\frac{d}{2}\) in finitely many steps (we have used here estimate (7.4) and the embedding \(W^{1,\frac{d(r+2)}{d-2}}\subset L^{q_0}\) to initialize the iterations and the embedding \(W^{2-\mu ,q}\subset L^\infty \) which holds for sufficiently small \(\mu \) due to the condition \(q>\frac{d}{2}\)).

Obviously this sequence will be monotone increasing if (and only if)

$$\begin{aligned} p-\frac{q_0}{d}(2-\mu )<1. \end{aligned}$$

Then it must converge to \(+\infty \), so we only need to verify the last inequality. Using assumption (7.5) and the explicit formula for \(q_0\), we only need the inequality

$$\begin{aligned} \frac{4}{d-4}+\varepsilon -(2-\mu )\frac{r+2}{d-r-4}<0. \end{aligned}$$

It remains to note that the last inequality is satisfied if \(\mu \ll 1\) and \(\varepsilon <\varepsilon _0=\varepsilon _0(r)\) for some positive \(\varepsilon _0\) if \(r>0\). This finishes the proof of the theorem.

Remark 7.5

The growth rate of the nonlinearity is no more important if the \(L^\infty \)-estimate for the solutions is obtained, so further regularity can be obtained by bootstrapping exactly as in the subcritical case. Thus, under the growth restriction (7.5), the actual regularity of a solution is determined by the smoothness of \(\Omega \), \({\textbf{f}}\) and \({\textbf{g}}\) only (if all of them are \(C^\infty \)-smooth, the solutions will be also \(C^\infty \)-smooth). In other words, we may say that the critical growth exponent for \({\textbf{f}}\) in our problem (2.1) is slightly larger than \(p_{crit}=1+\frac{4}{d-4}\). We also note that the value \(\varepsilon =\varepsilon ({\textbf{D}})\) somehow measures how far the matrix \({\textbf{D}}\) is from the scalar matrix. It is easy to show that \(\varepsilon ({\textbf{D}})=\infty \) if \({\textbf{D}}\) is scalar.

We now turn to the question of whether or not the attraction to \({\mathcal {A}}\) holds in the space \({\mathbb {D}}\). Since in this case we at least need the dissipativity of our semigroup in \({\mathbb {D}}\), we assume that \({\textbf{f}}\) has a polynomial growth rate (i.e., that (6.1) is satisfied for some \(p\in {\mathbb {R}}_+\)). Of course, the most interesting here is the supercritical case when the assumption (7.5) is not satisfied. Unfortunately, we do not know the answer on this question in general and have to pose some extra restrictions which however look natural. Namely, we assume that the nonlinearity also satisfies

$$\begin{aligned} |\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})|\le C(1+|{\textbf{f}}({\textbf{u}})|+|{\textbf{u}}|),\ \ {\textbf{u}}\in {\mathbb {R}}^k. \end{aligned}$$
(7.10)

Then, the following result holds.

Theorem 7.6

Let the assumptions of Proposition 7.1 hold and let also assumptions (2.3) and (7.10) be satisfied. Then the image \({\widehat{S}}(1)B_R\) of any closed ball \(B_R\) of radius R in H is a compact set in \({\mathbb {D}}\). In particular the global attractor \({\mathcal {A}}\) is compact in \({\mathbb {D}}\) and attracts in the strong topology of \({\mathbb {D}}\) as well.

Proof

We only need to prove the compactness of \({{\hat{S}}}(1)B_R\), the rest is a corollary of the standard attractor’s existence theorem. The fact that this set is closed is also standard and we left it to the reader. So, we will only check pre-compactness below.

The proof of this fact is a combination of parabolic regularity estimates which gives the pre-compactness of the set

$$\begin{aligned} {\mathcal {B}}_R:=\{\partial _t{\textbf{u}}(1),\, {\textbf{u}}(t)={\widehat{S}}(t){\textbf{u}}_0,\ {\textbf{u}}_0\in B_R\} \end{aligned}$$

in \(L^2(\Omega )\) and energy type estimates for the elliptic equation which then give the desired compactness in \(H^2(\Omega )\).

Step 1. \({\mathcal {B}}_R\) is compact in \(L^2(\Omega )\). We already know that \({\mathcal {B}}_R\) is a bounded set in \(L^{2+r}(\Omega )\), due to Proposition 7.1 and Corollary 6.2. In order to get the desired compactness we will use the standard interpolation embedding:

$$\begin{aligned} W^{1-\kappa ,1}(\Omega )\cap L^{r+2}(\Omega )\subset H^{(1-\kappa )\frac{r}{2(r+1)}}(\Omega ), \end{aligned}$$
(7.11)

see [42]. This embedding together with the compactness of the embedding \(H^{\varepsilon }\subset L^2\) will give the desired result if we prove boundedness of \({\mathcal {B}}_R\) in \(W^{1-\kappa ,1}\) for some \(0<\kappa <1\). To this end, we note that according to Corollary 6.2, \({\textbf{u}}(t)\in {\mathbb {D}}\) for \(t\in [1/2,1]\) and is uniformly bounded there if \({\textbf{u}}_0\in B_R\). Thus, according to assumption (7.10), \(\Vert \nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}}(t))\Vert _{L^2}\) is uniformly bounded. Since the \(L^{2+r}\)-norm of \(\partial _t{\textbf{u}}(t)\) is also bounded due to Proposition 7.1, we have the estimate

$$\begin{aligned} \Vert \nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\partial _t{\textbf{u}}\Vert _{L^s}\le C_R,\ \ t\in [1/2,1],\ \ \frac{1}{s}=\frac{1}{2}+\frac{1}{r+2}. \end{aligned}$$

Thus, applying the maximal \(L^s\)-regularity estimate (with \(s>1\), see e.g., [29]) to equation

$$\begin{aligned} \partial _t{\varvec{\theta }}-{\textbf{D}}\Delta _x{\varvec{\theta }}=-\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}}(t))\partial _t{\textbf{u}}(t),\ \ {\varvec{\theta }}:=\partial _t{\textbf{u}}, \end{aligned}$$

using the anisotropic Sobolev embedding

$$\begin{aligned} W^{1,s}(0,1;L^s)\cap L^s(0,1;W^{2,s})\subset C(0,1;W^{2(1-\frac{1}{s}),s}), \end{aligned}$$

and arguing as at the end of the proof of Lemma 7.4, we arrive at

$$\begin{aligned} \Vert {\varvec{\theta }}(1)\Vert _{W^{2(1-\frac{1}{s}),s}}\le & {} C(\Vert \partial _t{\varvec{\theta }}\Vert _{L^s(3/4,1;L^s)}+ \Vert \Delta _x{\varvec{\theta }}\Vert _{L^s(3/4,1;L^s)}) \le \nonumber \\\le & {} C(\Vert \nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\partial _t{\textbf{u}}\Vert _{L^s(1/2,1;L^s)}+\Vert \partial _t{\textbf{u}}\Vert _{L^2(1/2;1;L^2)})\le C_R'.\nonumber \\ \end{aligned}$$
(7.12)

This estimate gives the desired boundedness of \({\mathcal {B}}_R\) in \(W^{1-\kappa ,1}(\Omega )\) and completes the first step of the proof.

Step 2. Compactness in \(H^2\). Let us consider a sequence of solutions \({\textbf{u}}_n(t)\), \({\textbf{u}}_n(0)\in B_R\) and find a subsequence which is convergent strongly in \(H^2\) to some solution \({\textbf{u}}(t)\). Due to the result of Step 1, we may assume without loss of generality that \({\textbf{u}}_n(1)\rightarrow {\textbf{u}}(1)\) weakly in \(H^2\) and \(\partial _t{\textbf{u}}_n(1)\rightarrow \partial _t{\textbf{u}}(1)\) strongly in \(L^2\). In other words, we need to pass to the limit \(n\rightarrow \infty \) in the semilinear elliptic equation

$$\begin{aligned} {\textbf{D}}\Delta _x{\textbf{u}}_n(1)-{\textbf{f}}({\textbf{u}}_n(1))=\textbf{h}_n:=\partial _t{\textbf{u}}_n(1)-{\textbf{g}}. \end{aligned}$$
(7.13)

Without loss of generality we may assume also that \(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\ge 0\). We will utilize the so-called energy method. Assume at this moment that we are able to integrate by parts and get

$$\begin{aligned} ({\textbf{f}}({\textbf{u}}),\Delta _x{\textbf{u}})=-(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\nabla _x{\textbf{u}},\nabla _x{\textbf{u}}),\ \ {\textbf{u}}\in \mathbb D, \end{aligned}$$
(7.14)

this formula will be verified later at the end of the proof. Then, multiplying (7.13) by \(\Delta _x{\textbf{u}}_n\) and integrating over x, we get the energy identity

$$\begin{aligned} ({\textbf{D}}\Delta _x{\textbf{u}}_n,\Delta _x{\textbf{u}}_n)+ (\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}}_n)\nabla _x{\textbf{u}}_n,\nabla _x{\textbf{u}}_n)=(\textbf{h}_n,\Delta _x{\textbf{u}}_n). \end{aligned}$$
(7.15)

Our aim here is to pass to the limit \(n\rightarrow \infty \) in this equality and compare it with the energy equality for the limit solution. Indeed, using the convexity arguments (similarly to (4.11)), we get

$$\begin{aligned}&({\textbf{D}}\Delta _x{\textbf{u}}(1),\Delta _x{\textbf{u}}(1))\le \liminf _{n\rightarrow \infty }({\textbf{D}}\Delta _x{\textbf{u}}_n(1),\Delta _x{\textbf{u}}_n(1)),\nonumber \\&(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\nabla _x{\textbf{u}},\nabla _x{\textbf{u}})\le \liminf _{n\rightarrow \infty }(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}}_n)\nabla _x{\textbf{u}}_n,\nabla _x{\textbf{u}}_n) \end{aligned}$$
(7.16)

and due to the strong convergence \(\textbf{h}_n\rightarrow \textbf{h}\), we have

$$\begin{aligned} (\textbf{h},\Delta _x{\textbf{u}}(1))=\lim _{n\rightarrow \infty }(\textbf{h}_n,\Delta _x{\textbf{u}}_n(1)). \end{aligned}$$

Then, the comparison with the limit energy identity

$$\begin{aligned} ({\textbf{D}}\Delta _x{\textbf{u}},\Delta _x{\textbf{u}})+(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\nabla _x{\textbf{u}},\nabla _x{\textbf{u}})=(\textbf{h},\Delta _x{\textbf{u}}(1)) \end{aligned}$$

shows that we must have

$$\begin{aligned} \lim _{n\rightarrow \infty }({\textbf{D}}\Delta _x{\textbf{u}}_n(1),\Delta _x{\textbf{u}}_n(1))=({\textbf{D}}\Delta _x{\textbf{u}}(1),\Delta _x{\textbf{u}}(1)). \end{aligned}$$

Together with the weak convergence \(\Delta _x{\textbf{u}}_n(1)\rightarrow \Delta _x{\textbf{u}}(1)\) this gives the strong convergence \(\Delta _x{\textbf{u}}_n(1)\rightarrow \Delta _x{\textbf{u}}(1)\) in \(L^2\) and, therefore, the strong convergence \({\textbf{u}}_n(1)\rightarrow {\textbf{u}}(1)\) in \(H^2\). From the equation (7.13) we finally establish that \({\textbf{f}}({\textbf{u}}_n(1))\rightarrow {\textbf{f}}({\textbf{u}}(1))\) also strongly. Thus, the compactness of \({{\hat{S}}}(1)B_R\) in \({\mathbb {D}}\) is proved. Thus, the theorem is proved by modulo of the integration by parts formula (7.14) which we prove in the following lemma. \(\square \)

Lemma 7.7

Let the nonlinearity \({\textbf{f}}\) satisfy the assumptions of Theorem 7.6. Then integration by parts (7.14) is valid for every \({\textbf{u}}\in {\mathbb {D}}\).

Proof of the lemma

We first establish the identity

$$\begin{aligned} ({\textbf{f}}({\textbf{u}}),{\text {div}}W)=-(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\nabla _x{\textbf{u}}, W) \end{aligned}$$
(7.17)

for all vector fields \(W\in C^\infty ({{\bar{\Omega }}})\). Due to our assumption (7.10) both parts of this equality make sense. The identity may be proved by approximating the function \({\textbf{f}}\) by "good" functions \({\textbf{f}}_n\) as in Lemma 4.1. Since \({\textbf{f}}\) has a polynomial growth, we may take, say,

$$\begin{aligned} \Psi (z)=e^{\sqrt{z+1}} \end{aligned}$$

and this allows us to keep also assumption (7.10) uniformly in n. Let \({\textbf{u}}_n\) be the corresponding approximating functions for \({\textbf{u}}\) constructed as in (4.5). Then, we first verify the integration by parts for \({\textbf{f}}_n\) and \({\textbf{u}}_n\) (which is trivial since everything is smooth) and after that pass to the limit \(n\rightarrow \infty \) (which is also straightforward since as in Lemma 4.2, we have weak convergence \({\textbf{f}}_n({\textbf{u}}_n)\rightarrow {\textbf{f}}({\textbf{u}})\) in \(L^2\) and, due to our assumption (7.10), we also have weak convergence \(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}}_n)\nabla _x{\textbf{u}}_n\) to \(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\nabla _x{\textbf{u}}\) in \(L^{1+\varepsilon }\) for small positive \(\varepsilon \). Thus, the integration by parts (7.17) is verified for smooth vector fields W.

Note that the \(C^\infty \) smoothness assumption on the vector field W can be relaxed till

$$\begin{aligned} W\in H^1(\Omega )\cap L^q(\Omega ) \end{aligned}$$

(where q is large enough that \(\frac{1}{q}+\frac{1}{1+\varepsilon }\le 1\)) by density arguments.

We now construct a sequence of Lipschitz continuous cut-off functions

$$\begin{aligned} \varphi _n(z)={\left\{ \begin{array}{ll} 1,\ \ z\le n,\\ 1-\ln \frac{z}{n},\ \ z\in [n,en],\\ 0,\ \ z>ne. \end{array}\right. } \end{aligned}$$
(7.18)

Then, the sequence \(\varphi _n(z)\) is monotone increasing in n and is convergent point-wise to one. Moreover, the following estimate holds:

$$\begin{aligned} |\varphi '(z)z|\le 1,\ \ z\in {\mathbb {R}}\end{aligned}$$
(7.19)

(there are no problems to construct similar smooth sequence, but we prefer to give relatively simple explicit expression). Then, we define a vector field \(W=W_n\) as follows:

$$\begin{aligned} W_n(x):=\varphi _n(|\nabla _x{\textbf{u}}|^2)\nabla _x{\textbf{u}}. \end{aligned}$$

Then, as simple calculation shows, \(W\in L^\infty (\Omega )\) and, due to condition (7.19),

$$\begin{aligned} \Vert \nabla _xW_n\Vert _{L^2}\le C\Vert \nabla ^2_x{\textbf{u}}\Vert , \end{aligned}$$

where the constant C is independent of n. Thus, we may conclude that \({\text {div}}W_n\rightarrow \Delta _x{\textbf{u}}\) weakly in \(L^2(\Omega )\). Moreover, we may put \(W_n\) to the integration by parts formula (7.17) and get

$$\begin{aligned} ({\textbf{f}}({\textbf{u}}),{\text {div}}W_n)=-(\varphi _n(|\nabla _x{\textbf{u}}|^2)\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\nabla _x{\textbf{u}},\nabla _x{\textbf{u}}). \end{aligned}$$

It only remains to pass to the limit \(n\rightarrow \infty \) here. Passing to the limit in the left-hand side is immediate and to pass to the limit \(n\rightarrow \infty \) in the right-hand side, it is enough to note that \((\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})+K)\nabla _x{\textbf{u}}.\nabla _x{\textbf{u}}\) is non-negative and belongs to \(L^1(\Omega )\). The monotonicity of \(\varphi _n\) in n and its point-wise convergence to one allow us to apply the Levy monotone convergence theorem and get the desired result. Thus, the lemma is proved and the theorem is also proved. \(\square \)

Remark 7.8

We expect that the integration by parts formula (7.14) holds without the extra assumption (7.10), however, it is not clear how to verify it. Key difficulty here is that \({\mathbb {D}}\) is a nonlinear set and it is not easy to construct good smooth approximations for functions \({\textbf{u}}\in {\mathbb {D}}\).

The first step in the proof of Theorem 7.6 can be also done using the energy type arguments. To this end one just needs to verify the energy identity

$$\begin{aligned} \frac{1}{2}\frac{d}{dt}\Vert {\varvec{\theta }}(t)\Vert ^2_{L^2}+({\textbf{D}}\nabla _x{\varvec{\theta }}(t),\nabla _x{\varvec{\theta }}(t)) +(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}}(t)){\varvec{\theta }}(t),{\varvec{\theta }}(t))=0 \end{aligned}$$
(7.20)

which can be verified similarly to the proof of Lemma 7.7. In the case of reaction diffusion system (2.1) this is not necessary since Proposition 7.1 gives a simpler way to verify the compactness. However, it may be useful in the case of higher order equations where the technique of Proposition 7.1 may not work.

Finite dimensionality and exponential attractors

In this section we discuss the finite-dimensionality of the global attractor for problem (2.1) and the existence of the so-called exponential attractor. We recall that a set \(\mathcal M\subset H\) is called an exponential attractor of the semigroup \({\widehat{S}}(t):H\rightarrow H\) if the following conditions are satisfied:

  1. 1.

    \({\mathcal {M}}\) is compact in H;

  2. 2.

    \({\mathcal {M}}\) is semi-invariant: \({\widehat{S}}(t){\mathcal {M}}\subset {\mathcal {M}}\);

  3. 3.

    It has a finite fractal dimension in H: \(\dim _F({\mathcal {A}},H)<\infty \);

  4. 4.

    It attracts the images of bounded in H sets exponentially as time tends to infinity, i.e., for every bounded set B,

    $$\begin{aligned} \textrm{dist}({{\hat{S}}}(t)B,{\mathcal {M}})\le Q(\Vert B\Vert _H)e^{-\alpha t} \end{aligned}$$

    for some positive \(\alpha \) and monotone function Q which are independent of B.

It is well-known that the exponential attractor if exists always contains a global attractor, so the existence of \({\mathcal {M}}\) automatically implies the finite-dimensionality of a global attractor. In contrast to global attractors, exponential attractors usually more robust with respect to perturbations and allow us to control the rate of attraction in terms of physical parameters of the system considered, but as a price to pay for that, an exponential attractor is not unique, see [16, 18, 35] for more details.

The existence of an exponential attractor is usually verified using the following abstract result for discrete semigroups \({{\hat{S}}}(n):={{\hat{S}}}^n:H\rightarrow H\) generated by the map \({{\hat{S}}}:H\rightarrow H\).

Proposition 8.1

Let HV be two Banach spaces such that V is compactly embedded in H. Assume that there exists a bounded closed set \(B\subset H\) and a map \({{\hat{S}}}:B\rightarrow B\) such that

$$\begin{aligned} \Vert {{\hat{S}}} (\xi _1)-{{\hat{S}}}(\xi _2)\Vert _V\le K\Vert \xi _1-\xi _2\Vert _H,\ \ \xi _1,\xi _2\in B. \end{aligned}$$
(8.1)

Then the corresponding discrete semigroup \({{\hat{S}}}(n):B\rightarrow B\) possesses an exponential attractor \({\mathcal {M}}\subset B\).

For the proof of this proposition, see [17, 18].

In applications usually B is an absorbing ball of the considered continuous semigroup \({{\hat{S}}}(t):H\rightarrow H\), \({\widehat{S}}:={\widehat{S}}(T)\) for some properly chosen T and (8.1) is verified using the proper parabolic smoothing property for the equation on differences of two solutions. If the existence of a discrete exponential attractor \({\mathcal {M}}_d\) is established, the exponential attractor for the continuous semigroup can be constructed by the standard formula:

$$\begin{aligned} {\mathcal {M}}:=\cup _{t\in [T,2T]}{{\hat{S}}}(t){\mathcal {M}}_d \end{aligned}$$

and in order to get its finite-dimensionality, we need to assume in addition that the semigroup is also Hölder continuous in time:

$$\begin{aligned} \Vert {{\hat{S}}}(t_1)\xi _1-{{\hat{S}}}(t_2)\xi _2\Vert _{H}\le L\left( \Vert \xi _1-\xi _2\Vert _H+|t_1-t_2|^\alpha \right) ,\ \ \end{aligned}$$
(8.2)

for some \(\alpha \in (0,1]\) and all \(t_i\in [T,2T]\) and \(\xi _i\in B\), see [18] for the details.

The main result of this section is the following theorem.

Theorem 8.2

Let the nonlinearity \({\textbf{f}}\) satisfy assumptions (2.3), (6.1) for some \(p\in {\mathbb {R}}_+\), (7.10) and the following convexity property: there exist a convex function \(\Psi :{\mathbb {R}}^k\rightarrow {\mathbb {R}}_+\) such that

$$\begin{aligned} C_2(\Psi ({\textbf{u}})-1-|{\textbf{u}}|^2)\le |{\textbf{f}}({\textbf{u}})|^2\le C_1(\Psi ({\textbf{u}})+|{\textbf{u}}|^2+1),\ \ {\textbf{u}}\in {\mathbb {R}}^k, \end{aligned}$$
(8.3)

for some positive constants \(C_1\) and \(C_2\). Let also \({\textbf{g}}\in L^2(\Omega )\) and \({\textbf{D}}\) satisfy (2.2). Then problem (2.1) possesses an exponential attractor \({\mathcal {M}}\) in the space \(H:=L^2(\Omega )\) which is a compact set in \({\mathbb {D}}\).

Proof

According to Corollary 6.2 a ball \(B=B_R\) in \({\mathbb {D}}\) of a sufficiently large radius R is an absorbing set for the solution semigroup \({{\hat{S}}}(t):H\rightarrow H\) associated with equation (2.1). Let us fix \(T>0\) big enough that \({{\hat{S}}}(T)B\subset B\) and set \({{\hat{S}}}:={{\hat{S}}}(T)\). Then, according to estimate (3.8), the semigroup \({{\hat{S}}}(t)\) is Lipschitz continuous with respect to the initial data for every fixed t. Moreover, since \(\partial _t{\textbf{u}}(t)\) is bounded in the \(L^2\)-norm if \({\textbf{u}}(0)\in {\mathbb {D}}\), this semigroup is also Lipschitz continuous in time, so condition (8.2) is satisfied with \(\alpha =1\). Therefore, in order to verify the existence of an exponential attractor, it is enough to check the smoothing property (8.1) for the properly chosen space V. To this end, we need to establish a number of smoothing estimates for the difference of solutions of equation (2.1).

Let \({\textbf{u}}_1(t)\) and \({\textbf{u}}_2(t)\) be two solutions of (2.1) starting from the absorbing ball B. Then their difference \({\varvec{\theta }}(t):={\textbf{u}}_1(t)-{\textbf{u}}_2(t)\) solves the equation

$$\begin{aligned} \partial _t{\varvec{\theta }}={\textbf{D}}\Delta _x{\varvec{\theta }}-\textbf{L}(t){\varvec{\theta }},\ \ \textbf{L}(t):=\int _0^1\nabla _{\textbf{u}}{\textbf{f}}(s{\textbf{u}}_1(t)+(1-s){\textbf{u}}_2(t))\,ds. \end{aligned}$$
(8.4)

We recall that, due to the assumption \(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\ge -K\), multiplication of this equation on \({\varvec{\theta }}\) gives the basic Lipschitz continuity estimate:

$$\begin{aligned} \Vert {\varvec{\theta }}(T)\Vert ^2_{L^2}+\int _0^T\Vert \nabla _x{\varvec{\theta }}(t)\Vert ^2_{L^2}\,dt\le Ce^{KT}\Vert {\varvec{\theta }}(0)\Vert _{L^2}^2, \end{aligned}$$
(8.5)

see Lemma 3.6. Moreover, multiplying (8.4) by \({\varvec{\theta }}|{\varvec{\theta }}|^r\) and arguing exactly as in the proof of Proposition 7.1, we get the estimate

$$\begin{aligned} \Vert {\varvec{\theta }}(T)\Vert _{L^{r+2}}\le Ce^{KT}T^{-1}\Vert {\varvec{\theta }}(0)\Vert _{L^2} \end{aligned}$$
(8.6)

for some sufficiently small positive r depending only on the matrix \({\textbf{D}}\).

In order to get smoothing estimate for \({\varvec{\theta }}\), we argue as in Step 1 of the proof of Theorem 7.6. Namely, from (7.10) and (8.3), we conclude that

$$\begin{aligned}&|\nabla _{\textbf{u}}{\textbf{f}}(s{\textbf{u}}_1+(1-s){\textbf{u}}_2)|^2\le \nonumber \\&\quad \le C(|{\textbf{f}}(s {\textbf{u}}_1+(1-s){\textbf{u}}_2)|^2+ 1+|{\textbf{u}}_1|^2+|{\textbf{u}}_2|^2)\le \nonumber \\&\quad \le C'(\Psi (s {\textbf{u}}_1+(1-s){\textbf{u}}_2)+ 1+|{\textbf{u}}_1|^2+|{\textbf{u}}_2|^2)\le \nonumber \\&\quad \le C'(\Psi ({\textbf{u}}_1)+\Psi ({\textbf{u}}_2)+ 1+|{\textbf{u}}_1|^2+|{\textbf{u}}_2|^2)\le \nonumber \\&\quad \le C''(|{\textbf{f}}({\textbf{u}}_1)|^2+|{\textbf{f}}({\textbf{u}}_2)|^2+|{\textbf{u}}_1|^2+|{\textbf{u}}_2|^2+1),\ s\in (0,1) \end{aligned}$$
(8.7)

and, therefore, since \({\textbf{f}}({\textbf{u}}(t))\) is uniformly bounded in \(L^2\)-norm for our solutions \({\textbf{u}}_1\) and \({\textbf{u}}_2\), we have

$$\begin{aligned} \Vert \textbf{L}(t)\Vert _{L^2}\le C,\ \ t\in [0,T]. \end{aligned}$$
(8.8)

This estimate, in turn, implies (together with (8.5), Sobolev embedding theorem and Hölder inequality) that

$$\begin{aligned} \Vert \textbf{L}(t){\varvec{\theta }}\Vert _{L^s(0,T;L^s)}\le C_T \end{aligned}$$

for some \(1<s<2\). Applying now the \(L^s\) interior regularity estimate to equation (8.4) and arguing as in the proof of Theorem 7.6, we get

$$\begin{aligned} \Vert {\varvec{\theta }}(T)\Vert _{W^{2(1-\frac{1}{s}),s}}\le C_T\Vert {\varvec{\theta }}(0)\Vert _{L^2} \end{aligned}$$

which together with the embedding (7.11) gives

$$\begin{aligned} \Vert {\varvec{\theta }}(T)\Vert _{H^\varepsilon }\le C_T\Vert {\varvec{\theta }}(0)\Vert _{L^2} \end{aligned}$$

for some positive exponent \(\varepsilon \). Setting finally \(V=H^\varepsilon (\Omega )\) we get the desired smoothing property (8.1) and finish the proof of the theorem. \(\square \)

Remark 8.3

The finite-dimensionality of the global attractor \({\mathcal {A}}\) has been established under similar assumptions on \({\textbf{f}}\) in [46] using the so-called method of l-trajectories, see also [32, 33]. In the present work we suggest the simplified version of the proof which is based on multiplication of equation (8.4) on the quantities like \({\varvec{\theta }}|{\varvec{\theta }}|^r\). Although the proof becomes more transparent, it is slightly less general than the one suggested in [46] since this multiplication is suitable for reaction-diffusion systems and may not work for more general ones (e.g., higher order equations). In such cases one should return back to the method of l-trajectories.

Generalizations and concluding remarks

In this concluding section we briefly consider other types of equations for which the technique developed above works (with some minor changes which we will discuss) and state some interesting open problems. We start with the case of fractional Laplacians and the corresponding reaction-diffusion equations which are becoming more and more popular nowadays, see [1, 2, 23, 31] and references therein for more details.

Fractional reaction-diffusion systems

Let us define \(A:=(-\Delta _x)^{\alpha }\), \(0<\alpha <1\), in the domain \(\Omega \) endowed with Dirichlet boundary conditions. In other words, we define A as a fractional power of the Laplacian \(-\Delta _x\) in the domain \(\Omega \) endowed with Dirichlet boundary conditions:

$$\begin{aligned} Au:=\frac{1}{\Gamma (-\alpha )}\int _0^\infty (e^{t\Delta _x}u-u)\frac{dt}{t^{1+\alpha }}, \end{aligned}$$

where \(\Gamma (z)\) is the standard Euler \(\Gamma \)-function, see [7, 42] for more details, although we believe that similar results can be obtained for other types of fractional Laplacians.

Let us consider the following fractional reaction-diffusion system:

$$\begin{aligned} \partial _t{\textbf{u}}+{\textbf{D}}(-\Delta _x)^\alpha {\textbf{u}} +{\textbf{f}}({\textbf{u}})={\textbf{g}},\ \ {\textbf{u}}\big |_{\partial \Omega }=0, \end{aligned}$$
(9.1)

where the function \({\textbf{f}}\) and the matrix \({\textbf{D}}\) satisfy assumptions (2.3) and (2.2) respectively. In this case, the definition of the phase space \({\mathbb {D}}\) should be transformed as follows:

$$\begin{aligned} {\mathbb {D}}_\alpha :=\{{\textbf{u}}\in H^{2\alpha }_{\Delta _x},\ \ {\textbf{f}}({\textbf{u}})\in L^2(\Omega )\}, \end{aligned}$$
(9.2)

where \(H^{2\alpha }_{\Delta _x}:=D((-\Delta _x)^\alpha )\).

All of the estimates and results stated above for the case \(\alpha =1\) can be extended in a straightforward way to a general case \(0<\alpha <1\). The only non-trivial place is the estimates of the terms like \(((-\Delta _x)^\alpha {\textbf{u}}, {\textbf{f}}({\textbf{u}}))\) or \(((-\Delta _x)^\alpha {\textbf{u}},{\textbf{u}}|{\textbf{u}}|^r)\). In the case when \(\Omega ={\mathbb {R}}^d\) (or in the case of periodic BC), we have a nice explicit formula for such inner products which trivializes the required estimates (see e.g., [42]), namely,

$$\begin{aligned} (A{\textbf{u}},\textbf{v})=C_\alpha \int _{{\mathbb {R}}^d}\int _{{\mathbb {R}}^d}\frac{({\textbf{u}}(x)-{\textbf{u}}(y)).(\textbf{v}(x)-\textbf{v}(y))}{|x-y|^{d+2\alpha }}\,dx\,dy. \end{aligned}$$
(9.3)

In particular, it gives the positivity of \((A{\textbf{u}},{\textbf{f}}({\textbf{u}}))\) if \(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\ge 0\). Fortunately, there is an extension of this formula to the case of bounded domains (see [7]), namely,

$$\begin{aligned} (A{\textbf{u}},\textbf{v})= & {} \int _\Omega \int _\Omega ({\textbf{u}}(x)-{\textbf{u}}(y)).(\textbf{v}(x)-\textbf{v}(y)) K_{\Omega ,\alpha }(x,y)\,dx\,dy+\nonumber \\&+ \int _\Omega {\textbf{u}}(x).\textbf{v}(x)B_{\Omega ,\alpha }(x)\,dx \end{aligned}$$
(9.4)

for some non-negative functions \(K_{\Omega ,\alpha }\) and \(B_{\Omega ,\alpha }\). This formula allows us to get the same type of estimates as for the local case \(\alpha =1\).

Namely, analogously to Theorem 4.3, we have the existence and uniqueness of strong solutions \({\textbf{u}}\in C_w(0,T;\mathbb D_\alpha )\) for any \(\textbf{u}_\textbf{0}\in {\mathbb {D}}_\alpha \) and \(g\in L^2(\Omega )\) (we always assume in this section that assumptions (2.2) and (2.3) are satisfied), so the solution semigroup

$$\begin{aligned} S(t):{\mathbb {D}}_\alpha \rightarrow {\mathbb {D}}_\alpha \end{aligned}$$

is well-defined and globally Lipschitz continuous in \(L^2(\Omega )\). Moreover, exactly as in §5, this semigroup can be extended by continuity to the semigroup \({\widehat{S}}(t)\) acting on whole phase space \(L^2(\Omega )\). The corresponding trajectory \({\textbf{u}}(t):={\widehat{S}}(t)\textbf{u}_\textbf{0}\) belongs to \(C(0,T; L^2(\Omega ))\) and can be interpreted as a unique weak solution of problem (9.1) in a sense of Definition 5.1 where \({\mathbb {D}}\) is replaced by \({\mathbb {D}}_\alpha \) and \(\Delta _x\) by \(-(-\Delta _x)^\alpha \) respectively. Furthermore, exactly as in §5, this semigroup possesses a global attractor \({\mathcal {A}}\) in \(L^2(\Omega )\).

For the convenience of the reader, we state below the analogues of two key results for the fractional case which can be obtained analogously to the case \(\alpha =1\). The comprehensive study of this case out of the scope of this paper, so we leave the details to the reader. We return to this problem somewhere else.

Theorem 9.1

Let the matrix \({\textbf{D}}\) and the nonlinearity \({\textbf{f}}\) satisfy (2.2) and (2.3) respectively and let, in addition, the nonlinearity \({\textbf{f}}\) satisfy (6.1) with the exponent p restricted by the assumption

$$\begin{aligned} p<p_{crit}(\alpha ),\ \ p_{crit}(\alpha ):=1+\frac{4\alpha }{d-4\alpha } \end{aligned}$$
(9.5)

if \(d\ge 4\alpha \) and the external forces \({\textbf{g}}\in L^q(\Omega )\) for some \(q>\frac{d}{2\alpha }\). Then any weak solution \({\textbf{u}}(t)\) of problem (9.1) starting from \({\textbf{u}}(0)\in H\) possesses the following smoothing property:

$$\begin{aligned} \Vert {\textbf{u}}(t)\Vert _{L^\infty }\le Q_t(\Vert {\textbf{u}}(0)\Vert _{L^2}+\Vert {\textbf{g}}\Vert _{L^q}),\ \ t\in (0,1] \end{aligned}$$
(9.6)

for some monotone function \(Q_t\) depending on t, but independent of \({\textbf{u}}\) and \({\textbf{g}}\).

Remark 9.2

Note that this result is not very helpful if \(0<\alpha <\frac{1}{2}\) since the direct \(H^1\)-estimate which is obtained by multiplication of the equation by \(-\Delta _x{\textbf{u}}\) gives the control of the \(H^1\)-norm (of course, assuming in addition that \({\textbf{g}}\in H^{1-\alpha }\)) which is better than \(H^{2\alpha }\)-control finally obtained from \({\textbf{u}}\in {\mathbb {D}}_\alpha \). However, it is useful for \(\alpha \ge \frac{1}{2}\). In particular, in the case \(0< \alpha <\frac{3}{4}\), we may have the supercritical growth rate in the case of physical dimension \(d=3\) as well. So, main results become applicable for \(d=3\) as well. Note also that many of the results of our paper may be extended also to the case \(\alpha >1\) (e.g., to the Swift-Hohenberg type equations where \(\alpha =2\)), but in this case we will not be able to multiply the equation by \(A{\textbf{u}}\) since the term \((A{\textbf{u}},{\textbf{f}}({\textbf{u}}))\) will be out of control, so we may multiply it only on \(\Delta _x{\textbf{u}}\) and this gives the control of the \(H^{\frac{1+\alpha }{2}}\)-norm of \({\textbf{u}}(t)\) (not \(H^{2\alpha }\) as before).

We now state the result about exponential attractors for the supercritical case.

Theorem 9.3

Let the nonlinearity \({\textbf{f}}\) satisfy assumptions (2.3), (6.1) for some \(p\in {\mathbb {R}}_+\), (7.10) and the following convexity property: there exist a convex function \(\Psi :{\mathbb {R}}^k\rightarrow {\mathbb {R}}\) such that

$$\begin{aligned} C_2(\Psi ({\textbf{u}})-1-|{\textbf{u}}|^2)\le |f({\textbf{u}})|^2\le C_1(\Psi ({\textbf{u}})+|{\textbf{u}}|^2+1),\ \ {\textbf{u}}\in {\mathbb {R}}^k, \end{aligned}$$
(9.7)

for some positive constants \(C_1\) and \(C_2\). Let also \(0<\alpha <1\), \({\textbf{g}}\in L^2(\Omega )\) and \({\textbf{D}}\) satisfy (2.2). Then problem (9.1) possesses an exponential attractor \(\mathcal M\) in the space \(H:=L^2(\Omega )\) which is a boumded set in \(\mathbb D_\alpha \).

Remark 9.4

We expect that we may add sufficiently small \(\varepsilon >0\) to the critical exponent \(p_{crit}\) in (9.5) (exactly as in the case of classical diffusion, see (7.5)) as well as to prove that the exponential attractor \({\mathcal {M}}\) is not only bounded, but also compact in \({\mathbb {D}}_\alpha \). However, to get these results, we need to verify the analogue of Theorem A.1 for the fractional Laplacian and this requires extra efforts in comparison with the local case \(\alpha =1\), so we prefer not to state these results here.

Cahn-Hilliard type systems

Let us consider the following fractional Cahn-Hilliard system in \(\Omega \subset {\mathbb {R}}^d\):

$$\begin{aligned} \partial _t{\textbf{u}}+(-\Delta _x)^\beta ({\textbf{D}}(-\Delta _x)^\alpha {\textbf{u}}+{\textbf{f}}({\textbf{u}})-{\textbf{g}})=0 \end{aligned}$$
(9.8)

endowed by the Dirichlet boundary conditions. We assume here that \(0<\beta \le 1\), \(0<\alpha \le 1\). Note that \(\alpha =\beta =1\) corresponds to the classical Cahn-Hilliard system and \(\beta =0\), \(\alpha =1\) to the reaction-diffusion system considered above. See [1, 35, 41] and references therein for more details concerning classical and fractional CH-equations. It is natural to take \({\mathbb {D}}_\alpha \) as the phase space for this problem and rewrite it in the following form:

$$\begin{aligned} \partial _t(-\Delta _x)^{-\beta } {\textbf{u}}+{\textbf{D}}(-\Delta _x)^\alpha {\textbf{u}}+{\textbf{f}}({\textbf{u}})={\textbf{g}}. \end{aligned}$$
(9.9)

Then we may utilize the monotonicity of the function \({\textbf{f}}\) and apply the developed above theory to this equation (see also [35] for the case \(\alpha =\beta =1\)). In this case, weak solutions are naturally defined in the space \(H:=H^{-\beta }(\Omega )\) and strong solutions live in \({\mathbb {D}}_\alpha \).

The key result on the existence of exponential attractors now reads.

Theorem 9.5

Let the nonlinearity \({\textbf{f}}\) satisfy assumptions (2.3), (6.1) for some \(p\in {\mathbb {R}}_+\), (7.10) and the following convexity property: there exist a convex function \(\Psi :{\mathbb {R}}^k\rightarrow {\mathbb {R}}\) such that

$$\begin{aligned} C_2(\Psi ({\textbf{u}})-1-|{\textbf{u}}|^2)\le |{\textbf{f}}({\textbf{u}})|^2\le C_1(\Psi ({\textbf{u}})+|{\textbf{u}}|^2+1),\ \ {\textbf{u}}\in {\mathbb {R}}^k, \end{aligned}$$
(9.10)

for some positive constants \(C_1\) and \(C_2\). Let also \({\textbf{g}}\in L^2(\Omega )\) and \({\textbf{D}}\) satisfy (2.2). Then problem (9.8) possesses an exponential attractor \({\mathcal {M}}\) in the space \(H:=H^{-\beta }(\Omega )\) which is a bounded set in \(\mathbb D_\alpha \).

We leave the rigorous proof of this theorem to the reader.

Open problems

We conclude this section by a discussion of some open questions and possible further improvements of the above developed theory.

Problem 1. We start with the already posed question about the validity of the integration by parts formula

$$\begin{aligned} ({\textbf{f}}({\textbf{u}}),\Delta _x{\textbf{u}})=-(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\nabla _x{\textbf{u}},\nabla _x{\textbf{u}}) \end{aligned}$$
(9.11)

for every \({\textbf{u}}\in {\mathbb {D}}\). We know that both parts of this equality are well-defined for any \({\textbf{u}}\in {\mathbb {D}}\). However, since we do not know the density of smooth functions in \({\mathbb {D}}\), we cannot verify the identity in a standard way, so we need to use something else. We have proved this identity under the extra assumption (7.10) which allows us to control the Lebesgue norm of \(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\nabla _x{\textbf{u}}\) and simplifies the situation. Clarifying the situation with this integration by parts in general would be very useful for establishing energy equalities for many other equations containing monotone nonlinearities which, in turn, may give compactness of the corresponding global attractors. We were sure that (7.10) is technical, but surprisingly are unable to remove it (or find the proper reference).

Problem 2. Next problem is related with smoothness of weak/strong solutions of problem (2.1). We have established that under the assumption (6.1) that \({\textbf{f}}\) has a polynomial growth rate, the problem possesses H to \({\mathbb {D}}\) smoothing property. It would be interesting to understand whether or not this polynomial growth restriction is really necessary for the smoothing (ideally, to construct a non-smoothing weak solution for problem (2.1)), say, with exponential or stronger nonlinearities. A natural idea here is to extend the proof of Theorem 6.1 to the case where \(\partial _t{\textbf{u}}\) belongs to some weaker spaces than \(L^p(\Omega )\) with \(p\ll 1\) using the technique of Orlich spaces. But more detailed analysis shows that this does not work already when \(\ln (1+|{\textbf{f}}({\textbf{u}})|)\in L^1\), so we may expect the existence of such exotic non-smoothing weak solutions for fast growing nonlinearities.

The phenomenon of delayed regularization is well-known in the class of nonlinear diffusion problems, see [44] and reference therein. For example, the equation

$$\begin{aligned} \partial _tu|\partial _tu|^p=\Delta _xu, \ u\big |_{\partial \Omega }=0, \ \ p\ge 0 \end{aligned}$$

is well-posed in a natural energy phase space \(\Phi =W^{1,2}_0(\Omega )\). However, the solutions of this equation do not possess the standard parabolic smoothing property if, say, \(p>4\) and \(d=3\). Indeed, the energy identity for this equation reads

$$\begin{aligned} \Vert \nabla _xu(T)\Vert _{L^2}^2+\int _0^T\Vert \partial _tu(t)\Vert ^{p+2}_{L^{p+2}}\,dt=\Vert \nabla _xu(0)\Vert ^2_{L^2}, \end{aligned}$$

so if \(u(0)\notin L^{p+2}(\Omega )\), we have \(u(T)\notin L^{p+2}(\Omega )\) for any finite \(T>0\). However, if we start from more regular phase space \(\Psi :=W^{1,2}_0(\Omega )\cap L^\infty (\Omega )\), we will have instantaneous further regularization, see [20]. The open question is whether or not something similar happens in the case of system (1.1) of reaction-diffusion equations with fast growing nonlinearity \({\textbf{f}}\) satisfying (2.3).

Another related question is about generating singularities in finite time in equations like (2.1). It is known that general reaction-diffusion systems may generate singularities in higher norms even if the natural energy norm remains finite and dissipative, see e.g., [36] for RDS satisfying balance law (=mass conservation law), [25] for the case of reaction-diffusion with chemotaxis or [6] for Ginzburg-Landau equations in \({\mathbb {R}}^3\) (see also references therein). However, to the best of our knowledge, there are no such examples in the class of equation (2.1) with nonlinearities satisfying \(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\ge -K\). As we know, in this case the \(H^2\)-norm cannot blow up, so this is the question of possible blow up of higher norms and high space dimension \(d>4\).

Problem 3. Finally, about the finite-dimensionality of global attractors. The most popular scheme for proving this result is related with volume contraction technique, see [3, 41] and references therein. Using this technique, we need to estimate l dimensional traces \({\text {Tr}}_l{\mathcal {L}}_u\), where

$$\begin{aligned} {\mathcal {L}}_{\textbf{u}}{} \textbf{v}={\textbf{D}}\Delta _x\textbf{v}-\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\textbf{v} \end{aligned}$$

is the linearized operator on the trajectory \({\textbf{u}}(t)\) of the equation (2.1) lying on the attractor. Formal estimates of this quantity depend only on K (if the assumption \(\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})\ge -K\) is posed) and are independent of the norm of \({\textbf{u}}(t)\) and any norms of \({\textbf{f}}({\textbf{u}})\).

However, to justify this method we need to verify the differentiability of the semigroup \({{\hat{S}}}(T)\) with respect to the initial data (at least the so-called uniform quasi-differentiability on the attractor, see [41]) and such a differentiability usually does not hold in supercritical cases.

This was the main reason to use the alternative scheme based on Proposition 8.1 for verifying the finite-dimensionality. In this scheme the differentiability is not required, but as the price to pay, we get essentially worse estimates than expected since now the norm of \(|\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})|\) is involved into all dimension estimates.

It would be interesting to remove this drawback and remove the dependence on \(|\nabla _{\textbf{u}}{\textbf{f}}({\textbf{u}})|\) from these estimates, e.g., by finding a "clever" choice of spaces H and V in Proposition 8.1. Up to the moment we know how to do this in a scalar case only, due to the possibility to multiply (8.4) by \({\text {sgn}}v\) and using the Kato inequality. This in turn gives the estimate of the \(L^1\)-norm of \(L(t)\theta \) through quantities depending only on K. To the best of our knowledge, nothing similar is known for the vector case.