1 Introduction

We consider a class of functionals with the so-called (pq)-growth. The prominent example we have in mind is

$$\begin{aligned} {\mathcal {G}}(u) := \int _{\Omega } |\nabla u(x)|^p \mathrm {d}x + \int _{\Omega } a(x) \, |\nabla u(x)|^q \mathrm {d}x. \end{aligned}$$
(1.1)

Here, \(\Omega \subset {\mathbb {R}}^d\) is a bounded Lipschitz domain, \(u: \Omega \rightarrow {\mathbb {R}}\) is an argument of the functional \({\mathcal {G}}\), \(a: \Omega \rightarrow [0,\infty )\) is a given nonnegative function and \(1 \leqq p< q <\infty \) are given numbers. Functional \({\mathcal {G}}\) is the interesting toy model for studying minimisation of functionals with the so-called non-standard growth. Indeed, depending on whether \(a = 0\) or \(a > 0\), \({\mathcal {G}}\) exhibits either the p- or the q-growth.

A well-known feature of functional \({\mathcal {G}}\) is the so-called Lavrentiev phenomenon. For instance, there exists a function \(a \in C^{\alpha }({\overline{\Omega }})\) with \(\alpha \in (0,1)\), exponents p, q fulfilling \(p<d<d+{\alpha }<q\) and boundary data \(u_0 \in W^{1,q}(\Omega )\) such that

$$\begin{aligned} \inf _{u \in u_0 + W_0^{1,p}(\Omega )} {\mathcal {G}}(u) < \inf _{u \in u_0 + W_0^{1,q}(\Omega )} {\mathcal {G}}(u). \end{aligned}$$
(1.2)

On the other hand, it is known that if \(q \leqq p + \alpha \,\frac{p}{d}\), the Lavrentiev phenomenon does not occur for the toy model (1.1); see [21]. Under the additional assumption \(u_0 \in L^{\infty }(\Omega )\), the range of exponents has been improved to \(q \leqq p + \alpha \) [16, Proposition 3.6, Remark 5]. The latter work heavily depends on the properties of minimizers and the \(L^{\infty }\) bound for the minimizer of the functional (1.1) form a nontrivial part of the result in [16].

In this paper we prove that neither the assumption \(u_0 \in L^{\infty }(\Omega )\) nor any additional property of minimizer (higher integrability, continuity) is irrelevant for the absence of Lavrentiev phenomenon. More precisely, we prove that one does not observe Lavrentiev phenomenon if

$$\begin{aligned} q \leqq p + \alpha \, \max \left( 1, \frac{p}{d} \right) \end{aligned}$$
(1.3)

and boundary data \(u_0 \in W^{1,q}(\Omega )\). In this case, we have

$$\begin{aligned} \inf _{u \in u_0 + W_0^{1,p}(\Omega )} {\mathcal {G}}(u) = \inf _{u \in u_0 + W_0^{1,q}(\Omega )} {\mathcal {G}}(u) = \inf _{u \in u_0 + C_c^{\infty }(\Omega )} {\mathcal {G}}(u). \end{aligned}$$
(1.4)

This significantly improves the available results for the case \(p < d\). Moreover, our proof is elementary as it is based on a simple regularisation argument together with Young’s convolution inequality. In particular, we do not use estimates on minimizers of functional (1.1). Consequently, our method easily extends to the vector-valued maps and cover variable-exponent functionals as well (see Section 3.2.)

The question of whether (1.2) or (1.4) holds true is related to the density of \(C_c^{\infty }(\Omega )\) in the Musielak–Orlicz–Sobolev space \(W^{1,\psi }_0(\Omega )\) corresponding to the functional (1.1), see (4.1)–(4.3) for definitions. In this context, we prove that the density result hold true for pq satisfying (1.3) which is again better then so-far known regime of exponents announced in [1].

Let us discuss the result of the paper within the context of previous works related to this topic. The first studies concerning functionals changing their ellipticity rate at each point have been carried out by Zhikov [39,40,41,42]. In particular, in [41] he observed that it may happen that (1.4) does not hold, extending thus similar observations made by Lavrentiev [24] and Mania [26]. Another related direction of research is the regularity of minimizers. Although the fundamental results for minimizers were obtained by Marcellini [27,28,29,30] more than 20 years ago, it is in fact still an active topic of research; see for instance [2, 4,5,6,7, 10, 12, 16,17,18,19, 31,32,33, 35, 38].

Going back to the functional (1.1), the available results for boundary data \(u_0 \in W^{1,q}(\Omega )\) provide both positive and negative answers to the question whether (1.4) holds true. On the one hand, if \(q \leqq p + \frac{p\,\alpha }{d}\) then (1.4) is indeed valid [20, 21]. On the other hand, if \(q > p + \alpha \, \text{ max }\left( 1, \frac{p-1}{d-1}\right) \) then counterexample in [3, Theorem 34] shows that (1.4) is violated (see also [21, Lemma 7] for a weaker result concerning the case \(p< d< d +{\alpha } < q\) obtained with more elementary methods). In this paper we establish (1.4) for \(q \leqq p + \alpha \,\max {\left( 1, \frac{p}{d}\right) }\) which partially fills the gap between currently known positive and negative results concerning the Lavrentiev phenomenon. Moreover, in view of [3, Theorem 34], our result is sharp for \(p \leqq d\).

Next, we wish to address two issues that appeared in previous papers on this topic. First, in [17, Lemma 4.1] there is the following claim: for every \(\varepsilon > 0\) and ball \(B_r(x) \subset \Omega \), there exists \(p_{\varepsilon } < q_{\varepsilon }\) satisfying

$$\begin{aligned} \varepsilon \, p_{\varepsilon }> q_{\varepsilon } - p_{\varepsilon } - \alpha _{\varepsilon } \, \frac{p_{\varepsilon }}{d} > 0, \end{aligned}$$
(1.5)

a coefficient \(a_{\varepsilon } \in C^{\alpha }({\overline{\Omega }})\) and a boundary data \(u_0 \in W^{1,q}(B_r(x))\cap L^{\infty }(B_r(x))\) such that

$$\begin{aligned} \inf _{u \in u_0 + W_0^{1,p_{\varepsilon }}(B_r(x))} {\mathcal {G}}(u) < \inf _{u \in u_0 + W_0^{1,p_{\varepsilon }}(B_r(x)) \cap W^{1,q_{\varepsilon }}_{\text {loc}}(B_r(x))} {\mathcal {G}}(u). \end{aligned}$$

Although this is a very nice result, it does not prove that range of exponents \(q \leqq p + \alpha \,\frac{p}{d}\) is optimal for absence of the Lavrentiev phenomenon and it does not contradict our result about the range stated in (1.3). In fact, authors refer to the counterexample from [21] constructed for exponents satisfying \(p< d< d+{\alpha } < q\) i.e. exponents that do not meet our range because the distance between p and q is greater than \(\alpha \). In fact, it is shown that there exists \(p_{\varepsilon }\) and \(q_{\varepsilon }\) but it follows also from the proof that they are constructed in the following way: for \(\delta > 0\) to be specified later, we define \(p_{\varepsilon }:=d-\delta \), \(q_{\varepsilon }:= d+{\alpha } + \delta \) and find a proper counterexample constructed in [21]. Then, when \(p_{\varepsilon } \geqq 1\), we have

$$\begin{aligned} \varepsilon \, p_{\varepsilon } \geqq \varepsilon , \qquad q_{\varepsilon } - p_{\varepsilon } - \alpha _{\varepsilon } \, \frac{p_{\varepsilon }}{d} = 2\,\delta + \alpha \, \frac{\delta }{d} = \delta \, \left( 2 + \frac{\alpha }{d} \right) , \end{aligned}$$

so that (1.5) is satisfied if we let \(\delta := \frac{\varepsilon }{2\,\left( 2 + \alpha /d\right) }\). Consequently, \(p_{\varepsilon } \rightarrow d\) as \(\varepsilon \rightarrow 0\), which is in perfect coincidence with (1.3).

Second, we also want compare our result with [16], where authors proved that the Lavrentiev phenomenon is not observed for \(q \leqq p + {\alpha }\) in the particular cases when minimizers of (1.1) are bounded, but this requires an extra assumption on the boundary data, namely that the boundary data \(u_0\) is bounded and apply the maximum principle [25]. In addition, reasoning in [16] is based on the so-called Morrey type estimate on the gradient of minimizer which is not an obvious result itself. Comparing to our work, we prove that the Lavrentiev phenomenon does not occur independently of the properties of minimizers or boundedness of boundary data. Our methods are elementary and are based on simple estimates on convolutions. We point out that one could naively think that our result is a consequence of [16] and a simple approximation argument (boundary data \(u_0 \in W^{1,q}(\Omega )\) is approximated with a sequence \(\{u_{0,n}\}_{n \in {\mathbb {N}}} \subset W^{1,q}(\Omega ) \cap L^{\infty }(\Omega )\)) but it is not necessarily true that sequence of minimizers has then a subsequence converning again to a minimizer of the limit problem.

Finally, we want to point out and emphasize the main novelties of the paper. Standard methods [17, 20] for proving (1.4) are based on regularization of arbitrary function \(u \in W_0^{1,p}(\Omega )\) satisfying \({\mathcal {G}}(u) < \infty \) with a sequence of smooth functions \(u^{\varepsilon } = u *\eta _{\varepsilon }\) and passing to the limit \({\mathcal {G}}(u^{\varepsilon }) \rightarrow {\mathcal {G}}(u)\) as \(\varepsilon \rightarrow 0\). The latter is not trivial because the integrand in (1.1) is x-dependent. More precisely, if the integrand is convex and autonomous (i.e. it does not depend on x) one can use Jensen’s inequality and Vitali convergence theorem to prove that \({\mathcal {G}}(u^{\varepsilon }) \rightarrow {\mathcal {G}}(u)\) whenever \({\mathcal {G}}(u) < \infty \). In particular, there is no Lavrentiev phenomenon in this case; see also [9].

The strategy to deal with the non-autonomous case is to approximate locally the integrand with autonomous function that does not depend on x (see Lemma 5.4) so that one can exploit Jensen’s inequality. The approximation requires good estimate on \(\left\| \nabla u^{\varepsilon }\right\| _{\infty }\) which results in constraint on exponents p and q. The estimate on gradient is obtained by writing \(\nabla u^{\varepsilon } = \nabla u *\eta _{\varepsilon }\) and using the fact that \(\nabla u \in L^p(\Omega )\). Our main contribution is an observation that it is sufficient to approximate only bounded functions u (i.e. \(u \in L^{\infty }(\Omega )\)). It turns out that for \(p < d\), it is a better strategy to write \(\nabla u^{\varepsilon } = u *\nabla \eta _{\varepsilon }\) and exploit the estimate \(u \in L^{\infty }(\Omega )\) rather that \(\nabla u \in L^p(\Omega )\). We remark that these observations have been already used in our recent paper on parabolic equations [11] but at that point we did not observe that similar ideas may bring new information to analysis of the Lavrentiev phenomenon.

The structure of the paper is as follows: in Section 2 we present the main result, Theorem 2.3. The theorem holds true under rather complicated assumption so in Section 3 we discuss two representative examples. In Section 4 we review the most important properties of the Musielak–Orlicz–Sobolev spaces. We explain here why it is sufficient to approximate only bounded functions, see Lemmas 4.2 and 4.3. Then, in Section 5 we present the proof of the main result in the particular case of functional \({\mathcal {G}}\) as in (1.1) and \(\Omega = B\) (i.e. a unit ball). In this case we may neglect many technical difficulties and clearly present main ideas. Section 6 is devoted to the proof of Theorem 2.3 in the general case. Finally, in Section 7 we briefly discuss how to extend our work to the case of vectorial problems.

2 Main result

Let us first set notation. We always assume that \(\Omega \subset {\mathbb {R}}^d\) is a bounded Lipschitz domain and d is the dimension of the space. We write B for the unit open ball centered at 0. For balls with radius r we use \(B_r\) and if the center is at some general point x, we write \(B_r(x)\) so that \(B_1(0) = B\) and \(B_r(0) = B_r\). Concerning function spaces, we write \(C_c^{\infty }(\Omega )\) for the space of smooth compactly supported functions, \(W^{1,p}(\Omega )\) and \(W^{1,p}_0(\Omega )\) are usual Sobolev spaces, \(W^{1,\psi }(\Omega )\) and \(W^{1,\psi }_0(\Omega )\) are the Musielak–Orlicz–Sobolev spaces defined in Section 4 while \(C^{\alpha }({\overline{\Omega }})\) is the space of Hölder continuous functions on \({\overline{\Omega }}\) with exponent \(\alpha \in (0,1]\). Finally, \(\eta _{\varepsilon }: {\mathbb {R}}^d \rightarrow {\mathbb {R}}\) is a usual mollification kernel.

We already introduced the key motivation of the paper, i.e., the functional (1.1), but the main result concern more general cases. We focus in the paper on functionals being of the form

$$\begin{aligned} {\mathcal {H}}(u) = \int _{\Omega } \psi (x, |\nabla u(x)|) \mathrm {d}x, \end{aligned}$$
(2.1)

where \(\psi \) is the so-called \({\mathcal {N}}\)-function and it satisfies the following assumptions:

Assumption 2.1

We assume that an \({\mathcal {N}}\)-function \(\psi : \Omega \times {\mathbb {R}}^+ \rightarrow {\mathbb {R}}^+\) satisfies

  1. (A1)

    (vanishing at 0) \(\psi (x,\xi ) = 0\) if and only if \(\xi = 0\),

  2. (A2)

    (convexity) for each x, the map \({\mathbb {R}}^+ \ni \xi \mapsto \psi (x,\xi )\) is convex,

  3. (A3)

    (\(p-q\) growth) there exist exponents \(1 \leqq p< q < \infty \) and \(\xi _0 \geqq 1\) and constants \(C_1\) and \(C_2\) such that

    $$\begin{aligned} C_1\, |\xi |^p \leqq \psi (x,\xi ) \text{ for } \xi \geqq \xi _0, \qquad \psi (x,\xi ) \leqq C_2 \, (1+|\xi |^q) \text{ for } \text{ all } \xi \geqq 0, \end{aligned}$$
  4. (A4)

    (\(\Delta _2\) condition) there exists a constant \(C_4\) such that

    $$\begin{aligned} \psi (x, 2\xi ) \leqq C_4 \, \psi (x,\xi ). \end{aligned}$$
  5. (A5)

    (autonomous lower-bound) there is function \(m_{\psi }: {\mathbb {R}}^+ \rightarrow {\mathbb {R}}^+\) and \(\xi _0\) such that for \(\xi \geqq \xi _0\) we have \(m_{\psi }(\xi ) \leqq \psi (x,\xi )\) and \(\frac{m_{\psi }(\xi )}{\xi } \rightarrow \infty \) as \(\xi \rightarrow \infty \).

Assumption 2.2

Let \(\psi \) be an \({\mathcal {N}}\)-function satisfying Assumption 2.1. We assume that for all \(D>1\), there are constants \(M = M(p,q,D)\) and \(N = N(p,q,D)\) such that

$$\begin{aligned} \psi (z,\xi ) \leqq M \, \psi (y,\xi ) + N \end{aligned}$$

for all balls \(B_{\gamma }(x)\), all \(y, z \in \overline{B_{\gamma }(x)} \cap {\overline{\Omega }}\), all \(\xi \in \left[ 0, D\gamma ^{-\min \left( 1,\, \frac{d}{p}\right) }\right] \) and all \(\gamma \in \left( 0,\frac{1}{2}\right) \).

Let us make few comments on Assumptions 2.1 and 2.2. Conditions (A1)–(A2) are standard in the theory of Orlicz spaces while (A3) reflects growth of the \({\mathcal {N}}\)-function being trapped between \(p-\) and \(q-\)growth. Condition (A4) ensures good functional analytic properties in \(W_0^{1,\psi }(\Omega )\) cf. Lemma 4.1. Finally, condition (A5) guarantees good behaviour of second conjugate on \(\psi \) and it is only necessary for Step 4 in the proof of Lemma 6.2. In particular, this condition is necessary only for the general \({\mathcal {N}}\)-functions so it is does not have to hold in the special case \(\varphi (x,\xi ) = |\xi |^p + a(x)\,|\xi |^q\).

Assumption 2.1 thus reflects the basic functional setting. The real cornerstone of the paper is however Assumption 2.2. It is in fact an abstractly formulated condition on continuity of \(\psi \). To understand it better, we note that it is always possible to estimate, for all \(x\in \Omega \),

$$\begin{aligned} \inf _{y \in \overline{B_{\gamma }(x)} \cap {\overline{\Omega }}} \psi (y,\xi ) \leqq \psi (x,\xi ). \end{aligned}$$

Assumption 2.2 states that the above estimate can be inverted (with a suitable constant). As it seems to be hard to verify it directly, we provide two model examples of \({\mathcal {N}}\)-functions \(\psi \) satisfying this condition in Section 3. Nevertheless, we would like to emphasize that the prototypic functional (1.1) satisfying (1.3) fulfils also Assumption 2.1.

The main result of this paper is

Theorem 2.3

Let \({\mathcal {H}}\) be a functional defined with (2.1) with \(\psi \) satisfying Assumptions 2.12.2. Then, for all \(u_0 \in W^{1,q}(\Omega )\) we have

$$\begin{aligned} \inf _{u \in u_0 + W_0^{1,p}(\Omega )} {\mathcal {H}}(u) = \inf _{u \in u_0 + W_0^{1,q}(\Omega )} {\mathcal {H}}(u) = \inf _{u \in u_0 + C_c^{\infty }(\Omega )} {\mathcal {H}}(u). \end{aligned}$$

Moreover, space \(C_c^{\infty }(\Omega )\) is dense in the Musielak–Orlicz–Sobolev space \(W^{1,\psi }_0(\Omega )\).

3 Examples of N-functions satisfying Assumption 2.2

3.1 Standard double phase functionals

In this section, we prove that the \({\mathcal {N}}\)-function

$$\begin{aligned} \varphi (x,\xi ) = |\xi |^p + a(x) \, |\xi |^q \end{aligned}$$

satisfies Assumption 2.2 provided that \(a \in C^{\alpha }({\overline{\Omega }})\) and \(q \leqq p +\alpha \,\max \left( 1,\, \frac{p}{d} \right) \). The related functional reads as

$$\begin{aligned} {\mathcal {G}}(u) := \int _{\Omega } |\nabla u(x)|^p + a(x) \, |\nabla u(x)|^q \mathrm {d}x. \end{aligned}$$

To show thus, we use the following lemma, whose assumptions are evidently satisfied by the example given above:

Lemma 3.1

Suppose that \(\psi \) satisfies Assumption 2.1 with exponents p and q. Moreover, assume that there is \(\alpha \in (0,1]\) and constant \(C_3\) such that, for all \(x_1, x_2 \in \Omega \) and \(\xi \geqq \xi _0\), we have that

$$\begin{aligned} \left| \psi (x_1, \xi ) - \psi (x_2,\xi )\right| \leqq C_3\, |x_1 - x_2|^{\alpha } \, (1 + |\xi |^q). \end{aligned}$$
(3.1)

Then, \(\psi \) satisfies Assumption 2.2 provided that \(q \leqq p + \alpha \, \max \left( 1, \frac{p}{d}\right) \).

Proof of Lemma 3.1

First, we may assume that \(\xi > \xi _0\) as for \(\xi \in [0,\xi _0]\) we have that

$$\begin{aligned} \psi (x,\xi ) \leqq C_2 \, (1+|\xi |^q) \leqq C_2(1+|\xi _0|^q) + \psi (y,\xi ) \end{aligned}$$
(3.2)

so the assertion follows with \(M=1\) and \(N=C_2 \, (1+|\xi _0|^q)\). Hence, we fix \(\xi > \xi _0\) and some ball \(B_{\gamma }(x)\) such that \(B_{\gamma }(x) \cap {\overline{\Omega }}\) is not empty. Thanks to (3.1), we have for all \(y, z \in \overline{B_{\gamma }(x)} \cap {\overline{\Omega }}\),

$$\begin{aligned} \psi (z,\xi ) \geqq \psi (y,\xi ) - C_3\,\left( 1 + |\xi |^{q}\right) \, |y - z|^{\alpha } \geqq \psi (y,\xi ) - C_3\,\left( 1 + |\xi |^{q}\right) \, \gamma ^{\alpha }. \end{aligned}$$

As \(\xi \geqq \xi _0 \geqq 1\), we have in fact that

$$\begin{aligned} \psi (z,\xi ) \geqq \psi (y,\xi ) - 2\, C_3\,|\xi |^{q} \, \gamma ^{\alpha }. \end{aligned}$$
(3.3)

To bootstrap this estimate, we fix \(\delta \in (0,1)\) and write

$$\begin{aligned} \psi (z,\xi ) = \delta \, \psi (z,\xi ) + (1-\delta ) \, \psi (z,\xi ) \geqq \delta \, \psi (y,\xi ) - \delta \, 2\, C_3 \, |\xi |^{q} \, \gamma ^{\alpha } + (1-\delta ) \, C_1 |\xi |^p, \end{aligned}$$
(3.4)

where we used (3.3) to estimate the first term and lower bound \(\psi (z,\xi ) \geqq C_1 \, |\xi |^p\) to estimate the second term. Now, we may write

$$\begin{aligned} 2\, \delta \,C_3\, |\xi |^{q} \, \gamma ^{\alpha } = 2\, \delta \, C_3\, |\xi |^{q-p} \,|\xi |^{p} \, \gamma ^{\alpha } \leqq 2\,\delta \, C_3 \, D^{q-p} \, \gamma ^{\alpha - (q - p)\,\min \left( 1,\, \frac{d}{p}\right) } \, |\xi |^p, \end{aligned}$$
(3.5)

where we used \(|\xi | \leqq D\,\gamma ^{-\min \left( 1,\, \frac{d}{p}\right) }\). As \(q - p \leqq \alpha \, \max \left( 1, \frac{p}{d}\right) \), we have

$$\begin{aligned} \alpha - (q - p)\,\min \left( 1,\, \frac{d}{p}\right) \geqq \alpha - \alpha \, \max \left( 1, \frac{p}{d}\right) \, \min \left( 1,\, \frac{d}{p}\right) = \alpha - \alpha = 0. \end{aligned}$$

It follows that \( \gamma ^{\alpha - (q - p)\,\min \left( 1,\, \frac{d}{p}\right) } \leqq 1 \) for \(\gamma \in \left( 0, \frac{1}{2}\right) \). Hence, coming back to (3.4) we obtain

$$\begin{aligned} \psi (z,\xi )\geqq & {} \delta \, \psi (y,\xi ) - \delta \, 2\, C_3\, D^{q-p} \, |\xi |^p+ (1-\delta ) \, C_1 \, |\xi |^p \\= & {} \delta \, \psi (y,\xi ) + \left( (1 - \delta )\,C_1 - \delta \,2\, C_3\,D^{q-p} \right) \, |\xi |^p. \end{aligned}$$

We choose \(\delta = \frac{C_1}{C_1 + 2\,C_3\,D^{q-p}}\) so that \(\left( (1 - \delta )\,C_2 - \delta \, C_3\,D^{q-p} \right) \, |\xi |^p = 0\). Hence, for all \(y, z \in \overline{B_{\gamma }(x)} \cap {\overline{\Omega }}\),

$$\begin{aligned} \psi (z,\xi ) \geqq \delta \, \psi (y,\xi ), \end{aligned}$$

so combining this with (3.2), the proof is concluded with \(M = \max \left( 1/\delta , 1\right) \) and \(N = C_2\left( 1+|\xi _0|^q\right) \). \(\square \)

3.2 Variable exponent double phase functionals

In this section we prove that \({\mathcal {N}}\)-function

$$\begin{aligned} \phi (x,\xi ) := |\xi |^{p(x)} +a(x)\,|\xi |^{q(x)} \end{aligned}$$
(3.6)

satisfies our Assumption 2.2. The related functional reads as

$$\begin{aligned} {\mathcal {J}}(u,\Omega ):= \int _{\Omega } \left[ |\nabla u|^{p(x)} +a(x)\,|\nabla u|^{q(x)}\right] \mathrm {d}x. \end{aligned}$$
(3.7)

Assumption 3.2

We assume that

  1. (B1)

    (\(p-q\) growth) there exist pq with \(1 < p \leqq q\) such that the functions \(p(x), q(x): \Omega \rightarrow [1, \infty )\) satisfy \(p \leqq p(x) \leqq q(x) \leqq q\),

  2. (B2)

    (\(\log \)-Hölder continuity) there are constants \(C_p, C_q\) such that for all \(x, y \in \Omega \) with \(|x-y| \leqq \min \left( \text{ diam }\, \Omega , \frac{1}{2} \right) \) we have

    $$\begin{aligned} |p(x) - p(y)| \leqq -\frac{C_p}{\log |x-y|}, \qquad \qquad |q(x) - q(y)| \leqq -\frac{C_q}{\log |x-y|}. \end{aligned}$$
  3. (B3)

    (\(\alpha \)-Hölder continuity) \(a \in C^{\alpha }({\overline{\Omega }})\) with constant \(|a|_{\alpha }\).

Lemma 3.3

Under Assumption 3.2, \({\mathcal {N}}\)-function \(\phi \) defined with (3.6) satisfies Assumption 2.2 for q and p such that \(q \leqq p + \alpha \, \max \left( 1, \frac{p}{d}\right) \).

Proof

As in the proof of Lemma 3.1, we only need to consider \(\xi \geqq 1\). Let us estimate \(\frac{\phi (x,\xi )}{\phi (y,\xi )}\) for \(x, y \in \Omega \) such that \(|x-y| \leqq \min \left( \text{ diam }\, \Omega , \frac{1}{2} \right) \). Using \(a \geqq 0\), we have that

$$\begin{aligned} \begin{aligned} \frac{\phi (x,\xi )}{\phi (y,\xi )}&= \frac{|\xi |^{p(x)} +a(x)\,|\xi |^{q(x)}}{|\xi |^{p(y)} +a(y)\,|\xi |^{q(y)}} = \frac{|\xi |^{q(x)}}{|\xi |^{q(y)}} \frac{|\xi |^{p(x)-q(x)} +a(x)}{|\xi |^{p(y)-q(y)} +a(y)} \leqq \\&\leqq |\xi |^{q(x)-q(y)}\, \left[ \frac{|\xi |^{p(x)-q(x)}}{|\xi |^{p(y)-q(y)} + a(y)} + \frac{a(x) - a(y)}{|\xi |^{p(y)-q(y)} + a(y)} + \frac{a(y)}{|\xi |^{p(y)-q(y)} + a(y)} \right] \\&\leqq |\xi |^{q(x)-q(y)}\, \left[ \frac{|\xi |^{p(x)-q(x)}}{|\xi |^{p(y)-q(y)}} + \frac{a(x) - a(y)}{|\xi |^{p(y)-q(y)}} + 1 \right] \\&\leqq |\xi |^{q(x)-q(y)}\, \left[ {|\xi |^{p(x)-p(y)}}\,{|\xi |^{q(y)-q(x)}} + \frac{a(x) - a(y)}{|\xi |^{p(y)-q(y)}} + 1 \right] \\&\leqq |\xi |^{-\frac{C_q}{\log |x-y|}}\, \left[ {|\xi |^{-\frac{C_p}{\log |x-y|}}}\,{|\xi |^{-\frac{C_q}{\log |x-y|}}} + {|a|_{\alpha }\,|x-y|^{\alpha }} \, {|\xi |^{q(y)-p(y)}} + 1 \right] \\&\leqq |\xi |^{-\frac{C_q}{\log |x-y|}}\, \left[ {|\xi |^{-\frac{C_p}{\log |x-y|}}}\,{|\xi |^{-\frac{C_q}{\log |x-y|}}} + {|a|_{\alpha }\,|x-y|^{\alpha }} \, {|\xi |^{\alpha \,\max \left( 1,\, \frac{p}{d} \right) }} + 1 \right] \end{aligned} \end{aligned}$$
(3.8)

Now, let \(D> 1\) and \(\gamma \in \left( 0,\frac{1}{2}\right) \). Suppose that \(q \leqq p + \alpha \, \max \left( 1,\, \frac{p}{d}\right) \), \(|x-y| \leqq \gamma \) and \(\xi \in \left[ 1, D\gamma ^{-\min \left( 1,\, \frac{d}{p}\right) }\right] \). Let \(C>0\). Then,

$$\begin{aligned} |\xi |^{-\frac{C}{\log |x-y|}} \leqq \left( D\gamma ^{-\min \left( 1,\, \frac{d}{p}\right) } \right) ^{-\frac{C}{\log \gamma }} = D^{-\frac{C}{\log \gamma }} \, \gamma ^{\frac{C}{\log \gamma }\,\min \left( 1,\, \frac{d}{p}\right) } = D^{-\frac{C}{\log \gamma }} \, e^{C\, \min \left( 1,\, \frac{d}{p}\right) }. \end{aligned}$$

Applying this estimate with \(C = C_p, C_q\) and finding a numerical constant E such that

$$\begin{aligned} D^{-\frac{C_q}{\log \gamma }} \, e^{C_q}, D^{-\frac{C_p}{\log \gamma }} \, e^{C_p} \leqq E \end{aligned}$$

for all \(\gamma \in \left( 0,\frac{1}{2}\right) \). It follows that

$$\begin{aligned} |\xi |^{-\frac{C_p}{\log |x-y|}}, |\xi |^{-\frac{C_q}{\log |x-y|}} \leqq E. \end{aligned}$$

Using (3.8), we obtain

$$\begin{aligned} \frac{\phi (x,\xi )}{\phi (y,\xi )}\leqq & {} E\,\left( E^2 + |a|_{\alpha }\,\gamma ^{\alpha } \, \left( D\gamma ^{-\min \left( 1,\, \frac{d}{p}\right) } \right) ^{\alpha \, \max \left( 1,\, \frac{p}{d} \right) } + 1 \right) \\= & {} E\,\left( E^2 + D^{\alpha \, \max \left( 1,\, \frac{p}{d} \right) }\,|a|_{\alpha } + 1 \right) =:M, \end{aligned}$$

where we used \(\max \left( 1,\, \frac{p}{d} \right) \, \min \left( 1,\, \frac{d}{p}\right) = 1\) in the last line. We deduce that

$$\begin{aligned} \phi (x,\xi ) \leqq M \, \phi (y,\xi ). \end{aligned}$$

\(\square \)

4 Musielak–Orlicz–Sobolev spaces

Our results are based on smooth approximation in the Musielak–Orlicz spaces, so we first recall their definitions and basic properties. For more details, we refer to monographs [13, 23]. We consider an \({\mathcal {N}}\)-function \(\psi :\Omega \times {\mathbb {R}}^+ \rightarrow {\mathbb {R}}\) satisfying (A1)–(A4) in Assumption 2.1. For \(f: \Omega \rightarrow {\mathbb {R}}^d\) such that \(\int _{\Omega } \psi (x, |f(x)|) \mathrm {d}x < \infty \), we define the related Luxembourg norm with

$$\begin{aligned} \Vert f \Vert _{\psi } = \inf \left\{ \lambda >0: \int _{\Omega } \psi \left( x, \frac{\left| f(x)\right| }{\lambda }\right) \mathrm {d}x \leqq 1 \right\} . \end{aligned}$$
(4.1)

Finally, the Musielak–Orlicz–Sobolev spaces are defined as

$$\begin{aligned} W^{1,\psi }(\Omega ) = \{w \in W^{1,1}(\Omega ): \Vert \nabla w \Vert _{\psi } < \infty \}, \qquad W^{1,\psi }_0(\Omega ) = W^{1,1}_0(\Omega ) \cap W^{1,\psi }(\Omega ), \end{aligned}$$
(4.2)

the latter one corresponds to the space of functions vanishing at the boundary. These are normed spaces with norm

$$\begin{aligned} \Vert w\Vert _{1,\psi } = \Vert w\Vert _{1} + \Vert \nabla w\Vert _{\psi }. \end{aligned}$$
(4.3)

One can think of \(W^{1,\psi }(\Omega )\) as the space of functions having gradient integrable with p or q power depending on whether \(a = 0\) or not.

We summarize some properties of the Musielak–Orlicz spaces in the following lemma. They are mainly consequences of (A4) in Assumption 2.1. The proof can be found in many texts on Orlicz spaces [13, 23], yet for the sake of completeness, we present the proof in “Appendix A.2”.

Lemma 4.1

Let \(\psi \) satisfy Assumption 2.1 and let \(f, f_n: \Omega \rightarrow {\mathbb {R}}^d\). Then,

  1. (C1)

    \(\Vert f\Vert _{\psi }< \infty \iff \int _{\Omega } \psi (x,c|f(x)|) \mathrm {d}x < \infty \) for some \(c>0\) \(\iff \) \(\int _{\Omega } \psi (x,c|f(x)|) \mathrm {d}x < \infty \) for all \(c >0\),

  2. (C2)

    \(\Vert f_n - f\Vert _{\psi } \rightarrow 0 \iff \) for some \(c>0\) \(\int _{\Omega } \psi (x,c\,|f_n(x) - f(x)|) \mathrm {d}x \rightarrow 0\) \(\iff \) for all \(c >0\) \(\int _{\Omega } \psi (x,c|f_n(x) - f(x)|) \mathrm {d}x \rightarrow 0\),

  3. (C3)

    if \(\Vert f\Vert _{\psi } < \infty \) and any of the conditions in (C2) is satisfied, we have \(\int _{\Omega } \psi (x,|f_n(x)|) \mathrm {d}x \rightarrow \int _{\Omega } \psi (x,|f(x)|) \mathrm {d}x\),

  4. (C4)

    if \(f_n \rightarrow f\) a.e. on \(\Omega \), \(\Vert f\Vert _{\psi } < \infty \) and the sequence \(\{\psi (x,|f_n(x)|\}_{n \in {\mathbb {N}}}\) is uniformly integrable then \(\Vert f_n - f\Vert _{\psi } \rightarrow 0\),

  5. (C5)

    if \(\Vert f\Vert _{\psi } < \infty \) then \(f \in L^1(\Omega )\).

Next two lemmas show that to prove the absence of the Lavrentiev phenomenon, it is sufficient to demonstrate that every \(u\in W^{1,\psi }_0(\Omega ) \cap L^{\infty }(\Omega )\) can be approximated in the topology of \(W^{1,\psi }\) by smooth function from \(C_c^{\infty }(\Omega )\).

First lemma shows, that it is enough to consider only bounded functions. Notice that we do not impose any specific assumption on the \({\mathcal {N}}\)-function \(\psi \) here.

Lemma 4.2

Space \(W_0^{1,\psi }(\Omega ) \cap L^{\infty }(\Omega )\) is dense in \(W_0^{1,\psi }(\Omega )\).

Proof

Let \(u \in W_0^{1,\psi }(\Omega )\). Consider truncation of u defined as

$$\begin{aligned} T_k(u) = {\left\{ \begin{array}{ll} u &{} \text{ if } |u| \leqq k, \\ k\, \frac{u}{|u|} &{} \text{ if } |u| > k. \end{array}\right. } \end{aligned}$$
(4.4)

Clearly, \(T_k(u) \in L^{\infty }(\Omega )\). Moreover, chain rule for Sobolev maps implies that \(\nabla T_k(u) = \nabla u \, \mathbbm {1}_{|u|\leqq k}\) so that \(\nabla T_k(u) \rightarrow \nabla u\) a.e. as \(k \rightarrow \infty \). As \(\psi (x,0) = 0\), we have

$$\begin{aligned} 0 \leqq \psi (x, \left| \nabla T_k(u)\right| ) = \psi (x, \left| \nabla u\right| ) \, \mathbbm {1}_{|u| \leqq k} \leqq \psi (x,\left| \nabla u\right| ) \end{aligned}$$

so that the sequence \(\left\{ \psi (x, \left| \nabla T_k(u))\right| \right\} _{k \in {\mathbb {N}}}\) is uniformly integrable. Application of (C4) from Lemma 4.1 concludes the proof. \(\square \)

Lemma 4.3

Suppose that \(\psi \) satisfies (A1)–(A4) in Assumption 2.1. Let \(p < q\) be exponents as in (A3) in Assumption 2.1. Suppose that for every \(u\in W_0^{1,\psi }(\Omega ) \cap L^{\infty }(\Omega )\) there exists a sequence \(\{u^n\}_{n=1}^{\infty } \subset C_c^{\infty }(\Omega )\) such that \(\Vert u^n-u\Vert _{1,\psi } \rightarrow 0\) as \(n\rightarrow \infty \). Then, the space \(C_c^{\infty }(\Omega )\) is dense in \(W_0^{1,\psi }(\Omega )\) and the Lavrentiev phenomenon does not occur, i.e., for all \(u_0 \in W^{1,q}(\Omega )\) we have that

$$\begin{aligned} \inf _{u\in u_0 + W^{1,p}_0(\Omega )} {\mathcal {H}}(u) = \inf _{u\in u_0 + W^{1,q}_0(\Omega )} {\mathcal {H}}(u) = \inf _{u\in u_0 + C_c^{\infty }(\Omega )} {\mathcal {H}}(u). \end{aligned}$$

Proof

Thanks to Lemma 4.2, \(C_c^{\infty }(\Omega )\) is dense in \(W_0^{1,\psi }(\Omega )\). Let \(u^* \in W^{1,p}(\Omega )\) be the minimizer of \({\mathcal {H}}\) i.e.

$$\begin{aligned} \inf _{u\in u_0 + W^{1,p}(\Omega )} {\mathcal {H}}(u) = {\mathcal {H}}(u^*). \end{aligned}$$

The minimizer exists by a usual application of direct method in calculus of variations, cf. [37, Theorem 2.7]. Note that we always have that

$$\begin{aligned} {\mathcal {H}}(u^*) = \inf _{u\in u_0 + W^{1,p}(\Omega )} {\mathcal {H}}(u) \leqq \inf _{u\in u_0 + W^{1,q}(\Omega )} {\mathcal {H}}(u) \leqq \inf _{u\in u_0 + C_c^{\infty }(\Omega )} {\mathcal {H}}(u), \end{aligned}$$

because \(p < q\). To prove the reversed inequality, we write \(u^* = u_0 + {\overline{u}}\) where \({\overline{u}} \in W_0^{1,p}\). Note that \(u_0 \in W^{1,\psi }(\Omega )\) (because \(W^{1,q}(\Omega ) \subset W^{1,\psi }(\Omega )\)) and \(u^* \in W^{1,\psi }(\Omega )\) (because \({\mathcal {H}}(u^*) < \infty \) cf. Lemma 4.1 (C1)). It follows that \({\overline{u}} = u^* - u_0 \in W^{1,\psi }_0(\Omega )\). Now, consider the sequence \(\{u_n\}_{n\in {\mathbb {N}}} \subset C_c^{\infty }(\Omega )\) such that \(u_n \rightarrow {\overline{u}}\) in \(W^{1,\psi }(\Omega )\) which exists due to the assumptions. It follows that \(u_n + u_0 \rightarrow {\overline{u}} + u_0 = u^*\) in \(W^{1,\psi }(\Omega )\). In particular, \({\mathcal {H}}(u_0 + u_n) \rightarrow {\mathcal {H}}(u^*)\) cf. Lemma 4.1 (C3). Note that \(u_0 + u_n \in u_0 + C_c^{\infty }(\Omega )\). It follows that

$$\begin{aligned} \inf _{u\in u_0 + C_c^{\infty }(\Omega )} {\mathcal {H}}(u) \leqq {\mathcal {H}}(u_0 + u_n) \rightarrow {\mathcal {H}}(u^*) \qquad \text{ as } n \rightarrow \infty . \end{aligned}$$

\(\square \)

5 Proof of Theorem 2.3 in the special case

In this section we prove Theorem 2.3 in the case when \(\Omega = B\) (unit ball centered at 0) and the \({\mathcal {N}}\)-function is defined via the formula

$$\begin{aligned} \varphi (x,\xi ) = |\xi |^p + a(x) \, |\xi |^q. \end{aligned}$$
(5.1)

The corresponding functional then takes the form

$$\begin{aligned} {\mathcal {G}}(u) := \int _B |\nabla u(x)|^p + a(x) \, |\nabla u(x)|^q \mathrm {d}x. \end{aligned}$$

Note that, if \(a \in C^{\alpha }({\overline{B}})\) and \(q\leqq p + \alpha \, \max \left( 1, \,\frac{p}{d}\right) \), it follows from Lemma 3.1 that \(\varphi \) satisfies Assumption 2.2.

The main purpose of this section is that we avoid all technical difficulties and focus only on the main parts of the proof. More precisely, we do not need to take care of difficulties coming from

  • geometric properties of general Lipschitz domain \(\Omega \),

  • situation when for general \({\mathcal {N}}\)-function \(\psi \) there is no local minimizer of the map \(x \mapsto \psi (x,\xi )\) valid for all values of \(\xi \).

We start with introducing mollification that will be used to define the approximation.

Definition 5.1

(Mollification with squeezing) For \(\varepsilon \in (0, 1/4)\) we set \(\eta _{\varepsilon }(x) = \frac{1}{\varepsilon ^d} \eta \left( \frac{x}{\varepsilon }\right) \) where \(\eta \) is a usual mollification kernel. Then, for arbitrary \(u: {\mathbb {R}}^d \rightarrow {\mathbb {R}}\), we define \(u^{\varepsilon }: {\mathbb {R}}^d \rightarrow {\mathbb {R}}\) as

$$\begin{aligned} u^{\varepsilon }(x) = \int _{{\mathbb {R}}^d} \eta _{\varepsilon }(y) \, u\left( \frac{x}{1-2 \varepsilon } - y \right) \mathrm {d}y. \end{aligned}$$

Lemma 5.2

Let \(u \in W^{1,1}_0(B)\) and be extended by zero onto \({\mathbb {R}}^d\). Then, \(u^{\varepsilon } \in C_c^{\infty }(B)\). Moreover, \(\frac{x}{1-2 \varepsilon } - y \in B_{5\varepsilon }(x)\) for all y such that \(|y|\leqq \varepsilon \).

Proof

Smoothness follows from standard properties of convolutions cf. [22, Appendix C.4]. To see the compact support, let \(|x| \geqq 1 - \varepsilon \) and \(|y| \leqq \varepsilon \). Then,

$$\begin{aligned} \left| \frac{x}{1- 2\,\varepsilon } - y \right|\geqq & {} \frac{1-\varepsilon }{1 - 2\,\varepsilon } - \varepsilon = \frac{1-\varepsilon }{1 - 2\,\varepsilon } - \frac{\varepsilon -2\varepsilon ^2}{1 - 2\,\varepsilon } \\= & {} \frac{1 - 2\,\varepsilon + 2\,\varepsilon ^2}{1 - 2\,\varepsilon } = 1 + \frac{2\,\varepsilon ^2}{1 - 2\,\varepsilon } > 1 \end{aligned}$$

so that \(u\left( \frac{x}{1-2 \varepsilon } - y \right) = 0\). It follows that \(u^{\varepsilon }\) is supported in \(B_{1-\varepsilon }\). To see the second property, we estimate

$$\begin{aligned} \left| x - \frac{x}{1-2 \varepsilon } + y \right| \leqq |x| \frac{2\varepsilon }{1-2\varepsilon } + |y| \leqq 4 \varepsilon + \varepsilon = 5\varepsilon , \end{aligned}$$

where we used \(\frac{1}{1 - 2\,\varepsilon } \leqq 2\), i.e. \(\varepsilon \leqq \frac{1}{4}\). \(\square \)

Before formulating the main theorem of this section, we state and prove two results: a technical lemma concerning approximating sequence and a simple observation concerning \({\mathcal {N}}\)-function \(\varphi \).

Lemma 5.3

Let \(u \in W^{1,1}_0(B)\) be such that \({\mathcal {G}}(u)<\infty \) and consider its extension to \({\mathbb {R}}^d\). Then,

  1. (D1)

    \(\varphi \left( \frac{x}{1-2\,\varepsilon }, \left| \nabla u\right| \left( \frac{x}{1-2\,\varepsilon }\right) \right) \rightarrow \varphi (x,\left| \nabla u(x)\right| )\) in \(L^1({\mathbb {R}}^d)\),

  2. (D2)

    \( \int _{{\mathbb {R}}^d} \varphi \left( \frac{x}{1-2\,\varepsilon }-y, \left| \nabla u\right| \left( \frac{x}{1-2\,\varepsilon }-y\right) \right) \eta _{\varepsilon }(y) \mathrm {d}y \rightarrow \varphi \left( x, \left| \nabla u\right| (x) \right) \) in \(L^1({\mathbb {R}}^d)\).

Proof

To see (D1), we note that the convergence holds in the pointwise sense. Moreover, the considered sequence is supported only for \(x \in B_{1-2\varepsilon }\). Therefore, to establish convergence in \(L^1({\mathbb {R}}^d)\), it is sufficient to prove equiintegrability of the sequence \(\left\{ \varphi \left( \frac{x}{1-2\,\varepsilon }, \left| \nabla u \right| \left( \frac{x}{1-2\,\varepsilon }\right) \right) \right\} _{\varepsilon }\) and apply the Vitali convergence theorem. To this end, we need to prove that

$$\begin{aligned} \forall _{\eta>0}\, \exists _{\delta >0} \, \forall _{A\subset B, |A|\leqq \delta } \qquad \, \int _{A} \varphi \left( \frac{x}{1-2\,\varepsilon }, \left| \nabla u\right| \left( \frac{x}{1-2\,\varepsilon }\right) \right) \mathrm {d}x \leqq \eta . \end{aligned}$$

We fix \(\eta \) and arbitrary \(A \subset B\). Using change of variables we have that

$$\begin{aligned}&\int _{A} \varphi \left( \frac{x}{1-2\,\varepsilon }, \left| \nabla u\right| \left( \frac{x}{1-2\,\varepsilon }\right) \right) \mathrm {d}x \\&\quad = (1-2\,\varepsilon )^d \int _{A/(1-2\,\varepsilon )} \varphi (x, \left| \nabla u\right| (x) )\mathrm {d}x \leqq \int _{2A} \varphi (x,\left| \nabla u\right| (x)) \mathrm {d}x, \end{aligned}$$

where for \(c \in {\mathbb {R}}^+\), cA denotes a usual scaled set. By assumption, we have that \({\mathcal {G}}(u) < \infty \), so if we set

$$\begin{aligned} \omega (\tau ) := \sup _{C \subset {\mathbb {R}}^d: |C|\leqq \tau } \int _C \varphi (x,\left| \nabla u\right| (x)) \mathrm {d}x, \end{aligned}$$

then \(\omega (\tau )\) is a non-decreasing function, continuous at 0. Therefore, we may find \(\tau \) such that \(\omega (\tau ) \leqq 2^{-q} \, \eta \). Then, we choose \(\delta = 2^{-d} \, \tau \) to conclude the proof of (D1). Finally, the convergence result (D2) follows from Young’s convolutional inequality and (D1). \(\square \)

Lemma 5.4

Let \(\varphi \) be given by (5.1). Then for all balls \(B_{\gamma }(x)\) such that \(\overline{B_{\gamma }(x)} \cap {\overline{B}}\) is nonempty, there exists \(x^* \in \overline{B_{\gamma }(x)} \cap {\overline{B}}\) such that, for all \(\xi \),

$$\begin{aligned} \inf _{y \in \overline{B_{\gamma }(x)} \cap {\overline{B}}} \varphi (y,\xi ) = \varphi (x^*,\xi ). \end{aligned}$$

Proof

Using continuity of a and compactness of \(\overline{B_{\gamma }(x)} \cap {\overline{B}}\) we have that

$$\begin{aligned} \inf _{y \in \overline{B_{\gamma }(x)} \cap {\overline{B}}} \varphi (y,\xi ) = \inf _{y \in \overline{B_{\gamma }(x)} \cap {\overline{B}}} \left[ |\xi |^p + a(y) \, |\xi |^q \right] = |\xi |^p + |\xi |^q \inf _{y \in \overline{B_{\gamma }(x)} \cap {\overline{B}}} a(y), \end{aligned}$$

and we choose \({y}^*\) such that \(\inf _{y \in \overline{B_{\gamma }(x)} \cap {\overline{B}}} a(y) = a({y}^*)\). \(\square \)

Theorem 5.5

(Theorem 2.3 in the special case) Let \(u \in W^{1,\varphi }_0(B) \cap L^{\infty }(B)\) with \(a \in C^{\alpha }({\overline{B}})\). Suppose that

$$\begin{aligned} 1 \leqq p < q \leqq p + {\alpha } \max \left( 1, \frac{p}{d}\right) . \end{aligned}$$

Consider sequence \(u^{\varepsilon }\) as in Definition 5.1 with \( \varepsilon \in \left( 0, \frac{1}{4} \right) . \) Then,

  1. (E1)

    \(u^{\varepsilon } \in C_c^{\infty }(B)\),

  2. (E2)

    \({\mathcal {G}}\left( u^{\varepsilon }\right) \rightarrow {\mathcal {G}}(u)\) as \(\varepsilon \rightarrow 0\),

  3. (E3)

    \(u^{\varepsilon } \rightarrow u\) in \(W^{1,\varphi }(B)\) as \(\varepsilon \rightarrow 0\),

  4. (E4)

    \(C_c^{\infty }(B)\) is dense in \(W^{1,\varphi }_0(B)\) and Lavrentiev phenomenon does not occur, i.e. for all boundary data \(u_0 \in W^{1,q}(B)\)

    $$\begin{aligned} \inf _{u \in u_0 + W_0^{1,p}(B)} {\mathcal {G}}(u) = \inf _{u \in u_0 + W_0^{1,q}(B)} {\mathcal {G}}(u) = \inf _{u \in u_0 + C_c^{\infty }(B)} {\mathcal {G}}(u). \end{aligned}$$

Proof

The first property follows from construction. To prove convergence, we note that

$$\begin{aligned} {\mathcal {G}}\left( u^{\varepsilon }\right) = \int _B \varphi (x, \left| \nabla u^{\varepsilon }\right| (x)) \mathrm {d}x. \end{aligned}$$

We would like to take mollification out of the function \(\varphi \) using its convexity and Jensen’s inequality. However, this is not possible as function \(\varphi \) depends also on x explicitly. To overcome this problem, we apply Assumption 2.2, which allows us to approximate the function \(\varphi (x,\xi )\) locally by a function depending only on \(\xi \). Notice that \(\varphi \) satisfies Assumption 2.2 thanks to Lemma 3.1 and the structural assumption (5.1).

\(\underline{{{\textbf {Case 1}}~p \leqq d.}}\) In this case we have \(q \leqq p + \alpha \). Using Young’s convolution inequality, we obtain

$$\begin{aligned} \left\| \nabla u^{\varepsilon } \right\| _{\infty } \leqq \left\| u \right\| _{\infty } \, \left\| \nabla \eta _{\varepsilon } \right\| _{1} \leqq D\, (5\varepsilon )^{-1}, \end{aligned}$$
(5.2)

where we choose \(D:= 5\,\left\| u \right\| _{\infty } \,\left\| \nabla \eta \right\| _{1}\). Let \(x \in B\). Applying Assumption 2.2 with \(\gamma = 5\,\varepsilon \) and Lemma 5.4 we obtain \(x^* \in \overline{B_{5\varepsilon }(x)} \cap {\overline{B}}\) and constants M, N such that

$$\begin{aligned} \varphi (x, \left| \nabla u^{\varepsilon } \right| (x) ) \leqq M\, \varphi (x^*, \left| \nabla u^{\varepsilon }\right| (x) ) + N. \end{aligned}$$
(5.3)

Note that

$$\begin{aligned} \varphi (x^*, \left| \nabla u^{\varepsilon }(x)\right| )&= \varphi \left( x^*, \frac{1}{1-2\,\varepsilon } \left| \int _{B_{\varepsilon }} \nabla u \left( \frac{x}{1-2\,\varepsilon }-y\right) \, \eta _{\varepsilon }(y) \mathrm {d}y \right| \right) \leqq \\&\leqq \left( \frac{1}{1-2\,\varepsilon }\right) ^q \, \varphi \left( x^*, \int _{B_{\varepsilon }} \left| \nabla u\right| \left( \frac{x}{1-2\,\varepsilon }-y\right) \, \eta _{\varepsilon }(y) \mathrm {d}y\right) \\&\leqq 2^q \, \varphi \left( x^*, \int _{B_{\varepsilon }} \left| \nabla u\right| \left( \frac{x}{1-2\,\varepsilon }-y\right) \, \eta _{\varepsilon }(y) \mathrm {d}y\right) , \end{aligned}$$

where we used that \(\varphi \) is of the form (5.1). Then, Jensen’s inequality implies that

$$\begin{aligned}&\varphi \left( x^*, \int _{B_{\varepsilon }} \left| \nabla u\right| \left( \frac{x}{1-2\,\varepsilon }-y\right) \, \eta _{\varepsilon }(y) \mathrm {d}y\right) \\&\quad \leqq \int _{B_{\varepsilon }} \varphi \left( x^*, \left| \nabla u\right| \left( \frac{x}{1-2\,\varepsilon }-y\right) \right) \eta _{\varepsilon }(y) \mathrm {d}y. \end{aligned}$$

If \(\frac{x}{1-2\,\varepsilon }-y\) does not belong to \({\overline{B}}\) then \(\varphi \left( x^*, \left| \nabla u\right| \left( \frac{x}{1-2\,\varepsilon }-y\right) \right) = 0\). Otherwise, Lemma 5.2 implies \(\frac{x}{1-2\,\varepsilon }-y \in {\overline{B}} \cap \overline{B_{5\,\varepsilon }(x)}\), so that

$$\begin{aligned} \varphi \left( x^*, \left| \nabla u\right| \left( \frac{x}{1-2\,\varepsilon }-y\right) \right) \leqq \varphi \left( \frac{x}{1-2\,\varepsilon }-y, \left| \nabla u\right| \left( \frac{x}{1-2\,\varepsilon }-y\right) \right) , \end{aligned}$$

due to the minimality of \(x^*\) and nonnegativity of a. As \(x \in B\) was fixed, we obtain inequality

$$\begin{aligned} \varphi (x, \left| \nabla u^{\varepsilon }\right| (x) ) \leqq 2^q \, M\, \int _{B_{\varepsilon }} \varphi \left( \frac{x}{1-2\,\varepsilon }-y, \left| \nabla u\right| \left( \frac{x}{1-2\,\varepsilon }-y\right) \right) \eta _{\varepsilon }(y) \mathrm {d}y + N, \end{aligned}$$
(5.4)

valid for all \(x \in B\). Now, we observe that \(\varphi (x, \left| \nabla u^{\varepsilon }\right| (x) )\) converges a.e. to \(\varphi (x, \left| \nabla u\right| (x))\). Moreover, the (RHS) of (5.4) is convergent in \(L^1(B)\) cf. Lemma 5.3 (D2) so that \(\left\{ \varphi (x, \left| \nabla u^{\varepsilon }\right| (x) ) \right\} _{\varepsilon }\) is uniformly integrable in \(L^1(B)\). Therefore, Vitali convergence theorem implies

$$\begin{aligned} \varphi (x, \left| \nabla u^{\varepsilon }\right| (x) ) \rightarrow \varphi (x, \left| \nabla u\right| (x)) \qquad \text{ in } L^1(B) \text{ as } \varepsilon \rightarrow 0. \end{aligned}$$

Thanks to triangle inequality we obtain (E2). To see (E3), we note a simple estimate \(|a+b|^q \leqq 2^{q-1} \left( |a|^q +|b|^q\right) \) so that

$$\begin{aligned} \varphi \left( x,\left| \nabla u(x) - \nabla u^{\varepsilon }(x)\right| \right) \leqq 2^{q-1} \varphi \left( x,\left| \nabla u\right| (x) \right) + 2^{q-1}\varphi \left( x, \left| \nabla u^{\varepsilon }\right| (x)\right) . \end{aligned}$$

It follows that the sequence \(\left\{ \varphi \left( x,\left| \nabla u(x) - \nabla u^{\varepsilon }(x)\right| \right) \right\} _{\varepsilon }\) is again uniformly integrable and Vitali convergence theorem yields that

$$\begin{aligned} \varphi \left( x,\left| \nabla u(x) - \nabla u^{\varepsilon }(x)\right| \right) \rightarrow 0 \qquad \text{ in } L^1(B) \text{ as } \varepsilon \rightarrow 0, \end{aligned}$$

concluding the proof of (E3). This shows that any bounded function in \(W_0^{1,\varphi }(B)\) can be approximated with smooth compactly supported functions so that (E4) follows from Lemma 4.3.

\({\underline{{\textbf {Case 2}}~p > d.}}\) In this case we have \(q \leqq p + \alpha \,\frac{p}{d}\). Note that

$$\begin{aligned} \nabla u^{\varepsilon }(x) = \frac{1}{1-2\,\varepsilon } \int _{B_{\varepsilon }} \nabla u \left( \frac{x}{1-2\,\varepsilon }-y\right) \, \eta _{\varepsilon }(y) \mathrm {d}y. \end{aligned}$$

Therefore, instead of (5.2), we can compute that

$$\begin{aligned} \left\| \nabla u^{\varepsilon } \right\| _{\infty } \leqq \frac{1}{1-2\,\varepsilon } \, \left\| \nabla u\left( \frac{\cdot }{1-2\varepsilon }\right) \right\| _{p} \, \left\| \eta _{\varepsilon } \right\| _{p'} \leqq 2\, \left\| \nabla u\left( \frac{\cdot }{1-2\varepsilon }\right) \right\| _{p} \, \left\| \eta _{\varepsilon } \right\| _{p'},\nonumber \\ \end{aligned}$$
(5.5)

where \(p'\) is the usual Hölder conjugate exponent. Using change of variables we obtain

$$\begin{aligned} \left\| \eta _{\varepsilon } \right\| _{p'}^{p'} = \int _{B_{\varepsilon }} \frac{1}{\varepsilon ^{d\, p'}} \left| \eta \left( \frac{x}{\varepsilon } \right) \right| ^{p'} \mathrm {d}x = \varepsilon ^{d\,(1-p')} \int _{B} \left| \eta (x)\right| ^{p'} \mathrm {d}x = \varepsilon ^{-p'\, \frac{d}{p}} \Vert \eta \Vert _{p'}^{p'}, \end{aligned}$$

so that \( \left\| \eta _{\varepsilon } \right\| _{p'} = \varepsilon ^{-\frac{d}{p}} \Vert \eta \Vert _{p'}\). Using a change of variables again,

$$\begin{aligned} \left\| \nabla u\left( \frac{\cdot }{1-2\varepsilon }\right) \right\| _{p} \leqq \left\| \nabla u \right\| _{p}, \end{aligned}$$

which is finite as \({\mathcal {G}}(u)<\infty \). Therefore, (5.5) boils down to

$$\begin{aligned} \left\| \nabla u^{\varepsilon } \right\| _{\infty } \leqq D\, (5\varepsilon )^{-\frac{d}{p}}, \end{aligned}$$

where \(D:= 5^{\frac{d}{p}}\,\left\| \nabla u \right\| _{p} \, \Vert \eta \Vert _{p'}\). Using Assumption 2.2 we obtain estimate (5.3). The rest of the proof is exactly the same. \(\square \)

6 Proof of Theorem 2.3 in the general case

In this section we generalize construction from Section 5 to prove Theorem 2.3 in the general case.

6.1 Second convex conjugate function

For general \({\mathcal {N}}\)-function \(\psi \) satisfying Assumption 2.1, Lemma 5.4 is not necessarily true. Therefore, to control mollifications, we need a different method to approximate \(\psi (x,\xi )\) with a function depending only on \(\xi \). The construction below is somehow standard and has appeared in many works before; see [14, 15].

We start more generally. Let \(f: {\mathbb {R}}\rightarrow {\mathbb {R}}\). We define convex conjugate \(f^*:{\mathbb {R}}\rightarrow {\mathbb {R}}\cup \{+\infty \}\) of f as

$$\begin{aligned} f^*(\eta ) = \sup _{\xi \in {\mathbb {R}}} \left( \xi \cdot \eta - f(\xi )\right) . \end{aligned}$$

Moreover, the second convex conjugate of \(f^{**}\) is defined as

$$\begin{aligned} f^{**}(\xi ) = \sup _{\eta \in {\mathbb {R}}} \left( \xi \cdot \eta - f^*(\eta )\right) . \end{aligned}$$

We now list some basic properties of the convex conjugates cf. [37, Propositions 2.21, 2.28].

Lemma 6.1

Let \(f, g: {\mathbb {R}}\rightarrow {\mathbb {R}}\). Then, the following holds true:

  1. (F1)

    \(f^*\) and \(f^{**}\) are convex functions,

  2. (F2)

    if \(f \leqq g\) on \({\mathbb {R}}\), then \(g^* \leqq f^*\) on \({\mathbb {R}}\),

  3. (F3)

    if \(f \leqq g\) on \({\mathbb {R}}\), then \(f^{**} \leqq g^{**}\) on \({\mathbb {R}}\),

  4. (F4)

    if f is convex then \(f^{**} = f\) on \({\mathbb {R}}\).

  5. (F5)

    \(f^{**}\) is the gratest convex minorant of f.

Now, we apply these notions to \({\mathcal {N}}\)-functions. Given \({\mathcal {N}}\)-function \(\psi (x,\xi )\) satisfying Assumption 2.1, we extend it by 0 for \(\xi < 0\) (hence this extension is surely convex), we consider a ball \(B_{\gamma }(x)\) such that \(\overline{B_{\gamma }(x)} \cap {\overline{\Omega }}\) is nonempty and we define

$$\begin{aligned} \psi _{x,\,\gamma }(\xi ): {\mathbb {R}}\rightarrow {\mathbb {R}}, \qquad \qquad \psi _{x,\,\gamma }(\xi ) := \inf _{y \in \overline{B_{\gamma }(x)} \cap {\overline{\Omega }}} \psi (y,\xi ). \end{aligned}$$
(6.1)

Lemma 6.2

Let \(\psi \) be as in Assumptions 2.1, 2.2 and \(\psi _{x, \, \gamma }\) be as in (6.1).

  1. (G1)

    Let \({\mathcal {D}}>1\). Then, there are constants \({\mathcal {M}} = {\mathcal {M}}(p,q,{\mathcal {D}})\), \({\mathcal {N}} = {\mathcal {N}}(p,q,{\mathcal {D}})\) such that

    $$\begin{aligned} \psi (y,\xi ) \leqq {\mathcal {M}} \, \psi _{x,\, \gamma }^{**}(\xi ) + {\mathcal {N}} \end{aligned}$$
    (6.2)

    for all balls \(B_{\gamma }(x)\), all \(y \in \overline{B_{\gamma }(x)} \cap {\overline{\Omega }}\), all \(\xi \) such that \(\xi \leqq {\mathcal {D}}\, \gamma ^{-\min \left( 1,\, \frac{d}{p}\right) }\) and all \(\gamma \in \left( 0,\frac{1}{2}\right) \).

  2. (G2)

    It holds \(0 \leqq \psi _{x,\, \gamma }^{**}(\xi ) \leqq \psi (y, \xi )\) for all balls \(B_{\gamma }(x)\), all \(y \in \overline{B_{\gamma }(x)} \cap {\overline{\Omega }}\) and all \(\xi \in {\mathbb {R}}\).

One could try to prove Lemma 6.2 by applying property (F3) from Lemma 6.1 to the estimate appearing in Assumption 2.2. However, this estimate is valid only on some bounded interval rather than the whole real line. The correct argument is presented in [13] but since it contains some imperfections, we present it below.

Proof of Lemma 6.2

The proof of (G2) follows easily from (F3) and (F4) stated in Lemma 6.1. For (G1) we split the proof into several steps. Recall that a convex function has a supporting line so that for all \(\eta \in {\mathbb {R}}\), there exists supporting line \(h_{\eta }\) such that \(\psi _{x,\gamma }^{**}(\xi ) \geqq h_{\eta }(\xi )\) and \(\psi _{x,\gamma }^{**}(\eta ) = h_{\eta }(\eta )\). \(\square \)

Step 1. The map \({\mathbb {R}}\ni \xi \mapsto {\psi }_{x,\,\gamma }(\xi )\) is locally Lipschitz continuous.

Proof

Fix \(y \in B_{\gamma }(x)\) and interval \([-R,R] \subset {\mathbb {R}}\). The map \({\mathbb {R}}\ni \xi \mapsto {\psi }(y,\xi )\) is convex so its difference quotients are monotone. Hence, for \(\xi _1, \xi _2 \in [-R,R]\) with \(\xi _1 < \xi _2\) we have

$$\begin{aligned} \frac{{\psi }(y,\xi _2) - {\psi }(y,-R - 1)}{\xi _2 -(- R - 1)} \leqq \frac{{\psi }(y,\xi _2) - {\psi }(y,\xi _1)}{\xi _2 - \xi _1} \leqq \frac{{\psi }(y,R+1) - {\psi }(y,\xi _1)}{R+1 - \xi _1} \end{aligned}$$

Since \(|{\psi }(y,R + 1)| \leqq C_2\,(1+(R+1))^q\) cf. Assumption 2.1 (A3), the map \(\xi \mapsto {\psi }(y,\xi )\) is Lipschitz continuous with constant \(2\,C_2\,(1+(R+1))^q\). By triangle inequality we have that

$$\begin{aligned} {\psi }(y,\xi _1)\leqq & {} |{\psi }(y,\xi _1) - {\psi }(y,\xi _2)| + {\psi }(y,\xi _2)\\\leqq & {} 2\,C_2\,(1+(R+1))^q\, |\xi _1 - \xi _2|+ {\psi }(y,\xi _2). \end{aligned}$$

Finding a sequence \(\{y_k\}\) such that \({\psi }(y_k,\xi _2) \rightarrow {\psi }_{x,\,\gamma }(\xi _2)\) we obtain

$$\begin{aligned} {\psi }_{x,\,\gamma }(\xi _1) \leqq \liminf _{k \rightarrow \infty } {\psi }(y_k,\xi _1) \leqq 2\,C_2\,(1+(R+1))^q\, |\xi _1 - \xi _2|+ {\psi }_{x,\,\gamma }(\xi _2). \end{aligned}$$

As \(\xi _1\) and \(\xi _2\) can be interchanged, the conclusion follows. \(\square \)

Step 2. For \(\xi \leqq 0\) we have \({\psi }_{x,\,\gamma }^{**}(\xi ) = 0\) and for \(\xi > 0\) we have \({\psi }_{x,\,\gamma }^{**}(\xi ) \geqq 0\). In particular, estimate (6.2) is satisfied for function \({\psi }\) and \(\xi \leqq 0\).

Proof

First, note that the function being identically zero is convex. As \(0 \leqq \psi _{x,\,\gamma }(\xi )\), we deduce

$$\begin{aligned} 0 \leqq \psi _{x,\,\gamma }^{**}(\xi ). \end{aligned}$$
(6.3)

Finally, as \(\psi _{x,\,\gamma }^{**}(\xi ) \leqq \psi _{x,\,\gamma }(\xi )\), we deduce \(\psi _{x,\,\gamma }^{**}(\xi ) = 0\) for \(\xi \leqq 0\). \(\square \)

Step 3. Fix \(\eta \) such that \(0 \leqq \eta \leqq {\mathcal {D}}\, \gamma ^{-\min \left( 1,\, \frac{d}{p}\right) }\) and assume that \(h_{\eta }(\xi ) = \psi _{x,\,\gamma }^{**}(\xi )\) only for \(\xi = \eta \). Then, \(\psi _{x,\,\gamma }^{**}(\eta ) = \psi _{x,\,\gamma }(\eta )\) and estimate (6.2) is satisfied for \(\xi = \eta \).

Proof

Suppose that \(\psi _{x,\,\gamma }^{**}(\eta ) < \psi _{x,\,\gamma }(\eta )\) (we always have \(\psi _{x,\,\gamma }^{**}(\eta ) \leqq \psi _{x,\,\gamma }(\eta )\)!). Using Lipschitz continuity from Step 1, we find two lines such that \(\psi _{x,\,\gamma }\) is above them (see dotted lines in Fig. 1). Hence, we observe that \(\psi _{x,\,\gamma }^{**}\) is not the largest convex minorant of \(\psi _{x,\,\gamma }\), see Fig. 1. It follows that \(\psi _{x,\,\gamma }^{**}(\eta ) = \psi _{x,\,\gamma }(\eta )\) and estimate (6.2) follows directly from Assumption 2.2. \(\square \)

Step 4 Fix \(\eta \) such that \(0 \leqq \eta \leqq {\mathcal {D}}\, \gamma ^{-\min \left( 1,\, \frac{d}{p}\right) }\) and assume that \(h_{\eta }(\xi ) = \psi _{x,\,\gamma }^{**}(\xi )\) for some interval [ab] containing \(\eta \) (so that \(h_{\eta }\) and \(\psi _{x,\,\gamma }^{**}\) have joint line interval). Then, estimate (6.2) is satisfied for \(\xi = \eta \).

Proof

First, from Step 2 we may assume that \(a \geqq 0\) and by assumption (A5) we can assume \(b < \infty \) (as \(\psi (x,\xi )\) is superlinear as \(\xi \rightarrow \infty \)). Second, the reasoning from Step 4 shows that

$$\begin{aligned} \psi _{x,\,\gamma }(a) = \psi _{x,\,\gamma }^{**}(a), \qquad \qquad \psi _{x,\,\gamma }(b) = \psi _{x,\,\gamma }^{**}(b). \end{aligned}$$

Moreover, by the assumption, there exists \(t \in [0,1]\) such that

$$\begin{aligned} \psi _{x,\,\gamma }^{**}(\eta ) = t\, \psi _{x,\,\gamma }^{**}(a) + (1-t)\, \psi _{x,\,\gamma }^{**}(b) = t\, \psi _{x,\,\gamma }(a) + (1-t)\, \psi _{x,\,\gamma }(b). \end{aligned}$$

By definition of \(\psi _{x,\,\gamma }\), there exist sequences \(\{x^a_n\}_{n \in {\mathbb {N}}}\), \(\{x^b_n\}_{n \in {\mathbb {N}}} \subset B_\gamma (x)\) such that

$$\begin{aligned} \psi _{x,\,\gamma }^{**}(\eta ) \geqq t\, \psi (x^a_n,\,a) + (1-t)\, \psi (x^b_n,\,b) - \frac{1}{n}. \end{aligned}$$
(6.4)

With these at hand, we proceed to the final proof. By definition and convexity,

$$\begin{aligned} \psi _{x, \gamma }(\eta ) \leqq \psi (x^b_n,\eta ) \leqq t\, \psi (x^b_n, a) + (1-t)\,\psi (x^b_n, b) \end{aligned}$$
(6.5)

To apply (6.4), we have to replace \(\psi (x^b_n, a)\) with \(\psi (x^a_n, a)\). This can be done with Assumption 2.2: we note that \(|x_n^a - x^b_n| \leqq 2\,\gamma \) so if we let \(D := 2^{\min \left( 1,\frac{d}{p}\right) } \, {\mathcal {D}}\) we have

$$\begin{aligned} |\eta | \leqq {\mathcal {D}}\, \gamma ^{-\min \left( 1,\, \frac{d}{p}\right) } = D \, \left( 2\,\gamma \right) ^{-\min \left( 1,\, \frac{d}{p}\right) } \end{aligned}$$

and Assumption 2.2 implies existence of constants M(D), N(D) (we skip dependence of these constants on p and q as these exponents are fixed) such that

$$\begin{aligned} \psi (x^b_n,\eta ) \leqq M(D)\, \psi (x^a_n, a) + N(D). \end{aligned}$$

It follows from (6.5) that

$$\begin{aligned} \psi _{x, \gamma }(\eta ) \leqq t\, \left( M(D)\, \psi (x^a_n, a) + N(D) \right) + (1-t)\, \psi (x^b_n, b) \end{aligned}$$

Letting \({\widetilde{M}}(D) := \max (M(D),1)\) and exploiting (6.4) we have that

$$\begin{aligned} \psi _{x, \gamma }(\eta )\leqq & {} {\widetilde{M}}(D) \left( t\, \psi (x^a_n,\,a) + (1-t)\, \psi (x^b_n,\,b) \right) + N(D) \\\leqq & {} {\widetilde{M}}(D) \, \psi ^{**}_{x,\,\gamma }(\eta ) + \frac{{\widetilde{M}}(D)}{n} + N(D). \end{aligned}$$

Sending \(n \rightarrow \infty \), we deduce that

$$\begin{aligned} \psi _{x, \gamma }(\eta ) \leqq {\widetilde{M}}(D) \, \psi ^{**}_{x,\,\gamma }(\eta ) + N(D). \end{aligned}$$

Exploiting Assumption 2.2 once again, we obtain for all \(y \in B_{\gamma }(x)\)

$$\begin{aligned} \psi (y,\eta ) \leqq M({\mathcal {D}})\, \psi _{x, \gamma }(\eta ) + N({\mathcal {D}}) \leqq M({\mathcal {D}})\,{\widetilde{M}}({D}) \, \psi ^{**}_{x,\,\gamma }(\eta ) + N({\mathcal {D}}) + N({D}). \end{aligned}$$

The conclusion follows with \({\mathcal {M}}:= M({\mathcal {D}})\,{\widetilde{M}}({D})\) and \({\mathcal {N}} = N({\mathcal {D}}) + N({D})\). \(\square \)

Step 5 Cases considered in Steps 2-4 are the only possible ones.

Proof

Clearly, the tangent line \(h_{\eta }\) touches the epigraph of \(\psi _{x,\gamma }^{**}\) at least in one point. The case where it is touched exactly at one point was studied in Step 3 while the situation when it is touched along some interval [ab] was analyzed in Step 4. Now, suppose that there are \(\eta< \eta _1 < \eta _2\) such that

$$\begin{aligned} \psi _{x,\gamma }^{**}(\eta ) = h_{\eta }(\eta ), \qquad \psi _{x,\gamma }^{**}(\eta _2) = h_{\eta }(\eta _2), \qquad \psi _{x,\gamma }^{**}(\eta _1) > h_{\eta }(\eta _1). \end{aligned}$$

Then, \(\psi _{x,\gamma }^{**}\) is not convex raising contradiction. \(\square \)

Fig. 1
figure 1

We assume that there is \(\eta > 0\) such that functions \(\psi _{x,\,\gamma }^{**}\) (black line) and \(\psi _{x,\,\gamma }\) (grey line) satisfy \(\psi _{x,\,\gamma }^{**}(\eta ) < \psi _{x,\,\gamma }(\eta )\) and tangent line \(h_{\eta }\) touches \(\psi _{x,\,\gamma }^{**}\) only at \(\eta \). As \(\psi _{x,\,\gamma }\) is Lipschitz continuous, we can estimate it from below (dotted lines). Then, the function obtained by combining \(\psi _{x,\,\gamma }^{**}\) and the dashed line is convex. It lies below \(\psi _{x,\,\gamma }\) and above \(\psi _{x,\,\gamma }^{**}\) raising contradiction with Lemma 6.1 (F5)

6.2 Geometric issues

As \(\Omega \) is not a ball in general, we cannot define compactly supported approximation by retracting the function to the interior part of \(\Omega \) as in Definition 5.1. However, one can still do that for star-shaped domains.

Definition 6.3

 

  1. (1)

    A bounded domain \(U \subset {\mathbb {R}}^d\) is said to be star-shaped with respect to \({\overline{x}}\) if every ray starting from \({\overline{x}}\) intersects with \(\partial U\) at one and only one point.

  2. (2)

    A bounded domain \(U \subset {\mathbb {R}}^d\) is said to be star-shaped with respect to the ball \(B_{\gamma }(x_0)\) if U is star-shaped with respect to all \(y \in B_{\gamma }(x_0)\).

The following lemma shows that star-shaped domains can be uniformly shrinked which allows for defining compactly supported approximations:

Lemma 6.4

Let \(U \subset {\mathbb {R}}^d\) be a star-shaped domain with respect to the ball \(B_R\). Let \(\kappa _{\varepsilon } = 1- \frac{4\,\varepsilon }{R}\). Then, \({{\,\mathrm{dist}\,}}(\kappa _{\varepsilon } \, U, \partial U) \geqq 2\,\varepsilon \). In particular,

$$\begin{aligned} \overline{\kappa _{\varepsilon } \, U + \varepsilon \, B } \subset U. \end{aligned}$$

More generally, if U is star-shaped with respect to the ball \(B_R(x_0)\),

$$\begin{aligned} \overline{\kappa _{\varepsilon } \, (U-x_0) + \varepsilon \, B } \subset (U-x_0). \end{aligned}$$

Proof

Let \(b \in \partial U\) and let \(c \in \partial (\kappa _{\varepsilon } U)\) such that c lies on the interval [0, b]. Let T be a sphere of radius R perpendicular to the interval [0, b] and let S be the cone with base T and apex b, see Fig. 2. First, we have

$$\begin{aligned} {{\,\mathrm{dist}\,}}(\partial U, c) \geqq {{\,\mathrm{dist}\,}}(\partial S, c). \end{aligned}$$

Let \(\alpha \) be a half of an apex angle of the cone C, see Fig. 2. It follows that

$$\begin{aligned} {{\,\mathrm{dist}\,}}(\partial S, c)= & {} \sin (\alpha ) \, |b-c| = \sin (\alpha ) \, (1-\kappa _{\varepsilon }) \, |b| = \sin (\alpha ) \, (1-\kappa _{\varepsilon }) \, |b|\\\geqq & {} \sin (\alpha ) \, \frac{4\,\varepsilon }{R} \, |b|. \end{aligned}$$

so it is sufficient to estimate \(\sin (\alpha )\) from below. Using notation from Fig. 2, the length of interval [db] equals \(\frac{b}{\cos (\alpha )}\). Therefore,

$$\begin{aligned} \sin (\alpha ) = \frac{R}{|b|/\cos (\alpha )} \implies \tan (\alpha ) = \frac{R}{|b|} \end{aligned}$$

As \(\sin ^2(\alpha ) = \frac{\tan ^2(\alpha )}{1+\tan ^2(\alpha )}\) we have that

$$\begin{aligned} \sin ^2(\alpha ) = \frac{R^2}{R^2 + b^2} \geqq \frac{R^2}{4b^2} \implies \sin (\alpha ) \geqq \frac{R}{2|b|}, \end{aligned}$$

where we used \(R^2 \leqq |b|^2 \leqq 3\,|b|^2\). We conclude that \({{\,\mathrm{dist}\,}}(\partial U, c) \geqq 2 \varepsilon \). As this argument can be repeated for all \(c \in \partial (\kappa _{\varepsilon } U)\), we obtain \({{\,\mathrm{dist}\,}}(\partial U, \kappa _{\varepsilon } U) \geqq 2\,\varepsilon \). The second statement follows from observation that the set \(U-x_0\) is star-shaped with respect to the ball \(B_R\). \(\square \)

Fig. 2
figure 2

Three dimensional adaptation of the construction performed in Lemma 6.4. Point b belongs to the boundary of star-shaped domain U while point \(c = \kappa _{\varepsilon } b\) belongs to the boundary of the rescaled set \(\kappa _{\varepsilon } U\). Sphere T is perpendicular to the interval [0, b] and since U is star-shaped with respect to the ball \(B_R\), the cone S with base T and apex b lies inside U

On star-shaped domain we can define mollification with squeezing as in Definition 5.1.

Definition 6.5

(Mollification with squeezing on star-shaped domain) Let U be a star-shaped domain with respect to the ball \(B_R(x_0)\). Given \(u \in W^{1,1}_0(U)\) we extend it with 0 to \({\mathbb {R}}^d\) and define

$$\begin{aligned} {\mathcal {S}}^{\varepsilon }_U u(x) := \int _{{\mathbb {R}}^d} u\left( x_0 + \frac{x-x_0 -y}{\kappa _{\varepsilon }} \right) \, \eta _{\varepsilon }(y) \mathrm {d}y, \end{aligned}$$

where \(\kappa _{\varepsilon } = 1 - \frac{4\,\varepsilon }{R}\).

The reader may think about the case \(x_0 = 0\) first.

Lemma 6.6

Function \({\mathcal {S}}^{\varepsilon }_U u\) from Definition 6.5 belongs to \(C_c^{\infty }(U)\).

Proof

The smoothness is clear from standard properties of convolutions. Concerning compact support, we claim that \({\mathcal {S}}^{\varepsilon }_U u\) is supported in \(x_0 + \overline{\kappa _{\varepsilon } \, (U-x_0) + \varepsilon \, B }\) which is a compact subset of U due to Lemma 6.4. Indeed, let \(x \notin x_0 + \overline{\kappa _{\varepsilon } \, (U-x_0) + \varepsilon \, B }\) and suppose that there is y with \(|y| \leqq \varepsilon \) such that \(x_0 + \frac{x-x_0 -y}{\kappa _{\varepsilon }} \in U\). Then, we can write

$$\begin{aligned} x = x_0 + \kappa _{\varepsilon } \, \left( x_0 + \frac{x - x_0 -y}{\kappa _{\varepsilon }} - x_0\right) + y \end{aligned}$$

so that \(x \in x_0 + \overline{\kappa _{\varepsilon } \, (U-x_0) + \varepsilon \, B}\) raising contradiction. It follows that for \(x \in x_0 + \overline{\kappa _{\varepsilon } \, (U-x_0) + \varepsilon \, B}\) we have either

$$\begin{aligned} x_0 + \frac{x-x_0 -y}{\kappa _{\varepsilon }} \in U \text{ and } |y| > \varepsilon \qquad \text{ or } \qquad x_0 + \frac{x-x_0 -y}{\kappa _{\varepsilon }} \notin U \end{aligned}$$

so that the integral \(\int _{{\mathbb {R}}^d} u\left( x_0 + \frac{x-x_0 -y}{\kappa _{\varepsilon }} \right) \, \eta _{\varepsilon }(y) \mathrm {d}y = 0\). \(\square \)

To move from star-shaped domains to Lipschitz ones we will use the following decomposition cf. [34, Lemma 3.14]:

Lemma 6.7

Suppose that \(\Omega \subset {\mathbb {R}}^d\) is a bounded Lipschitz domain. Then, there exist domains \(\{U_i\}_{i=1,\ldots ,n}\) such that

$$\begin{aligned} {\overline{\Omega }} \subset \bigcup _{i=1}^n U_i, \end{aligned}$$

and \(\Omega \cap U_i\) is star-shaped with respect to some ball \(B_{R_i}(x_i)\).

6.3 Approximating sequence and proof of Theorem 2.3

We are in position to define the approximating sequence. Let \(\Omega \) be a Lipschitz bounded domain. From Lemma 6.7 we obtain a family of domains such that \({\overline{\Omega }} \subset \bigcup _{i=1}^n U_i\) where \(\{\Omega \cap U_i\}_{i=1,\ldots ,n}\) are star-shaped domains with respect to balls \(B_{R}(x_i)\) (without loss of generality, we may assume that the radii of the balls are the same by taking \(R := \min _{i=1,\ldots n} R_i\)). In particular, \(\{U_i\}_{i=1,\ldots ,n}\) is an open covering of \({\overline{\Omega }}\) so there exists partition of unity related to this covering: family of functions \(\{\theta _i\}_{i=1,\ldots ,n}\) such that

$$\begin{aligned} \theta _i \in C_c^{\infty }(U_i), \qquad 0 \leqq \theta _i \leqq 1, \qquad \sum _{i=1}^n \theta _i = 1 \text{ on } {\overline{\Omega }}. \end{aligned}$$

Given \(u \in W^{1,1}_0(\Omega ) \cap L^{\infty }(\Omega )\) we extend it with 0 as above and we set

$$\begin{aligned} {\mathcal {S}}^{\varepsilon } u := \sum _{i=1}^n {\mathcal {S}}^{\varepsilon }_{U_i}(u \, \theta _i) = \sum _{i=1}^n \int _{B_{\varepsilon }} ( u \, \theta _i) \left( x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }}\right) \, \eta _{\varepsilon }(y) \mathrm {d}y, \end{aligned}$$
(6.6)

where \(\kappa _{\varepsilon } = 1 - \frac{4\,\varepsilon }{R}\). We note that since u vanishes outside of \(\Omega \), function \(u\, \theta _i\) is supported in \(\Omega \cap U_i\) which is star-shaped.

Before formulating the main result of this section, we will state and prove two technical lemmas concerning approximating sequence.

Lemma 6.8

Let \(\kappa _{\varepsilon } = 1 - \frac{4\,\varepsilon }{R}\), \(x \in \Omega \) and \(|y| \leqq \varepsilon \). Then, there exists a constant \(C_{\Omega ,R}\) such that for \(\varepsilon \leqq \frac{R}{8}\) we have \(x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }} \in B_{\varepsilon C_{\Omega ,R}}(x)\).

Proof

Note that for \(\varepsilon \leqq \frac{R}{8}\), we have \(\frac{1}{\kappa _{\varepsilon }} \leqq 2\). We compute that

$$\begin{aligned} \left| x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }} - x \right|= & {} \left| (x_i-x)\left( 1-\frac{1}{\kappa _{\varepsilon }}\right) - \frac{y}{\kappa _{\varepsilon }} \right| \\\leqq & {} |x_i - x| \frac{1-\kappa _{\varepsilon }}{\kappa _{\varepsilon }} + \frac{\varepsilon }{\kappa _{\varepsilon }} \leqq 8\,\frac{|x_i - x|}{R}\, \varepsilon + 2\,\varepsilon . \end{aligned}$$

As \(|x_i - x| \leqq \text{ diam }(\Omega )\) (the diameter of \(\Omega \)), we choose \(C_{\Omega , R} := \frac{8 \, \text {diam}(\Omega )}{R} + 2\). \(\square \)

Lemma 6.9

Let \(u \in W^{1,1}_0(\Omega )\) be such that \({\mathcal {H}}(u)<\infty \) and consider its extension to \({\mathbb {R}}^d\). Then,

  1. (H1)

    \(\psi \left( x_i + \frac{x-x_i}{\kappa _{\varepsilon }}, (\left| \nabla u\right| \, \theta _i)\left( x_i + \frac{x-x_i}{\kappa _{\varepsilon }}\right) \right) \rightarrow \psi (x,\left| \nabla u\right| \, \theta _i)\) in \(L^1({\mathbb {R}}^d)\),

  2. (H2)

    \( \int _{B_{\varepsilon }} \psi \left( x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }}, (\left| \nabla u\right| \, \theta _i)\left( x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }}\right) \right) \eta _{\varepsilon }(y) \mathrm {d}y \rightarrow \psi \left( x, \left| \nabla u\right| \, \theta _i \right) \) in \(L^1({\mathbb {R}}^d)\).

Proof

Concerning (H1), we note that the convergence holds in the pointwise sense. Moreover, the considered sequence is supported on \(\Omega \cap U_i\). Therefore, to establish convergence in \(L^1({\mathbb {R}}^d)\), it is sufficient to prove equiintegrability of the sequence \(\left\{ \psi \left( x_i + \frac{x-x_i}{\kappa _{\varepsilon }}, (\left| \nabla u\right| \, \theta _i)\left( x_i + \frac{x-x_i}{\kappa _{\varepsilon }}\right) \right) \right\} _{\varepsilon }\) and apply Vitali convergence theorem. To this end, we need to prove

$$\begin{aligned}&\forall {\eta>0}\,\, \exists {\delta >0} \,\, \forall {A\subset \Omega \cap U_i, |A|\leqq \delta } \,\, \\&\quad \qquad \int _{A} \psi \left( x_i + \frac{x-x_i}{\kappa _{\varepsilon }}, (\left| \nabla u\right| \, \theta _i)\left( x_i + \frac{x-x_i}{\kappa _{\varepsilon }}\right) \right) \mathrm {d}x \leqq \eta . \end{aligned}$$

We fix \(\eta \) and arbitrary \(A \subset \Omega \cap U_i\). Using convexity,

$$\begin{aligned}&\psi \left( x_i + \frac{x-x_i}{\kappa _{\varepsilon }}, (\left| \nabla u\right| \, \theta _i)\left( x_i + \frac{x-x_i}{\kappa _{\varepsilon }}\right) \right) \\&\quad \leqq \psi \left( x_i + \frac{x-x_i}{\kappa _{\varepsilon }}, \left| \nabla u\right| \left( x_i + \frac{x-x_i}{\kappa _{\varepsilon }}\right) \right) \end{aligned}$$

as \(0 \leqq \theta _i \leqq 1\). Second, using change of variables we have that

$$\begin{aligned}&\int _{A} \psi \left( x_i + \frac{x-x_i}{\kappa _{\varepsilon }}, \left| \nabla u\right| \left( x_i + \frac{x-x_i}{\kappa _{\varepsilon }}\right) \right) \mathrm {d}x \\&\quad = (\kappa _{\varepsilon })^d \int _{{\widetilde{A}}} \psi (x,\left| \nabla u\right| (x)) \mathrm {d}x \leqq \int _{{\widetilde{A}}} \psi (x,\left| \nabla u\right| (x)) \mathrm {d}x, \end{aligned}$$

where \({\widetilde{A}}\) is a set obtained from A after the performed change of variables. Note that measures of these sets satisfy \( |{\widetilde{A}}|\leqq \frac{1}{\kappa _{\varepsilon }^d} |A| \leqq 2^d \, |A|. \) Having this in mind, we let

$$\begin{aligned} \omega (\tau ) := \sup _{C \subset {\mathbb {R}}^d: |C|\leqq \tau } \int _C \psi (x,\left| \nabla u\right| (x)) \mathrm {d}x. \end{aligned}$$

Function \(\omega (\tau )\) is a non-decreasing function, continuous at 0 because \({\mathcal {H}}(u) < \infty \). Therefore, we may find \(\tau \) such that \(\omega (\tau ) \leqq \eta \). Then, we choose \(\delta = 2^{-d} \, \tau \) to conclude the proof of (H1). Finally, (H2) follows from Young’s convolutional inequality and (H1). \(\square \)

Theorem 6.10

(Theorem 2.3 in the general case) Let \(u \in W^{1,\psi }_0(\Omega ) \cap L^{\infty }(\Omega )\) where \(\psi \) satisfies Assumptions 2.1 and 2.2. Let \({\mathcal {H}}\) be given with (2.1). Suppose that

$$\begin{aligned} q \leqq p + {\alpha } \max \left( 1, \frac{p}{d}\right) . \end{aligned}$$

Consider sequence \(\left\{ {\mathcal {S}}^{\varepsilon } u\right\} _{\varepsilon > 0}\) as in (6.6) with \(\varepsilon \leqq \frac{R}{8}\). Then,

  1. (I1)

    \({\mathcal {S}}^{\varepsilon } u \in C_c^{\infty }(\Omega )\),

  2. (I2)

    \({\mathcal {H}}\left( {\mathcal {S}}^{\varepsilon } u\right) \rightarrow {\mathcal {H}}(u)\) as \(\varepsilon \rightarrow 0\),

  3. (I3)

    \({\mathcal {S}}^{\varepsilon } u \rightarrow u\) in \(W^{1,\psi }(\Omega )\) as \(\varepsilon \rightarrow 0\),

  4. (I4)

    space \(C_c^{\infty }(\Omega )\) is dense in \(W^{1,\psi }_0(\Omega ) \) and Lavrentiev phenomenon does not occur, i.e. for all boundary data \(u_0 \in W^{1,q}(\Omega )\):

    $$\begin{aligned} \inf _{u \in u_0 + W_0^{1,p}(\Omega )} {\mathcal {H}}(u) = \inf _{u \in u_0 + W_0^{1,q}(\Omega )} {\mathcal {H}}(u) = \inf _{u \in u_0 + C_c^{\infty }(\Omega )} {\mathcal {H}}(u). \end{aligned}$$

Proof

The first property follows from Lemma 6.6. To prove (I2), we note that

$$\begin{aligned} {\mathcal {H}}\left( {\mathcal {S}}^{\varepsilon } u\right) = \int _{\Omega } \psi (x, \nabla {\mathcal {S}}^{\varepsilon } u(x) ) \mathrm {d}x. \end{aligned}$$

To take mollification out of the function \(\psi \) we want to use Jensen’s inequality and Lemma 6.2. The latter requires estimate on \(\left\| \nabla {\mathcal {S}}^{\varepsilon } u \right\| _{\infty }\)

\({\underline{{\textbf {Case 1: }}p \leqq d.}}\) In this case we have \(q \leqq p + \alpha \). Using Young’s convolution inequality we obtain

$$\begin{aligned} \left\| \nabla {\mathcal {S}}^{\varepsilon } u \right\| _{\infty } \leqq \sum _{i=1}^n \left\| u\, \theta _i \right\| _{\infty } \, \left\| \nabla \eta _{\varepsilon } \right\| _{1} \leqq \sum _{i=1}^n \left\| u \right\| _{\infty } \, \left\| \nabla \eta _{\varepsilon } \right\| _{1} \leqq {\mathcal {D}}\, (\varepsilon \, C_{\Omega , R})^{-1}, \end{aligned}$$
(6.7)

where we choose \({\mathcal {D}}:= n\, \Vert u\Vert _{\infty } \,\left\| \nabla \eta \right\| _{1} \, C_{\Omega , R}\) and \(C_{\Omega , R}\) is a constant from Lemma 6.8. Let \(x \in \Omega \) be fixed. Applying Lemma 6.2 with \(\gamma = \varepsilon \,C_{\Omega , R}\) we obtain constants \({\mathcal {M}}\), \({\mathcal {N}}\) such that

$$\begin{aligned} \psi \left( x, \left| \nabla {\mathcal {S}}^{\varepsilon } u \right| \right) \leqq {\mathcal {M}} \, \psi _{x,\,\gamma }^{**}\left( \left| \nabla {\mathcal {S}}^{\varepsilon } u\right| \right) + {\mathcal {N}}, \end{aligned}$$
(6.8)

where function \(\psi _{x,\,\gamma }^{**}\) is the second convex conjugate of the function defined in (6.1). Now, we want to estimate \(\psi _{x,\,\gamma }^{**}\left( \left| \nabla {\mathcal {S}}^{\varepsilon } u\right| \right) \). Due to its convexity, Jensen’s inequality implies

$$\begin{aligned} \psi _{x,\,\gamma }^{**}\left( \,\left| \nabla {\mathcal {S}}^{\varepsilon } u\right| \right)&= \psi _{x,\,\gamma }^{**}\left( \left| \sum _{i=1}^n \int _{B_{\varepsilon }} \nabla _x ( u \, \theta _i) \left( x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }}\right) \, \eta _{\varepsilon }(y) \mathrm {d}y \right| \,\right) \\&\leqq \int _{B_{\varepsilon }} \psi _{x,\,\gamma }^{**}\left( \sum _{i=1}^n \left| \nabla _x (u \, \theta _i)\left( x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }}\right) \right| \,\right) \, \eta _{\varepsilon }(y) \mathrm {d}y \\&\leqq \frac{1}{2}\,\int _{B_{\varepsilon }} \psi _{x,\,\gamma }^{**}\left( \frac{2}{\kappa _{\varepsilon }} \,\sum _{i=1}^n \left| (\nabla u \, \theta _i)\left( x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }}\right) \right| \, \right) \, \eta _{\varepsilon }(y) \mathrm {d}y \\&\quad + \frac{1}{2}\,\int _{B_{\varepsilon }} \psi _{x,\,\gamma }^{**}\left( \frac{2}{\kappa _{\varepsilon }} \,\sum _{i=1}^n \left| (u \, \nabla \theta _i)\left( x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }}\right) \right| \,\right) \, \eta _{\varepsilon }(y) \mathrm {d}y\\&=: X+Y. \end{aligned}$$

Concerning term Y, using upper bound from Lemma 6.2 and q-growth cf. Assumption 2.1 (A3) we can estimate as follows:

$$\begin{aligned}&\psi _{x,\,\gamma }^{**}\left( \frac{2}{\kappa _{\varepsilon }} \,\left| \sum _{i=1}^n (u \, \nabla \theta _i)\left( x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }}\right) \right| \, \right) \\&\quad \leqq C_2 \, \left( 1+\left| \frac{2\,n}{\kappa _{\varepsilon }}\, \Vert u\Vert _{\infty } \, \sup _{i=1,\ldots ,n} \Vert \nabla \theta _i\Vert _{\infty } \right| ^q\right) \\&\quad \leqq C_2 \, \left( 1+\left| 4\,n\, \Vert u\Vert _{\infty } \, \sup _{i=1,\ldots ,n} \Vert \nabla \theta _i\Vert _{\infty } \right| ^q\right) . \end{aligned}$$

Here we used \(\frac{1}{\kappa _{\varepsilon }} \leqq 2\) for \(\varepsilon \leqq \frac{R}{8}\) in the second inequality. Therefore, using \(\int _{B_{\varepsilon }} \eta _{\varepsilon }(y) \mathrm {d}y = 1\),

$$\begin{aligned} Y \leqq M\,C_2 \, \left( 1+\left| 4\,n\, \Vert u\Vert _{\infty } \, \sup _{i=1,\ldots ,n} \Vert \nabla \theta _i\Vert _{\infty } \right| ^q\right) := {\mathcal {Y}} < \infty . \end{aligned}$$

Concerning term X, we use convexity so get that

$$\begin{aligned} X \leqq \frac{M}{2\,n}\,\sum _{i=1}^n\int _{B_{\varepsilon }} \psi _{x,\,\gamma }^{**}\left( \frac{2\,n}{\kappa _{\varepsilon }} \left| (\nabla u \, \theta _i)\left( x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }}\right) \right| \, \right) \, \eta _{\varepsilon }(y) \mathrm {d}y, \end{aligned}$$

so that we can study each summand independently. If \(x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }}\) does not belong to \({\overline{\Omega }}\) then \(\psi _{x,\,\gamma }^{**}\left( \frac{2n}{\kappa _{\varepsilon }} \left| (\nabla u \, \theta _i)\left( x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }}\right) \right| \right) = 0\) because \(\nabla u\) vanishes at this point cf. Lemma 6.2 (G2). Otherwise, \(x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }} \in {\overline{\Omega }} \cap \overline{B_{C_{R,\Omega }\, \varepsilon }(x)}\) cf. Lemma 6.8 so that

$$\begin{aligned}&\psi _{x,\,\gamma }^{**}\left( \frac{2\,n}{\kappa _{\varepsilon }} \left| (\nabla u \, \theta _i)\left( x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }}\right) \right| \, \right) \\&\quad \leqq \psi \left( x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }}, \frac{2\,n}{\kappa _{\varepsilon }}\, \left| (\nabla u \, \theta _i)\left( x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }}\right) \right| \, \right) , \end{aligned}$$

due to Lemma 6.2. Applying Assumption 2.1 (A4) iteratively, we obtain

$$\begin{aligned}&\psi \left( x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }}, \frac{2\,n}{\kappa _{\varepsilon }}\, \left| (\nabla u \, \theta _i)\left( x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }}\right) \right| \, \right) \\&\quad \leqq C_4^k\, \psi \left( x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }}, \left| (\nabla u \, \theta _i)\left( x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }}\right) \right| \, \right) , \end{aligned}$$

where k is the smallest natural number such that \(\frac{2\,n}{\kappa _{\varepsilon }} \leqq 2^k\) (see (A.2) in the proof of Lemma A.2). Using (6.8) and the fact that \(x \in \Omega \) was arbitrary, we obtain the inequality

$$\begin{aligned} \begin{aligned}&\psi \left( x,\left| \nabla {\mathcal {S}}^{\varepsilon } u\right| \right) \leqq {\mathcal {N}} + {\mathcal {Y}}\, + \\&\qquad + C_4^k\, \frac{{\mathcal {M}}}{2\,n}\, \sum _{i=1}^n \int _{B_{\varepsilon }} \psi \left( x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }}, \left| (\nabla u \, \theta _i)\left( x_i + \frac{x-x_i - y}{\kappa _{\varepsilon }}\right) \right| \, \right) \eta _{\varepsilon }(y) \mathrm {d}y, \end{aligned} \end{aligned}$$
(6.9)

valid for all \(x\in \Omega \). Now, we observe that \(\psi \left( x,\left| \nabla {\mathcal {S}}^{\varepsilon } u\right| \right) \) converges a.e. to \(\psi (x, \left| \nabla u\right| )\). Moreover, the last term on (RHS) of (6.9) is convergent in \(L^1(\Omega )\) cf. Lemma 6.9 (H2). Therefore, Corollary A.2 (Vitali convergence theorem) implies

$$\begin{aligned} \psi (x, \left| \nabla {\mathcal {S}}^{\varepsilon } u\right| ) \rightarrow \psi (x, \left| \nabla u\right| ) \qquad \text{ in } L^1(\Omega ) \text{ as } \varepsilon \rightarrow 0. \end{aligned}$$

Thanks to triangle inequality we obtain (I2). Now, (I3) follows from Lemma 4.1 (C4) while property (I4) follows from Lemma 4.3.

\({\underline{{\textbf {Case 2: }}p > d.}}\) In this case we have \(q \leqq p +\alpha \, \frac{p}{d}\). In this situation, instead of (6.7), we compute using change of variables

$$\begin{aligned} \left\| \nabla {\mathcal {S}}^{\varepsilon } u \right\| _{\infty } \leqq \frac{1}{\kappa _{\varepsilon }}\,\sum _{i=1}^n \left\| \nabla (u\, \theta _i) \right\| _{p} \, \left\| \eta _{\varepsilon } \right\| _{p'} \leqq 2 \,\sum _{i=1}^n \left\| \nabla (u\, \theta _i) \right\| _{p} \, \left\| \eta _{\varepsilon } \right\| _{p'}, \end{aligned}$$
(6.10)

where \(p'\) is the usual Hölder conjugate exponent. Using change of variables we obtain

$$\begin{aligned} \left\| \eta _{\varepsilon } \right\| _{p'}^{p'} = \int _{B_{\varepsilon }} \frac{1}{\varepsilon ^{d\, p'}} \left| \eta \left( \frac{x}{\varepsilon } \right) \right| ^{p'} \mathrm {d}x = \varepsilon ^{d\,(1-p')} \int _{B} \left| \eta (x)\right| ^{p'} \mathrm {d}x = \varepsilon ^{-p'\, \frac{d}{p}} \Vert \eta \Vert _{p'}^{p'}, \end{aligned}$$

so that \( \left\| \eta _{\varepsilon } \right\| _{p'} = \varepsilon ^{-\frac{d}{p}} \Vert \eta \Vert _{p'}\). Concerning the term with function u,

$$\begin{aligned} \left\| \nabla (u\, \theta _i) \right\| _{p} \leqq \left\| \nabla u \right\| _{p} \, \left\| \theta _i \right\| _{\infty } + \left\| u \right\| _{p} \, \left\| \nabla \theta _i \right\| _{\infty }, \end{aligned}$$

which is finite as \({\mathcal {H}}(u)<\infty \) and \(u \in L^{\infty }(\Omega )\). Therefore, (5.5) boils down to

$$\begin{aligned} \left\| \nabla {\mathcal {S}}^{\varepsilon } u \right\| _{\infty } \leqq D\, \left( \varepsilon \,C_{\Omega ,R}\right) ^{-\frac{d}{p}}, \quad D:= 2\, C_{\Omega ,R}^{\frac{d}{p}}\,\Vert \eta \Vert _{p'} \,\sum _{i=1}^n \left( \left\| \nabla u \right\| _{p} \, \left\| \theta _i \right\| _{\infty } + \left\| u \right\| _{p} \, \left\| \nabla \theta _i \right\| _{\infty }\right) . \end{aligned}$$

Now, we can apply Lemma 6.2 (G1) to obtain estimate (6.8). The rest of the proof is exactly the same. \(\square \)

7 Extension of Theorem 2.3 to vector-valued maps

Many authors consider variational problems with vector-valued functions. However, in our work functionals depend only on the length of the gradient so there is almost no difficulty in extending our result to the vector case setting. In this section, we write \({{\textbf {u}}}=(u^1, \ldots , u^n)\) for the map \(u: \Omega \rightarrow {\mathbb {R}}^n\). For simplicity, we use the same notation for spaces of vector-valued functions as for spaces of scalar-valued ones.

The main point that needs explanation is a generalisation of Lemma 4.2, where we applied truncation to approximate functions from \(W^{1,\psi }(\Omega )\) by functions from \(W^{1,\psi }(\Omega ) \cap L^{\infty }(\Omega )\).

Lemma 7.1

Let \(u: \Omega \rightarrow {\mathbb {R}}^n\), \({{\textbf {u}}}=(u^1, \ldots , u^n)\) where \(n \in {\mathbb {N}}\). Suppose that \(u \in W^{1,\psi }(\Omega )\). Then, \(u^i \in W^{1,\psi }(\Omega )\). Moreover, suppose that for each \(i = 1, \ldots , n\) we have \(u^i_k \rightarrow u^i\) in \(W^{1,\psi }(\Omega )\). Let \({{\textbf {u}}}_{\mathbf{k}}:= (u^1_k, \ldots , u^n_k)\). Then, \({{\textbf {u}}}_{\mathbf{k}} \rightarrow {{\textbf {u}}}\) in \(W^{1,\psi }(\Omega )\).

Proof

We observe that if we interpret \(|\nabla {{\textbf {u}}}|\) component-wisely, we have \(|\nabla u^i| \leqq |\nabla {{\textbf {u}}}|\). By convexity of \(\xi \mapsto \psi (x,\xi )\), we have that

$$\begin{aligned} 0 \leqq \psi (x,|\nabla {u^i}|) \leqq \psi (x,|\nabla {{\textbf {u}}}|). \end{aligned}$$

To see the second statement, we note that

$$\begin{aligned} |\nabla {{\textbf {u}}} - \nabla {{\textbf {u}}}_{\mathbf{k}}| \leqq \sum _{i=1}^n |\nabla {{\textbf {u}}}^{\mathbf{i}} - \nabla \mathbf{u_k^i}|, \end{aligned}$$

so that by convexity of the mapping \(\xi \mapsto \psi (x,\xi )\),

$$\begin{aligned} 0 \leqq \psi (x,|\nabla {{\textbf {u}}} - \nabla {{\textbf {u}}}_{\mathbf{k}}|) \leqq \frac{1}{n} \sum _{i=1}^n\psi \left( x, n\, |\nabla { u^i} - \nabla { u_k^i}|\right) . \end{aligned}$$

Using Lemma 4.1 (C2), we conclude that \({{\textbf {u}}}_{\mathbf{k}} \rightarrow {{\textbf {u}}}\) in \(W^{1,\psi }(\Omega )\). \(\square \)

Theorem 7.2

Suppose that \(p \leqq q + \alpha \max \left( 1, \frac{p}{d} \right) \). Let \({\mathcal {H}}\) be a functional defined by (2.1) with \(\psi \) satisfying Assumptions 2.1 and 2.2. Then, for all \({{\textbf {u}}}_{\mathbf{0}} \in W^{1,q}(\Omega )\) we have that

$$\begin{aligned} \inf _{{{\textbf {u}}} \in {{\textbf {u}}}_{{{\textbf {0}}}} + W_0^{1,p}(\Omega )} {\mathcal {H}}({{\textbf {u}}}) = \inf _{{{\textbf {u}}} \in {{\textbf {u}}}_{{{\textbf {0}}}} + W_0^{1,q}(\Omega )} {\mathcal {H}}({{\textbf {u}}}) = \inf _{{{\textbf {u}}} \in {{\textbf {u}}}_{\mathbf{0}} + C_c^{\infty }(\Omega )} {\mathcal {H}}({{\textbf {u}}}). \end{aligned}$$

Moreover, space \(C_c^{\infty }(\Omega )\) is dense in the Musielak–Orlicz–Sobolev space \(W^{1,\psi }_0(\Omega )\).

Proof

We first prove that \(C_c^{\infty }(\Omega )\) is dense in the Musielak–Orlicz–Sobolev space \(W^{1,\psi }_0(\Omega )\). This follows from the following facts:

  • \(W^{1,\psi }_0(\Omega ) \cap L^{\infty }(\Omega )\) is dense in \(W^{1,\psi }_0(\Omega )\). Indeed, let \({{\textbf {u}}} \in W^{1,\psi }_0(\Omega )\) and \({{\textbf {u}}}_{\mathbf{k}} := (T_k(u^1), \ldots , T_k(u^n))\) where \(T_k\) was defined in (4.4). It follows from Lemmas 4.2 and 7.1 that \({{\textbf {u}}}_{\mathbf{k}} \rightarrow {{\textbf {u}}}\) in \(W^{1,\psi }(\Omega )\).

  • \(C_c^{\infty }(\Omega )\) is dense in \(W^{1,\psi }_0(\Omega ) \cap L^{\infty }(\Omega )\). Indeed, let \({{\textbf {u}}} \in W^{1,\psi }_0(\Omega ) \cap L^{\infty }(\Omega )\). Then, each \(u^i \in W^{1,\psi }_0(\Omega ) \cap L^{\infty }(\Omega )\). By Theorem 2.3, we have a sequence \(\{u^i_k\}_{k \in {\mathbb {N}}} \subset C_c^{\infty }(\Omega )\) such that \(u^i_k \rightarrow u^i\) in \(W^{1,\psi }(\Omega )\). Let \({{\textbf {u}}}_{\mathbf{k}}:= (u^1_k, \ldots u^n_k)\). By Lemma 7.1, \({{\textbf {u}}}_{\mathbf{k}} \rightarrow {{\textbf {u}}}\) in \(W^{1,\psi }(\Omega )\).

Having density of \(C_c^{\infty }(\Omega )\) in \(W^{1,\psi }_0(\Omega )\) in hand, the absence of Lavrentiev phenomenon follows as in the proof of Lemma 4.3. \(\square \)