1 Introduction

This paper deals with generalized Orlicz spaces, also known as Musielak–Orlicz spaces. This is a very active field recently [2, 5, 6, 14, 16, 19, 29, 31, 32], boosted by work on the double phase problem by Baroni, Colombo and Mingione, e.g. [4, 13]. The generalized Orlicz case unifies the study of the double phase problem and the variable exponent growth, widely researched over the last 20 years [15]. Many studies deal with isotropic energies of the type

$$\begin{aligned} \int _\Omega \varphi (x, |\nabla u|)\, \mathrm{{d}}x \end{aligned}$$

but recently also the anisotropic case

$$\begin{aligned} \int _\Omega \Phi (x, \nabla u)\, \mathrm{{d}}x \end{aligned}$$

has been considered e.g. in [1, 7,8,9, 24, 27]. As an example of an anisotropic energy with non-standard growth we could take a double-phase functional where the q-phase is directional:

$$\begin{aligned} \int _\Omega |\nabla u|^p + a(x) |\partial _{x_1} u|^q\, \mathrm{{d}}x\,; \end{aligned}$$

here only variation in the \(x_1\)-direction makes a contribution to the energy in the q-phase \(\{a>0\}\).

Counter examples (see, e.g., [3, 15]) show that more advanced results such as the boundedness of averaging operators or density of smooth functions require connecting \(\Phi (x, \xi )\) for different values of x. To this end, we developed in the isotropic case the (A1) condition [18, 22] (see also [28]), which is essentially optimal for the boundedness of the maximal operator. In the anisotropic case Chlebicka, Gwiazda, Zatorska-Goldstein and co-authors [1, 7,8,9,10,11,12, 17] have developed a theory based on their (M) condition. To state and compare these conditions, let us define

(1.1)

In essence, the (A1) conditions says that \(\Phi _B^+\) can be bounded by \(\Phi _B^-\) in small balls \(B\subset \mathbb {R}^n\) in a quantitative way, whereas (M) say that it can be similarly bounded by the least convex minorant \((\Phi _B^-)^\mathrm{conv}\) of \(\Phi _B^-\) (see Definition 3.2). Obviously, the latter is a stronger condition, and it is also more difficult to verify, since the relationship between \((\Phi _B^-)^\mathrm{conv}\) and \(\Phi _B^-\) may be complicated in the anisotropic case.

In the isotropic case \(\Phi _B^- \lesssim (\Phi _B^-)^\mathrm{conv}\) so (M) and (A1) are equivalent. In the anisotropic case this inequality does not hold (see Example 4.1), but we are nevertheless able to prove the equivalence of the conditions by a more careful analysis.

Theorem 1.2

Let \(\Phi :\Omega \times \mathbb {R}^m\rightarrow [0,\infty ]\) be a strong \(\Phi \)-function. Then (A1) and (M) are equivalent.

This result and the techniques introduced in this paper will allow for the development of a theory of anisotropic generalized Orlicz spaces with more natural assumptions. As an example we prove the following Jensen-type inequality.

Corollary 1.3

(Jensen-type inequality) Let \(\Phi \) satisfy (A1) and \(f\in L^\Phi _\mu (\Omega ; \mathbb {R}^m)\). Then there exists \(\beta >0\) such that

when \(\varrho _\Phi (f)\leqslant 1\) and \(\mu (B)\leqslant 1\).

Although the extra assumption \(\varrho _\Phi (f)\leqslant 1\) in the corollary may seem strange, it follows naturally for instance when dealing with local regularity and it is known that the anisotropic Jensen-inequality does not hold without restrictions.

Let us next define precisely the concepts we are using and characterize functions \(\Phi \) for which the equivalence \(\Phi ^\mathrm{conv}\simeq \Phi \) holds (Corollary 2.4). In Sect. 3, we define the conditions (A1) and (M) and give preliminary remarks regarding the definitions. Finally, in Sect. 4 we prove the main results mentioned above.

2 Almost Convexity and the Greatest Convex Minorant

I refer to the monographs [9, 18] for background on isotropic and anisotropic generalized Orlicz spaces, respectively. We consider functions \(\Phi :\Omega \times \mathbb {R}^m\rightarrow [0,\infty ]\); the capital letter \(\Phi \) is used to highlight the distinction from the isotropic case \(L^\varphi \) in [18] where \(\varphi :\Omega \times [0,\infty )\rightarrow [0,\infty ]\). The idea is to define

$$\begin{aligned} \varrho _\Phi (v) := \int _\Omega \Phi (x, v)\, \mathrm{{d}}\mu \quad \text {and}\quad \Vert v\Vert _\Phi := \inf \{\lambda >0 \,|\, \varrho _\Phi \left( \tfrac{v}{\lambda }\right) \leqslant 1\} \end{aligned}$$

for a vector field \(v\in L^1_\mu (\Omega ; \mathbb {R}^m)\). The space \(L^\Phi _\mu (\Omega ; \mathbb {R}^m)\) is defined by the requirement \(\Vert v\Vert _\Phi <\infty \). We use the equivalence relation \(\Phi \simeq \Psi \) which means that there exists \(\beta >0\) such that

$$\begin{aligned} \Phi (\beta \xi ) \leqslant \Psi (\xi ) \leqslant \Phi \left( \tfrac{\xi }{\beta }\right) . \end{aligned}$$

Here and in the rest of the paper \(\beta \) denotes a parameter which is given by by one or more conditions; if the conditions hold with different \(\beta _k\), then we can use \(\beta :=\min _k\beta _k\) for all the conditions so that we may just as well use only the one common \(\beta \). Since the parameter \(\lambda \) is inside \(\Phi \) in the definition of \(\Vert v\Vert _\Phi \), this is the natural way to compare functions \(\Phi \) (cf. Example 2.2). To ensure that the integral in \(\varrho _\Phi \) makes sense and \(\Vert \cdot \Vert _\Phi \) is a norm we require some conditions.

Definition 2.1

Let \(\Omega \subset \mathbb {R}^n\) be an open set. We say that \(\Phi :\Omega \times \mathbb {R}^m\rightarrow [0,\infty ]\) is a strong \(\Phi \)-function, and write \(\Phi \in \Phi _{\mathrm{s}}(\Omega )\), if the following four conditions hold:

  1. (1)

    \(x \mapsto \Phi (x, \xi )\) is measurable for every \(\xi \in \mathbb {R}^m\).

  2. (2)

    \(\displaystyle \Phi (x, 0) = \lim _{\xi \rightarrow 0} \Phi (x,\xi ) =0\) and \(\displaystyle \lim _{\xi \rightarrow \infty }\Phi (x,\xi )=\infty \) for a.e. \(x\in \Omega \).

  3. (3)

    \(\xi \mapsto \Phi (x,\xi )\) is continuous in the topology of \([0,\infty ]\) for a.e. \(x\in \Omega \).

  4. (4)

    \(\Phi \) is convex for a.e. \(x\in \Omega \):

    $$\begin{aligned} \Phi (x, \alpha \xi + \alpha '\xi ') \leqslant \alpha \Phi (x, \xi ) + \alpha ' \Phi (x, \xi '), \quad \alpha , \alpha '\geqslant 0,\ \alpha +\alpha '=1. \end{aligned}$$

With these conditions, \(\Vert \cdot \Vert _\Phi \) is a norm. Note that continuity in \(\xi \) follows from convexity if \(\Phi \) is real-valued and (3) is only needed to ensure that \(\Phi \) does not jump to \(\infty \). Note also that this class of strong \(\Phi \)-functions is broader than that studied [9] since we do not require that upper and lower bounds in terms of N-functions independent of x. For instance, this definition allows for \(L^1\)- and \(L^\infty \)-type growth. In [18] in the isotropic case we relaxed (3) and (4) further, and so used “strong” for this class, even though it is still less restrictive than N-functions.

For the study of \(\Phi \)-functions depending on the space-variable x, we use local approximations with the functions \(\Phi _B^+\) and \(\Phi _B^-\) from (1.1) [18, 22]. However, \(\Phi _B^-\) need not be convex even if each \(\Phi (x,\cdot )\) is (just think of \(\min \{t,t^2\}\)). In the isotropic case, \(\varphi _B^-\) nevertheless satisfies the following weaker variant of (4) above:

  1. (W4)

    \(\Phi \) is almost convex if there exists \(\beta >0\) such that

    $$\begin{aligned} \Phi \big (x, \beta (\alpha \xi + \alpha '\xi ')\big ) \leqslant \alpha \Phi (x, \xi ) + \alpha ' \Phi (x, \xi '), \end{aligned}$$

    for a.e. \(x\in \Omega \) and \(\alpha , \alpha '\geqslant 0\) with \(\alpha +\alpha '=1\).

Unfortunately, even this does not hold for \(\Phi _B^-\) in the anisotropic case (see Example 4.1).

The constant \(\beta \) in the almost convexity condition (W4) should be inside the function since we do not assume doubling, or even finite, functions, as the following example illustrates. A constant outside is possible, but too restrictive.

Example 2.2

Let \(\varphi _\infty (t):=\infty \chi _{(1,\infty )}(t)\) be the function generating the space \(L^\infty (\Omega )\). Define \(\Phi (\xi ) := \varphi _\infty (\Vert \xi \Vert _{1/2} ) = \varphi _\infty (|\xi _1| + 2\sqrt{|\xi _1 \xi _2|}+|\xi _2|)\) in \(\mathbb {R}^2\). Consider \(\alpha =\frac{1}{2}\) and the basis vectors \(\xi =e_1\) and \(\xi '=e_2\) in (W4). Then \(\Vert \tfrac{e_1+e_2}{2}\Vert _{1/2}=2\) so \(\Phi (\tfrac{e_1+e_2}{2}) = \infty \) and the inequality

$$\begin{aligned} \Phi \left( \tfrac{e_1+e_2}{2}\right) \leqslant \tfrac{L}{2} \left[ \Phi (e_1) + \Phi (e_2)\right] \end{aligned}$$

does not hold for any \(L<\infty \). However, the almost convexity inequality (W4)

$$\begin{aligned} \Phi \left( \beta \tfrac{e_1+e_2}{2}\right) \leqslant \tfrac{1}{2} \left[ \Phi (e_1) + \Phi (e_2)\right] = 0. \end{aligned}$$

holds for \(\beta \leqslant \frac{1}{2}\) since in this case \(\Phi \left( \beta \tfrac{e_1+e_2}{2}\right) = 0\).

If we choose \(\xi '=0\) in the almost convexity condition (W4), then we obtain (aInc)\(_1\):

figure a

In the special case \(\beta =1\), i.e. for a convex function, we have (Inc)\(_1\):

figure b

These inequalities mean that the function \(t\mapsto \frac{\Phi (x, t\xi )}{t}\) is almost increasing or increasing, hence the notation aInc\(_1\) and Inc\(_1\). In [18] we showed that these inequalities are useful substitutes for convexity in the isotropic case. In particular, it is easy to see that \(\Phi _B^+\) and \(\Phi _B^-\) satisfy aInc\(_1\) or Inc\(_1\) if \(\Phi \) does. For the anisotropic case the almost convexity is more appropriate since it also carries information about non-radial behavior.

Let us denote by \(\Phi ^\mathrm{conv}\) the greatest convex minorant of \(\Phi \). This function is often denoted by \(\Phi ^{**}\), since it can be obtained by applying the conjugation operation \(^*\) twice [9, Corollary 2.1.42], but we will not use this fact here. We next show a connection between the greatest convex minorant and the almost convexity condition. The following is a version of Carathéodory’s Theorem from convex analysis. Probably it is known, but a proof is included for completeness, since I could not find a reference.

Lemma 2.3

Let \(\Phi :\mathbb {R}^m\rightarrow [0,\infty ]\). Then

$$\begin{aligned} \Phi ^\mathrm{conv}(\xi ) = \min \bigg \{\sum _{k=1}^{m+1} \alpha _k \Phi (\xi _k) \,\bigg |\, \sum _{k=1}^{m+1} \alpha _k \xi _k =\xi ,\ \sum _{k=1}^{m+1} \alpha _k = 1,\, \alpha _k\geqslant 0 \bigg \}. \end{aligned}$$

Proof

Consider the epigraph of \(\Phi \),

$$\begin{aligned} E:= \{ (\xi , t) \in \mathbb {R}^m\times \mathbb {R}\,|\, \Phi (\xi )\leqslant t \} \subset \mathbb {R}^{m+1}. \end{aligned}$$

By Carathéodory’s Theorem (see, e.g., [30, Theorem 2.1.3]), every point in the convex hull of E can be represented as a convex combination of at most \(m+2\) points \(\xi _k\) from E. Furthermore, we observe that if any of the points \(\xi _k\) are from the interior of E, then the convex combination is also in the interior of the convex hull. Thus the points of the boundary, i.e. the graph of \(\Phi ^\mathrm{conv}\), are given as a convex combination of points in the boundary of E, i.e. on the graph of \(\Phi \). Hence

$$\begin{aligned} \Phi ^\mathrm{conv}(\xi ) = \sum _{k=1}^{m+2} \alpha _k \Phi (\xi _k) \quad \text {for some}\quad \sum _{k=1}^{m+2} \alpha _k \xi _k =\xi ,\ \sum _{k=1}^{m+2} \alpha _k = 1,\, \alpha _k\geqslant 0. \end{aligned}$$

This is the claim, except with one extra point \(\xi _{m+2}\).

However, \(\xi \) lies in the convex hull of \(\xi _1,\ldots , \xi _{m+2}\in \mathbb {R}^m\). Thus by Carathéodory’s Theorem in \(\mathbb {R}^m\), \(\xi \) can be expressed as the convex combination of at most \(m+1\) of the points \(\xi _1\), ..., \(\xi _{m+2}\). By re-labeling if necessary, we obtain

$$\begin{aligned} \sum _{k=1}^{m+1} \alpha _k' \xi _k =\xi ,\ \sum _{k=1}^{m+1} \alpha _k' = 1,\, \alpha _k'\geqslant 0. \end{aligned}$$

Since \(\big (\xi _k, \Phi (\xi _k)\big )_{k=1}^{m+2}\), all lie on the same hyper-plane for a boundary point, we also have

$$\begin{aligned} \Phi ^\mathrm{conv}(\xi ) = \sum _{k=1}^{m+1} \alpha _k' \Phi (\xi _k). \end{aligned}$$

\(\square \)

We can now show that \(\Phi \simeq \Phi ^\mathrm{conv}\) for almost convex functions.

Corollary 2.4

The function \(\Phi :\mathbb {R}^m\rightarrow [0, \infty ]\) is almost convex if and only if \(\Phi \simeq \Phi ^\mathrm{conv}\).

Proof

Clearly, \(\Phi ^\mathrm{conv}\leqslant \Phi \) since \(\Phi ^\mathrm{conv}\) is defined as a minorant of \(\Phi \). Let \(2^i\geqslant m+1\) and set \(\alpha _k:=0\) and \(\xi _k:=0\) for \(k>m+1\). By the almost convexity condition,

$$\begin{aligned} \Phi \bigg (\beta ^i \sum _{k=1}^{2^i}\alpha _k \xi _k \bigg ) \leqslant \alpha _{i,1} \Phi \bigg (\beta ^{i-1} \sum _{k=1}^{2^{i-1}}\frac{\alpha _k}{\alpha _{i,1}} \xi _k \bigg ) + \alpha _{i,2} \Phi \bigg (\beta ^{i-1} \sum _{k=2^{i-1}}^{2^i}\frac{\alpha _k}{\alpha _{i,2}} \xi _k \bigg ) \end{aligned}$$

where

$$\begin{aligned} \alpha _{i,1} := \sum _{k=1}^{2^{i-1}}\alpha _k \quad \text {and}\quad \alpha _{i,2} := \sum _{k=2^{i-1}+1}^{2^i}\alpha _k. \end{aligned}$$

Iterating this i times, we obtain that

$$\begin{aligned} \Phi \bigg (\beta ^i \sum _{k=1}^{2^i}\alpha _k \xi _k \bigg ) \leqslant \sum _{k=1}^{2^i} \alpha _k \Phi ( \xi _k) \end{aligned}$$

By Lemma 2.3 and this inequality,

$$\begin{aligned} \Phi ^\mathrm{conv}(\xi ) \geqslant \min \bigg \{\sum _{k=1}^{2^i} \alpha _k \Phi (\xi _k) \,\bigg |\, \sum _{k=1}^{2^i} \alpha _k \xi _k =\xi ,\ \sum _{k=1}^{2^i} \alpha _k = 1,\, \alpha _k\geqslant 0 \bigg \} \geqslant \Phi (\beta ^i \xi ). \end{aligned}$$

Thus the almost convexity implies that \(\Phi \simeq \Phi ^\mathrm{conv}\).

If, on the other hand, \(\Phi \simeq \Phi ^\mathrm{conv}\) with constant \(\beta \), then we directly obtain

$$\begin{aligned} \Phi \big (\beta (\alpha \xi {\,+\,} \alpha '\xi ')\big ) {\,\leqslant \,} \Phi ^\mathrm{conv}(\alpha \xi {\,+\,} \alpha '\xi ') {\,\leqslant \,} \alpha \Phi ^\mathrm{conv}(\xi ) {\,+\,} \alpha ' \Phi ^\mathrm{conv}(\xi ') {\,\leqslant \,} \alpha \Phi (\xi ) {\,+\,} \alpha ' \Phi (\xi '). \end{aligned}$$

\(\square \)

For almost convex functions we easily obtain a Jensen inequality with an extra constant.

Corollary 2.5

(Jensen’s inequality) Let \(E\subset \mathbb {R}^m\) have positive, finite measure \(\mu (E)\). If \(\Phi \in C(E; [0, \infty ])\) is almost convex, then there exists \(\beta \) such that

for every \(f\in L^\Phi _\mu (E; \mathbb {R}^m)\).

Proof

By Corollary 2.4 and Jensen’s inequality for the convex function \(\Phi ^\mathrm{conv}\),

\(\square \)

3 Definition of and Remarks on Conditions

The aInc\(_1\) and almost convexity (W4) conditions connect \(\Phi (x, \xi )\) for different values of \(\xi \) with x fixed. However, more advanced results such as the density of smooth functions in Sobolev spaces require connecting \(\Phi (x, \xi )\) for different values of x, cf. [6]. This is the purpose of the conditions (A1-\(\Psi \)) and (M-\(\Psi \)), which generalize (A1) and (M).

However, let us first start with the more elementary condition (A0): there exists \(\beta >0\) such that

$$\begin{aligned}\Phi (x, \beta \xi ) \leqslant 1 \leqslant \Phi \left( x,\tfrac{1}{\beta }\xi \right) \end{aligned}$$

for all \(\xi \in \mathbb {R}^m\) with \(|\xi |=1\) and all \(x\in \Omega \). Note that (A0) is implicit in the assumption \(m_1(|\xi |)\leqslant \Phi (x,\xi )\leqslant m_2(|\xi |)\) for N-functions \(m_1\) and \(m_2\) used in [1, 7,8,9,10,11, 17]. This property is inherited by other versions of \(\Phi \):

Lemma 3.1

If \(\Phi \in \Phi _{\mathrm{s}}(\Omega )\) satisfies (A0), then so do \(\Phi _B^+\), \(\Phi _B^-\) and \((\Phi _B^-)^\mathrm{conv}\).

Proof

Taking the supremum or infimum over \(x\in \Omega \) in (A0) of \(\Phi \) gives (A0) for \(\Phi _B^+\) and \(\Phi _B^-\). Since \((\Phi _B^-)^\mathrm{conv}\leqslant \Phi _B^-\), the left inequality of (A0) follows for \((\Phi _B^-)^\mathrm{conv}\). If \(|\xi |\geqslant \tfrac{1}{\beta }\), then by Inc\(_1\) and (A0) we conclude that

$$\begin{aligned} \Phi (x, \xi ) \geqslant \beta \,|\xi |\, \Phi \left( x, \tfrac{1}{\beta }\tfrac{\xi }{|\xi |}\right) \geqslant \beta \,|\xi |. \end{aligned}$$

Hence, for all \(\xi \in \mathbb {R}^m\), \(\Phi (x, \xi ) \geqslant (\beta \,|\xi |-1)_+\) (since \((\beta \,|\xi |-1)_+=0\) when \(|\xi |\leqslant \tfrac{1}{\beta }\)) and so \(\Phi _B^-(\xi ) \geqslant (\beta \,|\xi |-1)_+\). But the right-hand side is a convex function, so it follows that \((\Phi _B^-)^\mathrm{conv}(\xi ) \geqslant (\beta \,|\xi |-1)_+\) since \((\Phi _B^-)^\mathrm{conv}\) is defined as the greatest convex minorant. Consequently,

$$\begin{aligned} (\Phi _B^-)^\mathrm{conv}\left( \tfrac{2}{\beta }\xi \right) \geqslant \left( \beta \,\tfrac{2}{\beta }-1\right) _+ = 1, \end{aligned}$$

when \(\xi \in \mathbb {R}^m\) with \(|\xi |=1\), so \((\Phi _B^-)^\mathrm{conv}\) satisfies (A0) with constant \(\tfrac{\beta }{2}\). \(\square \)

The condition (A1) was introduced in [22] (see also [18, 28]) and is essentially optimal for the boundedness of the maximal operator in isotropic generalized Orlicz spaces. It also implies the Hölder continuity of solutions and (quasi)minimizers [5, 20, 21]. For higher regularity, we introduced in [23] a vanishing-(A1) condition along the same lines. These previous studies apply to the isotropic case, i.e. \(m=1\). In [24, 25] we generalized the (A1)-conditions to the anisotropic case, although only the quasi-isotropic case was considered in the main results.

Chlebicka, Gwiazda, Zatorska-Goldstein and co-authors [1, 7,8,9,10,11,12, 17] considered the assumption (M) in the anisotropic case; in the next definition their condition is reformulated to make it easier to compare with the (A1) condition (see also Lemma 3.4); also note that some of the earlier works included additional restrictions in the condition.

Definition 3.2

Let \(\Phi , \Psi \in \Phi _{\mathrm{s}}(\Omega )\). We say that \(\Phi \) satisfies (A1-\(\Psi \)) or (M-\(\Psi \)) if for any \(K>0\) there exists \(\beta >0\) such that

$$\begin{aligned} \Phi _B^+(\beta \xi )\leqslant \Phi _B^-(\xi )+1 \qquad \text {when }\Psi _B^-(\xi )\leqslant \tfrac{K}{\mu (B)} \end{aligned}$$

or

$$\begin{aligned} \Phi _B^+(\beta \xi )\leqslant (\Phi _B^-)^\mathrm{conv}(\xi )+1 \qquad \text {when }(\Psi _B^-)^\mathrm{conv}(\xi )\leqslant \tfrac{K}{\mu (B)} \end{aligned}$$

for all balls \(B\subset \mathbb {R}^n\) with \(\mu (B)\leqslant 1\) and \(\xi \in \mathbb {R}^n\).

When \(\Psi (t):=t^s\) and \(\Psi :=\Phi \) we use the abbreviations (A1-s), (A1), (M-s) and (M).

The role of \(\Psi \) is to calibrate the almost continuity requirement with the information on the function we are interested in and was developed from the initial condition (A1) over the course of several studies [5, 20, 21]. For instance, we showed in [5, Theorem 3.9] that the weak Harnack inequality holds for non-negative supersolutions of \(\mathrm{div}\left( \varphi '(|\nabla u|)\frac{\nabla u}{|\nabla u|}\right) =0\) if the isotropic \(\Phi \)-function \(\varphi \) satisfies (A1-\(\Psi \)) and the supersolution satisfies \(u\in W^{1,\psi }(\Omega )\), where \(\psi \in \Phi _{\mathrm{w}}(\Omega )\) is a potentially different function. Note that this involves a trade-off, since larger \(\psi \) means more restriction on u and less restriction on \(\varphi \).

As far as I know, Chlebicka, Gwiazda, Zatorska-Goldstein and co-authors considered (M) only in the case \(\Psi (t):=t\) and \(\Psi (t):=t^p\) (i.e. (M-1) and (M-p) in the notation above). However, the next example illustrates why this does not lead to optimal results.

Example 3.3

(Variable exponent double phase) Let \(\varphi (x, t):= t^{p(x)} + a(x) t^{q(x)}\) where \(a\in C^{0, \alpha }(\Omega )\), \(a\geqslant 0\) and \(1<p \leqslant q\). Now the (A1) or (M) conditions reduce to

$$\begin{aligned} \frac{q(x)}{p(x)} \leqslant 1+ \frac{\alpha }{n} \quad \Longleftrightarrow \quad \Big (\frac{q}{p}\Big )^+ \leqslant 1+ \frac{\alpha }{n} \end{aligned}$$

Let \(p^-:= \inf _{x\in \Omega } p(x)\) and \(p^+:= \sup _{x\in \Omega } p(x)\). If we only use fixed exponent gauges such as (A1-\(p^-\)) or (M-\(p^-\)), then we instead end up with the condition

$$\begin{aligned} \frac{q(x)}{p^-} \leqslant 1+ \frac{\alpha }{n} \quad \Longleftrightarrow \quad \frac{q^+}{p^-} \leqslant 1+ \frac{\alpha }{n} \end{aligned}$$

which is worse, and quite unnatural as the largest value of q is bounded by the smallest value of p.

As a final remark about the formulation, we note that earlier papers used a form without the “+1” and instead restricted the range of \(\Psi _B^-\). However, if (A0) holds, then these formulations are equivalent. We prove it for (M), the same applies to (A1).

Lemma 3.4

Let \(\Phi \in \Phi _{\mathrm{s}}(\Omega )\) satisfy (A0). Then (M) holds if and only if

$$\begin{aligned} \Phi _B^+(\beta \xi )\leqslant (\Phi _B^-)^\mathrm{conv}(\xi ) \qquad \text {when }(\Phi _B^-)^\mathrm{conv}(\xi )\in \left[ 1, \tfrac{K}{\mu (B)}\right] . \end{aligned}$$

Proof

If the condition of the lemma holds, then (M) needs only to be checked when \((\Phi _B^-)^\mathrm{conv}(\xi )\leqslant 1\). This inequality and (A0) imply that \(|\xi |\leqslant \frac{1}{\beta }\). Thus \(\Phi _B^+(\beta ^2 \xi )\leqslant 1\) by (A0), so (M) holds with constant \(\beta ^2\).

Assume conversely that (M) holds and \((\Phi _B^-)^\mathrm{conv}(\xi )\in \left[ 1, \tfrac{K}{\mu (B)}\right] \). Then it follows that

$$\begin{aligned} \Phi _B^+(\beta \xi )\leqslant (\Phi _B^-)^\mathrm{conv}(\xi ) + 1 \leqslant 2(\Phi _B^-)^\mathrm{conv}(\xi ). \end{aligned}$$

Then Inc\(_1\) implies that \(\Phi _B^+\left( \frac{\beta }{2} \xi \right) \leqslant \frac{1}{2}\Phi _B^+(\beta \xi )\leqslant (\Phi _B^-)^\mathrm{conv}(\xi )\), so the condition of the lemma holds with constant \(\frac{\beta }{2}\). \(\square \)

4 Equivalence of Conditions

In the previous section we introduced and motivated the conditions (A1) and (M) and their variants. We now move on to the main result, and consider their relation to one another.

Since \((\Phi _{B}^-)^\mathrm{conv}\leqslant \Phi _{B}^-\), (M-\(\Psi \)) implies (A1-\(\Psi \)). If \(\varphi \) is isotropic and satisfies aInc\(_1\), then I showed in [22] that \(\varphi _{B}^-(\beta t) \leqslant (\varphi _{B}^-)^\mathrm{conv}(t)\). Hence the two conditions are equivalent in this case. However, as pointed out in [9, Remarks 2.3.14 and 3.7.6], this approach is not possible in the anisotropic case. Since I did not understand the examples implicit in these remarks without consulting the authors, I include here an explicit example based on ideas of Piotr Nayar communicated to me by Iwona Chlebicka.

Example 4.1

Let \(m=2\) and \(\Phi _k(\xi _1e_1 + \xi _2e_2):= \xi _k^2\). Then both \(\Phi _1\) and \(\Phi _2\) are convex and \(\Phi _1(e_2)=\Phi _2(e_1)=0\). Denote \(\Phi :=\min \{\Phi _1,\Phi _2\}\). It follows that

$$\begin{aligned} \Phi ^\mathrm{conv}(\alpha _1e_1+\alpha _2e_2) \leqslant \alpha _1\Phi ^\mathrm{conv}(e_1)+\alpha _2\Phi ^\mathrm{conv}(e_2) \leqslant \alpha _1\Phi (e_1)+\alpha _2\Phi (e_2) = 0, \end{aligned}$$

where \(\alpha _1+\alpha _2=1\) and \(\alpha _1,\alpha _2\geqslant 0\). Thus we see that \(\Phi ^\mathrm{conv}\equiv 0\). Since \(\Phi (\beta (e_1+e_2)) = \Phi _1(\beta (e_1+e_2))=\beta ^2\) but \(\Phi ^\mathrm{conv}\equiv 0\), the relation \(\Phi \simeq \Phi ^\mathrm{conv}\) does not hold.

Even though \(\Phi _{B}^-(\beta \xi ) \leqslant (\Phi _{B}^-)^\mathrm{conv}(\xi )\) does not hold in general, we next construct an almost convex minorant which is comparable to \(\Phi _B^-\) when (A1) holds which can be used in (M). We prove that (A1) implies (M) in the main case \(\Psi :=\Phi \), which corresponds to the natural energy space \(L^\Phi \) or \(W^{1,\Phi }\). The implication for (A1-\(\Psi \)) and (M-\(\Psi \)) when \(\Psi \ne \Phi \) remains an open problem.

Let \(\Phi :\mathbb {R}^m\rightarrow [0,\infty ]\) be a strong \(\Phi \)-function independent of x. Denote \(K_s:= \{ \Phi \leqslant s\}\) and observe that it is a convex compact set which includes 0 in its interior. Define

$$\begin{aligned} \Vert \xi \Vert _{K_s} := \inf \{ \lambda >0 \,|\, \tfrac{\xi }{\lambda }\in K_s \} \qquad \text {and} \qquad N_s(\xi ) := s \max \{1, \Vert \xi \Vert _{K_s} \}. \end{aligned}$$

Here \(\Vert \cdot \Vert _{K_s} \) is the Minkowski functional of the set \(K_s\), first studied by Kolmogorov [26]. The Luxemburg norm \(\Vert \cdot \Vert _\Phi \) defined previously is another example of a Minkowski functional. Note that \(N_s\) is a convex function with \(\{N_s \leqslant s\} = K_s\). Since \(\Phi \) is convex, \(\Phi (\lambda \xi ) \leqslant \lambda \Phi (\xi )\) for \(\lambda \leqslant 1\). Thus \(N_s \leqslant \Phi \) outside \(K_s\), \(N_s\geqslant \Phi \) in \(K_s\) and \(N_s=\Phi \) on the boundary \(\partial K_s\). In other words, we take the s-level set of \(\Phi \) and replace \(\Phi \) outside of it by the function \(N_s\) which grows linearly.

not be even almost convex. However, in the next proposition we show that \(\min \{\Phi , N_s\}\) is almost convex, since the two functions are somehow compatible. This will be used to construct a convex minorant of \(\Phi _B^-\). The proposition also demonstrates the utility of the almost convexity condition, as it seems much more difficult to choose \(N_s\) to make the minimum convex while still being a minorant of \(\Phi _B^-\).

Proposition 4.2

Let \(\Phi :\mathbb {R}^m\rightarrow [0,\infty ]\) be a strong \(\Phi \)-function. Then \(M_s := \min \{\Phi , N_s\}\) is almost convex.

Proof

Note that \(M_s=\Phi \chi _{K_s} + N_s \chi _{\mathbb {R}^m\setminus K_s}\) and let \(\alpha ,\alpha '>0\) with \(\alpha +\alpha '=1\). If \(\xi ,\xi ' \not \in K_s\), then the convexity of \(N_s\) implies that

$$\begin{aligned} M_s\big (\beta (\alpha \xi {\,+\,} \alpha '\xi ')\big ) {\,\leqslant \,} N_s\big (\beta (\alpha \xi {\,+\,} \alpha '\xi ')\big ) {\,\leqslant \,} \alpha N_s(\xi ) {\,+\,} \alpha ' N_s(\xi ') {\,=\,} \alpha M_s(\xi ) {\,+\,} \alpha ' M_s(\xi '). \end{aligned}$$

If \(\xi ,\xi '\in K_s\), then the inequality follows from the convexity of \(\Phi \), which holds by assumption. Therefore it suffices to show that

$$\begin{aligned} M_s\big (\beta (\alpha \xi + \alpha ' \xi ')\big ) \leqslant \alpha \Phi (\xi ) + \alpha ' N_s(\xi ') \end{aligned}$$

when \(\xi \in K_s\) and \(\xi '\not \in K_s\). Define \({\tilde{\xi }} := \alpha \xi + \alpha ' \xi '\) and \({\tilde{\zeta }} :=\frac{1}{2} {\tilde{\xi }}\). We will show that

(4.3)

Observe that \(M_s\) satisfies Inc\(_1\), since \(N_s\) and \(\Phi \) do. By Inc\(_1\), (4.3) implies the previous inequality with constant \(\beta :=\frac{1}{2C}\) and concludes the proof. We consider two cases to prove (4.3).

Case 1:

\({\tilde{\zeta }} \in K_s\). Then \(M_s({\tilde{\zeta }}) \leqslant s \leqslant N_s(\xi ')\) and so (4.3) holds with \(C=2\) when \(\alpha ' > \frac{1}{2}\). Thus we may assume that \(\alpha \geqslant \frac{1}{2}\). Now if \(M_s({\tilde{\zeta }})\leqslant 2 \Phi (\xi )\), then (4.3) holds with \(C=4\). Hence we further assume that \(M_s({\tilde{\zeta }})> 2\Phi (\xi )\). We may assume that \(\xi \), \(\xi '\) and 0 are not collinear since in the collinear case we can choose \(\xi '_k\rightarrow \xi '\) such that \(\xi \), \(\xi '_k\) and 0 are not collinear and use the continuity of \(M_s\) and \(N_s\). Let \(\zeta '\) be the intersection of the segment \([0, \xi ']\) and the line through \(\xi \) and \({\tilde{\zeta }}\) (see Fig. 1). If \(\zeta '\in K_s\), then \({\tilde{\zeta }} = \theta \xi + (1-\theta )\zeta '\) for some \(\theta \in (0,1)\). By the convexity of \(\Phi \) and \(M_s({\tilde{\zeta }})> 2\Phi (\xi )\) we have

$$\begin{aligned} M_s({\tilde{\zeta }}) = \Phi ({\tilde{\zeta }}) \leqslant \theta \Phi (\xi ) + (1-\theta ) \Phi (\zeta ') \leqslant \tfrac{1}{2} M_s({\tilde{\zeta }}) + M_s(\zeta '). \end{aligned}$$

Thus \(M_s(\zeta ')=\Phi (\zeta ') \geqslant \frac{1}{2}M_s({\tilde{\zeta }})\). If, on the other hand, \(\zeta '\not \in K_s\), then \(M_s(\zeta ')\geqslant s\geqslant M_s({\tilde{\zeta }})\). In either case, we have \(M_s(\zeta ')\geqslant \frac{1}{2} M_s({\tilde{\zeta }})\). Consider the parallelogram \((0, \xi , {\tilde{\xi }}, {\tilde{\xi }}-\xi )\). Let \(\eta '\) be the intersection of the segments \([{\tilde{\xi }}, {\tilde{\xi }}-\xi ]\) and \([0,\xi ']\) (see Fig. 1). From \({\tilde{\xi }}=(1-\alpha ')\xi +\alpha '\xi '\) we observe that

$$\begin{aligned} \alpha ' = \frac{|{\tilde{\xi }}-\xi |}{|\xi -\xi '|} = \frac{|\eta '|}{|\xi '|} \geqslant \frac{|\zeta '|}{|\xi '|} =: \nu \in (0,1); \end{aligned}$$

the second equality follows since the triangles \((\xi ', {\tilde{\xi }}, \eta ')\) and \((\xi ', \xi , 0)\) are similar. Thus, by \(M_s(\zeta ')\geqslant \frac{1}{2} M_s({\tilde{\zeta }})\) from the previous paragraph and Inc\(_1\) of \(M_s\),

$$\begin{aligned} N_s(\xi ') = M_s(\xi ') \geqslant \tfrac{1}{\nu }M_s( \nu \xi ' ) = \tfrac{1}{\nu }M_s( \zeta ' ) \geqslant \tfrac{1}{2\nu } M_s({\tilde{\zeta }}) \geqslant \tfrac{1}{2\alpha '} M_s({\tilde{\zeta }}), \end{aligned}$$

where we used the conclusion of the previous paragraph in the penultimate step. The inequality

$$\begin{aligned} M_s\big ({\tilde{\zeta }}\big ) \leqslant 2\alpha ' N_s(\xi ') \end{aligned}$$

follows, so (4.3) holds with \(C=2\).

Fig. 1
figure 1

Construction of auxiliary points

Case 2 :

\({\tilde{\zeta }} \not \in K_s\). Let \(\nu := \Vert {\tilde{\zeta }} \Vert _{K_s}^{-1} < 1\). Since \(K_s\) is closed, it follows from the definition of \(\Vert \cdot \Vert _{K_s}\) that \(\Phi (\nu {\tilde{\zeta }}) = s\) and \(\nu {\tilde{\zeta }} \in \partial K_s\). Furthermore, \(N_s(\nu {\tilde{\zeta }})=s=M_s(\nu {\tilde{\zeta }})\) and so

$$\begin{aligned} M_s({\tilde{\zeta }}) = s \Vert {\tilde{\zeta }}\Vert _{K_s} = \tfrac{1}{\nu }s&= \tfrac{1}{\nu }M_s(\nu {\tilde{\zeta }}) \leqslant \tfrac{4}{\nu }(\alpha \Phi (\nu \xi ) + \alpha ' N_s(\nu \xi ')) \leqslant 4(\alpha \Phi (\xi ) + \alpha ' N_s(\xi ')), \end{aligned}$$

where we used the previous case for \(\nu {\tilde{\zeta }} \in K_s\) in the first inequality and Inc\(_1\) for the last step.

We are ready to prove the main theorem, i.e. the equivalence of (A1) and (M).

Proof of Theorem 1.2

Since \((\Phi _{B}^-)^\mathrm{conv}\leqslant \Phi _B^-\), it follows from (M) that

$$\begin{aligned} \Phi _B^+(\beta \xi ) \leqslant (\Phi _B^-)^\mathrm{conv}(\xi )+1 \leqslant \Phi _B^-(\xi )+1, \end{aligned}$$

when \(\xi \in \mathbb {R}^m\) with \(\Phi _B^-(\xi )\leqslant \frac{K}{\mu (B)}\), where \(B\subset \mathbb {R}^n\) is a ball, which gives (A1).

Assume now conversely that (A1) holds and let \(s:=\frac{K}{\mu (B)}+1\) for a ball \(B\subset \mathbb {R}^n\) with \(\mu (B)\leqslant 1\). Define \(N_s\) as before based on \(K_s:=\{\xi \in \mathbb {R}^m\,|\, \Phi _B^+(\beta \xi ) \leqslant s\}\) and set \(M_s(\xi ) := \min \{\Phi _B^+(\beta \xi ), N_s(\xi )\}\). By Proposition 4.2, \(M_s\) is almost convex so \(M_s(\beta '\xi ) \leqslant (M_s)^\mathrm{conv}(\xi )\) by Corollary 2.4.

If \(\xi \in K_s\), then \(M_s(\xi ) = \Phi _B^+(\beta \xi )\leqslant s\). Now either \(\Phi _B^-(\xi )\leqslant \frac{K}{\mu (B)}\) in which case (A1) implies that \(\Phi _B^+(\beta \xi ) \leqslant \Phi _{B}^-(\xi )+1\), or \(\Phi _B^-(\xi )> \frac{K}{\mu (B)}\) in which case \(\Phi _B^+(\beta \xi )\leqslant s \leqslant \Phi _{B}^-(\xi )+1\). Combining the two cases, we find that

$$\begin{aligned} \Phi _B^+(\beta \xi ) \leqslant \Phi _{B}^-(\xi )+1\qquad \text {for all } \xi \in K_s. \end{aligned}$$

If \(\xi \not \in K_s\), then \(\nu := \Vert \xi \Vert _{K_s}^{-1} < 1\). As in Case 2 of the previous proof \(\nu \xi \in \partial K_s\) and \(\Phi _B^+(\beta \nu \xi )=s\). If \(\Phi _B^-(\nu \xi )< \frac{K}{\mu (B)}\), then (A1) implies that \(\Phi _B^+(\beta \nu \xi ) \leqslant \Phi _B^-(\nu \xi ) + 1< s\), which is a contradiction. Therefore \(\Phi _B^-(\nu \xi )\geqslant \frac{K}{\mu (B)} = s-1\) and so

$$\begin{aligned} M_s(\xi ) = N_s(\xi ) = \tfrac{1}{\nu }s \leqslant \tfrac{1}{\nu }\tfrac{s}{s-1} \Phi _{B}^-(\nu \xi ) \leqslant \tfrac{s}{s-1} \Phi _{B}^-(\xi ), \end{aligned}$$

where we used Inc\(_1\) of \(\Phi _B^-\) in the last step. Note that \(\frac{s}{s-1}=1+\frac{\mu (B)}{K}\leqslant 1+\frac{1}{K}\) since we assumed that \(\mu (B)\leqslant 1\).

In the previous paragraph we have shown that \(M_s\leqslant \left( 1+\frac{1}{K}\right) \Phi _B^- + 1\). Therefore, the convex minorant of \(M_s\) is also a convex minorant of \(\left( 1+\frac{1}{K}\right) \Phi _B^-+1\), and we conclude that \((M_s)^\mathrm{conv}\leqslant \left( 1+\tfrac{1}{K}\right) (\Phi _B^-)^\mathrm{conv}+1\) since \((\Phi _B^-)^\mathrm{conv}\) is the greatest convex minorant of \(\Phi _B^-\). We noted above that \(M_s(\beta '\xi ) \leqslant (M_s)^\mathrm{conv}(\xi )\). Therefore,

$$\begin{aligned} M_s(\beta ' \xi ) \leqslant \left( 1+\tfrac{1}{K}\right) (\Phi _B^-)^\mathrm{conv}(\xi )+1 \qquad \text {for all }\xi \in \mathbb {R}^m. \end{aligned}$$

Let us show that (M) holds. Assume that \((\Phi _B^-)^\mathrm{conv}(\xi )\leqslant \frac{K}{\mu (B)}\). By Inc\(_1\) and the conclusion of the previous paragraph,

$$\begin{aligned} M_s\left( \tfrac{K}{K+1}\beta ' \xi \right) \leqslant \tfrac{K}{K+1} M_s(\beta ' \xi ) \leqslant (\Phi _B^-)^\mathrm{conv}(\xi )+1 \leqslant s. \end{aligned}$$

Therefore \(\tfrac{K}{K+1}\beta ' \xi \in K_s\) and \(M_s\left( \tfrac{K}{K+1}\beta ' \xi \right) =\Phi _B^+\left( \tfrac{K}{K+1}\beta \beta ' \xi \right) \). Thus

$$\begin{aligned} \Phi _B^+\left( \tfrac{K}{K+1}\beta \beta ' \xi \right) \leqslant (\Phi _B^-)^\mathrm{conv}(\xi )+1 \end{aligned}$$

and we have established (M) with constant \(\frac{K}{K+1} \beta \beta '\). \(\square \)

Remark 4.4

From the proof of Proposition 4.2, we see that the almost convexity constant equals 8. From the proof of Corollary 2.4, we see that \(\beta '=8^{-i}\), where i is the smallest integer with \(2^i\geqslant m+1\). Thus \(2^{-i}>\frac{1}{2(m+1)}\) so that \(\beta '>(2(m+1))^{-3}\). Then we see from the proof or the previous theorem that the constant in (M) can be chosen as \((2(m+1))^{-3} \frac{K}{K+1} \beta \) when \(\beta \) is the constant from (A1). In other words, the constants from the two conditions are comparable up to a constant depending on the dimension when \(K\geqslant 1\).

The assumption \(\Phi _B^-(\xi )\leqslant \frac{K}{\mu (B)}\) from (A1) is somewhat difficult to verify. In the isotropic case, if we assume that \(\varrho _\varphi (f)\leqslant 1\), then we can conclude from Jensen’s inequality that

(4.5)

Thus we may apply (A1) to conclude that

This argument is not possible in the anisotropic case, since \((\Phi _B^-)^\mathrm{conv}\) is not comparable to \(\Phi _B^-\). Fortunately, the condition of (M) is easier to use.

Proof of Corollary 1.3

By Theorem 1.2, \(\Phi \) satisfies (M). Since \((\Phi _B^-)^\mathrm{conv}\) is convex, it follows by Jensen’s inequality that

Therefore we can use (M) with and the previous inequality to conclude that

\(\square \)