1 Introduction

The celebrated John–Nirenberg inequality for \(\mathrm {BMO} \) functions has been extensively studied by many authors in different situations since its appearance in the original work [10] by F. John and L. Nirenberg. In the Euclidean space \({\mathbb {R}}^n\) endowed with a doubling measure \(\mu \) (see (1.2)) this inequality reads

$$\begin{aligned} \mu \left( \left\{ x\in Q:|f(x)-f_{Q,\mu }|>t\right\} \right) \le c_1e^{-t/\left( c_2\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}\right) }\mu (Q),\qquad t>0, \end{aligned}$$
(1.1)

where \(f_{Q,\mu }:=\int _Qf(x)\,\mathrm {d}\mu (x)/\mu (Q)\), and it is satisfied for every cube Q in \({\mathbb {R}}^n\) and every \(\mathrm {BMO}(\mathrm {d}\mu )\) function f with universal constants \(c_1,c_2>1\). Here \(\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}\) denotes the \(\mathrm {BMO}(\mathrm {d}\mu )\) norm of f, which is defined as

$$\begin{aligned} \Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )} :=\sup _{Q\in {\mathcal {Q}}} \frac{1}{\mu (Q)}\int _Q|f(x)-f_{Q,\mu }|\,\mathrm {d}\mu (x), \end{aligned}$$

where the supremum is taken over the class \({\mathcal {Q}}\) of all cubes Q in \({\mathbb {R}}^n\). Finiteness of this constant defines the membership of f to the class \(\mathrm {BMO}(\mathrm {d}\mu )\) of functions with bounded mean oscillations. Recall that a measure is said to satisfy the doubling condition, if there are positive constants \(c_\mu \) and \(n_\mu \) such that, for every pair of cubes Q and \({\tilde{Q}}\) in \({\mathbb {R}}^n\) with \(Q\subset {\tilde{Q}}\), the inequality

$$\begin{aligned} \mu ({\tilde{Q}})\le c_\mu \left( \frac{\ell ({\tilde{Q}})}{\ell (Q) }\right) ^{n_\mu }\mu (Q) \end{aligned}$$
(1.2)

holds. The constants \(c_\mu \) and \(n_\mu \) are called doubling constant and doubling dimension of \(\mu \), respectively. Note that \(c_\mu \) must be larger than 1.

It is known that a John–Nirenberg inequality

$$\begin{aligned} \mu \left( \left\{ x\in Q:|f(x)-f_{Q,\mu }|>t\right\} \right) \le Ce^{-c(f)\cdot t }\mu (Q),\qquad t>0,\qquad Q\in {\mathcal {Q}}, \end{aligned}$$
(1.3)

with constants \(c(f),C>0\), characterizes the belongingness of a function f to the \(\mathrm {BMO}(\mathrm {d}\mu )\) class, and in that case C and c(f) can be taken to be the constants \(c_1\) and \((c_2\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )})^{-1}\) in (1.1). A John–Nirenberg inequality (1.3) for a locally integrable function f is in turn equivalent to the validity of a precise estimate (actually, a family of estimates) of the form

$$\begin{aligned} \left( \frac{1}{\mu (Q)}\int _Q|f(x)-f_{Q,\mu }|^p\,\mathrm {d}\mu (x)\right) ^{\frac{1}{p}}\le c(\mu )\cdot p \cdot C(f) \end{aligned}$$
(1.4)

for all cubes \(Q\in {\mathcal {Q}}\) and all \(p>1\), with \(c(\mu )>0\) and \(C(f)>0\) independent of p and Q. Moreover, the constant C(f) in (1.4) can be replaced by \(\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}\). See [18] and also [19, p.146] for this.

It turns out then that having inequality (1.4) for every \(p>1\) becomes a precise quantitative expression of the John–Nirenberg inequality at all the \(L^p\) scales and then we will give (1.4) precisely the name of quantitative John–Nirenberg inequality at the \(L^p\) scale. The aim in this work is to get precise inequalities in the spirit of (1.4) to obtain variants of the quantitative John–Nirenberg inequality by replacing the \(L^p\) norms by different norms. To be precise, the main topic of this paper is the search for a method that allows to get precise inequalities like (1.4) for \(\mathrm {BMO}(\mathrm {d}\mu )\) functions beyond the \(L^p(\mathrm {d}w)\) scale. Here and in the remainder of this work, we denote by \(\mathrm {d}w\) to the measure given by \(w(x)\mathrm {d}x\), where w is a weight: a non negative locally integrable function in \({\mathbb {R}}^n\).

A possible approach to extend (1.4) is to study different functions spaces endowed with a notion of (at least) a quasi-norm allowing us to define a sort of local average or pushing the approach even further, replacing the averages in the left hand side by a family of norms or quasi-norms with uniformly bounded quasi-triangle inequality constant. The precise definitions will be given in detail in Definition 2.2. Accepting for a moment that we do have such a notion, our aim will be to find a method giving estimates of the form

$$\begin{aligned} \Vert f-f_{Q,\mu }\Vert _{Z_Q}\le c(\mu )\psi ({\mathcal {Z}})\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )} \end{aligned}$$
(1.5)

for every cube \(Q\in {\mathcal {Q}}\), where \({\mathcal {Z}}=\{Z_Q\}_{Q\in {\mathcal {Q}}}\) is the aforementioned notion of a family of norms or quasi norms and \(\psi ({\mathcal {Z}})\) is a constant depending on the family. These \(Z_Q\) norms could be given, for instance, in terms of modified averaged measures of the form \(\mathrm {d}\nu /Y(Q)\), where \(Y:{\mathcal {Q}}\rightarrow (0,\infty )\) is some functional defined over cubes.

Let us depict a possible and quite natural path for getting results of this type. Take a function \(\phi \) and suppose that the local Luxemburg type norm

$$\begin{aligned} \Vert f\Vert _{\phi (L)\left( Q,\frac{\mathrm {d}\mu }{\mu (Q)}\right) }:=\inf \left\{ \uplambda >0:\frac{1}{\mu (Q)}\int _Q\phi \left( \frac{|f(x)|}{\uplambda }\right) \,\mathrm {d}\mu (x)\le 1 \right\} , \end{aligned}$$
(1.6)

is well defined for every cube Q in \({\mathbb {R}}^n\). If \(\phi \) is an increasing function with \(\phi (0)=0\) which is absolutely continuous on every compact interval of \([0,\infty )\), then we know by Fubini’s theorem that the following so-called layer-cake representation formula holds:

$$\begin{aligned} \int _{Q}\phi \left[ |f(x)|\right] \,\mathrm {d}\mu (x)=\int _0^\infty \phi '(t)\mu \left( \left\{ x\in Q:|f(x)|>t\right\} \right) \,\mathrm {d}t, \end{aligned}$$

for any cube Q of \({\mathbb {R}}^n\) and any measurable function f. Let us suppose that \(f\in \mathrm {BMO}(\mathrm {d}\mu )\). We know then that f satisfies the John–Nirenberg inequality (1.1) and so, for any \(\uplambda >0\),

$$\begin{aligned} \frac{1}{\mu (Q)}\int _Q\phi \left( \frac{|f(x)-f_{Q,\mu }|}{\uplambda }\right) \,\mathrm {d}\mu (x)= & {} \frac{1}{\mu (Q)}\int _0^\infty \phi '(t)\mu \\&\left( \left\{ x\in Q: |f(x)-f_{Q,\mu }|>\uplambda t \right\} \right) \,\mathrm {d}t\\\le & {} c_1 \int _0^\infty \phi '(t)e^{-\uplambda t/c_2\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}}\,\mathrm {d}t \\= & {} c_1{\mathcal {L}}\{\phi '\}\left( \uplambda /c_2\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )} \right) , \end{aligned}$$

where \({\mathcal {L}}\) represents the Laplace transform. If in addition the function \(\phi \) is convex, then one has that \( \phi '\) is positive, which makes \({\mathcal {L}}\{\phi '\}\) a decreasing function on \((0,\infty )\). Therefore, we can invert it and so, we know that \(c_1{\mathcal {L}}\{\phi '\}\left( \uplambda /c_2\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}\right) \le 1\) if and only if \(\uplambda \ge c_2 \Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}{\mathcal {L}}\{\phi '\}^{-1}\left( \frac{1}{c_1}\right) \). Hence, for any function \(f\in \mathrm {BMO}(\mathrm {d}\mu )\), and for a function \(\phi \) as the one depicted above, we have that

$$\begin{aligned} \Vert f-f_{Q,\mu }\Vert _{\phi (L)\left( Q,\frac{\mathrm {d}\mu }{\mu (Q)}\right) }\le c_2\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}{\mathcal {L}}\{\phi '\}^{-1}\left( \frac{1}{c_1}\right) . \end{aligned}$$

Moreover, given any doubling measure \(\mu \), there exists a function \({\tilde{f}}\in \mathrm {BMO}(\mathrm {d}\mu )\) satisfying that

$$\begin{aligned} \mu \left( \left\{ x\in Q:|{\tilde{f}}(x)-{\tilde{f}}_{Q,\mu }|>t\right\} \right) \ge C(\mu )e^{-t/ c(\mu ) }\mu (Q),\qquad t>0 \end{aligned}$$

for any cube Q in \({\mathbb {R}}^n\), where \(C(\mu )\) and \(c(\mu )\) are positive constants depending only on the underlying measure \(\mu \). This proves that the exponential behaviour of the level sets in the John–Nirenberg inequality (1.1) is the best one can get in general for \(\mathrm {BMO}(\mathrm {d}\mu )\) functions. It also says that the estimate

$$\begin{aligned} \Vert f-f_{Q,\mu }\Vert _{\phi (L)\left( Q,\frac{\mathrm {d}\mu }{\mu (Q)}\right) }\le c_2\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}{\mathcal {L}}\{\phi '\}^{-1}\left( \frac{1}{c_1}\right) \end{aligned}$$

for every cube Q in \({\mathbb {R}}^n\) is essentially optimal, since there is a function \({\tilde{f}}\in \mathrm {BMO}(\mathrm {d}\mu )\) and positive constants \(C(\mu )\) and \(c(\mu )\) such that

$$\begin{aligned} \Vert {\tilde{f}}-{\tilde{f}}_{Q,\mu }\Vert _{\phi (L)\left( Q,\frac{\mathrm {d}\mu }{\mu (Q)}\right) }\ge c(\mu )\Vert {\tilde{f}}\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}{\mathcal {L}}\{\phi '\}^{-1}\left( \frac{1}{C(\mu )}\right) \end{aligned}$$

for every cube Q in \({\mathbb {R}}^n\).

This then provides a method for proving quantitative John–Nirenberg inequalities like (1.5) with an optimal control in the constant \(\psi ({\mathcal {Z}})\) as far as the family of norms is given by a Luxemburg norm defined by a function \(\phi \) like the one considered above. Note that this approach gives an alternative proof of the sharp inequality (1.4), as the \(L^p\) norm is a particular case of the Luxemburg norm given above if we choose \(\phi _p(t)=t^p\), and the quantity \({\mathcal {L}}\{\phi _p'\}^{-1}\left( \frac{1}{C}\right) \) behaves aymptotically like p when \(p\rightarrow \infty \), for any \(C>0\). However, although it is easy to compute the inverse of the Laplace transform of \(\phi _p'\), it seems not to be the case for other functions \(\phi \). Also, the method is confined to the study of norms given by the Luxemburg method in terms of some special functions \(\phi \), and this rules out interesting norms as for instance the ones of variable Lebesgue spaces. It is our purpose in this paper to give a general procedure which allows to prove a quantitative John–Nirenberg inequality like (1.5) in a wider context without computing the inverse of a Laplace transform.

Our method is based in a generalization of the self-improving result [18, Theorem 1.5] in which a very simple method based on the Calderón-Zygmund decomposition is used (see also [11, pp. 31–32], where the original ideas inspiring the general result can be found). The special case of [18, Theorem 1.5] which is of interest for us is the following.

Theorem A

Let \(\mu \) be a doubling measure in \({\mathbb {R}}^n\). There exists a geometric constant \(c(\mu )>0\) such that, given any \(p\ge 1\), the inequality

$$\begin{aligned} \left( \frac{1}{\mu (Q)}\int _Q |f(x)-f_{Q,\mu }|^p\,\mathrm {d}\mu (x)\right) ^{1/p}\le c(\mu )\cdot p\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )} \end{aligned}$$

holds for any cube Q in \({\mathbb {R}}^n\) and any function \(f\in \mathrm {BMO}(\mathrm {d}\mu )\).

The main contribution of our work is to provide analogous results for more general objects at the left-hand side of the inequality in Theorem A. To that end, we will take further some ideas in [18] and will also consider some of the concepts appearing in [16], thus including in the theory more general \(\mathrm {BMO}\) spaces defined by different oscillations.

Now, to be able to present the main result, we need to describe the key concept for our purposes, the family of norms. We will be relying upon families of norms \(\Vert \cdot \Vert _{Z_{Q}}\) with \(Q\in {\mathcal {Q}}.\) These families can have as a particular case local averages as the one given in (1.6). A possible choice for these families is given by the construction \(\Vert f\Vert _{Z_{Q}}:=\Vert f\cdot \chi _{Q}\Vert _{X\left( {\mathbb {R}}^{n},\mathrm {d}\nu /Y(Q)\right) }\) where X is given by an integral expression and \(Y:{\mathcal {Q}}\rightarrow (0,\infty )\) is a functional. This is for instance the case of function norms defined by a Luxemburg norm, and is the approach we took for our examples. Another possible choice for the definition of a local average is \(\Vert f\Vert _{X\left( Q,\frac{\mathrm {d}\nu }{\nu (Q)}\right) }:=\frac{\Vert f\cdot \chi _{Q}\Vert _{X\left( {\mathbb {R}}^{n},\mathrm {d}\nu \right) }}{\Vert \chi _{Q}\Vert _{X\left( {\mathbb {R}}^{n},\mathrm {d}\nu \right) }}\), which makes sense for any quasi-normed function space over the measure space \(({\mathbb {R}}^{n},\mathrm {d}\nu )\). This is the choice made for instance in [6].

As we already mentioned, in the case \(\Vert f\Vert _{Z_{Q}}:=\Vert f\cdot \chi _{Q}\Vert _{X\left( {\mathbb {R}}^{n},\mathrm {d}\nu /Y(Q)\right) }\), we gain generality in our results by considering the functional Y defined over the family of cubes in \({\mathbb {R}}^n\). Trivial examples are \(Y(Q)=w(Q)\) or the functional \(Y(Q)=w_r(Q)\) defined by

$$\begin{aligned} w_r(Q):=\mu (Q)^{1/r'}\left( \int _Q w(x)^r\,\mathrm {d}\mu (x)\right) ^{1/r}, \qquad r>1, \end{aligned}$$

for a weight w (see the discussion after (2.5) for more details about this functional \(w_r\)). In this case, the conditions we impose to the family of quasi-norms are actually conditions on the functional Y.

We present now our general theorem, that can be seen as a template from which we can derive a series of particular cases of self-improving results for different classical function spaces.

Theorem 1.1

Footnote 1 Let \(\mu \) be a doubling measure on \({\mathbb {R}}^n\). Let \({\mathcal {Z}}=\{Z_Q\}_{Q\in {\mathcal {Q}}}\) be a family of Banach spaces or quasi-Banach spaces with triangle inequality constant uniformly bounded by \(K\ge 1\). Assume that \({\mathcal {Z}}\) satisfies the generalized \(A_\infty (d\mu )\) condition with associated increasing bijection \(\Psi \) (see Definition 2.3) and that is good (see Definition 2.4). Then there is a constant \(C(\mu ,\Psi )>0\) such that, for any \(f\in \mathrm {BMO}(\mathrm {d}\mu )\) the following holds

$$\begin{aligned} \left\| (f-f_{Q,\mu })\chi _Q\right\| _{Z_Q}\le C\left( \mu ,\Psi \right) \Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )},\qquad Q\in {\mathcal {Q}}. \end{aligned}$$

Moreover, we can take

$$\begin{aligned} C\left( \mu , \Psi \right) :=\inf _{L>\max \left\{ 1,\left[ {\Psi } \left( ({C_{\mathcal {Z}}}\cdot K)^{-1} \right) \right] ^{-1}\right\} } c_\mu 2^{n_\mu }\frac{L }{ 1- {C_{\mathcal {Z}}}\cdot K\cdot {\Psi }^{-1}\left( \frac{1}{L}\right) }, \end{aligned}$$

where \({C_{\mathcal {Z}}}\) is the constant in the \(A_\infty (\mathrm {d}\mu )\) condition for \({\mathcal {Z}}\).

In such a generality, it is not easy to grasp the reach of the theorem, but its power becomes clear in light of the large variety of particular examples that can be treated in a unified manner.

Two different explicit examples will be given. The first one provides a quantitative John–Nirenberg inequality like (1.5) for Orlicz type norms \(\Vert \cdot \Vert _{\phi (L)(w)}\) defined by submultiplicative Young functions. The application of this approach to the specific norms \(\Vert \cdot \Vert _{L^p\log ^\alpha L(\mathrm {d}x)}\), \(p\ge 1\), \(\alpha \ge 0\) is investigated. In this case, the following result is obtained.

Corollary 1.1

Let \(\mu \) be a doubling measure in \({\mathbb {R}}^n\) and consider \(p> 1\), \(\alpha \ge 0\). Then

$$\begin{aligned} \begin{aligned} \Vert f-f_{Q,\mu }&\Vert _{L^p\log ^\alpha L\left( Q,\frac{\mathrm {d}\mu }{\mu (Q)}\right) }\le c_\mu 2^{n_{\mu }}e2^{\alpha }\left( p+\alpha +1\right) \Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )} \end{aligned} \end{aligned}$$

for every cube Q in \({\mathbb {R}}^n\) and every function \(f\in \mathrm {BMO}(\mathrm {d}\mu )\).

Observe that this extends the classical case in Theorem A to a wider collection of spaces, as the precise estimate for the \(L^p\) case is obtained by taking \(\alpha =0\).

The second example which will be presented corresponds to the variable Lebesgue norms \(\Vert \cdot \Vert _{L^{p(\cdot )}(\mathrm {d}x)}\), which shows that our method is more flexible than the one based on the use of the Laplace transform. The precise statement is the following (see Sect. 4.2 for the precise details and definitions).

Corollary 1.2

Consider an essentially bounded exponent function \(p:{\mathbb {R}}^n\rightarrow {\mathbb {R}}\) with finite essential upper bound. There exists a constant \(C(n)>0\) such that

$$\begin{aligned} \Vert f-f_Q\Vert _{L^{p(\cdot )}\left( Q,\frac{\mathrm {d}x}{|Q|}\right) }\le C(n) p^+ \Vert f\Vert _{\mathrm {BMO}} \end{aligned}$$

for every cube Q in \({\mathbb {R}}^n\) and every function \(f\in \mathrm {BMO}\).

From such an inequality, we deduce the following generalized John–Nirenberg inequality.

Corollary 1.3

Let \(p:{\mathbb {R}}^n\rightarrow {\mathbb {R}}\) be an essentially bounded exponent function with finite essential upper bound. There exists a constant \(C(n,p^+)>0\) such that the John–Nirenberg type inequality

$$\begin{aligned} \Vert \chi _{\{x\in Q: |f(x)-f_Q|\ge t\}}\Vert _{L^{p(\cdot )}\left( Q,\frac{\mathrm {d}x}{|Q|}\right) }\le 2 e^{-C(n,p^+)t/\Vert f\Vert _{\mathrm {BMO}}} \end{aligned}$$

holds for every cube Q in \({\mathbb {R}}^n\) and every function \(f\in \mathrm {BMO}\).

This John–Nirenberg type inequality is related to that obtained in [6], where a different \(L^{p(\cdot )}\) average is considered. It is a remarkable fact that no further condition has to be imposed on p to satisfy the above inequalities, in contrast with the result [6] where besides the essential uniform boundedness, local log-Hölder conditions for the exponent function p are imposed.

The type of techniques which are studied here are flexible enough to be applicable in many different situations. An example of this is the fact that new generalized Karagulyan type estimates can be obtained under suitable conditions for these quasi-norms.

Along this work, we will write \(A\lesssim B\) whenever there is some constant \(C>0\), independent of the relevant parameters, such that \(A\le C\cdot B\). We will stress the dependence of some constant C on a certain parameter \(\alpha \) by including it in a parenthesis like this: \(C(\alpha )\). The notation \(A\gtrsim B\) will mean that \(B\lesssim A\) and \(A\asymp B\) will be used in case both \(A\lesssim B \) and \(A\gtrsim B\) hold at the same time.

The rest of the paper is organized as follows: in Sect. 2 we introduce some previous self-improving results and we discuss their hypotheses. This leads us to consider a generalization of \(A_\infty \) weights in relation to \(L^p\) norms which we later extend to the context of general quasi-normed spaces. In Sect. 3 use the generalized \(A_\infty \) condition to settle Theorem 1.1. Section 4 is devoted to provide corollaries of Theorem 1.1, among which Corollaries 1.1 and 1.2 are included. We include an appendix with the proof of Theorem 2.1, which is a generalization to the setting of doubling measures of [16, Theorem 1.2].

2 The \(A_\infty \) condition of a functional with respect to a quasi-norm

In this section we provide the fundamental tools and concepts used to prove the main results of the paper. Some aspects of the previous self-improving result [18, Theorem 1.5] and some results in [16] will be discussed to motivate one of the new concepts which will be introduced here, namely, the generalized \(A_\infty \) type condition adapted to general quasi-normed function spaces. The basic assumptions on the quasi-norms we will consider in this work will also be introduced here. The core ideas for the results in this paper come essentially from [18, Theorem 1.5]. We state this result here for the convenience of the reader. We first recall that a weight w is an \(A_\infty \) weight if there exist some \(\delta ,C>0\) such that, given a cube Q in \({\mathbb {R}}^n\), the inequality

$$\begin{aligned} \frac{w(E)}{w(Q)}\le C\left( \frac{\mu (E)}{\mu (Q)}\right) ^\delta \end{aligned}$$
(2.1)

holds for any measurable subset \(E\subset Q\). We also recall the standard notation \(\Delta (Q)\) for the family of countable disjoint families of subcubes of a given cube Q.

Theorem B

Let \(\mu \) be a doubling measure in \({\mathbb {R}}^n\) and consider \(w\in A_\infty (\mathrm {d}\mu )\). Suppose that a functional \(a:{\mathcal {Q}}\rightarrow (0,\infty )\) satisfies the \(SD_p^s(w)\) condition, namely, that there exist \(p\ge 1\), \(s>1\) and \(\Vert a\Vert >0\) such that, for every cube Q in \({\mathbb {R}}^n\), the inequality

$$\begin{aligned} \left( \sum _{j\in {\mathbb {N}}}\left( \frac{a(Q_j)}{a(Q)}\right) ^p\frac{w(Q_j)}{w(Q)}\right) ^{1/p}\le \Vert a\Vert \left( \frac{\mu \left( \bigcup _{j\in {\mathbb {N}}} Q_j\right) }{\mu (Q)}\right) ^{1/s} \end{aligned}$$
(2.2)

holds for any \(\{Q_j\}_{j\in {\mathbb {N}}}\) in \(\Delta (Q)\). There exists a constant \(C(\mu )>0\) such that, for every \(f\in L^1_{\mathrm {loc}}({\mathbb {R}}^n,\mathrm {d}\mu )\) with

$$\begin{aligned} \frac{1}{\mu (Q)}\int _Q|f(x)-f_{Q,\mu }|\,\mathrm {d}\mu (x)\le a(Q),\qquad Q\in {\mathcal {Q}}, \end{aligned}$$
(2.3)

the estimate

$$\begin{aligned} \left( \frac{1}{\mu (Q)}\int _Q|f(x)-f_{Q,\mu }|^p\,\mathrm {d}\mu (x)\right) ^{1/p}\le C(\mu )\,s\,\Vert a\Vert ^s\, a(Q), \end{aligned}$$
(2.4)

holds for any cube Q in \({\mathbb {R}}^n\).

The proof of Theorem B is based in a Calderón-Zygmund decomposition which takes advantage of the two main hypothesis of the result, the \(SD_p^s(w)\) condition and the \(A_\infty (\mathrm {d}\mu )\) condition on w.

As already observed in [18, Remark 1.6], the \(A_\infty (\mathrm {d}\mu )\) condition on w seems to be an artifice of the proof and it may be not needed for getting the general result. The authors use the \(A_\infty \) condition as a tool for proving that the auxiliary functional \(a_\varepsilon (Q):=a(Q)+\varepsilon \) satisfies a smallness condition like (2.2) provided that the original functional a satisfies it. More specifically, they deal with the following computation for any cube Q and any \(\{Q_j\}_{j\in {\mathbb {N}}}\in \Delta (Q)\):

$$\begin{aligned} \left( \sum _{j\in {\mathbb {N}}}\frac{a_\varepsilon (Q_j)^pw(Q_j)}{a_\varepsilon (Q)^pw(Q)}\right) ^{1/p}\le \left( \sum _{j\in {\mathbb {N}}}\frac{a (Q_j)^pw(Q_j)}{a (Q)^pw(Q)}\right) ^{1/p}+ \left( \frac{ w\left( \bigcup _{j\in {\mathbb {N}}}Q_j\right) }{ w(Q)}\right) ^{1/p}. \end{aligned}$$

Note that the \(A_\infty (\mathrm {d}\mu )\) condition (2.1) is what allows to bound the second term in the sum above to finally get a smallness condition like (2.2) on \(a_\varepsilon \) for any \(\varepsilon >0\).

The need of this condition for a self-improving result like Theorem B has been investigated in [14], where the first author studies alternative arguments avoiding the \(A_\infty (\mathrm {d}\mu )\) condition on the weight to get a self-improving like that. Although the results there are not fully satisfactory in the sense that they do not recover the improvement (2.4) without the \(A_\infty (\mathrm {d}\mu )\) condition, they are good enough to get a new unified approach for getting classical and fractional weighted Poincaré-Sobolev inequalities. The approach taken there consists on replacing the weight w in the \(SD_p^s(w)\) condition (2.2) by a slightly more general functional \(w_r\) defined by

$$\begin{aligned} w_r(Q):=\mu (Q)^{1/r'}\left( \int _Q w(x)^r\,\mathrm {d}\mu (x)\right) ^{1/r}, \end{aligned}$$

thus getting a modified \(SD_p^s(w)\) condition which reads as follows

$$\begin{aligned} \left( \sum _{j\in {\mathbb {N}}}\left( \frac{a(Q_j)}{a(Q)}\right) ^p\frac{w_r(Q_j)}{w_r(Q)}\right) ^{1/p}\le \Vert a\Vert \left( \frac{\mu \left( \bigcup _{j\in {\mathbb {N}}} Q_j\right) }{\mu (Q)}\right) ^{1/s} \end{aligned}$$
(2.5)

for any cube Q and any \(\{Q_j\}_{j\in {\mathbb {N}}}\in \Delta (Q)\).

Observe that \(w_r(Q)\) is the result of applying Jensen’s inequality to the classical functional defined by w(Q) for any cube \(Q\in {\mathcal {Q}}\). This kind of functionals already appeared in some works as for instance [2, 3, 17], in which the authors study sufficient conditions for the two-weighted weak and strong-type (respectively) boundedness of fractional integrals, Calderón-Zygmund operators and commutators. There, one can find the following straightforward properties of \(w_r\):

  1. (1)

    \(w(E)\le w_r(E)\) for any measurable nonzero measure set E.

  2. (2)

    If \(E\subset F\) are two nonzero measure sets, then

    $$\begin{aligned} w_r(E)\le \left( \frac{\mu (E)}{\mu (F)}\right) ^{1/r'}w_r(F). \end{aligned}$$
    (2.6)
  3. (3)

    If \(E=\bigcup _{j\in {\mathbb {N}}}E_j\) for some disjoint family \(\{E_j\}_{j\in {\mathbb {N}}}\), then

    $$\begin{aligned} \sum _{j\in {\mathbb {N}}}w_r(E_j) \le w_r(E). \end{aligned}$$
    (2.7)
  4. (4)

    If two measurable sets E and F satisfy \(E\subset F\), then

    $$\begin{aligned} w_r(E)\le w_r(F). \end{aligned}$$
    (2.8)

The above properties are what allow to prove a smallness condition for the perturbations \(a_\varepsilon \) of a functional a. Specially, condition (2.6) is what makes possible to work with these perturbations without assuming the \(A_\infty (\mathrm {d}\mu )\) condition on the weight.

Also related with this problem, and more related to the results which will be studied here is the work [16], where the embedding of \(\mathrm {BMO}(\mathrm {d}\mu )\) into certain weighted \(\mathrm {BMO}\) spaces is characterized. To be precise, they consider the following weighted \(\mathrm {BMO}\) spaces.

Definition 2.1

Let us consider a positive functional \(Y:{\mathcal {Q}}\rightarrow (0,\infty )\) defined over the family \({\mathcal {Q}}\) of all cubes of \({\mathbb {R}}^n\). Consider a measure \(\mu \) in \({\mathbb {R}}^n\) and pick a weight \(v\in L^1_{\mathrm {loc}}({\mathbb {R}}^n,\mu )\). We define the class of functions with bounded \((v\,\mathrm {d}\mu ,Y)\)-mean oscillations as

$$\begin{aligned} \mathrm {BMO}_{v\,\mathrm {d}\mu ,Y}:=\left\{ f\in L_{\mathrm {loc}}^1({\mathbb {R}},\mathrm {d}\mu ):\Vert f\Vert _{\mathrm {BMO}_{v\,\mathrm {d}\mu ,Y}}<\infty \right\} , \end{aligned}$$
(2.9)

where

$$\begin{aligned} \Vert f\Vert _{\mathrm {BMO}_{v\,\mathrm {d}\mu ,Y}}:= \sup _{Q\in {\mathcal {Q}}}\frac{1}{Y(Q)}\int _Q|f(x)-f_{Q,\mu }|v(x)\,\mathrm {d}\mu (x). \end{aligned}$$
(2.10)

For the special case \(v=1\), \(Y(Q)=\mu (Q)\) the notation \(\mathrm {BMO}(\mathrm {d}\mu )\) will be adopted.

It is one of the main results in [16] that the embedding inequality

$$\begin{aligned} \Vert f\Vert _{\mathrm {BMO}_{v\,\mathrm {d}x,Y}}\le B\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}x)} \end{aligned}$$

is valid if and only if the weight v and the functional Y satisfy the Fujii-Wilson type \(A_\infty \) condition

$$\begin{aligned}{}[v]_{A_{\infty ,Y}}:=\sup _{Q\in {\mathcal {Q}}}\frac{1}{Y(Q)}\int _Q M(v\chi _Q)(x)\,\mathrm {d}x<\infty , \end{aligned}$$
(2.11)

which, in case \(Y(Q)=|Q|\), coincides with the \(A_\infty \) condition stated in (2.1) (see [5]). The following theorem generalizes the aforementioned result in [16] to the setting of doubling measures. A proof of it is provided in Appendix A.

Theorem 2.1

Let \(\mu \) be a doubling measure in \({\mathbb {R}}^n\) and consider a functional \(Y:{\mathcal {Q}}\rightarrow (0,\infty )\). The following two conditions on a weight \(w\in L^1_{\mathrm {loc}}({\mathbb {R}}^n,\mathrm {d}\mu )\) are equivalent:

  1. (1)

    There is some constant \(B>0\) such that

    $$\begin{aligned} \frac{1}{Y(Q)}\int _Q|f(x)-f_{Q,\mu }|\,\mathrm {d}w(x)\le B \Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )} \end{aligned}$$

    for every function \(f\in \mathrm {BMO}(\mathrm {d}\mu )\) and every cube Q in \({\mathbb {R}}^n\).

  2. (2)

    The weight w is an \(A_{\infty ,Y}(\mathrm {d}\mu )\) weight, i.e.

    $$\begin{aligned}{}[w]_{A_{\infty ,Y}(\mathrm {d}\mu )}:=\sup _{Q\in {\mathcal {Q}}}\frac{1}{Y(Q)}\int _Q M_\mu (w\chi _Q)(x)\,\mathrm {d}\mu (x)<\infty . \end{aligned}$$
    (2.12)

Moreover, there exist positive constants \(C_1\) and \(C_2\) such that \(C_1B \le [w]_{A_{\infty ,Y}(\mathrm {d}\mu )}\le C_2 B\).

Note that \(Y(Q):=w_r(Q)\), \(r>1\) is a possible choice of Y in the above theorem and thus the particular case of a constant functional in the main theorem [14] proves that w is an \(A_{\infty ,w_r}(\mathrm {d}\mu )\) weight for any \(r>1\). Evidently, the Fujii-Wilson \(A_\infty (\mathrm {d}\mu )\) weights studied for instance in [8] are \(A_{\infty ,Y}(\mathrm {d}\mu )\) weights for the functional Y defined by \(Y(Q):=w(Q)\) for every cube Q in \({\mathbb {R}}^n\). In particular, this answers the question on the need of the \(A_\infty (\mathrm {d}\mu )\) condition for Theorem B at least in the case of a constant functional a. Indeed, on the one hand, as weights satisfying the \(A_\infty (\mathrm {d}\mu )\) condition (2.1) are precisely those satisfying the Fujii-Wilson \(A_{\infty ,w}(\mathrm {d}\mu )\) condition (2.12), Theorem B ensures that, for any weight satisfying the \(A_\infty (\mathrm {d}\mu )\) condition (2.1), the self-improving inequality

$$\begin{aligned} \frac{1}{w(Q)}\int _Q|f(x)-f_{Q,\mu }|\,\mathrm {d}w(x)\le B \Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )},\qquad Q\in {\mathcal {Q}} \end{aligned}$$

holds. On the other hand, any weight for which the above self-improvement holds must be an \(A_\infty (\mathrm {d}\mu )\) weight in virtue of Theorem 2.1 and [5, Theorems 3.1 (b) and 4.2 (b)]. Thus, according to Theorem B, it happens that the \(SD_p^s(w)\) condition (2.2) for the constant functionals is equivalent to the \(A_\infty (\mathrm {d}\mu )\) condition on the weight w, i.e. \(w\in A_\infty (\mathrm {d}\mu )\) if and only if there are \(s>0\) and \(C>0\) such that given a cube Q in \({\mathbb {R}}^n\) and \(\{Q_j\}_{j\in {\mathbb {N}}}\in \Delta (Q)\),

$$\begin{aligned} \left( \sum _{j\in {\mathbb {N}}} \frac{w(Q_j)}{w(Q)}\right) ^{1/p}\le C\left( \frac{\mu \left( \bigcup _{j\in {\mathbb {N}}} Q_j\right) }{\mu (Q)}\right) ^{1/s} \end{aligned}$$
(2.13)

for some \(p\ge 1\) (or equivalently, for every \(p\ge 1\)). In fact, in this case it happens that \([w]_{A_\infty (\mathrm {d}\mu )}\asymp s/p\), where s is the best possible exponent in the above condition. In general, it is considered in [16] a general condition in the spirit of (2.13) which generalizes the situation to more general functionals Y (including the case \(Y(Q):=w_r(Q)\), \(r>1\)) and which reads as follows: given \(p\ge 1\), there is \(s>0\) such that for any cube Q in \({\mathbb {R}}^n\) and any \(\{Q_j\}_{j\in {\mathbb {N}}}\in \Delta (Q)\),

$$\begin{aligned} \left( \sum _{j\in {\mathbb {N}}}\frac{Y(Q_j)}{Y(Q)}\right) ^{1/p}\le C\left( \frac{\mu \left( \bigcup _{j\in {\mathbb {N}}} Q_j\right) }{\mu (Q)}\right) ^{1/s}. \end{aligned}$$
(2.14)

This condition may be regarded as an \(A_\infty (\mathrm {d}\mu )\) condition at scale p for the functional Y where, in analogy with the case \(Y(Q):=w(Q)\), one could call \([Y]_{A_\infty (\mathrm {d}\mu ,p)}\) (or \([Y]_{A_\infty (\mathrm {d}\mu )}\) in case \(p=1\)) to the best possible s in the above condition. Observe that this generalizes the usual case, where for an \(A_\infty (\mathrm {d}\mu )\) weight we have that \([w]_{A_\infty (\mathrm {d}\mu ,p)}\asymp p[w]_{A_\infty (\mathrm {d}\mu )}\). Also, note that, by taking into account properties (2.7) and (2.6) of \(w_r\),

$$\begin{aligned} \begin{aligned} \sum _{j\in {\mathbb {N}}}\frac{w_r(Q_j)}{w_r(Q)}\le \frac{w_r\left( \bigcup _{j\in {\mathbb {N}}}Q_j\right) }{w_r(Q)}\le \left( \frac{\mu \left( \bigcup _{j\in {\mathbb {N}}}Q_j\right) }{\mu (Q)}\right) ^{1/r'}, \end{aligned} \end{aligned}$$

so \(w_r\in A_\infty (\mathrm {d}\mu )\) and \([w_r]_{A_\infty (\mathrm {d}\mu )}\le r'\) for every \(r>1\). This is the model example for the embedding result [16, Theorem 1.6], which is a particular case of [14, Theorem 2].

As we advanced in the Introduction, it is our goal in this paper to get self-improving inequalities in the spirit of that in Theorem A (which, as already said, is the corollary of Theorem B we are interested in) replacing the \(L^p\) norms by different norms or even quasi-norms. Therefore, a brief reminder of the main concepts on the theory of quasi-normed spaces of functions is in order.

Definition 2.2

Let X be a vector space. A function \(\Vert \cdot \Vert :X\rightarrow [0,\infty )\) is called a quasi-norm if there is a constant \(K\ge 1\) such that

  1. (1)

    \(\Vert x\Vert =0\) if and only if \(x=0\).

  2. (2)

    \(\Vert \uplambda x\Vert =|\uplambda |\Vert x\Vert \) for \(\alpha \in {\mathbb {R}}\) and \(x\in X\).

  3. (3)

    \(\Vert x_1+x_2\Vert \le K\left( \Vert x_1\Vert +\Vert x_2\Vert \right) \) for all \(x_1, x_2\in X\). The constant K will be called the geometric constant of \(\Vert \cdot \Vert \).

A quasi-norm \(\Vert \cdot \Vert \) over a vector space X will be denoted by \(\Vert \cdot \Vert _X\). In case \(K=1\), the term “quasi” for the notation will be skipped.

Consider now the measure space \(({\mathbb {R}}^n,\mathrm {d}\nu )\), where \(\nu \) is a measure on the space. If \(L^0({\mathbb {R}}^n,\mathrm {d}\nu )\) is the vector lattice of all measurable functions modulo \(\nu \)-null functions, the positive cone of \(L^0({\mathbb {R}}^n,\mathrm {d}\nu )\) will be denoted by \(L^0({\mathbb {R}}^n,\mathrm {d}\nu )^+\). If \(X(\mathrm {d}\nu )\) is an order ideal of \(L^0({\mathbb {R}}^n,\mathrm {d}\nu )\) (i.e. a vector subspace of \(L^0({\mathbb {R}}^n,\mathrm {d}\nu )\) such that \(f\in X(\mathrm {d}\nu )\) for any \(f\in L^0({\mathbb {R}}^n,\mathrm {d}\nu )\) satisfying \(|f|\le |g|\) \(\nu \)-a.e. with \(g\in X(\mathrm {d}\nu ) \)), a quasi-norm \(\Vert \cdot \Vert _{X(\mathrm {d}\nu )}\) on \(X(\mathrm {d}\nu )\) is said to be a lattice quasi-norm if \(\Vert f\Vert _{X(\mathrm {d}\nu )}\le \Vert g\Vert _{X(\mathrm {d}\nu )}\) whenever \(f,g\in X(\mathrm {d}\nu )\) satisfy \(|f|\le |g|\). In this case, the pair \((X(\mathrm {d}\nu ),\Vert \cdot \Vert _{X(\mathrm {d}\nu )})\) (or, sometimes, simply \(X(\mathrm {d}\nu )\)) is called a quasi-normed function space based on \(({\mathbb {R}}^n,\mathrm {d}\nu )\). For a given measurable subset E of \({\mathbb {R}}^n\), the notation \(\Vert f\Vert _{X(E,\mathrm {d}\mu )}:=\Vert f \chi _E\Vert _{X(\mathrm {d}\mu )}\) will be used. Recall also the discussion on the concept of local average below Theorem A in the Introduction.

The normed function spaces introduced here coincide with those called normed Köthe function spaces [20, Ch. 15], which are defined as those for which a function norm \(\rho :{\mathcal {M}}^+(\nu )\rightarrow [0,\infty ]\) is finite, where \({\mathcal {M}}^+(\nu )\) is the class of nonnegative measurable functions up to \(\nu \)-a.e. null functions. See [15, Remark 2.3 (ii)] for more details about this.

If we want to study self-improving results in the spirit of Theorem A (or, more in general, in the spirit of [16, Theorem 1.6]) for norms different from the \(L^p\) ones, a good strategy would be to try to write the conditions on these theorems in terms of the \(L^p\) norm. If the obtained result makes sense for a different norm, it may be the correct condition for such a generalization. It turns out that this strategy works. Indeed, consider the general condition (2.14) and pick any family \(\{h_j\}_{j\in {\mathbb {N}}}\) of functions satisfying that \( \left\| \ h_j \chi _{Q_j}\right\| _{L^p\left( Q_j,\frac{\mathrm {d}w}{Y(Q_j)}\right) }=1\), where \(\{Q_j\}_{j\in {\mathbb {N}}}\) is a family of pairwise disjoint subcubes of a cube Q. We can make the following computations:

$$\begin{aligned} \left( \sum _{j\in {\mathbb {N}}}\frac{Y(Q_j)}{Y(Q)}\right) ^{1/p}&= \left( \sum _{j\in {\mathbb {N}}} \frac{1}{Y(Q)}\int _{Q_j} h_j(x)^p\,\mathrm {d}w(x) \right) ^{1/p} \\&= \left( \frac{1}{Y(Q)}\int _{Q }\sum _{j\in {\mathbb {N}}} h_j(x)^p\chi _{Q_j}(x)\,\mathrm {d}w(x) \right) ^{1/p} \\&= \left( \frac{1}{Y(Q)}\int _{Q} \left[ \sum _{j\in {\mathbb {N}}} h_j(x) \chi _{Q_j}(x)\right] ^p\,\mathrm {d}w(x) \right) ^{1/p} \\&= \left\| \sum _{j\in {\mathbb {N}}} h_j \chi _{Q_j}\right\| _{L^p\left( Q,\frac{\mathrm {d}w}{Y(Q)}\right) }, \end{aligned}$$

This way, we have written the left-hand side of (2.14) in terms of the \(L^p(\mathrm {d}w)\) norm. This, and the fact that in the self-improving results there is no special reason why this left-hand side must be controlled by a power function of \(\mu \left( \bigcup _{j\in {\mathbb {N}}}Q_j\right) /\mu (Q)\), leads us to make the following definition, in a clear paralellism with the comments below (2.14).

Definition 2.3

Let \(\mu \) be a measure in \({\mathbb {R}}^n\). A family of Banach spaces \({\mathcal {Z}}=\{Z_Q\}_{Q\in {\mathcal {Q}}}\) or quasi-Banach spaces with triangle inequality constant uniformly bounded will be said to satisfy an \(A_\infty (\mathrm {d}\mu )\) condition if there exist some constant \({C_{\mathcal {Z}}}>0\) and some increasing bijection \(\Psi :[0,1]\rightarrow [0,1]\) such that

$$\begin{aligned} \left\| \sum _{j\in {\mathbb {N}}} h_j \chi _{Q_j }\right\| _{Z_Q}\le {C_{\mathcal {Z}}}\Psi ^{-1}\left[ \frac{\mu \left( \bigcup _{j\in {\mathbb {N}}} Q_j\right) }{\mu (Q)}\right] \end{aligned}$$
(2.15)

for every \(\{Q_j\}_{j\in {\mathbb {N}}}\in \Delta (Q)\), \(Q\in {\mathcal {Q}}\) and every family of functions \(\{h_j\}_{j\in {\mathbb {N}}}\) satisfying \(\Vert h_j\Vert _{Z_{Q_j}(\mathrm {d}\mu )}=1\) for every \(j\in {\mathbb {N}}\).

This condition generalizes the above \(A_\infty (\mathrm {d}\mu )\) condition (2.14) for families of Banach spaces.

With this condition at hand, it is possible to prove a new self-improving result which generalizes [18, Theorem 1.5], [16, Theorem 1.6] and [14, Theorem 2] in the case a constant functional a is considered. First some technical lemmas have to be proved. We will start by imposing some conditions on the family of norms that we will deal with.

Definition 2.4

A family of Banach spaces \({\mathcal {Z}}=\{Z_Q\}_{Q\in {\mathcal {Q}}}\) or quasi-Banach spaces with triangle inequality constant uniformly bounded. Let us consider a measure \(\nu \) in \({\mathbb {R}}^n\). The family \({\mathcal {Z}}\) will be said to be good if:

  1. (1)

    (Fatou’s property) If \(\{f_k\}_{k\in {\mathbb {N}}}\) are positive functions in \(Z_Q(\mathrm {d}\nu )\) with \(f_k\uparrow f\) \(\nu \)-a.e. then \(\Vert f_k\Vert _{Z_Q}\uparrow \Vert f\Vert _{Z_Q}\).

  2. (2)

    \(\Vert \chi _Q\Vert _{Z_Q}\le 1\) for every cube Q in \({\mathbb {R}}^n\). This will be called the average property of \({\mathcal {Z}}\).

Example 2.1

Note that for \(L^{p}({\mathbb {R}}^{n},\mathrm {d}\nu ),\) and \(p\ge 1\) if we choose

$$\begin{aligned} \Vert f\Vert _{Z_{Q}}:=\Vert f\cdot \chi _{Q}\Vert _{L^p\left( {\mathbb {R}}^n, {\text {d}} \nu /Y(Q)\right) } \end{aligned}$$

where Y is any functional satisfying that \(\nu (Q)\le Y(Q)\), for every cube Q in \({\mathbb {R}}^{n}\), then \(\{Z_{Q}\}_{Q\in {\mathcal {Q}}}\) is a good family. Note that this example relies upon localized \(L^p\) norms. Analogous examples could be provided localizing weak Lebesgue spaces, which are defined for \(0<p<\infty \) as

$$\begin{aligned} L^{p,\infty }({\mathbb {R}}^{n},\mathrm {d}\nu ):=\left\{ f\in L^{0}({\mathbb {R}}^{n},\mathrm {d}\nu ):\Vert f\Vert _{L^{p,\infty }({\mathbb {R}}^{n},\mathrm {d}\nu )}<\infty \right\} , \end{aligned}$$

where \(\nu \) can be the usual underlying doubling measure \(\mu \) or any other measure depending or not on \(\mu \). Here we use the standard notation \(\Vert f\Vert _{L^{p,\infty }({\mathbb {R}}^{n},\mathrm {d}\nu )}\) for the weak norm defined as

$$\begin{aligned} \Vert f\Vert _{L^{p,\infty }({\mathbb {R}}^{n},\mathrm {d}\nu )}:=\sup _{t>0}t\nu \left( \{x\in {\mathbb {R}}^{n}:|f(x)|>t\}\right) ^{\frac{1}{p}}. \end{aligned}$$

3 A new quantitative self-improving theorem for BMO functions

In this section we prove the new general self-improving result in Theorem 1.1. We start by proving some preliminary lemmas which will allow us to reduce the proof to the case of bounded functions.

3.1 Lemmata

We first include the following trivial lemma regarding the oscillations of a function.

Lemma 3.1

Let \(f\in L^1_{\mathrm {loc}}({\mathbb {R}}^n,d\mu )\) and let \(p\ge 1\). If E is a positive finite measure set of \({\mathbb {R}}^n\), then

$$\begin{aligned} \begin{aligned} \inf _{c\in {\mathbb {R}}}\left( \frac{1}{\mu (E)}\int _E |f(x)-c|^pd\mu (x)\right) ^{1/p}&\le \left( \frac{1}{\mu (E)}\int _E |f(x)-f_E|^pd\mu (x)\right) ^{1/p}\\&\le 2\inf _{c\in {\mathbb {R}}}\left( \frac{1}{\mu (E)}\int _E |f(x)-c|^pd\mu (x)\right) ^{1/p}. \end{aligned} \end{aligned}$$

Recall that, for given \(L<U\), the notation \(\tau _{LU}\) is used for the function \(\tau _{LU}:{\mathbb {R}}\rightarrow {\mathbb {R}}\) given by

$$\begin{aligned} \tau _{LU}(a):={\left\{ \begin{array}{ll} L&{}\text {if }a<L,\\ a&{}\text {if }L\le a\le U\\ U&{}\text {if }a>U. \end{array}\right. } \end{aligned}$$

These functions allow to define the truncations \(\tau _{LU}(g)\) of a given function g by

$$\begin{aligned} \tau _{LU}g(x):=\tau _{LU}(g(x)),\qquad L<U,\ x\in {\mathbb {R}}^n. \end{aligned}$$

Lemma 3.2

Let \(\nu \) be any Borel measure in \({\mathbb {R}}^n\) and consider \(f\in L^1_{\mathrm {loc}}({\mathbb {R}}^n,\mathrm {d}\nu )\). Then, for every cube Q in \({\mathbb {R}}^n\),

$$\begin{aligned} \frac{1}{\nu (Q)}\int _Q|f-f_{Q,\nu }|\,\mathrm {d}\nu\le & {} \sup _{L<U}\frac{1}{\nu (Q)}\int _Q|\tau _{LU}f-(\tau _{LU}f)_{Q,\nu }|\,\mathrm {d}\nu \\\le & {} \frac{2}{\nu (Q)}\int _Q|f-f_{Q,\nu }|\,\mathrm {d}\nu . \end{aligned}$$

Proof

Let Q be a cube in \({\mathbb {R}}^n\). Observe first that, given \(L<U\) one has that \(|\tau _{LU}(a)-\tau _{LU}(b)|\le |a-b|\) for every \(a,b\in {\mathbb {R}}\). This allows to write

$$\begin{aligned} \begin{aligned} \frac{1}{\nu (Q)}\int _Q|\tau _{LU}f-(\tau _{LU}f)_{Q,\nu }|\,\mathrm {d}\nu&\le 2 \inf _{c\in {\mathbb {R}}}\frac{1}{\nu (Q)}\int _Q|\tau _{LU}f-c|\,\mathrm {d}\nu \\&\le \frac{2}{\nu (Q)}\int _Q|\tau _{LU}f-\tau _{LU}(f_{Q,\nu })|\,\mathrm {d}\nu \\&\le \frac{2}{\nu (Q)}\int _Q|f-f_{Q,\nu }|\,\mathrm {d}\nu , \end{aligned} \end{aligned}$$

for every \(L<U\). Here Lemma 3.1 has been used.

On the other hand, by Fatou’s lemma,

$$\begin{aligned} \begin{aligned} \frac{1}{\nu (Q)}\int _Q|f-f_{Q,\nu }|\,\mathrm {d}\nu&\le \liminf _{\begin{array}{c} L\rightarrow -\infty ,\\ U\rightarrow \infty \end{array}} \frac{1}{\nu (Q)}\int _Q|\tau _{LU}f-(\tau _{LU}f)_{Q,\nu }|\,\mathrm {d}\nu \\&\le \sup _{L<U}\frac{1}{\nu (Q)}\int _Q|\tau _{LU}f-(\tau _{LU}f)_{Q,\nu }|\,\mathrm {d}\nu , \end{aligned} \end{aligned}$$

and the result will follow. Here the local integrability of f was used to ensure \(f_{Q,\nu }=\lim _{\begin{array}{c} L\rightarrow -\infty ,\\ U\rightarrow \infty \end{array}}(\tau _{LU}f)_{Q,\nu }\) by dominated convergence. \(\square \)

Lemma 3.3

Let \(\mu ,\nu \) be Borel measures in \({\mathbb {R}}^n\) and let \(f\in L^1_{\mathrm {loc}}({\mathbb {R}}^n,\mathrm {d}\mu )\). Assume that \({\mathcal {Z}}=\{Z_Q\}_{Q\in {\mathcal {Q}}}\) is a good family. Then, for every cube Q in \({\mathcal {Q}}\),

$$\begin{aligned} \Vert (f-f_{Q,\mu })\chi _Q\Vert _{Z_Q}\le \sup _{L<U}\Vert (\tau _{LU}f-(\tau _{LU}f)_{Q,\mu })\chi _Q \Vert _{Z_Q}. \end{aligned}$$

Proof

Let Q be a cube in \({\mathcal {Q}}\). By Fatou’s property (1) in Definition 2.4,

$$\begin{aligned} \begin{aligned} \Vert (f -f_{Q,\mu })\chi _Q\Vert _{Z_Q}&\le \liminf _{\begin{array}{c} L\rightarrow -\infty ,\\ U\rightarrow \infty \end{array}} \Vert (\tau _{LU}f -(\tau _{LU}f)_{Q,\mu })\chi _Q\Vert _{Z_Q} \\&\le \sup _{L<U}\Vert (\tau _{LU}f -(\tau _{LU}f)_{Q,\mu })\chi _Q\Vert _{Z_Q}, \end{aligned} \end{aligned}$$

and the result will follow. Here the local integrability of f was used to ensure \(f_{Q,\mu }=\lim _{\begin{array}{c} L\rightarrow -\infty ,\\ U\rightarrow \infty \end{array}}(\tau _{LU}f)_{Q,\mu }\) by dominated convergence. \(\square \)

3.2 Proof of the self-improving theorem

We are already in position to present the proof of our Theorem 1.1.

Proof

Lemmas 3.2 and 3.3 allow to work under the assumption that f is a bounded function. Since f is in \(\mathrm {BMO}(\mathrm {d}\mu )\), for every cube P in \({\mathbb {R}}^n\), the following inequality holds

$$\begin{aligned} \frac{1}{\mu (P)}\int _P \frac{|f(x)-f_{P,\mu }|}{\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}}\,\mathrm {d}\mu (x)\le 1. \end{aligned}$$
(3.1)

Let \(L>1\) and let Q be any cube in \({\mathbb {R}}^n\). Inequality (3.1) allows to apply the local Calderón-Zygmund decomposition to \(\frac{f(x)-f_{Q,\mu }}{\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}}\) on Q at level L. This gives a family of disjoint subcubes \(\{Q_j\}_{j\in {\mathbb {N}}}\subset {\mathcal {D}}(Q)\) with the properties

$$\begin{aligned} L < \frac{1}{\mu (Q_j)}\int _{Q_j}\frac{|f(x)-f_{Q,\mu }|}{\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}}\,\mathrm {d}\mu (x) \le c_\mu 2^{n_\mu } L. \end{aligned}$$
(3.2)

For a simpler presentation, let us introduce the notation

$$\begin{aligned} g(x):=\frac{f(x)-f_{Q,\mu }}{\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}} \qquad \text {and} \qquad g_j(x):=\frac{f(x)-f_{Q_j,\mu }}{\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}}. \end{aligned}$$

The function \(g\chi _Q\) can be decomposed as

$$\begin{aligned} \begin{aligned} g(x)\chi _Q(x)&=\sum _{j\in {\mathbb {N}}}g(x)\chi _{Q_j}(x)+g(x)\chi _{Q\backslash \bigcup _{j\in {\mathbb {N}}} Q_j}(x)\\&=\sum _{j\in {\mathbb {N}}}\left[ g_j(x)+\frac{f_Q-f_{Q_j,\mu }}{\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}}\right] \chi _{Q_j}(x)+g(x)\chi _{Q\backslash \bigcup _{j\in {\mathbb {N}}} Q_j}(x). \end{aligned} \end{aligned}$$

On one hand, by Lebesgue differentiation theorem

$$\begin{aligned} \left| g(x)\chi _{Q\backslash \bigcup _{j\in {\mathbb {N}}} Q_j}(x)\right| \le L, \end{aligned}$$

for \(\mu \)-almost every \(x\in Q\) and, on the other hand, the second term in the sum

$$\begin{aligned} \sum _{j\in {\mathbb {N}}}\left[ g_j(x)+\frac{f_Q-f_{Q_j,\mu }}{\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}}\right] \chi _{Q_j}(x) \end{aligned}$$

can be bounded as follows

$$\begin{aligned} \left| \frac{f_Q-f_{Q_j,\mu }}{\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}}\right| \le \frac{1}{\mu (Q_j)}\int _{Q_j}\frac{|f(x)-f_{Q,\mu }|}{\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}}\mathrm {d}\mu (x) \le c_\mu 2^{n_\mu } L, \end{aligned}$$

for every \(j\in {\mathbb {N}}\).

Therefore, the absolute value of g can be bounded by

$$\begin{aligned} \begin{aligned} \left| g(x)\right| \chi _Q(x)&\le \sum _{j\in {\mathbb {N}}}\left| g_j(x)\right| \chi _{Q_j}(x)+ (c_\mu 2^{n_\mu }+1)L\chi _Q(x). \end{aligned} \end{aligned}$$

Hence, by using the quasi-triangle inequality, the Average property (2) from Definition 2.4 and the disjointness of the cubes \(Q_j\),

$$\begin{aligned} \left\| g\right\| _{Z_Q}\le K \left\| \sum _{j\in {\mathbb {N}}} g_j\chi _{Q_j}\right\| _{Z_Q}+C_{K,\mu }L \end{aligned}$$
(3.3)

where \(C_{K,\mu }:=K(c_\mu 2^{n_\mu }+1)\).

The key property of the cubes \(\{Q_j\}_{j\in {\mathbb {N}}}\) in the Calderón-Zygmund decomposition at level L of \( g(x)=\frac{f(x)-f_{Q,\mu }}{\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}} \chi _Q(x)\) is the fact that, by (3.2),

$$\begin{aligned} \sum _{j\in {\mathbb {N}}}\mu (Q_j) \le \sum _{j\in {\mathbb {N}}}\frac{1}{L}\int _{Q_j}|g(x)|\,\mathrm {d}\mu (x)=\frac{1}{L}\int _{Q}|g(x)|\,\mathrm {d}\mu (x)\le \frac{\mu (Q)}{L}, \end{aligned}$$
(3.4)

where (3.1) has been used.

A brief remark is in order here to explain the main idea. Note that in (3.3) we have essentially the same object on both sides of the inequality but at different levels. That is, we are trying to control the “local average” \(\left\| g\right\| _{Z_Q}\) in terms of a local average of \(\sum _j g_j\chi _{Q_j}\). In the classical case of \(L^p\) weighted norms, the localization of the functions \(g_j\chi _{Q_j}\) allows to move the norm into the sum. Here, however, we will appeal to the generalized \(A_\infty \) condition for the family \({\mathcal {Z}}\). To that end, let us define

$$\begin{aligned} {\mathbb {X}} := \sup _{P\in {\mathcal {Q}}}\left\| \frac{f -f_{P,\mu }}{\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}}\chi _P\right\| _{Z_P}. \end{aligned}$$
(3.5)

This supremum is finite since, by the Average property 2 from Definition 2.4 and the boundedness of f, for any cube \(P\in {\mathcal {Q}}\),

$$\begin{aligned} \left\| \frac{f -f_{P,\mu }}{\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}}\chi _P\right\| _{Z_P}\le 2\frac{\Vert f\Vert _{L^\infty ({\mathbb {R}}^n,\mathrm {d}\mu )}}{\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}}<\infty . \end{aligned}$$

This allows to make computations with \({\mathbb {X}}\), which allows to introduce the local averages of the functions \(g_j\chi _{Q_j}\) as follows:

$$\begin{aligned} \left\| g\right\| _{Z_Q}\le & {} K \left\| \sum _{j\in {\mathbb {N}}} g_j\chi _{Q_j}\right\| _{Z_Q}+ C_{K,\mu }\cdot L\\= & {} K\left\| \sum _{j\in {\mathbb {N}}} \frac{ \left\| g_j\right\| _{Z_{Q_j}}}{ \left\| g_j\right\| _{Z_{Q_j}}} g_j\chi _{Q_j}\right\| _{Z_{Q}}+ C_{K,\mu }\cdot L \end{aligned}$$

Using the \({\mathbb {X}}\) defined above, we get

$$\begin{aligned} \left\| g\right\| _{Z_{Q}}\le {\mathbb {X}}\cdot K \left\| \sum _{j\in {\mathbb {N}}} \frac{ g_j}{ \left\| g_j\right\| _{Z_{Q_j}}} \chi _{Q_j}\right\| _{Z_{Q}}+ C_{K,\mu }\cdot L \end{aligned}$$
(3.6)

Here is where the generalized \(A_\infty (\mathrm {d}\mu )\) condition for \({\mathcal {Z}}\) pops in. In particular, we get a bound that does not depend on the cube Q (recall that \(\{Q_j\}_{j\in {\mathbb {N}}}\) and Q satisfy the smallness relation (3.4)), and then one can take supremum at the left-hand side to get

$$\begin{aligned} {\mathbb {X}}\le {\mathbb {X}}\cdot {C_{\mathcal {Z}}}\cdot K\cdot {\Psi }^{-1}\left( \frac{1}{L}\right) + C_{K,\mu }\cdot L, \end{aligned}$$

where \({C_{\mathcal {Z}}}\) is the constant in the definition of the aforementioned generalized \(A_\infty \) condition. One can now choose \(L> \max \left\{ 1,\left[ {\Psi } \left( (C_{\mathcal {Z}}\cdot K)^{-1} \right) \right] ^{-1}\right\} \). Thanks to this, it is possible to isolate \({\mathbb {X}}\) at the left-hand side as follows

$$\begin{aligned} {\mathbb {X}} \left[ 1-{C_{\mathcal {Z}}}\cdot K\cdot {\Psi }^{-1}\left( \frac{1}{L}\right) \right] \le C_{K,\mu }L. \end{aligned}$$

Equivalently,

$$\begin{aligned} {\mathbb {X}} \le C_{K,\mu }\frac{L }{ 1-{C_{\mathcal {Z}}} \cdot K\cdot {\Psi }^{-1}\left( \frac{1}{L}\right) } \end{aligned}$$

for every \(L>\max \left\{ 1,\left[ {\Psi } \left( ({C_{\mathcal {Z}}}\cdot K)^{-1} \right) \right] ^{-1}\right\} \). It just remains to optimize the right-hand side on \(L>\max \left\{ 1,\left[ {\Psi }\left( ({C_{\mathcal {Z}}}\cdot K)^{-1} \right) \right] ^{-1}\right\} \) to get the desired result. \(\square \)

4 Applications of the self-improving theorem

As a first easy consequence of our general self-improving result, we include the following corollary regarding the classical \(A_\infty \) condition. We show that it suffices to check the usual condition replacing the usual power functions by any increasing bijection. We remit the reader to [5] for the classical definition and a number of equivalent conditions.

Corollary 4.1

Let w be any weight satisfying, for some increasing bijection \(\Phi :[0,1]\rightarrow [0,1]\), the condition

$$\begin{aligned} \frac{w\left( \bigcup _{j\in {\mathbb {N}}}Q_j\right) }{w(Q)}\le C\Phi ^{-1}\left[ \frac{\mu \left( \bigcup _{j\in {\mathbb {N}}}Q_j\right) }{\mu (Q)}\right] \end{aligned}$$

for every cube Q and every \(\{Q_j\}_{j\in {\mathbb {N}}}\in \Delta (Q)\). Then \(w\in A_\infty (\mathrm {d}\mu )\).

Proof

Considering \(\Vert \cdot \Vert _{Z_Q}=\Vert \cdot \Vert _{L^p\left( Q,\frac{\mathrm {d}w}{w(Q)}\right) }\) for each cube Q, by Theorem 1.1, we get that the weight w satisfies also that

$$\begin{aligned} \sup _{\Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}=1} \Vert f-f_{Q,\mu }\Vert _{L^p\left( Q,\frac{\mathrm {d}w}{w(Q)}\right) }<\infty , \end{aligned}$$

and so, by Theorem 2.1, it happens that \(w\in A_\infty (\mathrm {d}\mu )\). \(\square \)

In the sequel, we present two particular examples of application of Theorem 1.1.

4.1 BMO-type improvement at the Orlicz spaces scale

The first example has to do with Orlicz norms for submultiplicative Young functions. The aim is to write a quantitative self-improving result for the control on the mean oscillations of \(\mathrm {BMO}(\mathrm {d}\mu )\) functions to a control on Orlicz mean oscillations.

A special type of convex function is used to define Orlicz norms.

Definition 4.1

A convex function \(\phi :[0, \infty )\rightarrow [0, \infty )\) is said to be a Young function if \(\phi (0)=0\), and \(\lim _{t\rightarrow \infty }\phi (t)=\infty \). Througout the remainder of the paper we will assume that every Young function \(\phi \) has the additional property \(\phi (1)=1\). We will say that \(\phi \) is a quasi-submultiplicative Young function with associated constant \(c>0\) if, additionally, \(\phi (t_1\cdot t_2)\le c\phi (t_1)\cdot \phi (t_2)\) for every \(t_1,t_2\ge 0\). If \(c=1\) we will simply say that \(\phi \) is submultiplicative. If there is \(k>2\) such that \(\phi (2t)\le k\phi (t)\) for every \(t\ge t_0\) for some \(t_0\ge 0\), we will say that \(\phi \) satisfies the \(\Delta _2\) (or doubling) condition.

Example 4.1

As examples of doubling Young functions one can find the power functions \(\phi _p(t):=t^p\). These are clearly doubling functions since they are submultiplicative. In general, every submultiplicative Young function \(\phi \) is a doubling Young function but not only submultiplicative functions satisfy this condition, as this is also fulfilled by quasi-submultiplicative Young functions such as \(\phi _{p,\alpha }(t):=\log (e+1)^{-\alpha }t^p\log (e+t)^\alpha \), \(p\ge 1\), \(\alpha >0\).

Given any Young function \(\phi \), any Borel measure \(\nu \) in \({\mathbb {R}}^n\) and any cube Q in \({\mathbb {R}}^n\) one can define the \(\phi (L)(\nu )\)-mean average of a function f over Q with the Luxemburg norm

$$\begin{aligned} \Vert f\Vert _{\phi (L)\left( Q,\frac{\mathrm {d}\nu }{\nu (Q)}\right) }:=\inf \left\{ \uplambda >0:\frac{1}{\nu (Q)}\int _Q \phi \left( \frac{|f(x)|}{\uplambda }\right) \,\mathrm {d}\nu (x)\le 1\right\} , \end{aligned}$$

which is the localized version of the Luxemburg norm defining the Orlicz space \(\phi (L)({\mathbb {R}}^n,\mathrm {d}\nu )\) given by the finiteness of the norm

$$\begin{aligned} \Vert f\Vert _{\phi (L)\left( {\mathbb {R}}^n,\mathrm {d}\nu \right) }:=\inf \left\{ \uplambda >0: \int _{{\mathbb {R}}^n} \phi \left( \frac{|f(x)|}{\uplambda }\right) \,\mathrm {d}\nu (x)\le 1\right\} . \end{aligned}$$

Note that if \(\phi _1,\phi _2\) are Young functions satisfying \(\phi _1(t)\le \phi _2(kt)\) for \(t>t_0\), for some \(k>0\) and \(t_0\ge 0\), then \(\phi _2(L)\left( Q,\frac{\mathrm {d}\nu }{\nu (Q)}\right) \subset \phi _1(L)\left( Q,\frac{\mathrm {d}\nu }{\nu (Q)}\right) \) (see [12, Theorem 13.1]). Therefore one can find infinitely many Orlicz spaces different from a Lebesgue space between any two Lebesgue spaces \(L^p\left( Q,\frac{\mathrm {d}\nu }{\nu (Q)}\right) \) and \(L^q\left( Q,\frac{\mathrm {d}\nu }{\nu (Q)}\right) \), \(p<q\).

Orlicz spaces are examples of quasi-normed function spaces with a good quasi-norm as introduced in the beginning of this section and moreover they are Banach function spaces, i.e. the quasi-norm \(\Vert \cdot \Vert _{\phi (L)\left( {\mathbb {R}}^n, \mathrm {d}\nu \right) }\) is in fact a norm and the resulting space is complete.

Let us call \({\mathcal {Z}}_\phi \) the family of norms \(\left\| \cdot \right\| _{Z_Q}=\left\| \cdot \right\| _{\phi (L)\left( Q,\frac{\mathrm {d}\mu }{\mu (Q)}\right) }\). The following lemma shows that for \({\mathcal {Z}}_\phi \) it suffices to check the \(A_\infty (\mu )\) condition in Definition 2.3 just for characteristic functions instead of considering the arbitrary functions \(h_j\).

Lemma 4.1

Let \(\phi \) be a quasi-submultiplicative Young function with associated quasi-submultiplicative constant \(c>0\). The family of norms \({\mathcal {Z}}_\phi \) satisfies a generalized \(A_\infty (\mathrm {d}\mu )\) condition with associated increasing bijection \(\Phi :[0,1]\rightarrow [0,1]\) if and only if there is \(C>0\) such that

$$\begin{aligned} \left\| \sum _{j\in {\mathbb {N}}} \chi _{Q_j } \right\| _{\phi (L)\left( Q,\frac{\mathrm {d}\mu }{\mu (Q)}\right) }\le C\Phi ^{-1}\left[ \frac{\mu \left( \bigcup _{j\in {\mathbb {N}}} Q_j\right) }{\mu (Q)}\right] \end{aligned}$$
(4.1)

for every \(\{Q_j\}_{j\in {\mathbb {N}}}\in \Delta (Q)\), \(Q\in {\mathcal {Q}}\).

Proof

Indeed, consider a cube Q, a sequence \(\{Q_j\}_{j\in {\mathbb {N}}}\) of disjoint subcubes of Q and \(\{h_j\}_{j\in {\mathbb {N}}}\) a sequence of functions satisfying \(\Vert h_j\Vert _{\phi (L)\left( Q_j,\frac{\mathrm {d}\mu }{\mu (Q_j)}\right) }=1\) for every \(j\in {\mathbb {N}}\). Then,

$$\begin{aligned} \begin{aligned} \frac{1}{\mu (Q)} \int _Q \phi \left( \frac{\sum _{j\in {\mathbb {N}}} h_j\chi _{Q_j} }{\uplambda }\right) \,\mathrm {d}\mu (x)&= \sum _{j\in {\mathbb {N}}} \frac{\mu (Q_j)}{\mu (Q)}\frac{1}{\mu (Q_j)}\int _{Q_j} \phi \left( \frac{h_j (x) }{\uplambda }\right) \,\mathrm {d}\mu (x)\\&\le c\sum _{j\in {\mathbb {N}}} \frac{\mu (Q_j)}{\mu (Q)}\phi \left( \frac{ 1}{\uplambda }\right) \frac{1}{\mu (Q_j)}\\&\quad \int _{Q_j}\phi (h_j(x)) \,\mathrm {d}\mu (x)\\&= c\sum _{j\in {\mathbb {N}}}\frac{1}{\mu (Q)} \int _{Q_j}\phi \left( \frac{ 1 }{\uplambda }\right) \,\mathrm {d}\mu (x)\\&\le c \frac{1}{\mu (Q)}\int _{Q}\phi \left( \frac{ \sum _{j\in {\mathbb {N}}} \chi _{Q_j}(x) }{\uplambda }\right) \,\mathrm {d}\mu (x). \end{aligned} \end{aligned}$$

Hence,

$$\begin{aligned} \begin{aligned} \left\| \sum _{j\in {\mathbb {N}}} h_j\chi _{Q_j} \right\| _{\phi (L)\left( Q,\frac{\mathrm {d}\mu }{\mu (Q)}\right) }&\le c \left\| \sum _{j\in {\mathbb {N}}} \chi _{Q_j} \right\| _{\phi (L)\left( Q,\frac{\mathrm {d}\mu }{\mu (Q)}\right) }. \end{aligned} \end{aligned}$$

The result follows from the above computation and the fact that characteristic functions have average 1. \(\square \)

As a consequence, we get that the family \({\mathcal {Z}}_\phi \) satisfies a \(A_\infty (\mathrm {d}\mu )\) generalized condition.

Lemma 4.2

Let \(\phi \) be a quasi-submultiplicative Young function with associated quasi-submultiplicative constant \(c>0\). Then the family \({\mathcal {Z}}_\phi \) satisfies a generalized \(A_\infty (\mathrm {d}\mu )\) condition with associated increasing bijection given by \(\Psi (t):= 1/\phi ^{-1}(1/t)\).

Proof

By the above lemma, we just have to check the condition for characteristic functions. Let us then take a cube Q and any family \(\{Q_j\}_{j\in {\mathbb {N}}}\in \Delta (Q)\). Then, if one considers \(\uplambda _0:=1/\phi ^{-1}\left[ \mu (Q)/\sum _{j\in {\mathbb {N}}}\mu (Q_j)\right] \),

$$\begin{aligned} \begin{aligned} \frac{1}{\mu (Q)}\int _Q\phi \left( \frac{\sum _{j\in {\mathbb {N}}}\chi _{Q_j}(x)}{\uplambda _0} \right) \,\mathrm {d}\mu (x)&= \sum _{j\in {\mathbb {N}}} \frac{1}{\mu (Q)}\int _{Q_j}\phi \left( \frac{\chi _{Q_j}(x)}{\uplambda _0}\right) \,\mathrm {d}\mu (x)\\&= \sum _{j\in {\mathbb {N}}}\frac{\mu (Q_j)}{\mu (Q)}\phi \left( \frac{1}{\uplambda _0}\right) \\&=\sum _{j\in {\mathbb {N}}}\frac{\mu (Q_j)}{\mu (Q)}\frac{\mu (Q)}{\sum _{j\in {\mathbb {N}}}\mu (Q_j)}=1. \end{aligned} \end{aligned}$$

This implies that, for any cube Q and any family \(\{Q_j\}_{j\in {\mathbb {N}}}\in \Delta (Q)\),

$$\begin{aligned} \begin{aligned} \left\| \sum _{j\in {\mathbb {N}}}\chi _{Q_j} \right\| _{\phi (L)\left( Q,\frac{\mathrm {d}\mu }{\mu (Q)}\right) }&=\inf \left\{ \uplambda >0:\frac{1}{\mu (Q)}\int _Q\phi \left( \frac{\sum _{j\in {\mathbb {N}}}\chi _{Q_j}(x)}{\uplambda }\right) \,\mathrm {d}\mu (x) \right\} \\&\le \frac{1}{\phi ^{-1}\left[ \frac{\mu (Q)}{\sum _{j\in {\mathbb {N}}}\mu (Q_j)}\right] }. \end{aligned} \end{aligned}$$

The smallness condition is then satisfied for the increasing bijection of [0, 1] given by \(\Psi ^{-1}(t):= 1/\phi ^{-1}(1/t)\). \(\square \)

From the lemma above it can be deduced, through a simple application of Theorem 1.1, the following general result for Orlicz spaces.

Corollary 4.2

Let \(\mu \) be a doubling measure in \({\mathbb {R}}^n\). Let \(\phi \) be a quasi-submultiplicative Young function with associated quasi-submultiplicative constant \(c>0\). Let us further assume that \(\phi \) is differentiable for \(t>1\) and let \([\phi ]_1\) and \([\phi ]_2\) be the best constants satisfying \([\phi ]_1\phi (t)\le t\phi '(t)\le [\phi ]_2\phi (t)\), \(t>1\). If \(f\in \mathrm {BMO}(\mathrm {d}\mu )\) then

$$\begin{aligned} \Vert f-f_{Q,\mu }\Vert _{\phi (L)\left( Q,\frac{\mathrm {d}\mu }{\mu (Q)}\right) }\le c_\mu 2^{n_\mu } \phi \left[ c\left( 1+\frac{1}{[\phi ]_1}\right) \right] \left( [\phi ]_2+1\right) \Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )} \end{aligned}$$
(4.2)

for every cube Q in \({\mathbb {R}}^n\).

Proof

The inequality follows by a direct application of Theorem 1.1. The only thing which remains is to prove a bound for the constant \(C\left( \mu , \Psi \right) \). Observe that, as the constant C in the generalized \(A_\infty (\mathrm {d}\mu )\) condition of \({\mathcal {Z}}_\phi \) is clearly less than c, we have that

$$\begin{aligned} C\left( \mu , \Psi \right) \le \inf _{L>\max \{1,\Psi (c^{-1})^{-1}\}} c_\mu 2^{n_\mu }\frac{L }{ 1-c {\Psi }^{-1}\left( \frac{1}{L}\right) }. \end{aligned}$$

Since by the preceding lemma in this case we have \(\Psi ^{-1}(t)=1/\phi ^{-1}(1/t)\), then we can write

$$\begin{aligned} C\left( \mu , \Psi \right) =\inf _{L>\max \{1,\Psi (c^{-1})^{-1}\}} c_\mu 2^{n_\mu }\frac{L\phi ^{-1}(L) }{ {\phi }^{-1}\left( L\right) -c} . \end{aligned}$$

It is a simple real analysis exercise to find that the smallest value for the above function of L is attained at the smallest \(L>\max \{1,\Psi (c^{-1})^{-1}\}\) satisfying the identity

$$\begin{aligned} L=\phi \left[ c+c\frac{L[\phi ^{-1}]'(L)}{\phi ^{-1}(L)}\right] . \end{aligned}$$

Observe that such an L exists always because \(\phi \left[ c+c\frac{L[\phi ^{-1}]'(L)}{\phi ^{-1}(L)}\right] \) is a bounded function of L. Indeed, observe that, as \(\phi \) is an increasing function, we can make the change of variables \(L=\phi (s)\) to get

$$\begin{aligned} \phi \left[ c+c\frac{L[\phi ^{-1}]'(L)}{\phi ^{-1}(L)}\right] =\phi \left[ c+c\frac{\phi (s)}{s\phi '(s)}\right] \le \phi \left[ c+c\frac{1}{[\phi ]_1}\right] \end{aligned}$$

The existence is established then by checking that \(\phi \left[ c+c\frac{L[\phi ^{-1}]'(L)}{\phi ^{-1}(L)}\right] \) is greater than 1 or greater than \(\Psi (c^{-1})^{-1}\), depending on whether \(\max \{1,\Psi (c^{-1})^{-1}\}\) is one quantity or the other. If \(\max \{1,\Psi (c^{-1})^{-1}\}=1\) then it happens that \(c=1\) (note that c is not allowed to be below 1 by condition \(\phi (1)=1\)) and then

$$\begin{aligned} \phi \left[ c+c\frac{L[\phi ^{-1}]'(L)}{\phi ^{-1}(L)}\right]>1 \iff \frac{L[\phi ^{-1}]'(L)}{\phi ^{-1}(L)} >0, \end{aligned}$$

which trivially holds. In case \(\max \{1,\Psi (c^{-1})^{-1}\}=\Psi (c^{-1})^{-1}\), one just has to check the existence of \(L>\Psi (c^{-1})^{-1}\) such that

$$\begin{aligned} L=\phi \left[ c+c\frac{L[\phi ^{-1}]'(L)}{\phi ^{-1}(L)}\right] , \end{aligned}$$

but note that \(L>\Psi (c^{-1})^{-1}\iff c^{-1}>\Psi ^{-1}(L^{-1})\), which for our choice of \(\Psi ^{-1}\) reads \(\phi ^{-1}(L)>c\). By the continuity properties of the function under consideration, the desired existence will be proved if

$$\begin{aligned} c+c\frac{L[\phi ^{-1}]'(L)}{\phi ^{-1}(L)}>c, \end{aligned}$$

and this holds trivially because \(\phi ^{-1}\) is a positive increasing function.

Therefore, by calling \(A_\phi \) the set of those \(L >\max \{1,\Psi (c^{-1})^{-1}\}\) satisfying the condition \(L=\phi \left[ c+c\frac{L[\phi ^{-1}]'(L)}{\phi ^{-1}(L)}\right] ,\)

$$\begin{aligned} \begin{aligned} C\left( \mu , \Psi \right)&=\inf _{L\in A_\phi }c_\mu 2^{n_\mu } \phi \left[ c\left( 1+ \frac{L[\phi ^{-1}]'(L)}{\phi ^{-1}(L)}\right) \right] \left( \frac{\phi ^{-1}(L)}{L[\phi ^{-1}]'(L)}+1\right) \\&\le \inf _{\phi (s)\in A_\phi }c_\mu 2^{n_\mu } \phi \left[ c\left( 1+\frac{\phi (s)}{s\phi '(s)}\right) \right] \left( \frac{s\phi '(s)}{\phi (s)}+1\right) , \end{aligned} \end{aligned}$$

where the change of variables \(L=\phi (s)\), \(s>1\) has been used again. \(\square \)

Example 4.2

As an application we compute the example

$$\begin{aligned} {\phi }_{p,\alpha }(t)=t^{p}(1+\log ^{+}(t))^{\alpha },\qquad \alpha >0, \quad p\ge 1, \end{aligned}$$

which is a submultiplicative Young function defining the Orlicz space \(L^p(\log L)^\alpha \). First we note that, indeed, \( {\phi }_{p,\alpha }\) is submultiplicative, i.e.

$$\begin{aligned} {\phi }_{p,\alpha }(st)\le {\phi }_{p,\alpha }(s) {\phi }_{p,\alpha }(t),\qquad s,t>0. \end{aligned}$$

We note that if \(0<s<1\) and/or \(0<t<1\) the inequality trivially holds. Hence we shall assume that \(s,t>1\). Note that then

$$\begin{aligned} {\phi }_{p,\alpha }(st)=s^{p}t^{p}(1+\log ^+(st))^{\alpha }=s^{p}t^{p}(1+\log (st))^{\alpha } \end{aligned}$$

and it suffices to show that

$$\begin{aligned} 1+\log (st)\le (1+\log (s))(1+\log (t)) \end{aligned}$$

but

$$\begin{aligned} 1+\log (st)= & {} 1+\log (s)+\log (t)\\\le & {} 1+\log (s)+\log (t)+\log (s)\log (t)\\= & {} \left( 1+\log (s)\right) \left( 1+\log (t)\right) \end{aligned}$$

and hence we are done.

Now observe that

$$\begin{aligned} {\phi }_{p,\alpha }'(t)={\left\{ \begin{array}{ll} pt^{p-1} &{} \text {if}\quad t<1,\\ pt^{p-1}\left( 1+\log (t)\right) ^{\alpha }+\alpha t^{p-1}\left( 1+\log (t)\right) ^{\alpha -1} &{} \text {if}\quad t>1. \end{array}\right. } \end{aligned}$$

If \(t>1\), then

$$\begin{aligned} {\phi }_{p,\alpha }(t)=t^{p}(1+\log ^{+}(t))^{\alpha }=t^{p}(1+\log (t))^{\alpha }, \end{aligned}$$

and

$$\begin{aligned} t {\phi }_{p,\alpha }'(t)=pt^{p}(1+\log (t))^{\alpha }+\alpha t^{p}(1+\log (t))^{\alpha -1}. \end{aligned}$$

Hence,

$$\begin{aligned} \frac{t {\phi }_{p,\alpha }'(t)}{ {\phi }_{p,\alpha }(t)}=p+\frac{\alpha }{1+\log (t)},\qquad t>1, \end{aligned}$$

and we then have that

$$\begin{aligned} p\le \frac{t {\phi }_{p,\alpha }'(t)}{ {\phi }_{p,\alpha }(t)}\le p+\alpha . \end{aligned}$$

These bounds are optimal for \(t>1\), and so we have that \([ {\phi }_{p,\alpha }]_{1}=p\) and \([ {\phi }_{p,\alpha }]_{2}=p+\alpha \). By Corollary 4.2,

$$\begin{aligned} \Vert f-f_{Q,\mu }\Vert _{ {\phi }_{p,\alpha }(L)\left( Q,\frac{\mathrm {d}\mu }{\mu (Q)}\right) }\le c_{\mu }2^{n_{\mu }} {\phi }_{p,\alpha }\left( 1+\frac{1}{[ {\phi }_{p,\alpha }]_{1}}\right) \left( [ {\phi }_{p,\alpha }]_{2}+1\right) \Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}, \end{aligned}$$

and observe that, in this particular case,

$$\begin{aligned} {\phi }_{p,\alpha }\left( 1+\frac{1}{[ {\phi }_{p,\alpha }]_{1}}\right) \left( [ {\phi }_{p,\alpha }]_{2}+1\right) = {\phi }_{p,\alpha }\left( 1+\frac{1}{p}\right) \left( p+\alpha +1\right) , \end{aligned}$$

and

$$\begin{aligned} {\phi }_{p,\alpha }\left( 1+\frac{1}{p}\right) =\left( 1+\frac{1}{p}\right) ^{p}\left( 1+\log \left( 1+\frac{1}{p}\right) \right) ^{\alpha }\le e2^{\alpha }. \end{aligned}$$

Consequently,

$$\begin{aligned} \Vert f-f_{Q,\mu }\Vert _{ {\phi }_{p,\alpha }(L)\left( Q,\frac{\mathrm {d}\mu }{\mu (Q)}\right) }\le c_{\mu }2^{n_{\mu }}e2^{\alpha }\left( p+\alpha +1\right) \Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}. \end{aligned}$$

This proves Corollary 1.1 in the Introduction.

Remark 4.1

We observe that different choices for defining the same Orlicz norm may give different quantitative controls when applying our self-improving result. Indeed, one may check that, for instance, the alternative choice \({\tilde{\phi }}_{p,\alpha }(t):=[\log (e+1)]^{-\alpha }t^p[\log (e+t)]^\alpha \), \(\alpha \ge 0\), \(p>1\) for defining the norm \(\Vert \cdot \Vert _{L^p\log ^\alpha L}\) leads to the following estimate

$$\begin{aligned} \begin{aligned} \Vert f-f_{Q,\mu }&\Vert _{L^p(\log L)^\alpha \left( Q,\frac{\mathrm {d}\mu }{\mu (Q)}\right) }\\&\le c_\mu 2^{n_\mu } e\\&\quad \left[ \log \left( e+1\right) \right] ^{\alpha (p-1)}[\log (e+2\log (1+e)^\alpha )]^\alpha (p+\alpha +1) \Vert f\Vert _{\mathrm {BMO}(\mathrm {d}\mu )}. \end{aligned} \end{aligned}$$

This difference comes mainly from the fact that the Young function \({\tilde{\phi }}_{p,\alpha }\) is not submultiplicative but quasi-submultiplicative. Observe that the Young function we chose in the example above gives a cleaner constant. This difference makes us wonder about the sharpness of the estimates we get with our method. Nevertheless, observe that, in any case (that is, by choosing \(\phi _{p,\alpha }\) or \({\tilde{\phi }}_{p,\alpha }\)) we recover the sharp estimate in Theorem A by choosing \(\alpha =0\).

Remark 4.2

For a Young function \(\phi \) we can define the weak Orlicz quasi-norm

$$\begin{aligned} \Vert f\Vert _{{\mathcal {M}}_{\phi }(\mathrm {d}\mu )}:=\inf \left\{ \uplambda>0\,:\,\sup _{t>0}\phi (t)\mu \left( \left\{ x\in {\mathbb {R}}^{n}\,:\,|f(x)|>\uplambda t\right\} \right) \le 1\right\} . \end{aligned}$$

It is easy to prove that

$$\begin{aligned} \Vert cf\Vert _{{\mathcal {M}}_{\phi }(\mathrm {d}\mu )}=|c|\Vert f\Vert _{{\mathcal {M}}_{\phi }(\mathrm {d}\mu )} \end{aligned}$$

for any \(c\in {\mathbb {R}}\) and also that

$$\begin{aligned} \Vert f+g\Vert _{{\mathcal {M}}_{\phi }(\mathrm {d}\mu )}\le 2\left( \Vert f\Vert _{{\mathcal {M}}_{\phi }(\mathrm {d}\mu )}+\Vert g\Vert _{{\mathcal {M}}_{\phi }(\mathrm {d}\mu )}\right) . \end{aligned}$$

This kind of spaces were studied in [9]. This quasi-norm can be localized by defining

$$\begin{aligned} \left\| f\right\| _{{\mathcal {M}}_{\phi }\left( Q,\frac{\mathrm {d}\mu }{\mu (Q)}\right) }:=\inf \left\{ \uplambda>0\,:\,\sup _{t>0}\frac{\phi (t)}{\mu (Q)}\mu \left( \left\{ x\in Q\,:\,|f(x)|>\uplambda t\right\} \right) \le 1\right\} \end{aligned}$$

for every cube Q in \({\mathbb {R}}^n\). Choosing \(\Vert \cdot \Vert _{Z_Q}=\left\| \cdot \right\| _{{\mathcal {M}}_{\phi }\left( Q,\frac{\mathrm {d}\mu }{\mu (Q)}\right) }\) it is possible to show that, counterparts of Lemmas 4.1 and 4.2 hold as well in the case of weak-Orlicz spaces. Hence our approach allows to provide results for this family of spaces as well.

4.2 BMO-type improvement at the variable Lebesgue spaces scale

We finish this section with another example of application now to the setting of variable Lebesgue spaces. Note that in this case the method of the Laplace transform does not apply. Let \(p:{\mathbb {R}}^n\rightarrow [1,\infty ]\) be a Lebesgue measurable function and denote \(p^-:=\mathrm {ess\, inf}_{x\in {\mathbb {R}}^n}p(x)\) and \(p^+:=\mathrm {ess\, sup}_{x\in {\mathbb {R}}^n}p(x)\). Assume \(p^+<\infty \). The Lebesgue space with variable exponent \(p(\cdot )\) is the space of Lebesgue measurable functions f satisfying that

$$\begin{aligned} \Vert f\Vert _{L^{p(\cdot )}}:=\inf \left\{ \uplambda >0:\int _{{\mathbb {R}}^n} \left( \frac{|f(x)|}{\uplambda }\right) ^{p(x)}\,\mathrm {d}x\le 1 \right\} <\infty . \end{aligned}$$

One can associate to this space the local averages

$$\begin{aligned} \Vert f\Vert _{L^{p(\cdot )}\left( Q,\frac{\mathrm {d}x}{|Q|}\right) } :=\inf \left\{ \uplambda >0:\frac{1}{|Q|}\int _{Q} \left( \frac{|f(x)|}{\uplambda }\right) ^{p(x)}\,\mathrm {d}x\le 1 \right\} . \end{aligned}$$

Note that, by choosing \(\uplambda _0=1\), one has

$$\begin{aligned} \frac{1}{|Q|}\int _{Q} \left( \frac{|\chi _Q(x)|}{\uplambda _0}\right) ^{p(x)}\,\mathrm {d}x=\frac{1}{|Q|}\int _{Q} \left( |\chi _Q(x)| \right) ^{p(x)}\,\mathrm {d}x=1, \end{aligned}$$

and therefore \(\Vert \chi _Q\Vert _{L^{p(\cdot )}\left( Q,\frac{\mathrm {d}x}{|Q|}\right) }\le 1\). This in particular means that the family of norms \(\Vert \cdot \Vert _{Z_Q}=\Vert \cdot \Vert _{L^{p(\cdot )}\left( Q,\frac{\mathrm {d}x}{|Q|}\right) }\), which we call \({\mathcal {Z}}_{p(\cdot )}\), satisfies property (2) in Definition 2.4. Property (1) follows from [1, Theorem 2.59].

Example 4.3

We can show here that the family \({\mathcal {Z}}_{p(\cdot )}\) satisfies the generalized \(A_\infty (\mathrm {d}\mu )\) condition in Definition 2.3 for any essentially bounded exponent function p. Indeed, let Q be a cube in \({\mathbb {R}}^n\) and consider a family \(\{Q_j\}_{j\in {\mathbb {N}}}\in \Delta (Q)\) and a sequence \(\{h_j\}_{j\in {\mathbb {N}}}\) of functions satisfying \(\Vert h_j\Vert _{L^{p(\cdot )}\left( Q_j,\frac{\mathrm {d}x}{|Q_j|}\right) }=1\) for every \(j\in {\mathbb {N}}\). Then

$$\begin{aligned} \begin{aligned} \frac{1}{|Q|}\int _Q \left( \frac{\sum _{j\in {\mathbb {N}}} h_j(x)\chi _{Q_j}(x)}{\uplambda }\right) ^{p(x)}\,\mathrm {d}x&= \sum _{j\in {\mathbb {N}}} \frac{|Q_j|}{|Q|}\frac{1}{|Q_j|}\int _{Q_j} \left( \frac{h_j(x)}{\uplambda }\right) ^{ p(x)}\,\mathrm {d}x \end{aligned} \end{aligned}$$

and so, by taking \(\uplambda = \left( \frac{\sum _{j\in {\mathbb {N}}}|Q_j|}{|Q|}\right) ^{1/ p^+}\), one finds that

$$\begin{aligned} \begin{aligned} \frac{1}{|Q|}\int _Q \left( \frac{\sum _{j\in {\mathbb {N}}} h_j(x)\chi _{Q_j}(x)}{\uplambda }\right) ^{ p(x)}\,\mathrm {d}x&\le \sum _{j\in {\mathbb {N}}} \frac{|Q_j|}{|Q|}\frac{1}{|Q_j|}\int _{Q_j}\\&\quad \left( \frac{h_j(x)}{\left( \frac{\sum _{j\in {\mathbb {N}}}|Q_j|}{|Q|}\right) ^{1/ p^+}}\right) ^{ p(x)}\,\mathrm {d}x\\&\le \sum _{j\in {\mathbb {N}}} \frac{|Q_j|}{|Q|} \frac{1}{ \frac{\sum _{j\in {\mathbb {N}}}|Q_j|}{|Q|} } \frac{1}{|Q_j|}\int _{Q_j} h_j(x)^{p(x)} \,\mathrm {d}x\\&\le 1, \end{aligned} \end{aligned}$$

where [1, Proposition 2.21] has been used. This proves that, for any \(r\ge 1\),

$$\begin{aligned} \left\| \sum _{j\in {\mathbb {N}}} h_j\chi _{Q_j}\right\| _{L^{rp(\cdot )}\left( Q,\frac{\mathrm {d}x}{|Q|}\right) } \le \left( \frac{\sum _{j\in {\mathbb {N}}}|Q_j|}{|Q|}\right) ^{1/rp^+}. \end{aligned}$$
(4.3)

An application of this along with Theorem 1.1 proves Corollary 1.2 in the Introduction. Now we will set an application of this corollary to a John–Nirenberg type inequality. Note first that, for given \(1/p^-\le s<\infty \), one has that \(\Vert |f|^s\Vert _{L^{p(\cdot )}\left( Q,\frac{\mathrm {d}x}{|Q|}\right) }=\Vert f\Vert _{L^{sp(\cdot )}\left( Q,\frac{\mathrm {d}x}{|Q|}\right) }^s\), see [1, Proposition 2.18]. Let \(t>0\) and take \(r\ge 1\). Define, for any cube Q in \({\mathbb {R}}^n\), the subset \(E\subset Q\) defined as \(E_t:=\{x\in Q: |f(x)-f_Q|\ge t \}\). Then,

$$\begin{aligned}&\frac{1}{|Q|}\int _Q \left( \frac{\chi _{E_t}(x) }{\frac{1}{t^{r }}\Vert f-f_Q\Vert ^{r }_{L^{ rp(\cdot )}\left( Q,\frac{\mathrm {d}x}{|Q|}\right) }}\right) ^{p(x)}\,\mathrm {d}x \\&\quad = \frac{1}{|Q|}\int _{E_t}\left( \frac{t^r }{\Vert |f-f_Q|^r\Vert _{L^{p(\cdot )}\left( Q,\frac{\mathrm {d}x}{|Q|}\right) } }\right) ^{p(x)}\,\mathrm {d}x\\&\quad \le \frac{1}{|Q|} \int _{Q} \left( \frac{|f(x)-f_Q|^r }{\Vert |f-f_Q|^r\Vert _{L^{ p(\cdot )}\left( Q,\frac{\mathrm {d}x}{|Q|}\right) } }\right) ^{p(x)}\,\mathrm {d}x. \end{aligned}$$

Hence, for any \(t>0\), one has the following Chebychev type inequality

$$\begin{aligned} \Vert \chi _{\{x\in Q: |f(x)-f_Q|\ge t\}}\Vert _{L^{p(\cdot )}\left( Q,\frac{\mathrm {d}x}{|Q|}\right) }\le \frac{1}{t^r}\Vert f-f_Q\Vert _{L^{ rp(\cdot )}\left( Q,\frac{\mathrm {d}x}{|Q|}\right) }^r. \end{aligned}$$

Then, by Corollary 1.2 applied to the exponent function \(rp(\cdot )\),

$$\begin{aligned} \Vert \chi _{\{x\in Q: |f(x)-f_Q|\ge t\}}\Vert _{L^{p(\cdot )}\left( Q,\frac{\mathrm {d}x}{|Q|}\right) }\le \frac{1}{t^r}\left[ C(n) rp^+\Vert f\Vert _{\mathrm {BMO}} \right] ^r, \end{aligned}$$

for every \(r\ge 1\). For \(t\ge 2 C(n)p^+\Vert f\Vert _{\mathrm {BMO}}\), take \(r=t/(2 C(n)p^+\Vert f\Vert _{\mathrm {BMO}})\) to find that

$$\begin{aligned} \Vert \chi _{\{x\in Q: |f(x)-f_Q|\ge t\}}\Vert _{L^{p(\cdot )}\left( Q,\frac{\mathrm {d}x}{|Q|}\right) }\le 1/2^r=e^{-C(n,p^+)t/\Vert f\Vert _{\mathrm {BMO}}}, \end{aligned}$$

where \(C(n,p^+)=(2C(n)p^+)^{-1}\log 2\). When \(t\le 2C(n)p^+\Vert f\Vert _{\mathrm {BMO}}\), we have that the inequality \(e^{-C(n,p^+)t/\Vert f\Vert _{\mathrm {BMO}}}\ge 1/2\) holds and, therefore, by the Average property of the norm,

$$\begin{aligned} \Vert \chi _{\{x\in Q: |f(x)-f_Q|\ge t\}}\Vert _{L^{p(\cdot )}\left( Q,\frac{\mathrm {d}x}{|Q|}\right) }\le \Vert \chi _{Q}\Vert _{L^{p(\cdot )}\left( Q,\frac{\mathrm {d}x}{|Q|}\right) }\le 1\le 2e^{-C(n,p^+)t/\Vert f\Vert _{\mathrm {BMO}}} \end{aligned}$$

for these values of t. Hence, for every \(t>0\) we got that

$$\begin{aligned} \Vert \chi _{\{x\in Q: |f(x)-f_Q|\ge t\}}\Vert _{L^{p(\cdot )}\left( Q,\frac{\mathrm {d}x}{|Q|}\right) }\le 2 e^{-C(n,p^+)t/\Vert f\Vert _{\mathrm {BMO}}}. \end{aligned}$$

This proves Corollary 1.3. The John–Nirenberg type inequality we just got has something to do with the John–Nirenberg type inequality in [6, Theorem 3.2]. Note that, although condition \(p^+<\infty \) is used, no further condition is assumed on the exponent function p. This inequality also proves and generalizes the inequality

$$\begin{aligned} \frac{1}{|Q|}\int _Q e^{\uplambda |f(x)-f_Q|}\,\mathrm {d}x\le C \end{aligned}$$

for \(\uplambda < C(n,p^+)\), as it does the one in [6, Theorem 3.2].