1 Introduction

Let \({{\mathcal {S}}}\) be a sparse family of dyadic cubes, let \(b\in L^1_{\text {loc}}({{\mathbb {R}}}^n)\), \(m\in {{\mathbb {N}}}\) and \(1\le r<s\le \infty \). The key object of study in this paper is the bilinear sparse form defined by

$$\begin{aligned} {{\mathcal {B}}}_{{{\mathcal {S}}},b,r,s}^m(f,g):=\sum _{Q\in {\mathcal S}}\big \langle |b-\langle b\rangle _Q|^{m}|f|\big \rangle _{r,Q}\langle |g|\rangle _{s',Q}|Q|. \end{aligned}$$

This object appears naturally when one studies iterated commutators of various operators T and pointwise multiplication by b.

Let \(r<p,q<s\) and let \(\mu , \lambda \) be weights. Our goal is to obtain quantitative weighted \(L^p(\mu )\times L^{q'}(\lambda ^{1-q'})\)-bounds for \({{\mathcal {B}}}_{{{\mathcal {S}}},b,r,s}^m\) in the Bloom setting [2] both in the diagonal and off-diagonal cases. By the Bloom setting one means that an assumption on b is imposed in terms of the Bloom weight \(\nu \) depending on \(\mu \) and \(\lambda \) in a suitable way.

In the following cases the Bloom bounds for \({{\mathcal {B}}}_{{\mathcal S},b,r,s}^m\) have been considered before:

  • \(r=1\), \(s=\infty \), \(m\ge 1\) and \(p=q\) [20, 21].

  • \(r>1\), \(s=\infty \), \(m\ge 1\) and \(p=q\) [24].

  • \(r=1\), \(s=\infty \), \(m=1\) and \(p\ne q\) [10, 14].

Our results below are quantitative and cover all possible combinations of \(1\le r<p,q<s\le \infty \) and \(m\ge 1\). In particular, the bounds we obtain are new in the following settings:

  • Limited range: \(r>1\) or \(s<\infty \).

  • Iterated and off-diagonal: \(m>1\) and \(p\not =q\).

In the Bloom setting, prior works have been primarily focused on estimates for commutators of Calderón–Zygmund operators, which by [20, 21] boil down to estimates for \({\mathcal B}_{{{\mathcal {S}}},b,1,\infty }^m\). The study of the boundedness in the case \(m=1\) for these operators in the full range \(p,q \in (1,\infty )\) has recently been completed by Hänninen–Sinko and the second author [10]. For a comprehensive overview of the development of both unweighted and (Bloom) weighted estimates for these commutators, as well a discussion on the necessity of the conditions on b, we direct the reader to the introductions of, e.g. [10, 11, 14].

Our key application is a quantitative Bloom weighted estimate for iterated commutators of a rather general class of operators. This class includes, for example, Calderón–Zygmund operators, (maximal) rough homogeneous singular integral operators and Bochner–Riesz operators at the critical index. We refer to [22, Remark 4.4] for a further list of operators that fall within the scope of our theory. Note that for various operators on this list, no Bloom weighted (or even unweighted) commutator estimates were known previously.

Our approach will build upon a sparse domination procedure for commutators developed by Rivera-Ríos, the first and third authors [20], with subsequent generalizations by various authors. We will proceed in 3 steps:

  1. (1)

    In Sect. 3, we will prove that iterated commutators of certain sublinear operators T can be dominated by two sparse forms: \({{\mathcal {B}}}_{{{\mathcal {S}}},b,r,s}^m(f,g)\) and the dual form \({{\mathcal {B}}}_{{{\mathcal {S}}},b,s',r'}^m(g,f)\). Our key novel point here is in Lemma 3.4, which allows one to reduce \(m+1\) sparse forms to only 2 sparse forms.

  2. (2)

    We prove Bloom weighted estimates for \({{\mathcal {B}}}_{{{\mathcal {S}}},b,r,s}^m(f,g)\). To do so, we first extend a result of Li [23] and Fackler–Hytönen [8] to certain fractional sparse forms in Sect. 4. Afterwards, in Sect. 5 we combine the proof strategy of Hänninen, Sinko and the second author [10] with the change of measure formula of Cascante–Ortega–Verbitsky [4] to estimate \({{\mathcal {B}}}_{{{\mathcal {S}}},b,r,s}^m(f,g)\) in the Bloom setting. Furthermore, in the case \(q\le p\) and \(m \ge 2\), we provide a second Bloom weighted estimate for \({{\mathcal {B}}}_{{\mathcal S},b,r,s}^m(f,g)\) using a proof strategy suggested by Li [24]. In particular, Theorem 5.1 presents two incomparable quantitative bounds based on the approaches from [21] and [24].

  3. (3)

    Combining the first two steps, in Sect. 6, we obtain quantitative Bloom weighted estimates for iterated commutators. We apply this result to Calderón–Zygmund operators, (maximal) rough homogeneous singular integral operators and Bochner–Riesz operators at the critical index.

Since our estimates are quantitative, it is natural to ask about their sharpness. Here we encounter an interesting phenomenon, which is even new for Calderón–Zygmund operators. To be more precise, in [21], the first and the third authors jointly with Rivera–Ríos showed for a Calderón–Zygmund operator T and for \(m\ge 1\) (extending their previous work [20] for \(m=1\)) that

$$\begin{aligned} \Vert T_b^m\Vert _{L^p(\mu )\rightarrow L^p(\lambda )}\lesssim \Vert b\Vert _{{{{\,\textrm{BMO}\,}}}_{\nu }}^m([\lambda ]_{A_p}[\mu ]_{A_p})^{\frac{m+1}{2}\max (1,\frac{1}{p-1})}, \end{aligned}$$
(1.1)

where \({{\,\textrm{BMO}\,}}_{\nu }\) stands for the weighed \({{\,\textrm{BMO}\,}}\) space with weight \(\nu :=(\mu /\lambda )^{1/pm}\), and \(T_b^m\) is the m-th order commutator of T with a locally integrable function b.

Intuitively, one could conjecture that (1.1) is sharp, since in the case of equal weights \(\lambda =\mu =w\) one obtains the sharp one-weight estimate proved in [6] by Chung, Pereyra and Pérez. However, following this intuition, any bound by \([\lambda ]_{A_p}^{\alpha _p}[\mu ]_{A_p}^{\beta _p}\) with \(\alpha _p\) and \(\beta _p\) satisfying

$$\begin{aligned} \alpha _p+\beta _p=(m+1)\max (1,\tfrac{1}{p-1}) \end{aligned}$$

would be sharp.

Observe that the notion of sharpness in the Bloom setting (or in the two-weight setting, in general) has not been defined before. It is easy to see that a bound by \([\lambda ]_{A_p}^{\alpha _p}[\mu ]_{A_p}^{\beta _p}\) is stronger than a bound by \([\lambda ]_{A_p}^{\alpha '_p}[\mu ]_{A_p}^{\beta '_p}\) if and only if \(\alpha _p\le \alpha '_p\) and \(\beta _p\le \beta '_p\) and at least one of these inequalities is strict. In the case if, for example, \(\alpha _p<\alpha '_p\) and \(\beta _p>\beta '_p\), the bounds will be incomparable. This leads us to the following definition.

Definition 1.1

Let \(p\in (1,\infty )\), \(\mu ,\lambda \in A_p\) and let T be an operator. We say that the estimate

$$\begin{aligned} \Vert T\Vert _{L^p(\mu )\rightarrow L^p(\lambda )}\lesssim [\lambda ]_{A_p}^{\alpha _p}[\mu ]_{A_p}^{\beta _p} \end{aligned}$$

is sharp if neither of the exponents \(\alpha _p\) and \(\beta _p\) can be decreased.

Having this definition at hand, we are ready to present our result about the sharpness of (1.1). This result comes as a surprise to us because it says that the estimate (1.1) is sharp for all \(1<p<\infty \) only if \(m=1\). To be more precise, we have the following.

Theorem 1.2

Let T be a Dini-continuous Calderón–Zygmund operator.

  1. (i)

    If \(m=1\), then the estimate (1.1) is sharp for all \(p \in (1,\infty )\).

  2. (ii)

    If \(m\ge 2\), the estimate (1.1) is not sharp for all \(p\not \in [\frac{1+3m}{2m},\frac{1+3m}{m+1}]\).

In other words, if \(m\ge 2\) and \(p\not \in [\frac{1+3\,m}{2\,m},\frac{1+3\,m}{m+1}]\), the second Bloom weighted estimate obtained in Sect. 5 is incomparable with (1.1), and therefore, combined with the result from [6], a sharp bound for \(T_b^m\) in the sense of Definition 1.1 does not exist. Observe that Theorem 1.2 leaves open an interesting question about the sharpness of (1.1) when \(m\ge 2\) and \(p\in [\frac{1+3\,m}{2\,m},\frac{1+3\,m}{m+1}]\).

We shall see in Sect. 6 that a similar phenomenon with two incomparable bounds holds for a large class of operators.

1.1 Notation

We will make extensive use of the notation “\(\lesssim \)” to indicate inequalities up to an implicit multiplicative constant. These implicit constants may depend on pqnm, but not on any of the functions under consideration. If these implicit constants depend on the weights \(\mu , \lambda \), this will be denoted by “\(\lesssim _{\mu ,\lambda }\)”.

2 Preliminaries

2.1 Dyadic lattices

Denote by \({{\mathcal {Q}}}\) the set of all cubes \(Q\subseteq {{\mathbb {R}}}^n\) with sides parallel to the axes. For a cube \(Q \in \mathcal {Q}\) with side length \(\ell (Q)\) and \(\alpha >0\) we denote the cube with the same center as Q and side length \(\alpha \ell (Q)\) by \(\alpha Q\).

Given a cube \(Q\in {{\mathcal {Q}}}\), denote by \({{\mathcal {D}}}(Q)\) the set of all dyadic cubes with respect to Q, that is, the cubes obtained by repeated subdivision of Q and each of its descendants into \(2^n\) congruent subcubes. Following [18, Definition 2.1], a dyadic lattice \({{\mathscr {D}}}\) in \({{\mathbb {R}}}^n\) is any collection of cubes such that

  1. (i)

    Any child of \(Q\in {{\mathscr {D}}}\) is in \({{\mathscr {D}}}\) as well, i.e. \(\mathcal {D}(Q) \subseteq \mathscr {D}\).

  2. (ii)

    Any \(Q',Q''\in {{\mathscr {D}}}\) have a common ancestor, i.e. there exists a \(Q\in {{\mathscr {D}}}\) such that \(Q',Q''\in {{\mathcal {D}}}(Q)\).

  3. (iii)

    For every compact set \(K\subseteq {{\mathbb {R}}}^n\), there exists a cube \(Q\in {{\mathscr {D}}}\) containing K.

Throughout the paper, \(\mathscr {D}\) will always denote a dyadic lattice.

Definition 2.1

Let \(\eta \in (0,1)\) and let \({{\mathcal {S}}}\subseteq \mathcal {Q}\) be a family of cubes. We say that \({{\mathcal {S}}}\) is \(\eta \)-sparse if, for every cube \(Q\in {{\mathcal {S}}}\), there exists a subset \(E_Q\subseteq Q\) such that \(|E_Q|\ge \eta |Q|\) and the sets \(\{E_Q\}_{Q\in {{\mathcal {S}}}}\) are pairwise disjoint. We will omit the sparseness number \(\eta \) when its value is non-essential.

For a cube \(Q \in \mathcal {Q}\) and \(f \in L^1_{{{\,\textrm{loc}\,}}}({\mathbb {R}}^n)\) we define \(\langle f\rangle _Q:= \frac{1}{|Q|} \int _Q f\) and for \(r>0\) and a positive function \(f \in L^r_{{{\,\textrm{loc}\,}}}({\mathbb {R}}^n)\) we set

$$\begin{aligned} \langle f\rangle _{r,Q}:= \langle f^r\rangle _{Q}^{1/r}= \Bigl (\frac{1}{|Q|} \int _Q f^r\Bigr )^{1/r}. \end{aligned}$$

We define the maximal operator by

$$\begin{aligned} Mf&:= \sup _{Q\in \mathcal {Q}} \,\langle |f|\rangle _{1,Q} \chi _Q \end{aligned}$$

and set

$$\begin{aligned} M_rf&:=M(|f|^r)^{1/r}= \sup _{Q\in \mathcal {Q}} \langle |f|\rangle _{r,Q} \chi _Q. \end{aligned}$$

2.2 Weights

By a weight w we mean a non-negative \(w \in L^1_{{{\,\textrm{loc}\,}}}({\mathbb {R}}^n)\). For \(1<p<\infty \) we say that w belongs to the Muckenhoupt \(A_p\)-class and write \(w \in A_p\) if

$$\begin{aligned}{}[w]_{A_p}:=\sup _{Q\in {{\mathcal {Q}}}} \,\langle w\rangle _{1,Q} \langle w^{-1}\rangle _{\frac{1}{p-1},Q}<\infty . \end{aligned}$$

For \(1 \le r<\infty \) we say that w belongs to reverse Hölder class and write \(w \in {\textrm{RH}_r}\) if

$$\begin{aligned}{}[w]_{\textrm{RH}_r}:=\sup _{Q\in {{\mathcal {Q}}}}\frac{\langle w\rangle _{r,Q}}{\langle w\rangle _{1,Q}}<\infty . \end{aligned}$$

Furthermore, we say that w belongs to the Muckenhoupt \(A_\infty \)-class and write \(w \in A_{\infty }\) if

$$\begin{aligned}{}[w]_{A_{\infty }}:=\sup _{Q\in {\mathcal Q}}\frac{1}{w(Q)}\int _QM(w\chi _Q). \end{aligned}$$

We will frequently use that by the definition of the \(A_p\)-constant, we have

$$\begin{aligned}{}[w^{1-p'}]_{A_{p'}}=[w]_{A_p}^{\frac{1}{p-1}}. \end{aligned}$$
(2.1)

Moreover, we have

$$\begin{aligned}{}[w]_{A_{\infty }}\le c_{n}[w]_{A_p}, \end{aligned}$$

by [12, Proposition 2.2].

The following quantitative self-improvement lemma from [13] will play a key role in our applications.

Proposition 2.2

([13, Theorem 1.1 and 1.2]) There exists a constant \(c_n>0\) such that for \(w \in A_p\) with \(1<p<\infty \) we have

$$\begin{aligned}{}[w]_{{\textrm{RH}}_{1+\frac{1}{c_n[w]_{A_{p}}}}}\le c_n \qquad \text { and }\qquad [w]_{A_{p-\varepsilon }}\le c_n \, [w]_{A_p} \end{aligned}$$

with \(\varepsilon = \frac{p-1}{1+c_n[w]^{\frac{1}{p-1}}_{A_{p}}}\).

3 A sparse domination principle for commutators

In this section we will prove a general sparse domination principle for iterated commutators, following the line of research started in [20] by Rivera-Ríos and the first and the third authors. In order to state our result, let us introduce some notation.

Given a linear operator T and \(b \in L^1_{{{\,\textrm{loc}\,}}}({\mathbb {R}}^n)\), define the first order commutator \(T_b^1\) by

$$\begin{aligned} T_b^1(f):= bT(f)-T(bf). \end{aligned}$$

Next, for \(m\in {{\mathbb {N}}}, m\ge 2\), define higher order commutators \(T_b^m\) inductively by

$$\begin{aligned} T_b^m(f):=T_b^1(T_b^{m-1}(f)). \end{aligned}$$

It is easy to see that

$$\begin{aligned} T_b^mf(x)= T\bigl ((b(x)-b(\cdot ))^mf\bigr )(x), \qquad x \in {\mathbb {R}}^n. \end{aligned}$$
(3.1)

Now assume that T is a general, not necessarily linear, operator. In this case we use formula (3.1) as the definition of \(T_b^m\).

For \(1\le s\le \infty \) we define the sharp grand maximal truncation operator

$$\begin{aligned} {{\mathcal {M}}}^{\#}_{T,s}f(x):=\sup _{Q\ni x}{{\,\textrm{osc}\,}}_s\big (T(f\chi _{{{\mathbb {R}}}^n\setminus 3Q});Q\big ), \qquad x \in {\mathbb {R}}^n, \end{aligned}$$

where

$$\begin{aligned} {{\,\textrm{osc}\,}}_s(f;Q):=\Bigl (\frac{1}{|Q|^2}\int _{Q\times Q}|f(x')-f(x'')|^s\hspace{2pt}\textrm{d}x'\hspace{2pt}\textrm{d}x''\Bigr )^{1/s} \end{aligned}$$

and the supremum is taken over all \(Q \in \mathcal {Q}\) containing x.

We will use the following boundedness property of T and \(\mathcal {M}_{T,s}^{\#}\).

Definition 3.1

Given an operator T and \(r \in [1,\infty )\), we say that T is locally weak \(L^r\)-bounded if there exists a non-increasing function \(\varphi _{T,r}:(0,1) \rightarrow [0,\infty )\) such that for any cube \(Q\in \mathcal {Q}\) and \(f \in L^r(Q)\) one has

$$\begin{aligned} \bigl |\bigl \{x \in Q:|T(f \chi _Q)(x)|>\varphi _{T,r}(\lambda )\langle |f|\rangle _{r,Q}\bigr \}\bigr | \le \lambda \, |Q|, \qquad \lambda \in (0,1). \end{aligned}$$

This definition was given in [19] and was called the \(W_r\) property of T. Note that the usual weak \(L^r\)-boundedness of T implies the local weak \(L^r\)-boundedness of T with

$$\begin{aligned} \varphi _{T,r}(\lambda ):= \lambda ^{-1/r}\Vert T\Vert _{L^r({\mathbb {R}}^n) \rightarrow L^{r,\infty }({\mathbb {R}}^n)}, \qquad \lambda \in (0,1). \end{aligned}$$

Moreover, if T is locally weak \(L^{r_0}\)-bounded for some \(r_0 \in [1,\infty )\), it is locally weak \(L^r\)-bounded for all \(r>r_0\) by Hölder’s inequality with \(\varphi _{T,r}(\lambda )=\varphi _{T,r_0}(\lambda )\).

The main result of this section is the following abstract sparse domination principle for iterated commutators.

Theorem 3.2

Let \(1\le r<s\le \infty , m\in {{\mathbb {N}}}\) and let T be a sublinear operator. Assume that T and \({{\mathcal {M}}}^{\#}_{T,s}\) are locally weak \(L^r\)-bounded. Then there exist \(C_{m,n}>1\) and \(\lambda _{m,n}<1\) so that, for any \(f,g\in L^{\infty }_c({{\mathbb {R}}}^n)\) and \(b\in L^1_{\text {loc}}({{\mathbb {R}}}^n)\), there is a \(\frac{1}{2\cdot 3^n}\)-sparse collection of cubes \({{\mathcal {S}}}\) such that

$$\begin{aligned} \int _{{{\mathbb {R}}}^n}|T_b^mf||g|&\le C\Big (\sum _{Q\in {{\mathcal {S}}}}\big \langle |b-\langle b\rangle _Q|^{m}|f|\big \rangle _{r,Q}\big \langle |g|\big \rangle _{s',Q}|Q|\\&\hspace{1cm}+\sum _{Q\in {{\mathcal {S}}}}\big \langle |f|\big \rangle _{r,Q}\big \langle |b-\langle b\rangle _Q|^{m}|g|\big \rangle _{s',Q}|Q|\Big ), \end{aligned}$$

where

$$\begin{aligned} C:=C_{m,n}\big (\varphi _{T,r}(\lambda _{m,n})+\varphi _{\mathcal {M}^{\#}_{T,s},r}(\lambda _{m,n})\big ). \end{aligned}$$
(3.2)

We refer to [22, Remark 4.4] for a list of operators satisfying the assumptions of Theorem 3.2. Theorem 3.2 is an immediate corollary of the following two statements.

Theorem 3.3

Under the assumptions of Theorem 3.2 we have

$$\begin{aligned} \int _{{{\mathbb {R}}}^n}|T_b^mf||g|\le C\sum _{k=0}^m\Big (\sum _{Q\in {{\mathcal {S}}}}\big \langle |b-\langle b\rangle _Q|^{m-k}|f|\big \rangle _{r,Q} \big \langle |b-\langle b\rangle _Q|^{k}|g|\big \rangle _{s',Q}|Q|\Big ), \end{aligned}$$

where C is given by (3.2)

Lemma 3.4

Let \(1\le r,t<\infty \) and \(m\in {{\mathbb {N}}}\). Let \(f,g\in L^{\infty }_c({{\mathbb {R}}}^n)\) and \(b\in L^1_{\text {loc}}({{\mathbb {R}}}^n)\). Fix a cube \(Q \in \mathcal {Q}\) and for \(0\le k\le m\) define

$$\begin{aligned} c_k:=\big \langle |b-\langle b\rangle _Q|^{m-k}|f|\big \rangle _{r,Q}\big \langle |b-\langle b\rangle _Q|^{k}|g|\big \rangle _{t,Q}. \end{aligned}$$

Then we have \(c_k\le c_0+c_m.\)

Indeed, note that Lemma 3.4 allows us to reduce the summation over \(k=0,\dots ,m\) in Theorem 3.3 to the two extreme terms \(k=0\) and \(k=m\), yielding the formulation of Theorem 3.2.

Before turning to the proofs, let us mention a brief history of the above results.

  • In the case where T is a Dini-continuous Calderón-Zygmund operator, \(m=1\), \(r=1\) and \(s=\infty \), Theorem 3.2 goes back to Rivera-Ríos and the first and the third authors [20].

  • In the case where \(m\ge 1\) and T is a generalized Hörmander singular integral operator, the corresponding version of Theorem 3.3 was obtained by Ibañez-Firnkorn and Rivera-Ríos [15].

  • The closest precursors of Theorem 3.2 were obtained by

    • Rivera–Ríos [28, Theorem 3.1] in the case \(m=1\). We note that in this work the bilinear maximal operator \({{\mathcal {M}}}_T(f,g)\), introduced in [17], was used instead of \({{\mathcal {M}}}^{\#}_{T,s}\).

    • Ibañez–Firnkorn and Rivera–Ríos [16, Theorem 4.4] in the case \(m\ge 1\) and with \({{\mathcal {M}}}^{\#}_{T,\infty }\).

  • For a general account of similar sparse domination results we refer to our recent work [22].

Comparing to [28, Theorem 3.1] and [16, Theorem 4.4], our novel points are the following.

  • Instead of \({{\mathcal {M}}}_T(f,g)\) or \({{\mathcal {M}}}^{\#}_{T,\infty }\), we deal with a more flexible operator \({{\mathcal {M}}}^{\#}_{T,s}\). Here we continue the line of research originated in [19, 22, 26], where various variants of \({\mathcal M}^{\#}_{T,s}\) were considered.

  • We use a local weak \(L^r\)-boundedness assumption, originating from [19], rather than the usual weak \(L^r\)-boundedness assumption on T and \(M^{\#}_{T,s}\).

  • Our most important novel point in this section is in Lemma 3.4, which seems to be new. This lemma allows us to significantly simplify the main applications of Theorem 3.2 to quantitative weighted norm inequalities.

The proof of Lemma 3.4 is quite elementary and we therefore present it first.

Proof of Lemma 3.4

For \(x,y \in {\mathbb {R}}^n\) denote

$$\begin{aligned} \varphi (x,y):=|b(x)-\langle b\rangle _Q|^{m-k}|f(x)||b(y)-\langle b\rangle _Q|^{k}|g(y)|\chi _{Q\times Q}(x,y). \end{aligned}$$

Then \(c_k\) for \(0\le k\le m\) can be written in the form

$$\begin{aligned} c_k=\Big \Vert x \mapsto \Vert y \mapsto \varphi (x,y)\Vert _{L^t(\frac{\hspace{2pt}\textrm{d}y}{|Q|})}\Big \Vert _{L^r(\frac{\hspace{2pt}\textrm{d}x}{|Q|})}. \end{aligned}$$

From this, we obtain the conclusion by using the estimate

$$\begin{aligned} |b(x)-\langle b\rangle _Q|^{m-k}|b(y)-\langle b\rangle _Q|^{k}\le |b(x)-\langle b\rangle _Q|^{m}+|b(y)-\langle b\rangle _Q|^{m} \end{aligned}$$

along with Minkowski’s inequality. \(\square \)

Next we turn to the proof of Theorem 3.3. Its proof is based on the well-known ideas developed in the previous works (e.g., [17, 19, 22, 26]).

Proof of Theorem 3.3

Let \(Q \in \mathcal {Q}\) be a cube that contains the supports of f and g. We will show that there exists a \(\frac{1}{2}\)-sparse family \({{\mathcal {F}}}\subset \mathcal {D}(Q)\) such that

$$\begin{aligned} \begin{aligned}&\int _{{{\mathbb {R}}}^n}|T_b^m(f)||g|=\int _{Q}|T_b^m(f\chi _{3Q})||g|\\&\hspace{0.1cm}\le C\sum _{k=0}^m\Big (\sum _{P\in {\mathcal F}}\big \langle |b-\langle b\rangle _{3P}|^{m-k}|f|\big \rangle _{r,3P} \big \langle |b-\langle b\rangle _{3P}|^{k}|g|\big \rangle _{s',3P}|P|\Big ), \end{aligned} \end{aligned}$$
(3.3)

where C is given by (3.2). Taking \(\mathcal {S} = \{3P:P \in \mathcal {F}\}\) afterwards yields the result.

We construct the family \({{\mathcal {F}}}\subset \mathcal {D}(Q)\) inductively. Set \({{\mathcal {F}}}_0=\{Q\}\). Next, given a collection of pairwise disjoint cubes \({{\mathcal {F}}}_j\), let us describe how to construct \({{\mathcal {F}}}_{j+1}\).

Fix a cube \(P\in {{\mathcal {F}}}_j\). For \(k=0,\dots ,m\) denote

$$\begin{aligned} \eta _k:=(b-\langle b\rangle _{3P})^kf\chi _{3P}, \end{aligned}$$

and consider the sets

$$\begin{aligned} \Omega _k(P):=\bigl \{x\in P:|T(\eta _{k})(x)|>\varphi _{T,r}(\tfrac{1}{(m+1)6^{n+2}})\langle |\eta _k|\rangle _{r,3P}\bigr \} \end{aligned}$$

and

$$\begin{aligned} {{\mathcal {M}}}_k(P):=\bigl \{x\in P:|\mathcal {M}^{\#}_{T,s}(\eta _{k})(x)|>\varphi _{\mathcal {M}^{\#}_{T,s},r} (\tfrac{1}{(m+1)6^{n+2}})\langle |\eta _k|\rangle _{r,3P}\bigr \}. \end{aligned}$$

Then

$$\begin{aligned} |\Omega _k(P)|\le \tfrac{1}{(m+1)6^{n+2}}|3P|\le \tfrac{1}{3(m+1)2^{n+2}}|P|, \end{aligned}$$

and the same bound holds for \(|{{\mathcal {M}}}_k(P)|\). Since the maximal operator \(M_r\) is weak \(L^r\)-bounded with constant independent of r, there exists a \(c_{n,m}>0\) such that

$$\begin{aligned} M_k(P):=\bigl \{x\in P:M_r(\eta _{k})(x)>c_{n,m}\langle |\eta _k|\rangle _{r,3P}\bigr \} \end{aligned}$$

also satisfies

$$\begin{aligned} |M_k(P)|\le \tfrac{1}{3(m+1)2^{n+2}}|P|. \end{aligned}$$

Therefore, setting

$$\begin{aligned} \Omega (P):=\bigcup _{k=0}^m\bigl (\Omega _k(P)\cup {{\mathcal {M}}}_k(P)\cup M_k(P)\bigr ), \end{aligned}$$

we have \(|\Omega (P)|\le \frac{1}{2^{n+2}}|P|\).

We apply the local Calderón–Zygmund decomposition to \(\chi _{\Omega (P)}\) at height \(\frac{1}{2^{n+1}}\). We obtain a family of pairwise disjoint cubes \(\mathcal {S}_P\subseteq \mathcal {D}(P)\) such that \(|\Omega (P)\setminus \bigcup _{P' \in \mathcal {S}_P}P'|=0\) and for every \(P'\in \mathcal {S}_P\),

$$\begin{aligned} \tfrac{1}{2^{n+1}}|P'|\le |P'\cap \Omega (P)|\le \tfrac{1}{2}|P'|. \end{aligned}$$
(3.4)

In particular, it follows that

$$\begin{aligned} \sum _{P'\in \mathcal {S}_P}|P'|\le 2^{n+1}|\Omega (P)|\le \tfrac{1}{2} |P|. \end{aligned}$$
(3.5)

We define \({{\mathcal {F}}}_{j+1}=\cup _{P\in {{\mathcal {F}}}_j}\mathcal {S}_P\). Setting \({{\mathcal {F}}}=\cup _{j=0}^{\infty }{{\mathcal {F}}}_j\), we note by (3.5) that \({{\mathcal {F}}}\) is \(\frac{1}{2}\)-sparse.

Now, by iteration, to prove (3.3) it suffices to show for \(j \in {\mathbb {N}}\) and \(P \in \mathcal {F}_j\) that

$$\begin{aligned} \int _{P} \bigl |T_b^m(f\chi _{3P})\bigr ||g|&\le C\sum _{k=0}^m\big \langle |b-\langle b\rangle _{3P}|^{m-k}|f|\big \rangle _{r,3P} \big \langle |b-\langle b\rangle _{3P}|^{k}|g|\big \rangle _{s',3P}|P| \\ {}&\quad +\sum _{P' \in \mathcal {F}_{j+1}:P' \subseteq P}\int _{P'}\bigl |T_b^m(f\chi _{3P'})\bigr ||g|, \end{aligned}$$

where C is given by (3.2). Set \(F_j:=\cup _{P\in {{\mathcal {F}}}_j}P\). Noting that

$$\begin{aligned} \int _{P} \bigl |T_b^m(f\chi _{3P})\bigr ||g|&\le \int _{P\setminus F_{j+1}} \bigl |T_b^m(f\chi _{3P})\bigr ||g|\\&\quad +\sum _{P' \in \mathcal {F}_{j+1} : P' \subseteq P} \int _{P'} \bigl |T_b^m(f\chi _{3P \setminus 3P'})\bigr ||g| \\ {}&\quad +\sum _{P' \in \mathcal {F}_{j+1}:P' \subseteq P}\int _{P'}\bigl |T_b^m(f\chi _{3P'})\bigr ||g|, \end{aligned}$$

it thus suffices to show that

$$\begin{aligned} \begin{aligned}&\int _{P\setminus F_{j+1}} \bigl |T_b^m(f\chi _{3P})\bigr ||g| +\sum _{P' \in \mathcal {F}_{j+1} : P' \subseteq P} \int _{P'} \bigl |T_b^m(f\chi _{3P \setminus 3P'})\bigr ||g|\\&\quad \le C\sum _{k=0}^m\big \langle |b-\langle b\rangle _{3P}|^{m-k}|f|\big \rangle _{r,3P} \big \langle |b-\langle b\rangle _{3P}|^{k}|g|\big \rangle _{s',3P}|P|. \end{aligned} \end{aligned}$$
(3.6)

We first consider the first term on the left-hand side of (3.6). Since \(T_b^mf=T_{b-c}^mf\) for any \(c\in {\mathbb {C}}\), we have by definition of \(\Omega _k(P)\),

$$\begin{aligned}&\int _{P\setminus F_{j+1}} \bigl |T_b^m(f\chi _{3P})\bigr ||g|\nonumber \\&\hspace{1cm}\le \sum _{k=0}^m{m\atopwithdelims ()k}\int _{P\setminus F_{j+1}}|T\big ((b-\langle b\rangle _{3P})^{m-k}f\big )||b-\langle b\rangle _{3P}|^{k}|g|\\&\hspace{1cm} \le C_1\sum _{k=0}^m\big \langle |b-\langle b\rangle _{3P}|^{m-k}|f|\big \rangle _{r,3P}\big \langle |b-\langle b\rangle _{3P}|^{k}|g|\big \rangle _{1,3P}|P|,\nonumber \end{aligned}$$
(3.7)

where \(C_1:=2^m3^n\varphi _{T,r}(\tfrac{1}{(m+1)6^{n+2}})\).

Now consider the second term in (3.6). Fix \(P' \in \mathcal {F}_{j+1}\) such that \(P'\subseteq P\) and denote

$$\begin{aligned} \psi _k(x):=T\big ((b-\langle b\rangle _{3P})^{m-k}f\chi _{3P\setminus 3P'}\big )(x), \qquad x \in {\mathbb {R}}^n. \end{aligned}$$

Then, for \(y\in P'\) to be specified later we have

$$\begin{aligned} \begin{aligned} \int _{P'}&\bigl |T_b^m(f\chi _{3P \setminus 3P'})\bigr | |g|\le 2^m\sum _{k=0}^m\int _{P'}|\psi _k||b-\langle b\rangle _{3P}|^{k}|g|\\&\le 2^m\sum _{k=0}^m\int _{P'}|\psi _k(x)-\psi _k(y)||b(x)-\langle b\rangle _{3P}|^{k}|g(x)|\hspace{2pt}\textrm{d}x\\&\hspace{1cm}+2^m\sum _{k=0}^m|\psi _k(y)|\int _{P'}|b(x)-\langle b\rangle _{3P}|^{k}|g(x)|\hspace{2pt}\textrm{d}x. \end{aligned} \end{aligned}$$
(3.8)

Denote

$$\begin{aligned} \xi _k(x):=\bigl (b(x)-\langle b\rangle _{3P}\bigr )^{m-k}f(x)\chi _{3P'}(x),\qquad x \in {\mathbb {R}}^n, \end{aligned}$$

and consider the sets

$$\begin{aligned} {\widetilde{\Omega }}_k(P'):=\{x\in P':|T(\xi _k)(x)|>\varphi _{T,r}(\tfrac{1}{4(m+1)3^n})\langle |\xi _k|\rangle _{r,3P'}\}. \end{aligned}$$

Set \({\widetilde{\Omega }}(P'):=\cup _{k=0}^m{\widetilde{\Omega }}_k(P')\), for which we have \(|{\widetilde{\Omega }}(P')|\le \frac{1}{4}|P'|\). Now, define the good part of the cube \(P'\) as

$$\begin{aligned} G_{P'}:=P'\setminus \bigl (\Omega (P)\cup {\widetilde{\Omega }}(P')\bigr ). \end{aligned}$$

Then, by (3.4), we have

$$\begin{aligned} |G_{P'}|\ge \big (\tfrac{1}{2}-\tfrac{1}{4}\big )|P'|=\tfrac{1}{4}|P'|, \end{aligned}$$

and for all \(y\in G_{P'}\) we have

$$\begin{aligned} |\psi _k(y)|&\le \bigl |T\big ((b-\langle b\rangle _{3P})^{m-k}f\chi _{3P}\big )(y)\bigr |+\bigl |T\big ((b-\langle b\rangle _{3P})^{m-k}f\chi _{3P'}\big )(y)\bigr |\\&\le \varphi _{T,r}(\tfrac{1}{(m+1)6^{n+2}})\langle |b-\langle b\rangle _{3P}|^{m-k}|f|\rangle _{r,3P}\\&\hspace{1cm}+\varphi _{T,r}(\tfrac{1}{4(m+1)3^n})\langle |b-\langle b\rangle _{3P}|^{m-k}|f|\rangle _{r,3P'}. \end{aligned}$$

Further, by the definition of \(M_k(P)\),

$$\begin{aligned} \langle |b-\langle b\rangle _{3P}|^{m-k}|f|\rangle _{r,3P'}\le c_{n,m}\langle |b-\langle b\rangle _{3P}|^{m-k}|f|\rangle _{r,3P}. \end{aligned}$$

Hence, for all \(y\in G_{P'}\), we have

$$\begin{aligned} |\psi _k(y)|\le 2c_{n,m}\,\varphi _{T,r}(\tfrac{1}{(m+1)6^{n+2}})\langle |b-\langle b\rangle _{3P}|^{m-k}|f|\rangle _{r,3P}. \end{aligned}$$

From this, integrating (3.8) over \(y\in G_{P'}\), using Hölder’s inequality and the definition of the set \({\mathcal M}(P)\), we obtain

$$\begin{aligned}&\int _{P'}\bigl |T_b^m(f\chi _{3P \setminus 3P'})\bigr ||g|\\&\le 4\cdot 2^m \sum _{k=0}^m{{\,\textrm{osc}\,}}_s\big (T(\eta _{m-k}\chi _{{{\mathbb {R}}}^n\setminus 3P'});P'\big )\big \langle |b-\langle b\rangle _{3P}|^{k}|g|\big \rangle _{s',P'}|P'|\\&\hspace{0.1cm}+2c_{n,m}\,\varphi _{T,r}(\tfrac{1}{(m+1)6^{n+2}})\sum _{k=0}^m\langle |b-\langle b\rangle _{3P}|^{m-k}|f|\rangle _{r,3P}\big \langle |b-\langle b\rangle _{3P}|^{k}|g|\big \rangle _{1,P'}|P'|\\&\le C_2\sum _{k=0}^m\big \langle |b-\langle b\rangle _{3P}|^{m-k}|f|\big \rangle _{r,3P}\big \langle |b-\langle b\rangle _{3P}|^{k}|g|\big \rangle _{s',P'}|P'|, \end{aligned}$$

where

$$\begin{aligned} C_2:={\tilde{c}}_{n,m}\big (\varphi _{T,r}(\tfrac{1}{(m+1)6^{n+2}})+\varphi _{\mathcal {M}^{\#}_{T,s},r}(\tfrac{1}{(m+1)6^{n+2}})\big ). \end{aligned}$$

By Hölder’s inequality, for any \(q \in [1,\infty )\),

$$\begin{aligned} \sum _{P' \in \mathcal {F}_{j+1}: P' \subseteq P}\langle |h|\rangle _{q,P'}|P'|\le \langle |h|\rangle _{q,P}|P|. \end{aligned}$$

Therefore,

$$\begin{aligned}&\sum _{P' \in \mathcal {F}_{j+1} : P' \subseteq P}\int _{P'}\bigl |T_b^m(f\chi _{3P \setminus 3P'})\bigr ||g|\\&\hspace{1cm}\le C_2\sum _{k=0}^m\big \langle |b-\langle b\rangle _{3P}|^{m-k}|f|\big \rangle _{r,3P}\big \langle |b-\langle b\rangle _{3P}|^{k}|g|\big \rangle _{s',3P}|P|, \end{aligned}$$

which, along with (3.7), proves (3.6). This completes the proof. \(\square \)

Remark 3.5

Under the assumptions of Theorem 3.2 and by the “three lattice theorem" (see, e.g., [18]), there exist \(3^n\) dyadic lattices \({{\mathscr {D}}}_j\) so that for any \(f,g\in L^{\infty }_c({{\mathbb {R}}}^n)\) and \(b\in L^1_{\text {loc}}({\mathbb R}^n)\), there exist sparse families \({{\mathcal {S}}}_j\subseteq {{\mathscr {D}}}_j\) such that

$$\begin{aligned} \int _{{{\mathbb {R}}}^n}|T_b^mf||g|&\le C\sum _{j=1}^{3^n}\Bigl (\sum _{Q\in {{{\mathcal {S}}}_j}}\big \langle |b-\langle b\rangle _Q|^{m}|f|\big \rangle _{r,Q}\big \langle |g|\big \rangle _{s',Q}|Q|\\&\hspace{1cm}+\sum _{Q\in {{{\mathcal {S}}}_j}}\big \langle |f|\big \rangle _{r,Q}\big \langle |b-\langle b\rangle _Q|^{m}|g|\big \rangle _{s',Q}|Q|\Bigr ), \end{aligned}$$

where C is given by (3.2).

4 Weighted estimates for fractional sparse forms

In the next section, we will prove quantitative Bloom weighted estimates for the sparse forms in the conclusion of Theorem 3.2. As a preparation, we establish a weighted estimate for fractional sparse forms in this section.

Theorem 4.1

Let \(1\le r<p\le q<s\le \infty \), set \(\alpha = \frac{r}{p}-\frac{r}{q}\) and let \(w \in A_{q/r} \cap \textrm{RH}_{(s/q)'}\). For any sparse family of cubes \({{\mathcal {S}}}\subseteq {{\mathscr {D}}}\), \(f \in L^p(w^{p/q})\) and \(g \in L^{q'}(w^{1-q'})\) we have

$$\begin{aligned} \sum _{Q\in {\mathcal S}}\langle |f|\rangle _{\frac{r}{1+\alpha },Q}\langle |g|\rangle _{s',Q}|Q|^{1+\frac{\alpha }{r}}\lesssim [w]_{A_{q/r}}^{\beta }[w]_{{\textrm{RH}}_{(s/q)'}}^{\beta }\Vert f\Vert _{L^p(w^{p/q})}\Vert g\Vert _{L^{q'}(w^{1-q'})} \end{aligned}$$

with

$$\begin{aligned} \beta ={\max \Bigl ( \tfrac{s(1-\tfrac{\alpha }{r})-1}{s-q},\tfrac{1}{q-r}\Bigr )}. \end{aligned}$$

For \(\alpha =0\), this theorem was proved by Bernicot–Frey–Petermichl [1, Proposition 6.4]. The general case \(\alpha \ge 0\) is a combination of generalizations by Li [23] and Fackler–Hytönen [8].

Remark 4.2

Following the notation of Nieraeth [27], for \(0<r<p<s\le \infty \) and a weight w, define

$$\begin{aligned}{}[w]_{p,(r,s)}:= \sup _{Q \in \mathcal {Q}}\, \langle w^{-1}\rangle _{\frac{pr}{p-r},Q} \langle w\rangle _{\frac{sp}{s-p},Q}. \end{aligned}$$

Upon inspection of the proof, it is clear that the conclusion of Theorem 4.1 can be more symmetrically phrased as

$$\begin{aligned} \sum _{Q\in {\mathcal S}}\langle |f|\rangle _{\frac{r}{1+\alpha },Q}\langle |g|\rangle _{s',Q}|Q|^{1+\frac{\alpha }{r}}\lesssim [w]_{{q,(r,s)}}^{\beta q}\Vert f\Vert _{L^p(w^{p})}\Vert g\Vert _{L^{q'}(w^{-q'})} \end{aligned}$$

for all weights w such that \([w]_{{q,(r,s)}}<\infty \).

As a direct corollary of Theorem 4.1, in the case \(r=1\) and \(s= \infty \), we recover [10, Lemma 3.2], which is a special case of [8, Theorem 1.1]:

Corollary 4.3

Let \(1<p \le q <\infty , \) set \(\alpha = \frac{1}{p}-\frac{1}{q}\) and let \(w \in A_{q}\). For any sparse family of cubes \({{\mathcal {S}}}\subseteq {{\mathscr {D}}}\) and \(f \in L^p(w^{p/q})\) we have

$$\begin{aligned} \Bigl \Vert \sum _{Q\in {\mathcal S}}\langle |f|\rangle _{\frac{1}{1+\alpha },Q}|Q|^{{\alpha }} \chi _Q\Bigr \Vert _{L^q(w)} \lesssim [w]_{A_q}^{\max (1-\alpha ,\frac{1}{q-1})} \Vert f\Vert _{L^p(w^{p/q} )} . \end{aligned}$$

In particular, we recover the well-known bound

$$\begin{aligned} \Bigl \Vert \sum _{Q\in {{\mathcal {S}}}}\langle |f|\rangle _{1,Q} \chi _Q\Bigr \Vert _{L^q(w)}\lesssim [w]_{A_q}^{\max (1,\frac{1}{q-1})}\Vert f\Vert _{L^q(w )}. \end{aligned}$$
(4.1)

The proof of Theorem 4.1 is based on three main ingredients, the first of which is a very slight generalization of a result of Li [23].

Theorem 4.4

( [23, Theorem 1.2]) Let \(1<p\le q<s\le \infty \) and \(r \in (0,p)\). Let w and \(\sigma \) be weights and \(\lambda _Q\ge 0\) for any \(Q \in \mathscr {D}\). Let \(\mathcal {S}\subseteq \mathscr {D}\) be a sparse family of cubes and suppose that \({{\mathcal {N}}}\) is the best constant such that

$$\begin{aligned} \sum _{Q\in {\mathcal {S}}}\langle |f|\rangle _{r,Q}\langle |g|\rangle _{s',Q}\lambda _Q\le {{\mathcal {N}}}\Vert f\Vert _{L^p(w)}\Vert g\Vert _{L^{q'}(\sigma )}. \end{aligned}$$

Denote \(u:=w^{\frac{r}{r-p}}\), \(v:=\sigma ^{\frac{s'}{s'-q'}}\), set

$$\begin{aligned} \tau _Q:=\langle u\rangle _Q^{\frac{1}{r}-1}\langle v\rangle _Q^{-\frac{1}{s}}\frac{\lambda _Q}{|Q|},\qquad Q \in \mathscr {D} \end{aligned}$$

and for \(R \in \mathscr {D}\) define

$$\begin{aligned} T_R f:=\sum _{Q\in {\mathcal {S}}:Q\subseteq R}\tau _Q\langle f\rangle _Q\chi _Q. \end{aligned}$$

Then

$$\begin{aligned} {{\mathcal {N}}}\eqsim \sup _{R\in {\mathcal {S}}}\frac{\Vert T_R(u)\Vert _{L^q(v)}}{u(R)^{1/p}}+\sup _{R\in {\mathcal {S}}}\frac{\Vert T_R(v)\Vert _{L^{p'}(u)}}{v(R)^{1/q'}}. \end{aligned}$$
(4.2)

Proof

In the case \(r \ge 1\), the theorem is exactly [23, Theorem 1.2]. The case \(r<1\) is proven analogously. Indeed, only the proof of the equivalence [23, (2.1)] \(\Leftrightarrow \) [23, (2.2)] needs to be adapted. To handle the average of f in the implication \(\Leftarrow \), one replaces the maximal operator argument by Hölder’s inequality. Conversely, for the implication \(\Rightarrow \), one replaces Hölder’s inequality by a maximal operator argument, using the boundedness of \( M_{1,u}^{\mathcal {S}}\) on \(L^p(u)\). \(\square \)

In order to estimate the two terms in (4.2), we will use the following norm equivalence from Cascante–Ortega–Verbitsky [4].

Lemma 4.5

([4]) Let \(p \in [1,\infty )\), let w be a weight and let \(\lambda _Q \ge 0\) for all \(Q\in \mathscr {D}\). Then

$$\begin{aligned} \Big \Vert \sum _{Q\in {\mathscr {D}}}\lambda _Q\chi _Q\Big \Vert _{L^p(w)}\eqsim \Bigl (\sum _{Q\in {\mathscr {D}}}\lambda _Q\Big (\frac{1}{w(Q)}\sum _{Q'\in {{\mathscr {D}}}, Q'\subseteq Q}\lambda _{Q'}w(Q')\Big )^{p-1}w(Q)\Bigr )^{1/p}. \end{aligned}$$

The final ingredient in the proof of Theorem 4.1 is the following result from Fackler–Hytönen [8].

Lemma 4.6

( [8, Lemma 4.2]) Let w and \(\sigma \) be weights and \(\alpha ,\beta ,\gamma \ge 0\) with \(\alpha >0\) and \(\alpha +\beta +\gamma \ge 1\). Then we have for any sparse family \({\mathcal S}\subseteq {{\mathscr {D}}}\) and \(R \in \mathscr {D}\)

$$\begin{aligned} \sum _{Q\in {{\mathcal {S}}}:Q\subseteq R}|Q|^{\alpha }\sigma (Q)^{\beta }w(Q)^{\gamma }\lesssim |R|^{\alpha }\sigma (R)^{\beta }w(R)^{\gamma }. \end{aligned}$$

Combining these three ingredients with the proof strategy from [8], we can now prove Theorem 4.1.

Proof of Theorem 4.1

A direct computation shows that, in the notation of Theorem 4.4, we have \(u=w^{-\frac{r}{q-r}}\), \(v=w^{\frac{s}{s-q}}\) and

$$\begin{aligned} \tau _Q=\langle w^{-\frac{r}{q-r}}\rangle _Q^{\frac{1+\alpha }{r}-1}\langle w^{\frac{s}{s-q}}\rangle _Q^{-\frac{1}{s}}|Q|^{\frac{\alpha }{r}}. \end{aligned}$$

Let us first consider first the testing condition in (4.2). We will show that

$$\begin{aligned} \Vert T_R(u)\Vert _{L^q(v)}\lesssim [w^{\frac{s}{s-q}}]_{A_{\frac{s}{s-q}\frac{q-r}{r}+1}}^{\frac{1}{q}-\frac{1}{s}}[w^{-\frac{r}{q-r}}]_{A_{\infty }}^{\frac{1}{q}}u(R)^{1/p}. \end{aligned}$$
(4.3)

By Lemma 4.5, we have

$$\begin{aligned} \Vert T_R(u)\Vert _{L^q(v)}^q\eqsim \sum _{Q\in {{\mathcal {S}}}:Q\subseteq R} \langle w^{\frac{s}{s-q}}\rangle _Q^{2-q-\frac{1}{s}}\langle w^{-\frac{r}{q-r}}\rangle _Q^{\frac{1+\alpha }{r}}|Q|^{2-q+\frac{\alpha }{r}}\Psi (Q)^{q-1}, \end{aligned}$$

where

$$\begin{aligned} \Psi (Q):= \sum _{Q'\in {{\mathcal {S}}}:Q'\subseteq Q} \langle w^{\frac{s}{s-q}}\rangle _{Q'}^{1-\frac{1}{s}}\langle w^{-\frac{r}{q-r}}\rangle _{Q'}^{\frac{1+\alpha }{r}}|Q'|^{1+\frac{\alpha }{r}}. \end{aligned}$$
(4.4)

For \(\delta >0\) we have

$$\begin{aligned}&\Psi (Q)\le [w^{\frac{s}{s-q}}]_{A_{\frac{s}{s-q}\frac{q-r}{r}+1}}^{\delta }\\&\hspace{1.5cm}\cdot \sum _{Q'\in {{\mathcal {S}}}:Q'\subseteq Q}\langle w^{\frac{s}{s-q}}\rangle _{Q'}^{1-\frac{1}{s}-\delta }\langle w^{-\frac{r}{q-r}}\rangle _{Q'}^{\frac{1+\alpha }{r}-\delta \frac{s}{s-q}\frac{q-r}{r}}|Q'|^{1+\frac{\alpha }{r}}. \end{aligned}$$

Our goal now is to use Lemma 4.6. Its assumptions imply the following restrictions on \(\delta \):

$$\begin{aligned} 1-\tfrac{1}{s}-\delta \ge 0\Leftrightarrow & {} \delta \le 1-\tfrac{1}{s}\\ \tfrac{1+\alpha }{r}-\delta \tfrac{s}{s-q}\tfrac{q-r}{r}\ge 0\Leftrightarrow & {} \delta \le \tfrac{1+\alpha }{q-r}\cdot \tfrac{s-q}{s}\\ \tfrac{1}{s}-\tfrac{1}{r}+\delta \bigl (1+\tfrac{s}{s-q}\tfrac{q-r}{r}\bigr )>0\Leftrightarrow & {} \delta > \tfrac{1}{q}-\tfrac{1}{s}. \end{aligned}$$

Moreover, the assumption \(\alpha +\beta +\gamma \ge 1\) of Lemma 4.6 holds trivially because \(\alpha +\beta +\gamma =1+\frac{\alpha }{r}\). We conclude that \(\delta \in \big (\frac{1}{q}-\frac{1}{s},1-\frac{1}{s}\big ]\) and the set of \(\delta \) satisfying all restrictions will be non-empty if

$$\begin{aligned} \tfrac{1}{q}-\tfrac{1}{s}<\tfrac{1+\alpha }{q-r}\cdot \tfrac{s-q}{s}. \end{aligned}$$

It is easily seen that this estimate is true for all \(r<q<s\) and \(\alpha \ge 0\).

Taking \(\delta >0\) satisfying the above restrictions and applying Lemma 4.6, we obtain

$$\begin{aligned} \Psi (Q)\lesssim [w^{\frac{s}{s-q}}]_{A_{\frac{s}{s-q}\frac{q-r}{r}+1}}^{\delta } \langle w^{\frac{s}{s-q}}\rangle _{Q}^{1-\frac{1}{s}-\delta }\langle w^{-\frac{r}{q-r}}\rangle _{Q}^{\frac{1+\alpha }{r}-\delta \frac{s}{s-q}\frac{q-r}{r}}|Q|^{1+\frac{\alpha }{r}}. \end{aligned}$$

Therefore,

$$\begin{aligned} \Vert T_R(u)\Vert _{L^q(v)}^q&\lesssim [w^{\frac{s}{s-q}}]_{A_{\frac{s}{s-q}\frac{q-r}{r}+1}}^{\delta (q-1)} \sum _{Q\in {{\mathcal {S}}}:Q\subseteq R} \langle w^{\frac{s}{s-q}}\rangle _{Q}^{\frac{s-q}{s}-\delta (q-1)} \\ {}&\hspace{2cm}\cdot \langle w^{-\frac{r}{q-r}}\rangle _Q^{q\frac{1+\alpha }{r}+(q-1)\delta \frac{s}{s-q}\frac{q-r}{r}}|Q|^{1+\frac{q\alpha }{r}}\\&\le [w^{\frac{s}{s-q}}]_{A_{\frac{s}{s-q}\frac{q-r}{r}+1}}^{\frac{s-q}{s}}\sum _{Q\in {{\mathcal {S}}}:Q\subseteq R} \langle w^{-\frac{r}{q-r}}\rangle _Q^{1+\frac{q\alpha }{r}}|Q|^{1+\frac{q\alpha }{r}}. \end{aligned}$$

Further,

$$\begin{aligned} \sum _{Q\in {{\mathcal {S}}}:Q\subseteq R} \langle w^{-\frac{r}{q-r}}\rangle _Q^{1+\frac{q\alpha }{r}}|Q|^{1+\frac{q\alpha }{r}}&\le \Big (\int _R w^{-\frac{r}{q-r}}\Big )^{\frac{q\alpha }{r}}\sum _{Q\in {{\mathcal {S}}}:Q\subseteq R}\int _Qw^{-\frac{r}{q-r}}\\&\lesssim [w^{-\frac{r}{q-r}}]_{A_{\infty }}\Big (\int _Rw^{-\frac{r}{q-r}}\Big )^{1+\frac{q\alpha }{r}}, \end{aligned}$$

which, along with the previous estimate, proves (4.3).

Now consider the second testing condition in (4.2). Let us show that

$$\begin{aligned} \Vert T_R(v)\Vert _{L^{p'}(v)}\lesssim [w^{\frac{s}{s-q}}]_{A_{\frac{s}{s-q}\frac{q-r}{r}+1}}^{\frac{1}{q} -\frac{1}{s}}[w^{\frac{s}{s-q}}]_{A_{\infty }}^{\frac{1}{p'}}v(R)^{1/q'}. \end{aligned}$$
(4.5)

Again by Lemma 4.5, we have

$$\begin{aligned} \Vert T_R(v)\Vert _{L^{p'}(u)}^{p'}\eqsim \sum _{Q\in {{\mathcal {S}}}:Q\subseteq R} \langle w^{\frac{s}{s-q}}\rangle _Q^{1-\frac{1}{s}}\langle w^{-\frac{r}{q-r}}\rangle _Q^{\frac{1+\alpha }{r}+1-p'}|Q|^{2-p'+\frac{\alpha }{r}}\Psi (Q)^{p'-1}, \end{aligned}$$

where \(\Psi (Q)\) is as in (4.4). By the above estimate for \(\Psi (Q)\),

$$\begin{aligned} \Vert T_R(v)\Vert _{L^{p'}(u)}^{p'}&\lesssim [w^{\frac{s}{s-q}}]_{A_{\frac{s}{s-q}\frac{q-r}{r}+1}}^{\delta (p'-1)} \sum _{Q\in {{\mathcal {S}}}:Q\subseteq R} \langle w^{\frac{s}{s-q}}\rangle _{Q}^{p'(1-\frac{1}{s})-\delta (p'-1)}\\&\hspace{2cm}\cdot \langle w^{-\frac{r}{q-r}}\rangle _Q^{p'\frac{1+\alpha }{r}-(p'-1)(1+\delta \frac{s}{s-q}\frac{q-r}{r})}|Q|^{1+\frac{p'\alpha }{r}}\\&\lesssim [w^{\frac{s}{s-q}}]_{A_{\frac{s}{s-q}\frac{q-r}{r}+1}}^{p'(\frac{1}{q}-\frac{1}{s})}\sum _{Q\in {{\mathcal {S}}}:Q\subseteq R} \langle w^{\frac{s}{s-q}}\rangle _{Q}^{1+\frac{p'\alpha }{r}}|Q|^{1+\frac{p'\alpha }{r}}\\&\lesssim [w^{\frac{s}{s-q}}]_{A_{\frac{s}{s-q}\frac{q-r}{r}+1}}^{p'(\frac{1}{q}-\frac{1}{s})}[w^{\frac{s}{s-q}}]_{A_{\infty }}\Big (\int _Rw^{\frac{s}{s-q}}\Big )^{1+\frac{p'\alpha }{r}}, \end{aligned}$$

from which (4.5) follows.

Combining the two estimates for the testing conditions, inequalities (4.3) and (4.5), we obtain

$$\begin{aligned} \mathcal {N}&\lesssim [w^{\frac{s}{s-q}}]_{A_{\frac{s}{s-q}\frac{q-r}{r}+1}}^{\frac{1}{q}-\frac{1}{s}}[w^{-\frac{r}{q-r}}]_{A_{\infty }}^{\frac{1}{q}} +[w^{\frac{s}{s-q}}]_{A_{\frac{s}{s-q}\frac{q-r}{r}+1}}^{\frac{1}{q}-\frac{1}{s}}[w^{\frac{s}{s-q}}]_{A_{\infty }}^{\frac{1}{p'}}\\&\lesssim [w^{\frac{s}{s-q}}]_{A_{\frac{s}{s-q}\frac{q-r}{r}+1}}^{\frac{1}{q}-\frac{1}{s}+\frac{1}{q}\frac{s-q}{s}\frac{r}{q-r}} +[w^{\frac{s}{s-q}}]_{A_{\frac{s}{s-q}\frac{q-r}{r}+1}}^{\frac{1}{p'}+\frac{1}{q}-\frac{1}{s}}, \end{aligned}$$

where \(\mathcal {N}\) is as in Theorem 4.4. From this, since

$$\begin{aligned}{}[w^t]_{A_{t(q-1)+1}}\le [w]_{A_q}^t[w]_{\textrm{RH}_t}^t, \end{aligned}$$

we obtain the conclusion with

$$\begin{aligned} \beta&= {\tfrac{s}{s-q}\max (\tfrac{1}{p'}+\tfrac{1}{q}-\tfrac{1}{s},\tfrac{1}{q}-\tfrac{1}{s}+\tfrac{1}{q}\tfrac{s-q}{s}\tfrac{r}{q-r})}\\&={\max \bigl ( \tfrac{s(1-\tfrac{\alpha }{r})-1}{s-q}, \tfrac{1}{q-r}\bigr )}. \end{aligned}$$

\(\square \)

5 Bloom weighted bounds for sparse forms associated to commutators

In this section we consider one of the sparse forms in the conclusion of Theorem 3.2, namely, \({{\mathcal {B}}}_{{\mathcal S},b,r,s}^m(f,g)\) as defined in the introduction.

Let us start with some definitions. Given \(b \in L^1_{{{\,\textrm{loc}\,}}}({\mathbb {R}}^n)\), a weight \(\nu \) and \(\alpha \ge 0\), define the weighted, fractional \({{\,\textrm{BMO}\,}}\)-seminorm as

$$\begin{aligned} \Vert b\Vert _{{{\,\textrm{BMO}\,}}_{\nu }^{\alpha }}:=\sup _{Q \in \mathcal {Q}}\frac{1}{\nu (Q)^{1+\frac{\alpha }{n}}}\int _Q|b-\langle b\rangle _Q|. \end{aligned}$$

We omit \(\alpha \) from our notation if \(\alpha =0\). Furthermore, given a cube \(Q \in \mathcal {Q}\), define the oscillation

$$\begin{aligned} \Omega _\nu (b,Q):= \frac{1}{\nu (Q)} \int _Q |b-\langle b\rangle _Q| \end{aligned}$$

and the weighted sharp maximal function

$$\begin{aligned} M^{\#}_\nu (b):=\sup _{Q \in \mathcal {Q}} \,\Omega _\nu (b,Q) \chi _Q. \end{aligned}$$

Note that \( \Vert b\Vert _{{{\,\textrm{BMO}\,}}_\nu } = \Vert M^{\#}_\nu (b)\Vert _{L^\infty ({\mathbb {R}}^n)}. \)

Theorem 5.1

Let \(1\le r<p, q<s\le \infty \), \(m \in {\mathbb {N}}\) and \(b \in L^1_{{{\,\textrm{loc}\,}}}({\mathbb {R}}^n)\). Assume that \(\mu \in A_{p/r}\) and \(\lambda \in A_{q/r}\cap {\textrm{RH}}_{(s/q)'}\). Set

$$\begin{aligned} \alpha := -\tfrac{1}{t}:= \tfrac{1}{pm}-\tfrac{1}{qm}, \end{aligned}$$

\(\alpha _+:= \max \{\alpha ,0\}\) and define the Bloom weight

$$\begin{aligned} \nu ^{1+\alpha }:= \mu ^{\frac{1}{pm}}\lambda ^{-\frac{1}{qm}}. \end{aligned}$$

For any sparse family \({{\mathcal {S}}}\subseteq \mathscr {D}\), \(f \in L^p(\mu )\) and \(g \in L^{q'}(\lambda ^{1-q'})\) we have

$$\begin{aligned} {{\mathcal {B}}}_{{{\mathcal {S}}},b,r,s}^m(f,g)\lesssim C(\mu ,\lambda ) \Vert f\Vert _{L^p(\mu )}\Vert g\Vert _{L^{q'}(\lambda ^{1-q'})}{\left\{ \begin{array}{ll} \Vert b\Vert _{{{\,\textrm{BMO}\,}}_{\nu }^{\alpha n}}^m, \quad &{}p\le q,\\ \Vert M^{\#}_\nu (b)\Vert _{L^t(\nu )}^m, \quad &{}q\le p, \end{array}\right. } \end{aligned}$$

where

figure a

with

$$\begin{aligned} \beta _{\mu _1}&:=\max \bigl (\tfrac{1}{r},\tfrac{1}{p-r}\bigr )+\big (1-\tfrac{1}{rm}\big )(rm-\lfloor rm\rfloor )\max \big (\tfrac{1-\alpha _+}{r-\alpha p},\tfrac{1}{(1+\alpha )p-r}\big )\\&\hspace{2.85cm}+\tfrac{q}{p}\sum _{j=1}^{\lfloor rm\rfloor -1}\tfrac{j}{rm}\max \big (\tfrac{1-\alpha _+}{r+jq\alpha }, \tfrac{1}{(1-j\alpha )q-r}\big ),\\ \beta _{\mu _2}&:= \tfrac{rm}{p-r}\\ \beta _{\lambda _1}&:= \tfrac{p}{q}\tfrac{1}{rm}(rm-\lfloor rm\rfloor )\max \big (\tfrac{1-\alpha _+}{r-\alpha p},\tfrac{1}{(1+\alpha )p-r}\big )\\&\hspace{2.85cm}+\sum _{j=1}^{\lfloor rm\rfloor -1}\big (1-\tfrac{j}{rm}\big )\max \big (\tfrac{1-\alpha _+}{r+jq\alpha }, \tfrac{1}{(1-j\alpha )q-r}\big ),\\ \beta _{\lambda _2}&:=\max \bigl (\tfrac{s(1-\tfrac{\alpha }{r})-1}{s-q},\tfrac{1}{q-r}\bigr ). \end{aligned}$$

Remark 5.2

Observe that the sense of (5.2) is that in the case \(q\le p\) and \(m\ge 2\) it provides an additional bound for \(C(\mu ,\lambda )\), which is incomparable with (5.1), in general. See Sect. 6 for a further discussion of this phenomenon.

Before turning to the proof, let us discuss some particular cases of Theorem 5.1.

Remark 5.3

Suppose that \(s=\infty \) and thus \([\lambda ]_{\textrm{RH}_{(s/q)'}}=1\). In the diagonal case \(p=q\), we have \(\alpha =0\). So, in this case,

$$\begin{aligned} \beta _{\mu _1}&=\bigl (rm-\lfloor rm\rfloor +\tfrac{\lfloor rm\rfloor }{2rm}(1+\lfloor rm\rfloor )\bigr )\max \big (\tfrac{1}{r},\tfrac{1}{p-r}\big ),\\ \beta _{\lambda _1}&=\bigl (\lfloor rm\rfloor -\tfrac{\lfloor rm\rfloor }{2rm}(1+\lfloor rm\rfloor )\bigr )\max \big (\tfrac{1}{r},\tfrac{1}{p-r}\big ). \end{aligned}$$

If we additionally assume that \(rm\in {{\mathbb {N}}}\), we get

$$\begin{aligned} \beta _{\mu _1}&=\tfrac{rm+1}{2}\max \big (\tfrac{1}{r},\tfrac{1}{p-r}\big ),\\ \beta _{\lambda _1}&=\tfrac{rm-1}{2}\max \big (\tfrac{1}{r},\tfrac{1}{p-r}\big ). \end{aligned}$$

In particular, for \(r=1\), we have

$$\begin{aligned} \beta _{\mu _1}=\beta _{\lambda _1}+\beta _{\lambda _2} =\tfrac{m+1}{2}\max \big (1,\tfrac{1}{p-1}\big ), \end{aligned}$$

Therefore, if \(r=1\), \(s=\infty \) and \(p=q\), we have

$$\begin{aligned} C(\mu ,\lambda )\le {\left\{ \begin{array}{ll} \bigl ([\mu ]_{A_{p}} [\lambda ]_{A_{p}}\bigr )^{\frac{m+1}{2}\max (1,\frac{1}{p-1})},\quad &{} m\ge 1,\\ {[}\mu ]_{A_{p}}^{\frac{m}{p-1}}[\lambda ]_{A_{p}}^{\max (1,\frac{1}{p-1})},\quad &{}m\ge 2, \end{array}\right. } \end{aligned}$$

the first of which was obtained in [21]. Note that the second estimate is, in general, incomparable to the first.

Remark 5.4

In the spirit of Remark 4.2, we note that the conclusion of Theorem 5.1 can be replaced by

$$\begin{aligned} {{\mathcal {B}}}_{{{\mathcal {S}}},b,r,s}^m(f,g)&\lesssim \Vert f\Vert _{L^p(\mu ^p)}\Vert g\Vert _{L^{q'}(\lambda ^{-q'})} \\ {}&\quad \cdot {\left\{ \begin{array}{ll} \Vert b\Vert _{{{\,\textrm{BMO}\,}}_{\nu }^{\alpha n}}^m [\mu ]_{p, (r,\infty )}^{p\beta _{\mu _1} } [\lambda ]_{q,(r,\infty )}^{q\beta _{\lambda _1}}[\lambda ]_{q,(r,s)}^{q\beta _{\lambda _2}}, &{}p \le q,\, m\ge 1,\\ \Vert M^{\#}_\nu (b)\Vert _{L^t(\nu )}^m [\mu ]_{p, (r,\infty )}^{p\beta _{\mu _1} } [\lambda ]_{q,(r,\infty )}^{q\beta _{\lambda _1}}[\lambda ]_{q,(r,s)}^{q\beta _{\lambda _2}},\qquad &{}q \le p,\, m\ge 1,\\ \Vert M^{\#}_\nu (b)\Vert _{L^t(\nu )}^m [\mu ]_{p, (r,\infty )}^{p\beta _{\mu _2} } [\lambda ]_{q,(r,s)}^{q\beta _{\lambda _2}}, &{}q \le p,\, m\ge 2, \end{array}\right. } \end{aligned}$$

for all weights \(\mu \) such that \( [\mu ]_{p, (r,\infty )}<\infty \) and weights \(\lambda \) such that \([\lambda ]_{q,(r,s)}<\infty \).

Several statements below will be needed to prove Theorem 5.1, starting with the following lemma from Rivera-Ríos and the first and third authors [20].

Lemma 5.5

([20, Lemma 5.1]) Let \(b\in L^1_{\text {loc}}({{\mathbb {R}}}^n)\) and let \({\mathcal S}\subseteq {{\mathscr {D}}}\) be a sparse family. There exists a sparse family \({{\mathcal {S}}}'\subseteq {{\mathscr {D}}}\) such that \({\mathcal S}\subseteq {{\mathcal {S}}}'\) and for every cube \(Q\in {{\mathcal {S}}}'\),

$$\begin{aligned} |b-\langle b\rangle _Q|\chi _Q\lesssim \sum _{P\in {{\mathcal {S}}}':P\subseteq Q}\big \langle |b-\langle b\rangle _P|\big \rangle _{P}\chi _P. \end{aligned}$$

We also need the following additional result from Cascante–Ortega–Verbitsky [4].

Lemma 5.6

([4, (2.4)]) Let \(p \in [1,\infty )\) and \(\lambda _Q \ge 0\) for all \(Q \in \mathscr {D}\). Then we have

$$\begin{aligned} \Big (\sum _{Q\in {{\mathscr {D}}}}\lambda _Q\chi _Q\Big )^p\le p\sum _{Q\in {{\mathscr {D}}}}\lambda _Q\chi _Q\Big (\sum _{Q'\in {{\mathscr {D}}}:Q'\subseteq Q}\lambda _{Q'}\chi _{Q'}\Big )^{p-1} \end{aligned}$$

We are now ready to prove Theorem 5.1 in the case \(p \le q\).

Proof of (5.1)in Theorem 5.1in the case \(p\le q\) By Lemma 5.5, there exists a sparse collection of cubes \(\mathcal {S}\subseteq \mathcal {S}'\subseteq {{\mathscr {D}}}\) such that for any \(Q \in \mathcal {S}\),

$$\begin{aligned} \bigl \langle |b-\langle b\rangle _Q|^mf\bigr \rangle _{r,Q}^r&\lesssim \frac{1}{|Q|} \int _Q \Bigl (\sum _{P \in \mathcal {S'}:P\subseteq Q} \frac{1}{|P|}\int _P |b-\langle b\rangle _P|\chi _P\Bigr )^{rm} |f|^r \\&\lesssim \Vert b\Vert _{{{\,\textrm{BMO}\,}}_\nu ^{\alpha n}}^{rm} \frac{1}{|Q|} \int _Q \Bigl (\sum _{P \in \mathcal {S}':P\subseteq Q}\frac{\nu (P)^{1+\alpha }}{|P|}\chi _P \Bigr )^{rm} |f|^r. \end{aligned}$$

Let \(k:=\lfloor rm\rfloor \) and \(\gamma :=rm-(k-1) \in [1,2)\). Applying subsequently Lemma 5.6\((k-1)\) times yields

$$\begin{aligned}&\int _Q \Bigl (\sum _{P \in \mathcal {S}':P\subseteq Q}\frac{\nu (P)^{1+\alpha }}{|P|}\chi _P \Bigr )^{rm} |f|^r\\&\quad \lesssim \sum _{P_{k-1} \subseteq \cdots \subseteq P_1 \subseteq Q} \frac{\nu (P_1)^{1+\alpha }}{|P_1|} \cdots \frac{\nu (P_{k-1})^{1+\alpha }}{|P_{k-1}|} \int _{P_{k-1}} \Bigl (\sum _{P_k \subseteq P_{k-1}}\frac{\nu (P_k)^{1+\alpha }}{|P_k|}\chi _{P_k} \Bigr )^{\gamma }|f|^r, \end{aligned}$$

where we omitted the assumption \(P_1,\ldots ,P_k \in \mathcal {S}'\) from our notation for brevity.

For \(\delta \ge 0\) we denote

$$\begin{aligned} \mathcal {A}_{\mathcal {S}',\delta }(\varphi ):= \sum _{Q\in {\mathcal S}'}\langle |\varphi |\rangle _{\frac{1}{1+\delta },Q}|Q|^{{\delta }} \chi _Q, \end{aligned}$$

in which we omit \(\delta \) if \(\delta = 0\). Using Lemma 4.5 with the weight \(w = |f|^r\) and Minkowski’s inequality, we have

$$\begin{aligned}&\int _{P_{k-1}}\Bigl (\sum _{P_k \subseteq P_{k-1}}\frac{\nu (P_k)^{1+\alpha }}{|P_k|}\chi _{P_k} \Bigr )^{\gamma }|f|^r\\&\eqsim \sum _{P_k\subseteq P_{k-1}}\frac{\nu ({P_k})^{1+\alpha }}{|{P_k}|} \Bigl (\int _{P_k} |f|^r\Bigr )^{2-\gamma } \Big (\sum _{P\subseteq P_k}\frac{\nu (P)^{1+\alpha }}{|P|}\int _{P}|f|^r\Big )^{\gamma -1}\\&\le \sum _{P_k\subseteq P_{k-1}}\frac{\nu ({P_k})^{1+\alpha }}{|{P_k}|} \Bigl (\int _{P_k} |f|^r\Bigr )^{2-\gamma } \Big (\int _{P_k}{{\mathcal {A}}}_{{{\mathcal {S}}}'}(|f|^r)^{\frac{1}{1+\alpha }}\nu \Big )^{(1+\alpha )(\gamma -1)}\\&= \sum _{{P_k}\subseteq P_{k-1}}{\nu ({P_k})^{1+\alpha }} \langle |f|^r\rangle _{1,P_k}^{2-\gamma } \cdot \bigl \langle \mathcal {A}_{\mathcal {S}'}(|f|^r)\nu \bigr \rangle _{\frac{1}{1+\alpha },P_k}^{\gamma -1}|P_k|^{\alpha (\gamma -1)}\\&\le \Bigl (\int _{P_{k-1}} \Bigl (\mathcal {A}_{\mathcal {S}'}(|f|^r)^{2-\gamma } \cdot \mathcal {A}_{\mathcal {S}',\alpha }\bigl ({{{\mathcal {A}}}}_{\mathcal {S}'}(|f|^r) \nu ^{1+\alpha } \bigr )^{\gamma -1} \cdot \nu ^{1+\alpha }\Bigr )^{\frac{1}{1+\alpha }} \Bigr )^{1+\alpha }\\&=: \Bigl (\int _{P_{k-1}} h^{\frac{1}{1+\alpha }} \Bigr )^{1+\alpha }. \end{aligned}$$

Let us further write

$$\begin{aligned} \mathcal {A}_{\mathcal {S}',\alpha ,\nu }(\varphi ):=\mathcal {A}_{\mathcal {S}',\alpha }(\varphi )\nu ^{1+\alpha }, \end{aligned}$$

and let \(\mathcal {A}_{\mathcal {S}',\alpha ,\nu }^j\) be the j-th iteration of \(\mathcal {A}_{\mathcal {S}',\alpha ,\nu }\). Using Minkowski’s inequality \((k-1)\) times more, we find

$$\begin{aligned} \sum _{P_{k-1} \subseteq \cdots \subseteq P_1 \subseteq Q} \frac{\nu (P_1)^{1+\alpha }}{|P_1|} \cdots \frac{\nu (P_{k-1})^{1+\alpha }}{|P_{k-1}|}&\Bigl (\int _{P_{k-1}} h^{\frac{1}{1+\alpha }} \Bigr )^{1+\alpha } \\ {}&\le \Bigl (\int _{Q} \bigl (\mathcal {A}^{k-1}_{\mathcal {S}',\alpha ,\nu }(h)\bigr )^{\frac{1}{1+\alpha }}\Bigr )^{1+\alpha }, \end{aligned}$$

which, along with the previous estimates, implies

$$\begin{aligned} \bigl \langle |b-\langle b\rangle _Q|^mf\bigr \rangle _{r,Q}^r\lesssim \Vert b\Vert _{{{\,\textrm{BMO}\,}}_\nu ^{\alpha n}}^{rm} \bigl \langle \mathcal {A}^{k-1}_{\mathcal {S}',\alpha ,\nu }(h)\bigr \rangle _{\frac{1}{1+\alpha },Q}|Q|^\alpha . \end{aligned}$$

From this, we conclude

$$\begin{aligned} {{\mathcal {B}}}_{\mathcal {S},b,r,s}^m(f,g)&\lesssim \Vert b\Vert _{{{\,\textrm{BMO}\,}}_\nu ^{\alpha n}}^{m}\sum _{Q\in {{\mathcal {S}}}}\langle (\mathcal {A}^{k-1}_{\mathcal {S}',\alpha ,\nu }(h))^{\frac{1}{r}}\rangle _{\frac{r}{1+\alpha },Q}\langle |g|\rangle _{s',Q}|Q|^{1+\frac{\alpha }{r}}. \end{aligned}$$

Now, for \(j=1,\ldots ,k-1\), define

$$\begin{aligned} \tfrac{1}{u_j}&:=\tfrac{1}{q}+j\tfrac{\alpha }{r} = \tfrac{j}{rm}\tfrac{1}{p}+(1-\tfrac{j}{rm})\tfrac{1}{q},\\ w_j&:= \mu ^{\frac{j}{rm}\frac{u_j}{p}}\lambda ^{(1-\frac{j}{rm})\frac{u_j}{q}}. \end{aligned}$$

By Theorem 4.1, we have

$$\begin{aligned} {{\mathcal {B}}}_{\mathcal {S},b,r,s}^m(f,g)&\lesssim C(\lambda ) \,\Vert b\Vert _{{{\,\textrm{BMO}\,}}_\nu ^{\alpha n}}^m \Vert (\mathcal {A}^{k-1}_{\mathcal {S}',\alpha ,\nu }(h))^{\frac{1}{r}}\Vert _{L^{u_1}(\lambda ^{u_1/q})}\Vert g\Vert _{L^{q'}(\lambda ^{1-q'})}, \end{aligned}$$

where

$$\begin{aligned} C(\lambda ):=\big ([\lambda ]_{A_{q/r}}[\lambda ]_{{\textrm{RH}}_{(s/q)'}}\big )^{\max \bigl ( \frac{s(1-\frac{\alpha }{r})-1}{s-q},\frac{1}{q-r}\bigr )}. \end{aligned}$$

Next, we apply Corollary 4.3\((k-1)\) times to obtain

$$\begin{aligned} \Vert \mathcal {A}^{k-1}_{\mathcal {S}',\alpha ,\nu }(h)^{\frac{1}{r}}\Vert _{L^{u_1}(\lambda ^{u_1/q})}^r&=\Vert {{\mathcal {A}}}_{{{\mathcal {S}}}',\alpha }({{\mathcal {A}}}_{{{\mathcal {S}}}',\alpha ,\nu }^{k-2}(h))\Vert _{L^{u_1/r}(w_1)}\\&\lesssim [w_1 ]_{A_{u_1/r}}^{\max (1-\alpha , \frac{r}{u_1-r})} \Vert \mathcal {A}^{k-2}_{\mathcal {S}',\alpha ,\nu }(h)\Vert _{L^{u_2/r}(w_1)}\\ {}&\lesssim \cdots \\&\lesssim \Bigl (\prod _{j=1}^{k-1} [w_j]_{A_{u_j/r}}^{\max (1-\alpha , \frac{r}{u_j-r})}\Bigr ) \Vert h\Vert _{L^{u_k/r}(w_{k-1})}. \end{aligned}$$

Now note that, for \(\frac{1}{v}:= \frac{1}{p} -\frac{\alpha }{r}\), we have

$$\begin{aligned} \tfrac{1}{u_k} = \tfrac{1}{q} + (rm-(\gamma -1))\tfrac{\alpha }{r} = (2-\gamma )\tfrac{1}{p} +(\gamma -1)\tfrac{1}{v}. \end{aligned}$$

So, defining

$$\begin{aligned} w_v:=\mu ^{(1-\frac{1}{rm})\frac{v}{p}} \lambda ^{\frac{1}{rm}\frac{v}{q}}, \end{aligned}$$

we have by Hölder’s inequality,

$$\begin{aligned}&\Vert h\Vert _{L^{u_k/r}(w_{k-1})}\\&=\bigl \Vert {{\mathcal {A}}}_{\mathcal {S}'}(|f|^r)^{2-\gamma } {{{\mathcal {A}}}}_{\mathcal {S}',\alpha }\bigl ({{{\mathcal {A}}}}_{\mathcal {S}'}(|f|^r) \nu ^{1+\alpha }\bigr )^{\gamma -1} \nu ^{1+\alpha }\bigr \Vert _{L^{u_k/r}( \mu ^{(1-\frac{\gamma }{rm})\frac{u_{k}}{p}}\lambda ^{\frac{\gamma }{rm}\frac{u_{k}}{q}})}\\&=\bigl \Vert {{\mathcal {A}}}_{\mathcal {S}'}(|f|^r)^{2-\gamma } {{{\mathcal {A}}}}_{\mathcal {S}',\alpha }\bigl ({{{\mathcal {A}}}}_{\mathcal {S}'}(|f|^r) \nu ^{1+\alpha }\bigr )^{\gamma -1} \lambda ^{\frac{\gamma -1}{qm}} \mu ^{\frac{r}{p}-\frac{\gamma -1}{pm}}\bigr \Vert _{L^{u_k/r}}\\&\le \bigl \Vert {\mathcal A}_{\mathcal {S}'}(|f|^r)\bigr \Vert _{L^{p/r}(\mu )}^{2-\gamma } \bigl \Vert {{\mathcal A}}_{\mathcal {S}',\alpha }\bigl ({{{\mathcal {A}}}}_{\mathcal {S}'}(|f|^r) \nu ^{1+\alpha }\bigr )\bigr \Vert _{L^{v/r}(w_v)}^{\gamma -1}. \end{aligned}$$

For the first term on the right-hand side, by (4.1), we have

$$\begin{aligned} \bigl \Vert {{\mathcal {A}}}_{\mathcal {S}'}(|f|^r) \bigr \Vert _{L^{p/r}(\mu )}^{2-\gamma }&\lesssim [\mu ]_{A_{p/r}}^{(2-\gamma )\max (1,\frac{r}{p-r})} \Vert f\Vert _{L^p(\mu )}^{(2-\gamma )r}. \end{aligned}$$

For the second term, by applying Corollary 4.3 and (4.1), we have

$$\begin{aligned} \bigl \Vert {{{\mathcal {A}}}}_{\mathcal {S}',\alpha }&\bigl ({{{\mathcal {A}}}}_{\mathcal {S}'}(|f|^r) \nu ^{1+\alpha }\bigr )\bigr \Vert _{L^{v/r}(w_v)}^{\gamma -1} \\&\lesssim [w_v ]_{A_{v/r}}^{(\gamma -1) \max (1-\alpha ,\frac{r}{v-r})} \bigl \Vert {{{{\mathcal {A}}}}_{\mathcal {S}'}(|f|^r)}\bigr \Vert _{L^{p/r}(\mu )}^{\gamma -1}\\&\lesssim [w_v ]_{A_{v/r}}^{(\gamma -1) \max (1-\alpha ,\frac{r}{v-r})} [\mu ]_{A_{p/r}}^{(\gamma -1)\max (1,\frac{r}{p-r})} \Vert f\Vert _{L^p(\mu )}^{(\gamma -1)r}. \end{aligned}$$

Collecting our estimates, we have shown

$$\begin{aligned} {{\mathcal {B}}}_{\mathcal {S},b,r,s}^m(f,g)\lesssim C(\mu ,\lambda ) \Vert b\Vert _{{{\,\textrm{BMO}\,}}_{\nu }^{\alpha n}}^m \Vert f\Vert _{L^p(\mu )}\Vert g\Vert _{L^{q'}(\lambda ^{1-q'})} \end{aligned}$$

with

$$\begin{aligned} C(\mu ,\lambda )&:= C(\lambda ) \cdot [\mu ]_{A_{p/r}}^{\max (\frac{1}{r},\frac{1}{p-r})} \cdot [w_v]_{A_{v/r}}^{(\gamma -1) \max (\frac{1-\alpha }{r},\frac{1}{v-r})} \\ {}&\quad \cdot \Bigl (\prod _{j=1}^{k-1} [ w_j]_{A_{u_j/r}}^{\max (\frac{1-\alpha }{r}, \frac{1}{u_j-r})}\Bigr ). \end{aligned}$$

By Hölder’s inequality, since \(\frac{1}{v}=(1-\frac{1}{rm})\frac{1}{p}+\frac{1}{rm}\frac{1}{q}\), we have

$$\begin{aligned}{}[w_v]_{A_{v/r}} = [\mu ^{(1-\frac{1}{rm})\frac{v}{p}} \lambda ^{\frac{1}{rm}\frac{v}{q}} ]_{A_{v/r}}\le [\mu ]_{A_{p/r}}^{(1-\frac{1}{rm})\frac{v}{p}}[\lambda ]_{A_{q/r}}^{\frac{1}{rm}\frac{v}{q}}. \end{aligned}$$

Similarly, we have

$$\begin{aligned}{}[w_j]_{A_{u_j/r}} = [\mu ^{\frac{j}{rm}\frac{u_j}{p}}\lambda ^{(1-\frac{j}{rm})\frac{u_j}{q}}]_{A_{u_j/r}}\le [\mu ]_{A_{p/r}}^{\frac{j}{rm}\frac{u_j}{p}}[\lambda ]_{A_{q/r}}^{(1-\frac{j}{rm})\frac{u_j}{q}}. \end{aligned}$$

From this and from the above expression for \(C(\mu ,\lambda )\), the values of \(\beta _{\mu _1}\), and \(\beta _{\lambda _1}\) follow by direct computation. \(\square \)

We now turn to the case \(q \le p\). We start with the estimate in (5.1), which works for any \(m\ge 1\).

Proof of (5.1) in Theorem 5.1 in the case \(q\le p\)

By Lemma 5.5, there exists a sparse collection of cubes \(\mathcal {S}\subseteq \mathcal {S}'\subseteq {{\mathscr {D}}}\) such that for any \(Q \in \mathcal {S}\),

$$\begin{aligned} \bigl \langle |b-\langle b\rangle _Q|^mf\bigr \rangle _{r,Q}^r&\lesssim \frac{1}{|Q|} \int _Q \Bigl (\sum _{P \in \mathcal {S'}:P\subseteq Q} \frac{1}{|P|}\int _P |b-\langle b\rangle _P|\chi _P\Bigr )^{rm} |f|^r \\&=\frac{1}{|Q|} \int _Q \Bigl (\sum _{P \in \mathcal {S'}:P\subseteq Q} \frac{\nu (P)}{|P|}\Omega _{\nu }(b,P)\chi _P\Bigr )^{rm}|f|^r. \end{aligned}$$

Let \(k:=\lfloor rm\rfloor \) and \(\gamma :=rm-(k-1) \in [1,2)\). Applying subsequently Lemma 5.6\((k-1)\) times yields

$$\begin{aligned}&\int _Q \Bigl (\sum _{P \in \mathcal {S}':P\subseteq Q}\frac{\nu (P)}{|P|}\Omega _{\nu }(b,P)\chi _P \Bigr )^{rm} |f|^r\\&\quad \lesssim \sum _{P_{k-1} \subseteq \cdots \subseteq P_1 \subseteq Q} \frac{\nu (P_1)}{|P_1|}\Omega _{\nu }(b,P_1) \cdots \frac{\nu (P_{k-1})}{|P_{k-1}|}\Omega _{\nu }(b,P_{k-1})\\&\qquad \quad \cdot \int _{P_{k-1}} \Bigl (\sum _{P_k \subseteq P_{k-1}}\frac{\nu (P_k)}{|P_k|}\Omega _{\nu }(b,P_k)\chi _{P_k} \Bigr )^{\gamma }|f|^r, \end{aligned}$$

where we omitted the assumption \(P_1,\ldots ,P_k \in \mathcal {S}'\) from our notation for brevity.

Define

$$\begin{aligned} \mathcal {A}_{\mathcal {S}'}(\varphi ) := \sum _{Q\in {{\mathcal {S}}}'}\langle |\varphi |\rangle _{{1},Q} \chi _Q,\\ \mathcal {A}_{\mathcal {S}',\nu ,b}(\varphi ):=\mathcal {A}_{\mathcal {S}'}(\varphi )M^{\#}_\nu (b)\nu \end{aligned}$$

and let \(\mathcal {A}_{\mathcal {S}',\nu ,b}^j\) be the j-th iteration of \(\mathcal {A}_{\mathcal {S}',\nu ,b}\). By Lemma 4.5 and Hölder’s inequality, we can estimate

$$\begin{aligned}&\int _{P_{k-1}} \Bigl (\sum _{P_k\subseteq P_{k-1}}\frac{\nu (P_k)}{|P_k|} \Omega _{\nu }(b,P_{k})\chi _{P_k} \Bigr )^\gamma |f|^r\\&\quad \eqsim \sum _{P_k\subseteq P_{k-1}}\frac{\nu (P_k)}{|P_k|} \Omega _{\nu }(b,P_{k}) \Bigl (\int _{P_k} |f|^r\Bigr )^{2-\gamma }\Bigl (\sum _{P\subseteq P_k}\frac{\nu (P)}{|P|} \Omega _{\nu }(b,P) \int _{P} |f|^r\Bigr )^{\gamma -1}\\&\quad \le \sum _{P_k\subseteq P_{k-1}}{\nu (P_k)}\Omega _\nu (b,P_k)\Bigl (\frac{1}{|P_k|}\int _{P_k}|f|^r\Bigr )^{2-\gamma }\Bigl (\frac{1}{|P_k|}\int _{P_k}{{{\mathcal {A}}}}_{{\mathcal {S}',\nu ,b}}(|f|^r)\Bigr )^{\gamma -1}\\&\quad \le \int _{P_{k-1}}{ \mathcal {A}_{\mathcal {S}'}(|f|^r)^{2-\gamma } \cdot \mathcal {A}_{\mathcal {S}'}({{\mathcal A}}_{{\mathcal {S}',\nu ,b}}(|f|^r))^{\gamma -1}}\cdot M^{\#}_\nu (b)\nu =: \int _{P_{k-1}} h. \end{aligned}$$

Next, we can iteratively estimate

$$\begin{aligned}&\sum _{P_{k-1} \subseteq \cdots \subseteq P_1 \subseteq Q} \frac{\nu (P_1)}{|P_1|}\Omega _{\nu }(b,P_1) \cdots \frac{\nu (P_{k-1})}{|P_{k-1}|}\Omega _{\nu }(b,P_{k-1})\int _{P_{k-1}}h\\&\quad \le \sum _{P_{k-2} \subseteq \cdots \subseteq P_1 \subseteq Q} \frac{\nu (P_1)}{|P_1|}\Omega _{\nu }(b,P_1) \cdots \frac{\nu (P_{k-2})}{|P_{k-2}|}\Omega _{\nu }(b,P_{k-2})\int _{P_{k-2}}\mathcal {A}_{\mathcal {S}',\nu ,b}(h)\\&\quad \le \cdots \le \int _Q\mathcal {A}_{\mathcal {S}',\nu ,b}^{k-1}(h). \end{aligned}$$

Combined with the previous estimates, this implies

$$\begin{aligned} {{\mathcal {B}}}_{\mathcal {S},b,r,s}^m(f,g)\lesssim \sum _{Q\in {\mathcal S}}\langle \mathcal {A}_{\mathcal {S}',\nu ,b}^{k-1}(h)^{\frac{1}{r}}\rangle _{r,Q}\langle |g|\rangle _{s',Q}|Q|. \end{aligned}$$

From this, using Theorem 4.1, we obtain

$$\begin{aligned} {{\mathcal {B}}}_{b,r,s}^m(f,g)\lesssim \big ([\lambda ]_{A_{q/r}}[\lambda ]_{{\textrm{RH}}_{(s/q)'}}\big )^{\max (\frac{s-1}{s-q},\frac{1}{q-r})}\Vert \mathcal {A}_{\mathcal {S}',\nu ,b}^{k-1}(h)\Vert _{L^{q/r}(\lambda )}^{1/r}\Vert g\Vert _{L^{q'}(\lambda ^{1-q'})}. \end{aligned}$$

Define for \(j=1,\ldots ,k-1\)

$$\begin{aligned} \tfrac{1}{u_j}&:= \tfrac{1}{q}- \tfrac{j}{rt} = \tfrac{j}{rm}\tfrac{1}{p}+(1-\tfrac{j}{rm})\tfrac{1}{q},\\ w_j&:= \mu ^{\frac{j}{rm}\frac{u_j}{p}}\lambda ^{(1-\frac{j}{rm})\frac{u_j}{q}}. \end{aligned}$$

Applying Hölder’s inequality along with (4.1) \((k-1)\) times, we estimate

$$\begin{aligned} \Vert \mathcal {A}^{k-1}_{\mathcal {S}',\nu ,b}(h)\Vert _{L^{q/r}(\lambda )}&= \bigl \Vert M^{\#}_{\nu }(b)\nu ^{1/t}\mathcal {A}_{{\mathcal {S}'}}(\mathcal {A}^{k-2}_{\mathcal {S}',\nu ,b}h)\nu ^{1/t'} \lambda ^{r/q}\bigr \Vert _{L^{q/r}}\\&\le \Vert M^{\#}_{\nu }(b)\Vert _{L^t(\nu )} \Vert \mathcal {A}_{\mathcal {S}'}(\mathcal {A}^{k-2}_{\mathcal {S}',\nu ,b}h)\Vert _{L^{u_1/r}(w_1)}\\&\lesssim [w_1]_{A_{u_1/r}}^{\max (1, \frac{r}{u_1-r})} \Vert M^{\#}_{\nu }(b)\Vert _{L^t(\nu )} \Vert \mathcal {A}^{k-2}_{\mathcal {S}',\nu ,b}h\Vert _{L^{u_1/r}(w_1)}\\ {}&\lesssim \cdots \\&\lesssim \Bigl (\prod _{j=1}^{k-1} [w_j ]_{A_{u_j/r}}^{\max (1, \frac{r}{u_j-r})}\Bigr ) \Vert M^{\#}_{\nu }(b)\Vert _{L^t(\nu )}^{k-1} \Vert h\Vert _{L^{u_{k-1}/r}(w_{k-1})}. \end{aligned}$$

Now define \(\frac{1}{v} = \frac{1}{p}+\frac{1}{rt}\) and

$$\begin{aligned} w_v:= \mu ^{(1-\frac{1}{rm})\frac{v}{p}}\lambda ^{\frac{1}{rm}\frac{v}{q}}. \end{aligned}$$

Noting that \(1-\frac{k-1}{rm}=\frac{\gamma }{rm}\) and thus

$$\begin{aligned} \tfrac{1}{u_{k-1}} = \tfrac{1}{q} - \tfrac{k-1}{rt}&= \tfrac{1}{p}+\tfrac{\gamma }{rt} =\tfrac{1}{rt} +(2-\gamma )\tfrac{1}{p} + (\gamma -1)\tfrac{1}{v}, \end{aligned}$$

we can estimate by Hölder’s inequality,

$$\begin{aligned} \Vert h\Vert _{L^{u_{k-1}/r}(w_{k-1})}&= \Vert h\Vert _{L^{u_{k-1}/r}(\mu ^{(1-\frac{\gamma }{rm})\frac{u_{k-1}}{p}}\lambda ^{\frac{\gamma }{rm}\frac{u_{k-1}}{q}} )}\\&\le \Vert M^{\#}_{\nu }(b)\Vert _{L^t(\nu )} \bigl \Vert \mathcal {A}_{\mathcal {S}'}(|f|^r)\bigr \Vert _{L^{p/r}(\mu )}^{2-\gamma }\\&\quad \cdot \bigl \Vert {{{\mathcal {A}}}}_{\mathcal {S}'}\bigl ({{\mathcal A}}_{\mathcal {S}',\nu ,b}(|f|^r)\bigr )\bigr \Vert _{L^{v/r}( w_v)}^{\gamma -1}. \end{aligned}$$

For the second term on the right-hand side we have, by (4.1),

$$\begin{aligned} \bigl \Vert {\mathcal A}_{\mathcal {S}'}(|f|^r)\bigr \Vert _{L^{p/r}(\mu )}^{2-\gamma }\lesssim [\mu ]_{A_{p/r}}^{(2-\gamma )\max (1,\frac{r}{p-r})} \Vert f\Vert _{L^p(\mu )}^{(2-\gamma )r}, \end{aligned}$$

and for the third term we have

$$\begin{aligned}&\bigl \Vert {{{\mathcal {A}}}}_{\mathcal {S}'}\bigl ({{{\mathcal {A}}}}_{\mathcal {S}',\nu ,b}(|f|^r)\bigr )\bigr \Vert _{L^{v/r}( w_v)}^{\gamma -1} \\&\quad \lesssim [ w_v ]_{A_{v/r}}^{(\gamma -1)\max (1,\frac{r}{v-r})} \bigl \Vert {{{\mathcal {A}}}}_{\mathcal {S}',\nu ,b}(|f|^r)\bigr \Vert _{L^{v/r}(w_v )}^{\gamma -1}\\&\quad \lesssim [ w_v]_{A_{v/r}}^{(\gamma -1)\max (1,\frac{r}{v-r})}\Vert M^{\#}_{\nu }b\Vert _{L^t(\nu )}^{\gamma -1} \bigl \Vert {{{\mathcal {A}}}}_{\mathcal {S}'}(|f|^r)\bigr \Vert _{L^{p/r}(\mu )}^{\gamma -1}\\&\quad \lesssim [ w_v]_{A_{v/r}}^{(\gamma -1)\max (1,\frac{r}{v-r})} [\mu ]_{A_{p/r}}^{(\gamma -1)\max (1,\frac{r}{p-r})} \Vert M^{\#}_{\nu }(b)\Vert _{L^t(\nu )}^{\gamma -1}\Vert f\Vert _{L^p(\mu )}^{(\gamma -1)r}. \end{aligned}$$

Collecting our estimates, we have shown

$$\begin{aligned} {{\mathcal {B}}}_{b,r,s}^m(f,g) \lesssim C(\mu ,\lambda )\, [\lambda ]_{{\textrm{RH}}_{(s/q)'}}^{\max (\frac{s-1}{s-q},\frac{1}{q-r})} \Vert M^{\#}_\nu (b)\Vert _{L^t(\nu )}^m \Vert f\Vert _{L^p(\mu )}\Vert g\Vert _{L^{q'}(\lambda ^{1-q'})} \end{aligned}$$

with

$$\begin{aligned} C(\mu ,\lambda )&= [\mu ]_{A_{p/r}}^{\max (\frac{1}{r},\frac{1}{p-r})} \cdot [w_v]_{A_{v/r}}^{(\gamma -1)\max (\frac{1}{r},\frac{1}{v-r})}\\&\qquad \cdot \Bigl (\prod _{j=1}^{k-1} [w_j ]_{A_{u_j/r}}^{\max (\frac{1}{r}, \frac{1}{u_j-r})}\Bigr )\cdot [\lambda ]_{A_{q/r}}^{\max (\frac{s-1}{s-q},\frac{1}{q-r})}. \end{aligned}$$

By Hölder’s inequality,

$$\begin{aligned}{}[w_v]_{A_v/r} = [\mu ^{(1-\frac{1}{rm})\frac{v}{p}}\lambda ^{\frac{1}{rm}\frac{v}{q}}]_{A_{v/r}}\le [\mu ]_{A_{p/r}}^{(1-\frac{1}{rm})\frac{rv}{p}}[\lambda ]_{A_{q/r}}^{\frac{1}{m}\frac{v}{q}} \end{aligned}$$

and

$$\begin{aligned}{}[w_j]_{A_{u_j/r}}=[\mu ^{\frac{j}{rm}\frac{u_j}{p}}\lambda ^{(1-\frac{j}{rm})\frac{u_j}{q}} ]_{A_{u_j/r}}\le [\mu ]_{A_{p/r}}^{\frac{j}{rm}\frac{u_j}{p}}[\lambda ]_{A_{q/r}}^{(1-\frac{j}{rm})\frac{u_j}{q}}. \end{aligned}$$

From this and from the above expression for \(C(\mu ,\lambda )\), the values of \(\beta _{\mu _1}\) and \(\beta _{\lambda _1}\) follow by direct computation. \(\square \)

For the estimate (5.2) in Theorem 5.1 in the case \(q \le p\), we need a Fefferman–Stein-type lemma (see, e.g., [25]).

Lemma 5.7

Let \(p \in (1,\infty )\), \(0<\delta <1\) and let w be a weight. For any sparse family \(\mathcal {S} \subseteq \mathscr {D}\) and \(f \in L^1_{{{\,\textrm{loc}\,}}}({\mathbb {R}}^n)\) we have

$$\begin{aligned} \Bigl \Vert \sum _{Q\in {\mathcal S}}\langle |f|\rangle _{1,Q}\chi _Q\Bigr \Vert _{L^p(w)}\lesssim p'\tfrac{1}{\delta ^{1/p'}}\Vert f\Vert _{L^p(M_{1+\delta }w)}. \end{aligned}$$

Using this lemma, we can now prove the estimate (5.2) in Theorem 5.1 in the case \(q \le p\). This approach was suggested qualitatively in the case \(p=q\) by Li [24].

Proof of (5.2) in Theorem 5.1

By Lemma 5.5, there exists a sparse collection of cubes \(\mathcal {S}\subseteq \mathcal {S}'\subseteq {{\mathscr {D}}}\) such that for any \(Q \in \mathcal {S}\),

$$\begin{aligned} \big \langle |b-\langle b\rangle _Q|^m|f|\big \rangle _{r,Q}\lesssim |Q|^{-1/r}\Bigl \Vert \sum _{Q\in {{\mathcal {S}}}'}\langle M^{\#}_\nu b \cdot \nu \rangle _{1,Q}\chi _Q\Bigr \Vert _{L^{rm}(|f|^r)}^m. \end{aligned}$$

Since \(m\ge 2\), we have by Lemma 5.7,

$$\begin{aligned} \big \langle |b-\langle b\rangle _Q|^m|f|\big \rangle _{r,Q}\lesssim |Q|^{-1/r}\frac{1}{\delta ^{m/(rm)'}}\Vert M^{\#}_\nu b\cdot \nu \chi _Q\Vert _{L^{rm}(M_{1+\delta }(|f|^r))}^m. \end{aligned}$$

Therefore, by Theorem 4.1, Buckley’s estimate [3] and Hölder’s inequality, we have for \(0<\delta <1\),

$$\begin{aligned} {{\mathcal {B}}}_{{{\mathcal {S}}},b,r,s}^m(f,g)&\lesssim \tfrac{1}{\delta ^{m/(rm)'}} \sum _{Q\in {{\mathcal {S}}}}\big \langle (M^{\#}_\nu b\cdot \nu )^m M_{(1+\delta )r}f \big \rangle _{r,Q}\langle |g|\rangle _{s',Q} |Q|\\&\lesssim \tfrac{1}{\delta ^{m/(rm)'}} \bigl ([\lambda ]_{A_{q/r}}[\lambda ]_{{\textrm{RH}}_{(s/q)'}}\bigr )^{\max (\frac{s-1}{s-q},\frac{1}{q-r})}\\ {}&\hspace{2cm}\cdot \Vert (M^{\#}_\nu b\cdot \nu )^m M_{(1+\delta )r}f\Vert _{L^q(\lambda )} \Vert g\Vert _{L^{q'}(\lambda ^{1-q'})}\\&\lesssim \tfrac{1}{\delta ^{m/(rm)'}} \bigl ([\lambda ]_{A_{q/r}}[\lambda ]_{{\textrm{RH}}_{(s/q)'}}\bigr )^{\max (\frac{s-1}{s-q},\frac{1}{q-r})}\\ {}&\hspace{2cm}\cdot \Vert M_{(1+\delta )r}f\Vert _{L^p(\mu )} \Vert g\Vert _{L^{q'}(\lambda ^{1-q'})}\Vert M^{\#}_\nu (b)\Vert _{L^t(\nu )}^m\\&\lesssim \tfrac{1}{\delta ^{m/(rm)'}}[\mu ]_{A_{\frac{p}{(1+\delta )r}}}^{\frac{1}{p-(1+\delta )r}}\bigl ([\lambda ]_{A_{p/r}} [\lambda ]_{{\textrm{RH}}_{(s/q)'}}\bigr )^{\max (\frac{s-1}{s-q},\frac{1}{q-r})}\\&\quad \cdot \Vert f\Vert _{L^p(\mu )}\Vert g\Vert _{L^{q'}(\lambda ^{1-q'})}\Vert M^{\#}_\nu (b)\Vert _{L^t(\nu )}^m. \end{aligned}$$

Let \(c_n>0\) be the constant from Proposition 2.2 and set

$$\begin{aligned} \tfrac{1}{\delta }:={(p/r)'\cdot c_n} \, [\mu ]_{A_{p/r}}^{\frac{r}{p-r}}. \end{aligned}$$

(5.2) now follows from Proposition 2.2. \(\square \)

Remark 5.8

The proof of (5.2) in Theorem 5.1 also works in the case \(m=1\) and \(r>1\) with a constant depending on r, as can be seen from the proof. However, in applications to concrete operators, this will yield worse dependence on the weight characteristics of \(\mu \) and \(\lambda \) than (5.1).

The method of proof of (5.2) is likely also applicable to the case \(p<q\), again yielding an incomparable bound to (5.1). This would require one to develop a fractional version of Lemma 5.7. Since our interest in quantitative estimates is mainly in the case \(p=q\), we leave this extension to the interested reader.

6 Weighted bounds for commutators

In this final section we will apply the results from the previous sections to concrete operators. Let us first formulate a general result, in a qualitative form, which is an immediate corollary of Theorems 3.2 and 5.1 and duality.

Theorem 6.1

Let \(1\le r<p,q< s\le \infty \) and \(m\in {{\mathbb {N}}}\). Let T be a sublinear operator and \(b \in L^1_{{{\,\textrm{loc}\,}}}({\mathbb {R}}^n)\). Assume the following conditions:

  • Suppose T and \({{\mathcal {M}}}^{\#}_{T,s}\) are locally weak \(L^r\)-bounded.

  • Let \(\mu \in A_{p/r}\cap {\textrm{RH}}_{(s/p)'}\), \(\lambda \in A_{q/r}\cap {\textrm{RH}}_{(s/q)'}\) and define the Bloom weight

    $$\begin{aligned} \nu ^{1+\frac{1}{pm}-\frac{1}{qm}}:= \mu ^{\frac{1}{pm}}\lambda ^{-\frac{1}{qm}}. \end{aligned}$$

Then:

  1. (i)

    If \(p\le q\) and \(\alpha := \frac{1}{pm}-\frac{1}{qm}\), we have

    $$\begin{aligned} \Vert T_b^m\Vert _{L^p(\mu ) \rightarrow L^q(\lambda )}\lesssim _{\mu ,\lambda } \Vert b\Vert _{{{\,\textrm{BMO}\,}}_{\nu }^{\alpha n}}^m. \end{aligned}$$
  2. (ii)

    If \(q\le p\) and \(\frac{1}{t}:=\frac{1}{qm}-\frac{1}{pm}\) we have

    $$\begin{aligned} \Vert T_b^m\Vert _{L^p(\mu )\rightarrow L^q(\lambda )}\lesssim _{\mu ,\lambda } \Vert M^{\#}_\nu b\Vert _{L^t(\nu )}^m. \end{aligned}$$

Proof

Fix \(f,g \in L^\infty _c({\mathbb {R}}^n)\). By Remark 3.5, there exist \(3^n\) dyadic lattices \({{\mathscr {D}}}_j\) and sparse families \({{\mathcal {S}}}_j\subseteq {{\mathscr {D}}}_j\) such that

$$\begin{aligned} \int _{{{\mathbb {R}}}^n}|T_b^mf||g|\lesssim \sum _{j=1}^{3^n}\big ({{\mathcal {B}}}_{{{\mathcal {S}}}_j,b,r,s}^m(f,g)+{{\mathcal {B}}}_{{{\mathcal {S}}}_j,b,s',r'}^m(g,f)\big ). \end{aligned}$$

Therefore, the claims follow from applying Theorem 5.1 twice, directly and dually. We observe that in order to apply Theorem 5.1 to the dual terms \({{\mathcal {B}}}_{{\mathcal S}_j,b,s',r'}^m(g,f)\) in the dual spaces, we need the conditions

  • \(\lambda ^{1-q'}\in A_{q'/s'}\)

  • \(\mu ^{1-p'}\in A_{p'/s'}\cap {\textrm{RH}}_{(r'/p')'}\)

which follow directly from our assumptions on \(\mu \) and \(\lambda \). \(\square \)

We refer to [22, Remark 4.4] for a list of operators satisfying the assumptions, and thus the conclusion, of Theorem 6.1. Note that even unweighted bounds for commutators with some of the operators on that list were previously unknown.

Next, we examine a quantitative form of Theorem 6.1 in an important particular case of interest.

Theorem 6.2

Let \(1<p<\infty \) and \(m\in {{\mathbb {N}}}\). Let T be a sublinear operator and \(b \in L^1_{{{\,\textrm{loc}\,}}}({\mathbb {R}}^n)\). Assume the following conditions:

  • Suppose that for all \(1<r<2<s<\infty \), both T and \({{\mathcal {M}}}^{\#}_{T,s}\) are locally weak \(L^r\)-bounded, and

    $$\begin{aligned} \varphi _{T,r}(\lambda _{m,n})+\varphi _{\mathcal {M}^{\#}_{T,s},r}(\lambda _{m,n})\le \psi (r',s), \end{aligned}$$

    where \(\lambda _{m,n}>0\) is the constant provided by Theorem 3.2 and \(\psi :[1,\infty )^2 \rightarrow [1,\infty )\) is non-decreasing in both variables.

  • Let \(\mu ,\lambda \in A_{p}\) and define the Bloom weight \(\nu := (\frac{\mu }{\lambda })^{\frac{1}{pm}}.\)

Then

$$\begin{aligned} \Vert T_b^m\Vert _{L^p(\mu ) \rightarrow L^p(\lambda )}\lesssim K_{p}(\mu ,\lambda ) C_{p,\psi }(\mu ,\lambda )\Vert b\Vert _{{{\,\textrm{BMO}\,}}_{\nu }}^m, \end{aligned}$$

where

$$\begin{aligned} K_{p}(\mu ,\lambda )&\,\le {\left\{ \begin{array}{ll} \big ([\mu ]_{A_p}[\lambda ]_{A_p}\big )^{\frac{m+1}{2}\max (1,\frac{1}{p-1})}, &{}m\ge 1,\\ {[}\mu ]_{A_p}^{\frac{m}{p-1}}[\lambda ]_{A_p}^{\max (1,\frac{1}{p-1})}+ {[}\mu ]_{A_p}^{\max (1,\frac{1}{p-1})}[\lambda ]_{A_p}^m, \qquad &{}m\ge 2, \end{array}\right. }\\ C_{p,\psi }(\mu ,\lambda )&:=\psi \big (c_{p,n,m}\max ([\mu ]_{A_p},[\lambda ]_{A_p})^{\frac{1}{p-1}}, c_{p,n,m}\max ([\mu ]_{A_p},[\lambda ]_{A_p})\big ). \end{aligned}$$

Proof

Fix \(f,g \in L^\infty _c({\mathbb {R}}^n)\). By Remark 3.5, for any \(1<r<2<s<\infty \) there exist \(3^n\) dyadic lattices \({{\mathscr {D}}}_j\) and sparse families \({{\mathcal {S}}}_j\subseteq {{\mathscr {D}}}_j\) such that

$$\begin{aligned} \int _{{{\mathbb {R}}}^n}|T_b^mf||g|\lesssim \psi (r',s)\sum _{j=1}^{3^n}\big ({{\mathcal {B}}}_{{\mathcal S}_j,b,r,s}^m(f,g)+{{\mathcal {B}}}_{{\mathcal S}_j,b,s',r'}^m(g,f)\big ). \end{aligned}$$
(6.1)

Suppose that \(1<r<\min (\frac{m+1}{m},p)\) and \(\max (p,m+1)<s<\infty \). Then \(\lfloor rm\rfloor =m\) and \(\lfloor s'm\rfloor =m\), and hence, by Theorem 5.1,

$$\begin{aligned} {{\mathcal {B}}}_{{{\mathcal {S}}}_j,b,r,s}^m(f,g)\lesssim C_{p,r,s}(\mu ,\lambda )\Vert b\Vert _{{{\,\textrm{BMO}\,}}_{\nu }}^m \Vert f\Vert _{L^p(\mu )}\Vert g\Vert _{L^{p'}(\lambda ^{1-p'})}, \end{aligned}$$

where either

$$\begin{aligned} \begin{aligned} C_{p,r,s}(\mu ,\lambda )&:= \bigl ([\mu ]_{A_{p/r}}^{(r-1)m+\frac{m+1}{2r}}[\lambda ]_{A_{p/r}}^{m-\frac{m+1}{2r}} \bigr )^{\max (\frac{1}{r},\frac{1}{p-r})}\\ {}&\quad \cdot \bigl ([\lambda ]_{A_{p/r}}[\lambda ]_{{\textrm{RH}}_{(s/p)'}}\bigr )^{\max (\frac{s-1}{s-p},\frac{1}{p-r})}. \end{aligned} \end{aligned}$$

or, if \(m\ge 2\), alternatively

$$\begin{aligned} C_{p,r,s}(\mu ,\lambda ):= [\mu ]_{A_{p/r}}^{\frac{rm}{p-r}}\bigl ([\lambda ]_{A_{p/r}} [\lambda ]_{{\textrm{RH}}_{(s/p)'}}\bigr )^{\max (\frac{s-1}{s-p},\frac{1}{p-r})}. \end{aligned}$$

Moreover, Theorem 5.1 also yields

$$\begin{aligned} {{\mathcal {B}}}_{{{\mathcal {S}}}_j,b,s',r'}^m(g,f)\lesssim C_{p',s',r'}(\lambda ^{1-p'},\mu ^{1-p'})\Vert b\Vert _{{{\,\textrm{BMO}\,}}_{\nu }}^m \Vert g\Vert _{L^{p'}(\lambda ^{1-p'})}\Vert f\Vert _{L^p(\mu )}. \end{aligned}$$

Now, let \(c_n>0\) be the constant in Proposition 2.2 and define

$$\begin{aligned} {\bar{r}}&= 1+\tfrac{1}{p'\cdot c_n}\cdot \max ([\mu ]_{A_p},[\lambda ]_{A_p})^{-\frac{1}{p-1}} ,\\ {\bar{s}}&= p\bigl (1+ c_n \cdot \max ([\mu ]_{A_p},[\lambda ]_{A_p})\bigr ). \end{aligned}$$

Then we have for all \(1<r\le {\bar{r}}\) and \({\bar{s}}\le s<\infty \) that

$$\begin{aligned} C_{p,r,s}(\mu ,\lambda )&\lesssim {\left\{ \begin{array}{ll} \big ([\mu ]_{A_p}[\lambda ]_{A_p}\big )^{\frac{m+1}{2}\max (1,\frac{1}{p-1})},\qquad &{}m \ge 1,\\ {[}\mu ]_{A_p}^{\frac{m}{p-1}}[\lambda ]_{A_p}^{\max (1,\frac{1}{p-1})}, \qquad &{}m\ge 2, \end{array}\right. } \end{aligned}$$

and, using (2.1), also

$$\begin{aligned} C_{p',s',r'}(\lambda ^{1-p'},\mu ^{1-p'})&\lesssim {\left\{ \begin{array}{ll} \big ([\mu ]_{A_p}[\lambda ]_{A_p}\big )^{\frac{m+1}{2}\max (1,\frac{1}{p-1})},\qquad &{}m\ge 1,\\ {[}\mu ]_{A_p}^{\max (1,\frac{1}{p-1})}[\lambda ]_{A_p}^m, \qquad &{}m\ge 2. \end{array}\right. } \end{aligned}$$

Therefore, if \(r =\frac{1}{2}(\min (\frac{m+1}{m},p,{\bar{r}})+1)\) and \(s = 2\max (p,m+1,{\bar{s}})\), combining the above estimates with (6.1) completes the proof. \(\square \)

If T and \(\mathcal {M}^{\#}_{T,s}\) are both locally weak \(L^1\)-bounded, we can take \(\psi \) in Theorem 6.2 constant in the first coordinate, which we record as the following corollary.

Corollary 6.3

Let \(1<p<\infty \) and \(m\in {{\mathbb {N}}}\). Let T be a sublinear operator and \(b \in L^1_{{{\,\textrm{loc}\,}}}({\mathbb {R}}^n)\). Assume the following conditions:

  • Suppose that for all \(2<s<\infty \), both T and \({{\mathcal {M}}}^{\#}_{T,s}\) are locally weak \(L^1\)-bounded, and

    $$\begin{aligned} \varphi _{\mathcal {M}^{\#}_{T,s},1}(\lambda _{m,n})\le \psi (s), \end{aligned}$$

    where \(\lambda _{m,n}\) is a constant provided by Theorem 3.2 and \(\psi :[1,\infty ) \rightarrow [1,\infty )\) is non-decreasing.

  • Let \(\mu ,\lambda \in A_{p}\) and define the Bloom weight \(\nu := (\frac{\mu }{\lambda })^{\frac{1}{pm}}.\)

Then

$$\begin{aligned} \Vert T_b^m\Vert _{L^p(\mu ) \rightarrow L^p(\lambda )}\lesssim K_p(\mu ,\lambda ) C_{p,\psi }(\mu ,\lambda )\Vert b\Vert _{{{\,\textrm{BMO}\,}}_{\nu }}^m, \end{aligned}$$

where \(K_p(\mu ,\lambda )\) is as in Theorem 6.2 and

$$\begin{aligned} C_{p,\psi }(\mu ,\lambda ):=\psi \big (c_{p,n,m}\max ([\mu ]_{A_p},[\lambda ]_{A_p})\big ). \end{aligned}$$

Proof

Since T and \({{\mathcal {M}}}^{\#}_{T,s}\) are locally weak \(L^1\)-bounded, they are locally weak \(L^r\) bounded for all \(r>1\), and

$$\begin{aligned} \varphi _{T,r}(\lambda _{m,n})+\varphi _{\mathcal {M}^{\#}_{T,s},r}(\lambda _{m,n})&\le \varphi _{T,1}(\lambda _{m,n})+\varphi _{\mathcal {M}^{\#}_{T,s},1}(\lambda _{m,n})\\&\le \varphi _{T,1}(\lambda _{m,n})+\psi (s). \end{aligned}$$

Applying Theorem 6.2 with

$$\begin{aligned} \psi (r',s):=\varphi _{T,1}(\lambda _{m,n})+\psi (s) \end{aligned}$$

finishes the proof. \(\square \)

6.1 Calderón–Zygmund operators

As discussed in the introduction, Bloom weighted estimates for commutators have been widely studied for Calderón–Zygmund operators. As a first application of our results, we will compare the weighted estimates that we obtained in the diagonal \(p=q\) case for Calderón–Zygmmund operators and discuss their sharpness in the sense of Definition 1.1. In particular, let us prove Theorem 1.2.

Recall that a linear operator T is called Dini-continuous Calderón–Zygmund operator if it is \(L^2\)-bounded and for \(f \in L^\infty _c({\mathbb {R}}^n)\) has a representation

$$\begin{aligned} Tf(x)=\int _{{{\mathbb {R}}}^n}K(x,y)f(y)dy\qquad x\not \in \text {supp}\,f, \end{aligned}$$

where

$$\begin{aligned} |K(x,y)-K(x',y)|+|K(y,x)-K(y,x')|\le \omega \left( \frac{|x-x'|}{|x-y|}\right) \frac{1}{|x-y|^n}, \end{aligned}$$

whenever \(|x-y|>2|x-x'|\), with the modulus of continuity \(\omega :[0,1]\rightarrow [0,\infty )\) satisfying \(\int _0^1\omega (t)\frac{dt}{t}<\infty .\)

Proof of Theorem 1.2

It is well-known that T and \({{\mathcal {M}}}^{\#}_{T,\infty }\) are of weak \(L^1\)-bounded (see, e.g., [9, 19]). Next, \({\mathcal M}^{\#}_{T,s}\le {{\mathcal {M}}}^{\#}_{T,\infty }\) for every \(s\ge 1\). Therefore, after normalizing such that \(\Vert b\Vert _{{{{\,\textrm{BMO}\,}}}_{\nu }}=1\), by Corollary 6.3 we have

$$\begin{aligned} \Vert T_b^m\Vert _{L^p(\mu )\rightarrow L^p(\lambda )}\lesssim ([\lambda ]_{A_p}[\mu ]_{A_p})^{\frac{m+1}{2}\max (1,\frac{1}{p-1})}. \end{aligned}$$
(6.2)

and, if \(m\ge 2\), also

$$\begin{aligned} \Vert T_b^m\Vert _{L^p(\mu )\rightarrow L^p(\lambda )}\lesssim [\lambda ]_{A_p}^{\max (1,\frac{1}{p-1})}[\mu ]_{A_p}^{\frac{m}{p-1}}+[\mu ]_{A_p}^{\max (1,\frac{1}{p-1})}[\lambda ]_{A_p}^{m}. \end{aligned}$$
(6.3)

First let us take \(m\ge 2\). Then (6.3) implies for \(p\ge 2\) that

$$\begin{aligned} \Vert T_b^m\Vert _{L^p(\mu )\rightarrow L^p(\lambda )}\lesssim [\mu ]_{A_p}^{\max (1,\frac{m}{p-1})}[\lambda ]_{A_p}^{m}. \end{aligned}$$

If \(p>\frac{1+3m}{m+1}\), then \(\max (1,\frac{m}{p-1})<\frac{m+1}{2}\), and therefore we obtain that (6.2) is not sharp in the sense of Definition 1.1. Moreover, for \(1<p\le 2\) (6.3) implies

$$\begin{aligned} \Vert T_b^m\Vert _{L^p(\mu )\rightarrow L^p(\lambda )}\lesssim [\mu ]_{A_p}^{\frac{m}{p-1}}[\lambda ]_{A_p}^{\max (m,\frac{1}{p-1})}. \end{aligned}$$

If \(p<\frac{1+3m}{2m}\), then \(\max (m,\frac{1}{p-1})<\frac{m+1}{2}\frac{1}{p-1}\). Therefore, we again obtain that (6.2) is not sharp.

Suppose now that \(m=1\). Let us show that in this case (6.2) is sharp for all \(p \in (1,\infty )\). By duality, the sharpness of the exponent \(\max (1,\frac{1}{p-1})\) of \([\lambda ]_{A_p}\) is equivalent to the sharpness of the exponent \(\max (1,\frac{1}{p'-1})\) of \([\mu ]_{A_{p'}}\). Therefore, it suffices to establish the sharpness of the exponent \(\frac{1}{p-1}\) of \([\lambda ]_{A_p}\) and \([\mu ]_{A_p}\) for \(1<p\le 2\).

Let \(1<p\le 2\). We will provide examples in dimension \(n=1\) for the Hilbert transform H. Let us prove first that the exponent \(\frac{1}{p-1}\) of \([\lambda ]_{A_p}\) cannot be decreased. Let \(0<\delta <1\) and define \(\mu :=1\) and \(\lambda (x):=|x|^{(p-1)(1-\delta )}\). It is well known that \([\lambda ]_{A_p}\eqsim \delta ^{1-p}\). Observe that \(\nu =\lambda ^{-1/p}=|x|^{\frac{(\delta -1)}{p'}}\). Define \(b:=\nu \). It is easy to see that \(\Vert b\Vert _{{{\,\textrm{BMO}\,}}_{\nu }}\le 2\). Define \(f(x):=|x|^{\frac{(\delta -1)}{p}} \chi _{(0,1)}(x)\). Then

$$\begin{aligned} ([\lambda ]_{A_p}[\mu ]_{A_p})^{\frac{1}{p-1}}\Vert f\Vert _{L^p(\mu )}\eqsim \tfrac{1}{\delta ^{1+1/p}}. \end{aligned}$$

Therefore, the sharpness of the exponent \(\frac{1}{p-1}\) would follow if we show that

$$\begin{aligned} \Vert H_b^1\Vert _{L^p(\lambda )}\gtrsim \tfrac{1}{\delta ^{1+1/p}}. \end{aligned}$$
(6.4)

Observe that for \(x \in {\mathbb {R}}\)

$$\begin{aligned} H_{b}^1f(x)&=|x|^{\frac{(\delta -1)}{p'}} H(|y|^{\frac{(\delta -1)}{p}}\chi _{(0,1)})(x)- H(|y|^{\delta -1}\chi _{(0,1)})(x)\\&=:h_1(x)-h_2(x). \end{aligned}$$

By the unweighted \(L^p\) boundedness of H,

$$\begin{aligned} \Vert h_1\Vert _{L^p(\lambda )}=\Big (\int _{{{\mathbb {R}}}}|H(|y|^{\frac{(\delta -1)}{p}}\chi _{(0,1)})(x)|^pdx\Big )^{1/p}\lesssim \tfrac{1}{\delta ^{1/p}}. \end{aligned}$$

On the other hand, \(\frac{1}{\delta } |x|^{\delta -1}\lesssim |h_2(x)|\) for all \(x\in (0,1)\), and therefore

$$\begin{aligned} \tfrac{1}{\delta ^{1+1/p}} \lesssim \Vert h_2\Vert _{L^p(\lambda )}, \end{aligned}$$

which proves (6.4).

Let us show now that the exponent \(\frac{1}{p-1}\) of \([\mu ]_{A_p}\) cannot be decreased. The example is very similar. Define \(\lambda :=1\) and \(\mu (x):=|x|^{(p-1)(1-\delta )}\). Then \([\mu ]_{A_p}\eqsim \delta ^{1-p}\). Observe that \(\nu =\mu ^{1/p}=|x|^{\frac{(1-\delta )}{p'}}\). Define \(b:=\nu \), for which we have \(\Vert b\Vert _{{{\,\textrm{BMO}\,}}_{\nu }}\eqsim 1\). Define \(f(x):=|x|^{\delta -1} \chi _{(0,1)}(x)\). Then

$$\begin{aligned} ([\lambda ]_{A_p}[\mu ]_{A_p})^{\frac{1}{p-1}}\Vert f\Vert _{L^p(\mu )}\eqsim \tfrac{1}{\delta ^{1+1/p}}. \end{aligned}$$

Therefore, the sharpness of the exponent \(\frac{1}{p-1}\) would follow if we show that

$$\begin{aligned} \Vert H_b^1\Vert _{L^p}\gtrsim \tfrac{1}{\delta ^{1+1/p}}. \end{aligned}$$
(6.5)

Observe that

$$\begin{aligned} H_{b}^1f(x)&=|x|^{\frac{(1-\delta )}{p'}}H(|y|^{(\delta -1)}\chi _{(0,1)})(x)-H(|y|^{\frac{\delta -1}{p}}\chi _{(0,1)})(x)\\&=h_1(x)-h_2(x). \end{aligned}$$

Exactly as above, \(\Vert h_2\Vert _{L^p}\lesssim \tfrac{1}{\delta ^{1/p}}\). On the other hand, \(\frac{1}{\delta }|x|^{\frac{\delta -1}{p}}\lesssim h_1(x)\) for all \(x\in (0,1)\), and therefore,

$$\begin{aligned} \frac{1}{\delta ^{1+1/p}} \lesssim \Vert h_1\Vert _{L^p(\lambda )}, \end{aligned}$$

which proves (6.5). This completes the proof. \(\square \)

As we mentioned in the introduction, Theorem 1.2 leaves open a question about the sharpness of (1.1) when \(m\ge 2\) and \(p\in [\frac{1+3m}{2m},\frac{1+3m}{m+1}]\). Indeed, our example is based on the obvious fact that \(\nu \in {{\,\textrm{BMO}\,}}_{\nu }\). In the case \(m=1\) the choice \(b=\nu \) shows the sharpness of (1.1) for all \(1<p<\infty \). However, it is easy to check that in the case \(m\ge 2\) the same choice is not enough in order to show the sharpness of (1.1).

6.2 Further applications

We conclude this paper by applying our results to several other concrete examples of operators, for which (quantitative) Bloom weighted bounds of their commutators have not been known before.

Given an operator T and \(s\ge 1\), define the (non-sharp) grand maximal truncation operator \({{\mathcal {M}}}_{T,s}\) by

$$\begin{aligned} {{\mathcal {M}}}_{T,s}f(x):=\sup _{Q\ni x}\Big (\frac{1}{|Q|}\int _Q|T(f\chi _{{{\mathbb {R}}}^n\setminus 3Q})|^s\Big )^{1/s}, \qquad x \in {\mathbb {R}}^n. \end{aligned}$$

Observe that

$$\begin{aligned} {{\mathcal {M}}}^{\#}_{T,s}f\le 2\, {{\mathcal {M}}}_{T,s}f. \end{aligned}$$
(6.6)

Example 6.4

Consider a class of rough homogeneous singular integrals defined by

$$\begin{aligned} T_{\Omega }f(x)=\text {p.v.}\int _{{{\mathbb {R}}}^n}f(x-y)\frac{\Omega (y/|y|)}{|y|^n}\hspace{2pt}\textrm{d}y, \qquad x \in {\mathbb {R}}^d, \end{aligned}$$

for \(\Omega \in L^{\infty }(S^{n-1})\) with zero average over the sphere. Fix \(p \in (1,\infty )\), let \(m\in {\mathbb {N}}\) and \(b \in L^1_{{{\,\textrm{loc}\,}}}({\mathbb {R}}^n)\).

It is a well-known result of Seeger [29] that \(T_{\Omega }\) is of weak type (1, 1). Moreover, it was shown by the first author [17] that for \(s>2\), the grand maximal truncation operator \({{\mathcal {M}}}_{T_{\Omega },s}\) is weak \(L^1\)-bounded with

$$\begin{aligned} \Vert {{\mathcal {M}}}_{T_{\Omega },s}\Vert _{L^{1}({{\mathbb {R}}}^n)\rightarrow L^{1,\infty }({{\mathbb {R}}}^n)}\lesssim s. \end{aligned}$$

Therefore, by (6.6), \({{\mathcal {M}}}_{T_{\Omega },s}^{\#}\) satisfies the same bound. From this, by Corollary 6.3, we obtain for \(\mu ,\lambda \in A_p\) and \(\nu := (\frac{\mu }{\lambda })^{\frac{1}{pm}}\)

$$\begin{aligned} \Vert (T_{\Omega })_b^m\Vert _{L^p(\mu ) \rightarrow L^p(\lambda )}\lesssim C_1(\mu ,\lambda )\Vert b\Vert _{{{\,\textrm{BMO}\,}}_{\nu }}^m, \end{aligned}$$

where

$$\begin{aligned} C_1(\lambda ,\mu ):=\max \bigl ([\mu ]_{A_p},[\lambda ]_{A_p}\bigr )K_p(\mu ,\lambda ) \end{aligned}$$

with \(K_p(\mu ,\lambda )\) as in Theorem 6.2.

Furthermore, since \(T_{\Omega }\) is essentially self-adjoint, we have

$$\begin{aligned} \Vert (T_{\Omega })_b^m\Vert _{L^p(\mu )\rightarrow L^p(\lambda )}=\Vert (T_{\Omega })_b^m\Vert _{L^{p'}(\lambda ^{1-p'})\rightarrow L^{p'}(\mu ^{1-p'})}. \end{aligned}$$

Using (2.1), this yields

$$\begin{aligned} \Vert (T_{\Omega })_b^m\Vert _{L^p(\mu )\rightarrow L^p(\lambda )}\lesssim C_2(\mu ,\lambda )\Vert b\Vert _{{{\,\textrm{BMO}\,}}_{\nu }}^m, \end{aligned}$$

where

$$\begin{aligned} C_2(\lambda ,\mu )&:=\max \bigl ([\lambda ^{1-p'}]_{A_{p'}},[\mu ^{1-p'}]_{A_{p'}}\bigr )K_{p'}(\lambda ^{1-p'},\mu ^{1-p'})\\&\,=\max \bigl ([\mu ]_{A_p},[\lambda ]_{A_p}\bigr )^{\frac{1}{p-1}}K_p(\mu ,\lambda ). \end{aligned}$$

Therefore, we finally obtain that

$$\begin{aligned} \Vert (T_{\Omega })_b^m\Vert _{L^p(\mu ) \rightarrow L^p(\lambda )}&\lesssim \max \bigl ([\mu ]_{A_p},[\lambda ]_{A_p}\bigr )^{\min (1,\frac{1}{p-1})} K_p(\mu ,\lambda )\,\Vert b\Vert _{{{\,\textrm{BMO}\,}}_{\nu }}^m. \end{aligned}$$

Example 6.5

Consider now a class of maximal rough homogeneous singular integrals defined by

$$\begin{aligned} T_{\Omega }^{\star }f(x)=\sup _{\varepsilon>0}\Big |\int _{|y|>\varepsilon }f(x-y)\frac{\Omega (y/|y|)}{|y|^n}dy\Big |, \qquad x \in {\mathbb {R}}^d, \end{aligned}$$

for \(\Omega \in L^{\infty }(S^{n-1})\) with zero average over the sphere. Fix \(p \in (1,\infty )\), let \(m\in {\mathbb {N}}\) and \(b \in L^1_{{{\,\textrm{loc}\,}}}({\mathbb {R}}^n)\).

It was shown by Di Plinio–Hytönen–Li [7] that for \(1<r<2\),

$$\begin{aligned} \Vert T_{\Omega }^{\star }f\Vert _{L^{r,\infty }({\mathbb {R}}^n)}\lesssim r'\Vert f\Vert _{L^r({\mathbb {R}}^n)}. \end{aligned}$$
(6.7)

Let us deduce from the recent work of Tao–Hu [32] that

$$\begin{aligned} \varphi _{\mathcal {M}^{\#}_{T_{\Omega }^{\star },s},r}(\lambda _{m,n})\lesssim s\log r'. \end{aligned}$$
(6.8)

Denote \(\Phi (t):=t\log \log (\textrm{e}^2+t)\). In [32] the authors established that for \(s >2\) and \(\alpha >0\),

$$\begin{aligned} |\{x\in {{\mathbb {R}}}^n: {{\mathcal {M}}}_{T_{\Omega }^*,s}f(x)>\alpha \}|\lesssim s\int _{{{\mathbb {R}}}^n}\Phi \left( \frac{|f|}{\alpha }\right) dx. \end{aligned}$$

From this, using the estimate

$$\begin{aligned} \Phi (t)\lesssim (\log r')t+t^r,\qquad t>0 \end{aligned}$$

for \(1<r<2\), we obtain

$$\begin{aligned} |\{x\in Q: {{\mathcal {M}}}_{T_{\Omega }^*,s}(f\chi _Q)(x)>\alpha \}|\lesssim s\Big ((\log r')\int _Q\frac{|f|}{\alpha }+\int _Q\Big (\frac{|f|}{\alpha }\Big )^r\Big ). \end{aligned}$$

Hence, for \(\lambda \in (0,1)\), we have

$$\begin{aligned} \bigl |\bigl \{x\in Q: {\mathcal M}_{T_{\Omega }^*,s}(f\chi _Q)(x)>\tfrac{s\log r'}{\lambda }\langle |f|\rangle _{r,Q}\bigr \}\bigr |\lesssim \lambda |Q|. \end{aligned}$$

Along with (6.6), this implies (6.8).

Using (6.7) and (6.8), we are in position to apply Theorem 6.2 with \(\psi (r',s):=r'+(\log r')s\), from which it follows that for \(\mu ,\lambda \in A_p\) and \(\nu := (\frac{\mu }{\lambda })^{\frac{1}{pm}}\)

$$\begin{aligned} \Vert (T_{\Omega }^{\star })_b^m\Vert _{L^p(\mu ) \rightarrow L^p(\lambda )}\lesssim \bigl (t^{\frac{1}{p-1}}+t\log t \bigr ) K_p(\mu ,\lambda )\Vert b\Vert _{{{\,\textrm{BMO}\,}}_{\nu }}^m \end{aligned}$$

with \(K_p(\mu ,\lambda )\) as in Theorem 6.2 and \(t = \max \bigl ([\mu ]_{A_p},[\lambda ]_{A_p}\bigr )\)

Example 6.6

Consider the Bochner-Riesz operator at the critical index \(B_{(n-1)/2}\), which is defined by

$$\begin{aligned} (B_{(n-1)/2}f)\,\widehat{\,}\,(\xi ):=(1-|\xi |^2)_{+}^{\frac{n-1}{2}}\widehat{f}(\xi ), \qquad \xi \in {\mathbb {R}}^n. \end{aligned}$$

It is a well-known result of Christ [5] that \(B_{(n-1)/2}\) is of weak \(L^1\)-bounded. Furthermore, it is implicit in the work of Shrivastava–Shuin [30, 31] that for \(s \in [1,\infty )\)

$$\begin{aligned} \Vert {{\mathcal {M}}}_{B_{(n-1)/2},s}\Vert _{L^{1}({{\mathbb {R}}}^n)\rightarrow L^{1,\infty }({{\mathbb {R}}}^n)}\lesssim s. \end{aligned}$$
(6.9)

Therefore, arguing as in Example 6.4, we have for \(p \in (1,\infty )\), \(m\in {\mathbb {N}}\) and \(b \in L^1_{{{\,\textrm{loc}\,}}}({\mathbb {R}}^n)\) that for all \(\mu ,\lambda \in A_p\) and \(\nu := (\frac{\mu }{\lambda })^{\frac{1}{pm}}\)

$$\begin{aligned} \Vert (B_{(n-1)/2})_b^m\Vert _{L^p(\mu ) \rightarrow L^p(\lambda )}&\lesssim \max \bigl ([\mu ]_{A_p},[\lambda ]_{A_p}\bigr )^{\min (1,\frac{1}{p-1})} C_p(\mu ,\lambda )\,\Vert b\Vert _{{{\,\textrm{BMO}\,}}_{\nu }}^m. \end{aligned}$$

with \(C_p(\mu ,\lambda )\) as in Theorem 6.2.

We add some details about (6.9). It is well-known (see, e.g., [9, Section 5.2.1]) that the kernel of \(B_{(n-1)/2}\) is given by \(K_{(n-1)/2}:=K+\Phi \), where

$$\begin{aligned} K(x):=c_n\frac{\cos (2\pi |x|-\pi n/2)}{|x|^n}\chi _{\{|x|\ge 1\}}, \qquad x \in {\mathbb {R}}^n, \end{aligned}$$

and \(|\Phi (x)|\lesssim \frac{1}{1+|x|^{n+1}}\). Next, let \(\phi \) be a radial smooth function supported in B(0, 2) such that \(\phi =1\) on B(0, 1) and set

$$\begin{aligned} \psi _j(x):=\phi (2^{-j}|x|)-\phi (2^{-j+1}|x|), \qquad x \in {\mathbb {R}}^n. \end{aligned}$$

Then we obtain that

$$\begin{aligned} B_{(n-1)/2}f(x)=T_1f(x)+T_2f(x)+Tf(x), \end{aligned}$$

where

$$\begin{aligned} T_1f(x)&:=f*\Phi (x),{} & {} x \in {\mathbb {R}}^n,\\ T_2f(x)&:=f*(\phi K)(x),{} & {} x \in {\mathbb {R}}^n,\\ Tf(x)&:=\sum _{j=1}^{\infty }f*(\psi _jK)(x),{} & {} x \in {\mathbb {R}}^n. \end{aligned}$$

Using that \(|T_if|\lesssim Mf\) for \(i=1,2\), we obtain

$$\begin{aligned} {{\mathcal {M}}}_{B_{(n-1)/2},s}f(x)\lesssim Mf(x)+{{\mathcal {M}}}_{T,s}f(x), \qquad x \in {\mathbb {R}}^n. \end{aligned}$$

Therefore, it suffices to prove (6.9) for \({\mathcal M}_{T,s}\). In order to do that, we fix an integer N and further decompose \(T=S_1+S_2\), where

$$\begin{aligned} S_1f(x):=\sum _{j=1}^Nf*(\psi _jK)(x),\qquad S_2f(x):=\sum _{j=N+1}^{\infty }f*(\psi _jK)(x). \end{aligned}$$

Let \(\varepsilon :=2^{-N}\). It was shown in [31] that \(S_1\) is a Dini-continuous Calderón-Zygmund operator with Dini-constant bounded by \(\log \frac{1}{\varepsilon }\), and that \(\Vert S_2\Vert _{L^2\rightarrow L^2}\lesssim \varepsilon ^{\alpha }\) for some \(\alpha \in (0,1]\). Moreover, the kernels \(\psi _jK\) have radial smoothness (as was observed in [25]), which makes it possible to apply Seeger’s machinery from [29]. Using all these ingredients, the proof goes through as in [17] by the first author.

Remark 6.7

Let \(p,q \in (1,\infty )\), \(m\in {\mathbb {N}}\) and \(b \in L^1_{{{\,\textrm{loc}\,}}}({\mathbb {R}}^n)\). By a similar argument as employed in Sect. 6.1 and Examples 6.46.5 and 6.6, now using Theorem 6.1, we also get for all \(\mu \in A_p\) and \(\lambda \in A_q\) and

$$\begin{aligned} \nu ^{1+\frac{1}{pm}-\frac{1}{qm}}:= \mu ^{\frac{1}{pm}}\lambda ^{-\frac{1}{qm}} \end{aligned}$$

that we have

$$\begin{aligned} \Vert T_b^m\Vert _{L^p(\mu ) \rightarrow L^q(\lambda )}&\lesssim _{\mu ,\lambda } {\left\{ \begin{array}{ll} \Vert b\Vert _{{{\,\textrm{BMO}\,}}^{\alpha n}_{\nu }}^m, \qquad \qquad &{} p\le q,\\ \Vert M^{\#}_\nu (b)\Vert _{L^t(\nu )}^m, &{} q \le p. \end{array}\right. } \end{aligned}$$

with \(\alpha := \frac{1}{pm}-\frac{1}{qm}\), \(\frac{1}{t}:= \frac{1}{qm}-\frac{1}{pm}\) and T is either Calderón–Zygmund operator or \(T \in \{ T_\Omega , T_\Omega ^*, B_{(n-1)/2}\}\). We leave the exact dependence on \([\mu ]_{A_p}\) and \([\lambda ]_{A_q}\) in these cases to the interested reader.