Off-Diagonal Estimates for Bilinear Commutators

We find a minimal notion of non-degeneracy for bilinear singular integral operators T and identify testing conditions on the multiplying function b that characterize the Lp × Lq → Lr, 1<p,q<∞\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$1<p,q<\infty $\end{document} and r>12,\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$r>\frac {1}{2},$\end{document} boundedness of the bilinear commutator [b, T]1(f, g) = bT(f, g) − T(bf, g). Our arguments cover almost all arrangements of the integrability exponents p, q, r with a single open problem presented in the end. Additionally, the arguments extend to the multilinear setting.


INTRODUCTION
The study of commutator estimates have their roots in the work of Nehari [18] where the boundedness of the commutator of the Hilbert transform and a multiplying function b, where the supremum is taken over all cubes I ⊂ R d and b I = 1 |I| ´I b.The upper bound in (1.1) was proved in [4] for general Calderón-Zygmund operators T , while with the lower bounds they worked with the Riesz transforms R j ; commutator upper bounds are usually valid for all Calderón-Zygmund operators (CZO), while the lower bounds require some non-degeneracy.The lower bound in (1.1) was improved separately by both Janson [10] and Uchiyama [20] to b BMO [b, T ] L p (R d )→L p (R d ) under certain nondegeneracy assumptions on the kernel of T which encompass any single Riesz transform (in contrast with (1.1) involving all the d Riesz transforms).Janson's proof also gives the following off-diagonal characterization of the boundedness of the commutator when 1 < p < q < ∞.The off-diagonal characterizations in the case 1 < q < p < ∞ turned out to be harder and was only recently solved by the approximate weak factorization (awf) argument in Hytönen [9], Commutator estimates imply factorization results for Hardy spaces, see [4], they have applications in partial differential equations by compensated compactness, div-curl lemmas, see [3], and they have been crucial in the recent investigations to the Jacobian problem, see Lindberg [16] and [9].
The awf argument is strong in that it gives a unified approach to all of the three cases, those on the lines (1.1), (1.2), (1.3), in that it works for many singular integrals with kernels satisfying only minimum non-degeneracy assumptions, and in that it is flexible enough to grant e.g.multi-parameter and multilinear extensions.For the multiparameter variants of the awf argument see Airta, Hytönen, Li, Martikainen, Oikari [1] and Oikari [19], where, respectfully, the commutators were treated.On the line (1.4) 1 < p 1 , p 2 , q 1 , q 2 < ∞, T i is a one-parameter CZO on R d i , for i = 1, 2, and T is a bi-parameter CZO on R d 1 +d 2 .The adaptability of the awf argument to the bi-parameter settings was not effortless and for both commutators on the line (1.4) the characterization of some cases is still open.
In this article we extend the awf argument to the bilinear setting and study the two commutators [b, T ] 1 (f, g) = bT (f, g) − T (bf, g), [b, T ] 2 (f, g) = bT (f, g) − T (f, bg) as mappings L p × L q → L r for r > 1 2 and 1 < p, q < ∞.Our cases separate accordingly to the following three conditions If r > 1 then we are in the Banach range of exponents and if r ≤ 1 then we are in the quasi Banach range of exponents.In Chaffee [2] the necessity of b ∈ BMO on the diagonal in the Banach range of exponents was shown with kernels expandable locally as a Fourier series.A unified approach to the diagonal and sub-diagonal cases was given in Guo, Lian and Wu [8], which covers the diagonal in the whole quasi Banach range, however on the sub-diagonal they only treat the linear case.
In addition to involving new cases, the super-diagonal case is new in the bilinear setting, our results generalize previous work: the definition of non-degeneracy is weaker than those supposed in [14], [12], [2], [8] and [15]; the awf argument allows us to consider complex valued functions b, whereas [15] was limited to the real valued case; the full quasi-Banach range is reached in the diagonal and sub-diagonal cases, whereas [2] is limited to the Banach range; and in that the awf argument encompasses bilinear CZOs with both Dini and rough kernels.Lastly, due to us studying the quasi Banach range, the arguments involve additional twists absent from previous research articles.Our full results are recorded as theorems 4.1, 4.2, 4.15 and 5.20, the following being a condensed version.

DEFINITIONS AND PRELIMINARIES
2.1.Basic notation.We let Σ = Σ(R d ) denote the linear span of indicator functions of cubes on R d .Similarly we denote L 1 loc (R d ) = L 1 loc , ´Rd = ´, and so on, mostly leaving out the ambient space if this information is obvious.We denote averages with In this paper we study the L p × L q → L r boundedness and hence it is useful to denote σ(p, q) −1 = p −1 + q −1 ; then, Hölder's inequality writes as Lastly, we denote A B, if A ≤ CB for some constant C > 0 depending only on the dimension of the underlying space, on integration exponents and on other absolute constants appearing in the assumptions that we do not care about.Then A ∼ B, if A B and B A. Subscripts on constants (C a,b,c,... ) and quantifiers ( a,b,c,... ) signify their dependence on those subscripts.

Bilinear singular integrals.
We denote the diagonal with and say that a mapping K : (R d ) 3 \ ∆ → C is a bilinear Calderón-Zygmund kernel if it satisfies the size estimate: and the regularity estimate: Here the function ω is increasing, subadditive, and such that ω(0) = 0 and ω Dini = ´1 0 ω(t) dt t < ∞.We also assume that the appearing constants C K , ω Dini are the best possible.We denote the collection of all such kernels with CZ(2, d, ω) and associated to this class is the norm

Definition. A bilinear operator
loc is said to be a (variable kernel) bilinear SIO, if there exists a bilinear kernel K ∈ CZ(2, d, ω) so that

Definition.
A bilinear operator T Ω : Σ 2 → L 1 loc is said to be a rough bilinear SIO whenever it is well-defined as where

2.3.
Truncations of bilinear SIOs.We let K ∈ CZ(2, d, ω) and define the truncated operator T ε as A particular case of Cotlar's inequality in the bilinear setting states that where M is the Hardy-Littlewood maximal operator.For (2.6), see e.g.Grafakos, Torres [7].Since T, M are bounded, it follows that sup ε>0 T ε L p ×L q →L σ(p,q) < ∞ for 1 < p, q < ∞.For rough kernels Cotlar's inequality was not found.However to achieve a uniform bound on the truncations we need less.It was very recently shown in Theorem 1.1.of [6] that under the assumptions Ω ∈ L q (S 2d−1 ), for some q > 4  3 , and ´S2d−1 Ω = 0, there holds that T * Ω L 2 ×L 2 →L 1 Ω L q (S 2d−1 ) and this implies a uniform bound of the desired type.

2.4.
Boundedness assumptions on T .The majority of this paper is devoted to proving commutator lower bounds and there we do not need any boundedness assumptions on the operator T -only some non-degeneracy assumptions on the kernel K of T and some very weak regularity conditions, see Section 2.5 below.For the upper bounds we impose some boundedness on T and this will vary depending whether we are on the sub-diagonal, diagonal or the super-diagonal case.
We will assume (1) for the diagonal upper bound, (2) for the super-diagonal upper bound and (3) for the sub-diagonal upper bound.
2.8.Remark.The bilinear Riesz transforms, one of which is (2) for all points z, there exists two points x, y such that max a,b∈{x,y,z} It is immediate from the size estimate that if y, r and then x, z are given as in the item (1) of Definition 2.11 then max a,b∈{x,y,z} |a − b| ∼ r.Indeed, to see this, we simply check that ) which shows the claim.2.13.Remark.We will use the assumption (1) to prove Theorem 1.6 for the index i = 1 and the assumption (2) for the index i = 2.It follows that we will run the proofs of all of our results through with the assumption (1) of Definition 2.11 and it is clear how to modify them to get the case i = 2.

2.15.
Remark.The kernel of the bilinear Riesz transform R i satisfies both items (1) and (2) in Definition 2.11 and is also non-degenerate when considered as a rough bilinear SIO as in Definition 2.14.
In [15] the following definition of non-degeneracy is given; to contrast it with the nondegeneracy we name it the strong non-degeneracy.It is straightforward that strong non-degeneracy is stronger than non-degeneracy.
2.17.Proposition.Let K be a strongly non-degenerate kernel.Then, the kernel K is nondegenerate.
Proof.We only show the point (1) from Definition 2.11.Fix a point y ∈ R d , then by strong non-degeneracy there exists a point x ∈ B(y, r) c so that |K(x, y, y)| r −2d .Consequently, it remains to write the previous estimate as |K(x, y, z)| r −2d , where z = y and to notice that x ∈ B(y, r) c .2.18.Definition.We say that a bilinear SIO T is non-degenerate if its kernel K is nondegenerate.Similarly, a bilinear CZO is non-degenerate, if it is bilinear non-degenerate SIO that satisfies at least one of the properties (1), ( 2), (3) as in Definition 2.7.

BILINEAR APPROXIMATE WEAK FACTORIZATION
When proving the commutator lower bounds we do not need the full strength of the kernel assumption (2.2) and we will replace this with the following weaker assumption: the function ω satisfies ω(0) = 0, is increasing, is subadditive and such that Another strengthening is in that the awf argument only requires us to consider the following off-support information on the kernel K, where Q 0 , Q 1 , Q 2 are cubes of the same size such that max i=0,2 dist(Q i , Q 1 ) ∼ ℓ(Q 1 ).To press the point, there is no reference whatsoever to the operator T and everything is defined with the kernel K only.We move to prove the main technical Propositions 3.3 and 3.15.Recall that we only need the assumption (1) from Definition 2.11 to show the lower bounds for the commutator [b, T ] 1 .In the following we always work with three cubes Q 0 , Q 1 , Q 2 and variables are reserved to be used as follows,

Definition.
A dyadic grid on R d is a collection D of cubes satisfying the following.
Let Q 1 ⊂ R d be a cube with centre point c Q 1 and let D 0 and D 2 be arbitrary dyadic grids.
Then, there exists a constant , so that the following items hold.
(i) The cubes are separated and have size as follows

4)
(ii) There holds that where ω(A −1 ) → 0 as A → ∞. (iv) There holds that Moreover, the similar estimates to (3.6) and (3.7)where we always integrate over any two of the cubes Q 0 , Q 1 , Q 2 with the corresponding variables x, y, z, hold.
3.8.Remark.We only need to set up Q i ∈ D i , i = 0, 2, for the study of the super-diagonal case r = 1.
Proof of the case (1): As we will mostly manage without the property Q i ∈ D i , we first find any two cubes Q i , i = 1, 2, satisfying the rest of the claims.
We fix a cube The fact that we have ∼ above where indicated by * follows from the discussion after Definition 2.11, see line (2.12).Hence the claim (3.5) holds.Then, we let be the cubes respectfully with the centre points c Q 0 and c Q 2 .Then, it is clear that the claims on the line (3.4) hold.
Towards the remaining two claims, we first estimate Then, as for all points ) is applicable and we estimate first of the three intermediate terms as where in the last estimate we used the sub-additivity of ω.
The remaining two terms estimate similarly and consequently we find that Now, by choosing A large enough, subtracting and adding K(c Q 0 , c Q 1 , c Q 2 ) and using (3.9) and (3.10) we actually find that which is an improvement of (3.5).Similarly, by using the estimates (3.9) and (3.10), the claims (3.6) and (3.7) follow immediately.
We still need to argue that we can arrange Q i ∈ D i , for i = 0, 2. Assume that we have shown the claims for the triple of cubes and hence (3.5) is checked.The remaining claims are similarly immediate (with Q i d in place of Q i ) and follow as before.

Proof of the case (2):
We first check the claims with balls in place of cubes.By the nondegeneracy assumption let θ = (θ 0 , θ 2 ) ∈ S 2d−1 be a non-zero Lebesgue point of Ω.Then, fix a ball B 1 with centre c B 1 and radius r.Let the points c B 0 , c B 2 be defined by the following identities , and let B i be a ball with centre c B i and radius r.It is then clear that (3.4) holds and that It remains to check (3.6) and (3.7).Let x ∈ B 0 , y ∈ B 1 , z ∈ B 2 be arbitrary and write for a specific u a ∈ B(0, 1) depending on the parameter a ∈ {x, y, z}.To ease notation we write Ω(h ′ ) = Ω(h) and K Ω = K.Then, we have where and With a choice of A large enough we find that where as indicated by * the mean valued theorem was applied with x → x 2d and we used the estimate With a fixed y, the point u x − u y varies over B(0, 2) and with a fixed y, x the point u x − u z varies over B(0, 2).Hence, we estimate where Having the preceding estimate together with (3.13) shows (3.6), As before, (3.7) follows from (3.5) and (3.6).Lastly, we replace the balls with the desired cubes.Let Q 1 be a cube with centre point ) satisfy the claims (3.4) and (3.5).Of the remaining claims, the claim (3.6) follows, for example, by using the just shown result for balls, From now on whenever we fix a cube Q 1 , the associated cubes Q 0 and Q 2 will stand for the cubes generated through Proposition 3.3.If a function has support in the cube Q i then it has the subscript i or Q i , e.g. if spt(g) ⊂ Q i , then we write g = g i = g Q i .
3.15.Proposition.Suppose that K is a non-degenerate bilinear kernel.Then, there exists a large parameter A so that supposing the following items: (i) let Q 1 be a cube and let Q 0 , Q 2 stand for the cubes generated by Proposition 3.3 above, (ii) let f be a locally bounded function with zero mean supported on the cube Q 1 , (iii) let g i be functions such that spt(g i ) hold, the function f can be written as and we have the following size and support localization information where the implicit constants on the line (3.17)depend only on the implicit constants present in the point (iii) and are otherwise independent of the functions g i , i = 0, 1, 2.Moreover, there holds that ´Q1 f = 0.

3.18.
Remark.If we were dealing only with the integrability exponents p, q, r ∈ (1, ∞) then we could choose the appearing functions g i simply as 1 Q i , however, due to the fact that we allow r ∈ (0, 1], quite arbitrary functions g i have to be allowed, see the point (iii) in the statement.
Proof.We write out the function f as We first check that the function h 1 is well-defined.We denote with and split into two parts, By the lines (3.5) and (3.6), respectfully, of Proposition 3.3 we find and Consequently, after choosing A sufficiently large, the estimates (3.19) and (3.20) imply that and hence, that the function h 1 is well-defined.Then, from (3.21) the first claim on the line (3.17) is also clear.Next, we control the term w.We expand We estimate the left term on the right-hand side first and for this fix a point y ∈ spt(h By the lines (3.6) and (3.7) of Proposition 3.3 we have where we used the assumption (iii).Consequently, for x ∈ spt(g 0 ) ⊂ Q 0 there holds that For the remaining term, we use ´Q1 f = 0 to estimate By having the above two estimates together it follows that There holds that All the properties of the function f on the cube Q 1 that allowed us to run through the first iteration of the decomposition, are enjoyed by the function ω on the cube Q 0 .Also, for the kernel Exchanging the roles of T and T 1 * we iterate the above once more and we write out the function ω as Repeating the above arguments, we find that Consequently, we have checked the remaining claims on the line (3.17) and it remains to check that ´ f = 0, however, this follows by using the adjoints similarly as it did for the function w on the line (3.22).
In the remaining propositions of this section we relate the oscillation to commutator norms.Recall, that the oscillation of a function Also, let γ ∈ (0, 1).Then, a subset F ′ ⊂ F is said to be a γ-major subset, if 3.23.Proposition.Suppose that K is a bilinear non-degenerate kernel, b ∈ L 1 loc and γ ∈ (0, 1).Fix a cube Q 1 and let g i = 1 E Q i , for i = 0, 1, 2, where E Q i ⊂ Q i is a γ-major subset.Then, there holds that and we have the following size and support localization information, where the implicit constants depend only on γ.
Proof.By b − b Q 1 having zero mean on the cube Q 1 and duality find a function f with the properties By Proposition 3.15 we write out the function f to arrive at The claims on the line (3.25)follow immediately from f L ∞ ≤ 2, the choice of the functions g i and the corresponding information in Proposition 3.15.Then, by ´ f = 0 and the bound (3.17) on the error term, we estimate Consequently, we find that Now, as b ∈ L 1 loc , the common term shared on both sides of the estimate is finite, and hence, by choosing A large enough, we absorb it to the left-hand side and the claim follows.
The off-support norms that model the commutator norm will be given next.When 1 < r < ∞ we will use the following off-support norm.
3.26.Definition.Let p, q, r ∈ (1, ∞).Let K be a kernel that is locally bounded outside the diagonal ∆ and let b ∈ L 1 loc .Then, we define the off-support norm where the supremum is taken over all triples of cubes Q

Remark. It is immediate from Hölder's inequality that
, whenever K is the kernel of T .For example, when r = 1 and r ′ = ∞, we have When 0 < r < 1 we will use the following off-support norm.
3.28.Definition.Let r ∈ (0, ∞) and p, q ∈ (1, ∞), let K be any kernel that is locally bounded outside the diagonal and let b ∈ L 1 loc .Then, we define the weak off-support norm O ∞,A p,q,r (b; K) to be the smallest constant C such that for all triples of cubes Q 0 , Q 1 , Q 2 of the same size and functions there exists a major subset We now fix the constant A to be so large that all the above propositions where it appears are applicable.Hence, we will also drop the superscript A from the off-support norms 3.26 and 3.28 and only write O ∞ p,q,r , O p,q,r .As O ∞ p,q,r (b; K) ≤ O p,q,r (b; K), also O ∞ p,q,r is a reasonable off-support norm in the Banach range of exponents.Before connecting the off-support norms to the commutator, we remark the following a priori upper bound.
3.29.Remark.If K is a bilinear kernel satisfying the size estimate (2.1), then This is quickly seen as follows: fix triples Q i , f i for i ∈ {0, 1, 2} as in the Definition 3.26 and let Q be a minimal cube such that Q 0 , Q 1 ⊂ Q.Then, by the triangle inequality we see that it is enough to control two symmetric terms of which the other one is and is controlled as Next, we relate the weak off-support norm O ∞ p,q,r (b; K) to the commutator.For this, recall that a function f belongs to the space The following Lemma 3.30 is standard, see e.g.Section 2.4.Dualization of quasi-norms in the book [17] of Muscalu and Schlag.Taken together, the following two propositions control the oscillation with the commutator norm.
3.31.Proposition.Let p, q, r ∈ (0, ∞) be arbitrary exponents.Then, there holds that whenever T has the kernel K and the commutator is well-defined.
Proof.Consider a triple Q 0 , Q 1 , Q 2 and functions f 1 , f 2 as in the Definition 3.28 of O ∞ p,q,r (b; K).Clearly we may assume that the right-hand side of (3.32) is finite and hence that [b, T ] 1 (f 1 , f 2 ) L r,∞ < ∞.Then, denote F = Q 0 and let F ′ ⊂ F be the major subset given by the item (2) in Lemma 3.30 such that for all functions |f 0 | ≤ 1 F ′ there holds that which implies the claim.
3.33.Proposition.Let p, q, r ∈ (0, ∞) be arbitrary exponents and let K be a bilinear nondegenerate kernel.Then, for all cubes Proof.Fix a cube Q 1 and let Q 0 , Q 2 be the cubes given by Proposition 3.15 and let Then, according to the definition of O ∞ p,q,r (b; K) let F ′ ⊂ Q 0 be a major subset and define g 0 = 1 F ′ .Then, by Proposition 3.23, we find that In this section we will be either on the diagonal, meaning that r −1 = σ(p, q) −1 , or on the sub-diagonal, meaning that r −1 < σ(p, q) −1 .In both cases the lower bounds formulate simultaneously in Theorem 4.1 and the upper bounds in theorems 4.2 and 4.15.
The following upper bound is well-known and is recorded e.g. in [15].
The sub-diagonal upper bound in Theorem 4.15 requires some preparation consisting of extending parts from the linear theory to the bilinear setting.We refer the reader to Grafakos [5] for a complete account of the corresponding linear theory.4.3.Proposition.Let U, T : Σ × Σ → L 1 loc be bilinear SIOs with the same kernel.Then, there exists a function m ∈ L 1 loc so that c and the function m is bounded.Proof.We will first show the so called consistency condition: Let Q ⊂ R d be a cube and f 1 , f 2 ∈ Σ, then almost everywhere We reduce this to two parts, clearly (4.4) follows if we show that and We only show the claim (4.6), the claim (4.5) being similar.
As the operators U, T share the kernel K, i.e.U ε = T ε , the claim (4.6) follows if we show that for H ∈ {U, T } and all points x ∈ R d , there exists ε > 0 such that Assume first that x ∈ (Q ∪ ∂Q) c (the claim is made modulo sets of measure zero and hence we remove the boundary).Then, choose ε = 1 2 dist(x, ∂Q) so that for all points y ∈ Q there holds that max(|x and we find both sides of (4.7) to be zero.Then, let x ∈ Q \ ∂Q and again fix ε = 1 2 dist(x, ∂Q).Then, as above, we see that Hence, the identity (4.6) holds almost everywhere.Then, we define the function m by and, as the intersection of two cubes is a cube, the property (4.4) shows that this is welldefined.
Then, let f i , i = 1, 2, be simple and let x ∈ R d .Fix a cube Q such that spt(f 1 ) ∪ spt(f 2 ) ∪ {x} ⊂ Q.Then, by (4.4) and linearity there holds that Consequently, we have shown that U − T = m on Σ × Σ, and this also gives m ∈ L 1 loc by testing against simple functions.
If U, T are bounded operators (say = mf 1 f 2 follows by approximating L 4 functions with those in the class Σ (for which the identity holds) and as L ∞ c ⊂ L 4 the desired identity follows.Also, by testing against simple functions, it follows by the boundedness of U, T that necessarily 4.9.Proposition.Let K be a kernel such that sup ε>0 T ε L p ×L q →L σ(p,q) < ∞ for some exponents satisfying 1 < p, q < ∞ and 1 ≤ σ(p, q).Then, there exists a bounded bilinear operator T 0 : L p × L q → L σ(p,q) with the kernel K and a sequence ε k → 0 such that In addition, if T is a CZO with the kernel K, then (by Proposition 4.3) there exists a bounded function m such that T 0 = T + m.
Proof.We will show the argument with the exponents p = q = 3 and σ(p, q) = 3 2 .Let F be a countable dense subset of L 3 .By the bound sup ε>0 T ε L 3 ×L 3 →L 3 2 < ∞, Hölder's inequality and a diagonalization argument, we find a sequence ε k → 0 such that for all defines a bounded linear functional on L 3 ∩ F with norm By Cauchy sequences this extends as a bounded linear functional to the whole of L 3 and then the Riesz representation theorem gives a function Then, we define the operator is a bounded bilinear operator with the kernel K that satisfies (4.10) for functions f 1 , f 2 ∈ F and f 3 ∈ L 3 .Again, by Cauchy sequences T 0 extends as a bounded bilinear functional to the whole L 3 × L 3 and then it remains to argue that T 0 has the kernel K and that (4.10) holds for f 1 , f 2 , f 3 ∈ L 3 .That T 0 has the kernel K follows by how T 0 was extended to L 3 × L 3 via Cauchy sequences, the kernel representation being valid in F × F and the dominated convergence theorem.Similarly we find that (4.10) holds for f 1 , f 2 , f 3 ∈ L 3 .As L ∞ c ⊂ L 3 , we are done.4.12.Proposition.Let T be a SIO with a kernel K such that sup ε>0 T ε L p ×L q →L σ(p,q) < ∞ for some exponents satisfying 1 < p, q < ∞ and 1 ≤ σ(p, q), and let f 1 , f 2 ∈ L ∞ c and b ∈ Ċ0,α .Then, there holds that Then by Proposition 4.9 we have where the last step marked with * follows by the dominated convergence theorem after the following estimate (uniform in ε k ) where the finiteness follows simply by the fact that f, g ∈ L ∞ c and that the appearing singularity is weak enough to be locally integrable (see also the last estimate in the proof of Theorem 4.15).Now as (4.14) holds for all test functions f 3 the claim on the line (4.13)follows.
4.15.Theorem.Let b ∈ L 1 loc , let 1 2 < r, p, q < ∞ be such that r −1 < σ(p, q) −1 , let α := d σ(p, q) −1 − r −1 , and let T be a bilinear SIO such that sup ε>0 T ε L p 0 ×L q 0 →L σ(p 0 ,q 0 ) < ∞ for one tuple of exponents 1 < p 0 , q 0 < ∞ with σ(p 0 , q 0 ) ≥ 1.Then, Proof.Clearly we may assume that b Ċ0,α < ∞, as otherwise the claimed estimate is immediate.By density it is enough to prove the claim for functions f 1 , f 2 ∈ L ∞ c .Then, by Proposition 4.12 we write the commmutator in a closed form and estimate it as The operator I α is the multilinear fractional integral of Kenig and Stein, see [11], where its boundedness is fully characterized: it satisfies exactly the claimed estimates.
Then, there holds that [b, T ] 1 L p ×L q →L r b Ls .
Proof.We first estimate By Hölder's inequality and the boundedness of T we find that By the boundedness of T and Hölder's inequality we have Taking the infimum over all c ∈ C shows the claim.

Definition.
We say that a collection of sets S is γ-sparse, if there exists a pairwise disjoint collection of γ-major subsets Our sparse collections will be built by splitting into dyadic scales.For a cube and a dyadic grid Q ∈ D, we denote D Q = {P ∈ D : P ⊂ Q}.Let f be a locally integrable function and let Q ∈ D be a cube, then we set and form the principal stopping time family S ⊂ D Q by For an arbitrary collection S ⊂ D of dyadic cubes and for each Q ∈ S we let ch S (Q) consist of the maximal cubes P ∈ S such that P Q.For a given cube Q ∈ S we denote E Q = Q \ ∪ P ∈ch S Q P and for each Q ∈ D we let ΠQ = Π S Q denote the minimal cube P in S such that Q ⊂ P (on the condition that it exists).With this notation then, ch S (P ) = {Q ∈ S : Q P, ΠQ = P }.We also denote ch 0 S (P ) = ch S (P ) and ch k+1 S (P ) = ∪ Q∈ch k S (P ) ch S Q. Lastly, given a cube Q ∈ D, we denote The following Lemma is recorded e.g. in [9].
where the number N is finite and depends only on f L ∞ (Q) , as and the functions f P satisfy: (1) ´fP = 0, (2) P ∈S k f P s ∞ 1 P (M f ) s , s > 0. 5.5.Lemma.Let S be a sparse collection, let γ > 0 and let D be a dyadic grid.To each cube Proof.Let P , H ∈ S be such that H P .Then, from that dist(H, H) ≤ γℓ(H) and ℓ(H) ℓ( H), it follows that there exists a constant β ∼ γ so that H ⊂ β P .Consequently, we find that where we used ℓ( H) ℓ(H) in the second estimate and the sparseness of S in the third and the fourth estimates.We have shown that the collection S is Carleson and as the Carleson condition is equivalent with sparseness for dyadic collections, for this fact see e.g. the book of Lerner and Nazarov [13], the claim follows.5.6.Lemma.Let p ∈ (1, ∞) and S be a sparse collection.Then, for any constants a Q there holds that Proof.The claim follows by duality and the following estimate loc , 1 ≤ r, p, q < ∞ and let K be locally bounded away from the diagonal.Then, we define the super-diagonal off-support norm where and the supremum is taken over all finite collections of triples of cubes of the same diameter such that max 5.8.Remark.If r > 1, then we can replace the entries 1 spt(f i,k ) , in the three terms of 5.9.Remark.For bilinear operators U there holds that where ε i , ε ′ i are independent random signs, over some probability spaces with expectations denoted respectively as E, E ′ , meaning that Eε Then, Hölder's inequality shows that for r ≥ 1 we have O Σ p,q,r (b; K) ≤ [b, T ] 1 L p ×L q →L r , and consequently, that O Σ p,q,r is a reasonable off-support constant for r ≥ 1.
Proof.Let Q 1 be an arbitrary dyadic cube and let D be a dyadic grid containing the cube Q 1 .Fix a constant M > 0 and let f be a function such that Note that s > r and hence s, s ′ > 1 are both in the Banach range of exponents.Let S ⊂ D Q denote the sparse collection of cubes we obtain through Lemma 5.3.Write the function f as on the line (5.4) and by Proposition 3.15 factorize each of the terms f P 1 , P 1 ∈ S , as in (3.16), to arrive at f P 1 = h P 1 T 1 * (g P 0 , g P 2 ) − g P 0 T (h P 1 , g P 2 ) + h P 0 T (g P 1 , g P 2 ) − g P 1 T 1 * (h P 0 , g P 2 ) + f P 1 , where we have written h i = h P i and g i = g P i .Next, we will specify how the cubes P 0 , P 2 and the functions g P i are chosen.
By Proposition 3.3 we can assume P 0 , P 2 ∈ D.Then, by Lemma 5.5 the collection S 0 = {P 0 : P 1 ∈ S } ⊂ D is sparse and we will denote the pairwise disjoint major subsets with E P 0 .By Proposition 3.15 we are free to choose the functions g P i under the condition |g i | Q i g i ∞ 1 and clearly the following choices suffice, Now, the off-support norm only controls finite sums, but the collection S is potentially infinite.Hence, we empty the collection S through an increasing chain of finite subcollections where we denote f Σ = P 1 ∈S f P 1 and the implicit change of integration and summation is easily checked by the dominated convergence theorem after the subsequent estimates.
We first consider the case r > 1 in which all the three terms of RHS are estimated similarly.From that dist(P 1 , P i ) ℓ(P i ) and ℓ(P 1 ) ℓ(P i ) it follows that there exists an absolute constant C > 0 such that CP i ⊃ P 1 .This means that the collections {CP i : P 1 ∈ S } are sparse with the major subsets E P 1 .Hence, by Lemma 5.6, for i ∈ {0, 1, 2} and v ∈ (1, ∞) and u ∈ (0, ∞), there holds that where in the last estimate we used the point-wise estimate (3) from Lemma 5.3.Now, we find that ≤ O Σ p,q,r (b; K), where we used s ′ > 1 in the second estimate for the boundedness of the maximal function.
In the case r = 1 the first two terms of RHS estimate the same as in the case r > 1 and the last term estimates differently RHS O Σ p,q,r (b; K) ∞ 1 E P 0 L r ′ (R d ) = O Σ p,q,r (b; K) ≤ O Σ p,q,r (b; K) 1 L ∞ (R d ) = O Σ p,q,r (b; K), the crucial step here was the disjointness of the sets E P 0 .The common term on both sides of the estimate (5.17) is finite (recall that b ∈ L 1 loc and for each f as in the supremum f ∞ < M ), and hence by choosing A sufficiently large, by absorbing the common term to the left-hand side we find that sup (5.11) | ˆQ bf | O Σ p,q,r (b; K). (5.18)Then, as s > 1, the proof is concluded with exactly the same argument by Riesz' representation theorem as in [9].For the convenience of the reader we give the full details.Denote L ∞ c,0 = {ϕ : ´ϕ = 0, ϕ ∈ L ∞ c }, where L ∞ c denotes bounded and compactly supported functions.As the right-hand side of (5.18) is independent of the cube Q and the constant M, we find that O Σ p,q,r (b; K) defines a bounded linear functional in a dense subset of L s ′ .By density and linearity we find a linear extension Λ : L s → C of Λ such that Λ L s ′ →C ≤ Λ L s ′ ∩L ∞ c,0 →C .By the Riesz representation theorem there exists a function a satisfying a L s ≤ Λ L s ′ →C and Λf = ´af, for all f ∈ L s ′ .Especially, as Λ extends Λ, there holds that = p.v. ˆR f (x − y) dy y , was characterized by Hankel operators.Later, Coifman and Rochberg and Weiss extended Nehari's result by real analytic methods and showed that (1.1) b BMO d j=1 [b, R j ] L p (R d )→L p (R d ) b BMO := sup I I |b − b I |, p ∈ (1, ∞), where |A| denotes the Lebesgue measure of the set A. The indicator function of a set A is denoted by 1 A .

2. 16 .
Definition.A kernel K : (R d ) 3 \ ∆ → C is said to be strongly non-degenerate if for each given point y ∈ R d and r > 0 there exists a point x ∈ B(y, r) c such that |K(x, y, y)| r −2d .
satisfy the claims on the line (3.4), and as (3.11) is valid especially with the triple of points

− 1 )
|B(x,k −1 )| be an approximation to identity at the point x and define ϕ kx,y = ψ k x − ψ k y .Then ϕ k x,y ∈ L s ′ ∩ L ∞ c,0 and we find by (5.19) and the Lebesgue differentiation theorem that b(x) − b(y) = lim k→∞ ˆbϕ k x,y = lim k→∞ ˆaϕ k x,y = a(x) − a(y).It follows that b = a + c for some constant c, and especially that b Ls O Σ p,q,r (b; K).We are done.Having propositions 5.1 and 5.10 together gives us 2d, and (3.6) together with (3.5) implies(3.7).The last claim (we can integrate over any two of the cubes) follows by noting that the estimate for the term II was point-wise and inspecting the estimate (3.14) for the term I.