Inequalities for $L^p$-norms that sharpen the triangle inequality and complement Hanner's Inequality

In 2006 Carbery raised a question about an improvement on the na\"ive norm inequality $\|f+g\|_p^p \leq 2^{p-1}(\|f\|_p^p + \|g\|_p^p)$ for two functions in $L^p$ of any measure space. When $f=g$ this is an equality, but when the supports of $f$ and $g$ are disjoint the factor $2^{p-1}$ is not needed. Carbery's question concerns a proposed interpolation between the two situations for $p>2$. The interpolation parameter measuring the overlap is $\|fg\|_{p/2}$. We prove an inequality of this type that is stronger than the one Carbery proposed. Moreover, our stronger inequalities are valid for all $p$.


Introduction and main theorem
Since |z| p is a convex function of z for p ≥ 1, for any measure space, the L p unit ball, |f | p ≤ 1}, is convex. One way to express this is with Minkowski's triangle inequality f + g p ≤ f p + g p . Another is the inequality valid for any functions f and g on any measure space. There is equality if and only if f = g and, in Theorem 1.1, we improve (1.1) substantially when f and g are far from equal.
In 2006 Carbery proposed [3] several plausible refinements of (1.1) for p ≥ 2, of which the strongest was |f + g| p ≤ 1 + f g p/2 f p g p p−1 There is equality in (1.2) both when f = g and when f g = 0. Thus, (1.2), if true, can be viewed as a refinement of (1.1) in which there is equality not only when f = g but also when f g = 0.
The ratio Γ = f g p/2 f p g p varies between 0 and 1 and, therefore, the factor of (1 + Γ) p−1 varies between 1 and 2 p−1 , interpolating between the two cases of equality in (1.2).
c 2018 by the authors. This paper may be reproduced, in its entirety, for non-commercial purposes.
Work partially supported by NSF grants DMS-1501007 (E.A.C.), DMS-1363432 (R.L.F.), PHY-1265118 We propose and prove a strengthening of (1.2) in which Γ is replaced by the quantity Γ := f g p/2 f p p + g p p 2 −2/p , (1.3) which is smaller by virtue of the arithmetic-geometric mean inequality.
Our improved inequalities are not restricted to p > 2, but are valid for all p ∈ R, as stated in Theorem 1.1. There we write We note that inequality (1.2) involves three kinds of quantities on the right side ( f g p/2 , f p p + g p p and f p g p ), while our inequality involves only two ( f g p/2 and f p p + g p p ), a simplification that is essential for our proof. The inequality reverses if p ∈ (−∞, 0) ∪ (1, 2), where, for p ∈ (1, 2), it is assumed that f and g are positive almost everywhere.
We note that in proving the theorem, we may always assume that f and g are nonnegative. In fact, the right side of (1.4) only depends on |f | and |g| and the left side does not decrease for p ≥ 0 and does not increase for p < 0 if f and g are replaced by |f | and |g|. The latter follows since |f + g| ≤ |f | + |g| implies |f + g| p ≤ (|f | + |g|) p for p > 0 and |f + g| p ≥ (|f | + |g|) p for p < 0.
Carbery proved that his proposed inequality is valid when f and g are characteristic functions. Our theorem can also be easily proved in this special case.
which is Minkowski's inequality.
When p = 1 and f, g ≥ 0, (1.1) is an identity; otherwise when p > 1, there is equality in (1.1) if and only if f = g. When the supports of f and g are disjoint, however, (1.1) is far from an equality and the factor 2 p−1 is not needed. There is equality in Minkowski's inequality whenever f is a multiple of g or vice-versa. Hence although (1.1) is equivalent to Minkowski's inequality, it becomes an equality in fewer circumstances.
There is another well-known refinement of Minkowski's inequality for 1 < p < ∞, namely Hanner's inequality, [4,2,6] which gives the exact modulus of convexity of B p , the unit ball in L p . For p ≥ 2, and unit vectors u and v, Hanner's inequality says that which is also a consequence of one of Clarkson's inequalities [1]. When u and v have disjoint supports, u+v p p = u−v p p = 2, and then the left hand side is 2 2−p , so that for unit vectors u and v, the condition uv = 0, which yields equality in the inequality of Theorem 1. Part A: We show how to reduce the inequality to a simpler one involving only one function, namely α := f /(f + g) for f, g > 0, which takes values in [0, 1], and a reference measure that is a probability measure. This exploits the fact that the only important quantity is the ratio of f to g. This part is very easy.
Part B: In the second part, which is more difficult than Part A, we show that Theorem 1.1 is true if it is true when the function α is constant. (This is the same as saying f and g are proportional to each other.) When α := f /(f + g) is constant and the reference measure is a probability measure, (1.4) yields the inequality for numbers α ∈ [0, 1] and (1.6) with the reverse inequality for p / Remark 1.2. The quantity R := 2α p/2 (1 − α) p/2 α p + (1 − α) p lies in [0, 1] for all α and p. Therefore, R q decreases as q increases. Thus for p ≥ 2, the inequality strengthens as q increases, and for p ∈ [0, 1], it strengthens as q decreases. Likewise, for p ∈ [1, 2] the reverse of (1.7) is stronger for smaller q, and for p < 0, it is stronger for larger q.
Part C: With Parts A and B complete, the proof reduces to a seemingly elementary inequality, parametrized by p, for a number α ∈ [0, 1]. The proof of this is Part C. While the validity of (1.6) appears to be a consequence of Theorem 1.1, one can also view Theorem 1.1 as a consequence of (1.6). For p > 2, (resp. for p ∈ (0, 1) ) inequality (1.7) is false if q > 2/p, (resp. for q < 2/p).
1.1. Restatement of Theorem 1.3 in terms of means. Inequality (1.6) can be restated in terms of qth power means [5]: For x, y > 0, define Note that M 0 (x, y) is the geometric mean of x and y and M −1 (x, y) is their harmonic mean. Proof. A simple calculation shows that for all p > 0, M −p (x, y) M p (x, y) = 2 2/p xy (x p + y p ) 2/p . Thus, taking x = α and y = 1 − α, the inequality (1.6) can be written as Then by homogeneity and the fact that M 1 (α, 1 − α) = 1/2, (1.6) is equivalent to (1.8) The following way to write our inequality sharpens and complements the arithmeticgeometric mean inequality for any two numbers x, y > 0, provided one has information on M p (x, y).
Remark 1.6. Since p, p ≥ 1, all of the quantities being compared in these inequalities are non-negative.
Despite the classical appearance of (1.8), we have not been able to find it in the literature, most of which concerns inequalities for means M q (x 1 , . . . , x n ) = ( 1 n n j=1 x p j ) 1/p of an n-tuple of non-negative numbers, often with more general weights. The obvious generalization of (1.8) from two to three non-negative numbers x, y, and z is false as one sees by taking z = 0: Then there is no help from M −p (x, y, z) on the right. A valid generalization to more variables probably involves means over M −p (x j , x k ) for the various pairs. In any case, as far as we know, (1.8) is new.
A truly remarkable feature of the inequality (1.8) is that it is surprisingly close to equality uniformly in the arguments. To see this, let f (α, p) denote the right hand side of (1.6).
Contour plots of this function for various ranges of p are shown in Figs. 1, 2 and 3 below.   is quite close -within two percent -to the constant 1 over the range p ≥ 1 and α ∈ [0, 1].
Moreover, the "landscape" is quite flat: The gradient has a small norm over the whole domain. For p < 0, there is equality only at α = 1/2, and the inequality is not so uniformly close to an identity. The contour plot is less informative, and hence is not recorded here. This is the case in which the inequality is easiest to prove.
It is possible to give a simple direct proof of the inequality for certain integer values of p, as we discuss in Section 5. We also give a simple proof that for p > 2 and for p < 0, validity of the inequality at p implies validity of the inequality at 2p, and we briefly discuss an application of this to the problem in which functions are replaced by operators and integrals are replaced by traces.
Remark 1.7. We close the introduction by briefly discussing one other way to write the inequality (1.6). Introduce a new variable s ∈ (0, 1) through Rewriting (1.6), and taking the 1 p−1 root of both sides, we may rearrange terms to obtain. . (1.11) Taking the 1 p−1 eliminate the change of direction in the inequality at p = 1, and it now take on a non-trivial form at p = 1: Define (1.12) for p = 1, and one easily computes the limit at p = 1: Theorem 1.3 is equivalent to the assertion that for all s ∈ (0, 1) (1.13) In this form, the inequality is easy to check for some values of p. For example, for p = −1, which is clearly positive. One can give simple proofs of (1.13) for other integer values of p, e.g., p = 3 and p = 4 along these line, but this change of variables is not what we use to prove the general inequality. It is, however, convenient for checking optimality of of the power 2/p in (1.6).

Part A. Reduction from two functions to one
While Theorem 1.1 involves two functions f and g one can use the arbitrariness of the measure to reduce the question to a single function defined on a probability space (that is, 1 = 1). We have already observed that it suffices to prove the inequality in the case where f and g are both non-negative. For non-negative functions f and g, set Replacing the underlying measure dx by the new measure (f + g) p dx/ f + g p p we see that it suffices to prove the following inequality for p ∈ [0, 1] ∪ [2, ∞], and also to prove the reverse for a single function 0 ≤ α ≤ 1 on a probability space, i.e., 1 = 1.

Part B. Reduction to a constant function
In this section we prove the following.
, then the reverse of inequality (2.1) is true for all functions α (which is equivalent to the reverse of (1. 2) for all f, g) if and only if it is true for all constant functions, that is, for all numbers α ∈ [0, 1], the reverse of (3.1) holds.
To prove this Proposition we need a definition and a lemma.  Proof. To prove this lemma we use the chain rule to compute the second derivative of H. As a first step we define a useful reparametrization as follows: e 2x := a/(1 − a). A quick computation shows that h = (2 cosh x) −p and b = 2 cosh(px)(2 cosh x) −p . Thus, h = b/(2 cosh(px)). By symmetry, we can restrict our attention to the half-line x ≥ 0.
We now compute the first two derivatives: Our goal is to show that (3.6) has the correct sign (depending on p) for all x ≥ 0.
We claim that the quantity ( For p > 0, there is evidently equality for α ∈ {0, 1 2 , 1}, and for p < 0, there is equality for α = 1/2. Thus for the proof of (4.1) it suffices to consider α ∈ (1/2, 1) for p > 0, and α ∈ (0, 1/2) if p < 0, and it is convenient to change variables By taking logarithms we see that the claimed inequality (4.1) is equivalent to In order to discuss the sign changes of f we compute Clearly, it suffices to consider the sign changes of the second factor and therefore to consider the sign changes of We shall show that for c > 0, g has a unique sign change in (0, 1) and it changes sign from − to + if c ∈ (0, 1/2) and from + to − if c ∈ (1/2, ∞). Moreover, for c < 0 we shall show that g is negative on (0, 1). Clearly, these properties of g imply the claimed properties of f and therefore will conclude the proof.
We next observe that the second term in (4.3) is positive.
Proof. First, consider the case c ∈ [0, 1). Then concavity of the map t → t c implies 1 − c + t c −t ≥ 1, and the claim follows from t c + 1 > 1.
Next, for c > 1 the argument is similar using convexity of the map t → t c .
Finally, for c < 0 convexity of t → t 1−c implies that This concludes the proof of the lemma.
Because of Lemma 4.1, we can define We shall show that for c > 0, h has a unique sign change in (0, 1) and it changes sign from − to + if c ∈ (0, 1/2) and from + to − if c ∈ (1/2, ∞). Moreover, for c < 0 we shall show that h is negative on (0, 1). Clearly, these properties of h imply the claimed properties of g and therefore will conclude the proof.
We will prove this by investigating sign changes of h . Namely, we shall show that for c > 0, h has a unique sign change in (0, 1) and it changes sign from + to − if c ∈ (0, 1/2) and from − to + if c ∈ (1/2, ∞). Moreover, for c < 0 we shall show that h is positive on (0, 1).
Let us show that this implies the claimed properties of h. Indeed, an elementary limiting argument shows that Therefore in order to complete the proof of Theorem 1.3 we need to discuss the sign changes of h . We compute We shall show that for c > 0, v has a unique sign change in (0, 1) and it changes sign from + to − if c ∈ (0, 1/2) and from − to + if c ∈ (1/2, ∞). Moreover, for c < 0 we shall show that v is positive on (0, 1).
Since, by Lemma 4.1 the denominator in the above expression for h is positive, these properties of v clearly imply those of h and therefore complete the proof of the theorem.
In order to prove the claimed properties of v we shall study the sign changes of v . We shall show that for c > 0, v has a unique sign change in (0, 1) and it changes sign from + to − if c ∈ (0, 1/2) and from − to + if c ∈ (1/2, ∞). Moreover, for c < 0 we shall show that v is positive.
Let us now argue that these properties of v indeed imply the claimed properties of v.
We compute and finally From these formulas we easily infer that Thus, we are left with studying the sign changes of v . In order to do so, we need to distinguish several cases. For c < 1 we will argue via the sign changes of v , while for c > 1 we will argue directly.
The fact that p is positive means that w is convex. Since w(0) = +∞ and w(1) < 0, we conclude that w has only one root.
Case c ∈ (−∞, 0). We want to show that v is positive.
In what follows we assume c > 2. Let us rewrite (4.5) as v (t) = 2ct 2c−3 u(t) with We need to show that u changes sign only once. We have u(0) = −∞, and u (0) < 0. At the It suffices to show that u < 0 on (0, 1). Since u (0) < 0, u (1) < 0, the latter claim will follow from showing that u has a constant sign. We have The factor b has the property that b(0) = +∞, b(1) = (c + 6)(c − 1) > 0. On the other hand, This concludes the proof of the inequality of Theorem 1.3. instead of f p (s). A motivation for this reparametrization is that for fixed p, the function on the right hand side of (1.6) is equal to 1 up to order O((α − 1/2) 4 ) at α = 1/2. In the variable s, the leading term in Taylor expansion in s will be second order, and we proves the sharpness by an expansion at this point.
A Taylor expansion shows that g r,p (s) = p(1 − r)s + o(s) .
Since the exponent q in (1.7) corresponds to r(2/p), this together with the reamrks leading to (1.13) justifies the statements referring to q in Theorem 1.3.
The proof of Theorem 1.3 is now complete.
What made this proof work is the fact that the inequality holds for p = 2 -as an identity, but that is unimportant. Then, using Minkowski's inequality, as in (5.2), together with the numerical inequality (5.3) we arrive at the inequality for p = 4. This is a first instance of the general doubling proposition, to be proved next. The inequality (5.3) is s special case of the general inequality (5.4) proved below.
This strategy can be adapted to give direct proof of the inequality for other integer values of p; e.g., p = 3. When p is an integer, and f and g are non-negative, one has the binomial expansion of (f + g) p = f p + g p + mixed terms. Under the assumption that (f p + g p ) = 2, one is left with estimating the mixed terms, and one can use Hölder for this. When p is not an integer, there is no useful expression for (f + g) p − f p − g p .  4) is valid for all f, g > 0, then the reverse of (1.4) is valid with p replaced by 2p for all f, g > 0.
The proof of Proposition 5.1 relies on the following lemma.
Defining a := 1 + α 2 , b := 1 + α and c := 2, and defining ϕ(α) := x t , the right hand side is the same as For α ∈ [0, 1) we have a < b < c and therefore this quantity is positive when ϕ is concave, and negative when ϕ is convex. For α ∈ (1, ∞) we have a > b > c and therefore this quantity is negative when ϕ is concave, and positive when ϕ is convex.

5.2.
A generalization to Schatten norms. For p ∈ [1, ∞), an operator A on some Hilbert space belongs the Schatten p-class S p in case (A * A) p/2 is trace class, and the Schatten p norm on S p is defined by A p = (Tr[(A * A) p/2 ]) 1/p . One possible non-commutative analog of (part of) Theorem 1.1 would assert that for positive A, B ∈ S p , p > 2. Note that for p = 2, (5.5) holds as an identity.
In this setting, it is not clear how to implement analogs of Parts A and B of our proof for functions. However, the direct proofs sketched at the beginning of this section do allow us to prove the valididty of (5.5) for all p = 2 k , k ∈ N.
Theorem 5.3. If (5.5) is valid for some p ≥ 2 and all positive A, B ∈ S p , then it is valid for 2p and all A, B ∈ S 2p . In particular, since (5.5) holds as an identity for p = 2, it is valid for p = 2 k for all k ∈ N.
Proof. Let A and B be positive operators in S 2p , and assume that A 2p 2p + B 2p 2p = 2, which, by homogeneity, entails no loss of generality. Define X := 1 2 (AB + BA) and Y = A 2 + B 2 .