Multivariate Trace Inequalities

We prove several trace inequalities that extend the Golden–Thompson and the Araki–Lieb–Thirring inequality to arbitrarily many matrices. In particular, we strengthen Lieb’s triple matrix inequality. As an example application of our four matrix extension of the Golden–Thompson inequality, we prove remainder terms for the monotonicity of the quantum relative entropy and strong sub-additivity of the von Neumann entropy in terms of recoverability. We find the first explicit remainder terms that are tight in the commutative case. Our proofs rely on complex interpolation theory as well as asymptotic spectral pinching, providing a transparent approach to treat generic multivariate trace inequalities.


Introduction
Trace inequalities are mathematical relations between different multivariate trace functionals. Often these relations are straightforward equalities if the involved matrices commute-and can be difficult to prove for the non-commuting case.
Arguably one of the most powerful trace inequalities is the celebrated Golden-Thompson (GT) inequality [23,64]. It states that for any two Hermitian matrices H 1 and H 2 we have tr exp(H 1 + H 2 ) ≤ tr exp(H 1 ) exp(H 2 ). (1) We note that in case H 1 and H 2 commute (1) holds with equality. Inequality (1) has been generalized in various directions (see, e.g., [3,12,32,38,39,42,57,59]). For example, it has been shown that it remains valid by replacing the trace with any unitarily invariant norm [41,58,65] and an extension to three non-commuting matrices was suggested in [44].
The GT inequality has found applications ranging from statistical physics [64] and random matrix theory [1,67] to quantum information theory [45,46]. Straightforward extensions of this inequality to three matrices are incorrect, as discussed in Appendix A. In this work, for any n ∈ N, Hermitian matrices {H k } n k=1 and any p ≥ 1, we show that log exp where · p denotes the Schatten p-norm and β 0 is a fixed probability distribution on R, independent of the other parameters. An extensive discussion of this result is provided in Sect. 3.2. The precise statement is given in Corollary 3.3. Note that the expression exp((1 + it)H k ) decomposes as exp(H k ) exp(it H k ) where the latter is a unitary rotation.
Since the Schatten p-norm is unitarily invariant it follows that the integrand in (2) is independent of t for n = 2. Inequality (2) thus constitutes an n-matrix extension of the GT inequality and further simplifies to (1) for n = 2 and p = 2. For n = 3 and p = 2 our result strengthens Lieb's triple matrix inequality [44], as shown in Lemma 3.4. The GT inequality can be seen as a limiting case of the more general Araki-Lieb-Thirring (ALT) inequality [4,43]. The latter states that, for any positive semi-definite matrices A 1 and A 2 , and q > 0, The inequality holds in the opposite direction for r ≥ 1 by an appropriate substitution. 1 The GT inequality for Schatten p-norms is implied by the Lie-Trotter product formula in the limit r → 0. The ALT inequality has also been extended in various directions (see, e.g., [2,39,70]). In this work, we provide an n-matrix extension of the ALT inequality. For any n ∈ N, positive semi-definite matrices {A k } n k=1 and any p ≥ 1, we show that log n k=1 A r where β r are a family of probability distributions on R, independent of the other parameters. In this article we use the convention that 0 z = 0 for any z ∈ C. We refer to Theorem 3.2 for a precise statement and discussion. Our extension of the GT inequality again follows in the limit r → 0 by the Lie-Trotter product formula. We also provide an extension of the ALT and GT inequality for general square matrices (see Theorem 3.5). We apply our results to quantum information theory and show how it can be used to prove strong sub-additivity. This yields remainder terms on the monotonicity of relative entropy in terms of recoverability, strengthening the Fawzi-Renner bound [21] and subsequent improvements [9,11,35,62,63,71]. We find that for any positive semidefinite operator σ , and any trace-preserving completely positive map N , there exists a trace-preserving completely positive recovery map R σ,N that satisfies for any quantum state ρ. Here the bound is given in terms of the measured relative entropy, D M (· ·), as in [63]. The recovery map is the explicit universal (i.e., independent of ρ) rotated Petz recovery map introduced in [35]. We thus provide the first explicit lower bound that is tight in the commutative case. A precise statement and further results are presented in Sect. 4. We believe that the proof techniques used in this article, based on asymptotic spectral pinching and complex interpolation theory, yield a transparent method to derive multivariate trace inequalities, which should be applicable beyond the extensions of the GT and ALT inequalities studied here. Section 2 introduces the method of asymptotic spectral pinching and explains how it can be used to prove trace inequalities. Section 3 then explains how trace inequalities can be derived via complex interpolation theory. Readers interested in the proof of (2) and (4) may directly proceed to Sect. 3.

Trace Inequalities via Asymptotic Spectral Pinching
One contribution of this article is the presentation of a transparent method, based on asymptotic spectral pinching, that can be used to prove several trace inequalities. The results in this section hold for p-norms with p > 0 and are in this sense slightly more general than the results mentioned in the introduction. However, the asymptotic spectral pinching method does not yield an explicit form of the distributions β 0 and β r in (2) and (4), respectively.
2.1. The asymptotic spectral pinching method. Let '≥' denote the Löwner partial order on positive semi-definite matrices. Any positive semi-definite matrix A has a decomposition A = λ λP λ where λ ∈ spec(A) ⊂ R are unique eigenvalues and P λ are mutually orthogonal projectors. The spectral pinching map with respect to A is Such maps are trace-preserving, completely positive, unital, self-adjoint, and can be viewed as dephasing operations that remove off-diagonal blocks of a matrix. Moreover, they satisfy the following properties: (i) P A [X ] commutes with A for any X ≥ 0, (ii) tr P A [X ]A = tr X A for any X ≥ 0, and (iii) we have the pinching inequality [28], where spec(A) = {λ 1 , . . . , λ |spec(A)| } and U y = |spec(A)| z=1 exp( i2π yz |spec(A)| )P λ z are unitaries. The inequality step in (7) follows form the facts that U y XU y ≥ 0 and U |spec(A)| = id. The following observation is crucial. Let A be a positive semi-definite d × d matrix. The cardinality |spec(A ⊗m )| grows polynomially in m due to the fact that the number of distinct eigenvalues of A ⊗m is bounded by the number of different types of sequences of d symbols of length m, a concept widely used in information theory. More precisely [15,Lemma II.1] gives where poly(m) denotes an arbitrary polynomial in m. Another useful property of the pinching operation is that it exhibits the following integral representation.
The proof of Lemma 2.1 is given in Appendix B. More information about pinching maps can be found in [13,Section 4.4] The first equality (10) follows because the trace is multiplicative under tensor products. The sole inequality in (11) follows by the pinching inequality (7), i.e., Property (iii), together with the fact that the logarithm is operator monotone and tr exp(·) is monotone. Equality (12) uses Property (i) which ensures that P B ⊗m [A ⊗m ] commutes with B ⊗m and GT thus holds as an equality for these matrices. Equality (13) employs Property (ii) and again the multiplicativity of the trace under tensor products. Considering the limit m → ∞ directly implies the GT inequality (1). As we will see later, this proof already suggests an extension of the GT inequality to n matrices by iterative pinching. Let us emphasize the high-level intuition of the proof method presented above. We know that the GT inequality is trivial if the operators commute, and spectral pinching forces our operators to commute. At the same time the pinching should hopefully not destroy the operator which it acts on too much. This is indeed the case (guaranteed by the pinching inequality) if we consider an m-fold tensor product of our operators and the limit m → ∞.

A convexity result for Schatten quasi-norms.
Let us define the Schatten p-norm of any matrix L as where |L| := √ L † L. We extend this definition to all p > 0, but note that L p is not a norm for p ∈ (0, 1) since it does not satisfy the triangle inequality. In the limit p → ∞ we recover the operator norm and for p = 1 we obtain the trace norm.
Schatten norms are functions of the singular values and thus unitarily invariant. They satisfy L p = L † p and L 2 2 p = L L † p = L † L p . They are also multiplicative under tensor products. We note that the Schatten p-norm with p ≥ 1 is the unique norm that is unitarily invariant and multiplicative under tensor products [5,Theorem 4.2]. 2 Due to the triangle inequality p-norms for p ≥ 1 are convex. In particular, for any probability measure μ on a measurable space (X , ) and a collection {L x } x∈X of matrices, we have Quasi-norms with p ∈ (0, 1) are no longer convex. However, we show that these quasinorms still satisfy an asymptotic convexity property for tensor products of matrices in the following sense. We believe that this result may be of independent interest.
The proof is given in Appendix C. Combining this with (15) shows that for all p > 0 we have the following quasi-convexity property

Main results and proofs via pinching.
In this section we present two results obtained via the spectral pinching method, which are extensions of the ALT and the GT inequality, respectively, for arbitrarily many matrices. We want to emphasize that in addition to the fact that Theorem 2.3 is valid for Schatten quasi-norms, i.e., p ∈ (0, 1), the proof technique via pinching has the advantage of being transparent and intuitive.
Before we present the proof, let us given an equivalent statement that follows by a simple substitution p ← 2q and A k ← √ A k for q > 0, namely For n = 2 the right-hand side of (19) is independent of t and we recover the ALT inequality in (3).

Proof of Theorem 2.3.
We prove the result for positive definite matrices and note that the generalization to positive semi-definite matrices follows by continuity under the convention that 0 z = 0 for any z ∈ C. For convenience of exposition we provide only the proof of Theorem 2.3 for three matrices (i.e, n = 3). The generalization to n matrices follows by appropriately iterating the technical steps presented below. Using the multiplicativity of the trace under tensor products, we write Then, employing the pinching inequality and the monotonicity of tr(·) q/r , we find where o(1) simply denotes an additive term that vanishes as m → ∞. The second inequality uses the fact that t → t r is operator concave for r ∈ (0, 1]. The final step uses property (i) of pinching maps. Repeating these steps shows that The integral representation of pinching maps (see Lemma 2.1) ensures that there exist probability measures μ and ν on R such that where the second inequality uses Lemma 2.2. The final step follows from the fact that Schatten (quasi) norms are unitarily invariant. Considering the limit m → ∞ implies (19) and thus completes the proof.
The multivariate Lie-Trotter product formula (see, e.g., [10, Problem IX.8.5]) states that (28) for square matrices {L k } n k=1 . This allows us to derive a multivariate extension of the GT inequality as a limit of the above extended ALT inequality in the limit r → 0. In particular, combining the product formula with Theorem 2.3 implies an extension of the GT inequality to n matrices. Corollary 2.4. Let p > 0, n ∈ N and consider a collection {H k } n k=1 of Hermitian matrices. Then For n = 2 the right-hand side term of (29) is independent of t and we recover the GT inequality (1) for the choice p = 2.

Trace Inequalities via Interpolation Theory
For p-norms we can prove a more explicit and also more general version of Theorem 2.3 based on an entirely different technique-complex interpolation theory.
3.1. The complex interpolation method. The main ingredient for most of our proofs in this section is a complex interpolation result for Schatten norms, commonly attributed to Stein [60], and based on Hirschman's improvement of the Hadamard three-lines theorem [33]. Epstein [20] showed how interpolation theory can be used in matrix analysis.
Complex interpolation theory has recently garnered attention in quantum information theory for proving entropy inequalities. Beigi [6] and Dupuis [17] used variations of the Riesz-Thorin theorem based on Hadamard's three line theorem to show properties of sandwiched Rényi divergence and conditional Rényi entropy, respectively. Wilde [71] first used these techniques to derive remainder terms for the monotonicity of quantum relative entropy (see Sect. 4 for more details). Extensions and further applications of this approach are discussed by Dupuis and Wilde [18]. Hirschmann's refinement was first studied in this context by Junge et al. [35], where the following theorem essentially appeared: Theorem 3.1 (Stein-Hirschman). Let S := {z ∈ C : 0 ≤ Re(z) ≤ 1} and let G be a map from S to bounded linear operators on a separable Hilbert space that is holomorphic in the interior of S and continuous on the boundary. Let p 0 , p 1 ∈ [1, ∞], θ ∈ (0, 1), define p θ by . . 1. This plot depicts the probability densities β θ defined in (30) is uniformly bounded on S, 3 the following bound holds: For the sake of completeness a proof is given in Appendix D. We note that for any θ ∈ (0, 1) the function β θ is non-negative and ∞ −∞ dt β θ (t) = 1 so that β θ can be interpreted as probability density function on R. These distributions are depicted in Fig. 1. Furthermore, the following limits hold: (32) Here β 0 is another probability density function on R and δ(t) denotes the Dirac δdistribution.

Main results and proofs via interpolation theory.
In this section we prove our main results which are extensions of the ALT and the GT inequality to arbitrarily many matrices. (30), n ∈ N, and consider a collection {A k } n k=1 of positive semi-definite matrices. Then Proof. The case r = 1 holds trivially with equality, so suppose r ∈ (0, 1). We prove the result for positive definite matrices and note that the generalization to positive semidefinite matrices follows by continuity. We define the function G(z) := n k=1 A z k = n k=1 exp(z log A k ) which satisfies the regularity assumptions of Theorem 3.1. Furthermore we pick θ = r , p 0 = ∞ and p 1 = p such that p θ = p r . We find since the matrices A it k are unitary. Moreover, we have Plugging this into Theorem 3.1 yields the desired inequality.
Let us now remark on several aspects of this inequality. First, we note that the substitution p ← 2q and A k ← √ A k allows to rewrite (33) in a more suggestive form. For q ≥ 1 2 and r ∈ (0, 1], we have For n = 2 the term on the right-hand side is independent of t and we recover the ALT inequality in (3). However, we only recover the result for q ≥ 1 2 using complex interpolation theory. This can be fixed by proving a multivariate extension of the ALT inequality based on pinching (see Theorem 2.3).
Also note that we can always remove the logarithm in the above inequalities by using its concavity and Jensen's inequality. Moreover, for q ∈ [ 1 2 , 1] we may pull the integration inside the quasi-norm (by employing the fact that X → log X p is concave for p ∈ [0, 1] on the positive definite cone), which yields the following relaxation Next, recall the multivariate Lie-Trotter product formula in (28). Again, this allows us to derive an extension of the GT inequality to arbitrarily many matrices by taking the limit r → 0 of (36).
Let us take a closer look at the case n = 3 and p = 2. Substituting H k ← 1 2 H k and using the concavity of the logarithm and Jensen's inequality, we relax Corollary 3.3 to This is to be contrasted with Lieb's triple matrix inequality [44], which asserts that As the next lemma shows, it turns out that these two expressions are in fact equivalent. We believe that this result might be of independent interest as it allows us to write the Fréchet derivative of the operator logarithm using an integration over rotations.
The proof is given in Appendix E. The above lemma also gives a further means to understand the probability distribution β 0 which we obtained from Hirschman's interpolation theorem. 4 Whereas Lieb's triple matrix inequality in (40) has not been extended to more than three matrices, the alternative representation obtained in (39) through Corollary 3.3 naturally extends to arbitrarily many matrices. Finally, it should be noted that Lieb's triple matrix inequality has been shown to be equivalent to many other interesting statements (such as Lieb's concavity theorem [44]), and hence it is valuable to have an entirely different proof of these results. Corollary 3.3 is valid for Hermitian matrices, but we can extend its scope to general square matrices using the same techniques.
, and note that both (L k ) and (L k ) are Hermitian. Now define which satisfies the regularity assumption of Theorem 3.1. We note that G(it) is unitary, and thus log G(it) ∞ vanishes. We again pick θ = r ∈ (0, 1), p 0 = ∞ and p 1 = p such that p θ = p r , and find r log exp r n k=1 L k Dividing by r and taking the limit r → 0 then yields the desired result via the Lie-Trotter product formula.
We note that for the case of normal matrices N , the matrices (N ) and (N ) commute, which allows us to slightly simplify the above formula by employing the fact that exp( (N )) = exp(N ) . For two normal matrices the result then reads generalizing an inequality by Li and Zhao [42]. Finally, we note that (46) can be viewed as an ALT inequality for general square matrices.

An Application: Entropy Inequalities
In this section we show that the multivariate extension of the GT inequality derived in Corollary 3.3 can be used to derive remainder terms in terms of recoverability for certain entropy inequalities. For positive semi-definite matrices ρ, σ with tr ρ = 1, Umegaki's quantum relative entropy [69] is defined as D(ρ σ ) := trρ(log ρ − log σ ) if ρ σ and as +∞ if ρ σ . Here, ρ σ denotes that the support of ρ is contained in the support of σ . We recall the following variational formula for the relative entropy [53] (see also [7]): The measured relative entropy is given as D M (ρ σ ) := sup (X ,M) D P ρ,M P σ,M [7,16,31,52], where the optimization is over positive operator valued measures (POVMs) M on the power-set of a finite set X , the probability mass functions are given by P ρ,M (x) = trρ M(x), and D(P Q) is the Kullback-Leibler divergence [40]. We recall the following variational formula [7,54]: A fundamental entropy inequality [46,47] states that the quantum relative entropy is monotone under trace-preserving and completely positive maps N , i.e., 5 This is closely related to the celebrated strong sub-additivity of quantum entropy [45,46] stating that for any positive semi-definite matrix ρ ABC on a composite Hilbert space H A ⊗H B ⊗H C with tr ρ ABC = 1. Here ρ AB , ρ BC , and ρ B are marginals of ρ ABC obtained via the partial trace, and H (ρ) = −trρ log ρ denotes the von Neumann entropy. Motivated by recoverability questions in quantum information theory, (50) and (51) have been refined in a series of recent works [9,11,21,35,62,63,71], making use of complex interpolation theory as well as asymptotic spectral pinching. With the four matrix extension of the GT inequality given by Corollary 3.3, we find the following statement which answers an open question stated in [35].

Theorem 4.1 (Strengthened monotonicity for partial trace). Let ρ AB and σ AB be positive semi-definite matrices on H A ⊗ H B such that ρ AB
σ AB and tr ρ AB = 1. Then with the rotated Petz recovery map given by and Proof. Let us recall Corollary 3.3 applied for n = 4 and p = 2. Using the concavity of the logarithm and Jensen's inequality, it yields for Hermitian matrices Moreover, by definition of the relative entropy for positive definite operators ρ AB and σ AB , we have For positive semi-definite operators ρ AB and σ AB , the Hermitian operators log σ AB , log ρ A and log σ A are well-defined under the convention log 0 = 0. Under this convention, the above equality (55) also holds for positive semi-definite operators as long as ρ AB σ AB , which is required by the theorem. By the variational formula for the relative entropy (48) we thus find where the single inequality follows by the four matrix extension of the GT inequality in (54). The final step uses the variational formula (49) for the measured relative entropy.
We note that the four matrix extension of the GT inequality is the only inequality used in the proof of Theorem 4.1. More properties of the recovery map R σ AB ,tr B given by (53) are discussed in [35].
Theorem 4.1 implies two other interesting statements. If we substitute ρ AB ← ρ ABC , we immediately find a remainder term for the conditional quantum mutual information, namely where I A is the identity map and R ρ BC ,tr C is defined in (53). Moreover, using the Stinespring dilation theorem [61] and the fact that the relative entropy is invariant under isometries, Theorem 4.1 generalizes to the following result.

Corollary 4.2 (Strengthened monotonicity).
Let ρ, σ be positive semi-definite matrices such that ρ σ , tr ρ = 1, and N be a trace-preserving completely positive map acting on these matrices. Then with the rotated Petz recovery map given by and Proof. Let us introduce the Stinespring dilation of N , denoted U , and the states ρ AB = UρU † , σ AB = U σ U † such that N (ρ) = ρ A and N (σ ) = σ B . Then, using the fact that the relative entropy is invariant under isometries, we have where the inequality is due to Theorem 4.1 and the last equality uses again invariance under isometries and the fact that We note that Corollary 4.2 is no longer valid if we replace the measured relative entropy in (60) with a relative entropy. This leads us to believe that (60) cannot be further improved.
The right-hand side of (60) can be relaxed using Uhlmann's fidelity, F(ρ, σ ) : F(ρ, σ ). 6 Therefore, Corollary 4.2 implies Moreover, Corollary 4.2 can be transformed into universal remainder terms (in terms of recoverability with the measured relative entropy) for other entropy inequalities, such as concavity of the conditional entropy and joint convexity of the relative entropy [8].
We refer to [35, Section 5] for a more detailed discussion of these bounds.
We also want to refer the reader to Appendix F where we give a different derivation that yields lower and upper bounds on the difference of relative entropies in Theorem 4.1. This derivation follows the structure of Lieb and Ruskai's original proof of strong sub-additivity [45,46], i.e., it uses the Peierls-Bogoliubov inequality followed by an extension of the GT inequality (see also [14]). However, whereas Lieb and Ruskai use the three matrix extension of the GT inequality (Lieb's triple matrix inequality) we use the four matrix extension of the GT inequality, leading us to a stronger statement.

Discussion
We discussed two techniques to prove trace inequalities. One is based on asymptotic pinching and the other one uses complex interpolation theory. Both methods lead to transparent and direct proofs of generalized multivariate extensions of the GT and also the more general ALT inequalities. We believe that these methods can be used to prove trace inequalities beyond the extensions of the GT and ALT inequalities studied in this article. For example in [2,32], complementary GT and ALT inequalities have been shown in terms of matrix means. It is left for future research to investigate if these inequalities can be obtained (and possibly be extended to the multivariate case) via pinching or interpolation theory.
Hansen gave an alternative multivariate extension of the GT inequality [25] that can be considered an interpolation between the original GT inequality (1) and the operator Jensen inequality [26,27]. It would be interesting to unify his and our result.
Moreover, Lieb showed that his triple matrix inequality (40) is equivalent to many other interesting statements such as several concavity results [44]. As Corollary 3.3 generalizes the triple matrix inequality, it is natural to ask if it can be used to prove more general concavity results.
Ahlswede and Winter noticed that the GT inequality can be used to prove tail bounds for sums of random matrices via the Laplace transform method [1]. As the (original) GT inequality is only valid for two matrices it has to be applied sequentially. Later, Tropp realized that sharper tail bounds can be obtained by using Lieb's concavity theorem instead of the GT inequality [67]. An interesting question is whether the multivariate extension of the GT inequality derived in this article (see Corollary 3.3) can be used to prove tail bounds for random matrices.
Acknowledgements. Elliott H. Lieb's talk at the "Beyond I.I.D. in Information Theory" workshop in Banff inspired us to derive an alternative proof for his triple matrix inequality. We thank Christian Majenz for allowing us to include his counterexample for a three matrix GT inequality without rotations. We would also like to

A. About the Probability Distribution in Corollary 3.3
As observed in [64] For the following discussion, let us define for three positive definite matrices A 1 , A 2 , and A 3 . As discussed in (39), Corollary 3.3 implies that It is a natural question to investigate how much freedom we have in choosing a probability distribution (different than β 0 ) such that (69) remains valid, where the distribution should be independent of the matrices A 1 , A 2 , and A 3 . The following two examples indicate that it might be difficult to find a distribution different than β 0 that satisfies (69) since it cannot be too narrow (around t = 0) but also not too flat, either. Let us consider the positive semi-definite matrices [48]:  (70) and (71). If we want κ ≤ μ(dt)γ (t) to hold for some probability measure μ on R that does not depend on A 1 , A 2 and A 3 , these two example show that μ cannot be too narrow (around t = 0) but also not too flat, either As a second example we consider the positive semi-definite matrices (71) Figure 2 compares κ with γ (t). We note that the matrices (70) also show that κ > γ (0) is possible, i.e., a three matrix extension of the GT inequality without any phases does not hold in general [48].
Furthermore for a fixed τ > 0 we define the triangular function We next pickμ(ξ ) = r 2 (ξ ) which clearly satisfies the requirements (i) and (ii) mentioned above. Its inverse Fourier transform can be computed as It is immediate to verify that μ as given in (77) is a probability distribution on R (i.e., μ(t) ≥ 0 for all t ∈ R and R dt μ(t) = 1). This is the distribution that satisfies the assertion of Lemma 2.1 and thus completes the proof.

C. Proof of Lemma 2.2
Let H denote the Hilbert space of dimension d where the matrices A x act on. For any x, consider the spectral decomposition A x = k λ k |k k| in Dirac bra-ket notation. Introducing an isometric space H , we define the vector |v x ∈ H ⊗ H by |v x = k √ λ k |k ⊗ |k -i.e., the purification of A x . Now note that the projectors (|v x v x |) ⊗m lie in the symmetric subspace of (H ⊗ H ) ⊗m whose dimension grows as poly(m). 7 Moreover, we have Then by Carathéodory's theorem (see, e.g., [19,Theorem 18]) there exists a discrete probability measure P on ∈ I ⊂ X with |I| = poly(m) such that

D. Proof of Theorem 3.1
We follow the argument given in [35,Appendix A], and take care of the explicit conditions on the Schatten norms of G(z). We recall Hirschman's strengthening [33] (see also [24,Lemma 1.3.8]) of Hadamard's three line theorem.
Lemma D.1 (Hirschman). Let S := {z ∈ C : 0 ≤ Re(z) ≤ 1} and let g(z) be uniformly bounded on S, holomorphic on the interior of S and continuous on the boundary. Then for θ ∈ (0, 1), we have Moreover, the assumption that the function is uniformly bounded can be relaxed to sup z∈S exp − a|Im(z)| log |g(z)| ≤ A for some constants A < ∞ and a < π. (88) We are now prepared to prove Theorem 3.1.
Proof of Theorem 3.1. For x ∈ [0, 1], define q x as the Hölder conjugate of p x such that p −1 x + q −1 x = 1. Hence, using the definition of p x in (30), we have Now for our fixed θ ∈ (0, 1) the operator G(θ ) is bounded by assumption and thus allows a polar decomposition, G(θ ) = U , where is positive semi-definite and U is a partial isometry [55, Theorem VI.10] We find that z → X (z) is anti-holomorphic on S and Consequently, the Hilbert-Schmidt inner product g(z) := tr X (z) † G(z) is holomorphic and bounded on S because the Hölder inequality (see, e.g., [34,Theorem 7.8]) yields Hence, our assumptions on G(z) imply that g(z) satisfies the assumptions of Lemma D.1. It remains to verify the following relations using the Hölder inequality in (92): Substituting this into Lemma D.1 yields the desired result.

E. Proof of Lemma 3.4
The first expression for the derivative given in (41) is well known and can be derived using integral representations of the operator logarithm (see, e.g., [13]). Now let A = k μ k |k k| for an orthonormal eigenbasis {|k } k of A. The claim is thus equivalent to Thus, it suffices to show that Since β 0 (t) is symmetric in t, we have and the claim follows because we also have (98)

F. Additional Recoverability Bounds
The purpose of this section is to present two additional entropy inequalities that also follow from the multivariate extension of the GT inequality given by Corollary 3.3. These two bounds have been proven before [18,35].

Proposition 1.
Let ρ AB and σ AB be positive semi-definite matrices on H A ⊗ H B such that ρ AB σ AB and tr ρ AB = 1 and let R [t] σ AB ,tr B be as defined in (53). Then where D 2 (ρ σ ) := log trρ 2 σ −1 is Petz' Rényi relative entropy of order 2.

(101)
We first prove (99). Let ρ AB and σ AB be positive semi-definite matrices on H A ⊗ H B such that ρ AB σ AB and trρ AB = 1. For G 1 = log ρ AB and G 2 = 1 2 (log ρ A ⊗ id B + log σ AB − log σ A ⊗ id B − log ρ AB ), this gives where the penultimate step uses the extension of the GT inequality from Corollary 3.3 for n = 4 and p = 1. It remains to prove (100). Applying the Peierls-Bogoliubov inequality (101) for G 1 = log ρ AB and G 2 = log σ A ⊗ id B + log ρ AB − log ρ A ⊗ id B − log σ AB , we find where the second inequality follows by Corollary 3.3 applied for n = 4 and p = 2.
Following the same line of arguments as in the proof in the proof of Corollary 4.2, (99) can be extended to the case of arbitrary trace-preserving completely positive maps. This then reproduces a result in [35,Section 3].