Approximate tensorization of the relative entropy for noncommuting conditional expectations

In this paper, we derive a new generalisation of the strong subadditivity of the entropy to the setting of general conditional expectations onto arbitrary finite-dimensional von Neumann algebras. The latter inequality, which we call approximate tensorization of the relative entropy, can be expressed as a lower bound for the sum of relative entropies between a given density and its respective projections onto two intersecting von Neumann algebras in terms of the relative entropy between the same density and its projection onto an algebra in the intersection, up to multiplicative and additive constants. In particular, our inequality reduces to the so-called quasi-factorization of the entropy for commuting algebras, which is a key step in modern proofs of the logarithmic Sobolev inequality for classical lattice spin systems. We also provide estimates on the constants in terms of conditions of clustering of correlations in the setting of quantum lattice spin systems. Along the way, we show the equivalence between conditional expectations arising from Petz recovery maps and those of general Davies semigroups.


Introduction
In the last few decades, entropy has been proven to be a fundamental object in various fields of mathematics and theoretical physics. Its quantum analogue characterizes the optimal rate at which two different states of a system can be discriminated when an arbitrary number of copies of the system is available. Given two states ρ, σ of a finite-dimensional von Neumann algebra N ⊂ B(H), it is given by Probably the most fundamental property of entropy is the following strong subadditivity inequality (SSA) [34]: given a tripartite system H ABC := H A ⊗ H B ⊗ H C and a state ρ ≡ ρ ABC on H ABC , where for any subsystem D of ABC, ρ D := Tr D c [ρ ABC ] denotes the marginal state on D. Restated in terms of the quantum relative entropy, (SSA) takes the following form: (1.1) the coarse-graining maps to be the partial traces onto the subalgebras N 1 ≡ B(H AB ), N 2 ≡ B(H BC ) and M ≡ B(H B ), respectively. Thus, inequality (1.2) can be seen as an operator algebraic generalization of the (SSA) inequality. However, the commuting square assumption and subsequently inequality (1.2) are not satisfied in most of the cases of interest that appear in information-theoretical settings or quantum many-body systems. Indeed, in the context of interacting lattice spin systems, conditional expectations arising e.g. from the large time limit of a dissipative evolution on subregions of the lattice generally do not satisfy the commuting square assumption. In this case, approximations of the (SSA) were found in the classical case (i.e. when all algebras are commutative) and when M ≡ C1 H [14]. For classical lattice spin systems, these inequalities, termed as approximate tensorization of the relative entropy (also known in the literature as quasi-factorization of the relative entropy [14,17]), take the following form where σ := E M * (ρ) for all states ρ, and c 1 : is a constant that measures the violation of the commuting square condition for the quadruple (M, N 1 , N 2 , N ). For reasons that will become clear in the remaining parts of the article, we refer to the constant c 1 as the clustering of correlations constant in this introduction.
An inequality of the form of (1.3) is the main ingredient in modern proofs of modified logarithmic Sobolev inequalities (MLSI) which govern the rapid thermalization of classical lattice spin systems evolving according to a Glauber dynamics and in the high temperature regime [14], [17]. Furthermore, the aforementioned quantum versions of (1.3) for different conditional relative entropies have been used in the past years to obtain some examples of positive MLSI for quantum spin systems [10,3,11]. Our main motivation in the current paper is a continuation of those results by further generalizing (1.3) to a more abstract setting, with the aim of providing new interesting examples of positive MLSI. In fact, after the first version of this manuscript, the main results contained here have allowed some of the authors to solve a long-standing open problem regarding a system-size independent MLSI for certain evolutions that converge to Gibbs states of nearest-neighbour commuting Hamiltonians at high enough temperature in [12].
Main results: In this paper, building on the previous results of approximate tensorization of the form of (1.3), we take one step further and introduce a weak approximate tensorization for the relative entropy, denoted throughout the text by AT(c, d), which amounts to the existence of positive constants c ≥ 1 and d ≥ 0 such that (see Theorem 2) 1 D(ρ E M * (ρ)) ≤ c D(ρ E 1 * (ρ)) + D(ρ E 2 * (ρ)) + d .
(AT(c, d)) Whenever d = 0, we refer to the previous bound as a strong approximate tensorization for the relative entropy. Nevertheless, as opposed to the classical setting, conditional expectations arising from dissipative evolutions on quantum lattice spin systems generically do not satisfy the commuting square condition even at infinite temperature. This difference is exclusively due to the non-commutativity of the underlying algebras. The additive constant d is meant to take into account this correction from the classical case. Note that, at infinite temperature, the conditional expectations are selfadjoint with respect to the Hilbert-Schmidt inner product, a property referred to as symmetric in [21,4]. Under this condition, in [20], a different extension of (SSA) was proposed. In our framework, the inequality derived in [20] leads to an AT (1, d), which can be regarded as measuring the violation of the commutative square condition at infinite temperature. On the other hand, our strong approximate tensorization constant c can be regarded as a finite temperature relaxation of the case c = 1 in [20].
The first AT(c, d) inequality that we obtain is presented in Proposition 2, where we use the change of measure argument from [27] in order to directly connect the previous AT(1, d) inequality from [20] for symmetric conditional expectations to an AT(c, d ′ ) inequality for the general case, where c is a spectral quantity depending solely on the invariant states of the smallest algebra M and d ′ is proportional to d. In particular, whenever d = 0, this results allows us to transfer strong approximate tensorization for symmetric conditional expectations to strong approximate tensorization for general conditional expectations. However, in this inequality the multiplicative constant cannot be related to the clustering of correlations constant c 1 in the case of interacting systems, and can be in general exponentially larger. Our main result, stated in Theorem 2, precisely fills this gap. Moreover, the inequality reduces to the classical inequality of [14] for commutative algebras.
In Section 5, we apply the previous results on weak approximate tensorization to the context of lattice spin systems with commuting Hamiltonians. In particular, we show in Theorem 3 that classical evolutions over quantum systems (termed embedded Glauber dynamics) satisfy AT(c, 0) with the same constant as in the classical case. As an independent but important result, we also prove in Theorem 1, that the conditional expectations associated to the heat-bath dynamics and Davies dynamics coincide. This, in particular, allows us to transfer various results of remarkable interest that have been proven in the past years for one of the dynamics to the other, and vice versa.
Applications: As mentioned previously, the main application of these inequalities is in the context of mixing times of continuous-time local Markovian evolutions over quantum lattice spin systems -although we expect these inequalities and their proof techniques to find other applications in quantum information theory. In [14], Cesi used his inequality in order to show the exponential convergence in relative entropy of classical Glauber dynamics on lattice systems towards equilibrium, independently of the lattice size, in the form of a positive MLSI constant (defined in Section 4.1). In a subsequent paper [12] that appeared after the first version of the current manuscript, we made use of the approximate tensorization inequality to show similar convergences for dissipative quantum Gibbs samplers.
Moreover, in this paper we illustrate the potential of these techniques in the aforementioned context of mixing times by estimating the MLSI constant whenever the generator of the dynamics is constructed from Pinching onto a pair of different, orthonormal bases. Additionally, we use our main results in approximate tensorization to obtain new entropic uncertainty relations in Section 4.2.
Outline of the paper In Section 2, we review basic mathematical concepts used in this paper, and more particularly the notion of a non-commutative conditional expectation. We derive theoretical expressions on the strong (c) and weak (d) constants for general von Neumann algebras in Section 3, where our main result is stated as Theorem 2. We subsequently apply them to obtain strengthenings of uncertainty relations and examples of positivity of MLSI in Section 4. Moreover, in Section 5, we derive explicit bounds on the constants c and d for conditional expectations associated to Gibbs samplers on lattice spin systems in terms of the interactions of the corresponding Hamiltonian. In Section 6, we discuss the results presented in our paper and how they have been applied to different contexts after the appearance of the first version of our manuscript. Finally, in Appendix A, we review the conditional expectations arising from Petz recovery maps and from Davies generators and show in that both conditional expectations coincide. We conclude by collecting the proofs of some technical results in Appendix B.

Notations and definitions
In this section, we fix the basic notation used in the paper, and introduce the necessary definitions.

Basic notations
Let (H, .|. ) be a finite-dimensional Hilbert space of dimension d H . We denote by B(H) the Banach space of bounded operators on H, by B sa (H) the subspace of self-adjoint operators on H, and by B + (H) the cone of positive semidefinite operators on H. The adjoint of an operator Y is written as Y * . We will also use the same notations N sa and N + in the case of a von Neumann subalgebra N of B(H). The identity operator on N is denoted by 1 N , dropping the index N when it is unnecessary. In the case of B(C ℓ ), ℓ ∈ N, we will also use the notation 1 for 1 C ℓ . Similarly, given a map Φ : B(H) → B(H), we denote its dual with respect to the Hilbert-Schmidt inner product as Φ * . We also denote by id B(H) , or simply id, resp. id ℓ , the identity superoperator on B(H), resp. B(C ℓ ). We denote by D(H) the set of positive semidefinite, trace-one operators on H, also called density operators, by D + (H) the subset of full-rank density operators, and by D ≤ (H) the set of subnormalized density operators. In the following, we will often identify a density matrix ρ ∈ D(H) and the state it defines, that is the positive linear functional B(H) ∋ X → Tr(ρ X). More generally, given a von Neumann subalgebra N ⊆ B(H) with block decomposition N := l M n l ⊗ 1 m l , we denote by D(N ) the set of states of the form for some n l × n l states ρ l and m l × m l full-rank states τ l . The sets D(N ) + and D(N ) ≤ are defined similarly.

Entropic quantities and L p spaces
Throughout this paper, we will use various distance measures between states and between observables: given a state ρ ∈ D(N ), its von Neuman entropy is defined by is the state of a bipartite quantum system, its conditional entropy is defined by where ρ B := Tr A (ρ) corresponds to the marginal of ρ over the subsystem H B . More generally, given two positive semidefinite operators ρ, σ ∈ B + (H), the relative entropy between ρ and σ is defined as follows [44]: Moreover, given (possibly subnormalized) positive semidefinite operators ρ ≥ 0 and σ > 0, their max-relative entropy is defined as [18]: From the max-relative entropy, we can define the max-information of a (possibly subnormalized) bipartite state ρ AB ∈ D ≤ (H A ⊗ H B ) as follows [8]: Given a subalgebra N of B(H) and σ ∈ D + (N ), we define the modular maps Γ σ : N → B(H) and ∆ σ : N → N as follows Then for any p ≥ 1 and X ∈ N , its non-commutative weighted L p (σ)-norm is defined as [30]: , and X L∞(σ) = X ∞ , the operator norm of X, which we will also often more simply denote by X . We call the space B(H) endowed with the norm . Lp(σ) the quantum L p (σ) space. In the case p = 2, we have a Hilbert space, with corresponding σ-KMS scalar product X, Y σ := Tr σ 1/2 X * σ 1/2 Y .
Weighted L p norms enjoy the following useful properties: -Hölder's inequality: for any p,p ≥ 1 such that p −1 +p −1 = 1, and any X, Y ∈ N : Here,p is the Hölder conjugate of p.
-Duality of norms: for any p ≥ 1 of Hölder conjugatep, and any X ∈ N : Y, X σ .

Conditional expectations
Here, we introduce the main object studied in this paper: A conditional expectation satisfies the following useful properties (see [42] for proofs and more details): Proposition 1. Conditional expectations generically satisfy the following properties: (i) The map E is completely positive and unital.
(ii) For any X ∈ N and any Y, (iii) E is self-adjoint with respect to the scalar product ., . σ . In other words: where E * denotes the adjoint of E with respect to the Hilbert-Schmidt inner product.
(iv) E commutes with the modular automorphism group of σ: for any s ∈ R, (v) Uniqueness: given a von Neumann subalgebra M ⊂ N and a faithful state σ, the existence of a conditional expectation E is equivalent to the invariance of M under the modular automorphism group (∆ is σ ) s∈R . In this case, E is uniquely determined by σ. In other words, the states τ l in the decomposition (2.1) are now fixed by E. Similarly, the set of subnormalized states on the algebra N is defined as D ≤ (N ). We also introduce the concept of a conditional covariance: given a von Neumann-subalgebra M ⊂ N , a conditional expectation E M from N onto M and a quantum state σ ∈ D + (M), where D(M) is defined with respect to E M , we define the conditional covariance functional as follows: for any two X, Y ∈ N , (2.5)

Two examples of classes of conditional expectations
In this subsection, we provide more details about the conditional expectations that we will consider in the case of Gibbs states on lattice spin systems in Section 5. Some properties and new results of independent interest regarding these conditional expectations are deferred to Appendix A for sake of clarity.

Conditional expectations generated by a Petz recovery map
Let σ be a faithful density matrix on a finite-dimensional algebra N and let M ⊂ N be a subalgebra. We denote by E τ the conditional expectation onto M with respect to the completely mixed state (i.e. E τ is self-adjoint with respect to the Hilbert-Schmidt inner product). We also adopt the following notations: we write σ M = E τ (σ) and M . Remark that A σ is also the unique map such that for all X ∈ N and all Y ∈ M: The adjoint of A σ is the Petz recovery map of E τ with respect to σ, denoted by R σ : where ρ M := E τ (ρ). It is proved in [13] that A σ is a conditional expectation if and only if σ X σ −1 ∈ M for all X ∈ M. In the general case, we denote by the projection on its fixed-point algebra for the σ-KMS inner product, which is a conditional expectation as we assumed σ to be faithful. That is, E σ is the orthogonal projection for ·, · σ on the algebra:

Conditional expectations coming from Davies semigroups
The basic model for the evolution of an open system in the Markovian regime is given by a quantum Markov semigroup (or QMS) (P t ) t≥0 acting on B(H). Such a semigroup is characterised by its generator, called the Lindbladian L, which is defined on B(H) by for all X ∈ B(H). Recall that by the GKLS Theorem [35,25], L takes the following form: for all X ∈ B(H), where H ∈ B sa (H), the sum runs over a finite number of Lindblad operators L k ∈ B(H), and [·, ·] denotes the commutator defined as [X, The QMS is said to be faithful if it admits a full-rank invariant state σ. When the state σ is the unique invariant state, the semigroup is called primitive. Further assuming the self-adjointness of the generator L with respect to the inner product (2.2) (or detailed balance condition), there exists a conditional expectation E ≡ E F onto the fixed-point subalgebra for all X ∈ B(H).
We now focus on a particular class of QMS called Davies QMS. Such semigroups are obtained in the weak coupling limit of a system and a heat bath. Let H be a selfadjoint operator on H, representing the Hamiltonian of the system. The corresponding Gibbs state at inverse temperature β is defined as . (2.8) Next, consider the Hamiltonian H HB of the heat bath, as well as a set of system-bath interactions {S α ⊗B α }, for some label α. Here, we do not assume anything on the S α 's. The Hamiltonian of the universe composed of the system and its heat-bath is given by Assuming that the bath is in a Gibbs state, by a standard argument (e.g. weak coupling limit, see [39]), the evolution on the system can be approximated by a quantum Markov semigroup whose generator is of the following form: The Fourier coefficients of the two-point correlation functions of the environment χ β α satisfy the following KMS condition: The operators S α (ω) are the Fourier coefficients of the system couplings S α , which means that they satisfy the following equation for any t ∈ R: where the sum is over a finite number of frequencies. This implies in particular the following useful relation: The above identity means that the operators S α (ω) form a basis of eigenvectors of ∆ σ . Next, we define the conditional expectation onto the algebra F (L) of fixed points of L with respect to the Gibbs state σ = σ β as follows [28]: (2.14) Some results regarding the fixed-point algebra associated to this conditional expectation are contained in Appendix A. In particular, we prove the following theorem which is of independent interest.

Weak approximate tensorization of the relative entropy
This section is devoted to the main results of this article, namely approximate tensorization inequalities for the relative entropy.
Definition 2. Let M ⊂ N 1 , N 2 ⊂ N be finite-dimensional von Neumann algebras and E M , E 1 , E 2 associated conditional expectations onto M, resp. N 1 , N 2 . These conditional expectations are said to satisfy a weak approximate tensorization with constants c ≥ 1 and d ≥ 0, denoted by AT(c, d), if, for any state ρ ∈ D(N ): The approximate tensorization is said to be strong if d = 0.
Remark 1. One can easily get similar inequalities for k ≥ 2 algebras M ⊂ N 1 , . . . N k ⊂ N by simply averaging over each inequality for two k 1 = k 2 ∈ [k]. Denoting by c and d as the maximal constants we get by considering two algebras N k1 and N k2 pairwise, we would thus obtain For sake of clarity, we will restrict to the case k = 2 in the rest of the article.
The first technical result presented in this section is Lemma 1, derived from the so-called multivariate trace inequalities [41]. It takes the form is an additive error term that we subsequently estimate via different approaches in the subsequent Sections 3.2 to 3.4: Lemma 1 directly yields a generalization of a result of [20] for conditional expectations with respect to non-tracial states in Corollary 1. Moreover, using a noncommutative change of measure argument [4], we provide in Proposition 2 some first estimates of the strong and weak constants c and d in AT(c, d) in terms of the maximal and minimal eigenvalues of a common invariant state of the three conditional expectations involved.
Next, in Theorem 2, we use a different technique involving Pinching maps onto certain subspaces that appear in a block-diagonal decomposition of M (this setting is properly introduced in Section 3.3) to obtain the inequality: ) strongly depends on the Pinching map with respect to E M * (ρ) and it is subsequently estimated in Proposition 3. Furthermore, the multiplicative error term above can be interpreted as arising from a condition of clustering of correlations for the state E M * (ρ) (see Section 3.4).

A technical lemma
In the next result, we derive a bound on the difference between D(ρ E M * (ρ)) and the sum of the relative entropies D(ρ E i * (ρ)), which is our key tool in finding constants c and d for which AT(c, d) is satisfied. The result is inspired by the work of [14,17] and makes use of the multivariate trace inequalities introduced in [41]: Lemma 1. Let M ⊂ N 1 , N 2 ⊂ N be finite-dimensional von Neumann algebras and E M , E 1 , E 2 their corresponding conditional expectations. Then the following inequality holds for any ρ ∈ D(N ), writing ρ j := E j * (ρ) and ρ M := E M * (ρ): with the probability distribution function Proof. The first step of the proof consists in showing the following bound: Moreover, since Tr[M ] = 1 in general, from the non-negativity of the relative entropy of two states it follows that: In the next step, we bound the error term making use of [33,Theorem 7] and [41,Lemma 3.4], concerning Lieb's extension of Golden-Thompson inequality and Sutter, Berta and Tomamichel's rotated expression for Lieb's pseudo-inversion operator using multivariate trace inequalities, respectively: Let us recall that Theorem 7 of [33] states that for observables f, g and h, we have where T f is given by: An alternative definition of this superoperator in terms of multivariate trace inequalities was provided in Lemma 3.4 of [41], namely with β 0 as in the statement of the lemma. Now, we apply both results to inequality (3.3), to obtain which concludes the proof of the lemma.
Note that, if a constant d > 0 is such that for every ρ ∈ D(N ), then inequality (3.2) constitutes a result of approximate tensorization AT(1, d). Using this observation, we obtain an arguably more direct proof of a result appearing in [20], that we generalize to the case of non-tracial states. Indeed, the proof of [20] required the introduction of so-called amalgamated L p spaces, a technical tool that we do not require.
Then the following weak approximate tensorization AT(1, d) holds: Proof. We focus on the last term on the right-hand side of (3.2). First, note that: We have by definition of d that there exists a state η ∈ D(M) such that for any t ∈ R: for some density X M ∈ M given by ρ Remark 2. In [23], the authors showed that, for doubly stochastic conditional expectations (i.e. E i * = E i , E M * = E M ), the following equation holds: Given the following block decomposition of the algebras N 2 and M, where a kl denotes the number of copies of the block M n k contained in the block M m l . In the context of lattice spin systems, this typically corresponds to the infinite temperature regime.

Approximate tensorization via noncommutative change of measure
Corollary 1 states a correction to exact tensorization with a unique weak constant. We expect this result to be relevant for doubly stochastic conditional expectations, where this additive term is purely quantum. However, the weak constant d is suboptimal in general. In this section and the following one, we provide tools to improve the latter at the cost of replacing the optimal strong constant by c > 1. This intuition is inspired by the classical setting, where the weak constant can be removed at the cost of a worsening of the strong constant [14,17]. Given a state σ that is invariant for the conditional expectations E M , E 1 and E 2 , we define the doubly stochastic conditional expectations Then, the following proposition is a direct consequence of a recent noncommutative change of measure argument in [27] under the assumption that strong approximate tensorization for the relative entropy holds for Proposition 2. As in Corollary 1, we define the constant Let us assume that AT(1, d) holds for the doubly stochastic conditional expectations, i.e. for every ρ ∈ D(H) Then, the following result of AT(c, d ′ ) with c = λmax(σ) λmin(σ) and d ′ = λ max (σ) d H d holds: We defer the proof of this result to the Appendix B.1, as it merely follows the lines of [27].

Approximate tensorization via Pinching map
Proposition 2 states an approximate tensorization inequality with the advantage over Corollary 1 that the weak constant d vanishes when the doubly stochastic conditional expectations projecting onto the same subalgebras form a commuting square. However, the multiplicative constant typically explodes when increasing the size of the system. In the following theorem, we take care of this issue by employing a pinching argument in place of the change of measure argument laid in Proposition 2.
Before stating the result, let us fix some notations. As before, we are interested in proving (weak) approximate tensorization results for the quadruple of algebras M ⊂ N 1 , N 2 ⊂ N . As a subalgebra of B(H) for some Hilbert space H, M bears the following block diagonal decomposition: given H = i∈IM H i ⊗ K i : where P i corresponds to the projection onto the i-th diagonal block in the decomposition of M, and each τ i is a full-rank state on K i . We further make the observation that, since the restrictions of the conditional expectations E 1 , E 2 and E M on B(H i ⊗ K i ) only act non-trivially on the factor B(K i ), there exist conditional expectations E (i) j and (E M ) (i) acting on B(K i ) and such that In order to get another form of approximate tensorization, we wish to compare the state ρ with a classicalquantum state according to the decomposition given by M. To this end we introduce the Pinching map with respect to each H i : define ρ Hi ≡ Tr Ki [P i ρ P i ]. Then each ρ Hi can be diagonalized individually: The Pinching map we are interested in is then: Remark that we have for all ρ ∈ D(N ): Theorem 2. Assume Then, the following inequality holds: for any η ∈ D(N ) such that η = P ρM (η) and Tr Ki [P i η P i ] = ρ Hi . In particular, any state η of the form η := i∈IM ρ Hi ⊗ τ ′ i , for an arbitrary family of subnormalized states τ ′ i , satisfies these conditions. Alternatively, we can get Consequently, AT(c, d) holds with

11)
where the infimum in the second line runs over η such that η = P ρM (η) and Tr Ki [P i η P i ] = ρ Hi .
Proof. The proof starts similarly to that of Corollary 1. We once again simply need to bound the integral on the right hand side of (3.2). By considering η as in the statement of the theorem and writing for the moment d : To simplify the notation, let us write: η 12 := E 1 * • E 2 * (η). Now, note that the following holds: since E M * , E 1 * and E 2 * are conditional expectations in the Schrödinger picture and, thus, trace preserving. Therefore, where we have used that ln(x + 1) ≤ x for positive real numbers. Defining X := Γ −1 ρM (ρ) and Y t := ρ and we can rewrite the previous expression as thus obtaining the following inequality Now, we focus on the integrand on the right-hand side of the above inequality. Denote for any A ∈ B(H), We also write A (λ,i) = |λ (i) λ (i) | ⊗ A (λ,i) by a slight abuse of notation. Then Next, by Hölder's inequality each summand in the right-hand side above is upper bounded by where we use Young's inequality in the last line. Using Pinsker's inequality and summing over the indices i and λ (i) , we find that Equation (3.9) follows after rearranging the term. In order to obtained Equation (3.10), we exploit that ρ M is a fixed point of P ρM and therefore We can then apply Equation (3.9) to P ρM (ρ) and remark that the weak constant vanishes. The result follows after remarking that P ρM •E M * = E M * •P ρM and applying the data-processing inequality to the map P ρM . Remark 3. In the case of a classical evolution over a classical system, taking η = P ρM (ρ) shows that d = 0 in Equation (3.11), and thus we get back the strong approximate tensorization of [14]. In Section 5.2, we will see that this remains also true for classical evolution over quantum systems. The estimation of the constant c under a condition of clustering of correlations is discussed in the next section.
The next proposition provides a short analysis of the weak constant in Theorem 2. We note that the interpretation of this term as a deviation to the classical case is direct from the pinching argument, which explicitly "pinches" on a classical basis. However, as opposed to Proposition 2, we were unable to prove that the weak constant necessarily vanishes when the doubly stochastic conditional expectations form a commuting square.
and where P M := i∈IM P i (·)P i . Furthermore, given i ∈ I N , denote by I Then, .
The proof of this result is deferred to Appendix B.2.

Clustering of correlations
In this section we shift slightly our focus and study the multiplicative constant of the previous results, instead of the additive one. More specifically, we provide an interpretation of the multiplicative constant appearing in the last section in terms of certain notions of clustering of correlations. The latter play a particularly relevant role when applied in the context of quantum spin lattices [12].
The constant c 1 := max i∈IM E (i) in Theorem 2 provides a bound on the following covariance-type quantity: For any i ∈ I M and any X, Y ∈ L 1 (τ i ), (3.14) We call the above property conditional L 1 clustering of correlations, and denote it by condL 1 (c 1 ). Conversely, one can show by duality of L p -norms that if condL 1 (c ′ 1 ) holds for some positive constant c ′ 1 , then In [28], the authors introduced a different notion of clustering of correlation in order to show the positivity of the spectral gap of Gibbs samplers 2 .
Definition 3. We say that M ⊂ N 1 , N 2 ⊂ N satisfies strong L 2 clustering of correlations with respect to the state σ ∈ D(M) with constant c 2 > 0 if for all X, Y ∈ N , Definition 3 does not depend on the state σ ∈ D(M) chosen. This is the content of the next theorem, whose proof is presented in Appendix B.3.
Remark 4. As a consequence of the previous theorem, we realize that the condition assumed in [28] of strong L 2 clustering of correlation with respect to one invariant state, to prove positivity of the spectral gap for the Davies dynamics, would be analogous to assuming strong L 2 clustering of correlation with respect to any invariant state.
It is easy to see that the above notion of strong L 2 clustering of correlation implies that of a conditional L 2 clustering, denoted by condL 2 (c 2 ), simply defined by replacing the L 1 norms by L 2 norms in Equation (3.14), or equivalently by assuming that One can ask whether the converse holds. We prove it under the technical assumption that the composition of conditional expectations E 1 • E 2 cancels off-diagonal terms in the decomposition of M: This is for instance the case when M ⊂ N 1 , N 2 ⊂ N forms a commuting square. The proof for this result is also deferred to Appendix B.3. We conclude this section by noting a crutial difference between L 2 and L 1 clusterings: similarly to Definition 3, one could define a notion of strong L 1 clustering of correlation with respect to a state σ ∈ D(M): This would in particular imply condL 1 (c 1 (σ)). With this notion, and from an argument very similar to that of the proof of Theorem 2, we could show the following bound on the error term in Lemma 1: From this, one would conclude a strong approximate tensorization result if one could find a uniform bound on c 1 (σ) for any σ ∈ D(M). However, and as opposed to the case of strong L 2 clustering, the constant c 1 (σ) depends on the state σ, and can in particular diverge: this is the case whenever there exists i ∈ I M such that dim(H i ) < ∞, and for a state σ := |ψ ψ| Hi ⊗ τ i that is pure on H i . This justifies our choice of condL 1 as the better notion of L 1 clustering in the quantum setting. After the submission of this manuscript, new insights into this particular problem were shed in [24]. We defer a discussion of their results to Section 6.

Applications
This section is devoted to two applications of the results of last section. In Section 4.1, we show the usefulness of Theorem 2 in the context of modified logarithmic Sobolev inequalities. Then, we derive new entropic uncertainty relations in Section 4.2.

Modified logarithmic Sobolev inequalities for biased bases
Take H = C l and assume that the algebra N 1 is the diagonal onto some orthonormal basis |e (1) k , whereas N 2 is the diagonal onto the basis |e (2) k . Moreover, choose M to be the trivial algebra C1 ℓ . Hence for each i ∈ {1, 2}, E i denotes the Pinching map onto the diagonal span |e . Then, for any X ≥ 0: so that by choosing η = ρ = P ρM (ρ) in Theorem 2, as long as ε < 1, for any ρ ∈ D(C ℓ ), AT((1 − ε) −1 , 0) holds: This result is related to Example 4.5 of [31]. There, the author obtains an inequality that can be rewritten in the following form: where δ here is related with ε in our example by: The approximate tensorization derived in (4.1) can be used to find exponential convergence in relative entropy of the primitive quantum Markov semigroup e tL , where Indeed, for any state ρ ∈ D(H), denoting by ρ t the evolved state e tL (ρ) up to time t, the fact that D(ρ t ℓ −1 1) ≤ e −αt D(ρ ℓ −1 1) holds for some α > 0 is equivalent to the so-called modified logarithmic Sobolev inequality. Let us recall that L is said to satisfy a positive modified logarithmic Sobolev inequality (MLSI for short) if there exits a constant α > 0 such that the following inequality holds for every ρ ∈ D(H): In such a case, the optimal α for which the previous inequality holds is called the modified logarithmic Sobolev constant. In this particular setting, by [27,Lemma 3.4], the MLSI for L can be written as αD(ρ ℓ −1 1) ≤ D(ρ E 1 * (ρ)) + D(E 1 * (ρ) ρ) + D(ρ E 2 * (ρ)) + D(E 2 * (ρ) ρ) .
By positivity of the relative entropy, it suffices to prove the existence of a constant α > 0 such that This last inequality is equivalent to (4.1) for α = 1 − ε. Therefore, Theorem 2 yields as a consequence the fact that the generator L defined above satisfies a MLSI of constant bounded by 1 − ε.

Tightened entropic uncertainty relations
Given a function f ∈ L 2 (R) and its Fourier transform [45] the following uncertainty relation: where, given a probability distribution function g, V (g) denotes its variance. The uncertainty inequality means that |f | 2 and |F [f ]| 2 cannot both be concentrated arbitrarily close to their corresponding means. An entropic strenghthening of (4.3) was derived independently by Hirschmann [26] and Stam [40], and tightened later on by Beckner [6]: where H(g) := − R g(x) ln g(x) dx stands for the differential entropy functional. In the quantum mechanical setting, this inequality has the interpretation that the total amount of uncertainty, as quantified by the entropy, of non-commuting observables (i.e. the position and momentum of a particle) is uniformly lower bounded by a positive constant independently of the state of the system. For an extensive review of entropic uncertainty relations for classical and quantum systems, we refer to the recent survey [15]. More generally, given two POVMs X := {X x } x and Y := {Y y } y on a quantum system A, and in the presence of side information M that might help to better predict the outcomes of X and Y, the following state-dependent tightened bound was found in [19] (see also [7] for the special case of measurements in two orthonormal bases and [36] for the case without memory): for any bipartite state ρ AM ∈ D(H A ⊗ H M ), with c ′ = max x,y {Tr(X x Y x )}, where Φ Z denotes the quantum-classical channel corresponding to the measurement Z ∈ {X, Y}: The above inequality has been recently extended to the setting where the POVMs are replaced by two arbitrary quantum channels in [22]. In this section, we restrict ourselves to the setting of [7], so that the measurement channels reduce to the Pinching maps of Section 4.1. First of all, we notice that the relation (4.4) easily follows from Corollary 1: where the last equality is derived from [27,Lemma 3.4]. Hence, since by virtue of Corollary 1 we have Now, taking into account the computations of Section 4.1, notice that obtaining thus expression (4.4).
However, close to the completely mixed state, this inequality is not tight whenever X and Y are not mutually unbiased bases (i.e. ∃x ∈ X , y ∈ Y such that | X x |Y y | 2 > 1 dA ). Here, we derive the following strengthening of Equation (4.4) when d M = 1 as a direct consequence of Theorem 2: Corollary 2. Given a finite alphabet Z ∈ {X , Y}, let E Z denote the Pinching channels onto the orthonormal basis {|e (Z) z } z∈Z corresponding to the measurement Z. Assume further that Then the following strenghtened entropic uncertainty relation holds for any state ρ ∈ D(H A ), (4.5) Proof. Following the first lines of Example 1 for d M = 1, we have . Then, by virtue of Theorem 2, for any η = P ρM (ρ), and by further choosing η = ρ, the last two terms above vanish. Thus, we have: To conclude, just notice that Analogously, we can study the case for three different orthonormal bases (see [7]). For that, let us recall that given N 1 , N 2 , N 3 ⊂ N von Neumann subalgebras and M ⊂ N 1 ∩ N 2 ∩ N 3 , if we consider their associated conditional expectations E i with respect to a state σ, and for each pair (N i , N j ) a result of AT(c ij , d ij ) holds, then for every ρ ∈ D(N ): {c ij } ( D(ρ||E 1 * (ρ)) + D(ρ||E 2 * (ρ)) + D(ρ||E 3 * (ρ)) ) + d 12 + d 13 + d 23 3 . (4.6) Corollary 3. Given a finite alphabet I ∈ {X , Y, Z}, and using the same notation that in Corollary 2, assume that Then the following strenghtened entropic uncertainty relation holds for any state ρ ∈ D(H A ),

Lattice spin systems with commuting Hamiltonians
In this section, we further control the strong and weak constants appearing in Theorem 2 in the context of lattice spin systems, and compare them with previous conditions in the classical and quantum literature. The main result presented in this section is Theorem 3, where we show that the classical Glauber dynamics embedded in a quantum system satisfies a strong approximate tensorization AT(1, 0) at infinite temperature and presents an approximate tensorization AT(c, 0) with small multiplicate constant when the temperature is high enough. This result is contained in Section 5.2. Given a finite lattice Λ ⊂⊂ Z d , we define the tensor product Hilbert space H := H Λ ≡ k∈Λ H k , where for each k ∈ Λ, H k ≃ C ℓ , ℓ ∈ N. Then, let Φ : Λ → B sa (H Λ ) be an r-local potential, i.e. for any j ∈ Λ, Φ(j) is self-adjoint and supported on a ball of radius r around site j. We assume further that Φ(j) ≤ K for some constant K < ∞. The potential Φ is said to be a commuting potential if for any i, j ∈ Λ, [Φ(i), Φ(j)] = 0. Given such a local, commuting potential, the Hamiltonian on a subregion A ⊆ Λ is defined as Next, the corresponding Gibbs state corresponding to the region A and at inverse temperature β is defined as .
Note that this is in general not equal to the state Tr B [σ Λ ]. We begin by introducing Davies semigroups on lattice spin systems. These are the most studied examples of Markovian dynamics studied in this context, together with heat-bath generators defined through Petz recovery maps [3,29,43]. Thanks to Theorem 1, we know that the conditional expectations arising from both dynamics coincide. Hence, for the rest of the paper, all the results presented will be independent of the choice of underlying dynamics.

Davies generators on lattice spin systems
Consider the setting introduced in Section 2.3 and, in particular, the Hamiltonian modelling the system-bath interaction. As mentioned before, the evolution on the system can be approximated by a quantum Markov semigroup whose generator is of the following form: Similarly, define the generator L β A by restricting the sum in Equation (5.3) to the sublattice A: Note that L D,β A acts non-trivially on the boundary of A, denoted by A ∂ := {k ∈ Λ : d(k, A) ≤ r}. Then, for any region A ⊂ Λ, we define the conditional expectation onto the algebra N A of fixed points of L A with respect to the Gibbs state σ = σ Λ as follows [28]: given an adequate decomposition H Λ := i∈IN A H A i ⊗ K A i of the total Hilbert space H Λ of the lattice spin system, for some fixed full-rank states σ A i on K A i . It was shown in Lemma 11 of [28] that the generator of the Davies semigroups corresponding to a local commuting potential is frustration-free. This means that the state σ is in the kernels of all L D,β A , A ⊆ Λ. Therefore, the conditional expectations E D,β A are all defined with respect to σ.
In the next section, we study the weak approximate tensorization of the conditional expectations E D,β A ≡ E β A in the case of a classical Hamiltonian. We start with the following simple observation for commuting Hamiltonians.
Proposition 5. Let A, B ⊂ Λ be two regions separated by at least a distance 2r, that is such that A ∂ ∩ B ∂ = ∅. Then N A and N B form a commuting square, that is, Consequently, for all ρ ∈ D(H Λ ), Proof. Remark that by definition of the map L D,β A , it only acts non-trivially on A ∂ and as identity on (A ∂ ) c . Consequently, as E A = lim t→∞ e tL D,β A , this property carries over to the conditional expectation and we have . This shows the result since A ∂ ∩ B ∂ = ∅.

Classical Hamiltonian over quantum systems
In this section, we investigate the case of a quantum lattice spin system undergoing a classical Glauber dynamics, whose framework was already studied in [16]. These semigroups correspond to Davies generators whose Hamiltonian is classical, that is, diagonal in a product basis of H Λ . In order to make the connection with the classical Glauber dynamics over a classical system (i.e. initially diagonal in the product basis), we introduce the generator more explicitly: consider a lattice spin system over Γ = Z d with classical configuration space S = {+1, −1}, and, for each Λ ⊂ Γ, denote by Ω Λ = S Λ the space of configurations over Λ. Next, given a classical finite-range, translationally invariant potential {J A } A∈Γ and a boundary condition τ ∈ Ω Λ c , define the Hamiltonian over Λ as The classical Gibbs state corresponding to such Hamiltonian is then given by Next, define the Glauber dynamics for a potential J as the Markov process on Ω Λ with the generator where ∇ x f (σ) = f (σ x ) − f (σ) and σ x is the configuration obtained by flipping the spin at position x. The numbers c J (x, σ) are called transition rates and must satisfy the following assumptions: 1. There exist c m , c M such that 0 < c m ≤ c J (x, σ) ≤ c M < ∞ for all x, σ.
2. c J (x, .) depends only on spin values in b r (x).

For all
4. Detailed balance: for all x ∈ Γ, and all σ These assumptions constitute sufficient conditions for the corresponding Markov process to have the Gibbs states over Λ as stationary points. Next, we introduce the notion of a quantum embedding of the aforementioned classical Glauber dynamics. This is the Lindbladian of corresponding Lindblad operators given by L x,η := c J (x, η) |η x η| ⊗ 1 , ∀x ∈ Λ, η ∈ Ω bx(r) . (5.9) It was shown in [16] that such a dynamics is KMS-symmetric with respect to the state µ τ Λ as embedded into the computational basis. Moreover, the set of fixed points in the Schrödinger picture corresponds to the convex hull of the set of Gibbs states over Λ, {µ τ Λ |τ ∈ Ω Λ c }. In the Heisenberg picture, this implies that the fixed-point algebras F (L A ) are expressed as Equivalently, where σ ω A denotes the Gibbs state µ ω A embedded into the computational basis. With this expression at hand we can prove that classical Hamiltonians over quantum systems satisfy the same approximate tensorization than in the classical case.
Theorem 3. Let A, B ⊂ Λ. Then, at β = 0, N A and N B form a commuting square, that is, and consequently, for all ρ ∈ D(H Λ ), At finite temperature β > 0, AT(c, 0) holds with Proof. Equation (5.12) is a direct consequence of the definition of the conditional expectations at β = 0: i.e. E β=0 In order to prove that AT(c, 0) holds at positive temperature, we use our main result on approximate tensorization based on Pinching techniques, namely Theorem 2. More specifically, for every ρ ∈ D(H Λ ), we denote ρ M := E A∪B * (ρ) and apply Equation (3.9) to η = P ρM (ρ). Thus, we only need to check that D max E A * • E B * (ρ) E A * • E B * (η) = 0. We denote by P A the pinching map on the computational basis on a subset A of Λ. By a simple computation we see that P (A∪B) ∂ • P ρM = P (A∪B) ∂ and so that E A * • E B * (ρ) = E A * • E B * (P ρM (ρ)), which completes the proof.
In Theorem 3, we have shown that strong approximate tensorization AT(1, 0) holds at infinite temperature for classical Hamiltonians. However, let us remark that it is not clear (and we strongly believe the opposite) that this remains true for non-classical commuting Gibbs states. A first idea to support this intuition has been shown in Proposition 5. We leave a thorough study of this fact for future work.

Outlook
In this paper, we introduce and study an extension of the celebrated strong subadditivity of the entropy: given algebras N = N 1 ∩ N 2 , N 1 , N 2 ⊆ M, with corresponding conditional expectations E 1 : M → N 1 , E 2 : M → N 2 and E N : M → N , there exist constants c ≥ 1 and d ≥ 0 such that In analogy with its classical analogue, we dubbed this inequality approximate tensorization of the relative entropy.
Since the first submission of this paper, (6.1) has found several extensions and applications in the fields of quantum information theory and many body quantum systems: first, the inequality was used to derive the first proof of the positivity of the modified logarithmic Sobolev inequality constant independently of the system size for Gibbs states of nearest neighbour commuting Hamiltonians on a regular lattice [12]. For this specific class of Gibbs states, the authors showed that the analysis can indeed be reduced to the case of states ρ for which the additive error term in Theorem 2 vanishes, hence providing a direct application to our main result.
More recently, [24] (as well as a new version of [31]) proved a strong approximate tensorization result with multiplicative constant depending on the L 2 clustering of the conditional expectations as well as the dimension of the system. Their approximate tensorization was then used to find asymptotically tight exponential entropic decay to equilibrium for various models of noise including quantum Markov semigroups generated by classical graph Laplacians, approximate k-designs, or the quantum Kac master equation. In their extension of (6.1), the noisy system can also be coupled to an arbitrarily large noiseless environment. Although providing a tight approximate tensorization result in the sense that d = 0 and that it reduces to the exact tensorization in the commuting square setting, their bound however still provides a poor control of the multiplicative constant in the context of Gibbs samplers. We expect that both methods combined will prove useful in proving the uniform positivity of the MLSI constant for generic quantum Gibbs samplers in the near future. Indeed, these techniques, together with a version of Theorem 2, will be used soon to derive positivity of a MLSI for Davies generators in 1D systems [2].

A.1 Conditional expectations generated by a Petz recovery map
Here, we further discuss the notion of conditional expectations coming from the Petz recovery map. The discussion is largely inspired by some results in [13].
Let σ be a faithful density matrix on the finite-dimensional algebra N and let M ⊂ N be a subalgebra. We denote by E τ the conditional expectation onto M with respect to the completely mixed state (i.e. E τ is self-adjoint with respect to the Hilbert-Schmidt inner product). Let us recall the notations and notions introduced in Section 2.3 regarding the adjoint of the Petz recovery map and the conditional expectation constructed from it. We show below the form that these concepts take for a bipartite system. Example 2. Our main example is the case of a bipartite system AB. In this case, N = B(H AB ) and M = 1 HA ⊗ B(H B ). Let σ = σ AB be a faithful density matrix on AB. The partial trace with respect to H A is an example of a conditional expectation E τ which is not compatible with σ AB , in general. With this choice, we obtain: Proposition 7. For any state η ∈ D(N ) such that E σ * (η) = η and any state ρ ∈ D(N ), we have

Proof. Equation (A.3) is a direct consequence of Equation (
A.2) when applied to η = E σ * (ρ), so we focus on the first equation (remark that it can be seen as a counterpart of Equation (A.1) for the difference of relative entropies). To this end, we need the following state σ Tr defined in [1] and heavily exploited in [5]: It has the property that for all X ∈ F (A σ ), [X, σ Tr ] = 0 (see Lemma 3.1 in [1]). Then it is enough to prove that for all η ∈ D(N ) such that E σ * (η) = η, we have: Now any such η can be written η = Xσ Tr with X ∈ F (A σ ). Remark that by definition of F (A σ ), X ∈ M so that E τ (η) = XE τ (σ Tr ). Using the commutation between X and σ Tr and developping the RHS of the previous equation we get the result.

A.2 Davies semigroups
Here we consider the conditional expectation associated to the Davies dynamics that was presented in Section 2.4. Our first result is a characterization of the fixed-point algebra in the Davies case.
where the notation {·} ′ denotes the centralizer of the set.
Proof. We recall that F (L D,β ) = {S α (ω)} ′ . Hence, since σ it S α σ −it can be expressed as a linear combination of the S α (ω)'s by Equation (2.12), it directly follows that To prove the opposite direction, we let X ∈ {σ it S α σ −it ; t ≥ 0} ′ . This means in particular that, for all t ∈ R, and all α: Since the equation holds for all t ∈ R, we can differentiate it N ≡ |{ω}| times at 0 to get that, for any 0 ≤ n ≤ N − 1: ω ω n [X, S α (ω)] = 0 .
Using an arbitrary labelling of the N distinct frequencies ω 1 , ..., ω N , the resulting N linear equations can be rewritten as       Since all the frequencies ω i are distinct, their Vandermonde matrix is invertible. Hence, [X, S α (ω)] = 0 for all ω, so that X ∈ F (L D,β ). with analogous expressions for N 1 and N 2 with their respective conditional expectations. Then, we can express this relative entropy as an infimum over D Lin . Indeed, Lemma 3.4 in [27] states that for all full-rank positive semi-definite Y ∈ M, D Lin (ρ Γ σ (Y )) = D Lin (ρ E M * (ρ)) + D Lin (E M * (ρ) Γ σ (Y )) .

B.2 Proof of Proposition 3
Proof of Proposition 3. We first proceed by proving the bound d ≤ d 1 + d 2 . For all ρ ∈ D(N ), we can use the chain rule on the max-relative entropy to obtain: where the second inequality follows from the data processing inequality for D max . Then where we write A (i) := P i A P i for any A ∈ B(H). This last D max is exactly I max H i : K i ρ (i) after minimizing on η. We are left with proving the two separate bounds on d 1 and d 2 respectively. The first bound is a simple consequence of the data processing inequality for D max and the Pinching inequality. The second bound is a consequence of Lemma B.7 in [8].

B.3 Proofs of Lemma 2 and Proposition 4
Before proving Lemma 2, we need to prove a technical lemma.
Lemma 3. Given a conditional expectation E : N → M ⊂ N ⊂ B(H) that is invariant with respect to two different full-rank states, ρ and σ, the following holds: Proof of Lemma 3. Since we are in finite dimension, the von Neumann algebra M takes the following form: Consider now the Hilbert-Schmidt decomposition of P i XP i with respect to (B(H i ), ·, · σi ) and (B(K i ), ·, · τi ): Thus we have and therefore where in the third line we use that (f (i) α ) α is an orthogonal family for every i ∈ I M . This shows that which is equivalent to strong L 2 clustering.