1 Introduction

The classical Brascamp–Lieb (BL) problem asks, given a finite sequence of surjective linear maps \(L_k:{\mathbb {R}}^m \rightarrow {\mathbb {R}}^{m_k}\) and \(q_k\in {\mathbb {R}}_+\) for \(k\in [n]\), for the optimal constant \(C\in {\mathbb {R}}\) such that [7, 10, 14, 47]

$$\begin{aligned} \int _{{\mathbb {R}}^m} \prod _{k=1}^n f_k\big (L_k x\big ) \,\textrm{d}x \le \exp (C)\prod _{k=1}^n\Vert f_k\Vert _{1/q_k} \end{aligned}$$
(1)

holds for all non-negative functions \(f_k:{\mathbb {R}}^{m_k}\rightarrow {\mathbb {R}}_+\), \(k\in [n]\), where \(\Vert \cdot \Vert _p\) denotes the p-norm. Many classical integral inequalities fall into this framework, such as the Hölder inequality, Young’s inequality, and the Loomis-Whitney inequality. A celebrated theorem by Lieb asserts that the optimal constant in Eq. (1) can be computed by optimizing over centred Gaussians \(f_k\) alone [47].

Remarkably, Eq. (1) has a dual, entropic formulation in terms of the differential entropy \(H(g):=-\int \,g(x)\log g(x)\, \textrm{d}x\). Namely, Eq. (1) holds for all \(f_1,\dots ,f_n\) as above if, and only if, for all probability densities g on \({\mathbb {R}}^m\) with finite differential entropy, we have [19]

$$\begin{aligned} H(g)\le \sum _{k=1}^nq_k H(g_k)+C \, . \end{aligned}$$
(2)

Here, \(g_k\) denotes the marginal probability density on \({\mathbb {R}}^{m_k}\) corresponding to \(L_k\), i.e., the push-forward of g along \(L_k\) defined by \(\int _{{\mathbb {R}}^m} \phi (L_k x)g(x) \textrm{d}x = \int _{{\mathbb {R}}^{m_k}} \phi (y) g_k(y) \textrm{d}y\) for all bounded, continuous functions \(\phi \) on \({\mathbb {R}}^{m_k}\). The duality between Eqs. (1) and (2) readily generalizes to arbitrary measure spaces and measurable maps [19].

Of particular interest is the so-called geometric case where each \(L_k\) is a surjective partial isometry and \(\sum _{k=1}^n q_k \, L_k^\dagger L_k = \mathbbm {1}_{{\mathbb {R}}^m}\) [2,3,4,5,6,7]. In this case, Eqs. (1) and (2) hold with \(C=0\). This setup includes the Hölder and Loomis-Whitney inequalities. Equivalently, we are given n subspaces \(V_k\subseteq {\mathbb {R}}^m\) (the supports of the \(L_k\)) such that \(\sum _{k=1}^n q_k \, \Pi _k = \mathbbm {1}_{{\mathbb {R}}^m}\), where \(\Pi _k\) denotes the orthogonal projection onto \(V_k\). In this case we can think of the marginal densities \(g_k\) as functions on \(V_k\), namely

$$\begin{aligned} g_{V_k}(y) = \int _{V_k^\perp } g(y + z) \textrm{d}z \qquad \forall y \in V_k \, . \end{aligned}$$
(3)

In particular, if \(V_k\) is a coordinate subspace of \({\mathbb {R}}^m\) then \(g_{V_k}\) is nothing but the usual marginal probability density of the corresponding random variables, justifying our terminology. As a concrete example, let \(V_1\), \(V_2\) be the two coordinate subspaces of \({\mathbb {R}}^2\) and \(q_1=q_2=1\); then Eq. (2) amounts to the sub-additivity property of the differential entropy, which is dual to the trivial estimate \(\int _{{\mathbb {R}}^2} f_1(x_1) f_2(x_2) \,\textrm{d}x \le \Vert f_1\Vert _1 \Vert f_2\Vert _1\). In contrast, already for three equiangular lines in \({\mathbb {R}}^2\) (a ‘Mercedes star’ configuration) and \(q_1=q_2=q_3=\frac{2}{3}\), neither inequality is immediate.

Recently, the BL duality has been extended on the entropic side to not only include entropy inequalities as in Eq. (2) but also relative entropy inequalities in terms of the Kullback–Leibler divergence [52]. The dual analytic form then again corresponds to generalized Young inequalities as in Eq. (1) but now for weighted p-norms. Interestingly, this extended BL duality covers many fundamental entropic statements from information theory and more. This includes, e.g., hypercontractivity inequalities, strong data processing inequalities, and transportation-cost inequalities [53].

Here, we raise the question how aforementioned BL dualities can be extended in the non-commutative setting. Our main motivation comes from quantum information theory, where quantum entropy inequalities are pivotal and dual formulations often promise new insights. BL dualities for non-commutative integration have previously been studied by Carlen and Lieb [20]. Amongst other contributions, they gave BL dualities similar to Eqs. (1)–(2) leading to generalized sub-additivity inequalities for quantum entropy.

In this paper, we extend the classical duality results of [52, 53] to the quantum setting—thereby generalizing Carlen and Lieb’s BL duality to the quantum relative entropy and general quantum channel evolutions. In particular, we derive in Sect. 2 a fully quantum BL duality for quantum relative entropy and discuss its properties. In Sect. 3 we then discuss a plethora of examples from quantum information theory that are covered by our quantum BL duality. As novel inequalities, we give quantum versions of the geometric Brascamp–Lieb inequalities discussed above, whose entropic form can be interpreted as an uncertainty relation for certain Gaussian quantum operations (Sect. 3.2).

Note added: Since the first version of our manuscript, our geometric quantum Brascamp–Lieb inequalities from Sect. 3.2 have been extended to the conditional case [50] and to more general Gaussian quantum operations [29]. We briefly mention these extensions in Sect. 3.2.

Notation. Let A and B be separable Hilbert spaces. We denote the set of bounded operators on A by \({{\,\textrm{L}\,}}(A)\), the set of trace-class operators on A by \(\textrm{T}(A)\), the set of Hermitian operators on A by \({{\,\textrm{Herm}\,}}(A)\), the set of positive operators on A by \({{\,\mathrm{P_\succ }\,}}(A)\), and the set of positive semi-definite operators on A by \({{\,\mathrm{P_\succeq }\,}}(A)\). A density operator or quantum state is a positive semi-definite trace-class operator with unit trace; we denote the set of density operators on A by \({{\,\textrm{S}\,}}(A)\). The set of trace-preserving and positive maps from \(\textrm{T}(A)\) to \(\textrm{T}(B)\) is denoted by \({{\,\textrm{TPP}\,}}(A,B)\) and the set of trace-preserving and completely positive maps from \(\textrm{T}(A)\) to \(\textrm{T}(B)\) is denoted by \({{\,\textrm{TPCP}\,}}(A,B)\). For \({\mathcal {E}}\in {{\,\textrm{TPP}\,}}(A,B)\) the adjoint map \({\mathcal {E}}^\dagger \), which is a unital and positive map from \({{\,\textrm{L}\,}}(B)\) to \({{\,\textrm{L}\,}}(A)\), is defined by \({{\,\textrm{tr}\,}}{\mathcal {E}}(X)^\dagger Y = {{\,\textrm{tr}\,}}X^\dagger {\mathcal {E}}^\dagger (Y)\) for all \(X \in \textrm{T}(A)\) and \(Y \in {{\,\textrm{L}\,}}(B)\). When it is clear from the context, we sometimes leave out identity operators, i.e., we may write \(\rho _A \sigma _{AB} \rho _B\) for \((\rho _A \otimes \mathbbm {1}_B) \sigma _{AB} (\mathbbm {1}_A \otimes \rho _B)\).

The von Neumann entropy of a density operator \(\rho \in {{\,\textrm{S}\,}}(A)\) is defined asFootnote 1

$$\begin{aligned} H(\rho ) := -{{\,\textrm{tr}\,}}\rho \log \rho \end{aligned}$$

and can be infinite (only) if A is infinite-dimensional. The quantum relative entropy of \(\omega \in {{\,\textrm{S}\,}}(A)\) with respect to \(\tau \in {{\,\mathrm{P_\succeq }\,}}(A)\) is given by

$$\begin{aligned} D(\omega \Vert \tau ):= {{\,\textrm{tr}\,}}\omega (\log \omega -\log \tau )\quad \text {if}\, \omega \ll \tau \,\,\text {and as} +\infty \,\, \text {otherwise}, \end{aligned}$$

where \(\omega \ll \tau \) denotes that the support of \(\omega \) is contained in the support of \(\tau \). The von Neumann entropy can be expressed as a relative entropy, \(H(\rho ) = -D(\rho \Vert \mathbbm {1})\), where \(\mathbbm {1}\) denotes the identity operator. For \(\rho _{AB} \in {{\,\textrm{S}\,}}(A \otimes B)\) with \(H(A)_\rho <\infty \), the conditional entropy of A given B is defined as [45]

$$\begin{aligned} H(A|B)_{\rho }:= H(A)_\rho - D(\rho _{AB}\Vert \rho _A \otimes \rho _B), \end{aligned}$$

where the notation \(H(A)_\rho := H(\rho _A)\) refers to the entropy of the reduced density operator \(\rho _A = {{\,\textrm{tr}\,}}_B(\rho )\) on A. For A and B finite-dimensional we can also write \(H(A|B)_\rho = H(AB)_\rho - H(B)_\rho \).

Throughout this manuscript the default is that Hilbert spaces are finite-dimensional unless explicitly stated otherwise (such as in Sect. 3.2).

2 Brascamp–Lieb Duality for Quantum Relative Entropies

In this section, we describe our main result (Theorem 2.1) and discuss some of its mathematical properties.

2.1 Main result

The following result establishes a version of the Brascamp–Lieb dualities of [19, 52, 53] for quantum relative entropies.

Theorem 2.1

(Quantum Brascamp–Lieb duality). Let \(n\in {\mathbb {N}}\), \({q}=(q_1,\ldots ,q_n)\in {\mathbb {R}}_+^n\), \({{\mathcal {E}}}=({\mathcal {E}}_1,\ldots ,{\mathcal {E}}_n)\) with \({\mathcal {E}}_k \in {{\,\textrm{TPP}\,}}(A,B_k)\) for \(k\in [n]\), \(\sigma \in {{\,\mathrm{P_\succ }\,}}(A)\), \({\sigma }=(\sigma _1,\ldots ,\sigma _n)\) with \(\sigma _k \in {{\,\mathrm{P_\succ }\,}}(B_k)\) for \(k\in [n]\), and \(C\in {\mathbb {R}}\). Then, the following two statements are equivalent:

$$\begin{aligned}&\displaystyle \sum _{k=1}^n q_k D\big ({\mathcal {E}}_k(\rho ) \Vert \sigma _k \big ) \le D(\rho \Vert \sigma )+C \quad \forall \rho \in {{\,\textrm{S}\,}}(A) , \end{aligned}$$
(4)
$$\begin{aligned}&\displaystyle {{\,\textrm{tr}\,}}\exp \! \left( \! \log \sigma \!+\! \sum _{k=1}^n {\mathcal {E}}_k^\dagger (\log \omega _k)\! \right) \le \exp (C) \prod _{k=1}^n\! \left\Vert \exp \bigl (\log \omega _k + q_k \log \sigma _k \bigr ) \right\Vert _{1/q_k} \; \forall \omega _k\in {{\,\mathrm{P_\succ }\,}}(B_k) ,\qquad \end{aligned}$$
(5)

where \(\left\Vert L\right\Vert _p:= ({{\,\textrm{tr}\,}}|L|^p)^{\frac{1}{p}}\) is the Schatten p-norm for \(p\in [1,\infty ]\) and an anti-norm for \(p\in (0,1]\).Footnote 2 Moreover, Eq. (5) holds for all \(\omega _k\in {{\,\mathrm{P_\succ }\,}}(B_k)\) if and only if it holds for all \(\omega _k\in {{\,\textrm{S}\,}}(B_k)\) with full support.

We refer to Eq. (4) as a quantum Brascamp–Lieb inequality in entropic form, and to Eq. (5) as a quantum Brascamp–Lieb inequality in analytic form. The latter can be understood as a quantum version of a Young-type inequality. The two formulations in Eqs. (4) and (5) encompass a large class of concrete inequalities, as we will see in Sect. 3 below; we are also often interested in identifying the smallest constant \(C\in {\mathbb {R}}\) such that either inequality holds. To this end, both directions of Theorem 2.1 are of interest:

  1. 1.

    To prove quantum entropy inequalities, Theorem 2.1 allows us to alternatively work with matrix exponential inequalities in the analytic form. That this approach can give crucial insights was already discovered in the original proof of strong sub-additivity of the von Neumann entropy [49], which relied on Lieb’s triple matrix inequality for the exponential function (see also [31, 60] for more recent works). We discuss similar examples in Sect. 3.3.

  2. 2.

    In the commutative setting, we know that for deriving Young-type inequalities it can be beneficial to work in the entropic form [19, 21]. As the quantum relative entropy has natural properties mirroring its classical counterpart, this translates to the non-commutative setting. We discuss corresponding examples in Sect. 3.1 and Sect. 3.2.

The proof of Theorem 2.1 relies on the following formula for the Legendre transform of the quantum relative entropy and its dual.

Fact 2.2

(Variational formula for quantum relative entropy [56]) Let \(\sigma \in {{\,\mathrm{P_\succ }\,}}(A)\). Then:

  • For all \(\rho \in {{\,\textrm{S}\,}}(A)\) we have

    $$\begin{aligned} D(\rho \Vert \sigma ) = \sup \limits _{\omega \in {{\,\mathrm{P_\succeq }\,}}(A)} \left\{ {{\,\textrm{tr}\,}}\rho \log \omega - \log {{\,\textrm{tr}\,}}\exp (\log \omega + \log \sigma ) \right\} . \end{aligned}$$
    (6)

    Furthermore, the supremum is attained for \(\omega = \exp (\log \rho - \log \sigma )/{{\,\textrm{tr}\,}}\exp (\log \rho - \log \sigma )\).

  • For all \(H \in {{\,\textrm{Herm}\,}}(A)\), we have

    $$\begin{aligned} \log {{\,\textrm{tr}\,}}\exp (H +\log \sigma ) = \sup _{\omega \in {{\,\textrm{S}\,}}(A)} \left\{ {{\,\textrm{tr}\,}}H \omega - D(\omega \Vert \sigma ) \right\} \, . \end{aligned}$$
    (7)

    Furthermore, the supremum is attained for \(\omega = \exp (H +\log \sigma )/{{\,\textrm{tr}\,}}\exp (H + \log \sigma )\).

These variational formulas are powerful on their own for proving quantum entropy inequalities, as, e.g., the first term in Eq. (6) only depends on \(\rho \) (but not on \(\sigma \)) and the second term only on \(\sigma \) (but not on \(\rho \)). We refer to [60] for a more detailed discussion.

We mention that Carlen-Lieb use the variational characterization of the von Neumann entropy to derive Brascamp–Lieb dualities and [20, bottom of page 564] commented that their proof strategy extends to the relative entropy via Petz’s variational expression for the relative entropy (Lemma 2.2), which is what is done here.

Proof of Theorem 2.1

We first show that Eqs. (4) implies (5). Let \(H_k:=\log \omega _k\) and define \(H\in {{\,\textrm{Herm}\,}}(A)\) and \(\rho \in {{\,\textrm{S}\,}}(A)\) by

$$\begin{aligned} H:=\sum _{k=1}^n {\mathcal {E}}^\dagger _k(H_k) \qquad \text {and} \qquad \rho := \frac{\exp (H +\log \sigma )}{{{\,\textrm{tr}\,}}\exp (H + \log \sigma )} \ \, , \end{aligned}$$
(8)

respectively. Then,

$$\begin{aligned} \log {{\,\textrm{tr}\,}}\exp \left( \log \sigma + \sum _{k=1}^n {\mathcal {E}}^\dagger _k(H_k) \right)&= \log {{\,\textrm{tr}\,}}\exp ( H + \log \sigma ) \\&= {{\,\textrm{tr}\,}}H \rho - D(\rho \Vert \sigma ) \\&= \sum _{k=1}^n {{\,\textrm{tr}\,}}{\mathcal {E}}^\dagger _k(H_k) \rho - D(\rho \Vert \sigma ) \\&\le C + \sum _{k=1}^n q_k\Big ( {{\,\textrm{tr}\,}}\frac{H_k}{q_k} {\mathcal {E}}_k(\rho ) - D\big ({\mathcal {E}}_k(\rho ) \Vert \sigma _k \big ) \Big ) \\&\le C + \sum _{k=1}^n q_k \log {{\,\textrm{tr}\,}}\exp \left( \frac{H_k}{q_k} + \log \sigma _k \right) \, , \end{aligned}$$

where we used Eq. (7) in both the second and the last step and Eq. (4) in the penultimate step. By substituting \(H_k = \log \omega _k\) and taking the exponential on both sides we obtain Eq. (5).

We now show that, conversely, Eqs. (5) implies (4). Let \(\omega =\exp (H)\), with H defined as in Eq. (8) in terms of \(H_k = \log (\omega _k)\) for \(\omega _k \in {{\,\mathrm{P_\succ }\,}}_{\sigma _k}(B_k)\) that we will choose later. Then, using Eq. (6),

$$\begin{aligned} D(\rho \Vert \sigma )&\ge {{\,\textrm{tr}\,}}\rho \log \omega - \log {{\,\textrm{tr}\,}}\exp (\log \omega + \log \sigma ) \\&= \sum _{k=1}^n {{\,\textrm{tr}\,}}\rho \, {\mathcal {E}}^\dagger _k(H_k) - \log {{\,\textrm{tr}\,}}\exp \left( \sum _{k=1}^n {\mathcal {E}}^\dagger _k(H_k) + \log \sigma \right) \\&= \sum _{k=1}^n {{\,\textrm{tr}\,}}{\mathcal {E}}_k(\rho ) \log \omega _k - \log {{\,\textrm{tr}\,}}\exp \left( \log \sigma + \sum _{k=1}^n {\mathcal {E}}^\dagger _k(\log \omega _k) \right) \\&\ge \sum _{k=1}^n q_k \left( {{\,\textrm{tr}\,}}{\mathcal {E}}_k(\rho ) \frac{\log \omega _k}{q_k} - \log {{\,\textrm{tr}\,}}\exp \Big ( \frac{\log \omega _k}{q_k} + \log \sigma _k \Big ) \right) -C\\&= \sum _{k=1}^n q_k D\big ({\mathcal {E}}_k(\rho ) \Vert \sigma _k \big )-C, \end{aligned}$$

where the last inequality uses Eq. (5) and the final step follows from Eq. (6) provided we choose \(\omega _k^{1/q_k}\) as the maximizer for the variational expression of \(D\big ({\mathcal {E}}_k(\rho ) \Vert \sigma _k \big )\). \(\square \)

Remark 2.3

As the variational characterizations from Lemma 2.2 hold in the general \(W^*\)-algebra setting [56], the BL duality in Theorem 2.1 extends to separable Hilbert spaces.

Remark 2.4

The BL duality in Theorem 2.1 can be extended to \(\sigma \in {{\,\mathrm{P_\succeq }\,}}(A)\) and \({\sigma }=(\sigma _1,\ldots ,\sigma _n)\) with \(\sigma _k \in {{\,\mathrm{P_\succeq }\,}}(B_k)\) for \(k\in [n]\) when

  1. 1.

    \({\mathcal {E}}_k(\rho ) \ll \sigma _k\) for all \(\rho \in {{\,\textrm{S}\,}}(A)\) with \(\rho \ll \sigma \)

  2. 2.

    \({\mathcal {E}}^\dagger (\log \omega _k)\ll \sigma \) for all \(\omega _k\in {{\,\mathrm{P_\succeq }\,}}(B)\) with \(\omega _k\ll \sigma _k\).

Then, the BL duality still holds but for the alternative conditions

$$\begin{aligned} \rho \in {{\,\textrm{S}\,}}(A)\;\text {with}\;\rho \ll \sigma \;\text {in}\,\, \mathrm{Eq.}\, (4)\quad \text { and } \quad \omega _k\in {{\,\mathrm{P_\succ }\,}}(B_k)\;\text {with}\;\omega _k\ll \sigma _k\;\text {in}\,\, (5). \end{aligned}$$

To see this, note that the variational formula in Eq. (6) still holds for \(\sigma \in {{\,\mathrm{P_\succeq }\,}}(A)\) as long as \(\rho \ll \sigma \) with the supremum taken over \(\omega \in {{\,\mathrm{P_\succeq }\,}}(A)\) with \(\omega \ll \sigma \). Similarly, Eq. (7) still holds for \(H \in {{\,\textrm{Herm}\,}}(A)\) for \(H\ll \sigma \) with the supremum taken over \(\omega \in {{\,\textrm{S}\,}}(A)\) with \(\omega \ll \sigma \). The proof of Theorem 2.1 then also goes through in the more general form.

In many important applications, we are interested in using Theorem 2.1 either in the situation that \(\sigma _k = {\mathcal {E}}_k(\sigma )\) for all \(k\in [n]\), or in a setting where \(\sigma =\mathbbm {1}_A\) and \(\sigma _k=\mathbbm {1}_{B_k}\) for all \(k\in [n]\). In the latter case, Theorem 2.1 specializes to the following equivalence between von Neumann entropy inequalities and Young-type inequalities:

Corollary 2.5

Let \(n\in {\mathbb {N}}\), \({q}=(q_1,\ldots ,q_n)\in {\mathbb {R}}_+^n\), \({{\mathcal {E}}}=({\mathcal {E}}_1,\ldots ,{\mathcal {E}}_n)\) with \({\mathcal {E}}_k \in {{\,\textrm{TPP}\,}}(A,B_k)\) for \(k\in [n]\), and \(C\in {\mathbb {R}}\). Then, the following two statements are equivalent:

$$\begin{aligned}&H(\rho ) \le \sum _{k=1}^n q_k H\big ({\mathcal {E}}_k(\rho ) \big )+C \quad \forall \rho \in {{\,\textrm{S}\,}}(A) , \end{aligned}$$
(9)
$$\begin{aligned}&{{\,\textrm{tr}\,}}\exp \Big ( \sum _{k=1}^n {\mathcal {E}}^\dagger _k(\log \omega _k) \Big ) \le \exp (C)\prod _{k=1}^n \left\Vert \omega _k\right\Vert _{1/q_k} \quad \forall \omega _k \in {{\,\textrm{S}\,}}(B_k) . \end{aligned}$$
(10)

Carlen and Lieb previously proved a variant of Corollary 2.5 in the \(W^*\)-algebra setting assuming that the maps \({\mathcal {E}}_k^\dagger \) are \(W^*\)-homomorphisms and that \(q_k\in [0,1]\) [20, Theorem 2.2]. One interesting special case is when the \({\mathcal {E}}_k\) are partial trace maps. The entropic form Eq. (9) then corresponds to generalized sub-additivity inequalities for the von Neumann entropy (cf. Sect. 3.1).

2.2 Weighted anti-norms

In the commutative setting, the right-hand side of Eq. (5) can conveniently be understood as a product of \(\sigma _k\)-weighted norms or anti-norms of the operators \(\omega _k\) [52, 53]. It is natural to ask whether such an interpretation also holds quantumly. To this end, given \(p \in (0,1]\) and \(\sigma \in {{\,\mathrm{P_\succ }\,}}(A)\), define

$$\begin{aligned} {\left| \left| \left| \omega \right| \right| \right| }_{\sigma ,p} := \big ( {{\,\textrm{tr}\,}}\exp ( \log \omega ^p + \log \sigma ) \big )^{\frac{1}{p}}=\left\Vert \exp \Big (\log \omega + \frac{1}{p} \log \sigma \Big )\right\Vert _p , \end{aligned}$$

for all \(\omega \in {{\,\mathrm{P_\succ }\,}}(A)\). The following proposition, which follows readily from [41], shows that \({\left| \left| \left| \cdot \right| \right| \right| }_{\sigma ,p}\) is an anti-norm provided that \(p\le 1\). For \(p>1\), it is easy to find \(\sigma \in {{\,\mathrm{P_\succ }\,}}(A)\) such that the functional \({\left| \left| \left| \cdot \right| \right| \right| }_{\sigma ,p}\) is neither a norm nor an anti-norm.

Proposition 2.6

For \(p \in (0,1]\) and \(\sigma \in {{\,\mathrm{P_\succ }\,}}(A)\), \({\left| \left| \left| \cdot \right| \right| \right| }_{\sigma ,p}\) is homogeneous and concave, hence an anti-norm.

Proof

Clearly, \({\left| \left| \left| \cdot \right| \right| \right| }_{\sigma ,p}\) is homogeneous. Since moreover \(p\in (0,1]\), [41, Lemma D.1] asserts that its concavity on the set of positive matrices is equivalent to the concavity of its p-th power, i.e.,

$$\begin{aligned} \omega \mapsto {{\,\textrm{tr}\,}}\exp ( p \log \omega + H ), \end{aligned}$$
(11)

where \(H = \log \sigma \). A well-known result of Lieb [46] states that Eq. (11) is indeed concave for any Hermitian matrix H. Thus, \({\left| \left| \left| \cdot \right| \right| \right| }_{\sigma ,p}\) is concave. As a consequence of homogeneity and concavity, we obtain that \({\left| \left| \left| \cdot \right| \right| \right| }_{\sigma ,p}\) is super-additive, as \({\left| \left| \left| \omega +\omega '\right| \right| \right| }_{\sigma ,p} = 2 {\left| \left| \left| \frac{1}{2}\omega + \frac{1}{2}\omega '\right| \right| \right| }_{\sigma ,p} \ge {\left| \left| \left| \omega \right| \right| \right| }_{\sigma ,p} + {\left| \left| \left| \omega '\right| \right| \right| }_{\sigma ,p}\) for all \(\omega \), \(\omega '\in {{\,\mathrm{P_\succ }\,}}(A)\). We conclude that \({\left| \left| \left| \cdot \right| \right| \right| }_{\sigma ,p}\) is an anti-norm. \(\square \)

Thus, the quantum Brascamp–Lieb inequality in its analytic form Eq. (5) can be written as

$$\begin{aligned} {{\,\textrm{tr}\,}}\exp \left( \log \sigma + \sum _{k=1}^n {\mathcal {E}}_k^\dagger (\log \omega _k) \right) \le \exp (C)\prod _{k=1}^n {\left| \left| \left| \omega _k\right| \right| \right| }_{\sigma _k,1/q_k}\quad \forall \omega _k\in {{\,\textrm{S}\,}}(B_k), \end{aligned}$$
(12)

where, assuming that all \(q_k \ge 1\), the right-hand side can be interpreted in terms of anti-norms, pleasantly generalizing Eq. (10).

2.3 Convexity and tensorization

For fixed \(n\in {\mathbb {N}}\), \({{\mathcal {E}}}=({\mathcal {E}}_1,\ldots ,{\mathcal {E}}_n)\) with \({\mathcal {E}}_k \in {{\,\textrm{TPP}\,}}(A,B)\), \(\sigma \in {{\,\mathrm{P_\succ }\,}}(A)\), and \({\sigma }=(\sigma _1,\ldots ,\sigma _n)\) with \(\sigma _k \in {{\,\mathrm{P_\succ }\,}}(B_k)\), we define the Brascamp–Lieb (BL) set as

$$\begin{aligned} {{\,\textrm{BL}\,}}\Big ({{\mathcal {E}}},{\sigma },\sigma \Big ) := \Big \{\big ({q},C\big ) \in {\mathbb {R}}_+^n\times {\mathbb {R}}:~\mathrm{Eq.}\,(4)/\mathrm{Eq.}\,(5)\, \text {holds} \Big \}. \end{aligned}$$

We record the following elementary property.

Proposition 2.7

(Convexity). The set \({{\,\textrm{BL}\,}}({{\mathcal {E}}},{\sigma },\sigma )\) is convex.

Proof

We use the characterization using the entropic form Eq. (4). Let \(({q}^{(i)},C^{(i)})\in {{\,\textrm{BL}\,}}({{\mathcal {E}}},{\sigma },\sigma )\) for \(i \in \{1,2\}\). Let \(\theta \in [0,1]\) and \((q, C)\) the corresponding convex combination, i.e., \({q}:= \theta \, {q}^{(1)} + (1-\theta ) \, {q}^{(2)}\) and \(C:= \theta \, C^{(1)} + (1-\theta ) \, C^{(2)}\). Then, for all \(\rho \in {{\,\textrm{S}\,}}(A)\),

$$\begin{aligned} \sum _{k=1}^n q_k D\big ({\mathcal {E}}_k(\rho ) \Vert \sigma _k \big )&= \theta \sum _{k=1}^n q^{(1)}_k D\big ({\mathcal {E}}_k(\rho ) \Vert \sigma _k \big )+ (1 - \theta ) \sum _{k=1}^n q^{(2)}_k D\big ({\mathcal {E}}_k(\rho ) \Vert \sigma _k \big )\\&\le \theta \Bigl ( D(\rho \Vert \sigma ) + C^{(1)} \Bigr )+ (1 - \theta ) \Bigl ( D(\rho \Vert \sigma ) + C^{(2)} \Bigr )= D(\rho \Vert \sigma ) + C. \end{aligned}$$

Thus, \(({q},C)\in {{\,\textrm{BL}\,}}({{\mathcal {E}}},{\sigma },\sigma )\). \(\square \)

In the commutative case, the BL set satisfies a tensorization property [53, Section V.B], and we can ask if a similar property holds in the non-commutative case as well. Namely, do we have that for \(\left( {q},C^{(i)}\right) \in {{\,\textrm{BL}\,}}\Big ({{\mathcal {E}}}^{(i)},{\sigma }^{(i)},\sigma ^{(i)}\Big )\) with \(i\in \{1,2\}\) and

$$\begin{aligned} {{\mathcal {E}}}:=\left( {\mathcal {E}}_1^{(1)}\otimes {\mathcal {E}}_1^{(2)},\dots ,{\mathcal {E}}_n^{(1)}\otimes {\mathcal {E}}_n^{(2)}\right) \quad \text {as well as} \quad {\sigma }:=\left( \sigma _1^{(1)}\otimes \sigma _1^{(2)},\dots ,\sigma _n^{(1)}\otimes \sigma _n^{(2)}\right) \end{aligned}$$

that

$$\begin{aligned} \left( {q},C^{(1)}+C^{(2)}\right) {\mathop {\in }\limits ^{?}}{{\,\textrm{BL}\,}}\left( {{\mathcal {E}}},{\sigma },\sigma ^{(1)}\otimes \sigma ^{(2)}\right) . \end{aligned}$$
(13)

As we will see in several examples (Sect. 3), tensorization does in general not hold due to the potential presence of entanglement. Indeed, the problem of deciding in which case Eq. (13) holds can be understood as a general information-theoretic additivity problem, which contains the (non-)additivity for the minimum output entropy as a special case (cf. Eq. (39) in Sect. 3.4).

3 Applications of Quantum Brascamp–Lieb Duality

The purpose of this section is to present examples from quantum information theory where the duality from Theorem 2.1 is applicable. The majority of examples concern entropy inequalities that are of interest from an operational viewpoint. Theorem 2.1 then shows that all entropy inequalities of suitable structure have a dual formulation as an analytic inequality, and vice versa. Depending on the scenario, one form may be easier to prove than the other, and we find that these reformulations often give additional insight.

3.1 Generalized (strong) sub-additivity

In this section, we discuss entropy inequalities that generalize the sub-additivity and strong sub-additivity properties of the von Neumann entropy. Recall that the latter states that \(H(AB) + H(BC) \ge H(ABC) + H(B)\) for \(\rho _{ABC}\in {{\,\textrm{S}\,}}(A \otimes B \otimes C)\) [49].

We first state the following result from [20, Theorem 1.4 & Theorem 3.1], which gives generalized sub-additivity relations and their dual analytic form. Here, the second argument in the relative entropy is always equal to the identity. Throughout this section, all quantum channels are given by partial trace channels.

Corollary 3.1

(Quantum Shearer and Loomis–Whitney inequalities, [20]). Let \(S_1\), ..., \(S_n\) be non-empty subsets of [m] such that every \(s\in [m]\) belongs to at least p of those subsets. Then, the following inequalities hold and are equivalent:

$$\begin{aligned}&H(A_1 \dots A_m) \le \frac{1}{p} \sum _{k=1}^n H(\{A_s\}_{s\in S_k}) \quad \forall \rho \in {{\,\textrm{S}\,}}(A_1 \otimes \cdots \otimes A_m) , \end{aligned}$$
(14)
$$\begin{aligned}&{{\,\textrm{tr}\,}}\exp \left( \sum _{k=1}^n \mathbbm {1}_{{\bar{S}}_k} \otimes \log \omega _{S_k} \right) \le \prod _{k=1}^n \left\Vert \omega _{S_k}\right\Vert _p \quad \forall \omega _{S_k} \in {{\,\textrm{S}\,}}(\otimes _{s\in S_k} A_s) \, , \end{aligned}$$
(15)

where \({\bar{S}}\) denotes the complement of a subset S of [m].

Inequalities in the form of Eq. (14) have been termed quantum Shearer’s inequalities and their analytic counterparts as in Eq. (15) are known as quantum Loomis-Whitney inequalities. Interestingly, and as explained in [20, Section 1.3], the latter cannot directly be deduced from standard matrix trace inequalities such as Golden–Thompson combined with Cauchy–Schwarz. That Eqs. (14) and (15) are equivalent follows from Corollary 2.5 by choosing \(C=0\), \(q_k = \frac{1}{p}\), and \({\mathcal {E}}_k(\cdot )={{\,\textrm{tr}\,}}_{\bar{S}_k}(\cdot )\). The following result provides a conditional version of the quantum Shearer inequality with side information.

Proposition 3.2

(Conditional quantum Shearer inequality). Let \(S_1\), ..., \(S_n\) be non-empty subsets of [m] such that every \(s\in [m]\) belongs to exactly p of those subsets. Then,

$$\begin{aligned} H(A_1 \dots A_m|B)&\le \frac{1}{p} \sum _{k=1}^n H(\{A_s\}_{s\in S_k}|B) \quad \forall \rho \in {{\,\textrm{S}\,}}(A_1 \otimes \cdots \otimes A_m \otimes B) \, . \end{aligned}$$
(16)

For \(n=2\), \(S_1 = \{1\}\), \(S_2 = \{2\}\), \(p=1\), Eq. (16) reduces to \(H(A_1 A_2 | B) \le H(A_1 | B) + H(A_2 | B)\), which is equivalent to the strong sub-additivity of von Neumann entropy.Footnote 3

Note that, in contrast to Corollary 3.1, in the conditional case it is not enough to assume that every \(s\in [m]\) belongs to at least p of the subsets. This is clear from the following proof. For a concrete counterexample, note that for \(n=2\), \(S_1=S_2=\{1\}\), \(S_3=\{2\}\), \(p=1\), Eq. (16) is violated for, e.g., a maximally entangled state between \(A_1\) and B.

Proof of Corollary 3.1 Proposition 3.2

We adapt the argument of [20] to the conditional case. If S and T are two subsets of [m] then strong sub-additivity implies that

$$\begin{aligned} H(\{A_s\}_{s\in S \cup T}|B) + H(\{A_s\}_{s\in S \cap T}|B) \le H(\{A_s\}_{s\in S}|B) + H(\{A_s\}_{s\in T}|B) \, . \end{aligned}$$

This means that we obtain a stronger version of Eq. (16) if we replace any two subsets \(S_k\), \(S_l\) by \(S_k \cup S_l\), \(S_k \cap S_l\). Moreover, each such replacement preserves the number of times that any \(s\in [m]\) is contained in the subsets \(S_1,\dots ,S_n\). We can successively apply replacement steps until we arrive at the situation where \(S_k \subseteq S_l\) or \(S_l \subseteq S_k\) for any two subsets. Without loss of generality, this means that it suffices to prove Corollary 3.1 Proposition 3.2 in the case that \(S_1 \supseteq \dots \supseteq S_n\). In this case, \(S_1 = \dots = S_p = [m]\), since each \(s\in [m]\) is contained in at least p of the subsets. The corresponding inequality Eq. (16) can thus be simplified to

$$\begin{aligned} 0 \le \sum _{k=p+1}^n H(\{A_s\}_{s\in S_k}|B). \end{aligned}$$

If \(B=\emptyset \), as in Corollary 3.1, this inequality holds since the von Neumann entropy is never negative. And if each \(s\in [m]\) belongs to exactly p of the subsets, as in Corollary 3.2, then \(S_{p+1} = \dots = S_n = \emptyset \), so the inequality holds trivially. \(\square \)

Remark 3.3

Corollaries 3.1 and 3.2 also hold for separable Hilbert spaces, as the variational characterizations from Lemma 2.2 hold in the general \(W^*\)-algebra setting [56].

3.2 Brascamp–Lieb inequalities for Gaussian quantum operations

In this section, we present quantum versions of the classical Brascamp–Lieb inequalities as in Eqs. (1) and (2), where probability distributions on \({\mathbb {R}}^m\) are replaced by quantum states on \({{\,\textrm{L}\,}}^2({\mathbb {R}}^m)\), the Hilbert space of square-integrable wave functions on \({\mathbb {R}}^m\). We focus on the geometric case discussed in the introduction. The marginal distribution with respect to a subspace \(X\subseteq {\mathbb {R}}^m\) has the following natural quantum counterpart. Define a TPCP map \({\mathcal {E}}_X\) as the composition of the unitary \({{\,\textrm{L}\,}}^2({\mathbb {R}}^m) \cong {{\,\textrm{L}\,}}^2(X) \otimes {{\,\textrm{L}\,}}^2(X^\perp )\) with the partial trace over the second tensor factor. Given a density operator \(\rho \) on \({{\,\textrm{L}\,}}^2({\mathbb {R}}^m)\), we can think of

$$\begin{aligned} \rho _X={\mathcal {E}}_X(\rho ) \end{aligned}$$

as the reduced density operator corresponding to X. This is the natural quantum version of the marginal probability density in Eq. (3) of the introduction. Indeed, if we identify \(\rho \) with its kernel in \({{\,\textrm{L}\,}}^2({\mathbb {R}}^m \times {\mathbb {R}}^m)\), and likewise for \(\rho _k\), then we have the completely analogous formula

$$\begin{aligned} \rho _k(y,y') = \int _{X^\perp } \rho \bigl (y + z, y' + z\bigr ) \, \textrm{d}z \qquad \forall y, y' \in X \, . \end{aligned}$$

This definition is very similar in spirit to the quantum addition operation in the quantum entropy power inequality of [43] (see also [28, 44]) and in fact contains the latter as a special case. In linear optical terms, \(\rho _X\) can be interpreted as the reduced state of \(\dim X\) many output modes obtained by subjecting \(\rho \) to a network of beamsplitters with arbitrary transmissivities.

The following result establishes quantum versions of the Brascamp–Lieb dualities as in Eqs. (1) and (2) for the geometric case.

Proposition 3.4

(Geometric quantum Brascamp–Lieb inequalities). Let \(X_1,\dots ,X_n \subseteq {\mathbb {R}}^m\) be subspaces and let \(q_1,\dots ,q_n\ge 0\) such that \(\sum _{k=1}^n q_k \Pi _k = \mathbbm {1}_{{\mathbb {R}}^m}\), where \(\Pi _k\) denotes the orthogonal projection onto \(X_k\). Then, for all \(\rho \in {{\,\textrm{S}\,}}({{\,\textrm{L}\,}}^2({\mathbb {R}}^m))\) with finite second moments,

$$\begin{aligned} H(\rho )&\le \sum _{k=1}^n q_k H(\rho _{X_k}) . \end{aligned}$$
(17)

Furthermore, for all \(\omega _{X_k} \in {{\,\textrm{S}\,}}({{\,\textrm{L}\,}}^2(X_k))\) such that \(\exp \Bigl ( \sum _{k=1}^n \mathbbm {1}_{X_k^\perp } \otimes \log \omega _{X_k} \Bigr )\) has finite second moments, it holds thatFootnote 4

$$\begin{aligned} {{\,\textrm{tr}\,}}\exp \Big ( \sum _{k=1}^n \mathbbm {1}_{X_k^\perp } \otimes \log \omega _{X_k} \Big )&\le \prod _{k=1}^n \left\Vert \omega _{X_k}\right\Vert _{1/q_k}. \end{aligned}$$
(18)

Note that if \(X_k\) is spanned by a subset \(S_k\subseteq [m]\) of the m coordinates of \({\mathbb {R}}^m\), then \(\rho _{X_k}\) is nothing but the reduced density matrix of subsystems \(S_k\), which appears on the right-hand side of the quantum Shearer inequality Eq. (14). Thus, Proposition 3.4 implies Corollary 3.1 in the case that all \(s\in [m]\) are contained in exactly p of the subsets \(S_k\).

To establish Proposition 3.4, we will first prove the entropic form Eq. (17) using a quantum version of the heat flow approach from [8, 21] (cf. the recent works [22,23,24] on entropy inequalities for quantum Markov semigroups). We assume some familiarity with Gaussian quantum systems (see, e.g., [40]) and follow the framework of König and Smith [43], which holds under regularity assumptions on the quantum state, which were subsequently removed by De Palma and Trevisan [28].

Let \(X \subseteq {\mathbb {R}}^m\) be a subspace and \(m_X\) its dimension. For all \(x\in X\), define position and momentum operators on \({{\,\textrm{L}\,}}^2(X)\) by \((Q_{X,x} \psi )(y):= (x \cdot y) \psi (y)\) and \(P_{X,x}:= -\textrm{i}\partial _x\). Denote by \({\mathcal {N}}_t^X\) the non-commutative heat flow or heat semigroup [28, 43], which is a one-parameter semi-group, meaning \({\mathcal {N}}_0^X = \mathbbm {1}\) and \({\mathcal {N}}_s^X \circ {\mathcal {N}}_t^X = {\mathcal {N}}_{s+t}^X\) for \(s,t\ge 0\). On a suitable domain it is generated by

$$\begin{aligned} {\mathcal {L}}_X := - \frac{1}{4} \sum _{j=1}^{m_X} [Q_{X,e_j}, [Q_{X,e_j}, \rho ]] + [P_{X,e_j}, [P_{X,e_j}, \rho ]], \end{aligned}$$

where \(\{e_j\}_{j=1}^{m_X}\) is an arbitrary orthonormal basis of X (but we will not directly use this specific form). For every \(t\ge 0\), \({\mathcal {N}}^X_t\) is a Gaussian TPCP map, hence fully determined by its action on covariance matrices and mean vectors,Footnote 5 which is given by

$$\begin{aligned} \Sigma \mapsto \Sigma + t \mathbbm {1}\quad \text {and}\quad \mu \mapsto \mu \, . \end{aligned}$$
(19)

In particular, the heat flow is independent of the choice of orthonormal basis in X. The generalized partial trace maps \({\mathcal {E}}_X:\rho \mapsto \rho _X\) defined above are also Gaussian and act by

$$\begin{aligned} \Sigma \mapsto \Sigma |_X \quad \text {and}\quad \mu \mapsto \mu |_X \, , \end{aligned}$$
(20)

where \(\mu |_X\) denotes the restriction of \(\mu \) onto \(X\oplus X\) and likewise for \(\Sigma |_X\). The non-commutative heat flow is compatible with the maps \({\mathcal {E}}_X\), i.e.,

$$\begin{aligned} {\mathcal {E}}_X \circ {\mathcal {N}}_t^{{\mathbb {R}}^m} = {\mathcal {N}}_t^X \circ {\mathcal {E}}_X. \end{aligned}$$

Indeed, since both channels (and hence their composition) are Gaussian, it suffices to verify that the action commutes on the level of mean vectors and covariance matrices, and the latter is clear from Eqs. (19) and (20). See also [28, Lemma 2]. Thus, we may unambiguously introduce the notation

$$\begin{aligned} \rho ^{(t)}_X := {\mathcal {E}}_X\big ({\mathcal {N}}_t^{{\mathbb {R}}^m}(\rho )\big ) = \mathcal N_t^X\big ({\mathcal {E}}_X(\rho )\big ) = {\mathcal {N}}_t^X\big (\rho _X\big ) \end{aligned}$$
(21)

for the reduced density operator on \({{\,\textrm{L}\,}}^2(X)\) at time t. Similarly, we may show that \({\mathcal {E}}_X\) is compatible with phase-space translations (cf. [43, Lemma XI.1]). For \(x\in X\), define the unitary one-parameter groups

$$\begin{aligned} {\mathcal {Q}}_{X,x}^{(t)}(\rho ) := \textrm{e}^{-\textrm{i}t P_{X,x}} \rho \, \textrm{e}^{\textrm{i}t P_{X,x}} \text {and}~{\mathcal {P}}_{X,x}^{(t)}(\rho ) := \textrm{e}^{\textrm{i}t Q_{X,x}} \rho \, \textrm{e}^{-\textrm{i}t Q_{X,x}}. \end{aligned}$$

They are Gaussian, leave the covariance matrices invariant, and send mean vectors \(\mu \mapsto \mu + t (x^T,0)\) and \(\mu + t (0,x^T)\), respectively. By comparing with Eq. (20), we find that

$$\begin{aligned} {\mathcal {Q}}_{X,x}^{(t)} \circ {\mathcal {E}}_X = {\mathcal {E}}_X \circ {\mathcal {Q}}_{{\mathbb {R}}^m,x}^{(t)} \quad \text {and}\quad {\mathcal {P}}_{X,x}^{(t)} \circ {\mathcal {E}}_X = {\mathcal {E}}_X \circ {\mathcal {P}}_{{\mathbb {R}}^m,x}^{(t)} \, . \end{aligned}$$
(22)

In the following we shall make use of two crucial properties of the heat flow that will allow us to ‘linearize’ the proof of the entropy inequality: First, the entropy of \(\rho _X^{(t)}\) grows logarithmically as \(t \rightarrow \infty \) and becomes asymptotically independent of the state \(\rho \), as proved in [43, Corollary III.4] and [28, Theorem 5]:

$$\begin{aligned} \big |H\big (\rho ^{(t)}_X\big ) / m_X - (1 - \log 2 + \log t) \big | \rightarrow 0. \end{aligned}$$

In particular, this implies that any valid inequality of the form Eq. (17) must satisfy the inequality

$$\begin{aligned} \sum _{k=1}^n q_k m_{X_k} \ge m \, , \end{aligned}$$
(23)

since this is precisely equivalent to the validity of Eq. (17) as \(t\rightarrow \infty \). For us, this condition follows by taking the trace on both sides of our assumption that \(\sum _{k=1}^n q_k \Pi _k = \mathbbm {1}_{{\mathbb {R}}^m}\). To state the second property of the heat flow that we will need, we momentarily assume sufficient regularity of the states under consideration, following [43]. Then, the Fisher information of a one-parameter family of states \(\big \{\sigma ^{(s)}\big \}\) is defined as

$$\begin{aligned} J\big (\big \{\sigma ^{(s)}\big \}\big ) := \partial ^2_{s=0} D\big (\sigma ^0\big \Vert \sigma ^{(s)}\big ). \end{aligned}$$

It satisfies the following version of the data processing inequality [43, Theorem IV.4]: For any TPCP map \(\mathcal E\),

$$\begin{aligned} J\big (\big \{{\mathcal {E}}\big (\sigma ^{(s)}\big )\big \}\big ) \le J\big (\big \{\sigma ^{(s)}\big \}\big ) \, . \end{aligned}$$
(24)

For a covariant family of the form \(\sigma ^{(s,K)}:= \textrm{e}^{\textrm{i}s K} \sigma \, \textrm{e}^{-\textrm{i}s K}\), the Fisher information can be computed as [43, Lemma IV.5]

$$\begin{aligned} J\big (\{\sigma ^{(s,K)}\}\big ) = {{\,\textrm{tr}\,}}\sigma [K,[K,\log \sigma ]]. \end{aligned}$$
(25)

We can now state the quantum de Bruijn identity [43, Theorem V.1], which computes the derivative of the entropy along the heat flow in terms of the Fisher information:

$$\begin{aligned} \partial _t H\big (\rho ^{(t)}_X\big ) = \frac{1}{4} J\big (\rho ^{(t)}_X\big ) \, , \end{aligned}$$
(26)

where the total Fisher information \(J(\sigma _X)\) of a state \(\sigma _X\) on \({{\,\textrm{L}\,}}^2(X)\) is defined by

$$\begin{aligned} J(\sigma _X) := \sum _{j=1}^{m_X} J\big (\big \{\sigma ^{(s,Q_{X,e_j})}\big \}\big ) + J\big (\big \{\sigma ^{(s,P_{X,e_j})}\big \}\big )\, , \end{aligned}$$
(27)

for an arbitrary orthonormal basis \(\{e_j\}_{j=1}^{m_X}\) of X. While above we assumed regularity, the Fisher information \(J(\sigma _X)\) can be defined for any state with finite second moments, and the de Bruijn identity (26) generalizes as well [28, Definition 7 & Proposition 1].Footnote 6

Proof of Proposition 3.4

We first prove the entropic Eq. (17) by considering \(\rho ^{(t)}:= \rho ^{(t)}_{{\mathbb {R}}^m}\) for \(t\ge 0\). As \(t\rightarrow \infty \), Eq. (17) holds up to arbitrarily small error, as explained below Eq. (23). To show its validity at \(t=0\), we would therefore like to argue that \(\partial _t H(\rho ^{(t)}) \ge \sum _{k=1}^n q_k \, \partial _t H\big (\rho ^{(t)}_{X_k}\big )\) for all \(t\ge 0\). In view of the de Bruijn identity in Eqs. (26) and (21), it suffices to establish the following super-addivity property of the Fisher information for all states \(\sigma \) on \({{\,\textrm{L}\,}}^2({\mathbb {R}}^m)\) with finite second moment:

$$\begin{aligned} \sum _{k=1}^n q_k J(\sigma _{X_k}) \le J(\sigma ) \, . \end{aligned}$$
(28)

We first prove this under the regularity assumptions of [43], so that Eq. (25) applies. We will abbreviate \(Q_j:= Q_{{\mathbb {R}}^m,e_j}\) and \(P_j:= P_{{\mathbb {R}}^m,e_j}\), where \(\{e_j\}_{j=1}^m\) is the standard basis of \({\mathbb {R}}^m\). For all \(x \in X_k\), it holds that

$$\begin{aligned} J\big (\big \{\sigma _{X_k}^{(s,Q_{X_k,x})}\big \}\big )&= J\big (\big \{{\mathcal {P}}^{(s)}_{X,x}({\mathcal {E}}_{X_k}(\sigma ))\big \}\big ) \\&= J\big (\big \{{\mathcal {E}}_{X_k}({\mathcal {P}}^{(s)}_{{\mathbb {R}}^m,x}(\sigma ))\big \}\big ) \\&\le J\big (\big \{{\mathcal {P}}^{(s)}_{{\mathbb {R}}^m,x}(\sigma )\big \}\big ) \\&= {{\,\textrm{tr}\,}}\sigma [Q_{{\mathbb {R}}^m,x}, [Q_{{\mathbb {R}}^m,x}, \log \sigma ]] \\&= \sum _{j,j'=1}^m x_j x_{j'} {{\,\textrm{tr}\,}}\sigma [Q_j, [Q_{j'}, \log \sigma ]] \\&= \sum _{j,j'=1}^m \big (x x^T\big )_{j,j'} {{\,\textrm{tr}\,}}\sigma [Q_j, [Q_{j'}, \log \sigma ]] \, \end{aligned}$$

where the second step is by the compatibility of phase-space translations and generalized partial trace (22), the third step uses the data-processing inequality for the Fisher information Eq. (24), and the fourth step follows from Eq. (25). If we apply the same argument to \(J\big (\big \{\sigma _{X_k}^{(s,P_{X,x})}\big \}\big )\) and sum both inequalities over an orthonormal basis \(\{x\}\) of \(X_k\), we obtain

$$\begin{aligned} J(\sigma _{X_k}) \le J(\sigma , \Pi _k) \, , \end{aligned}$$
(29)

where we used the shortcut \(J(\sigma , A):= \sum _{j,j'=1}^m A_{j,j'} \bigl ( {{\,\textrm{tr}\,}}\sigma [Q_j, [Q_{j'}, \log \sigma ]] + {{\,\textrm{tr}\,}}\sigma [P_j, [P_{j'}, \log \sigma ]] \bigr )\) for any positive semidefinite \(m\times m\) matrix A, which is linear in A. Thus, our assumption that \(\sum _{k=1}^n q_k \Pi _k = \mathbbm {1}_{{\mathbb {R}}^m}\) (with all \(q_k\ge 0\)) implies the desired inequality:

$$\begin{aligned} \sum _k q_k \, J(\sigma _{X_k}) \le \sum _k q_k \, J(\sigma , \Pi _{k}) = J(\sigma , \sum _k q_k \Pi _{k}) = J(\sigma , \mathbbm {1}_{{\mathbb {R}}^m}) = J(\sigma ). \end{aligned}$$
(30)

This establishes Eq. (28) and hence Eq. (17) for states that are sufficiently regular. While Eq. (25) need not apply in general, the Fisher information \(J(\sigma )\) and the de Bruijn identity (26) have been generalized to arbitrary states with finite second moments [28], as discussed above. The quantity \(J(\sigma ,A)\) can be defined in the same manner so that Eqs. (29) and (30) hold verbatim, see [29, Definition 6, Propositions 6 & 9].Footnote 7

The analytic form in Eq. (18) then follows from a slight extension of Theorem 2.1, or more specifically the special case discussed in Corollary 2.5. Namely, we need to incorporate on the entropic side the finite second moment assumption from Eq. (17). By inspection, the variational formulae from Lemma 2.2 applied to operators with finite second moment still hold for the respective suprema only taken over operators with finite second moment. Hence, following the proof of the BL duality in Theorem 2.1, we can still go from the entropic to the analytic form when assuming that the operator in exponential form on the left hand side of the analytic form has finite second moment. \(\square \)

While the preceding discussion restricted to the geometric case, we can also consider the general case of surjective linear map \(L_k:{\mathbb {R}}^m \rightarrow {\mathbb {R}}^{m_k}\), as in Sect. 1. For this, write \(L_k\) as the composition of an invertible map \(M_k\in {{\,\textrm{GL}\,}}(m)\) and the projection onto the first \(m_k\) coordinates. Define a unitary operator \(U_k\) on \({{\,\textrm{L}\,}}^2({\mathbb {R}}^m)\) by \((U_k g)(x):= g(M_k^{-1} x) / \sqrt{|\det M_k|}\). Then, \({\mathcal {E}}_k(\rho ):= {{\,\textrm{tr}\,}}_{m_k+1,\dots ,m}(U_k \rho U_k^\dagger )\) defines a TPCP map that is the natural quantum version of the marginalization \(g\mapsto g_k\) (same notation as in Eq. (2)). We leave it for future work to determine under which conditions such quantum Brascamp–Lieb inequalities hold in general.

Note added: In follow-up work, Eq. (17) from Proposition 3.4 has been extended to the conditional case with side information [50, Theorem 7.3] for Gaussian states, based on [44]. Subsequently, the latter assumption was removed by De Palma and Trevisan [29], who further generalized Proposition 3.4 and also fully resolved the aforementioned question.

3.3 Entropic uncertainty relations

In this section, we explain how the duality of Theorem 2.1 and Corollaray 2.5 offers an elegant way to prove entropic uncertainty relations (cf. the related work [57]). In order to compare our uncertainty bounds with the previous literature, we work in the current subsection with the explicit logarithm function relative to base two.

Example 3.5

(Maassen–Uffink). For \(\rho _A\in {{\,\textrm{S}\,}}(A)\) the Maassen–Uffink entropic uncertainty relation [54] for two arbitrary basis measurements,

$$\begin{aligned} M_{{\mathbb {X}}}(\cdot )=\sum _x\langle x|\cdot |x\rangle |x\rangle \langle x|_X \; \quad \text {and}\; \quad M_{{\mathbb {Z}}}(\cdot )=\sum _z\langle z|\cdot |z\rangle |z\rangle \langle z|_Z , \end{aligned}$$

asserts in its strengthened form [11] that

$$\begin{aligned} H(X) + H(Z) \ge -\log c(X,Z) + H(A)\quad \text {with} c(X,Z):=\max _{x,z} | \! \left\langle x|z \right\rangle \! |^2. \end{aligned}$$
(31)

The constant c(XZ) is tight in the sense that there exist quantum states that achieve equality for certain measurement maps. Equation (10) of Corollary 2.5 for \(n=2\), \(q_1=q_2=1\), \({\mathcal {E}}_1=M_{{\mathbb {X}}}\), and \({\mathcal {E}}_2=M_{{\mathbb {Z}}}\) then immediately gives the equivalent analytic form

$$\begin{aligned} {{\,\textrm{tr}\,}}\exp \left( M_{{\mathbb {X}}}^\dagger ( \log \omega _1) + M_{{\mathbb {Z}}}^\dagger (\log \omega _2) \right) \le c(X,Z) \quad \forall \omega _1,\omega _2\in {{\,\textrm{S}\,}}(A)\, . \end{aligned}$$
(32)

In other words, in order to prove Eq. (31) it suffices to show Eq. (32). Now, since the logarithm is operator concave and \(M_{{\mathbb {X}}}^\dagger \) is a unital map, the operator Jensen inequality [36] implies

$$\begin{aligned} M_{{\mathbb {X}}}^\dagger (\log X_1) \le \log M_{{\mathbb {X}}}^\dagger (X_1). \end{aligned}$$

Together with the monotonicity of \(X \mapsto {{\,\textrm{tr}\,}}\exp (X)\) [18, Theorem 2.10] and the Golden–Thompson inequalityFootnote 8 [34, 61], this establishes the analytic form of Eq. (32)

$$\begin{aligned} {{\,\textrm{tr}\,}}\exp \left( M_{{\mathbb {X}}}^\dagger ( \log \omega _1) + M_{{\mathbb {Z}}}^\dagger (\log \omega _2) \right)&\le {{\,\textrm{tr}\,}}\exp \left( \log M_{{\mathbb {X}}}^\dagger (\omega _1) + \log M_{{\mathbb {Z}}}^\dagger (\omega _2) \right) \\&\le {{\,\textrm{tr}\,}}M_{{\mathbb {X}}}^\dagger (\omega _1) M_{{\mathbb {Z}}}^\dagger (\omega _2)\\&\le c(X,Z). \end{aligned}$$

Thus, the entropic Maassen–Uffink relation Eq. (31) follows from our Corollary 2.5.

We note that the approach of proving entropic uncertainty relations via the Golden–Thompson inequality was pioneered by Frank & Lieb [31] and is conceptually different from the original proofs that are either based on complex interpolation theory for Schatten p-norms [54] or the monotonicity of quantum relative entropy [26]. We refer to [25] for a review on entropic uncertainty relations. As a possible extension one could choose non-trivial pre-factors \(q_k\ne 1\) and study the optimal uncertainty bounds in that setting as well (as done in [57] without the H(A) term). Another natural extension is to general quantum channels instead of measurements (as detailed in [12, 32]). The constant c(XZ) from Eq. (31) is multiplicative for tensor product measurements. However, we might ask more generally if for given measurements the optimal lower bound in Eq. (31) becomes multiplicative for tensor product measurements. This amounts to an instance of the tensorization question from Eq. (13) and we refer to [32, 57] for a discussion.

An advantage of our BL analysis is that it suggests tight generalizations to multiple measurements by means of the multivariate extension of the Golden–Thompson inequality [60]. A basic example is as follows.

Example 3.6

(Six-state [27]) For \(\rho _A\in {{\,\textrm{S}\,}}(A)\) with \(\text {dim}(A)=2\) and measurement maps \(M_{{\mathbb {X}}}, M_{{\mathbb {Y}}}, M_{{\mathbb {Z}}}\) of the Pauli matrices \(\sigma _X,\sigma _Y,\sigma _Z\) we have

$$\begin{aligned} H(X) + H(Y) + H(Z) \ge 2+H(A). \end{aligned}$$
(33)

Moreover, this relation is tight in the sense that there exist quantum states that achieve equality. Note that applying the Maassen–Uffink relation Eq. (31) for any two of of the three Pauli measurements only yields the weaker bound

$$\begin{aligned} H(X) + H(Y) + H(Z) \ge \frac{3}{2}+\frac{3}{2}H(A). \end{aligned}$$

The equivalent analytic form of Eq. (33) is given by Corollary 2.5 as

$$\begin{aligned} {{\,\textrm{tr}\,}}\exp \left( M^\dagger _{{\mathbb {X}}}(\log \omega _1) + M^\dagger _{{\mathbb {Y}}}(\log \omega _2)+M^\dagger _{{\mathbb {Z}}}(\log \omega _3)\right) \le \frac{1}{4} \quad \forall \omega _1,\omega _2, \omega _3\in {{\,\textrm{S}\,}}(A). \end{aligned}$$

The same steps as in the proof of the Maassen–Uffink relation, together with Lieb’s triple matrix inequality [46] then yield the upper boundFootnote 9

$$\begin{aligned}&\int _{0}^{\infty } {{\,\textrm{tr}\,}}M_{{\mathbb {X}}}(\omega _1) \frac{1}{M_{{\mathbb {Z}}}(\omega _3)^{-1} +t} M_{{\mathbb {Y}}}(\omega _2) \frac{1}{M_{{\mathbb {Z}}}(\omega _3)^{-1} +t} \textrm{d}t \\&\quad =\sum _{x,y} \langle x | \omega _1 | x \rangle \langle y | \omega _2 | y \rangle \int _{0}^{\infty } |\langle x | \frac{1}{M_{{\mathbb {Z}}}(\omega _3)^{-1} +t} | y \rangle |^2 \textrm{d}t \\&\quad \le \max _{x,y} \int _{0}^{\infty } |\langle x | \frac{1}{M_{{\mathbb {Z}}}(\omega _3)^{-1} +t} | y \rangle |^2 \textrm{d}t. \end{aligned}$$

In the penultimate step we used that

$$\begin{aligned}{} & {} M_{{\mathbb {X}}}(\omega ) = \sum _{x \in \{x_0,x_1\}} \langle x | \omega | x \rangle |x\rangle \!\langle x| \quad \text {where } \left\{ | x_0 \rangle = \frac{1}{\sqrt{2}}(1,1)^T, \, | x_1 \rangle = \frac{1}{\sqrt{2}}(1,-1)^T \right\} \nonumber \\{} & {} M_{{\mathbb {Y}}}(\omega ) = \sum _{y \in \{y_0,y_1\}} \langle y | \omega | y \rangle |y\rangle \!\langle y| \quad \text {where } \left\{ | y_0 \rangle = \frac{1}{\sqrt{2}}(1,\textrm{i})^T, \, | y_1 \rangle = \frac{1}{\sqrt{2}}(1,-\textrm{i})^T \right\} \nonumber \\{} & {} M_{{\mathbb {Z}}}(\omega ) = \sum _{z \in \{z_0,z_1\}} \langle z | \omega | z \rangle |z\rangle \!\langle z| \quad \text {where } \left\{ | z_0 \rangle = (1,0)^T, \, | z_1 \rangle = (0,1)^T \right\} . \end{aligned}$$
(34)

As \((M_{{\mathbb {Z}}}(\omega _3)^{-1} +t)^{-1}= \sum _{z} \frac{1}{\langle z | \omega _3 | z \rangle ^{-1} + t} |z\rangle \!\langle z|\), we get

$$\begin{aligned} \left| \langle x | \frac{1}{M_{{\mathbb {Z}}}(\omega _3)^{-1} +t} | y \rangle \right| ^2 = \left| \sum _{z} \frac{1}{\langle z | \omega _3 | z \rangle ^{-1} + t} \langle x|z\rangle \langle z|y\rangle \right| ^2. \end{aligned}$$

Together with \( \langle x|z_0\rangle \langle z_0|y\rangle = \frac{1}{2}\) and \( \langle x|z_1\rangle \langle z_1|y\rangle = \pm \frac{\textrm{i}}{2}\) for all \(x \in \{x_0,x_1\},\, y \in \{y_0,y_1 \}\) we find the upper bound

$$\begin{aligned} \frac{1}{4} \int _{0}^{\infty } \Big ( (\langle z_0 | \omega _3 | z_0 \rangle ^{-1} + t)^{-2} + (\langle z_1 | \omega _3 | z_1 \rangle ^{-1} +t)^{-2} \Big )\textrm{d}t&= \frac{1}{4} \big ( \langle z_0 | \omega _3 | z_0 \rangle + \langle z_1 | \omega _3 | z_1 \rangle \big ) =\frac{1}{4}. \end{aligned}$$

This then concludes the proof of the six-state entropic uncertainty relation Eq. (33).

3.4 Minimum output entropy

The Brascamp–Lieb duality from Theorem 2.1 and Corollary 2.5 is also applied usefully to general quantum channels. Recall that the minimum output entropy of a map \({\mathcal {E}}\in {{\,\textrm{TPCP}\,}}(A,B)\) is defined by

$$\begin{aligned} H_{\min }({\mathcal {E}}):= \min _{\rho \in {{\,\textrm{S}\,}}(A)} H\big ( {\mathcal {E}}(\rho ) \big ) \, . \end{aligned}$$
(35)

The computation of minimum output entropy is in general NP-complete [9]. Nevertheless, it is a fundamental information measure [58] that has been used, e.g., to prove super-additivity of the Holevo information [38]. Corollary 2.5 for \(n=2\), \(q_1=q_2=1\), \({\mathcal {E}}_1={\mathcal {I}}\), and \({\mathcal {E}}_2 = {\mathcal {E}}\) gives the following result.

Corollary 3.7

(Minimum output entropy). For \({\mathcal {E}}\in {{\,\textrm{TPCP}\,}}(A,B)\) and \(C \in {\mathbb {R}}\), the following two statements are equivalent:

$$\begin{aligned}&C \le H\big ( {\mathcal {E}}(\rho ) \big ) \quad \forall \rho \in {{\,\textrm{S}\,}}(A) \, , \end{aligned}$$
(36)
$$\begin{aligned}&{{\,\textrm{tr}\,}}\exp (\log \omega _1 + {\mathcal {E}}^\dagger (\log \omega _2)) \le \exp (-C) \quad \forall \omega _1 \in {{\,\textrm{S}\,}}(A),\,\omega _2 \in {{\,\textrm{S}\,}}(B) \, . \end{aligned}$$
(37)

Moreover, we have

$$\begin{aligned} H_{\min }({\mathcal {E}}) = - \max _{\omega \in {{\,\textrm{S}\,}}(B)} \lambda _{\max }({\mathcal {E}}^{\dagger }( \log \omega )) . \end{aligned}$$
(38)

It is unclear if the form Eq. (38) could give new insights on the tensorization question of when the minimal output entropy of tensor product channels becomes additive. That is, for which \({\mathcal {E}}, {\mathcal {F}}\in {{\,\textrm{TPCP}\,}}(A,B)\) do we have

$$\begin{aligned} H_{\min }({\mathcal {E}}\otimes {\mathcal {F}}){\mathop {=}\limits ^{?}} H_{\min }({\mathcal {E}}) + H_{\min }({\mathcal {F}}). \end{aligned}$$
(39)

We note that probabilistic counterexamples are known [38], which shows that the tensorization question Eq. (13) is in general answered in the negative.

Proof of Corollary 3.7

We give two proofs of Eq. (38), one based on the variational characterization of the relative entropy from Eq. (6), and the other based on the dual formulation from Eq. (37). Using the former approach, we see that

$$\begin{aligned} H_{\min }({\mathcal {E}}) =\min _{\rho \in {{\,\textrm{S}\,}}(A)} H\big ( {\mathcal {E}}(\rho ) \big )&=\min _{\rho \in {{\,\textrm{S}\,}}(A)} -D\big ({\mathcal {E}}(\rho )\Vert \mathbbm {1}\big )\\&=\min _{\rho \in {{\,\textrm{S}\,}}(A)} - \left( \max _{\omega \in {{\,\mathrm{P_\succ }\,}}(B)} {{\,\textrm{tr}\,}}{\mathcal {E}}(\rho ) \log \omega - \log {{\,\textrm{tr}\,}}\omega \right) \\&=\min _{\rho \in {{\,\textrm{S}\,}}(A), \omega \in {{\,\textrm{S}\,}}(B)} - {{\,\textrm{tr}\,}}\rho \, {\mathcal {E}}^{\dagger }(\log \omega ) \\&=- \max _{\rho \in {{\,\textrm{S}\,}}(A), \omega \in {{\,\textrm{S}\,}}(B)} {{\,\textrm{tr}\,}}\rho \, {\mathcal {E}}^{\dagger }(\log \omega ) \\&= - \max _{\omega \in {{\,\textrm{S}\,}}(B)} \lambda _{\max }\big ( {\mathcal {E}}^\dagger (\log \omega ) \big ) \, , \end{aligned}$$

where the final step follows from the variational formula of the largest eigenvalue.

Alternatively we can verify Eq. (38) in the analytic picture. To see this, we note that using the equivalence between Eqs. (36) and (37) as well as the monotonicity of the logarithm,

$$\begin{aligned} H_{\min }({\mathcal {E}})&=-\max _{\omega _1 \in {{\,\textrm{S}\,}}(A), \,\omega _2 \in {{\,\textrm{S}\,}}(B)} \log {{\,\textrm{tr}\,}}\exp \bigl (\log \omega _1 + {\mathcal {E}}^\dagger (\log \omega _2)\bigr ). \end{aligned}$$
(40)

Next, note that, for any Hermitian H, the Golden–Thompson inequality gives

$$\begin{aligned} \max _{\omega _1 \in {{\,\textrm{S}\,}}(A)} {{\,\textrm{tr}\,}}\exp \bigl (\omega _1 + H\bigr ) \le \max _{\omega _1 \in {{\,\textrm{S}\,}}(A)} {{\,\textrm{tr}\,}}\omega _1 \exp (H) = \lambda _{\max }(\exp (H)) = \exp (\lambda _{\max }(H)), \end{aligned}$$

where the second step uses again the variational formula for the largest eigenvalue. This inequality is in fact an equality, since the upper-bound is attained if we choose \(\omega _1\) to be a projector onto an eigenvector of H with largest eigenvalue (any such \(\omega _1\) commutes with H). If we use this to evaluate Eq. (40), then we obtain the desired result.

Example 3.8

(Qubit depolarizing channel) The minimal output entropy of the qubit depolarizing channel

$$\begin{aligned} {\mathcal {E}}_p :X \mapsto (1-p) X + p \frac{\mathbbm {1}_{{\mathbb {C}}^2}}{2} {{\,\textrm{tr}\,}}X\quad \text {for} p \in [0,1] \end{aligned}$$
(41)

is given by \(H_{\min }({\mathcal {E}}_p) = h\big (p/2\big )\) with \(h(x):=-x\log x - (1-x) \log (1-x)\) is the binary entropy function. In the entropic picture, this follows as the concavity of the entropy ensures that the optimizer in Eq. (35) can always be taken to be a pure state; the unitary covariance property of the depolarizing channel then implies that we only need to evaluate the output entropy for a single arbitrary pure state. In the analytic picture, we can use Eq. (38) to see that

$$\begin{aligned} H_{\min }({\mathcal {E}}_p) = - \max _{\omega \in {{\,\textrm{S}\,}}(B)} \lambda _{\max }({\mathcal {E}}^{\dagger }( \log \omega )) = - \max _{t \in [0,1]} \Big \{ \big (1-\frac{p}{2}\big ) \log t + \frac{p}{2} \log (1-t)\Big \} = h\Big (\frac{p}{2} \Big ), \end{aligned}$$

where the second step follows from unitary covariance and the final step uses that \(t^\star = 1-p/2\) is the optimizer.

3.5 Data-processing inequality

The examples given so far employed Corollary 2.5, but in this section we give an example that demonstrates Theorem 2.1 in its full strength (with \(\sigma \), \(\sigma _k\ne \mathbbm {1}\)). The data-processing inequality (DPI) for the quantum relative entropy is a cornerstone in quantum information theory [51, 55, 62]. It states that, for \(\rho \in {{\,\textrm{S}\,}}(A)\) and \(\sigma \in {{\,\mathrm{P_\succ }\,}}(A)\), the quantum relative entropy cannot increase when applying a channel \({\mathcal {E}}\in {{\,\textrm{TPP}\,}}(A,B)\) to both arguments, i.e.,

$$\begin{aligned} D\big ({\mathcal {E}}(\rho ) \Vert {\mathcal {E}}(\sigma )\big ) \le D(\rho \Vert \sigma ) . \end{aligned}$$

The DPI is mathematically equivalent to many other fundamental results, including the strong sub-additivity of quantum entropy [48, 49]. Our Brascamp–Lieb duality framework fits the DPI. That is, Theorem 2.1 applied for \(n=1\), \(q_1=1\), \(\sigma _1={\mathcal {E}}(\sigma )\), and \(C=0\) implies the following duality.

Corollary 3.9

(DPI duality). For \(\sigma \in {{\,\mathrm{P_\succ }\,}}(A)\) and \({\mathcal {E}}\in {{\,\textrm{TPP}\,}}(A,B)\) the following inequalities hold and are equivalent:

$$\begin{aligned} D\big ({\mathcal {E}}(\rho ) \Vert {\mathcal {E}}(\sigma ) \big )&\le D(\rho \Vert \sigma ) \quad \forall \rho \in {{\,\textrm{S}\,}}(A) \, , \nonumber \\ {{\,\textrm{tr}\,}}\exp \big (\log \sigma + {\mathcal {E}}^\dagger (\log \omega ) \big )&\le {{\,\textrm{tr}\,}}\exp \big (\log \omega + \log {\mathcal {E}}(\sigma )\big ) \quad \forall \omega \in {{\,\mathrm{P_\succ }\,}}(B) \, . \end{aligned}$$
(42)

As a simple example for \({{\,\textrm{tr}\,}}\sigma \le {{\,\textrm{tr}\,}}\rho =1\), one can immediately see that \(D(\rho \Vert \sigma )\ge 0\) by considering the trace map \({\mathcal {E}}(\cdot )={{\,\textrm{tr}\,}}(\cdot )\). Namely, data processing for the trace map takes the trivial analytic form \({{\,\textrm{tr}\,}}\log \omega \le 0\) for quantum states \(\omega \in {{\,\textrm{S}\,}}(A)\).

Given that the DPI is quite powerful, we suspect that Eq. (42) may be of interest too. We note that Eq. (42) does not immediately follow from existing results and thus seems novel. For example, employing the operator concavity of the logarithm, the operator Jensen inequality, and the Golden–Thompson inequality we get

$$\begin{aligned} {{\,\textrm{tr}\,}}\exp \big (\log \sigma + {\mathcal {E}}^\dagger (\log \omega ) \big )\le {{\,\textrm{tr}\,}}\exp \big ( \log \sigma + \log {\mathcal {E}}^{\dagger }(\omega ) \big )\le {{\,\textrm{tr}\,}}{\mathcal {E}}^{\dagger }(\omega ) \sigma ={{\,\textrm{tr}\,}}\omega {\mathcal {E}}(\sigma ). \end{aligned}$$
(43)

This immediately implies Hansen’s multivariate Golden–Thompson inequality [35, Inequality (1)], but is in general still weaker than Eq. (42) as the Golden–Thompson inequality applied to the right-hand side of Eq. (42) likewise gives

$$\begin{aligned} {{\,\textrm{tr}\,}}\exp \big (\log \omega + \log {\mathcal {E}}(\sigma )\big )\le {{\,\textrm{tr}\,}}\omega {\mathcal {E}}(\sigma ). \end{aligned}$$
(44)

Only when \(\sigma = \mathbbm {1}\) and \({\mathcal {E}}\) is unital does Eq. (42) simplify to \({{\,\textrm{tr}\,}}\exp ( {\mathcal {E}}^\dagger (\log \omega )) \le {{\,\textrm{tr}\,}}\omega \), reducing to Eq. (43).Footnote 10

3.6 Strong data-processing inequalities

It is a natural to study potential strengthenings of the DPI inequality and a priori it is possible to seek for additive or multiplicative improvements. Additive strengthenings of the DPI have recently generated interest in quantum information theory [30, 42, 59, 60]. Here, we consider multiplicative improvements of the DPI, which have been called strong data-processing inequalities in the literature. To this end, define the contraction coefficient of \({\mathcal {E}}\in {{\,\textrm{TPCP}\,}}(A,B)\) at \(\sigma \in {{\,\textrm{S}\,}}(A)\) as

$$\begin{aligned} \eta (\sigma ,{\mathcal {E}}):= \sup _{{{\,\textrm{S}\,}}(A) \ni \rho \ne \sigma } \frac{D\big ({\mathcal {E}}(\rho ) \Vert {\mathcal {E}}(\sigma ) \big )}{D(\rho \Vert \sigma )}. \end{aligned}$$
(45)

The data-processing inequality then ensures that \(\eta (\sigma ,{\mathcal {E}}) \le 1\), and we say that \({\mathcal {E}}\) satisfies a strong data-processing inequality at \(\sigma \) if \(\eta (\sigma ,{\mathcal {E}})<1\). Theorem 2.1 for \(n=1\), \(C=0\), \(\sigma _1={\mathcal {E}}(\sigma )\), and \(q_1=\eta (\sigma ,{\mathcal {E}})^{-1}\) implies the following equivalence.

Corollary 3.10

(Strong DPI duality). For \({\mathcal {E}}\in {{\,\textrm{TPP}\,}}(A,B)\), \(\sigma \in {{\,\mathrm{P_\succ }\,}}(A)\), and \(\eta >0\), the following two statements are equivalent:

$$\begin{aligned} D\big ( {\mathcal {E}}(\rho ) \Vert {\mathcal {E}}(\sigma ) \big )&\le \eta D( \rho \Vert \sigma ) \quad \forall \rho \in {{\,\textrm{S}\,}}(A) \, , \end{aligned}$$
(46)
$$\begin{aligned} {{\,\textrm{tr}\,}}\exp \bigl (\log \sigma + {\mathcal {E}}^{\dagger }(\log \omega ) \big )&\le \left\Vert \exp \Bigl (\log \omega + \frac{1}{\eta } \log {\mathcal {E}}(\sigma ) \Bigr )\right\Vert _{\eta } \quad \forall \omega \in {{\,\textrm{S}\,}}(B) \, . \end{aligned}$$
(47)

Thus, to determine \(\eta (\sigma ,{\mathcal {E}})\), we aim to find the smallest constant \(\eta \in [0,1]\) such that Eq. (46) or, equivalently, Eq. (47) holds. For unital \({\mathcal {E}}\) and maximally mixed \(\sigma =\mathbbm {1}/d\), \(d:=\dim (A)\), the duality in Corollary 3.10 simplifies to

$$\begin{aligned} \log d - H\big ( {\mathcal {E}}(\rho )\big )&\le \eta \big ( \log d - H(\rho ) \big ) \quad \forall \rho \in {{\,\textrm{S}\,}}(A) \, , \end{aligned}$$
(48)
$$\begin{aligned} {{\,\textrm{tr}\,}}\exp \bigl ( {\mathcal {E}}^\dagger (\log \omega ) \bigr )&\le d^{\frac{\eta -1}{\eta }} \left\Vert \omega \right\Vert _{\eta } \quad \forall \omega \in {{\,\textrm{S}\,}}(B) . \end{aligned}$$
(49)

Often we are also interested in the global contraction coefficient of \({\mathcal {E}}\), obtained by optimizing \(\eta (\sigma , {\mathcal {E}}) \) over all \(\sigma \in {{\,\textrm{S}\,}}(A)\), i.e.,

$$\begin{aligned} \eta ({\mathcal {E}}):= \sup _{\sigma \in {{\,\textrm{S}\,}}(A)} \eta (\sigma , {\mathcal {E}}) \, . \end{aligned}$$
(50)

Example 3.11

(Qubit depolarizing channel) For the qubit depolarizing channel \({\mathcal {E}}_p\) from Eq. (41), which is unital, we claim that

$$\begin{aligned} \eta \left( \frac{\mathbbm {1}_{{\mathbb {C}}^2}}{2},{\mathcal {E}}_p\right) =(1-p)^2. \end{aligned}$$
(51)

To prove this in the entropic picture we start by recalling that \(\eta ({\mathcal {E}}_p)=(1-p)^2\) [39], which already gives \(\eta (\frac{\mathbbm {1}_{{\mathbb {C}}^2}}{2},{\mathcal {E}}_p)\le (1-p)^2\). Thus, it suffices to find states \(\rho \in {{\,\textrm{S}\,}}(A)\) such that

$$\begin{aligned} (1-p)^2 \le \frac{D({\mathcal {E}}_p(\rho ) \Vert \frac{\mathbbm {1}_{{\mathbb {C}}^2}}{2})}{D(\rho \Vert \frac{\mathbbm {1}_{{\mathbb {C}}^2}}{2})} =\frac{1 - H\big ( {\mathcal {E}}_p(\rho )\big )}{1 - H(\rho )} \end{aligned}$$
(52)

up to arbitrarily small error. The states \(\rho _{\varepsilon } = {{\,\textrm{diag}\,}}(\frac{1}{2}+\varepsilon , \frac{1}{2}-\varepsilon )\) satisfy this condition in the limit \(\varepsilon \rightarrow 0\). Indeed,

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} \frac{1 - H\big ( {\mathcal {E}}_p(\rho _{\varepsilon })\big )}{1 - H(\rho _{\varepsilon })} = \lim _{\varepsilon \rightarrow 0} \frac{1-h\big ((1-p) (1/2 + \varepsilon ) + p/2 \big )}{1-h(1/2 + \varepsilon )} = (1-p)^2 \, , \end{aligned}$$
(53)

as follows from the Taylor expansion of the binary entropy function \(h(\cdot )\).

In the analytic form of Eq. (49), the statement of Eq. (51) is equivalent to the claim that \(\eta =(1-p)^2\) is the smallest \(\eta \in [0,1]\) such that

$$\begin{aligned} {{\,\textrm{tr}\,}}\exp \big ((1-p) \log \omega + \frac{p}{2} \mathbbm {1}_{{\mathbb {C}}^2 }{{\,\textrm{tr}\,}}\log \omega \big ) \le 2^{\frac{\eta -1}{\eta }} \left\Vert \omega \right\Vert _{\eta } \quad \text {for all} \quad \omega \in {{\,\textrm{S}\,}}(B) \, . \end{aligned}$$
(54)

Without loss of generality we can assume that \(\omega = {{\,\textrm{diag}\,}}(t,1-t)\) for \(t \in [0,1]\). Then, the statement above simplifies to showing that \(\eta =(1-p)^2\) is the smallest \(\eta \in [0,1]\) such that

$$\begin{aligned} \big (t (1-t) \big )^{\frac{p}{2}} \big (t^{1-p} +(1-t)^{1-p} \big ) \le 2^{\frac{\eta -1}{\eta }} \big (t^\eta +(1-t)^{\eta } \big )^{\frac{1}{\eta }} \quad \text {for all} \quad t \in [0,1] \, . \end{aligned}$$
(55)

3.7 Super-additivity of relative entropy

Another type of strengthening of the DPI is as follows. The quantum relative entropy is super-additive for product states in the second argument. That is, for \(\rho _{AB}\), \(\sigma _{AB} \in {{\,\textrm{S}\,}}(A \otimes B)\) we have

$$\begin{aligned} D(\rho _{AB} \Vert \sigma _A \otimes \sigma _B) \ge D(\rho _A \Vert \sigma _A) + D(\rho _B \Vert \sigma _B) \, . \end{aligned}$$
(56)

This directly follows from the non-negativity of the relative entropy, since \(D(\rho _{AB} \Vert \sigma _A \otimes \sigma _B) - D(\rho _A \Vert \sigma _A) - D(\rho _B \Vert \sigma _B) = D(\rho _{AB} \Vert \rho _A \otimes \rho _B) \ge 0\). If the state in the second argument is not a product state we can apply the DPI twice and find

$$\begin{aligned} D(\rho _{AB} \Vert \sigma _{AB}) \ge t D(\rho _A \Vert \sigma _A) + (1-t) D(\rho _B \Vert \sigma _B) \quad \text {for all } t \in [0,1] \, . \end{aligned}$$
(57)

A natural question is thus to find parameters \(\alpha (\sigma _{AB}), \beta (\sigma _{AB})\) with \(\alpha (\sigma _A \otimes \sigma _B) = \beta (\sigma _A \otimes \sigma _B) =1\) such thatFootnote 11

$$\begin{aligned} D(\rho _{AB} \Vert \sigma _{AB}) \ge \alpha (\sigma _{AB}) D(\rho _A \Vert \sigma _A) + \beta (\sigma _{AB}) D(\rho _B \Vert \sigma _B) \, . \end{aligned}$$
(58)

Recently, it was shown [17] that Eq. (58) indeed holds for

$$\begin{aligned} \alpha (\sigma _{AB}) = \beta (\sigma _{AB}) = \Big ( 1 + 2 \Big \Vert \sigma _A^{-\frac{1}{2}}\otimes \sigma _B^{-\frac{1}{2}} \sigma _{AB} \sigma _A^{-\frac{1}{2}}\otimes \sigma _B^{-\frac{1}{2}}- \mathbbm {1}_{AB}\Big \Vert _\infty \Big )^{-1}. \end{aligned}$$
(59)

Applying Theorem 2.1 for \(n=2\), \(\sigma _1=\sigma _A\), \(\sigma _2=\sigma _B\), \(C=0\), \({\mathcal {E}}_1={{\,\textrm{tr}\,}}_B\), \({\mathcal {E}}_2 = {{\,\textrm{tr}\,}}_A\), \(q_1= \alpha \), and \(q_2 = \beta \) gives the following BL duality.

Corollary 3.12

(Duality for super-additivity of relative entropy). For \(\sigma _{AB} \in {{\,\mathrm{P_\succ }\,}}(A \otimes B)\) with \({{\,\textrm{tr}\,}}\sigma _{AB}=1\), \(\alpha >0\), and \(\beta >0\), the following two statements are equivalent:

$$\begin{aligned} \alpha D(\rho _A \Vert \sigma _A) + \beta D(\rho _B \Vert \sigma _B)&\le D(\rho _{AB} \Vert \sigma _{AB}) \quad \forall \rho _{AB} \in {{\,\textrm{S}\,}}(A \otimes B) , \end{aligned}$$
(60)
$$\begin{aligned} {{\,\textrm{tr}\,}}\exp ( \log \sigma _{AB} + \log \omega _A + \log \omega _B ) \!&\le \! \left\Vert \exp (\log \omega _A +\alpha \log \sigma _A)\right\Vert _{\frac{1}{\alpha }} \left\Vert \exp (\log \omega _B + \beta \log \sigma _B )\right\Vert _{\frac{1}{\beta }} \nonumber \\& \forall \omega _A \in {{\,\textrm{S}\,}}(A),\omega _B \in {{\,\textrm{S}\,}}(B) \, . \end{aligned}$$
(61)

We leave it as an open question to find parameters \(\alpha (\sigma _{AB})\) and \(\beta (\sigma _{AB})\) different from Eq. (59), satisfying Eq. (60) or equivalently Eq. (61).

4 Conclusion

Our fully quantum Brascamp–Lieb dualities raise a plethora of possible extensions to study. Taking inspiration from the commutative case [53], this could include, e.g., Gaussian optimality questions, hypercontractivity inequalities, transportation cost inequalities, strong converses in Shannon theory, entropy power inequalities [1], or algorithmic and complexity-theoretic questions [15, 16, 33]. For some of these applications it seems that an extension of Barthe’s reverse Brascamp–Lieb duality [7] to the non-commutative setting would be useful.