1 Introduction

The concept of a relative entropy is fundamental in quantum information theory. One of the most important examples is the quantum relative entropy (QRE), defined for a quantum state \(\rho \) and a positive semidefinite operator \(\sigma \) with \(\mathrm{\,supp\,}\rho \subseteq \mathrm{\,supp\,}\sigma \) asFootnote 1

$$\begin{aligned} D(\rho \Vert \sigma ) := \mathrm{Tr\,}(\rho (\log \rho -\log \sigma )). \end{aligned}$$

The QRE satisfies \(D(\rho \Vert \sigma )\ge 0\) if both \(\rho \) and \(\sigma \) are quantum states and has an important operational interpretation as a measure of distinguishability of two quantum states, as it characterizes the minimal type-II error in asymmetric quantum hypothesis testing [23, 38]. In addition, the QRE acts as a parent quantity for various entropic quantities (such as the von Neumann entropy, the conditional entropy, and the Holevo quantity), which characterize the optimal rates of information-theoretic tasks in the asymptotic memoryless setting. A key ingredient in establishing these characterizations of information-theoretic tasks is the data processing inequality (DPI), which states that the QRE cannot increase under the joint action of a quantum operation \(\Lambda \),

$$\begin{aligned} D(\rho \Vert \sigma ) \ge D(\Lambda (\rho )\Vert \Lambda (\sigma )). \end{aligned}$$
(1)

Using the operational interpretation of \(D(\cdot \Vert \cdot )\) as a measure of distinguishability, we can interpret (1) in the following way: A physical transformation (modeled by the quantum operation \(\Lambda \)) of a quantum system cannot enhance our ability to distinguish between two quantum states \(\rho \) and \(\sigma \) describing the system.

A natural question, however, is to ask when a quantum operation does not affect the distinguishability of \(\rho \) and \(\sigma \). More precisely, given a quantum operation \(\Lambda \), we are interested in characterizing those \(\rho \) and \(\sigma \) for which we have equality in the DPI, that is,

$$\begin{aligned} D(\rho \Vert \sigma ) = D(\Lambda (\rho )\Vert \Lambda (\sigma )). \end{aligned}$$
(2)

The answer to this question was given by Petz [40, 41], who proved that (2) holds if and only if there exists a recovery map given by a quantum operation \(\mathcal {R}\) which reverses the action of \(\Lambda \) on \(\rho \) and \(\sigma \), that is, \(\mathcal {R}(\Lambda (\rho )) = \rho \) and \(\mathcal {R}(\Lambda (\sigma )) = \sigma \) (see Sect. 5 for a precise statement). This property of \(\Lambda \) is also called sufficiency [22, 25,26,27, 34, 35, 40, 41]. Petz’s result about equality in the DPI has found important applications in quantum information theory. For example, in [20] it was used to characterize the case of equality in the strong subadditivity of the von Neumann entropy [30], giving rise to the concept of a short quantum Markov chain. Moreover, sparked by a breakthrough result by Fawzi and Renner [16] relating the notion of recoverability to states with small conditional mutual information, there has been a recent surge of interest in the topic of recoverability [7, 8, 15, 28, 45, 46, 52]. Note that strong subadditivity is equivalent to non-negativity of the quantum conditional mutual information, and hence there is an intimate connection between recoverability and saturation of strong subadditivity.

In general, we call a real-valued functional \(\mathcal {D}(\cdot \Vert \cdot )\) on pairs of positive semidefinite operators a (generalized) relative entropy if it is non-negative on quantum states and satisfies the DPI \(\mathcal {D}(\rho \Vert \sigma )\ge \mathcal {D}(\Lambda (\rho )\Vert \Lambda (\sigma ))\) for any quantum operation \(\Lambda \). An important family of relative entropies is given by the quantum Rényi divergences, two important variants of which are known as the \(\alpha \)-relative Rényi entropy (\(\alpha \)-RRE) and the \(\alpha \)-sandwiched Rényi divergence (\(\alpha \)-SRD). These can be seen as special cases of a two-parameter family of relative entropies known as \(\alpha \)-z-Rényi relative entropies [2]. For \(\alpha \in (0,\infty ){\setminus } \lbrace 1\rbrace \) and positive semidefinite operators \(\rho \) and \(\sigma \), the \(\alpha \)-RRE \(D_\alpha (\rho \Vert \sigma )\) [39] is defined as

$$\begin{aligned} D_\alpha (\rho \Vert \sigma ) := {\left\{ \begin{array}{ll} \frac{1}{\alpha -1}\log \left\{ (\mathrm{Tr\,}\rho )^{-1} \mathrm{Tr\,}\left( \rho ^\alpha \sigma ^{1-\alpha }\right) \right\} &{} \!\begin{aligned} &{}\text {if}\, \mathrm{\,supp\,}\rho \subseteq \mathrm{\,supp\,}\sigma \,\mathrm{or}\\ &{} (\alpha \in (0,1)\,\text {and}\, \rho \not \perp \sigma ) \end{aligned} \\ +\infty &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

The values at 0, 1 and \(\infty \) are determined by taking the respective limits, with \(\lim _{\alpha \rightarrow 1}D_\alpha (\rho \Vert \sigma ) = D(\rho \Vert \sigma )\). The \(\alpha \)-RRE is non-negative for quantum states \(\rho \) and \(\sigma \), and satisfies the DPI for \(\alpha \in [0,2]\) [29, 39, 47]. It has direct operational interpretations as generalized cut-off rates in quantum hypothesis testing [31] and error exponents in composite hypothesis testing [18]. Hiai et al. [24] derived necessary and sufficient conditions for equality in the DPI for \(D_\alpha (\cdot \Vert \cdot )\) (and more generally for the class of f-divergences). We discuss this result in Sect. 5.

The \(\alpha \)-SRD [36, 53] is defined for \(\alpha \in (0,\infty ){\setminus } \lbrace 1\rbrace \) and positive semidefinite operators \(\rho \) and \(\sigma \) as

$$\begin{aligned} \widetilde{D}_\alpha (\rho \Vert \sigma )&:= {\left\{ \begin{array}{ll} \frac{1}{\alpha -1}\log \left\{ (\mathrm{Tr\,}\rho )^{-1} \mathrm{Tr\,}\left[ \left( \sigma ^{(1-\alpha )/2\alpha } \rho \sigma ^{(1-\alpha )/2\alpha } \right) ^\alpha \right] \right\} &{} \!\begin{aligned} &{}\text {if}\, \mathrm{\,supp\,}\rho \subseteq \mathrm{\,supp\,}\sigma \,\mathrm{or}\\ &{} (\alpha \in (0,1)\, \text {and} \rho \not \perp \sigma ) \end{aligned} \\ +\infty &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

As before, we define \(\widetilde{D}_*(\cdot \Vert \cdot )\) for \(*\in \lbrace 0,1,\infty \rbrace \) by taking the respective limits and note that we again have \(\lim _{\alpha \rightarrow 1}\widetilde{D}_\alpha (\rho \Vert \sigma ) = D(\rho \Vert \sigma )\). However, in general \(\widetilde{D}_0(\rho \Vert \sigma ) \ne D_0(\rho \Vert \sigma )\) [14]. Furthermore, \(\widetilde{D}_\infty (\rho \Vert \sigma )\) coincides with the max-relative entropy \(D_{\text {max}}(\rho \Vert \sigma )\) [12]. The \(\alpha \)-SRD satisfies \(\widetilde{D}_\alpha (\rho \Vert \sigma )\ge 0\) for states \(\rho \) and \(\sigma \) and has operational interpretations as the strong converse exponent in various settings in quantum hypothesis testing [11, 18, 32] and classical-quantum channel coding [33]. As proved in [3, 17] (see also [21, 36, 53]), it satisfies the DPI for the range \(\alpha \in [1/2,\infty )\):

$$\begin{aligned} \widetilde{D}_\alpha (\rho \Vert \sigma ) \ge \widetilde{D}_\alpha (\Lambda (\rho )\Vert \Lambda (\sigma )). \end{aligned}$$
(3)

Moreover, there are counterexamples to (3) for the range \(\alpha \in (0,1/2)\) [6].

2 Main result

Our main result in this paper, Theorem 1 below, is a necessary and sufficient condition for equality in the DPI (3). In order to state it properly, we first introduce some necessary notation and terminology.

Throughout this paper we only consider finite-dimensional Hilbert spaces. All logarithms are taken to base 2. For a Hilbert space \(\mathcal {H}\) we write \(\mathcal {B}(\mathcal {H})\) for the algebra of linear operators on \(\mathcal {H}\), and we denote by \(\mathcal {P}(\mathcal {H}):= \lbrace \rho \in \mathcal {B}(\mathcal {H}):\rho \ge 0\rbrace \) and \(\mathcal {D}(\mathcal {H}):= \lbrace \rho \in \mathcal {P}(\mathcal {H}):\mathrm{Tr\,}\rho =1\rbrace \) the sets of positive semidefinite operators and density matrices (or quantum states), respectively. We denote by \(\mathrm{rk}\,\ A\) the rank of an operator A, and by \(\mathrm{\,supp\,}A\) the support of A, i.e. the orthogonal complement of the kernel of A. We write \(A \not \perp B\) if \(\mathrm{\,supp\,}A \cap \mathrm{\,supp\,}B\) contains at least one non-zero vector. For a Hermitian operator A, we denote by \(\mathrm{spec}\,A\subseteq \mathbb {R}\) the set of eigenvalues of A. For a pure state \(|\psi \rangle \in \mathcal {H}\) we write \(\psi =|\psi \rangle \langle \psi |\in \mathcal {D}(\mathcal {H})\) for the corresponding rank-1 density matrix. Given a linear map \(\mathcal {L}:\mathcal {B}(\mathcal {H})\rightarrow \mathcal {B}(\mathcal {K})\) between Hilbert spaces \(\mathcal {H}\) and \(\mathcal {K}\), the adjoint map \(\mathcal {L}^\dagger :\mathcal {B}(\mathcal {K})\rightarrow \mathcal {B}(\mathcal {H})\) is the unique map satisfying \(\langle \mathcal {L}^\dagger (Y),X\rangle = \langle Y,\mathcal {L}(X)\rangle \) for all \(X\in \mathcal {B}(\mathcal {H})\) and \(Y\in \mathcal {B}(\mathcal {K})\), where \(\langle A,B\rangle := \mathrm{Tr\,}(A^\dagger B)\) is the Hilbert-Schmidt inner product. A linear map \(\Phi :\mathcal {B}(\mathcal {H})\rightarrow \mathcal {B}(\mathcal {K})\) between Hilbert spaces \(\mathcal {H}\) and \(\mathcal {K}\) is called n-positive if \(\mathrm{id}_n\otimes \Phi :\mathcal {B}(\mathbb {C}^n)\otimes \mathcal {B}(\mathcal {H}) \rightarrow \mathcal {B}(\mathbb {C}^n)\otimes \mathcal {B}(\mathcal {K})\) is positive, where \(\mathrm{id}_n\) denotes the identity map on \(\mathcal {B}(\mathbb {C}^n)\). A map is completely positive if it is n-positive for all \(n\in \mathbb {N}\). A quantum operation (or quantum channel) \(\Lambda \) between Hilbert spaces \(\mathcal {H}\) and \(\mathcal {K}\) is a linear, completely positive, and trace-preserving map \(\Lambda :\mathcal {B}(\mathcal {H})\rightarrow \mathcal {B}(\mathcal {K})\).

Our main result is given by the following theorem:

Theorem 1

Let \(\alpha \in [1/2,1)\cup (1,\infty )\) and set \(\gamma =(1-\alpha )/2\alpha \). Furthermore, let \(\rho \in \mathcal {D}(\mathcal {H})\) and \(\sigma \in \mathcal {P}(\mathcal {H})\) with \(\mathrm{\,supp\,}\rho \subseteq \mathrm{\,supp\,}\sigma \) if \(\alpha >1\) or \(\rho \not \perp \sigma \) if \(\alpha < 1\), and let \(\Lambda :\mathcal {B}(\mathcal {H}) \rightarrow \mathcal {B}(\mathcal {K})\) be a quantum operation. We have equality in the data processing inequality (3),

$$\begin{aligned} \widetilde{D}_\alpha (\rho \Vert \sigma ) = \widetilde{D}_\alpha (\Lambda (\rho )\Vert \Lambda (\sigma )), \end{aligned}$$

if and only if

$$\begin{aligned} \sigma ^\gamma \left( \sigma ^\gamma \rho \sigma ^\gamma \right) ^{\alpha -1} \sigma ^\gamma = \Lambda ^\dagger \!\left( \Lambda (\sigma )^\gamma \left[ \Lambda (\sigma )^\gamma \Lambda (\rho ) \Lambda (\sigma )^\gamma \right] ^{\alpha -1} \Lambda (\sigma )^\gamma \right) . \end{aligned}$$

For \(\alpha > 1\) and positive trace-preserving maps, Theorem 1 was also proved using the framework of non-commutative \(L_p\)-spaces [13]. The case of equality in the DPI for the \(\alpha \)-SRD was also discussed in two papers by Hiai and Mosonyi [22] and Jenčová [25], both of which focused on the aspect of sufficiency. The connections between Theorem 1 and these results are discussed in Sect. 5. The rest of this paper is organized as follows: In Sect. 3, we analyze the proof of the DPI (3) for the \(\alpha \)-SRD as given in [17], extracting a necessary and sufficient condition for equality in (3) and thus proving Theorem 1. We present applications of Theorem 1 to entanglement and distance measures in Sect. 4. Finally, in Sect. 5 we compare our result to the recoverability/sufficiency results mentioned above and state some open questions.

3 Proof of the main result

For the remainder of the discussion we will assume that \(\rho \in \mathcal {D}(\mathcal {H})\) and \(\sigma \in \mathcal {P}(\mathcal {H})\) with \(\mathrm{\,supp\,}\rho \subseteq \mathrm{\,supp\,}\sigma \) if \(\alpha >1\), or \(\rho \not \perp \sigma \) if \(\alpha \in [1/2,1)\). We set \(\gamma =(1-\alpha )/2\alpha \) and define the trace functional

$$\begin{aligned} \widetilde{Q}_\alpha (\rho \Vert \sigma ):= \mathrm{Tr\,}\!\left[ \left( \sigma ^\gamma \rho \sigma ^\gamma \right) ^\alpha \right] \!, \end{aligned}$$

which is invariant under joint unitary conjugation and tensoring with an arbitrary state as follows: For any unitary U and any state \(\tau \), we have

$$\begin{aligned} \widetilde{Q}_\alpha \!\left( U\rho U^\dagger \Vert U\sigma U^\dagger \right)&= \widetilde{Q}_\alpha (\rho \Vert \sigma ), \end{aligned}$$
(4)
$$\begin{aligned} \widetilde{Q}_\alpha (\rho \otimes \tau \Vert \sigma \otimes \tau )&= \widetilde{Q}_\alpha (\rho \Vert \sigma ) . \end{aligned}$$
(5)

The \(\alpha \)-SRD can be expressed in terms of this trace functional as \(\widetilde{D}_\alpha (\rho \Vert \sigma ) = \frac{1}{\alpha -1}\log \widetilde{Q}_\alpha (\rho \Vert \sigma )\), and hence, \(\widetilde{D}_\alpha (\cdot \Vert \cdot )\) inherits the invariance properties (4) and (5) from \(\widetilde{Q}_\alpha (\cdot \Vert \cdot )\). By virtue of the Stinespring Representation Theorem [44], the DPI (3) is thus equivalent to monotonicity of the \(\alpha \)-SRD under partial trace:

$$\begin{aligned} \widetilde{D}_\alpha (\rho _{AB}\Vert \sigma _{AB}) \ge \widetilde{D}_\alpha (\rho _A\Vert \sigma _A), \end{aligned}$$
(6)

where the subscripts AB and A indicate the Hilbert spaces \(\mathcal {H}_{AB}=\mathcal {H}_A\otimes \mathcal {H}_B\) and \(\mathcal {H}_A\) on which the density matrices act, and the partial trace is taken over the B system. Since the logarithm is monotonically increasing, the monotonicity of \(\widetilde{D}_\alpha (\cdot \Vert \cdot )\) under partial trace (6) is in turn equivalent to the following monotonicity properties of \(\widetilde{Q}_\alpha (\cdot \Vert \cdot )\):

$$\begin{aligned} \widetilde{Q}_\alpha (\rho _{AB}\Vert \sigma _{AB})&\le \widetilde{Q}_\alpha (\rho _A\Vert \sigma _A)\quad \text {for}\, \alpha \in [1/2,1),\nonumber \\ \widetilde{Q}_\alpha (\rho _{AB}\Vert \sigma _{AB})&\ge \widetilde{Q}_\alpha (\rho _A\Vert \sigma _A)\quad \text {for}\, \alpha \in (1,\infty ). \end{aligned}$$
(7)

We set \(d=\dim \mathcal {H}_B\) and let \(\lbrace V_i \rbrace _{i=1}^{d^2}\) be a representation of the discrete Heisenberg-Weyl group on \(\mathcal {H}_B\), satisfying the following relation (see, e.g. [51] or [54]):

$$\begin{aligned} \frac{1}{d^2}\sum _i (\mathbbm {1}_A\otimes V_i) \rho _{AB} (\mathbbm {1}_A\otimes V_i^\dagger )&= \rho _A\otimes \pi _B, \end{aligned}$$
(8)

where \(\pi _B = \mathbbm {1}_B/d\) denotes the completely mixed state on B. The crucial ingredient in proving (7) is then the joint concavity/convexity of the trace functional \(\widetilde{Q}_\alpha (\cdot \Vert \cdot )\):

Proposition 2

([17]) The functional \((\rho ,\sigma )\mapsto \widetilde{Q}_\alpha (\rho \Vert \sigma )\) is jointly concave for \(\alpha \in [1/2,1)\) and jointly convex for \(\alpha \in (1,\infty )\).

Remark 3

The joint convexity/concavity of the trace functional \(\widetilde{Q}_\alpha (\cdot \Vert \cdot )\) is a special case of the joint convexity/concavity of a more general trace functional underlying the \(\alpha \)-z-Rényi relative entropies mentioned in Sect. 1, which was proved by Hiai [21] using the theory of Pick functions. A more accessible proof can be found in the arXiv version of [2].

Joint convexity/concavity of the trace functional \(\widetilde{Q}_\alpha (\cdot \Vert \cdot )\) as stated in Proposition 2 can be used to prove the monotonicity of \(\widetilde{Q}_\alpha (\cdot \Vert \cdot )\) under partial trace (7) as follows: Abbreviating \(V_i\equiv \mathbbm {1}_A\otimes V_i\), we have for \(\alpha >1\) that

$$\begin{aligned} \widetilde{Q}_\alpha (\rho _{AB}\Vert \sigma _{AB})&= \widetilde{Q}_\alpha \! \left( V_i\rho _{AB} V_i^\dagger \,\Vert \,V_i \sigma _{AB} V_i^\dagger \right) \nonumber \\&= \frac{1}{d^2} \sum _i \widetilde{Q}_\alpha \!\left( V_i\rho _{AB} V_i^\dagger \,\Vert \,V_i \sigma _{AB} V_i^\dagger \right) \nonumber \\&\ge \widetilde{Q}_\alpha \left( d^{-2} \sum _i V_i\rho _{AB} V_i^\dagger \,\Vert \,d^{-2} \sum _i V_i \sigma _{AB} V_i^\dagger \right) \nonumber \\&= \widetilde{Q}_\alpha ( \rho _A\otimes \pi _B\Vert \sigma _A\otimes \pi _B)\nonumber \\&= \widetilde{Q}_\alpha (\rho _A \Vert \sigma _A). \end{aligned}$$
(9)

In the first equality we used the invariance of \(\widetilde{Q}_\alpha (\cdot \Vert \cdot )\) under joint unitary conjugation (4). The inequality follows from the joint convexity of \(\widetilde{Q}_\alpha (\cdot \Vert \cdot )\) as stated in Proposition 2. In the third equality we used property (8) of the Heisenberg-Weyl operators, and in the last equality we used the invariance of \(\widetilde{Q}_\alpha (\cdot \Vert \cdot )\) under tensoring with a fixed state (5). For \(\alpha \in [1/2,1)\), we go through the same steps as above to show that

$$\begin{aligned} \widetilde{Q}_\alpha (\rho _{AB}\Vert \sigma _{AB}) \le \widetilde{Q}_\alpha (\rho _A\Vert \sigma _A), \end{aligned}$$

only this time employing the joint concavity of \(\widetilde{Q}_\alpha (\cdot \Vert \cdot )\) from Proposition 2 in (9).

To derive an equality condition for (7) (and hence, (3)), we take a closer look at Proposition 2. The key ingredient in its proof in [17] is to rewrite the trace functional \(\widetilde{Q}_\alpha (\rho \Vert \sigma )\) as follows: Defining the function

$$\begin{aligned} f_\alpha (H,\rho ,\sigma ):= \alpha \mathrm{Tr\,}\rho H - (\alpha -1) \mathrm{Tr\,}\left[ \left( \sigma ^{-\gamma } H \sigma ^{-\gamma }\right) ^{\alpha /(\alpha -1)}\right] \!, \end{aligned}$$

it holds that

$$\begin{aligned} \widetilde{Q}_\alpha (\rho \Vert \sigma ) = {\left\{ \begin{array}{ll} \inf \nolimits _{H\ge 0} f_\alpha (H,\rho ,\sigma ) &{} \text {if}\, \alpha \in [1/2,1)\\ \sup \nolimits _{H\ge 0} f_\alpha (H,\rho ,\sigma ) &{} \text {if}\, \alpha >1.\\ \end{array}\right. } \end{aligned}$$
(10)

The joint concavity/convexity of \(\widetilde{Q}_\alpha (\cdot \Vert \cdot )\) then follows from showing that \((\rho ,\sigma )\mapsto f_\alpha (H,\rho ,\sigma )\) is jointly concave/convex for fixed H and the respective ranges of \(\alpha \). Moreover, in the course of proving the validity of (10), Frank and Lieb [17] show that for fixed \(\rho ,\sigma \) a critical point of \(f_\alpha (H,\rho ,\sigma )\) satisfying \(\partial f_\alpha (H,\rho ,\sigma )/\partial H = 0\) is given by

$$\begin{aligned} \hat{H} = \sigma ^\gamma (\sigma ^\gamma \rho \sigma ^\gamma )^{\alpha -1} \sigma ^\gamma . \end{aligned}$$
(11)

As \(H\mapsto f_\alpha (H,\rho ,\sigma )\) is concave for \(\alpha >1\) and convex for \(\alpha \in [1/2,1)\), the critical point \(\hat{H}\) in (11) is a maximum of \(f_\alpha (H,\rho ,\sigma )\) for \(\alpha >1\) and fixed \(\rho \) and \(\sigma \), and a minimum of \(f_\alpha (H,\rho ,\sigma )\) for \(\alpha \in [1/2,1)\) and fixed \(\rho \) and \(\sigma \). Consequently, it holds that

$$\begin{aligned} \widetilde{Q}_\alpha (\rho ,\sigma ) = f_\alpha (\hat{H},\rho ,\sigma )\quad \text {for all}\, \alpha \in [1/2,1)\cup (1,\infty ). \end{aligned}$$

In the following, we show that \(H\mapsto f_\alpha (H,\rho )\) is in fact strictly concave/convex, such that the optimizer \(\hat{H}\) in (11) is a unique maximizer/minimizer. To this end, we employ the following result, which is proved e.g. in [10, Thm. 2.10]:

Theorem 4

Let A be a Hermitian matrix with \(\mathrm{spec} \ A\subseteq \mathcal {D}\subseteq \mathbb {R}\), and let \(g:\mathcal {D}\rightarrow \mathbb {R}\) be a continuous, (strictly) convex function. Then the function \(A\mapsto \mathrm{Tr\,}g(A)\) is (strictly) convex.

Let us first consider \(\alpha >1\). The function \(H\mapsto \mathrm{Tr\,}[( \sigma ^{-\gamma } H \sigma ^{-\gamma })^{\alpha /(\alpha -1)}]\) is the composition of the linear function \(X\mapsto \sigma ^{-\gamma }X\sigma ^{-\gamma }\) and the functional \(A\mapsto \mathrm{Tr\,}A^{\alpha /(\alpha -1)}\), the latter being strictly convex by Theorem 4 upon choosing \(g:\mathbb {R}_+\rightarrow \mathbb {R}_+,\, g(x)= x^{\alpha /(\alpha -1)}\). As \(\alpha >1\), the function \(H\mapsto -(\alpha -1)\mathrm{Tr\,}[( \sigma ^{-\gamma } H \sigma ^{-\gamma })^{\alpha /(\alpha -1)}]\) is, therefore, strictly concave, and hence, \(f_\alpha (H,\rho ,\sigma )\) is strictly concave, since it is the sum of a linear function and a strictly concave function. In the case \(\alpha \in [1/2,1)\), a similar argument shows that \(f_\alpha (H,\rho ,\sigma )\) is strictly convex.

We have seen in (9) above that the joint concavity/convexity of the trace functional \(\widetilde{Q}_\alpha (\cdot \Vert \cdot )\) is the only step in the proof of (6) involving an inequality. Let us analyze this step further. For \(\alpha >1\), we abbreviate \(\rho _i=V_i\rho _{AB}V_i^\dagger \), \(\sigma _i=V_i\sigma _{AB} V_i^\dagger \), and \(\lambda _i=d^{-2}\) for \(i=1,\ldots ,d^2\), such that \(\rho = \sum _i\lambda _i \rho _i = \rho _A\otimes \pi _B\) and \(\sigma = \sum _i\lambda _i\sigma _i = \sigma _A\otimes \pi _B\) by (8). We then consider the operators

$$\begin{aligned} \bar{H}&:= \mathrm{arg\,max}_H f_\alpha (H,\rho ,\sigma )&H_i&:= \mathrm{arg\,max}_H f_\alpha (H, \rho _i,\sigma _i), \end{aligned}$$

which are well-defined by the preceding discussion. Step (9) above can now be written as

$$\begin{aligned} \widetilde{Q}_\alpha (\rho \Vert \sigma ) = f_\alpha \!\left( \bar{H},\rho ,\sigma \right) \le \sum _i \lambda _i f_\alpha \!\left( \bar{H}, \rho _i,\sigma _i\right) \le \sum _i \lambda _i f_\alpha \!\left( H_i,\rho _i,\sigma _i\right) = \sum _i \lambda _i \widetilde{Q}_\alpha (\rho _i\Vert \sigma _i). \end{aligned}$$
(12)

Assume now that we have equality in the joint convexity, that is, \(\widetilde{Q}_\alpha (\rho \Vert \sigma ) = \sum _i \lambda _i \widetilde{Q}_\alpha (\rho _i\Vert \sigma _i)\). Then the chain of inequalities in (12) collapses, and in particular we obtain

$$\begin{aligned} f_\alpha \!\left( \bar{H}, \rho _i,\sigma _i\right) = f_\alpha \!\left( H_i,\rho _i,\sigma _i\right) \quad \text {for every}\quad i=1,\ldots ,d^2. \end{aligned}$$

In other words, the operator \(\bar{H}\) maximizes \(f_\alpha (H,\rho _i,\sigma _i)\) for every \(i=1,\ldots ,d^2\), and since the maximizing element of \(f_\alpha (H,\rho _i,\sigma _i)\) is unique, we obtain \(\bar{H}= H_i\) for every \(i=1,\ldots ,d^2\). In the case \(\alpha \in [1/2,1)\), we define \(\bar{H}:= \mathrm{arg\,min}_H f_\alpha (H,\rho ,\sigma )\) and \(H_i := \mathrm{arg\,min}_H f_\alpha (H, \rho _i,\sigma _i)\). The inequalities in (12) are now reversed due to the joint concavity of \(\widetilde{Q}_\alpha (\cdot \Vert \cdot )\), and since \(f_\alpha (\cdot ,\rho _i,\sigma _i)\) attains a minimum at \(H_i\). Again, we obtain \(\bar{H}= H_i\) for every \(i=1,\ldots ,d^2\).

Using the explicit form of the optimal \(\hat{H}\) from (11), we can write out the condition \(\bar{H}= H_i\) with the choices for \(\rho _i\), \(\sigma _i\), and \(\lambda _i\) made above, obtaining for every \(i=1,\ldots ,d^2\)

$$\begin{aligned}&\sigma _A^\gamma \left( \sigma _A^\gamma \rho _A \sigma _A^\gamma \right) ^{\alpha -1} \sigma _A^\gamma \otimes \pi _B^{2\gamma + (\alpha -1)(2\gamma +1)} \nonumber \\&\quad = (\mathbbm {1}_A\otimes V_i) \sigma _{AB}^\gamma \left( \sigma _{AB}^\gamma \rho _{AB} \sigma _{AB}^\gamma \right) ^{\alpha -1} \sigma _{AB}^\gamma \left( \mathbbm {1}_A\otimes V_i^\dagger \right) \!. \end{aligned}$$
(13)

Since \(2\gamma + (\alpha -1)(2\gamma + 1) = 0\), the dimension factor of \(\pi _B\) cancels, and eliminating the unitary \(V_i\) in (13) yields

$$\begin{aligned} \sigma _A^\gamma \left( \sigma _A^\gamma \rho _A \sigma _A^\gamma \right) ^{\alpha -1} \sigma _A^\gamma \otimes \mathbbm {1}_B = \sigma _{AB}^\gamma \left( \sigma _{AB}^\gamma \rho _{AB} \sigma _{AB}^\gamma \right) ^{\alpha -1} \sigma _{AB}^\gamma . \end{aligned}$$
(14)

This is a necessary condition for equality in the monotonicity of the trace functional \(\widetilde{Q}_\alpha (\cdot \Vert \cdot )\). Furthermore, it is easy to see that (14) is also sufficient, as \(\widetilde{Q}_\alpha (\rho _{AB}\Vert \sigma _{AB}) = \widetilde{Q}_\alpha (\rho _A\Vert \sigma _B)\) follows from multiplying (14) by \(\rho _{AB}\), taking the trace, and using cyclicity of the trace. In summary, we have, therefore, proved the following:

Proposition 5

Let \(\alpha \in [1/2,1)\cup (1,\infty )\), then we have \(\widetilde{D}_\alpha (\rho _{AB}\Vert \sigma _{AB}) = \widetilde{D}_\alpha (\rho _A\Vert \sigma _A)\) if and only if

$$\begin{aligned} \sigma _A^\gamma \left( \sigma _A^\gamma \rho _A \sigma _A^\gamma \right) ^{\alpha -1} \sigma _A^\gamma \otimes \mathbbm {1}_B = \sigma _{AB}^\gamma \left( \sigma _{AB}^\gamma \rho _{AB} \sigma _{AB}^\gamma \right) ^{\alpha -1} \sigma _{AB}^\gamma . \end{aligned}$$

We are now in a position to prove our main result, Theorem 1:

Proof of Theorem 1

For the quantum operation \(\Lambda :\mathcal {B}(\mathcal {H})\rightarrow \mathcal {B}(\mathcal {K})\), the Stinespring Representation Theorem [44] asserts that there is a Hilbert space \(\mathcal {H}'\), a pure state \(|\tau \rangle \in \mathcal {H}'\otimes \mathcal {K}\), and a unitary U acting on \(\mathcal {H}\otimes \mathcal {H}'\otimes \mathcal {K}\) such that for every \(\rho \in \mathcal {B}(\mathcal {H})\) we have

$$\begin{aligned} \Lambda (\rho ) = \mathrm{Tr\,}_{12}\left( U(\rho \otimes \tau )U^\dagger \right) \!, \end{aligned}$$

where \(\mathrm{Tr\,}_{12}\) denotes the partial trace over \(\mathcal {H}\) and \(\mathcal {H}'\), that is, the first two factors of \(\mathcal {H}\otimes \mathcal {H}'\otimes \mathcal {K}\). We then have

$$\begin{aligned} \widetilde{D}_\alpha (\rho \Vert \sigma )&= \widetilde{D}_\alpha \!\left( U(\rho \otimes \tau )U^\dagger \,\Vert \,U(\sigma \otimes \tau )U^\dagger \right) \\&\ge \widetilde{D}_\alpha \!\left( \mathrm{Tr\,}_{12}\left( U(\rho \otimes \tau )U^\dagger \right) \,\Vert \,\mathrm{Tr\,}_{12}\left( U(\sigma \otimes \tau )U^\dagger \right) \right) \\&= \widetilde{D}_\alpha (\Lambda (\rho )\Vert \Lambda (\sigma )), \end{aligned}$$

where the first line follows from (4) and (5), and the inequality follows from (6). By Proposition 5 we have equality in the second line if and only if

$$\begin{aligned} \left( U(\sigma \otimes \tau )U^\dagger \right) ^{\gamma } \left[ \left( U(\sigma \otimes \tau )U^\dagger \right) ^\gamma U(\rho \otimes \tau )U^\dagger \left( U(\sigma \otimes \tau )U^\dagger \right) ^\gamma \right] ^{\alpha -1} \left( U(\sigma \otimes \tau )U^\dagger \right) ^{\gamma } \\ = \mathbbm {1}_{\mathcal {H}\otimes \mathcal {H}'} \otimes \Lambda (\sigma )^{\gamma }\left[ \Lambda (\sigma )^\gamma \Lambda (\rho )\Lambda (\sigma )^\gamma \right] ^{\alpha -1} \Lambda (\sigma )^\gamma . \end{aligned}$$

Using the fact that \(f(UXU^\dagger )=Uf(X)U^\dagger \) for every function f and unitary U, this is equivalent to

$$\begin{aligned} U\left( \sigma ^{\gamma }\left( \sigma ^\gamma \rho \sigma ^\gamma \right) ^{\alpha -1} \sigma ^\gamma \otimes \tau \right) U^\dagger = \mathbbm {1}_{\mathcal {H}\otimes \mathcal {H}'} \otimes \Lambda (\sigma )^{\gamma }\left[ \Lambda (\sigma )^\gamma \Lambda (\rho )\Lambda (\sigma )^\gamma \right] ^{\alpha -1} \Lambda (\sigma )^\gamma . \end{aligned}$$

The theorem now follows from the fact that the adjoint of \(\Lambda \) is given by \(\Lambda ^\dagger (\omega ) = V^\dagger (\mathbbm {1}_{\mathcal {H}\otimes \mathcal {H}'} \otimes \omega )V\), where \(V = U(\mathbbm {1}_\mathcal {H}\otimes |\tau \rangle )\) is the Stinespring isometry of \(\Lambda \) satisfying \(V^\dagger V = \mathbbm {1}_{\mathcal {H}}\). \(\square \)

4 Applications

In this section we discuss applications of Theorem 1. Our goal is to generalize a set of results by Carlen and Lieb [9] about the Araki–Lieb inequality and entanglement of formation by proving the corresponding results for Rényi quantities. In Sect. 4.1 we state a Rényi version of the Araki–Lieb inequality (Lemma 8) and analyze the case of equality (Theorem 9). In Sect. 4.2 we first prove a general lower bound on the Rényi entanglement of formation (analogous to the corresponding bound on the entanglement of formation in [9]) and then use the results from Sect. 4.1 to show that this lower bound is achieved by states saturating the Rényi version of the Araki–Lieb inequality. These results are presented in Theorem 13. Finally, in Sect. 4.3 we discuss the case of equality in a well-known upper bound on the entanglement fidelity in terms of the usual fidelity, which we state in Proposition 15.

We start with a few definitions. For \(\rho \in \mathcal {D}(\mathcal {H})\) the von Neumann entropy \(S(\rho )\) is defined as \(S(\rho ):= -\mathrm{Tr\,}(\rho \log \rho ) = -D(\rho \Vert \mathbbm {1})\), and we write \(S(A)_\rho \equiv S(\rho _A)\) for a state \(\rho _A\) acting on a Hilbert space \(\mathcal {H}_A\). The conditional entropy \(S(A|B)_\rho \) is defined as \(S(A|B)_\rho := S(AB)_\rho - S(B)_\rho = -D(\rho _{AB}\Vert \mathbbm {1}_A\otimes \rho _B)\). In our discussion we consider the following Rényi generalization of the conditional entropy, first defined in [36]: For \(\alpha \in [1/2,\infty )\), the \(\alpha \)-Rényi conditional entropy of a bipartite state \(\rho _{AB}\) is defined as

$$\begin{aligned} \widetilde{S}_\alpha (A|B)_\rho := -\min _{\sigma _B} \widetilde{D}_\alpha (\rho _{AB}\Vert \mathbbm {1}_A\otimes \sigma _B), \end{aligned}$$

where the minimization is over states \(\sigma _B\), and we set \(\widetilde{S}_1(A|B)_\rho := \lim _{\alpha \rightarrow 1}\widetilde{S}_\alpha (A|B)_\rho = S(A|B)_\rho \). The \(\alpha \)-Rényi conditional entropy satisfies the following duality relation:

Proposition 6

([3, 36]) Let \(\rho _{ABC}\) be a pure state with marginals \(\rho _{AB}\) and \(\rho _{AC}\). For \(\alpha ,\beta \in [1/2,\infty )\) such that \(\frac{1}{\alpha } + \frac{1}{\beta } = 2\), we have

$$\begin{aligned} \widetilde{S}_\alpha (A|B)_\rho = -\widetilde{S}_\beta (A|C)_\rho . \end{aligned}$$

4.1 Rényi version of Araki–Lieb inequality and the case of equality

The Araki–Lieb inequality [1] states that for every bipartite state \(\rho _{AB}\),

$$\begin{aligned} S(AB)_\rho \ge \left| S(A)_\rho - S(B)_\rho \right| . \end{aligned}$$
(15)

There are a few different characterizations for the case of equality in the Araki–Lieb inequality [9, 37, 55]. Here, we concentrate on a result by Carlen and Lieb:

Theorem 7

(Equality in the Araki–Lieb inequality; [9]) For a bipartite state \(\rho _{AB}\) denote by \(r_{AB}\), \(r_A\), and \(r_B\) the ranks of \(\rho _{AB}\), \(\rho _A\), and \(\rho _B\), respectively. The state \(\rho _{AB}\) saturates the Araki-Lieb inequality (15),

$$\begin{aligned} S(AB)_\rho = S(B)_\rho - S(A)_\rho , \end{aligned}$$

if and only if the following conditions are satisfied:

  1. (i)

    \(r_B = r_A r_{AB}\)

  2. (ii)

    The state \(\rho _{AB}\) has a spectral decomposition of the form

    $$\begin{aligned} \rho _{AB} = \sum _{i=1}^{r_{AB}} \lambda _i |i\rangle \langle i|_{AB}, \end{aligned}$$

    where the vectors \(\lbrace |i\rangle _{AB}\rbrace _{i=1}^{r_{AB}}\) are such that \(\mathrm{Tr\,}_B|i\rangle \langle j|_{AB} = \delta _{ij}\rho _A\) for \(i,j=1,\ldots ,r_{AB}\).

We can regard the Araki–Lieb inequality (15) as lower bounds on the conditional entropies:

$$\begin{aligned} S(A|B)_\rho&\ge - S(A)_\rho&S(B|A)_\rho&\ge - S(B)_\rho . \end{aligned}$$
(16)

In the following, we only focus on the bound \(S(A|B)_\rho \ge - S(A)_\rho \), noting that all the results we obtain hold for \(S(B|A)_\rho \) in an analogous manner. The formulation (16) of the Araki–Lieb inequality admits a simple proof based on duality as follows: With a purification \(|\rho \rangle _{ABC}\) of \(\rho _{AB}\), we have

$$\begin{aligned} S(A|B)_\rho&= - S(A|C)_\rho = D(\rho _{AC}\Vert \mathbbm {1}_A\otimes \rho _C) \ge D(\rho _A\Vert \mathbbm {1}_A) = -S(A)_\rho , \end{aligned}$$
(17)

where the first equality follows from duality for the conditional entropy, and the inequality follows from the DPI for the QRE with \(\Lambda = \mathrm{Tr\,}_C\).

The advantage of phrasing the Araki–Lieb inequality in the form of (16) is that we can easily generalize it to Rényi quantities. To this end, we simply replace the von Neumann quantities in (17) by the Rényi conditional entropy and the Rényi entropy, defined as

$$\begin{aligned} S_\alpha (A)_\rho = -\widetilde{D}_\alpha (\rho _A\Vert \mathbbm {1}_A) = \frac{1}{1-\alpha } \log \mathrm{Tr\,}\rho _A^\alpha . \end{aligned}$$
(18)

With \(\alpha ,\beta \in [1/2,\infty )\) such that \(\frac{1}{\alpha } + \frac{1}{\beta } = 2\), we then have

$$\begin{aligned} \widetilde{S}_\alpha (A|B)_\rho = -\widetilde{S}_\beta (A|C)_\rho = \widetilde{D}_\beta (\rho _{AC}\Vert \mathbbm {1}_A\otimes \tilde{\sigma }_C) \ge \widetilde{D}_\beta (\rho _A\Vert \mathbbm {1}_A) = -S_\beta (A)_\rho . \end{aligned}$$
(19)

Here, we used Proposition 6 in the first equality, chose an optimizing state \(\tilde{\sigma }_C\) for \(\widetilde{S}_\beta (A|C)_\rho \) in the second equality, and the inequality is simply the DPI for the \(\beta \)-SRD with respect to \(\Lambda =\mathrm{Tr\,}_C\). We also obtain the upper bound \(\widetilde{S}_\alpha (A|B)_\rho \le S_\alpha (A)_\rho \) by a simple application of the DPI with respect to \(\Lambda =\mathrm{Tr\,}_B\). Hence, we have proved

Lemma 8

(Rényi version of the Araki–Lieb inequality) Let \(\rho _{AB}\) be a bipartite state, and \(\alpha ,\beta \in [1/2,\infty )\) be such that \(\frac{1}{\alpha } + \frac{1}{\beta } = 2\); then

$$\begin{aligned} -S_\beta (A)_\rho \le \widetilde{S}_\alpha (A|B)_\rho \le S_\alpha (A)_\rho . \end{aligned}$$
(20)

Since the inequality in the lower bound of Lemma 8 stems from the DPI for \(\widetilde{D}_\beta (\cdot \Vert \cdot )\) (cf. (19)), we can apply Theorem 1 (in the form of Proposition 5) to investigate the case of equality. By Proposition 5 we have \(\widetilde{D}_\beta (\rho _{AC}\Vert \mathbbm {1}_A\otimes \tilde{\sigma }_C) = \widetilde{D}_\beta (\rho _A\Vert \mathbbm {1}_A)\) if and only if

$$\begin{aligned} \rho _A^{\beta -1} \otimes \mathbbm {1}_C = \left( \mathbbm {1}_A\otimes \tilde{\sigma }_C^\delta \right) \left( \left( \mathbbm {1}_A\otimes \tilde{\sigma }_C^\delta \right) \rho _{AC} \left( \mathbbm {1}_A\otimes \tilde{\sigma }_C^\delta \right) \right) ^{\beta -1} \left( \mathbbm {1}_A\otimes \tilde{\sigma }_C^\delta \right) \!, \end{aligned}$$
(21)

where \(\delta = (1-\beta )/2\beta \). It is easy to see that (21) is equivalent to \(\rho _{AC} = \rho _A\otimes \tilde{\sigma }_C\), that is, if \(\rho _{ABC}\) is a purification of \(\rho _{AB}\), then the marginal \(\rho _{AC}\) is of product form. We can then go through the same steps as in the proof of Theorem 1.4 in [9] (which we stated as Theorem 7 above) to arrive at the following Rényi generalization of this result:

Theorem 9

(Equality in the Rényi version of the Araki–Lieb inequality). Let \(\rho _{AB}\) be a bipartite state with purification \(\rho _{ABC}\), and let \(\alpha ,\beta \in [1/2,\infty )\) be such that \(1/\alpha + 1/\beta = 2\). Denote by \(r_{AB}\), \(r_{A}\), and \(r_B\) the ranks of \(\rho _{AB}\), \(\rho _A\), and \(\rho _B\), respectively. We have equality in the Rényi version of the Araki–Lieb inequality,

$$\begin{aligned} \widetilde{S}_\alpha (A|B)_\rho = -S_\beta (A)_\rho , \end{aligned}$$

if and only if the following conditions are satisfied:

  1. (i)

    \(r_B = r_A r_{AB}.\)

  2. (ii)

    The state \(\rho _{AB}\) has a spectral decomposition of the form

    $$\begin{aligned} \rho _{AB} = \sum _{i=1}^{r_{AB}} \lambda _i |i\rangle \langle i|_{AB}, \end{aligned}$$

    where the vectors \(\lbrace |i\rangle _{AB}\rbrace _{i=1}^{r_{AB}}\) are such that \(\mathrm{Tr\,}_B|i\rangle \langle j|_{AB} = \delta _{ij}\rho _A\) for \(i,j=1,\ldots ,r_{AB}\).

Remark 10

For the upper bound \(\widetilde{S}_\alpha (A|B)_\rho \le S_\alpha (A)_\rho \) in Lemma 8, we have equality if and only if \(\widetilde{D}_\alpha (\rho _{AB}\Vert \mathbbm {1}_A\otimes \tilde{\sigma }_B) = \widetilde{D}_\alpha (\rho _A\Vert \mathbbm {1}_A)\), where \(\tilde{\sigma }_B\) is a state optimizing \(\widetilde{S}_\alpha (A|B)_\rho \). Similar to above, we obtain from Proposition 5 that this is the case if and only if \(\rho _{AB} = \rho _A\otimes \tilde{\sigma }_B\).

4.2 Rényi entanglement of formation

Let \(\rho _{AB}\) be a bipartite state; then the entanglement of formation (EoF) \(E_F(\rho _{AB})\) [4, 5] is defined as the least expected entropy of entanglement of any ensemble of pure states realizing \(\rho _{AB}\):

$$\begin{aligned} E_F(\rho _{AB}) := \min _{\lbrace p_i,\psi _i\rbrace } \sum _i p_i S(\mathrm{Tr\,}_B{\psi _i}), \end{aligned}$$
(22)

where the minimum is over ensembles of pure states \(\lbrace p_i,\psi _i\rbrace \) such that \(\rho _{AB} = \sum _i p_i |\psi _i\rangle \langle \psi _i|\). This entanglement measure satisfies \(E_F(\rho _{AB})\ge 0\) for all \(\rho _{AB}\) and is furthermore faithful, that is, \(E_F(\rho _{AB}) = 0\) if and only if \(\rho _{AB}\) is separable. The EoF is an upper bound on the (two-way) distillable entanglement [4, 5]. Moreover, its regularized version \(E_F^\infty (\rho _{AB}):= \lim _{n\rightarrow \infty }E_F(\rho _{AB}^{\otimes n})/n\) is equal to the asymptotic entanglement cost of preparing the state \(\rho _{AB}\) [19]. Carlen and Lieb [9] prove the following result, which provides a lower bound on \(E_F(\rho _{AB})\) that is achieved by states saturating the Araki–Lieb inequality (Theorem 7):

Theorem 11

([9]) Let \(\rho _{AB}\) be a bipartite state. Then

$$\begin{aligned} E_F(\rho _{AB}) \ge \max \left\{ -S(A|B)_\rho , -S(B|A)_\rho ,0\right\} \!, \end{aligned}$$
(23)

and this bound is saturated by states satisfying the conditions of Theorem 7. That is, for states \(\rho _{AB}\) with \(S(A|B)_\rho = -S(A)_\rho \), we have

$$\begin{aligned} E_F(\rho _{AB}) = -S(A|B)_\rho . \end{aligned}$$

Remark 12

If \(S(A|B)_\rho = -S(A)_\rho \), then \(E_F(\rho _{AB}) = -S(A|B)_\rho \ge -S(B|A)_\rho \) by (23).

Using the results of Sect. 4.1, our goal in this section is to obtain a Rényi generalization of Theorem 11. To this end, we consider the Rényi entanglement of formation (REoF) \(E_{F,\alpha }(\rho _{AB})\) [49, 50], which is obtained from the definition of \(E_F(\rho _{AB})\) in (22) by replacing the von Neumann entropy with the Rényi entropy of order \(\alpha \ge 0\) [as defined in (18)]:

$$\begin{aligned} E_{F,\alpha }(\rho _{AB}) := \min _{\lbrace p_i,\psi _i\rbrace } \sum _i p_i S_\alpha (\mathrm{Tr\,}_B{\psi _i}) \end{aligned}$$

Note that in [43] the authors consider a different Rényi generalization of the EoF based on the \(\alpha \)-Rényi conditional entropy. As in the von Neumann case, the REoF satisfies \(E_{F,\alpha }(\rho _{AB})\ge 0\) for all \(\rho _{AB}\), and it is faithful as well. We prove the following generalization of Theorem 11 for \(\alpha >1\):

Theorem 13

Let \(\rho _{AB}\) be a bipartite state, and let \(\alpha >1\) and \(\beta = \alpha /(2\alpha -1)\in (1/2,1)\) such that \(1/\alpha + 1/\beta = 2\). Then we have the following bound on the REoF:

$$\begin{aligned} E_{F,\alpha }(\rho _{AB}) \ge \max \left\{ -\widetilde{S}_\beta (A|B)_\rho , -\widetilde{S}_\beta (B|A)_\rho ,0\right\} \!. \end{aligned}$$
(24)

If \(\rho _{AB}\) saturates the Rényi version (20) of the Araki–Lieb inequality with Rényi parameter \(\beta \), that is, \(\widetilde{S}_\beta (A|B)_\rho = -S_\alpha (A)_\rho \), then

$$\begin{aligned} E_{F,\alpha }(\rho _{AB}) = -\widetilde{S}_\beta (A|B)_\rho . \end{aligned}$$
(25)

Proof

Let \(\lbrace q_i, \phi _i\rbrace \) be an ensemble of pure states minimizing the REoF, that is, \( E_{F,\alpha }(\rho _{AB}) = \sum _i q_i S_\alpha (\mathrm{Tr\,}_B(\phi _i)). \) We define a purification \(\rho _{ABC}\) of \(\rho _{AB}\) by \(|\rho \rangle _{ABC} = \sum _i \sqrt{q_i} |\phi _i\rangle _{AB}|i\rangle _C\), where \(\lbrace |i\rangle _C\rbrace _i\) is an orthonormal basis for \(\mathcal {H}_C\). Denoting by \(\tilde{\sigma }_C\) the state optimizing the Rényi conditional entropy \(\widetilde{S}_\alpha (A|C)_\rho \), we have

$$\begin{aligned} \widetilde{S}_\beta (A|B)_\rho&= -\widetilde{S}_\alpha (A|C)_\rho \\&= \widetilde{D}_\alpha (\rho _{AC}\Vert \mathbbm {1}_A\otimes \tilde{\sigma }_C)\\&= \widetilde{D}_\alpha \!\left( \sum _{i,j} \sqrt{q_i q_j} \mathrm{Tr\,}_B|\phi _i\rangle \langle \phi _j|_{AB}\otimes |i\rangle \langle j|_{C} \,\Vert \,\mathbbm {1}_A\otimes \tilde{\sigma }_C\right) , \end{aligned}$$

where the first line follows from Proposition 6. We now apply the pinching map \(\rho \mapsto \sum _i |i\rangle \langle i|_C\rho |i\rangle \langle i|_C\) (which is a quantum operation) to both states and use the DPI for \(\widetilde{D}_\alpha (\cdot \Vert \cdot )\). Setting \(\lambda _i = \langle i|\tilde{\sigma }_C|i\rangle \), we obtain

$$\begin{aligned} \widetilde{S}_\beta (A|B)_\rho&\ge \widetilde{D}_\alpha \!\left( \sum _i q_i \mathrm{Tr\,}_B\phi _i \otimes |i\rangle \langle i|_C \,\Vert \,\mathbbm {1}_A \otimes \sum _i \lambda _i |i\rangle \langle i|_C\right) \\&= \frac{1}{\alpha -1} \log \left\{ \mathrm{Tr\,}\left[ \sum _i q_i^\alpha \lambda _i^{1-\alpha } \left( \mathrm{Tr\,}_B\phi _i\right) ^\alpha \otimes |i\rangle \langle i|_C\right] \right\} \\&= \frac{1}{\alpha -1} \log \left\{ \sum _i q_i \left( q_i/\lambda _i\right) ^{\alpha -1} \mathrm{Tr\,}\left[ \left( \mathrm{Tr\,}_B\phi _i\right) ^{\alpha }\right] \right\} \\&\ge \frac{1}{\alpha -1} \sum _i q_i \log \left\{ \left( q_i/\lambda _i\right) ^{\alpha -1} \mathrm{Tr\,}\left[ \left( \mathrm{Tr\,}_B\phi _i\right) ^{\alpha }\right] \right\} \\&= \sum _i q_i \frac{1}{\alpha -1}\log \mathrm{Tr\,}\left[ \left( \mathrm{Tr\,}_B\phi _i\right) ^{\alpha }\right] + \sum _i q_i \log (q_i/\lambda _i)\\&= - E_{F,\alpha }(\rho _{AB}) + D\!\left( \lbrace q_i\rbrace \Vert \lbrace \lambda _i\rbrace \right) \\&\ge - E_{F,\alpha }(\rho _{AB}). \end{aligned}$$

In the first equality we used the fact that the states \(\sum _i q_i \mathrm{Tr\,}_B\phi _i \otimes |i\rangle \langle i|_C\) and \(\mathbbm {1}_A \otimes \sum _i \lambda _i |i\rangle \langle i|_C\) commute, and hence \(\widetilde{D}_\alpha (\cdot \Vert \cdot )\) reduces to the ordinary \(\alpha \)-RRE \(D_\alpha (\omega \Vert \tau ) = \frac{1}{\alpha -1}\log \mathrm{Tr\,}(\omega ^\alpha \tau ^{1-\alpha })\). In the second inequality we used concavity of the logarithm together with \(\alpha >1\), and in the last inequality we used non-negativity of the classical Kullback-Leibler divergence, defined for probability distributions P and Q on an alphabet \(\mathcal {X}\) as \(D(P\Vert Q) = \sum _{x\in \mathcal {X}} P(x)\log P(x)/Q(x)\), provided that \(P(x)=0\) whenever \(Q(x)=0\). Note that the latter is satisfied as \(\mathrm{\,supp\,}\rho _C\subseteq \mathrm{\,supp\,}\tilde{\sigma }_C\) holds for the optimizing state \(\tilde{\sigma }_C\) of \(\widetilde{S}_\alpha (A|C)_\rho \) [36]. The bound \(E_{F,\alpha }(\rho _{AB})\ge -\widetilde{S}_\beta (B|A)_\rho \) follows in an analogous way, yielding (24).

To prove (25), we first note that by Theorem 9 the state \(\rho _{AB}\) satisfies \(\widetilde{S}_\beta (A|B)_\rho = -S_\alpha (A)_\rho \) if and only if the rank condition of Theorem 9(i) holds and \(\rho _{AB}\) has a spectral decomposition of the form

$$\begin{aligned} \rho _{AB} = \sum _i \lambda _i |i\rangle \langle i|_{AB}, \end{aligned}$$

where the vectors \(\lbrace |i\rangle _{AB} \rbrace \) satisfy \(\mathrm{Tr\,}_B|i\rangle \langle j|_{AB} = \delta _{ij}\rho _A\). We can now employ the same argument used in [9] in the proof of the second assertion of Theorem 11, to prove the corresponding assertion of Theorem 13:

$$\begin{aligned} E_{F,\alpha }(\rho _{AB})&= \min _{\lbrace p_i,\psi _i\rbrace } \sum _i p_i S_\alpha (\mathrm{Tr\,}_B\psi _i)\\&\le \sum _i \lambda _i S_\alpha (\mathrm{Tr\,}_B|i\rangle \langle i|_{AB})\\&= \sum _i \lambda _i S_\alpha (A)_\rho \\&= S_\alpha (A)_\rho \\&= -\widetilde{S}_\beta (A|B)_\rho , \end{aligned}$$

where in the inequality we chose the particular ensemble \(\lbrace \lambda _i, |i\rangle _{AB}\rbrace \) realizing \(\rho _{AB}\). This upper bound on \(E_{F,\alpha }(\rho _{AB})\), together with the general lower bound in (24), yields the claim. \(\square \)

Remark 14

  1. (i)

    The proof method of the lower bound (24) for \(E_{F,\alpha }(\rho _{AB})\) in Theorem 13 can be specialized to the quantum relative entropy \(D(\cdot \Vert \cdot )\), providing a new proof of (23) in Theorem 11:

    $$\begin{aligned} S(A|B)_\rho&= - S(A|C)_\rho \\&= D(\rho _{AC}\Vert \mathbbm {1}_A\otimes \rho _C)\\&= D\!\left( \sum _{i,j} \sqrt{q_i q_j} \mathrm{Tr\,}_B|\phi _i\rangle \langle \phi _j|_{AB}\otimes |i\rangle \langle j|_{C} \,\Vert \,\mathbbm {1}_A\otimes \rho _C\right) \\&\ge D\!\left( \sum _i q_i \mathrm{Tr\,}_B\phi _i \otimes |i\rangle \langle i|_C \,\Vert \,\mathbbm {1}_A \otimes \sum _i \lambda _i |i\rangle \langle i|_C\right) \\&= D(\lbrace q_i\rbrace \Vert \lbrace \lambda _i\rbrace ) + \sum _i q_i D\!\left( \mathrm{Tr\,}_B\phi _i\Vert \mathbbm {1}_A\right) \\&= D(\lbrace q_i\rbrace \Vert \lbrace \lambda _i\rbrace ) - \sum _i q_i S(\mathrm{Tr\,}_B\phi _i)\\&\ge - E_F(\rho _{AB}). \end{aligned}$$

    The bound \(S(B|A)_\rho \ge - E_F(\rho _{AB})\) can be proved in an analogous way.

  2. (ii)

    If a state \(\rho _{AB}\) satisfies \(\widetilde{S}_\beta (A|B)_\rho = -S_\alpha (A)_\rho \) for \(a>1\) and \(\beta = \alpha /(2\alpha -1)\), then \(E_{F,\alpha }(\rho _{AB}) = -\widetilde{S}_\beta (A|B)_\rho \ge -\widetilde{S}_\beta (B|A)_\rho \) by (24) in Theorem 13.

4.3 Entanglement fidelity

For a state \(\rho \in \mathcal {D}(\mathcal {H})\) and a quantum channel \(\mathcal {N}:\mathcal {B}(\mathcal {H})\rightarrow \mathcal {B}(\mathcal {H})\), the entanglement fidelity \(F_e(\rho ,\mathcal {N})\) [42] is defined as

$$\begin{aligned} F_e(\rho ,\mathcal {N}) := \langle \psi ^\rho |(\mathcal {N}\otimes \mathrm{id}_{\mathcal {H}'})(\psi ^\rho )|\psi ^\rho \rangle , \end{aligned}$$
(26)

where the state \(|\psi ^\rho \rangle \in \mathcal {H}\otimes \mathcal {H}'\) purifies \(\rho \). Since any two purifications of \(\rho \) are related by an isometry acting on the purifying system, the definition (26) of the entanglement fidelity is independent of the chosen purification. For a mixed state \(\rho \) with spectral decomposition \(\rho = \sum _{i=1}^{\mathrm{rk}\,\rho } \lambda _i |i\rangle \langle i|_{\mathcal {H}}\), a canonical purification is given by

$$\begin{aligned} |\psi ^\rho \rangle = \sum _{i=1}^{\mathrm{rk}\,\rho } \sqrt{\lambda _i} |i\rangle _\mathcal {H}\otimes |i\rangle _{\mathcal {H}'} \end{aligned}$$

for suitable orthonormal vectors \(\lbrace |i\rangle _{\mathcal {H}'}\rbrace _{i=1}^{\mathrm{rk}\,\rho }\) in \(\mathcal {H}'\). Hence, in the following discussion we can assume without loss of generality that \(\dim \mathcal {H}' = \mathrm{rk}\,\rho \).

The entanglement fidelity \(F_e(\rho ,\mathcal {N})\) can be expressed in terms of the usual fidelity \(F(\omega ,\tau ):= \Vert \sqrt{\omega }\sqrt{\tau }\Vert _1\) asFootnote 2

$$\begin{aligned} F_e(\rho ,\mathcal {N}) = F(\psi ^\rho ,(\mathcal {N}\otimes \mathrm{id}_{\mathcal {H}'})(\psi ^\rho ))^2. \end{aligned}$$

We have \(\Vert \sqrt{\omega }\sqrt{\tau }\Vert _1 = \mathrm{Tr\,}(\sqrt{\tau } \omega \sqrt{\tau } )^{1/2}\) by definition of the trace norm, and hence the fidelity is related to the 1 / 2-SRD via

$$\begin{aligned} F(\omega ,\tau ) = \widetilde{Q}_{1/2}(\omega \Vert \tau ). \end{aligned}$$
(27)

It follows from the DPI for \(\widetilde{Q}_{1/2}(\cdot \Vert \cdot )\) that the fidelity is non-decreasing under partial trace.Footnote 3 This can be used to prove the following upper bound on the entanglement fidelity, where we write \(\mathcal {N}(\psi ^\rho )\equiv (\mathcal {N}\otimes \mathrm{id}_{\mathcal {H}'})(\psi ^\rho )\):

$$\begin{aligned} F_e(\rho ,\mathcal {N})&= F(\psi ^\rho , \mathcal {N}(\psi ^\rho ))^2 \le F(\rho ,\mathcal {N}(\rho ))^2 \end{aligned}$$
(28)

Due to (28), the entanglement fidelity provides a more stringent notion of distance between quantum states than the fidelity. However, it is clear that we have equality in (28) if the state \(\rho \) is pure. Using our condition for equality in the DPI for the 1 / 2-SRD from Theorem 1 (resp.  Proposition 5), we can prove that this is in fact the only case of equality:

Proposition 15

Let \(\rho \in \mathcal {D}(\mathcal {H})\) and \(\mathcal {N}:\mathcal {B}(\mathcal {H})\rightarrow \mathcal {B}(\mathcal {H})\) be a quantum channel; then we have

$$\begin{aligned} F_e(\rho ,\mathcal {N}) = F(\rho ,\mathcal {N}(\rho ))^2 \end{aligned}$$

if and only if \(\rho \) is pure.

Proof

We have already noted above that purity of \(\rho \) is sufficient for equality in (28). If \(F_e(\rho ,\mathcal {N}) = F(\rho ,\mathcal {N}(\rho ))^2\), then (27) implies that we have equality in the DPI for the 1 / 2-SRD with respect to \(\Lambda =\mathrm{Tr\,}_{\mathcal {H}'}\). Hence, Proposition 5 yields

$$\begin{aligned}&\mathcal {N}(\rho )^{1/2} \left( \mathcal {N}(\rho )^{1/2}\rho \mathcal {N}(\rho )^{1/2}\right) ^{-1/2} \mathcal {N}(\rho )^{1/2} \otimes \mathbbm {1}_{\mathcal {H}'}\\&\quad = \mathcal {N}(\psi ^\rho )^{1/2} \left( \mathcal {N}(\psi ^\rho )^{1/2} \psi ^\rho \mathcal {N}(\psi ^\rho )^{1/2}\right) ^{-1/2} \mathcal {N}(\psi ^\rho )^{1/2}, \end{aligned}$$

from which we obtain

$$\begin{aligned} \rho \otimes \mathbbm {1}_{\mathcal {H}'} = c(\rho ,\mathcal {N}) \mathcal {N}(\rho )^{-1} \mathcal {N}(\psi ^\rho ) \psi ^\rho \mathcal {N}(\psi ^\rho ) \mathcal {N}(\rho )^{-1} \end{aligned}$$
(29)

for a suitable constant \(c(\rho ,\mathcal {N})\). Note that the right-hand side of (29) has rank 1, and hence \(\rho \otimes \mathbbm {1}_{\mathcal {H}'}\) is a pure state. But this is only possible if \(\rho \) is pure and \(\dim \mathcal {H}'=1\). \(\square \)

5 Conclusion and open questions

We have shown that equality in the DPI for the \(\alpha \)-SRD \(\widetilde{D}_\alpha (\cdot \Vert \cdot )\) holds for a quantum operation \(\Lambda \), a quantum state \(\rho \), and a positive semidefinite operator \(\sigma \) (with suitable support conditions) if and only if the following algebraic condition is satisfied (setting \(\gamma =(1-\alpha )/2\alpha \)):

$$\begin{aligned} \sigma ^\gamma \left( \sigma ^\gamma \rho \sigma ^\gamma \right) ^{\alpha -1} \sigma ^\gamma = \Lambda ^\dagger \!\left( \Lambda (\sigma )^\gamma \left[ \Lambda (\sigma )^\gamma \Lambda (\rho ) \Lambda (\sigma )^\gamma \right] ^{\alpha -1} \Lambda (\sigma )^\gamma \right) \end{aligned}$$
(30)

In the case of the \(\alpha \)-RRE \(D_\alpha (\cdot \Vert \cdot )\) for \(\alpha \in [0,2]\) (which includes the QRE corresponding to \(\alpha =1\)), a necessary and sufficient algebraic condition for equality in the DPI is given by [24, 40, 41]

$$\begin{aligned} \Lambda ^\dagger \!\left( \Lambda (\sigma )^{-z}\Lambda (\rho )^z\right) = \sigma ^{-z}\rho ^{z}\quad \text {for all}\, z\in \mathbb {C}. \end{aligned}$$

This can be rephrased in terms of the existence of a recovery map by an argument detailed in [24]: We have \(D_\alpha (\rho \Vert \sigma ) = D_\alpha (\Lambda (\rho )\Vert \Lambda (\sigma ))\) if and only if there is a recovery map in the form of a quantum operation \(\mathcal {R}_{\sigma ,\Lambda }\) such that

$$\begin{aligned} \mathcal {R}_{\sigma ,\Lambda }(\Lambda (\sigma ))&= \sigma&\mathcal {R}_{\sigma ,\Lambda }(\Lambda (\rho ))&= \rho . \end{aligned}$$
(31)

In general, a quantum operation \(\Lambda \) is called sufficient with respect to a set \(\mathcal {S}\subseteq \mathcal {D}(\mathcal {H})\) of quantum states if there exists a quantum operation \(\mathcal {R}\) satisfying \(\mathcal {R}(\Lambda (\tau )) = \tau \) for all \(\tau \in \mathcal {S}\). Hence, (31) says that \(\Lambda \) is sufficient for \(\lbrace \rho ,\sigma \rbrace \). Furthermore, the particular map \(\mathcal {R}_{\sigma , \Lambda }\) admits an explicit formula on the support of \(\Lambda (\sigma )\):

$$\begin{aligned} \mathcal {R}_{\sigma ,\Lambda }(\cdot ) = \sigma ^{1/2} \Lambda ^\dagger \!\left( \Lambda (\sigma )^{-1/2} \cdot \Lambda (\sigma )^{-1/2}\right) \sigma ^{1/2}. \end{aligned}$$
(32)

Since \(\mathcal {R}_{\sigma ,\Lambda }(\Lambda (\sigma )) = \sigma \) holds by definition (32) of the recovery map, the non-trivial part of (31) is the assertion \(\mathcal {R}_{\sigma ,\Lambda }(\Lambda (\rho )) = \rho \). Note that by a theorem of Petz [40] a quantum channel \(\Lambda \) is sufficient for \(\lbrace \rho ,\sigma \rbrace \) if and only if \(\mathcal {R}_{\sigma ,\Lambda }(\Lambda (\rho )) = \rho \) holds for the map defined in (32). We also observe that the recovery map \(\mathcal {R}_{\sigma ,\Lambda }\) is independent of \(\alpha \), and the existence of a map \(\mathcal {R}\) satisfying (31) forces equality in the DPI for any \(\alpha \in [0,2]\),

$$\begin{aligned} D_\alpha (\rho \Vert \sigma ) \ge D_\alpha (\Lambda (\rho )\Vert \Lambda (\sigma )) \ge D_\alpha (\mathcal {R}(\Lambda (\rho ))\Vert \mathcal {R}(\Lambda (\sigma ))) = D_\alpha (\rho \Vert \sigma ), \end{aligned}$$
(33)

where the first inequality follows from applying the DPI with respect to \(\Lambda \), and the second follows from applying the DPI with respect to \(\mathcal {R}\). Thus, we have equality in the DPI for the \(\alpha \)-RRE for all \(\alpha \in [0,2]\) once it holds for some \(\alpha \in [0,2]\).

Taking a closer look at the condition (30) for equality in the DPI for the \(\alpha \)-SRD, it is easy to see that choosing \(\alpha =2\) in (30) yields precisely the statement \(\mathcal {R}_{\sigma ,\Lambda }(\Lambda (\rho )) = \rho \). Hence, in the case \(\alpha =2\) we have equality in the DPI for the 2-SRD if and only if the recovery map \(\mathcal {R}_{\sigma ,\Lambda }\) defined in (32) satisfies (31). This was already observed in [13] for positive trace-preserving maps.

Shortly after completion of the present paper, the connection between sufficiency and equality in the DPI for the \(\alpha \)-SRD was presented by Jenčová [25] and Hiai and Mosonyi [22]. The main result of [25] is that a 2-positive trace-preserving map \(\Lambda \) is sufficient with respect to \(\lbrace \rho ,\sigma \rbrace \) if and only if \(\widetilde{D}_\alpha (\Lambda (\rho )\Vert \Lambda (\sigma )) = \widetilde{D}_\alpha (\rho \Vert \sigma )\) holds for some \(\alpha >1\). By the theorem of Petz [40] mentioned above, we, therefore, have equality in the DPI for the \(\alpha \)-SRD for any \(\alpha >1\) if and only if the map \(\mathcal {R}_{\sigma ,\Lambda }\) defined in (32) satisfies (31). Furthermore, a similar argument as in (33) for \(\widetilde{D}_\alpha (\cdot \Vert \cdot )\) shows that equality holds in the DPI for the \(\alpha \)-SRD for all \(\alpha >1\) if it holds for some \(\alpha >1\). This result settles the sufficiency question for the \(\alpha \)-SRD for the range \(\alpha >1\) and 2-positive trace-preserving maps (which include all quantum operations). In [22] sufficiency is analyzed for 2-positive bistochastic maps \(\Lambda \), that is, both \(\Lambda \) and \(\Lambda ^\dagger \) are 2-positive and trace-preserving. The main theorem of [22] regarding the \(\alpha \)-SRD states conditions for sufficiency of \(\Lambda \) for certain ranges of \(\alpha \) (including the range \(\alpha \in [1/2,1)\)) under the additional assumption that one of the two states \(\rho \) and \(\sigma \) is a fixed point of \(\Lambda \). In fact, this result is obtained as a corollary of a more general theorem analyzing sufficiency for the \(\alpha \)-z-Rényi relative entropies under similar assumptions.

It the light of our main result (Theorem 1) and the results of [22] and [25], it remains an open question whether equality in the DPI for the \(\alpha \)-SRD in the range \(\alpha \in (1/2,1)\) is equivalent to sufficiency of \(\Lambda \) in our setting, in which \(\Lambda \) is an arbitrary quantum operation and \(\rho \) and \(\sigma \) are states with \(\rho \not \perp \sigma \). Note that for \(\alpha =1/2\) it is known that such a general sufficiency result cannot hold [32].Footnote 4

Regarding our results in Sect. 4 about entanglement measures and distances, it would be interesting to see whether the entropic bounds proved therein can be used to characterize information-theoretic tasks.