1 Introduction

The relative entropy between two density matrices \(\omega _\eta , \omega _\psi \), defined as

$$\begin{aligned} S(\omega _\psi | \omega _\eta ) = {\text {Tr}}(\omega _\psi {\text {ln}}\omega _\psi ) - {\text {Tr}}(\omega _\psi {\text {ln}}\omega _\eta ), \end{aligned}$$
(1)

is an asymptotic measure of their distinguishability. Classically, \(e^{-NS(\{p_i\} | \{q_i\})}\) approaches for large N the probability for a sample of size N of letters, distributed according to the true distribution \(\{p_i\}\), when calculated according to an incorrect guess \(\{q_i\}\). In the setting of general von Neumann algebras \({\mathcal{A}}\), density matrices should be replaced by algebraic states, i.e., positive, linear, normalized and ultra-weakly continuous functionals \(\omega _\eta , \omega _\psi : {\mathcal{A}}\rightarrow \mathbb {C}\). The correspondence to density matrices is given by \(\omega _\psi (a) = {\text {Tr}}(a\omega _\psi ), a \in {\mathcal{A}}\) in the case of matrix algebras \({\mathcal{A}}= M_n(\mathbb {C})\). It has been shown how to generalize the relative entropy to general von Neumann algebras by Araki [2, 3]. Instead of \({\text {ln}}\omega _\psi \) and \({\text {ln}}\omega _\eta \), which have no obvious counterparts in a general von Neumann algebra, he used relative modular operators.

By far the most fundamental property of S—from which in fact essentially all others follow—is its monotonicity under a channel. A channel \(T:{\mathcal{B}}\rightarrow {\mathcal{A}}\) between von Neumann algebras is a completely positive, unital normal linear map. It induces a corresponding action on states by pull-back, \(\omega _\psi \mapsto \omega _\psi \circ T\). [36] has shown for general von Neumann algebras that

$$\begin{aligned} S(\omega _\psi | \omega _\eta ) \ge S( \omega _\psi \circ T| \omega _\eta \circ T). \end{aligned}$$
(2)

In quantum information theory, T is related to data processing, so (2) is sometimes called the data-processing inequality (DPI).

S plays an important role when characterizing the entanglement between subsystems. Over the years, several generalizations of S with different operational meaning have therefore been given, see e.g., [19]. One such generalization is the one-parameter family of “sandwiched relative Renyi divergences (entropies)” \(D_s, s \in [1/2,1) \cup (1,\infty )\) proposed by [32, 38], which generalize the classical alpha-Renyi entropies and have several interesting properties. For example, like the relative entropy S, they can be defined for general von Neumann algebras [4, 6], have an operational meaning [31], satisfy the DPI [5, 6, 15], and interpolate between S (\(s \rightarrow 1\)) and the fidelity F (\(s \rightarrow 1/2\)) [6, 32, 38].

The purpose of this note is to point out a related variational expression, \(\Phi _s(\omega _\psi | \omega _\eta ), s \in (1/2,1)\), [Eq. (41)] inspired by a corresponding characterization of S due to Kosaki [23]. \(\Phi _s\) reduces to a multiple of the fidelity in the classical case. Thus, it cannot be seen as a generalization of the classical Renyi entropies, but it is shown to provide an upper bound for the sandwiched relative Renyi entropies \(D_s, s \in (1/2,1)\). Furthermore, \(\Phi _s\) is shown to have several other desirable properties. For example, it satisfies the DPI, and it is ordered with respect to the states that it depends on.

By construction, our divergence \(\Phi _s(\omega _\psi | \omega _\eta )\) is defined for arbitrary von Neumann algebras, thus in particular type III. As is well-known [9], the algebras of observables in algebraic quantum field theory (QFT) [16] are of this type. In the second half of this note, we will give an application of \(\Phi _s\) to QFT. We consider a QFT described by a Haag–Kastler net [16] \({\mathcal{F}}=\{ {\mathcal{F}}(A) \}\) and a net of subfactors \({\mathcal{A}}= \{ {\mathcal{A}}(A) \}\) in the sense of [28]. If \(A_n, B_n\) are disjoint regions separated by a corridor of size \(\sim 1/n\), we can consider a conditional expectation “\(E_n = E_{A_n} \otimes E_{B_n}\)” projecting \({\mathcal{F}}(A_n) \vee {\mathcal{F}}(B_n)\) to \({\mathcal{A}}(A_n) \vee {\mathcal{A}}(B_n)\). The partial state of the vacuum with respect to the subsystem \({\mathcal{F}}(A_n) \vee {\mathcal{F}}(B_n)\) is called \(\omega _\Omega \). We show [Theorem 1]

$$\begin{aligned} \lim _{n \rightarrow \infty } \Phi _s(\omega _\Omega | \omega _\Omega \circ E_n) = {\text {ln}}[{\mathcal{F}}:{\mathcal{A}}], \end{aligned}$$
(3)

which yields a formula (60) for F (fidelity) as a limiting case. Here, \([{\mathcal{F}}:{\mathcal{A}}]\) is the Jones index [21, 24] of the subnet [28], whose values are restricted to \(\{ 4 \cos ^2(\pi /n) : n=3,4,\dots \} \cup [4,\infty ]\). An example is a subtheory \({\mathcal{A}}\subset {\mathcal{F}}\) of charge neutral operators under a finite gauge group G, in which case \([{\mathcal{F}}:{\mathcal{A}}]=|G|\). Similar results could be obtained in analogous settings in higher dimensions.

We also point out a dual result for the inclusion \({\mathcal{F}}' \subset {\mathcal{A}}'\) and the dual conditional expectations \(E_n'\) in the case of the fidelity. This last result is a consequence of an “entropic (un)certainty relation”, given in Corollaries 1, 3, which generalize a result by [30] to general von Neumann algebras. A noteworthy special case of Corollary 3 is the following. Consider a finite index inclusion \({\mathcal{M}}\supset {\mathcal{N}}\) of factors and \(E:{\mathcal{M}}\rightarrow {\mathcal{N}}\) the corresponding minimal conditional expectation with dual conditional expectation \(E':{\mathcal{N}}' \rightarrow {\mathcal{M}}'\). Then, we have

$$\begin{aligned} F_{\mathcal{M}}(\omega _\psi | \omega _\psi \circ E) \cdot F_{{\mathcal{N}}'}(\omega _\psi ' | \omega _\psi ' \circ E') \ge \frac{1}{\sqrt{[{\mathcal{M}}:{\mathcal{N}}]}}. \end{aligned}$$
(4)

Here, \(\omega _\psi \) is the state on \({\mathcal{M}}\) induced by a vector \(|\psi \rangle \) (“purification”) in a natural cone of a standard representation of \({\mathcal{M}}\) and \(\omega '_\psi \) the corresponding state on \({\mathcal{N}}'\) induced by the same vector. F is the fidelity between two states. Such relations remind one of the Heisenberg uncertainty principle. We plan to come back to this in the future.

Notations and conventions Calligraphic letters \({\mathcal{A}}, {\mathcal{M}}, \dots \) denote von Neumann algebras. Calligraphic letters \(\mathscr {H}, \mathscr {K}, \dots \) denote linear spaces. We use the physicist’s “ket”-notation \(|\psi \rangle \) for vectors in a Hilbert space. The scalar product is written as \( \langle \psi | \psi ' \rangle \) and is anti-linear in the first entry. The norm of a vector is written simply as \(\Vert |\psi \rangle \Vert =: \Vert \psi \Vert \). Each vector \(|\psi \rangle \in \mathscr {H}\) gives rise to a positive definite linear functional on the von Neumann algebra \({\mathcal{M}}\) acting on \(\mathscr {H}\) via

$$\begin{aligned} \omega _\psi (m) = \langle \psi | m \psi \rangle , \quad m \in {\mathcal{M}}. \end{aligned}$$
(5)

The commutant of \({\mathcal{M}}\) is denoted as \({\mathcal{M}}'\) and consists of those bounded operators commuting with all elements of \({\mathcal{M}}\).

2 von Neumann algebras and relative entropy

2.1 Relative modular theory and entropy

Let \(({\mathcal{M}}, J, \mathscr {P}_{\mathcal{M}}^\natural , \mathscr {H})\) be a von Neumann algebra in standard form acting on a Hilbert space \(\mathscr {H}\), with natural cone \(\mathscr {P}^\sharp _{\mathcal{M}}\) and modular conjugation J (for an explanation of these terms, see [8, 35] as general references). We will use relative modular operators \(\Delta _{\psi ,\zeta }\) associated with two vectors \(|\zeta \rangle \in \mathscr {P}^\natural _{\mathcal{M}}, |\psi \rangle \in \mathscr {H}\). The nonnegative, self-adjoint operator \(\Delta _{\psi ,\zeta }\) is characterized by

$$\begin{aligned} J \Delta _{\psi ,\zeta }^{1/2}\left( a | \zeta \rangle + | \chi \rangle \right) = \pi ^{\mathcal{M}}(\zeta ) a^* | \psi \rangle \,, \quad \forall \,\, a \in {\mathcal{M}}\,,\,\, | \chi \rangle \in (1-\pi ^{{\mathcal{M}}'}(\zeta )) \mathscr {H} . \end{aligned}$$
(6)

Here, \(\pi ^{{\mathcal{M}}'}(\psi )\) is the support projection of the vector \(|\psi \rangle \), defined as the orthogonal projection onto the closure of the subspace \({\mathcal{M}}|\psi \rangle \subset \mathscr {H}\). The nonzero support of \(\Delta _{\psi ,\zeta }\) is \(\pi ^{\mathcal{M}}(\psi ) \pi ^{{\mathcal{M}}'}(\zeta )\), and complex powers such as \(\Delta _{\psi ,\zeta }^z\) are understood via the functional calculus on this support and are defined as 0 on \(1-\pi ^{\mathcal{M}}(\psi ) \pi ^{{\mathcal{M}}'}(\zeta )\). \(\Delta _{\psi ,\zeta }^{1/2}\) depends on \(|\psi \rangle \) only via the associated state functional (5).

According to [2, 3], if the support projections satisfy \(\pi ^{\mathcal{M}}(\psi ) \ge \pi ^{\mathcal{M}}(\zeta )\), the relative entropy may be defined by

$$\begin{aligned} S(\zeta | \psi ) = -\lim _{\alpha \rightarrow 0^+} \frac{\langle \zeta | \Delta ^\alpha _{\psi , \zeta } \zeta \rangle -1}{\alpha } , \end{aligned}$$
(7)

; otherwise, it is by definition infinite. The relative entropy may be viewed as a function of the functionals (5) \(\omega _\psi , \omega _\zeta \) on \({\mathcal{M}}\). So one can write instead also \(S(\omega _\zeta | \omega _\psi )\) without ambiguity. In the case of the matrix algebra \({\mathcal{M}}=M_n(\mathbb {C})\), where \(\omega _\zeta \) and \(\omega _\psi \) are identified with density matrices as \(\omega _\psi (a) = {\text {Tr}}(a\omega _\psi )\), etc., the relative entropy is the usual expression (1).

Kosaki [23] has given the following variational formula for two normalized state functionals \(\omega _\psi , \omega _\zeta \) on \({\mathcal{M}}\):

$$\begin{aligned} S(\omega _\zeta | \omega _\psi )= & {} \sup _{n \in {\mathbb N}} \sup _{x: (1/n,\infty ) \rightarrow {\mathcal{M}}} \nonumber \\&\left\{ {\text {ln}}n - \int _{1/n}^\infty [ \omega _\zeta (x(t)^* x(t)) + t^{-1} \omega _\psi (y(t) y(t)^*)] t^{-1} \mathrm{d}t \right\} , \end{aligned}$$
(8)

where the second supremum is over all step functions x(t) valued in \({\mathcal{M}}\) with finite range where \(y(t) = 1-x(t)\). (8) no longer makes explicit reference to modular theory, and the dependence on the state functionals (as opposed to vectors) is manifest. Some further explanations and uses of Kosaki’s formula are discussed, e.g., in [33], ch. 5.

2.2 Conditional expectations, index, and relative entropy

Let \({\mathcal{M}}, {\mathcal{N}}\) be two von Neumann algebras. A linear operator \(T:{\mathcal{M}}\rightarrow {\mathcal{N}}\) is called a channel if it is ultra-weakly continuous (“normal”), unital \(T(1)=1\), and completely positive. The latter means that the induced mapping \(T \otimes id_n: {\mathcal{M}}\otimes M_n(\mathbb {C}) \rightarrow {\mathcal{N}}\otimes M_n(\mathbb {C})\), maps positive elements to positive elements for all \(n \ge 1\).

If \({\mathcal{N}}\subset {\mathcal{M}}\) is a von Neumann sub-algebra, then a quantum channel \(E: {\mathcal{M}}\rightarrow {\mathcal{N}}\) is called a conditional expectation if

$$\begin{aligned} E(n_1 m n_2)= n_1 E(m) n_2 \end{aligned}$$
(9)

for \(m \in {\mathcal{M}}, n_i \in {\mathcal{N}}\). The space of such conditional expectations is called \(C({\mathcal{M}}, {\mathcal{N}})\). A faithful normal operator valued weight is an unbounded and unnormalized positive linear map \(N: {\mathcal{M}}\rightarrow {\mathcal{N}}\) with the same bimodule property and with dense domain \({\mathcal{M}}_+\) (\(=\) non-negative elements of \({\mathcal{M}}\)) [17]. The space of such operator-valued weights is denoted \(P({\mathcal{M}}, {\mathcal{N}})\), and clearly \(C({\mathcal{M}}, {\mathcal{N}})\) is a subset thereof. Both \(C({\mathcal{M}},{\mathcal{N}})\) and \(P({\mathcal{M}},{\mathcal{N}})\) may be empty.

Let \({\mathcal{M}},{\mathcal{N}}\) be factors. If there exists \(E \in C({\mathcal{M}},{\mathcal{N}})\), then the best constant \(\lambda >0\) such that

$$\begin{aligned} E(m^*m) \ge \lambda ^{-1} m^*m \quad \text {for all }m \in {\mathcal{M}}\end{aligned}$$
(10)

is called ind(E), the index of E. If there is any conditional expectation at all, then there is one for which \(\lambda \) is minimal [18]. This \(\lambda = [{\mathcal{M}}:{\mathcal{N}}]\) is the Jones–Kosaki index of the inclusion [21, 24, 34].

Haagerup [17] has established a canonical correspondence \(N \in P({\mathcal{M}},{\mathcal{N}}) \leftrightarrow N^{-1} \in P({\mathcal{N}}', {\mathcal{M}}')\) satisfying \((N^{-1})^{-1} = N, (N_1 \circ N_2)^{-1} = N_2^{-1} \circ N_1^{-1}\). One can connect this to the notion of a “spatial derivative” [10]. To this end, let \({\mathcal{M}}\) be a von Neumann algebra acting on \(\mathscr {H}\), let \(|\zeta \rangle , |\psi \rangle \in \mathscr {H}\), and let \(B(\mathscr {H})\) be the von Neumann algebra of all bounded operators on \(\mathscr {H}\). Applying (5) to \({\mathcal{M}}\) and the commutant \({\mathcal{M}}'\), we get state functionals \(\omega _\zeta '\), respectively, \(\omega _\psi \) on \({\mathcal{M}}'\), respectively, \({\mathcal{M}}\). Now, the functional \(\omega _\psi : {\mathcal{M}}\rightarrow \mathbb {C}\) is a special case of a conditional expectation, so the dual conditional expectation \(\omega _\psi ^{-1}\) is in \(P(B(\mathscr {H}), {\mathcal{M}}')\). Thus, \(\omega '_\zeta \circ \omega _\psi ^{-1}\) is a weight on \(B(\mathscr {H})\). Such a weight defines a densely defined positive definite (sesquilinear) quadratic form on \(\mathscr {H}\) by

$$\begin{aligned} q_{\psi ,\zeta }(\phi _1, \phi _2) = \omega '_\zeta \circ \omega _\psi ^{-1}(|\phi _2\rangle \langle \phi _1|), \end{aligned}$$
(11)

and the operator on \(\mathscr {H}\) representing \(q_{\psi ,\zeta }\) is called the “spatial derivative,” \(\Delta _{{\mathcal{M}}}(\omega '_\zeta / \omega _\psi )\). It can be seen to only depend on the functionals \(\omega _\zeta '\), respectively, \(\omega _\psi \) on \({\mathcal{M}}'\), respectively, \({\mathcal{M}}\). \(\Delta _{{\mathcal{M}}}(\omega '_\zeta / \omega _\psi )\) equals the relative modular operator \(\Delta _{{\mathcal{M}};\zeta ,\psi }\) in case \(|\psi \rangle \in {{\mathscr {P}}}_{\mathcal{M}}\). It follows that if \(|\zeta \rangle \) is in the form domain of \({\text {ln}}\Delta _{{\mathcal{M}}}(\omega '_\zeta / \omega _\psi )\), then the relative entropy may also be written as \( S(\zeta |\psi ) = \langle \zeta | {\text {ln}}\Delta _{{\mathcal{M}}}(\omega '_\zeta / \omega _\psi ) \zeta \rangle . \) This representation and the structures established by [10, 17] have an immediate corollary for a conditional expectation \(E: {\mathcal{M}}\rightarrow {\mathcal{N}}\). First, by [10], thm. 9, the spatial derivative has the duality property

$$\begin{aligned} \Delta _{{\mathcal{M}}}(\omega '_\zeta / \omega _\psi ) = \Delta _{{\mathcal{M}}'}(\omega _\psi / \omega '_\zeta )^{-1}. \end{aligned}$$
(12)

Furthermore, \( \omega '_\psi \circ (\omega _\psi \circ E)^{-1} = (\omega '_\psi \circ E^{-1}) \circ \omega _\psi ^{-1}, \) so [24]

$$\begin{aligned} \Delta _{{\mathcal{M}}}(\omega '_\psi / \omega _\psi \circ E) = \Delta _{{\mathcal{N}}}(\omega '_\psi \circ E^{-1}/ \omega _\psi ) = \Delta _{{\mathcal{N}}'}(\omega _\psi /\omega '_\psi \circ E^{-1})^{-1}. \end{aligned}$$
(13)

Taking a log and the expectation value with respect to the vector \(|\psi \rangle \) then gives:

$$\begin{aligned} S_{{\mathcal{M}}}(\omega _\psi | \omega _\psi \circ E) + S_{{\mathcal{N}}'}(\omega _\psi ' | \omega _\psi ' \circ E^{-1}) = 0. \end{aligned}$$
(14)

Note that \(E^{-1}\) is not normalized unless \(E=id\). If \({\mathcal{M}}\) is a factor such that \(ind(E)=\lambda < \infty \) is finite, then it can be shown from (10) that 1 is in the domain of \(E^{-1}\) and \(\lambda 1 = E^{-1}(1)\). Therefore,

$$\begin{aligned} E' = \lambda ^{-1}E^{-1} \end{aligned}$$
(15)

is a (normalized) conditional expectation \(E' \in C({\mathcal{N}}', {\mathcal{M}}')\) [24]. In fact, if E is minimal, then also \(E'\) is. Using the standard scaling properties of the relative entropy thereby gives the following trivial corollary which generalizes [30] who have considered by an explicit method the special case of finite dimensional type I von Neumann algebras:

Corollary 1

Let \({\mathcal{N}}\subset {\mathcal{M}}\) be a an inclusion of von Neumann factors with finite index \([{\mathcal{M}}:{\mathcal{N}}]<\infty \), acting standardly on a Hilbert space \(\mathscr {H}\). Assume that \(E \in C({\mathcal{M}},{\mathcal{N}})\) is the minimal conditional expectation, \(E' \in C({\mathcal{N}}', {\mathcal{M}}')\) the dual minimal conditional expectation. For \(|\psi \rangle \in \mathscr {H}\), we have

$$\begin{aligned} S_{{\mathcal{M}}}(\omega _\psi | \omega _\psi \circ E) + S_{{\mathcal{N}}'}(\omega _\psi ' | \omega _\psi ' \circ E') = {\text {ln}}[{\mathcal{M}}:{\mathcal{N}}]. \end{aligned}$$
(16)

(Note that \(\omega _\psi '\) in the second expression means the functional (5) on \({\mathcal{N}}'\), etc.)

Results of a similar flavor have also been given by [39]. Interesting physical applications of the above “certainty relation” (16) involving Wilson–‘t Hooft operators in four-dimensional quantum Yang-Mills theory have recently been pointed out by [12, 30]. In such a situation, the algebras are expected to be of type III [9]. Then, the minimal conditional expectation E and its dual \(E'\) can be described more explicitly using Q-systems [27]. In this framework, \({\mathcal{M}}\) is generated by \({\mathcal{N}}\) together with a single operator, v, [see “Appendix”, Eq. (77)] and \({\mathcal{N}}'\) is generated by \({\mathcal{M}}'\) together with a single operator, \(v'\). The operators \(w=j_{\mathcal{N}}(v') \in {\mathcal{N}}\), \(w'=j_{\mathcal{M}}(v) \in {\mathcal{M}}'\) and the “canonical” endomorphsms

$$\begin{aligned} \gamma =j_{\mathcal{N}}j_{\mathcal{M}}: {\mathcal{M}}\rightarrow {\mathcal{N}}, \quad \gamma ' = j_{\mathcal{M}}j_{\mathcal{N}}: {\mathcal{N}}' \rightarrow {\mathcal{M}}' \end{aligned}$$
(17)

can be defined, where \(j_{\mathcal{N}}(n)=J_{\mathcal{N}}n J_{\mathcal{N}}\) and \(J_{\mathcal{N}}\) is the modular conjugationFootnote 1 of \({\mathcal{N}}\), etc. The expectations \(E,E'\) are then given by (here \(d=[{\mathcal{M}}:{\mathcal{N}}]^{1/2}\))

$$\begin{aligned} E(m)=\frac{1}{d} w^* \gamma (m) w, \quad E'(n')= \frac{1}{d} w^{\prime *} \gamma '(n') w^\prime . \end{aligned}$$
(18)

Another property is that \(J_{\mathcal{M}}v'= v' J_{\mathcal{N}}, J_{\mathcal{M}}v= v J_{\mathcal{N}}\).

The operator \(v'\) is closely related to the idea of quantum error correcting codes as described by [14]: For the sake of easier comparison, define

$$\begin{aligned} V:= v'/\sqrt{d}, \quad V':= v/\sqrt{d}, \end{aligned}$$
(19)

with the normalizations made such that \(V, V'\) are isometries. For any \(|\psi \rangle , |\zeta \rangle \in \mathscr {H}\) we define states \(\omega _\zeta \) on \({\mathcal{N}}\) and \(\omega _\zeta '\) on \({\mathcal{N}}'\) by (5). Then, we have the implications

$$\begin{aligned} {\left\{ \begin{array}{ll} \omega _{\zeta }'|_{{\mathcal{N}}'} = \omega _{\psi }'|_{{\mathcal{N}}'} \Longrightarrow &{} \omega _{V\zeta }'|_{{\mathcal{M}}'} = \omega _{V\psi }'|_{{\mathcal{M}}'}\\ \omega _{\zeta }|_{{\mathcal{N}}} = \omega _{\psi }|_{{\mathcal{N}}} \ \ \Longrightarrow &{} \omega _{V\zeta }|_{{\mathcal{M}}} = \omega _{V\psi }|_{{\mathcal{M}}}, \end{array}\right. } \end{aligned}$$
(20)

so \({\mathcal{M}}\) is “standardly c-reconstructible” from \({\mathcal{N}}\) in the terminology [14]. In the context of holography, \({\mathcal{N}}\) would be a bulk observable algebra, \({\mathcal{M}}\) a corresponding CFT algebra and the subspace \(V\mathscr {H}\subset \mathscr {H}\) the “code subspace.” Dually, the operator \(V'\) is used in a similar way to “standardly c-reconstruct” \({\mathcal{N}}'\) from \({\mathcal{M}}'\), with similar relations. While the existence and properties of the operator V are equivalent to the existence of some conditional expectation \(E:{\mathcal{M}}\rightarrow {\mathcal{N}}\) alone [14], thm. 7, the existence of the operator \(V'\) for the dual code does not follow from these results but requires a finite index (and minimal conditional expectation).

These facts can be used to give an “error correction version” of the certainty relation expressed by Corollary 1. We simply observe the equalities

$$\begin{aligned} E(m) = \frac{1}{d} w^* \gamma (m) w = \frac{1}{d} j_{\mathcal{N}}(v^{\prime *}) j_{\mathcal{N}}j_{\mathcal{M}}(m) j_{\mathcal{N}}(v^{\prime }) = \frac{1}{d} J_{\mathcal{N}}v^{\prime *} J_{\mathcal{M}}m J_{\mathcal{M}}v^{\prime } J_{\mathcal{N}}= V^* m V \end{aligned}$$
(21)

we get \(E'(n') = V^{\prime *} n' V'\) for \(n' \in {\mathcal{N}}'\). This gives in view of Corollary 1:

Corollary 2

(Error correcting code version) Let \({\mathcal{M}}\supset {\mathcal{N}}\) be an inclusion of type III von Neumann factors with finite index and let \(|\psi \rangle \in \mathscr {H}\). Let V be a code operator as in (20) and \(V'\) the dual code operator. Then,

$$\begin{aligned} S_{{\mathcal{M}}}( \omega _\psi | \omega _{V\psi } ) + S_{{\mathcal{N}}'}(\omega _\psi ' | \omega _{V'\psi }') = {\text {ln}}[{\mathcal{M}}:{\mathcal{N}}]. \end{aligned}$$
(22)

2.3 Sandwiched Renyi divergence

A family of entropy functionals for von Neumann algebras extrapolating the relative entropy are the “sandwiched Renyi divergences (entropies)” [32, 38]. In the general von Neumann algebra setting, they can be defined in terms of certain \(L_p\) norms [5, 6], which were defined by [4] (see also [6, 20]) relative to a fixed cyclic and separating vector \(|\psi \rangle \in \mathscr {H}\) in the a natural cone of a standard representation of a von Neumann algebra \({\mathcal{M}}\).

More precisely, for \(1 \le p \le 2\), \(L_p({\mathcal{M}}, \psi )\) is defined as the completion of \(\mathscr {H}\) with respect to the following pseudo-normFootnote 2:

$$\begin{aligned} \Vert \zeta \Vert _{p,\psi } = \inf \{ \Vert \Delta _{\phi , \psi }^{(1/2)-(1/p)}\zeta \Vert : \Vert \phi \Vert =1, \pi ^{\mathcal{M}}(\phi ) = \pi ^{\mathcal{M}}(\zeta ) \}. \end{aligned}$$
(23)

We deviate slightly from [4] by imposing \(\pi ^{\mathcal{M}}(\phi ) = \pi ^{\mathcal{M}}(\zeta )\) instead of \(\pi ^{\mathcal{M}}(\phi ) \ge \pi ^{\mathcal{M}}(\zeta )\), but this makes no difference for the resulting formula in the case of finite-dimensional von Neumann algebras, nor the results that we invoke below. It is well-known that any state \(|\zeta '\rangle \in \mathscr {H}\) such that \(\omega _{\zeta '}' = \omega _\zeta '\) as state functionals on \({\mathcal{M}}'\) is related to \(|\zeta \rangle \in \mathscr {H}\) by \(|\zeta '\rangle = u|\zeta \rangle \), where \(u \in {\mathcal{M}}, u^* u = \pi ^{{\mathcal{M}}}(\zeta )\). Furthermore, it is known [4], thm. 5 that \(\Vert m \zeta \Vert _{p,\psi } \le \Vert m \Vert \Vert \zeta \Vert _{p,\psi }\) for any \(m \in {\mathcal{M}}\), and from this one can easily see that \(\Vert \zeta \Vert _{p,\psi }\) depends on \(|\zeta \rangle \in \mathscr {H}\) only via the induced functional \(\omega '_\zeta \) on \({\mathcal{M}}'\). Vice versa, to get an invariant for state functionals on \({\mathcal{M}}\), one should therefore consider the \(L_p\) norms relative to \({\mathcal{M}}'\) as in the following definition.

Definition 1

Let \({\mathcal{M}}\) be a von Neumann algebra in standard form acting on \({\mathcal{H}}\), \(|\zeta \rangle \in \mathscr {H}\). The “sandwiched Renyi divergences” [32, 38] \(D_s, s \in [1/2,1)\) are defined by

$$\begin{aligned} D_s(\omega _\zeta | \omega _\psi ) = (s-1)^{-1} \, {\text {ln}}\Vert \zeta \Vert _{2s,\psi ,{\mathcal{M}}'}^{2s} \end{aligned}$$
(24)

with norm taken relative to \({\mathcal{M}}'\).

The \(L_p\) norms for \(p > 2\) and corresponding sandwiched Renyi entropies for \(s>1\) may be defined by duality [4], thm. 1, but the precise definition is not needed for the purposes of this paper. The generalization of the definition to non-faithful state functionals \(\omega _\psi \) whose representing vector \(|\psi \rangle \) is not separating is given in [6], who also prove key properties of \(D_s\) [5, 15, 32, 38] in the general von Neumann setting. States on the finite-dimensional type I factor \({\mathcal{A}}= M_n(\mathbb {C})\) correspond to density matrices via \(\omega _\psi (a)={\text {Tr}}(a\omega _\psi )\). In that case, the definition gives

$$\begin{aligned} D_s(\omega _\zeta | \omega _\psi ) = (s-1)^{-1} \, {\text {ln}}{\text {Tr}}(\omega _\psi ^{(1-s)/(2s)} \omega _\zeta \omega _\psi ^{(1-s)/(2s)})^s. \end{aligned}$$
(25)

The sandwiched Renyi divergences extrapolate the relative entropy S which can be recovered as the limit \(s \rightarrow 1^-\) [6], thm. 13. At the other end, for \(s \rightarrow 1/2^+\), one recovers the negative log squared fidelity. In fact, the \(L_1\) norm relative to \({\mathcal{M}}'\) is related to the fidelity [1, 37] relative to \({\mathcal{M}}\) by

$$\begin{aligned} \Vert \zeta \Vert _{1,\psi ,{\mathcal{M}}'} = \sup \{ |\langle \zeta | a' \psi \rangle | : a' \in {\mathcal{M}}', \Vert a'\Vert =1 \} = F_{{\mathcal{M}}}(\omega _\zeta | \omega _\psi ), \end{aligned}$$
(26)

see [13], lem. 3 (1), which generalizes [4], lem. 5.3 when \(\psi \) is not necessarily faithful.

It has been shown that \(D_s \le S\) for \(s \le 1\) by [6], prop. 4. Therefore, Corollary 1 implies:

Corollary 3

For a finite index inclusion \({\mathcal{N}}\subset {\mathcal{M}}\) of von Neumann factors with minimal conditional expectation \(E:{\mathcal{M}}\rightarrow {\mathcal{N}}\), we have

$$\begin{aligned} D_{s}^{\mathcal{M}}(\omega _\psi | \omega _\psi \circ E) + D_{s}^{{\mathcal{N}}'}(\omega _\psi ' | \omega _\psi ' \circ E') \le {\text {ln}}[{\mathcal{M}}:{\mathcal{N}}] \end{aligned}$$
(27)

for any \(s \in [1/2,1)\) and any vector \(|\psi \rangle \in \mathscr {H}\) with induced state functionals \(\omega '_\psi \) on \({\mathcal{N}}'\) and \(\omega _\psi \) on \({\mathcal{M}}\).

A noteworthy special case arises for \(s=1/2\):

$$\begin{aligned} F_{\mathcal{M}}(\omega _\psi | \omega _\psi \circ E) \cdot F_{{\mathcal{N}}'}(\omega _\psi ' | \omega _\psi ' \circ E') \ge \frac{1}{\sqrt{[{\mathcal{M}}:{\mathcal{N}}]}}. \end{aligned}$$
(28)

There are also evident error correcting code formulations of this analogous to Corollary 2.

3 Variational formulas

Here, we point out a variational expression related to the \(L_p\) norms in the range \(p \in (1,2)\) similar to Kosaki’s formula [23] for the relative entropy. To simplify the discussion, we assume that the fixed reference vector \(|\psi \rangle \in \mathscr {P}^\natural _{\mathcal{M}}\) in the definition of the \(L_p\) norm is cyclic and separating for the von Neumann algebra \({\mathcal{M}}\) in standard form acting on \(\mathscr {H}\). Let \(\Delta ' \equiv \Delta '_{\psi ,\phi }\) be the relative modular operator defined using \({\mathcal{M}}'\) and the unit vector \(|\phi \rangle \in \mathscr {P}^\natural _{\mathcal{M}}\). Then, the support of \(\Delta '\) is \(\pi ^{\mathcal{M}}(\phi )\). \(\Delta ^{\prime 1/2}\) is a nonnegative self-adjoint operator whose domain we denote by \(\mathscr {D}(\phi )\) (it is equal the closure in the graph norm of \({\mathcal{M}}' |\phi \rangle \oplus (1-\pi ^{{\mathcal{M}}}(\phi ))\mathscr {H}\)).

We can apply [33], lem. 5.8, showing that, for \(t>0\), \(|\zeta \rangle \in \mathscr {D}(\phi )\), we have

$$\begin{aligned} \langle \Delta ^{\prime }(\Delta ^{\prime }+t)^{-1} \zeta | \zeta \rangle = \inf \{ \Vert \xi \Vert ^2 + t^{-1} \Vert \Delta ^{\prime 1/2} \eta \Vert ^2 : |\xi \rangle + |\eta \rangle = |\zeta \rangle , \quad |\xi \rangle , |\eta \rangle \in \mathscr {D}(\phi ) \}. \end{aligned}$$
(29)

Now, we note the well-known formula

$$\begin{aligned} \lambda ^\alpha = \frac{\sin (\pi \alpha )}{\pi } \int _0^\infty \frac{\lambda }{t+\lambda } t^{\alpha -1} \mathrm{d}t \end{aligned}$$
(30)

when \(\lambda >0, \alpha \in (0,1)\), which is commonly used to trade the power \(\Delta ^{ \prime \alpha }\) for the resolvent \(\Delta ^{\prime }(\Delta ^{\prime }+t)^{-1}\), for which we have the variational expression (29). Then, arguing in the same way as in the proof of [33], prop. 5.10 gives

$$\begin{aligned} \Vert \Delta ^{ \prime \alpha /2} \zeta \Vert ^2 = \frac{\sin (\pi \alpha )}{\pi } \inf _{\xi , \eta :\mathbb {R}_+ \rightarrow \mathscr {D}(\phi )} \int _0^\infty [\Vert \xi (t)\Vert ^2 + t^{-1} \Vert \Delta ^{\prime 1/2} \eta (t) \Vert ^2] t^{\alpha -1} \mathrm{d}t,\qquad \end{aligned}$$
(31)

where \(|\zeta \rangle \in \mathscr {D}(\phi )\) and where the infimum is taken over all step functions \(\xi ,\eta :[0,\infty ] \rightarrow \mathscr {D}(\phi )\) with finite range such that \(|\xi (t)\rangle =|\zeta \rangle \) for sufficiently small \(t>0\), such that \(|\eta (t)\rangle =|\zeta \rangle \) for sufficiently large t, and such that \(|\xi (t)\rangle +|\eta (t)\rangle =|\zeta \rangle \). The support of \(\Delta '\) is \(\pi ^{\mathcal{M}}(\phi )\), thus we have \(\Vert \Delta ^{\prime 1/2} \eta (t) \Vert = \Vert \Delta ^{\prime 1/2} \pi ^{\mathcal{M}}(\phi ) \eta (t) \Vert \). Taking into account the definition of the \(L_1\) norm relative to \({\mathcal{M}}\) and the relation \(\Delta _{\phi ,\psi }^{-1} = \Delta '_{\psi ,\phi }\), see e.g., [4], thm. C. 1, we get \(\Vert \Delta ^{\prime 1/2} \eta (t) \Vert \ge \Vert \pi ^{\mathcal{M}}(\phi ) \eta (t) \Vert _{1,\psi , {\mathcal{M}}}\). The last expression is also equal to the fidelity \(F_{{\mathcal{M}}'}(\omega _{\pi ^{\mathcal{M}}(\phi ) \eta (t)}' | \omega _\psi ')\) relative to \({\mathcal{M}}'\) by (26). Taking furthermore into account the trivial fact that \(\Vert \pi ^{\mathcal{M}}(\phi ) \xi (t) \Vert \le \Vert \xi (t) \Vert \) we get

$$\begin{aligned}&\Vert \Delta ^{\prime \alpha /2} \zeta \Vert ^2 \ge \frac{\sin (\pi \alpha )}{\pi } \inf _{\xi , \eta :\mathbb {R}_+ \rightarrow \mathscr {D}(\phi )} \int _0^\infty \nonumber \\&[\omega _{\pi ^{\mathcal{M}}(\phi ) \xi (t)}'(1) + t^{-1} F_{{\mathcal{M}}'}(\omega _{\pi ^{\mathcal{M}}(\phi ) \eta (t)}' | \omega _\psi ')^2] t^{\alpha -1} \mathrm{d}t, \end{aligned}$$
(32)

where still \(|\zeta \rangle \in \mathscr {D}(\phi )\) and the infimum is still taken over all step functions as described above. We want to enlarge the domain of \(|\zeta \rangle \) for which (32) is valid.

Lemma 1

(32) holds for \(|\zeta \rangle \) in the domain of \(\Delta ^{\prime \alpha /2}\) provided that we take the infimum over functions \(\xi , \eta \) that are now valued in \(\mathscr {H}\) and have analogous properties otherwise.

Proof

First, by applying an increasing sequence of spectral projections we define \(|\zeta _n\rangle := E_{[0,n]}(\Delta ')|\zeta \rangle \in \mathscr {D}(\phi )\). Then, \(|\zeta _n\rangle \rightarrow |\zeta \rangle \), \(\Delta ^{\prime \alpha /2} |\zeta _n\rangle \rightarrow \Delta ^{\prime \alpha /2} |\zeta \rangle \) strongly, and the above inequality (32) holds for \(|\zeta _n\rangle \).

We next consider a sufficiently large n such that \(\Vert \zeta -\zeta _n \Vert \) and \(\Vert \Delta ^{\prime \alpha /2} \zeta - \Delta ^{\prime \alpha /2} \zeta _n \Vert \) are \(<\varepsilon \). For this fixed n, we then consider step functions \(\xi _n,\eta _n:[0,\infty ] \rightarrow \mathscr {D}(\phi )\) with finite range such that \(|\xi _n(t)\rangle =|\zeta _n\rangle \) for sufficiently small \(t>0\), such that \(|\eta _n(t)\rangle =|\zeta _n\rangle \) for sufficiently large t, such that \(|\xi _n(t) \rangle + |\eta _n(t)\rangle = |\zeta _n\rangle \), and such that the infimum on the right side of (32), applied to \(|\zeta _n\rangle \), is nearly achieved by the functions \(\xi _n,\eta _n\) up to a small tolerance, \(\varepsilon \). We also define new functions \({\tilde{\xi }}_n,{\tilde{\eta }}_n\) by

$$\begin{aligned} |{\tilde{\eta }}_n(t)\rangle = {\left\{ \begin{array}{ll} |\eta _n(t)\rangle &{} \text {for }t< 1,\\ |\zeta \rangle - |\xi _n(t)\rangle &{} \text {for }t \ge 1, \end{array}\right. } \quad |{\tilde{\xi }}_n(t)\rangle = {\left\{ \begin{array}{ll} |\xi _n(t)\rangle &{} \text {for }t\ge 1,\\ |\zeta \rangle - |\eta _n(t)\rangle &{} \text {for }t < 1. \end{array}\right. } \end{aligned}$$
(33)

The new functions have analogous properties as the old ones with \(|\zeta _n\rangle \) replaced by \(|\zeta \rangle \). Furthermore, for \(t<1\), we have \(\Vert {\tilde{\eta }}_n(t) - \eta _n(t) \Vert = 0, \Vert \tilde{\xi }_n(t) - \xi _n(t) \Vert = \Vert \zeta - \zeta _n\Vert \), whereas for \(t \ge 1\), we have \(\Vert {\tilde{\xi }}_n(t) - \xi _n(t) \Vert = 0, \Vert \tilde{\eta }_n(t) - \eta _n(t) \Vert = \Vert \zeta - \zeta _n\Vert \). We now claim that

$$\begin{aligned} \Vert \Delta ^{\prime \alpha /2} \zeta \Vert ^2 \ge \frac{\sin (\pi \alpha )}{\pi } \int _0^\infty [\omega _{\pi ^{\mathcal{M}}(\phi ) {\tilde{\xi }}_n(t)}'(1) + t^{-1} F_{{\mathcal{M}}'}(\omega _{\pi ^{\mathcal{M}}(\phi ) \tilde{\eta }_n(t)}' | \omega _\psi ')^2] t^{\alpha -1} \mathrm{d}t - c\varepsilon , \end{aligned}$$
(34)

constant only depending on \(\alpha ,|\zeta \rangle \) but not n.

By construction such an equation holds with \(c=1\) for \(|\zeta _n\rangle \) and the functions \(\xi _n,\eta _n\), and we want to deduce (34) from that. Also, by construction \(\Vert \Delta ^{\prime \alpha /2} \zeta \Vert ^2\) differs from \(\Vert \Delta ^{\prime \alpha /2} \zeta _n \Vert ^2\) by \(c\varepsilon \). Then, if we split the integral in (34) into a contribution from \(t \in [0,1)\) and \(t \in [1,\infty ]\), we can see that the integral for the functions \(\xi _n,\eta _n\) is equal to the integral for the functions \({\tilde{\xi }}_n,{\tilde{\eta }}_n\) up to an error which is bounded in terms of the sum of the integrals

$$\begin{aligned} \begin{aligned}&\int _0^1 | \Vert \pi ^{\mathcal{M}}(\phi ) \xi _n(t)\Vert ^2 - \Vert \pi ^{\mathcal{M}}(\phi ) {\tilde{\xi }}_n(t)\Vert ^2 | t^{\alpha -1} \mathrm{d}t, \\&\int _1^\infty |F_{{\mathcal{M}}'}(\omega _{\pi ^{\mathcal{M}}(\phi ) \eta _n(t)}' | \omega _\psi ')^2 - F_{{\mathcal{M}}'}(\omega _{\pi ^{\mathcal{M}}(\phi ) {\tilde{\eta }}_n(t)}' | \omega _\psi ')^2| t^{\alpha -2} \mathrm{d}t, \end{aligned} \end{aligned}$$
(35)

times numerical coefficients depending only on \(\alpha \). The second integral is shown to be of the order \(\Vert \zeta -\zeta _n\Vert <\varepsilon \) as follows. The fidelity is continuous under strong limits (see e.g. [13], lem. 11) which in the case at hand gives

$$\begin{aligned} |F_{{\mathcal{M}}'}(\omega _{\pi ^{\mathcal{M}}(\phi ) \eta _n(t)}' | \omega _\psi ') - F_{{\mathcal{M}}'}(\omega _{\pi ^{\mathcal{M}}(\phi ) \tilde{\eta }_n(t)}' | \omega _\psi ')| \le \Vert {\tilde{\eta }}_n(t) - \eta _n(t) \Vert , \end{aligned}$$
(36)

where we can use \(\Vert {\tilde{\eta }}_n(t) - \eta _n(t) \Vert = \Vert \zeta - \zeta _n\Vert <\varepsilon \). In the second integral, we actually have the squares of the fidelities, so we use the elementary identity \(|a^2-{\tilde{a}}^2| \le |a-{\tilde{a}}|^2 + 2 |a| |a-{\tilde{a}}|\), where a and \({\tilde{a}}\) are the fidelities under the second integral (35) associated with \(\eta _n\) and \({\tilde{\eta }}_n\). For the term corresponding to \(2 |a| |a-{\tilde{a}}|\), we then need a bound on \(\int _1^\infty F_{{\mathcal{M}}'}(\omega _{\pi ^{\mathcal{M}}(\phi ) \eta _n(t)}' | \omega _\psi ') t^{\alpha -2} \mathrm{d}t\). Such a bound can be obtained immediately from (32), the Cauchy–Schwarz inequality, and the construction of the functions \(\xi _n, \eta _n\):

$$\begin{aligned} \left( \int _1^\infty F_{{\mathcal{M}}'}(\omega _{\pi ^{\mathcal{M}}(\phi ) \eta _n(t)}' | \omega _\psi ') t^{\alpha -2} \mathrm{d}t \right) ^2 \le c(\Vert \Delta ^{\prime \alpha /2} \zeta _n \Vert ^2+\varepsilon ). \end{aligned}$$
(37)

Then, by construction \(\Vert \Delta ^{\prime \alpha /2} \zeta _n \Vert \le \varepsilon + \Vert \Delta ^{\prime \alpha /2} \zeta \Vert \) which in turn is uniformly bounded in n. Also the first integral in (35) is shown to be of the order \(\Vert \zeta -\zeta _n\Vert <\varepsilon \). Here, we use first the reverse triangle inequality

$$\begin{aligned} | \Vert \pi ^{\mathcal{M}}(\phi ) \xi _n(t)\Vert - \Vert \pi ^{\mathcal{M}}(\phi ) {\tilde{\xi }}_n(t)\Vert | \le \Vert {\tilde{\xi }}_n(t) - \xi _n(t) \Vert , \end{aligned}$$
(38)

where we can use \(\Vert {\tilde{\xi }}_n(t) - \xi _n(t) \Vert = \Vert \zeta - \zeta _n\Vert <\varepsilon \). In the first integral, we actually have the squares of the norms, so we use the elementary identity \(|a^2-\tilde{a}^2| \le |a-{\tilde{a}}|^2 + 2 |a| |a-{\tilde{a}}|\), where a and \(\tilde{a}\) are now the norms under the first integral (35) associated with \(\xi _n\) and \({\tilde{\xi }}_n\). For the term corresponding to \(2 |a| |a-{\tilde{a}}|\), we then need a bound on \(\int _0^1 \Vert \pi ^{\mathcal{M}}(\phi ) \xi _n(t)\Vert t^{\alpha -2} \mathrm{d}t\). Such a bound can be obtained immediately from (32), the Cauchy–Schwarz inequality, and the construction of the functions \(\xi _n, \eta _n\):

$$\begin{aligned} \left( \int _0^1 \Vert \pi ^{\mathcal{M}}(\phi ) \xi _n(t)\Vert t^{\alpha -1} \mathrm{d}t \right) ^2 \le c(\Vert \Delta ^{\prime \alpha /2} \zeta _n \Vert ^2+\varepsilon ). \end{aligned}$$
(39)

Again, \(\Vert \Delta ^{\prime \alpha /2} \zeta _n \Vert \) is uniformly bounded in n. Thus, we see that the integrals (35) have an upper bound of the form \(c\varepsilon \). Hence, our argument implies that (34) holds. Trivially, the inequality (32) then also holds up to a tolerance of order \(c\varepsilon \) if we take the infimum over the set of all step functions \(\xi ,\eta :[0,\infty ] \rightarrow \mathscr {H}\) with finite range such that \(|\xi (t)\rangle =|\zeta \rangle \) for sufficiently small \(t>0\) such that \(|\eta (t)\rangle =|\zeta \rangle \) for sufficiently large t, and such that \(|\xi (t)\rangle +|\eta (t)\rangle =|\zeta \rangle \). Since \(\varepsilon \) was arbitrarily small, the statement follows. \(\square \)

Suppose now that \(\pi ^{\mathcal{M}}(\phi ) = \pi ^{\mathcal{M}}(\zeta )\). The structure of the right side of (32) implies range of the step functions may be restricted to \(\pi ^{\mathcal{M}}(\phi ) \mathscr {H}= \pi ^{\mathcal{M}}(\zeta ) \mathscr {H}\). Suppose we have step functions \(\xi ,\eta :[0,\infty ] \rightarrow \pi ^{\mathcal{M}}(\zeta ) \mathscr {H}\) such that the infimum in (32) is attained up to a small tolerance. Then, we can pick step functions \(x',y': [0,\infty ] \rightarrow {\mathcal{M}}'\) such that \(x'(t) + y'(t) = 1\), \(x'(t)=1\) for sufficiently small \(t>0\), \(y'(t)=1\) for sufficiently large \(t>0\), and such that \(\Vert x'(t) \zeta - \xi (t)\Vert , \Vert y'(t) \zeta - \eta (t)\Vert \) is small for all \(t>0\). Because the fidelity is continuous, see e.g., [13], lem. 11, we may replace the infimum in (32) by the infimum over all functions \(|\xi (t) \rangle , |\eta (t)\rangle \) of the form \(x'(t) |\zeta \rangle , y'(t) |\zeta \rangle \), where \(x'(t)+y'(t)=1\), \(x',y': \mathbb {R}_+ \rightarrow {\mathcal{M}}'\) step functions with finite range such that \(x'(t)=1\) for sufficiently small \(t>0\) and \(y'(t)=1\) for sufficiently large \(t>0\). Taking \(\alpha = 2/p-1 \in (0,1)\), and defining \(p', c_p\) as in (42), this yields

$$\begin{aligned} \Vert \Delta ^{\prime (1/p)-(1/2)}\zeta \Vert ^2\ge & {} c_p \inf _{x',y':\mathbb {R}_+ \rightarrow {\mathcal{M}}'} \int _0^\infty [\omega _{\zeta }'(x'(t)^* x'(t))\nonumber \\&+ t^{-1} F_{{\mathcal{M}}'}(y'(t) \omega _{\zeta }' y'(t)^*| \omega _\psi ')^2] t^{-2/p'} \mathrm{d}t, \end{aligned}$$
(40)

for all \(|\zeta \rangle \) in the domain of \(\Delta ^{\prime (1/p)-(1/2)}\) (which depends upon the unit vector \(|\phi \rangle \in \mathscr {P}^\natural _{\mathcal{M}}\)) such that \(\pi ^{\mathcal{M}}(\phi ) = \pi ^{\mathcal{M}}(\zeta )\).

We now take into account the relation \(\Delta _{\phi ,\psi }^{-1} = \Delta '_{\psi ,\phi }\), see e.g., [4], thm. C. 1, on the left side of (40). We note that the right side no longer depends on the choice of \(|\phi \rangle \), whereas on the the left side, we can drop the condition that \(|\phi \rangle \) is in the natural cone, because \(\Delta _{\phi ,\psi }\) is unchanged if we replace \(|\phi \rangle \) by \(u' |\phi \rangle \), \(u' \in {\mathcal{M}}', u'^{*} u'= \pi ^{{\mathcal{M}}'}(\phi )\) and such a replacement preserves \(\pi ^{\mathcal{M}}(\phi ) = \pi ^{\mathcal{M}}(\zeta )\).

Then, we minimize the left side of (40) for fixed \(|\zeta \rangle \) over unit vectors \(|\phi \rangle \in \mathscr {H}\) such that \(\pi ^{\mathcal{M}}(\phi ) = \pi ^{\mathcal{M}}(\zeta )\) and such that \(|\zeta \rangle \) is in the domain of \(\Delta ^{(1/2)-(1/p)}_{\phi ,\psi }\), which gives the \(L_p\) norm (23) of \(|\zeta \rangle \) relative to \({\mathcal{M}}\). As a consequence, the following proposition follows after reversing the roles of \({\mathcal{M}}\) and \({\mathcal{M}}'\):

Proposition 1

Let \({\mathcal{M}}\) be a von Neumann algebra in standard form with cyclic and separating vector \(|\psi \rangle \) in the natural cone. For any \(1< p < 2\), and \(|\zeta \rangle \in \mathscr {H}\), we have the variational formula

$$\begin{aligned} \Vert \zeta \Vert _{p,\psi ,{\mathcal{M}}'}^2 \ge c_p \inf _{x:\mathbb {R}_+ \rightarrow {\mathcal{M}}} \int _0^\infty [\omega _\zeta (x(t)^*x(t)) + t^{-1} F_{{\mathcal{M}}}(y(t) \omega _\zeta y(t)^* | \omega _\psi )^2] t^{-2/p'} \mathrm{d}t , \end{aligned}$$
(41)

for the \(L_p\)-norm relative to \({\mathcal{M}}'\) and \(|\psi \rangle \). \(F_{{\mathcal{M}}}\) is the fidelity relative to \({\mathcal{M}}\),

$$\begin{aligned} c_p=-\frac{\sin (2\pi /p)}{\pi }>0, \quad \frac{1}{p} + \frac{1}{p'}=1, \end{aligned}$$
(42)

\(y(t)=1-x(t)\), \(x: \mathbb {R}_+ \rightarrow {\mathcal{M}}\) a step function with finite range such that \(x(t)=1\) for sufficiently small \(t>0\) and \(x(t)=0\) for sufficiently large \(t>0\), and we use the notation \((x\omega x^*)(a)=\omega (x^* a x)\).

Remarks

Both sides of the inequality (41) only depend on \(|\zeta \rangle \in \mathscr {H}\) via the functional \(\omega _\zeta \) on \({\mathcal{M}}\).

We will now start to investigate the variational expression in the proposition in its own right. For easier reference, we make the following definition where p corresponds to 2s.

Definition 2

Let \({\mathcal{M}}\) be a von Neumann algebra in standard form acting on \({\mathcal{H}}\), \(s\in (1/2,1)\). The “generalized fidelity” is defined by

$$\begin{aligned} \Phi _s(\omega _\zeta | \omega _\psi ) = {\text {ln}}\left\{ c_{2s} \inf _{x:\mathbb {R}_+ \rightarrow {\mathcal{M}}} \int _0^\infty [\omega _\zeta (x(t)^*x(t)) + t^{-1} F(y(t) \omega _\zeta y(t)^* | \omega _\psi )^2] t^{\frac{s-1}{s}} \frac{\mathrm{d}t}{t} \right\} ^{\frac{s}{s-1}} \end{aligned}$$
(43)

with the infimum and notations as defined in Proposition 1.

Remarks

(1) The normalizations of \(\Phi _s\) are chosen in such a way that \( \Phi _s \ge D_s \) by Proposition 1.

(2) The terminology “generalized fidelity” is due to the following observation. Consider \({\mathcal{M}}= M_n(\mathbb {C})\) and diagonal (normalized) density matrices \(\omega _\zeta = diag(p_1, \dots , p_n), \omega _\psi = diag(q_1, \dots , q_n)\). We use the abbreviation \(F=F(\omega _\zeta | \omega _\psi ) = \sum _i \sqrt{p_iq_i}\) for the fidelity. By considering the variational expression in the definition of \(\Phi _s\) with diagonal \(x(t) = diag(x_1(t), \dots , x_n(t))\), one can easily convince oneself that the infimum can be reached by approximations of

$$\begin{aligned} x_i(t) = \sqrt{\frac{q_i}{p_i}} \frac{F}{t+1} \end{aligned}$$
(44)

by step functions. Inserting this into the variational formula one gets \(\Phi _s \ge -\frac{s}{1-s}{\text {ln}}F^2\). Corollary 6 shows that an inequality of this type with a worse constant is true generally. On the other hand, as given in Corollary 6, we always have the reverse inequality which implies that \(\Phi _s =-\frac{s}{1-s}{\text {ln}}F^2\) in the present case. This becomes (minus log of) the squared fidelity when \(s=1/2\).

Using this formula for \(\Phi _s\) in the commutative case, the inequality \(\Phi _s \ge D_s\) (Proposition 1) is seen to be equivalent to Hölder’s inequality applied to the fidelity \(F=\sum _i \sqrt{p_iq_i}\). Indeed, taking the Hölder exponents to be \(p=2s, p'=(2s)/(2s-1)\), one has

$$\begin{aligned} \sum _i \sqrt{p_iq_i} = \sum _i (p_i^{1/2} q_i^{(1-s)/(2s)}) q_i^{(2s-1)/(2s)} \le \left( \sum _i p_i^s q_i^{1-s} \right) ^{1/(2s)} \end{aligned}$$
(45)

using also \(\sum _i q_i=1\). Then, applying log to this inequality and using that \(D_s = (s-1)^{-1} {\text {ln}}\sum _i p_i^s q_i^{1-s}\) in the commutative case, we get \(-\frac{s}{1-s}{\text {ln}}F^2 = \Phi _s \ge D_s\), i.e., Proposition 1 in the commutative case.

(3) The properties shown in the following indicate that \(\Phi _s\) has many of the desired properties of a divergence. To the best of our knowledge, \(\Phi _s\) is a new generalization of the negative log squared fidelity.

We now investigate some properties of \(\Phi _s\). First, consider \(|\zeta _1\rangle , |\zeta _2\rangle \) such that \(\omega _{\zeta _1} \le \omega _{\zeta _2}\) in the sense of functionals on the von Neumann algebra \({\mathcal{M}}\). It is well-known that such a condition implies the existence of \(a' \in {\mathcal{M}}'\) such that \(|\zeta _1\rangle = a'|\zeta _2\rangle \) and \(\Vert a'\Vert \le 1\). Then, (26) immediately gives:

$$\begin{aligned} \begin{aligned} F_{\mathcal{M}}(y\omega _{\zeta _1}y^*, \omega _\psi )&= \sup \{ |\langle y \zeta _1 | b' \psi \rangle | : b' \in {\mathcal{M}}', \Vert b' \Vert =1\}\\&= \sup \{ |\langle y a' \zeta _2 | b' \psi \rangle | : b' \in {\mathcal{M}}', \Vert b' \Vert =1\}\\&= \sup \{ |\langle y \zeta _2 | a^{\prime *} b' \psi \rangle | : b' \in {\mathcal{M}}', \Vert b' \Vert =1\}\\&\le \sup \{ |\langle y \zeta _2 | c' \psi \rangle | : c' \in {\mathcal{M}}', \Vert c' \Vert =1\}\\&= F_{\mathcal{M}}(y\omega _{\zeta _2}y^*, \omega _\psi ) \end{aligned} \end{aligned}$$
(46)

for any \(y \in {\mathcal{M}}\), since \(\Vert a^{\prime *} b'\Vert \le 1\) so the sup in the fourth line is over a larger set. But then the variational formula (41) gives without difficulty \(\Phi _s(\omega _{\zeta _1} | \omega _{\psi }) \ge \Phi _s(\omega _{\zeta _2} | \omega _{\psi })\). Similarly, consider \(|\psi _1\rangle , |\psi _2\rangle \) such that \(\omega _{\psi _1} \le \omega _{\psi _2}\). By the same argument \(F(y\omega _{\zeta }y^*, \omega _{\psi _1}) \le F(y\omega _{\zeta }y^*, \omega _{\psi _2})\), and the variational formula (41) thereby gives the following corollary:

Corollary 4

For normal positive functionals on a von Neumann algebra \(\omega _{\zeta _1} \le \omega _{\zeta _2}\) and \(\omega _{\psi _1} \le \omega _{\psi _2}\), we have also \(\Phi _s(\omega _{\zeta _1} | \omega _{\psi _1}) \ge \Phi _s(\omega _{\zeta _2} | \omega _{\psi _2})\) when \(1> s > 1/2\).

As an application, consider an inclusion of von Neumann factors \({\mathcal{N}}\subset {\mathcal{M}}\) together with a conditional expectation \(E: {\mathcal{M}}\rightarrow {\mathcal{N}}\) and unit vector \(|\zeta \rangle \) such that \(ind(E)=\lambda <\infty \). Then by definition, \(\omega _\zeta \circ E \ge \lambda ^{-1} \omega _\zeta \). The identity \(\Phi _s(\omega _\zeta | \lambda ^{-1} \omega _\psi ) = \Phi _s(\omega _\zeta | \omega _\psi ) + {\text {ln}}\lambda \) [Corollary 6,3)] and the previous corollary trivially give

$$\begin{aligned} \Phi _s(\omega _\zeta | \omega _\zeta \circ E) \le {\text {ln}}\lambda \end{aligned}$$
(47)

because \(\Phi _s(\omega _\psi | \omega _\psi )=0\).

We can also prove the DPI for \(\Phi _s\) in the context of properly infinite von Neumann algebras using only properties of the fidelity in the range \(1/2 \le s \le 1\).

Corollary 5

Let \({\mathcal{M}}, {\mathcal{N}}\) be properly infinite von Neumann algebras and \(T: {\mathcal{M}}\rightarrow {\mathcal{N}}\) a channel. Then, for two normal state functionals \(\omega _\zeta , \omega _\psi \) we have \(\Phi _s(\omega _\zeta \circ T| \omega _\psi \circ T) \le \Phi _s(\omega _\zeta | \omega _\psi )\) for \(s \in (1/2,1)\).

Proof

By [25], thm. 2.10 (which assumes properly infinite von Neumann algebras), T can be written in Stinespring form \(T(b)=v^*\rho (b)v\), where \(v \in {\mathcal{M}}, v^* v=1, vv^*=q\) (q a projection) and \(\rho :{\mathcal{N}}\rightarrow {\mathcal{M}}\) a homomorphism of von Neumann algebras. Then, it is sufficient to prove the theorem separately for the case (i) \(T_1(a) = v^*av\) and the case (ii) \(T_2(b)=\rho (b)\).

(i) Using (26) with \({\mathcal{M}}'\) in place of \({\mathcal{M}}\), we have for \(y \in {\mathcal{M}}\):

$$\begin{aligned} \begin{aligned} F_{{\mathcal{M}}}(y\omega _{v\zeta }y^* | \omega _{v\psi })&= \sup \{ |\langle yv\zeta | x'v\psi \rangle | : \Vert x' \Vert =1, x' \in {\mathcal{M}}' \}\\&= \sup \{ |\langle yv\zeta | vx'\psi \rangle | : \Vert x' \Vert =1, x' \in {\mathcal{M}}' \}\\&= \sup \{ |\langle v^*yv\zeta | x'\psi \rangle | : \Vert x' \Vert =1, x' \in {\mathcal{M}}' \} \\&= F_{{\mathcal{M}}}( (v^*yv) \omega _{\zeta } (v^*yv)^* | \omega _\psi ). \end{aligned} \end{aligned}$$
(48)

Furthermore,

$$\begin{aligned} \omega _{v\zeta }( x^* x) = \omega _\zeta (v^* x^* x v) \ge \omega _\zeta ((v^*xv)^* v^*xv) . \end{aligned}$$
(49)

Then, we have

$$\begin{aligned}&c_p \inf _{x: \mathbb {R}_+ \rightarrow {\mathcal{M}}} \int _0^\infty [\omega _{v\zeta }( x(t)^* x(t)) + t^{-1} F_{\mathcal{M}}( y(t) \omega _{v\zeta } y(t)^* | \omega _{v\psi })^2] t^{-2/p'} \mathrm{d}t \nonumber \\&\quad = c_p \inf _{x: \mathbb {R}_+ \rightarrow {\mathcal{M}}} \int _0^\infty [\omega _{v\zeta }( x(t)^* x(t)) + t^{-1} F_{\mathcal{M}}( (v^* y(t) v) \omega _{\zeta } (v^* y(t) v)^* | \omega _{\psi })^2] t^{-2/p'} \mathrm{d}t \nonumber \\&\quad \ge c_p \inf _{x:\mathbb {R}_+ \rightarrow {\mathcal{M}}} \int _0^\infty [\omega _\zeta (X(t)^* X(t)) + t^{-1} F_{{\mathcal{M}}}( Y(t) \omega _{\zeta } Y(t)^* | \omega _\psi )^2 ] t^{-2/p'} \mathrm{d}t , \end{aligned}$$
(50)

where \(Y(t)=v^*y(t) v, X(t)=v^* x(t) v\). Note that these are particular examples of piecewise constant functions valued in \({\mathcal{M}}\) with finite range such that \(X(t)+Y(t)=1\) and such that \(Y(t)=0\) for sufficiently small t and \(X(t)=0\) for sufficiently large t. Thus, we can make the right side at most smaller by taking the infimum over all such functions. This results in \(\Phi _s(\omega _{v\zeta }| \omega _{v\psi }) \le \Phi _s(\omega _{\zeta }| \omega _{\psi })\) using the definition of \(\Phi _s\) (43) (where \(2s=p\)).

(ii) We have

$$\begin{aligned}&c_p \inf _{x:\mathbb {R}_+ \rightarrow \rho ({\mathcal{N}})} \int _0^\infty [ \omega _\zeta (x(t)^* x(t)) + t^{-1} F_{\rho ({\mathcal{N}})}(y(t)\omega _\zeta y(t)^* | \omega _\psi )^2 ] t^{-2/p'} \mathrm{d}t \nonumber \\&\quad \ge \ c_p \inf _{X:\mathbb {R}_+ \rightarrow {\mathcal{M}}} \int _0^\infty [ \omega _\zeta (X(t)^* X(t)) + t^{-1} F_{\rho ({\mathcal{N}})}(Y(t) \omega _\zeta Y(t)^* | \omega _\psi )^2 ] t^{-2/p'} \mathrm{d}t \nonumber \\&\quad \ge \ c_p \inf _{X:\mathbb {R}_+ \rightarrow {\mathcal{M}}} \int _0^\infty [ \omega _\zeta (X(t)^* X(t)) + t^{-1} F_{{\mathcal{M}}}(Y(t) \omega _\zeta Y(t)^* | \omega _\psi )^2 ] t^{-2/p'} \mathrm{d}t,\qquad \end{aligned}$$
(51)

where in the first step we took the infimum over the larger set of piecewise constant functions X valued in \({\mathcal{M}}\) with finite range such that \(1-X(t)=Y(t)=0\) for sufficiently small t and \(X(t)=0\) for sufficiently large t. In the second step, we used the monotonicity \(F_{\rho ({\mathcal{N}})} \ge F_{\mathcal{M}}\) since \(\rho ({\mathcal{N}})\) is a von Neumann subalgebra of \({\mathcal{M}}\), by (26). This yields \(\Phi _s(\omega _{\zeta } \circ \rho | \omega _{\psi } \circ \rho ) \le \Phi _s(\omega _{\zeta }| \omega _{\psi })\). \(\square \)

Applying the DPI to the channel \( {\mathcal{A}}\rightarrow {\mathcal{A}}\oplus \cdots \oplus {\mathcal{A}}, a \mapsto a \oplus \cdots \oplus a\) and the states \(\rho = \oplus _i \lambda _i \omega _{\psi _i}, \sigma = \oplus _i \lambda _i \omega _{\zeta _i}\) implies that \(\Phi _s\) is jointly convex by a standard argument, see e.g., [32], proof of prop. 1,

$$\begin{aligned} \sum _i \lambda _i \Phi _s(\omega _{\zeta _i} | \omega _{\psi _i}) \ge \Phi _s(\sum _i \lambda _i \omega _{\zeta _i} | \sum _j \lambda _j \omega _{\psi _j}) \end{aligned}$$
(52)

where the sum is finite and \(\lambda _i \ge 0, \sum \lambda _i=1\). Next, we obtain the following corollary:

Corollary 6

Let \({\mathcal{M}}\) be a von Neumann algebra and \(s \in (1/2,1)\).

  1. (1)

    We have for \(\Vert \zeta \Vert =1\)

    $$\begin{aligned} \Phi _s(\omega _\zeta | \omega _\psi ) \ge -{\text {ln}}F(\omega _\zeta | \omega _\psi )^2. \end{aligned}$$
    (53)
  2. (2)

    We have for \(\Vert \psi \Vert =1\)

    $$\begin{aligned} \Phi _s(\omega _\zeta | \omega _\psi ) \le -\frac{s}{1-s} {\text {ln}}F(\omega _\zeta | \omega _\psi )^2. \end{aligned}$$
    (54)
  3. (3)

    \(\Phi _s(\omega _\zeta | \lambda \omega _\psi ) = \Phi _s(\omega _\zeta | \omega _\psi ) - {\text {ln}}\lambda \) for \(\lambda > 0\).

  4. (4)

    We have for \(\Vert \psi \Vert =1=\Vert \zeta \Vert \) that \(\lim _{s \rightarrow (1/2)^+} \Phi _{s} (\omega _\zeta | \omega _\psi ) = -{\text {ln}}F(\omega _\zeta | \omega _\psi )^2\).

  5. (5)

    \(\Phi _s(\omega _\zeta | \omega _\psi ) \ge 0\) for \(\Vert \psi \Vert =1=\Vert \zeta \Vert \) with equality iff \(\omega _\zeta = \omega _\psi \).

Proof

For 1), we choose an approximation of

$$\begin{aligned} x(t) = \frac{F(\omega _\zeta | \omega _\psi )^2}{t +F(\omega _\zeta | \omega _\psi )^2} 1 \end{aligned}$$
(55)

by step functions. Then, we apply the variational definition of \(\Phi _s\) (43) and the integral formula (30) upon which the result follows by a simple calculation using \(\Vert \zeta \Vert =1\).

For 2), we first use the supremum characterization of the fidelity (26), by which have \(F(y\omega _\zeta y^*,\omega _\psi )^2\ge |\langle \psi | y \zeta \rangle |^2 = \Vert P_\psi y \zeta \Vert ^2\), where \(P_\psi = |\psi \rangle \langle \psi |\) is a projector because \(\Vert \psi \Vert =1\). Then (\(p=2s\)),

$$\begin{aligned} \begin{aligned}&c_p \inf _{x:\mathbb {R}_+ \rightarrow {\mathcal{M}}} \int _0^\infty [\omega _\zeta (x(t)^*x(t)) + t^{-1} F(y(t) \omega _\zeta y(t)^* | \omega _\psi )^2] t^{-2/p'} \mathrm{d}t \\&\quad \ge \ c_p \inf _{x: \mathbb {R}_+ \rightarrow {\mathcal{M}}'} \int _0^\infty [\Vert x(t) \zeta \Vert ^2 + t^{-1} \Vert P_\psi y(t) \zeta \Vert ^2] t^{-2/p'} \mathrm{d}t \\&\quad = \ c_p \int _0^\infty \langle \zeta | P_\psi (t+P_\psi )^{-1} \zeta \rangle t^{-2/p'} \mathrm{d}t \\&\quad = \ c_p \Vert P_\psi \zeta \Vert ^2 \int _0^\infty (t+1)^{-1} t^{-2/p'} \mathrm{d}t = | \langle \zeta | \psi \rangle |^2. \end{aligned} \end{aligned}$$
(56)

This remains true if we change \(|\zeta \rangle \rightarrow u' |\zeta \rangle \) for any unitary \(u'\) from \({\mathcal{M}}'\), thus giving

$$\begin{aligned} \begin{aligned}&c_p \inf _{x:\mathbb {R}_+ \rightarrow {\mathcal{M}}} \int _0^\infty [\omega _\zeta (x(t)^*x(t)) + t^{-1} F(y(t) \omega _\zeta y(t)^* | \omega _\psi )^2] t^{-2/p'} \mathrm{d}t \\&\quad \ge \sup \{ | \langle u' \zeta | \psi \rangle |^2 : u' \in {\mathcal{M}}' \ \ \text {unitary}\} = F(\omega _\zeta | \omega _\psi )^2, \end{aligned} \end{aligned}$$
(57)

using a well-known characterization [1] of the fidelity in the last step. The rest then follows from the definition (43) of \(\Phi _s\).

For (3), we use the homogeneity of the fidelity \(F(\lambda y(t)\omega _\psi y(t)^* | \omega _\zeta ) = \sqrt{\lambda } F( y(t)\omega _\psi y(t)^* | \omega _\zeta )\) and apply the change of variables \(t' = t/\lambda \) in the integral (43).

Item (4) is a combination of (1) and (2).

Item (5) follows from the properties \(F(\omega _\zeta | \omega _\psi ) \le 1\), \(F(\omega _\zeta | \omega _\psi )=1\) iff \(\omega _\zeta = \omega _\psi \), and 1), 2). \(\square \)

4 Application to quantum field theory

Here, we consider an application of \(\Phi _s\) to quantum field theory inspired by [26]. For simplicity and concreteness, we consider chiral conformal quantum field theories (CFTs) on a single lightray (real line) or equivalently the circle in the conformally compactified picture. But the arguments are of a rather general nature and would apply with some fairly obvious modifications to general quantum field theories in higher dimensions under appropriate hypotheses.

We assume axioms that are standard in algebraic quantum field theory [16]. According to this axiom scheme, fulfilled by many examples, a chiral CFT is an assignment \({\mathcal{A}}: I \mapsto {\mathcal{A}}(I)\), wherein \(I \subset S^1\) is an open interval and \({\mathcal{A}}(I)\) a von Neumann algebra acting on a fixed Hilbert space \(\mathscr {H}\) with the following properties:

  1. 1.

    (Isotony) If \(I_1 \subset I_2\) then \({\mathcal{A}}(I_1) \subset {\mathcal{A}}(I_2)\).

  2. 2.

    (Commutativity) If \(I_1 \cap I_2\) is empty, then \([{\mathcal{A}}(I_1) , {\mathcal{A}}(I_2)]=\{0\}\).

  3. 3.

    (Möbius covariance) There is a strongly continuous unitary representation U on \(\mathscr {H}\) of the Möbius group \(G=\widetilde{SL_2({\mathbb R})/{\mathbb Z}_2}\) which is consistent with the standard action of this group on the circle by fractional linear transformations, in the sense \(U(g) {\mathcal{A}}(I) U(g)^* = {\mathcal{A}}(gI)\) for all \(g \in G\).

  4. 4.

    (Positive energy) The one-parameter subgroup of rotations has a positive generator \(L_0\) under the representation U.

  5. 5.

    (Vacuum) There is a unique vector \(|\Omega \rangle \in {\mathcal{H}}\), called the vacuum, which is invariant under all \(U(g), g \in G\).

  6. 6.

    (Additivity) Let I and \(I_n\) be intervals such that \(I = \cup _n I_n\). Then, \({\mathcal{A}}(I)=\vee _n {\mathcal{A}}(I_n)\) (strong closure).

The special situation we would like to study here are two chiral CFTs \({\mathcal{A}}, {\mathcal{F}}\) in the above sense such that \({\mathcal{A}}(I) \subset {\mathcal{F}}(I)\) is an inclusion of von Neumann factors acting on the same Hilbert space \(\mathscr {H}\) for any open interval I, and transforming under the same representation, U. By general arguments, these factors have to be of type III [9]. A typical example is when \({\mathcal{A}}\) is the Virasoro net (operator algebras generated by the stress energy tensor) and \({\mathcal{F}}\) is an extension of finite index as classified in [22]. For further details on such a setting, see e.g., [27, 28]. We will also assume that the Jones-Kosaki index \(\lambda \equiv [{\mathcal{F}}(I):{\mathcal{A}}(I)]\) is finite (hence independent of I by [28]). By [27], lemma 13, this implies that for each I there is a conditional expectation \(E_I: {\mathcal{F}}(I) \rightarrow {\mathcal{A}}(I)\), satisfying the Pimsner–Popa inequality (10). We assume that \(E_I\) leaves the vacuum vector invariant, \(\omega _\Omega \circ E_I = \omega _\Omega \) for all intervals I. Furthermore, these conditional expectations are assumed to be consistent in the sense \(E_I |_{{\mathcal{F}}(J)} = E_J\) for \(J \subset I\) [28]. Consider two sets of intervals (identifying \(S^1\) with the real line via a stereographic projection):

$$\begin{aligned} A_n = (a, -1/n), \quad B_n = (1/n, b), \end{aligned}$$
(58)

wherein n is a natural number and \(a<0,b>0\). We consider the von Neumann algebra inclusion \({\mathcal{A}}(A_n) \vee {\mathcal{A}}(B_n) \subset {\mathcal{F}}(A_n) \vee {\mathcal{F}}(B_n)\), and we let \(E_n\) be the conditional expectation \({\mathcal{F}}(A_n) \vee {\mathcal{F}}(B_n) \rightarrow {\mathcal{A}}(A_n) \vee {\mathcal{A}}(B_n)\) such that

$$\begin{aligned} E_n(a_n b_n) = E_{A_n}(a_n) E_{B_n}(b_n) \quad \forall a_n \in {\mathcal{F}}(A_n), b_n \in {\mathcal{F}}(B_n). \end{aligned}$$
(59)

Thus, \(E_n\) only projects out degrees of freedom of the individual parts of the system in (58) separately.Footnote 3 In the limit as \(n \rightarrow \infty \) (denoted as \(\lim _n\) in the following), these systems touch each other. We can show the following theorem.

Theorem 1

We have \(\lim _n \Phi _s(\omega _\Omega | \omega _\Omega \circ E_n) = {\text {ln}}[ {\mathcal{F}}: {\mathcal{A}}]\) for \(s \in [1/2,1)\).

Corollary 7

We have

$$\begin{aligned} \lim _n F(\omega _\Omega |\omega _\Omega \circ E_n) = [{\mathcal{F}}: {\mathcal{A}}]^{-1/2}. \end{aligned}$$
(60)

Proof

Obvious in view of Corollary 6, 4) and Theorem 1. \(\square \)

Proof of Theorem 1

The proof strategy is similar to that of a result by Longo and Xu [26] who have considered the relative entropy S instead of the divergence \(\Phi _s\). As their proof, we make use of the variational definition of the divergence \(\Phi _s\).

First assume that \(1/2<s<1\). We use the notation \(d^2 = \lambda \equiv [{\mathcal{F}}(I):{\mathcal{A}}(I)] < \infty \) which is independent of I [28]. Let \(|\psi _n\rangle \) be a vector such that \(\omega _{\psi _n} = \omega _\Omega \circ E_n\), as a functional on \({\mathcal{F}}(A_n) \vee {\mathcal{F}}(B_n)\). \(\square \)

Lemma 2

There exists a sequence \(\{ f_n \} \subset {\mathcal{F}}(A_n) \vee {\mathcal{F}}(B_n)\) such that \(f_n \rightarrow 1\) strongly and

$$\begin{aligned} \lim _n \omega _\Omega (f_n) = 1, \quad \lim _n \omega _\Omega (f_n^* f_n) = 1, \quad \lim _n \omega _{\psi _n}(f_n^*f_n) = \lambda ^{-1}. \end{aligned}$$
(61)

Proof

The proof is given in [26], prop. 4.5. However we rephrase it somewhat in preparation to the discussions in the next section. A finite index inclusion \({\mathcal{N}}\subset {\mathcal{M}}\) of properly infinite von Neumann factors is characterized uniquely by its associated Q-system [7, 29] \((x,w,\theta )\), wherein \(x,w \in {\mathcal{N}}\) obey certain relations relative to the canonical endomorphism \(\theta \) of \({\mathcal{N}}\), see appendix A.

Applying this structure to the inclusions \({\mathcal{A}}(A_n) \subset {\mathcal{F}}(A_n)\) we get \(v_{A_n} \in {\mathcal{F}}(A_n)\) and similarly for \(B_n\). These are fixed uniquely demanding that the corresponding conditional expectations, see appendix A, be given by the \(|\Omega \rangle \) preserving conditional expectation \(E_{A_n}\) etc. By translation-dilation covariance, this implies for example that \(v_{A_n} \rightarrow v_A\) strongly as \(n \rightarrow \infty \). Another standard result in this setting, shown, e.g., in [26], lemma 2.9, is that \(v_{A_n}\) can be “transported” to \(v_{B_n}\) in the sense that there is a unitary \(u_{B_nA_n} \in {\mathcal{A}}(a,b) \cap {\mathrm {Hom}}(\theta _{B_n}, \theta _{A_n})\), such that \(v_{B_n} = u_{B_nA_n}v_{A_n}\). By additivity, we may find a sequence of unitaries \(a_{n,k} \in {\mathcal{A}}(A_n), b_{n,k} \in {\mathcal{B}}(B_n)\) such that \(\sum _{k=1}^{N(n)} b_{n,k}^* a_{n,k}^{} - u_{B_nA_n} \rightarrow 0\) as \(n\rightarrow \infty \), in the strong sense. Then, let

$$\begin{aligned} V_{A_n,k} = \frac{1}{\sqrt{d}} a_{n,k} v_{A_n} \in {\mathcal{F}}(A_n),\quad V_{B_n,k}^* = \frac{1}{\sqrt{d}} v_{B_n}^* b_{n,k}^* \in {\mathcal{F}}(B_n). \end{aligned}$$
(62)

Finally, let

$$\begin{aligned} f_n = \sum _{k=1}^{N(n)} V_{B_n,k}^* V_{A_n,k}^{}. \end{aligned}$$
(63)

Then, it follows that \(f_n \rightarrow d^{-1} v_{B}^* v_{B}^{}=1\) strongly by construction and the relations of Q-systems, see appendix A. This already implies the first two of the claimed limits in (61). On the other hand,

$$\begin{aligned} \begin{aligned} \omega _\Omega \circ E_n(f_n^* f_n)&= \sum _{k,l} \omega _\Omega \circ E_n(V_{A_n,k}^* V_{B_n,k}V_{B_n,l}^* V_{A_n,l}) \\&= \sum _{k,l} \omega _\Omega \circ E_n(V_{A_n,k}^* V_{A_n,l} V_{B_n,k}V_{B_n,l}^*) \\&= \sum _{k,l} \omega _\Omega (E_{A_n}(V_{A_n,k}^* V_{A_n,l})E_{B_n}(V_{B_n,k}V_{B_n,l}^*)) \\&= \sum _{k,l} \omega _\Omega (V_{A_n,k}^* V_{A_n,l}E_{B_n}(V_{B_n,k}V_{B_n,l}^*)) \\&= d^{-3} \sum _{k,l} \omega _\Omega (v_{A_n}^*a_{n,k}^* a_{n,l}^{} v_{A_n} b_{n,k}^{} b_{n,l}^*) \\&= d^{-3} \sum _{k,l} \omega _\Omega (v_{A_n}^*a_{n,k}^* b_{n,k}^{} a_{n,l}^{} b_{n,l}^* v_{A_n}^{}) \rightarrow d^{-2} \end{aligned} \end{aligned}$$
(64)

using commutativity in the first line, the definition of \(E_n\) in the second line, \(E_I |_{{\mathcal{F}}(J)} = E_J\) for \(J \subset I\) and \(\omega _\Omega \circ E_I = \omega _\Omega \) in the third line, the bimodule property of \(E_{B_n}\) as well as \(E_{B_n}(v_{B_n}v_{B_n}^*) = d^{-1}\) by properties of the Q-system in the fourth line, commutativity again in the fifth line, and \(\sum a_{n,k}^* b_{n,k} a_{n,l} b_{n,l}^* \rightarrow 1\) strongly as \(n \rightarrow \infty \) and \(v_{A_n}^* v_{A_n}^{}=d \cdot 1\) in the last line (using properties of the Q-system). \(\square \)

Next, we define

$$\begin{aligned} x_n(t) = {\left\{ \begin{array}{ll} 1-\frac{t}{t+\lambda ^{-1}} f_n &{} \text {if }1/k \le t \le k\\ 1 &{} \text {if }t>k\\ 0 &{} \text {if }t<1/k. \end{array}\right. } \end{aligned}$$
(65)

Using the properties (61) of \(f_n\), we have for \(t \in (1/k, k)\):

$$\begin{aligned} \begin{aligned}&\lim _n \omega _\Omega (x_n(t)^* x_n(t)) = \frac{\lambda ^{-2}}{(t+\lambda ^{-1})^2}\\&\quad \limsup _n F(y_n(t) \omega _\Omega y_n(t)^* | \omega _{\psi _n}) \le \limsup _n \Vert y_n(t)^* \psi _n \Vert = \frac{\lambda ^{-1}t^2}{(t+\lambda ^{-1})^2}, \end{aligned} \end{aligned}$$
(66)

using in the second line the Cauchy–Schwarz inequality in order to estimate the fidelity characterized through (26). Therefore, for fixed k, we have

$$\begin{aligned}&\limsup _n \int _{1/k}^k \left[ \omega _\Omega (x_n(t)^* x_n(t)) + t^{-1} F(y_n(t) \omega _\Omega y_n(t)^* | \omega _{\psi _n})^2 \right] t^{-(2s-1)/s} \mathrm{d}t \nonumber \\&\quad \le \int _{1/k}^k \left[ \frac{\lambda ^{-2}}{(t+\lambda ^{-1})^2} + \frac{\lambda ^{-1}t}{(t+\lambda ^{-1})^2} \right] t^{-(2s-1)/s} \mathrm{d}t \nonumber \\&\quad = \ \ c_{2s}^{-1} \lambda ^{(s-1)/s} - \frac{s}{1-s} k^{(s-1)/s} - \frac{s}{2s-1} k^{-(2s-1)/s} \nonumber \\&\qquad + s\sum _{m=1}^\infty (-1)^m \left\{ \left( \frac{\lambda }{k} \right) ^m \frac{1}{ms+(1-s)} + \left( \frac{1}{\lambda k} \right) ^{m+1} \frac{1}{ms+(2s-1)} \right\} ,\qquad \end{aligned}$$
(67)

using the integral (30) and the definition of \(c_p\) from prop. 1 in the last step. The last sum is of order \(O(k^{-1})\) uniformly in \(s \in [1/2,1]\). On the other hand, using the definition of \(x_n(t)\) in the range \(t<1/k\), we have

$$\begin{aligned} \begin{aligned}&\limsup _n \int _0^{1/k} \left[ \omega _\Omega (x_n(t)^* x_n(t)) + t^{-1} F(y_n(t) \omega _\Omega y_n(t)^* | \omega _{\psi _n})^2 \right] t^{-(2s-1)/s} \mathrm{d}t\\&\quad = \int _0^{1/k} t^{-(2s-1)/s} \mathrm{d}t = \frac{s}{1-s} k^{(s-1)/s}, \end{aligned} \end{aligned}$$
(68)

while using the definition of \(x_n(t)\) in the range \(t>k\), we have

$$\begin{aligned} \begin{aligned}&\limsup _n \int _k^\infty \left[ \omega _\Omega (x_n(t)^* x_n(t)) + t^{-1} F(y_n(t) \omega _\Omega y_n(t)^* | \omega _{\psi _n})^2 \right] t^{-(2s-1)/s} \mathrm{d}t\\&\quad = \int _k^\infty t^{-(2s-1)/s-1} \mathrm{d}t = \frac{s}{2s-1} k^{-(2s-1)/s}. \end{aligned} \end{aligned}$$
(69)

Consequently, when \(s=p/2\), the variational expression (41) is estimated byFootnote 4

$$\begin{aligned} \begin{aligned}&\limsup _n c_{2s} \int _0^{\infty } \left[ \omega _\Omega (x_n(t)^* x_n(t)) + t^{-1} F(y_n(t) \omega _\Omega y_n(t)^* | \omega _{\psi _n})^2 \right] t^{-2/(2s)'} \mathrm{d}t \\&\quad \le \ \lambda ^{(s-1)/s} + O(k^{-1}) \end{aligned} \end{aligned}$$
(70)

for any \(k>0\) where \(O(k^{-1})\) is a term bounded in norm by \(Ck^{-1}\) uniformly in \(s \in [1/2,1]\). Letting \(k \rightarrow \infty \) this term disappears, and then using the definition of \(\Phi _s\) and of \(|\psi _n\rangle \) gives

$$\begin{aligned} \liminf _n \Phi _s(\omega _\Omega | \omega _\Omega \circ E_n) \ge {\text {ln}}\lambda . \end{aligned}$$
(71)

On the other hand, we have already seen before in (47) that \(\Phi _s(\omega _\Omega | \omega _\Omega \circ E_n) \le {\text {ln}}\lambda \). The proof of the theorem is therefore complete for the case \(1/2<s<1\).

Now we turn to the limiting case \(s \rightarrow (1/2)^+\). We go back to the proof and investigate the limit as \(s \rightarrow (1/2)^+\). By inspection it can be seen that in order to obtain an expression in (70) not exceeding \(\lambda ^{(s-1)/s} + O(k^{-1}) + \varepsilon \) for some \(\varepsilon >0\), we need \(n \ge n_0(k,\varepsilon )\), where \(n_0\) does not depend on \(s \in [1/2,1]\). Furthermore, we have argued in the proof that \(O(k^{-1})\) is uniform in \(s \in [1/2,1]\). Thus, the limit \(s \rightarrow 1/2^+\) may be taken and we learn that \(F(\omega _\Omega | \omega _{\psi _n})^2 \le \lambda ^{-1} + O(k^{-1}) + \varepsilon \) when \(n \ge n_0(k,\varepsilon )\). Thus, \(\limsup _n F(\omega _\Omega | \omega _{\psi _n})^2 \le \lambda ^{-1}\) and the rest is as before.

Remarks

Cor. 1 for \(s=1/2\) gives the a dual formulation of this result when applied to \({\mathcal{M}}_n={\mathcal{F}}(A_n) \vee {\mathcal{F}}(B_n), {\mathcal{N}}_n = {\mathcal{A}}(A_n) \vee {\mathcal{A}}(B_n)\) and \(E_n': ({\mathcal{A}}(A_n) \vee {\mathcal{A}}(B_n))' \rightarrow (\mathcal {F}(A_n) \vee {\mathcal{F}}(B_n))'\), which is the dual conditional expectation. Indeed, if we combine cor.s 1, 7 and 6, 4), we immediately find that

$$\begin{aligned} \lim _n F(\omega _\Omega ' | \omega _\Omega ' \circ E_n') = 1. \end{aligned}$$
(72)

5 Conclusions

We end this paper by commenting on the physical significance of the result in sec. 4. For this, it is instructive to have in mind the example of a Haag–Kastler QFT [16], \({\mathcal{F}}\), containing charged fields. These map the vacuum \(|\Omega \rangle \) to states with net (flavor) charge. The subset of charge neutral operators is \({\mathcal{A}}\). On the full Hilbert space \({\mathcal{H}}\) (including charged states), the gauge group G acts by global unitaries which transform the charged fields and leave the vacuum invariant. The conditional expectation \(E_I: {\mathcal{F}}(I) \rightarrow {\mathcal{A}}(I)\) is the Haar-average over G and projects onto the charge neutral operators (“observables”) in a given region I, which is left invariant because gauge transformations commute with translations by the Coleman–Mandula theorem (see e.g., [16]). Assuming that G is a finite group with |G| elements, the index is \(|G|=[{\mathcal{F}}:{\mathcal{A}}]\).

Given two spacelike related regions \(A_n\) and \(B_n\) separated by a finite corridor of size \(\sim 1/n\), the conditional expectation \(E_n\) defined by (59) is basically the tensor product \(E_{A_n} \otimes E_{B_n}\). \(\Phi _s(\omega _\Omega | \omega _\Omega \circ E_n)\) in a sense accounts for the correlations between \(A_n\) and \(B_n\) that are visible using charge operators only in both subsystem. This interpretation becomes more and more precise when the regions move together. The above intuitive argument has been substantiated (in a somewhat heuristic way) in the very lucid paper by [11], in the case of the relative entropy S—such that we should use Kosaki’s variational formula for S (8) instead of the variational definition of \(\Phi _s\) (43). They first argue using known properties of S in connection with conditional expectations that the mutual information between \(A_n\) and \(B_n\) in the vacuum state satisfies

$$\begin{aligned} I_{\mathcal{F}}(A_n | B_n) - I_{\mathcal{A}}(A_n | B_n) = S(\omega _\Omega | \omega _\Omega \circ E_n). \end{aligned}$$
(73)

When \(n \rightarrow \infty \), it is plausible that the mutual information on the left side is dominated by correlations between charge carrying operators localized very near the edges where \(A_n\) and \(B_n\) approach each other. Furthermore, although each term in \(I_{\mathcal{F}}(A_n | B_n) - I_{\mathcal{A}}(A_n | B_n)\) is expected to diverge, the difference ought to be a finite number related to the order of G. In fact, by investigating more closely the right side of the equation, they argue that \(S(\omega _\Omega | \omega _\Omega \circ E_n)\) converges to \({\text {ln}}|G|\) when \(n \rightarrow \infty \).

Actually, the core of the argument by [11] has a similar flavor to that given in the proof of Theorem 1, in the following sense. Going to our proof, a key step is the construction of the “vertex operators” which have in a sense maximal correlation across the separating corridor between \(A_n\) and \(B_n\) as stated in lem. 2. To simplify, let us take half lines \(A_n, B_n\) separated by a corridor of width 2/n symmetrically around the origin. Proceeding somewhat informally to simplify the discussion, we consider instead the isometric vertex operators \(V_n = u_{C_n B_n} v_{B_n}/{\sqrt{d}}\) where \(C_n=(1/n,2/n)\) and \(u_{C_n B_n}\) is a unitary charge transporter from \(B_n\) to \(C_n\). Then, \(V_n\) is localized in (1/n, 2/n), and it creates an incoherent superposition of all irreducible charges in this interval by the Q-system construction, see app. A. Letting \(J=J_{{\mathcal{F}}}\) be the modular conjugation associated with the half-line \((0,\infty )\), we can say that \({\bar{V}}_n=J V_n J\) creates the opposite charges in the opposite interval \((-2/n,-1/n)\) because J is basically the PCT operator exchanging \(A_n\) with \(B_n\), and particle with anti-particle (Bisognano–Wichmann theorem [16]).

Thus, the correlation function which we want to maximize similar to Lemma 2 is

$$\begin{aligned} 1 \ge \langle \Omega | \bar{V}_n V_n^{} \Omega \rangle = \langle \Omega | V_{n}^{} \Delta ^{1/2} V_{n}^* \Omega \rangle , \end{aligned}$$
(74)

where the inequality is simply the Cauchy–Schwarz inequality. The modular flow \(\Delta ^{it}\) corresponds to dilations by \(e^t\) (Bisognano–Wichmann theorem), and \(V_n |\Omega \rangle \) should be approximately dilation invariant moving ever closer to the edge of \(B_n\) when \(n \rightarrow \infty \). Thus, the limit of \(\langle \Omega | {\bar{V}}_n V_n^{} \Omega \rangle \) should indeed be 1. Arguing just as in Lemma 2, one can likewise see at least formally that \(\langle \psi _n | {\bar{V}}_n V_n^{} \psi _n \rangle \) should tend to \(\lambda ^{-1}\).

Thus, in this sense, the quantity \(S(\omega _\Omega | \omega _\Omega \circ E_n)\) is dominated in the limit \(n \rightarrow \infty \) by particle anti-particle pair correlations very close to the edges across the corridor in accordance with the intuitive picture proposed by [11].