Approximate Recovery and Relative Entropy I: General von Neumann Subalgebras

We prove the existence of a universal recovery channel that approximately recovers states on a von Neumann subalgebra when the change in relative entropy, with respect to a fixed reference state, is small. Our result is a generalization of previous results that applied to type-I von Neumann algebras by Junge at al. [arXiv:1509.07127]. We broadly follow their proof strategy but consider here arbitrary von Neumann algebras, where qualitatively new issues arise. Our results hinge on the construction of certain analytic vectors and computations/estimations of their Araki–Masuda Lp\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L_p$$\end{document} norms. We comment on applications to the quantum null energy condition.


Introduction
Quantum error correction is an important tool in quantum computation but has physical manifestations well beyond this domain. For example, it has become influential in the study of topological aspects of many-body quantum physics [27,43,61], renormalization group approaches to interacting theories [42,57], random quantum systems [17], and even basic aspects of quantum gravity in the AdS/CFT correspondence [1,21,25].
A general question that can be abstracted from these concrete settings is to what extent a quantum channel (completely positive trace preserving linear operation) T on density matrices ρ can be inverted, i.e. to what extent ρ can be recovered from T (ρ). One way to address this question which emerged from the work by [29] and has subsequently been generalized e.g. in [11,19,29,38,39,41,55,56,62], is via entropy. For example, it is standard that the relative entropy between two states is non-increasing under a channel, S(ρ|σ) − S(T (ρ)|T (σ)) ≥ 0.
(1) c The difference expresses to what extent T makes two states ρ, σ less distinguishable. Thinking of σ as a reference state, the general idea is that if the above difference-or related information theoretic quantities-is small, then ρ can be recovered from T (ρ) well, and ideally, one should have an explicit formula for the recovery map. Rather attractive results, expressing a certain strengthened version of the "data processing inequality" (DPI) (1) have been given in [41,56]. These inequalities, while not equivalent, share the same key qualitative features: Smallness of the difference in (1) implies that T (ρ) can be recovered with "high precision" in from ρ via an explicit recovery channel constructed from the reference state σ. The difference between the results is the precise information theoretic measure of "high precision". Quantum computers typically manipulate finite dimensional Hilbert spaces, and accordingly the above mentioned results such as [41] or [56] have focussed on matrix algebras (or their natural generalization, so-called type I algebras 1 ). However, many applications of error correction to quantum field theory and gravity go beyond this simple setting and a general treatment requires more sophisticated tools, including tools from the theory of operator algebras. While one might hope to approximate any of these physical systems by finite quantum systems, this point of view can obscure crucial physical features that are more naturally expressed in a less restrictive approach. We will give an example of this in the context of quantum field theory, where operator algebraic approaches have a long tradition, see e.g. [31]. At the same time, the operator algebra approach is so general that expressing proofs of fundamental quantum information results in this language exposes the core nature of such proofs and ends up simplifying the approach in many situations. Indeed, many of the original theorems in quantum information have their origin in the study of operator algebras.
In this paper, we generalize the results of [41], pertaining to the approximate reversibility of quantum channels, from a type-I von Neumann algebra setting to general von Neumann algebras (Theorem 2). At the heart of these results is a strengthened version of the monotonicity [59] of relative entropy (Theorem 1). In the present paper (part I), we treat the sub-algebra case which involves a simple quantum channel called an inclusion. Many of the technical innovations presented presented in this paper, such as our regularization procedure described in Sect. 3.5.2, or the interpolation vector described in Sect. 3.4 can be generalized to the case of a general quantum channel and also play a major role there. However, the latter case also presents some novel problems, and is therefore treated in a separate follow-up paper (part II) in which we also take the opportunity to present alternative proofs to some of our technical lemmas. Our technical innovations are also relevant when addressing the generalization of [56] to general von Neumann algebras. However, the extra effort in that case is even more considerable and we therefore prefer to separate it from this series to retain a certain amount of readabiltiy.
Along the way, we prove in the present paper also two theorems that might be of independent interest. The first (Theorem 3) concerns the computations of the derivatives of the "sandwiched" and "Petz" relative Renyi entropies for two nearby states. We call this result a first law because of its similarity to the first law of black hole thermodynamics in the setting of AdS/CFT [10,24]. The second (Theorem 4) pertains to a regularization procedure for relative entropy that produces states with finite relative entropy and also allows for continuous extrapolation of relative entropy when removing the regulator. The vectors that result from this procedure are important here because they lead to extended domains of holomorphy that allow us to proceed towards the proof of strengthened monotonicity with a similar argument as in the finite dimensional setting.
We will also discuss an application to the study of the quantum information aspects of quantum field theory that requires this general von Neumann algebra setting. In the field theory context, new results using operator algebra methods have made it possible to make rigorous statements about the dynamics of interacting theories. For example, we propose that the quantum null energy condition, a bound on the local energy density (that has already been proven with other methods [9,14]), is tightly linked to the strengthened monotonicity result that we derive in this paper.
Notations and conventions: Calligraphic letters A, M, . . . denote von Neumann algebras. Calligraphic letters H , K , . . . denote more general linear spaces or subsets thereof. S a = {z ∈ C | 0 < (z) < a} denotes an open strip, and we often write S = S 1 . We typically use the physicist's "ket"-notation |ψ for vectors in a Hilbert space. The scalar product is written (|ψ , |ψ ) H = ψ|ψ (2) and is anti-linear in the first entry. The norm of a vector is sometimes written simply as |ψ =: ψ . The action of a linear operator T on a ket is sometimes written as T |φ = |T φ . In this spirit, the norm of a bounded linear operator T on H is written as T = sup |ψ : ψ =1 T ψ .

Tomita-Takesaki theory.
Here we outline some elements of von Neumann algebra theory relevant for this work; for more details, see [15,16,54,58]. The motivating example for the definition, as well as for many constructions is the algebra A = M n (C) of matrices of size n. The irreducible representation of this algebra is on C n , but one may also consider the "standard" representation of A on H = C n ⊗ C n ∼ = M n (C) by left multiplication. H is a Hilbert space with the Hilbert-Schmidt inner product, ζ 1 |ζ 2 = tr(ζ * 1 ζ 2 ). The standard representation is highly reducible, as expressed by the fact that the commutant of A, denoted A is isomorphic to A itself (acting by right multiplication). One advantage of the standard representation is that mixed states, i.e. density matrices ω, can be viewed as vectors in H : tr(ωa) = ω 1/2 |aω 1/2 , a ∈ A. Another advantage is that besides the operators a from A, we have other useful linear operators on H . One such object is the "relative modular operator," defined for two density matrices σ, ρ by Δ ρ,σ = ρ ⊗ σ −1 (defined at first for invertible σ). Using this operator, we can for instance consider ρ 1/2 | log Δ ρ,σ ρ 1/2 = tr(ρ log ρ − ρ log σ) = S(ρ|σ), which defines the so-called relative entropy. It turns out that, while the individual terms on the right side do not exist for a general von Neumann algebra, the relative modular can still be defined, as can the relative entropy.
In this work, we shall employ the relative modular operator and its properties to investigate the relative entropy for general von Neumann algebras. A von Neumann algebra, A, is by definition a subspace of the set of all bounded operators B(H ) containing the unit operator 1 that is closed under: products, the star operation denoted a * and limits in the ultra-weak operator topology. States on A are linear functionals that are positive, ρ(a * a) ≥ 0, normalized, ρ(1) = 1, and "normal" i.e., continuous in the ultra-weak operator topology. In the above matrix algebra example, states are in one-to-one correspondence with density matrices: ρ(a) = tr(ρa), where we do not distinguish notationally between the functional and the density matrix. The set of normal states is contained in the "predual" A of A, i.e. the set of all ultra-weakly continuous linear functionals on A. One defines the support projection π A associated to a state ρ as the smallest projection π = π A (ρ) in A that satisfies ρ(π) = 1. Faithful states by definition have unit support projection.
We will work with the von Neumann algebra in a so called standard form, (A, H , J, P ), where A acts on the Hilbert space H and where there is an anti-linear, unitary involution J and a self-dual "natural" cone P left pointwise invariant by J. The properties of such a standard form are extracted from the above matrix algebra example, but its existence and detailed properties in the general case are quite non-trivial theorems (proven in [32]); here we only mention: One has JAJ = A where A ⊂ B(H ), the "commutant", is the von Neumann algebra of all bounded operators on H that commute with A. The natural cone defines a set of vectors in the Hilbert space that canonically represent states on A via and where we use the notation ω ψ (·) ≡ ψ| · |ψ ∈ A for the linear functional on A induced by a vector ψ ∈ H . The vector in the natural cone representing ω ψ will also be denoted by |ξ ψ . It is known that it is related to |ψ by a partial isometry v ψ ∈ A , Furthermore, it is known that 2 proximity of the state functionals implies that of the vector representatives in the natural cone and vice versa, in the sense that holds. We now introduce the modular operators that are central to our discussion of relative entropy [2][3][4] and non-commutative L p -spaces [5]. This is most straightforward if we have cyclic and separating vector |η for the algebra A, meaning that {a|η : a ∈ A} is dense in H and that a|η = 0 implies that a = 0. Then Tomita-Takesaki theory establishes that one can define an anti-linear, unitary operator J and a positive, self-adjoint operator Δ η by the relations Δ η is in general unbounded. J can be used in this case to define a standard form, with P given by the closure of {aJaJ|η : a ∈ A}, but we emphasize that a standard form exists generally even without a faithful state |η . From now on, we regard such a standard form, hence J, as fixed. We will continue to take |η ∈ P . We will also need the concept of relative modular operator Δ φ,ψ [2], which generalizes that given for matrix algebras above (see Sect.4.1 for further discussion.) In a slight generalization of the above definitions, let |φ , |ψ ∈ P . Then there is a non-negative, self-adjoint operator Δ φ,ψ characterized by The non-zero support of Δ φ,ψ is π A (φ)π A (ψ)H , and the functions Δ z φ,ψ are understood via the functional calculus on this support and are defined as 0 on 1 − π A (φ)π A (ψ). We can similarly define relative modular operators for vectors outside of the natural cone, for a detailed discussion of such matters see e.g., [5], app. C. For example, we may use the well known transformation property of the A is a partial isometry (with appropriate initial and final support), to define: Similarly we can define the relative modular operators for the commutant in direct analogy. We will often denote it by Δ φ,ψ . When |ψ = |φ we will denote these operators as Δ φ,φ ≡ Δ φ . This is the nonrelative modular operator already discussed from which we can define modular flow: where a ∈ A and we have taken φ to be cyclic and separating. The modular flow can also be extracted from the relative modular operators: for any |ψ ∈ H . The modular operators satisfy various relations that we need to draw on below and we simply quote these here (recall that |η ∈ P ): for t ∈ R, z ∈ C and a ∈ A and where these equations make sense when acting on vectors in appropriate domains-we are more specific about this when we get to use these equations. The Connes cocycle (Dψ : Dφ) t is the partial isometry from A defined by (t ∈ R) According to [2][3][4], if π A (φ) ≥ π A (ψ), the relative entropy may be defined as otherwise, it is by definition equal to +∞. The relative entropy only depends on the functionals ω ψ , ω φ but not on the particular choice of vectors that define them.

Inclusions of von
We adopt the convention that the corresponding support projection will be labelled in the following manner: and we have where for two self-adjoint elements a, b ∈ A we say that a ≤ b if a − b is a non-negative operator. Given ρ, σ ∈ A , we may define the relative entropy S A (ρ|σ) ≡ S(ρ|σ) as above, and we put By monotonicity of the relative entropy [59], Given a faithful state σ ∈ A , an isometry V σ : K → H can be naturally defined as follows [47,50,51]: where we use the notation |ξ B σ for the vector representative of the state σ•ι ∈ B in the natural cone of the algebra B and ξ A σ for the vector representative of the state σ ∈ A in the natural cone of the algebra A. As reviewed in Appendix B, this embedding V σ commutes with the action of b, and satisfies V * . We now recall the concept of approximate sufficiency. First, recall that a linear mapping α : A → B is called "channel" if it is completely positive, ultra-weakly continuous and α(1) = 1, see [47]. Definition 1. Following [47,49] we say that the inclusion B ⊂ A is -approximately sufficient for a set of states S ⊂ A , if there exists a fixed channel called "recovery channel ", for which the recovered state is close to the original state in the sense that Here we take all ρ ∈ S to be normalized ρ(1) = 1.
Note that if A ⊂ B is -sufficient for S , then A ⊂ B is -sufficient for the closed convex hull of states conv(S ).
We would now like to construct an α that works as a recovery map for a set of states that are close in relative entropy under restriction to the sub-algebra. We take the relative entropy to compare to a fixed state σ ∈ A . That is, we consider the set The required recovery channel is related to the so-called Petz map, which is defined in the sub-algebra context (and faithful σ) as (see e.g., [47], sec. 8): It maps operators on H to operators on K , and furthermore As shown in [47], prop. 8.3 this map satisfies the defining properties of a recovery channel used in Definition. 1-in fact, in the subalgebra context considered here it is equal to the generalized conditional expectation introduced even earlier by [8]. In the non-faithful case there is a slightly more complicated expression that we will discuss below in Lemma 1.

Main theorems.
Given two states ρ, σ ∈ A , the fidelity is defined as [60]: Some of its properties in our setting are discussed in Lemma 3 below. One of the two main theorems we would like to establish is: Theorem 1. (Faithful case) Montonicity of relative entropy can be strengthened to where we assume that ρ, σ are normal, σ is faithful and where α t σ : A → B is the rotated Petz map, defined as p(t) is the normalized probability density defined by Furthermore, the dual of ι acting on density matrices ρ A ≡ ρ BC is the partial trace over system C:ι : Then, the statement of the theorem becomes in terms of density matrices: obtained previously by [41]. In the special case where one puts for a density matrix ρ DBC associated with A = D ⊗ B ⊗ C for another matrix algebra D, one finds a lower bound on the conditional mutual information [29]. For further explanations regarding our methods in the finite-dimensional case see Sect. 4.1.
We may extend this theorem to the case where σ is non-faithful. The basic idea is contained in the following lemma:

Lemma 1. Consider a sub-algebra ι(B) ⊂ A, of a general von Neumann algebra, and a normal state σ with support projectors π
Then the following statements hold: (i) The projected sub-algebras, are (σ-finite) von Neumann sub-algebras, acting respectively on H π = π A (σ)π A (σ)H and K π = π B (σ)π B (σ)K . The projected inclusion is defined as: where we defined the (ultra weakly continuous) *-isomorphism of von Neumann algebras The projected algebras are in a standard form. For example the standard form of A π is (A π , H π , J A , π(σ)π (σ)P ) where J A maps the subspace H π to itself. (ii) The relative entropy satisfies for all states such that (iii) Consider a channel on the projected algebras: We can construct a new cannel on the algebras of interest α : A → B via: Then for all ρ ∈ A with π A (ρ) ≤ π A (σ) we have: and Similarly: (iv) The explicit form of the resulting Petz map coming from the inclusion ι π (B π ) ⊂ A π is: where the embedding V (ιπ) σ is defined for the projected inclusion as and where ξ A σ and ξ B σ are now cyclic and separating for A π and B π respectively.
Proof. The proof of this lemma uses standard properties of support projectors and is left to the reader.
Note that the modular automorphism groups in (41) can be understood as being associated to the non-cyclic and separating vector ξ A σ (resp. ξ B σ ) for the original algebra A (resp. B), which are however defined to project to zero away from the H π (resp. K π ) subspace. So, for example ς σ can be understood in this way, as being defined on the subspaces K π and projecting to zero away from this subspace via: An obvious corollary is: (Theorem 1 in the non-faithful case) Theorem 1 continues to hold when σ is non-faithful but still π A (ρ) ≤ π A (σ). The recovery map is now given by (41).
From this result one can characterize approximately sufficiency using relative entropy: Theorem 2. Consider a set of normal states S on a general von Neumann algebra A with a subalgebra B. If S contains a state σ such that for all ρ ∈ S the following condition holds: then there exists a universal recovery channel α S such that A ⊂ B is -approximately The explicit form of the recovery map is: where α t σ was given in (41). We can make sense of the later integral as a Lebesgue integral of a weakly measurable function with values in B, thought of as a Banach space.
Remark 2. Results similar in spirit to Theorems 1, 2 applicable to-mostly finitedimensional-type I von Neumann algebras can be found in [11,19,29,38,39,41,55,56,62]. Among these results, the first giving a lower bound with an explicit, ρ-independent recovery map as in our Theorem 1 (hence suitable as a way to prove our Theorem 2), was [41]. As already described in Remark 5, their result can be seen as a special case of our Theorem 1 in the case of embeddings. The inequalities given in [56], demonstrated there for finite dimensional type I von Neumann algebras, are sharper than our Theorem 2, although not obviously so than Theorem 1 because the bounds in [56] naturally come with the "integral inside" the information theoretic quantity while Theorem 1 has the "integral outside". A generalization of [56] to arbitrary von Neumann algebras, including the case of general channels T , is possible [37] but does not seem to follow from Theorem 1 and instead requires ideas beyond those presented in this work.
An example of a set of states that satisfy the assumptions in Theorem 2 is simply S = R (σ) δ (22) for any state σ. If we were to additionally assume that A is σ-finite then we could also pick S to be any closed convex set of states such that To see this, note that the σ-finite condition imposes that all families of mutual orthogonal projectors in A are at most countable. This is satisfied for von Neumann algebras that act on a separable Hilbert space, and is equivalent to the assumption that there is a faithful state in A . Then (46) is sufficient for finding a σ that works with Theorem 2 due to the following basic result: Given a closed convex subset of normal states S ⊂ A for a σ-finite von Neumann algebra A then we can always find a σ ∈ S such that: Proof. Given in Appendix A.

Proof of Main Theorems
Our eventual goal in this section is to prove our main results, Theorems 1 and 2.
As discussed above, without loss of generality we can take σ to be faithful and so we will assume this from now on. The proof is divided into several steps. In Sect. 3.1, we first fix some notation and recall basic facts about the vectors that we are dealing with. In Sect. 3.2, we introduce the non-commutative L p -space by Araki and Masuda [5] and explain its-in principle well-known-relation to the fidelity. We make certain minor modifications to the standard setup and prove a simple but important intermediate result which we call a "first law", Theorem 3. In Sect. 3.3, we motivate the definition of certain interpolating vectors that will be of main interest in the following subsections and in Sect. 3.4 we prove some of their basic properties. Sect. 3.5 is the most technical section. It introduces certain regularized ("filtered") versions of our interpolating vectors and their properties. Our definition of filtered vectors involves a certain cutoff, P , that is defined in terms of relative modular operators. A quite general result of independent interest is that the relative entropy behaves continuously as this cutoff is removed, Theorem 5. Armed with this technology, we can then complete the proofs in Sect. 3.6 using an interpolation result for Araki-Masuda L p -spaces, Lemma 9.
3.1. Isometries V ψ for general states, notation. Since the two states σ, ρ play a central role in Theorem 1 we will use a special notation for the vectors that represent these states in their respective natural cones: where |η A ∈ H (|η B ∈ K ) are cyclic and separating for A (B). We will also choose to label various objects, such as support projectors, and the modular operators discussed below, for the most part with the vector rather than the linear functional as we did in Sect. 2. This will be convenient since we will occasionally have to work with vectors that do not necessarily live in the natural cone. For example, given a |χ ∈ H we define: where ω χ ∈ A is the induced linear functional of |χ ∈ H on the commutant. For vectors |ξ in the natural cone we have a symmetry between the support projectors π A (ξ) = J A π A (ξ)J A . We use similar notation for objects associated to the algebras B. When the only algebra in question is A, we write We have already recalled that a general vector |χ ∈ H is related to a unique vector in the natural cone inducing the same linear functional on A. More precisely, there is a partial isometry in v χ ∈ A such that Now consider a vector |ψ A = ξ A ψ ∈ P A and define a corresponding vector in The vector V η |ψ B ∈ H induces the same linear functional on ι(B) as |ψ A , where we use exchangeably the notation V η = V σ for the embedding (18). Thus there exists a partial isometry u ψ;η in ι(B) , with implied initial and final support, relating the two vectors Combining this with (19) Since this notation is cumbersome we will simply define a new isometry V ψ : It will also be convenient to have V χ defined for states |χ ∈ H that are not necessarily in the natural cone. In that case, we extend this definition further: These satisfy 3.2. L p -spaces, fidelity and relative entropy. In this part we introduce various quantum information measures that will be useful to characterize sufficiency. We have already seen the importance of relative entropy and the fidelity. What we need are quantities interpolating between them. These will be provided by the non-commutative L p norm associated with a von Neumann algebra, with reference to a state/vector. There exist different definitions of such norms/spaces in the literature; here we basically follow the version by Araki and Masuda [5], suitably generalized to non-faithful states. Such a generalization was considered by [12], see also [40] for related work.

Definition 2. [5]
Let M be a von Neumann algebra in standard form acting on a Hilbert space H . For 1 ≤ p ≤ 2 the Araki-Masuda L p (M, ψ) norms, with reference to a fixed vector |ψ ∈ H , are defined by 3 : Remark 3. (1) The norm is always finite for this range of p. We will use the L p norms mostly for the commutant algebra A of A. Then, For any unitary u ∈ A , one shows that u ζ A p,ψ = ζ A p,ψ , so the norm only depends upon the functional ω ζ induced by |ζ on A. Furthermore, we will usually take |ψ in the natural cone of A. Then ζ A p,ψ only depends on the linear functionals ω ζ , ω ψ on A induced by |ζ , |ψ ∈ H .
When p = 2, the L p norm becomes the projected Hilbert space norm: Taking a derivative at p = 2 will give the relative entropy comparing |ζ with |ψ as linear functionals on A, see below.
At p = 1 we have the following lemma: where ω φ , ω ψ ∈ A are the induced linear functionals for |φ , |ψ , respectively. 2. The fidelity may also be written as

It is related to the linear functional norm (Fuchs-van-der-Graff inequalities) by
Proof. While these results are standard, we include the proof in the Appendix C.1 because we also treat the non-faithful case for the generalized Araki-Masuda norm in (57). See also the proof in [12].
We will also need the following result that is potentially of independent interest.

Theorem 3. (First Law for Renyi Relative Entropy)
Consider a one parameter family of vectors |ζ λ ∈ H for λ ≥ 0, which are normalized ζ λ = 1 and satisfy where |ψ = |ζ 0 . Then: (1) The Petz-Renyi relative entropy satisfies: where > 0 and there is no other constraint on x(λ). (2) The sandwiched Renyi relative entropy satisfies: with no other constraint on how the function p(λ) behaves under the limit.
In order to prove this, we first prove the following lemma: Lemma 4. Given two normalized vectors |ψ , |ζ ∈ H , we have: and we have the elementary bound: Proof.
(1) This is demonstrated by an application of Harnack's inequality (see e.g. [28], sec. 2, Theorem 11) which applies to any h(z) that is harmonic and non negative in some connected open set O: for all compact subsets K ⊂ O there exists a constant 1 ≤ C(K, O) < ∞ such that: where notably this constant is independent of the particular h satisfying the assumptions.
We work with the real part of two holomorphic functions in two strips: These functions are continuous on the closure of the above strips and they are non-negative since for normalized vectors | ψ| Δ z ψ,ζ |ζ |, | ζ| Δ z ψ,ζ |ζ | ≤ 1 by an easy application of the Hadamard three lines theorem-these facts are standard results of Tomita-Takesaki theory for the relative modular operators. There is no need for any of the vectors to be in the natural cone.
We can thus apply Harnack's inequality. Using the fact that: and picking the compact subset K 1 ⊂ O 1 with 0 ∈ K 1 we have: We have to relate this to h 2 (z) which is what we are most concerned with. We can relate the two functions using the Cauchy-Schwarz inequality where the two defining strips overlap, 0 ≤ Rez ≤ 1/2: which translates to: We can split the compact set K in the statement of the lemma into two compact pieces Repeatedly applying Harnack's inequality as above gives the following upper bound for C K : where it was necessary to add the points {0, 1/4} since they may not have been in the original K.
(2) This result is basically the well-known Araki-Lieb-Thirring inequality [7], for a proof in the von Neumann algebra setting see [12], Theorem 12, for L p norms based on a not necessarily cyclic and separating vector |ψ .

Proof of Theorem 3.
(1) is a consequence of Lemma 4 (1): We can take K = [0, 1 − ] which satisfies the assumptions of this lemma so: Then using differentiability of ln(x) at x = 1 and the chain rule we show (64).

Exact recoverability/sufficiency
. This section is meant as an informal summary of some of the results given in [50,51], defining the exact notion of recoverability or sufficiency. We will focus only on the properties associated to sufficiency that we make contact with in this paper, and we will also treat only the case of faithful linear functionals and drop all support projectors here. By definition, the quantum channel ι : B → A is exactly reversible for at least two fixed states ρ, σ if there exists a recovery channel α : A → B such that: and similarly for σ. Since the relative entropy is monotonous [59] under both α, ι, we must have S A (ρ, σ) = S B (ρ, σ), see (17) for our notation. Representing σ, ρ by vectors in the natural cone as in (48) and using a standard integral representation of the relative entropy based on the spectral theorem and the elementary identity (x, y > 0) we get that vanishes. Known properties of the modular operators imply that the integrand is positive [47,50,51]. Therefore, for all β > 0, which can be integrated against a specific kernel that we will not write to arrive at a statement about the relative modular flow: Further manipulations give a derivation that the Petz map is a perfect recovery channel, although we will not go through this. Here we simply note that it is a reasonable guess at this point that for the approximate version of recoverability, one must require that |ψ A must be close to Δ −it ηA,ψA V ψ Δ it ηB,ψB |ψ B in some metric. We will use the non-commutative Araki-Masuda L p norms to provide such a metric.

Interpolating vector.
Motivated by the above discussion we consider the following vector in H : defined at first for purely imaginary z, and assuming at first that |ψ , |η are in the natural cone (of A), see (48) for our notation.
Remark 4. The vector defined here is similar in spirit but does not quite coincide with the interpolating vector considered by [41]. It seems possible to consider other vectors instead, and we briefly comment on this in Appendix E.
Our first result will be an analytic continuation of the vector (84) into a strip: , weakly continuous in the closure of the strip and has the following explicit form at the top and bottom edges: The norm of the vector |Γ ψ (z) is bounded by 1 everywhere in the closure of S 1/2 , and |Γ ψ (0) = |ψ A . 2. On the top edge of the strip S 1/2 this vector induces the the following state on A: where a + is any non-negative self-adjoint element in A, and where α t η is the rotated Petz map (27) for the state σ induced by |η .
Remark 5. 1) A variant of this theorem holds when |ψ is replaced by a unit vector |χ that is not necessarily in the natural cone. In this case, we should define with v χ as in (51). The limiting values (85b), (85a) at the boundaries of the strip are then readily computed using (52). In particular, (85b) takes the same form as before as seen using (55), (56), which also implies |Γ χ (0) = |χ . (86) follows from (9).
(1) Given an a ∈ A , consider the function: which using Tomita-Takesaki theory is analytic in the strip S 1/2 , continuous in the closure, and bounded by: where θ = Re(1/2−z). The maximum is achieved by continuity and compactness of the interval. This bound is however not uniform over vectors a |η A ∈ H with norm 1. For this, we need to use the Phragmen-Lindelöff theorem. Our function has the following form at the edges of the strip (t ∈ R): where we made use of the expressions/definitions in (85a) and (85b) respectively. The first equation above is rather trivial but the second equation requires some lines of algebra: where in the first line we used (8), in the second we inserted Δ it η;B for free giving rise to b which is in A from the last equation in (12), we used (18) in the third line after which we passed Δ −1/2 η,ψ;A to the right which is allowed since this vector is now in the domain of this operator. We used (56) in line four and finally b can be rewritten as: using (13). This finally leads to (90b). Since both expressions in (85a) and (85b) involve products of partial isometries we have the following bound on the edges of the strip: which then extends inside the strip via the Phragmen-Lindelöf theorem. That theorem also requires the (weaker) bound we derived in (89) and it applies inside the closure of the strip. Since A |η A is dense in the Hilbert space we can extend the definition of g(z) to the full Hilbert space, at which point it is a continuous anti-linear functional on all vectors, weakly (hence strongly) holomorphic in S 1/2 . This then defines a vector in H which is then our definition of (84) on the strip S 1/2 . The bound on the norm of this vector follows also from Phragmen-Lindelöf theorem. For the continuity statements we further need the limit of g(z), as a |η approaches an arbitrary vector, to be uniform in z. This follows easily from the uniform boundedness of g(z) and the Banach-Steinhaus principle.
(2) The final property (86) follows from a short calculation: where we used (11) in the second line, the positivity of ς t η (a + ) in the third line, the bound π A (ψ) ≤ 1 in the fourth line, the fact that V * η A V η ⊂ B (see (187)) and again (11) for the B algebra in the fifth line.

Strengthened monotonicity.
3.5.1. Basic strategy We will apply interpolation theory to the vector |Γ ψ (z) , following the basic strategy of [41]. By Theorem 4 (2) we get the rotated Petz recovered state on the top of the strip at z = 1/2 + it, so we need to interpolate to the L 1 (A , ψ) norm there where it becomes the fidelity by Lemma 3 (1). Close to z = 0 we will need to approach the p = 2 norm (the π(ψ) projected Hilbert space norm) by (59) where we will show that we can extract the difference in relative entropy. A generalized sum rule, using sub-harmonic analysis, relates the z = 0 limit to an integral over the fidelities of the z = 1/2 + it vector.
Extracting the relative entropy difference is the most difficult part of the proof and requires some modifications to the basic strategy. We proceed by extending the domain of holomorphy to a larger strip so that we can take derivatives at z = 0 easily. This requires defining a class of states with filtered spectrum for the relative modular operator. We then approach the original state as a limit. After a continuity argument, we show that this is sufficient to prove a strengthened monotonicity statement for all states with finite B relative entropy.

Filtering and continuity
Our first task will be to extend |Γ ψ (z) holomorphically into the larger strip {−1/2 < Rez ≤ 1/2}. This might not be possible for general |ψ , so to make progress we work with vectors that have approximately bounded spectral support for the relative modular operator Δ η,ψ . Thus we now introduce a filtering procedure that produces from |ψ a vector |ψ P with approximately bounded spectral support.
For convenience, we work with |η , |ψ ∈ H in the natural cone, and consider a related vector |ψ P (which is not in the natural cone of A), defined by: wheref P is the Fourier transform of a certain function f P and provides a kind of damping. All modular operators and support projections in this subsection refer to A, and since we only consider one algebra in this subsection, we drop the subscripts to lighten the notation. Note that ln Δ η,ψ is defined on π (ψ)π(ψ)H since Δ η,ψ is only invertible there. Away from this subspace the operator acts as 0.
We take f P to have the following properties, motivated by the desire to prove nice continuity statements as P → ∞. Since we want to think of P as a cutoff, we take f P to be a scaling function: and now specify properties of f (t). (Note that the Fourier transform satisfies f P (p) =f (p/P ).)

Definition 3.
We call the function f in (96) a smooth filtering function if it satisfies the following properties.
exists as a real and non-negative Schwarz-space function. This implies that the original function f is Schwarz and has finite L 1 (R) norm, f 1 < ∞. (B) f (t) has an analytic continuation to the upper complex half plane such that the L 1 (R) norm of the shifted function has f (· + iθ) 1 < ∞ for 0 < θ < ∞.
Note that the Fourier transform of the shifted function satisfies: Examples of such smooth filtering functions include Gaussians as well as the Fourier transform of smooth functionsf with compact support. The norms satisfy: where the later inequality is well-known as the Hausdorff-Young inequality. We now establish some properties of the resulting vector |ψ P : Lemma 5. The filtered vector |ψ P defined in (95) based on a smooth filtering function f , has the following properties: 1. lim P →∞ |ψ P =f (0) |ψ strongly.
2. There exists a P ∈ A such that |ψ P = a P |ψ , and such that a P π(ψ) = a P .
(1) Letting E η,ψ (dλ) be the spectral resolution of ln Δ η,ψ , we have We can take the pointwise limit P → ∞ using dominated convergence (sincef is a bounded function); this immediately gives the statement.
(4) Note that Δ 1/2 ψ,η |η = J|ψ = |ψ since |ψ is in the natural cone. Then, shifting the integration contour as is legal by Defiition 3 (A), is a Connes-cocycle for A , and hence an element of A . Now define Since the Connes-cocycle is isometric, the norm of a P may be bounded by (5) We have ψ P | a + |ψ P = η| a 1/2 + a P * a P a 1/2 + |η ≤ a P 2 η| a + |η , which gives the statement in view of (4).
We would now like to see how the relative entropy between |η , |ψ P behaves in the limit P → ∞. We will find the conditions on f for which the relative entropy converges to that between |η , |ψ as P → ∞. Theorem 5. Suppose |ψ , |η are states on a von Neumann algebra A, assumed to be in the natural cone, and suppose |ψ P is given by (95) with scaling function (96) satisfying property (A) of Definition 3 andf (0) = 1. Then: 3. The relative entropy behaves continuously for P → ∞, iff the Fourier transform of the scaling function,f (t), is a Gaussian centered at the origin. Proof.
(1) In view of (101), [3,4], thm. 3.6, eq. (3.7), applied to the algebra A , gives: for all β > 0. 4 The following type of integral representation for the relative entropy is well-known, see e.g. [47]: and integral converges iff the relative entropy is finite. The bound in (112) can be used to bound (113) from above due to the first equation in (12) and this gives: Using the spectral decomposition of ln Δ η,ψ , we can write 4 When applying [3,4], thm. 3.6, eq. (3.7) to the commutant A using (8), where one switches A ↔ A as well as any support projectors π ↔ π . Note further that [3,4], thm. 3.6 refers to the natural cone but the specific representative of the linear functional does not affect the modular operators above since Δ ξ ψ P ,η = Δ ψ P ,η using the notation (51) (now for the commutant).
This integral converges because pf (p/P ) 2 is uniformly bounded, by the Schwartz condition in Definition 3 (A). Thus the right hand side of (114) is finite and so we have shown (1).
(2) Let us continue by first assuming that S(ψ|η) < ∞. Strong convergence of |ψ P , Lemma 5 (1), guarantees that lim P ψ P | ψ P = 1 sincef (0) = 1. Now the integral on the right hand side of (115) can be split into two parts: where we have applied the dominated convergence theorem to each term using the facts thatf P (p) is bounded and that the relative entropy is finite. Taking the lim sup on both sides of (114) gives the first inequality in (110). Lower semi-continuity of relative entropy [3,4] gives the second inequality.
If instead S(ψ|η) = ∞, then we find from lower semi-continuity: thus the limit must exist on the extended positive real line where it is infinite. This shows (2).
(3) Note that f 1 ≥ f ∞ ≥f (0) = 1 so we get the continuity in (111) iff the Hausdorff-Young inequality is saturated andf (0) = f ∞ . It was shown by Lieb [44] that the only functions that saturate the Hausdorff-Young bound are in fact Gaussians. The conditionf (0) = f ∞ then simply means the Gaussiañ f must be centered at the origin.

Updated interpolating vector
We now consider again our interpolating vector (84). With the intention to extend the domain of holomorphy, we consider the filtered states |ψ P instead of |ψ . Although |ψ P is not in the natural cone, we can still define |Γ ψP (z) in view of Remark 5 (1). This will however by itself not be sufficient: It turns out that we also have to apply a projector Π Λ to our vectors, so we consider where E ψP (dλ) is the spectral decomposition of ln Δ ψP , so that lim Λ→∞ Π Λ = π(ψ P )π (ψ P ) in the strong sense. We intend to send the regulators Λ, P → ∞, and in that process we will tunef (0) to maintain |ψ P = 1, and require (A) and (B) of Definition 3. With those changes in place, we claim the following updated version of Theorem 4.

We have
Proof. In order for the proof to run in parallel with that of Theorem 4, we consider instead of |ψ P the corresponding vector |ξ ψP in the natural cone of A. By Remark 5 (1), and transformation formulas such as Δ z ψP = v ψP * Δ z ξ ψ P v ψP (which give corresponding transformation formulas for Π Λ ), we find that Π Λ,ψP | Γ ψP (z) = v ψP Π Λ,ξ ψ P |Γ ξ ψ P (z) . The partial isometry v ψP is evidently of no consequence for the claims made in this lemma. By abuse of notation, we can assume without loss of generality for the rest of this proof that |ψ P is in the natural cone.
(1) Then, as in the proof of Theorem 4, we also use the shorthand Δ ηB,ψP B = Δ η,ψP ;B etc. With these notations understood, let us write out which is initially defined only for purely imaginary z. We now consider the bracketed operator above: Δ −z ψP Δ z η,ψP . It is well known that the majorization condition (102) ensures that this operator has an analytic continuation to the strip −1/2 < Rez < 0. For completeness we give this argument here using a similar approach as in the proof of Theorem 4.
Thus, we define, dropping temporarily the subscript A as all quantities refer to this algebra: where: c ∈ A, d ∈ A and |ζ ∈ (1 − π (ψ P ))H . This function is holomorphic in the lower strip {−1/2 < Rez < 0} and is continuous in the closure due to Tomita-Takesaki theory. As in the proof of Theorem 4 we can easily derive an upper bound on |G(z)| that is not uniform with c, d . We can then improve this to a uniform bound using the Phragmén-Lindelöf theorem by checking the top and bottom edges of the strip. At the top we have: and at the bottom we need the following calculation: Consequently, where in the first line we dropped the support projectors and defined modular flow on A , ς In the second line we finally used the majorization condition (102) that is true for these filtered states. These bounds at the edges of the strip, and the weaker bound derived earlier, can be extended into the full strip such that G(z) f (· + iP ) 2z 1 is holomorphic and bounded by 1 everywhere for −1/2 ≤ Re(z) ≤ 0. Since c * |ψ P + |ζ and d |η for all c ∈ A and d ∈ A are dense, we can extend the definition of the operator Δ −z ψP Δ z η,ψP to the entire Hilbert space where it remains bounded, Since the limit on G(z) as c * |ψ P and d |η approaches two general vectors in the Hilbert space and is uniform in z, we get the same continuity statement for Δ −z ψP Δ z η,ψP in the weak operator topology. We also get holomorphy for this operator in the interior of the strip. Note that since Δ −z ψP Δ z η,ψP = (Dψ P : Dη) −iz π (ψ P ) for the Connes-cocycle (Dψ P : Dη) −iz ∈ A holds along z = it for real t, it continues to take this form in the lower strip. Now let us turn to the first bracketed operator in (121), Π Λ Δ z ψP , which is a holomorphic operator (and thus continuous in the strong operator topology) in the entire strip due to the projection on a bounded support of the spectrum of ln Δ ψP . In fact, the operator norm satisfies Π Λ Δ z ψP ≤ e −ΛRez for Rez ≤ 0. Finally we analyze the following vector appearing in (121), Δ −z η,ψP ;B |ψ P B which is holomorphic in {−1/2 < Rez < 0} and strongly continuous in the closure of this region due to Tomita-Takesaki theory. This vector is also norm bounded by 1.
At this stage, we can combine the above holomorphy statements in (121) showing that this vector is analytic in the lower strip {−1/2 < Rez < 0}. For the continuity statement in z, note that an operator that is uniformly bounded and continuous in the weak operator topology such as Δ −z ψP ,η Δ z η , acting on a strongly continuous vector Δ −z η,ψP ;B |ψ P B gives a weakly continuous vector. Similarly, an operator that is continuous in the strong operator topology Π Λ Δ z ψP acting on a weakly continuous vector-the output of the last statement-gives a weakly continuous vector. Now we use the vector-valued edge of the wedge theorem (see e.g. [35], app. A), in conjunction with Theorem 4, which already establishes an analytic extension to the upper strip 0 < Rez < 1/2. We thereby extend Π Λ |Γ ψP (z) holomorphically to the full strip −1/2 < Rez < 1/2.
(2) The bound (119) follows by combining the operator norm bounds above.
(3) Holomorphy at z = 0 allows us to take the derivative in (120a) on the bra and ket separately and it is easy to see that they give the same contribution. The equality in (120a) also relies on Π Λ |ψ P = |ψ P . The second line (120b) follows by working with the right hand side of in (120a) and taking the derivative as a limit along z = it for t → 0. This gives: where the later limits can be shown to exist when the |ψ P relative entropies are finite, as is indeed the case by Theorem 5 (1), see [47], thm. 5.7. The first equality in (127) can be shown more explicitly by subtracting the two sides and observing that this is an inner product on two vectors. After applying the Cauchy-Schwarz inequality, one again uses the finiteness of |ψ P relative entropy, by Theorem 5 (1), to show that this difference vanishes in the limit:

L p norms of updated interpolating vector
We now study L p norms of the updated interpolating vector (118) and its limits as P, Λ → ∞, z → 0 and p → 1 or p → 2. First we consider p = 1.

We have
Proof.
(1) For the first equality, we need an appropriate continuity property of the L 1 -norm which is provided in Lemma 11, Appendix C.2. It shows that strong convergence of the vectors implies the convergence of the L 1 norm. For the limit Λ → ∞, this follows from the strong convergence of Π Λ to π (ψ P )π(ψ P ). In fact, we can drop these support projectors because by definition π (ψ P ) |Γ ψP (z) = |Γ ψP (z) and also because the L p norms satisfy (58).
(2) We use the fact that, where the fidelity F (ω ψP , ω ψP • ι • α t η ) is concerned, we can pick another vector that gives the same linear functional. We can replace: Then, in view of Lemma 11, Appendix C.2, we only need establish the strong convergence of |ψ P B and of |ψ P A as P → ∞, and this follows by combining Lemma 5 (1) and eq. (6) [remembering the notations (48)].
Next, we consider simultaneously approaching p = 2 and z = 0.

Proof. Define the normalized vector
We can then use Lemma 6, (120a) to show that: So we can apply the "first law" (65) for the L p norms in Lemma 3 to |ζ θ , to conclude since p θ = 2/(1 + 2θ) satisfies the assumptions of Lemma 3. The L p norms are homogenous so we can pull out the normalization: and this gives the desired answer after applying (120a) again.
Proof. See Appendix D. In the commutative setting this is closely related to the Stein interpolation theorem [53]. In the non-commutative setting, a proof appears for type I factors and the usual non-commutative Schatten L p norms in [41]. We will make sure that it works in the setting of the Araki-Masuda L p norms defined in (57).

Proof of Theorems 1 and 2.
We close out this long section by combining the above auxiliary results into proofs of the main theorems.
Proof of Theorem 1. Given the two normal states ρ, σ we consider as above representers |ψ , |η in the natural cone. From this we construct the filtered vector |ψ P as in (95). We then apply Lemma 9 with p 1 = 1, p 0 = 2, M = A , |G(z) = Π Λ |Γ ψP (z) and use that the L 2 norm is actually the (projected) Hilbert space norm, see eq. (59), so Taking the limit θ → 0 + with the aid of Lemma 8 we have: where the limit exits due to Lemma 7 (1) and where we have used the monotonicity of ln. Taking the limit P → ∞ we get in view of Lemma 7 (2), Theorem 5 (3) for a Gaussian filtering function satisfying (A) and (B) of Definition 3 and lower semi-continuity of the B relative entropy that We can then re-write the answer in terms of the original states ρ, σ and we arrive at (26). (Recall that we are using α t η = α t σ interchangeably.) Theorem 1 forms the basis of the next proof: Proof of Theorem 2. Since all states ρ i ∈ S have finite relative entropy with respect to σ ∈ S we learn that π(ρ i ) ≤ π(σ). This implies, via Lemma 1, (in particular (40)) that if ι π (B π ) ⊂ A π is -approximately sufficient for S π then ι(B) ⊂ A is -approximately sufficient for S . Here and we have used (34b). The recovery channel α S is derived from the recovery channel for ι π (B π ) ⊂ A π . This later recovery channel α S π then pertains to the "faithful" version of this theorem, and is derived from Theorem 1, as we will show below. In this way we can proceed by simply assuming that σ is faithful for A, now without loss of generality. In particular we may take (45) to be determined by the faithful Petz map in (27).
In the faithful case we first check that the map (45) is indeed a recovery channel. This follows since α t σ are recovery channels for each t ∈ R ( generalizing the results in [48] to non-zero t) and so the weighted t integral is also clearly unital and completely positive.
We now check the continuity property of (45). The integral is rigorously defined as follows. For all a ∈ A the function t → α t σ (a) is continuous in t in the ultra-weak topology (thus Lebesgue measurable) and bounded on R. So gives a continuous linear functional and thus defines an element in B (the continuous dual of the predual) that we call α S (a). Continuity in the linear functional norm follows from the convergence of the following integral: This also guarantees that the resulting operator α S (a) = R p(t)α t σ (a)dt is a bounded operator: We need to check the ultraweak continuity of a → α S (a). For all ρ ∈ B we define the integral in much the same way as above, as a Lebesgue integral on continuous functions valued in A . That is, the evaluation of this expression on a ∈ A defines an ultraweakly continuous functional on A. This follows since the sequence R p(t)ρ • α t σ (a n )dt (149) converges to the integral of the pointwise limit by the dominated convergence theorem, as p(t)|ρ • α t σ (a)| ≤ p(t) ρ a is integrable. Putting all the pieces together we find that is ultraweakly continuous, since for all ρ ∈ B , converges to zero whenever a n → a ultraweakly. The proof is then completed by rewriting Theorem 1 using the concavity of fidelity. For this, we require a version of Jensen's inequality for the convex functional σ → F (ρ, σ) on normal states on A with respect to the measure p(t)dt. This would give us where ρ is a state in A . Then Theorem 1 becomes: which implies that B is -approximately sufficient as claimed by the theorem. We are not aware of a proof for Jensen's inequality for convex functionals of a Banach space valued random variable that would apply straight away to the case considered here. In particular, it is not evident that the integrals in question can be approximated by Riemann sums in the general case, as was done in [41]. So we now demonstrate (152) by a more explicit argument using the detailed structure of the fidelity.
Consider the Hilbert space Y = L 2 (R; H ; p(t)dt) ∼ = H⊗L 2 (R; p(t)dt) of strongly measurable square integrable functions valued in H . Vectors |Υ in this space are (equivalence classes of) functions t → |Υ t . Y is evidently a module for A. We denote this von Neumann algebra by A ⊗ 1 since it acts trivially in the second L 2 tensor factor of Y . Now define the fidelity as: We next formulate a lemma that will allow us to complete the proof.

Lemma 10. Let |Υ , |Ψ ∈ Y induce linear functionals on A ⊗ 1 such that
where a + is an arbitrary non-negative element in A and σ, ρ states on A. Then if |Υ t , |Ψ t are strongly continuous then F (Υ t , Ψ t ) is continuous, and we have Proof. If |Υ t , |Ψ t are strongly continuous then F (Υ t , Ψ t ) is continuous in t by (225), and since the fidelity is the L 1 norm, see app. C. The idea is now to construct a suitable family of elements y t ∈ A . This family should be chosen at the same time so as to satisfy: (i) y t ≤ 1, (ii) t → y t is strongly continuous, (iii) in the sup definition of the fidelity, (61) we are suitably close to saturating the supremum in the sense that F (Υ t , Ψ t ) is approximately | Υ t |y t Ψ t |. Then (ii) implies that y t |Ψ t is weakly measurable and thus strongly measurable by the Pettis measurability theorem, see e.g. [52], thm. 3.1.1. 5 By (i) we then see that the map y t |Ψ t is in the Hilbert space Y because boundedness y t clearly implies that it is square integrable. (ii) holds for instance if the function y t is continuous in the norm topology, and we will attempt to choose it in this way. Then y t , as a function, will define an element Y in (A ⊗ 1) that can be used in the variational principle (154). We must therefore have, using concavity of the fidelity in the same manner as in (131), using the variational principle (154) to obtain the last inequality, and using that the fidelity only depends on functionals in the first. The evident strategy is now to make our choice (iii) of of the function y t in such a way that the right side is close to the right side of (156), while being continuous in the operator norm topology and while satisfying y t < 1, so that (i) and (ii) hold as discussed.
To this end, consider the open unit ball in A in the norm topology, A 1 ≡ {x ∈ A : x < 1}. For all t we define next a subset X t ⊂ A 1 by This set is open in the norm topology because the second set on the right hand side of (158) is open in the weak operator topology and so it is open in the norm topology, too. It is non empty since we know that in the sup definition of fidelity it is sufficient to take x < 1 and still achieve F (Ψ t , Υ t ).
We will be interested in the norm closures X t . What we then need to do is select a function from this set y t ∈ X t that varies continuously in the operator norm. This problem can be solved by the Michael selection theorem [45]. Indeed, we can consider the mapping t ∈ R → X t ∈ 2 A as a map from the paracompact space R to subsets of A thought of as a the Banach space (with the operator norm). If it can be shown that the sets X t are nonempty closed and convex and that this map is "lower hemicontinuous", then by the Micheal selection theorem, there is a continuous selection y t ∈ X t as we require.
We have seen that the sets are closed and nonempty. Convexity follows from where the first equation is schematic but is hopefully clear, and where p 1 , p 2 ≥ 0, p 1 +p 2 = 1. This implies that X t is convex and hence its closure is also convex. Lower hemicontinuity at some point t is the property that for any open set V ⊂ A that intersects X t there exists a δ such that X t ∩V = ∅ for all |t−t | < δ. We see this for the case at hand as follows. Take V satisfying the assumption, and note that V ∩ X t is also non empty. Pick a y ∈ V ∩ X t . There exists an < such that: Then, by the strong continuity of |Υ t resp. |Ψ t and continuity of F (Ψ t , Υ t ), we see that this condition is stable: Given − > 0 there does indeed exist a δ such that which implies that y ∈ V ∩ X t ⊂ V ∩ X t as required. From Michael's theorem we therefore get the desired norm continuous y t satisfying for all t. Using that the fidelity is real and (157) and that can be made arbitrarily small then readily implies the lemma.
We now use this lemma with |Υ t := |Γ ψ (i/2+t) , which is weakly continuous by Theorem 4 (1). Actually, it is even strongly continuous since it is given by the product of bounded operators and Δ it η;A , Δ it η;B , which are strongly continuous as they are 1-parameter groups of unitaries generated by a self-adjoint operator by Stone's theorem, see e.g. [26], sec. 5.3. We also take |Ψ t = |ψ , which is obviously strongly continuous as it is just constant. Then |Υ induces a state dominated by ρ • ι • α S , by Theorem 4 (2), and |Ψ induces ρ by definition, and |Υ t induces ρ • ι • α t σ . We thereby arrive at the concavity result (152), and this concludes the proof of Theorem 2.

Examples
Here we illustrate our method and results in two representative examples.

Example: finite type-I algebras.
To compare our method to that of [41] in the subalgebra case, we work out our interpolating vector (84) in the matrix algebra case. Thus let A = M n (C) and B = M m (C), C = B ∩ A, embedded as the subalgebra b → ι(b) = b ⊗ 1 C where n = m × k and these integers label the size of the matrices. We will work in the standard Hilbert space (H M n (C) C n * ⊗ C n ) and identify state functionals such as σ with density matrices. So for example σ A ∈ M n (C), and we assume for simplicity that this has full rank (faithful state).
H M n (C) is both a left and right module for A, and the inner product on H is the Hilbert-Schmidt inner product. The natural cone of A is defined to be the subset of positive semi-definite matrices in H . The modular conjugation and relative modular operators (of A) associated with this natural cone are: where we invert the density matrix ρ A on its support. The natural cone vectors correspond to the unique positive square root of the corresponding density matrix, now thought of as pure states in the standard Hilbert space. So . The embedding is: Using these replacements it is easy to compute our interpolating vector (84) |Γ ψ (z) by starting with the expression in (85a) and The L p (A , ψ) norms can be computed using the well known correspondence between these norms and the sandwiched relative entropy discussed in [12]. This gives: where in the last equation we set p = p θ and used 1/p θ − 1/2 = θ, and where and we recognize this later expression as [41], eq. (25) with α there given by p θ /2.

Example: half-sided modular inclusions.
Half-sided modular inclusions were introduced in [64,65] and consist of the following data: An inclusion B ⊂ A of von Neumann algebras acting on a common Hilbert space H , containing a common cyclic and separating vector |η . Furthermore, for t ≥ 0, it is required that Δ it η,A BΔ −it η,A ⊂ B, hence the terminology "half-sided." This situation is common for light ray algebras in chiral CFTs, where |η is the vacuum.
Wiesbrock's theorem [64,65] is the result that for any half-sided modular inclusion, there exists a 1-parameter unitary group U (s), s ∈ R with self-adjoint, non-negative generator which can be normalized so that for t ∈ R. Furthermore, the unitaries Δ it η,A , U(s) fulfill the Borchers commutation relations [13] and in particular B = U (1)AU (1) * , J A U (s)J A = U (−s). For any a > 0, the inclusion A a = U (a)AU (a) * ⊂ A is then also half sided modular.
For a half-sided modular inclusion, the embedding is trivial, V η = 1. Using this information, one can easily show that in the case of the half-sided modular inclusions A a = U (a)AU (a) * ⊂ A, the rotated Petz recovery channel, denoted here as α t a : B → A to emphasize the dependence on a, is: Theorem 1 therefore gives the following corollary, conjectured in [21], after a change of integration variable. It is now well known that the QFT averaged null energy condition (ANEC) can be proven [22] using monotonicity of relative entropy for algebras associated to space-time regions that are null deformations of each other. The ANEC is a weaker version of the null energy condition, a positivity bound on the stress energy tensor contracted with a null direction. The null energy condition (NEC), which is known to be violated by quantum effects in QFT, plays a central role in proving singularity theorems and the second law of black hole thermodynamics for classical gravity coupled to matter. Finding the correct quantum generalization of the NEC is an important task for studying aspects of quantum gravity and the ANEC is one such candidate. Indeed the setting of von Neumann algebras satisfying a half-sided modular inclusion is believed (proven in physical terms in [23]) to underly the QFT of null-deformed regions. It is thus natural to wonder what the strengthened monotonicity bound (172) implies in this physical setting. In [21], it was speculated that there is a relation to the quantum null energy condition (QNEC) -an improved candidate for the quantum generalization of the NEC that constrains the second null derivative of relative entropy. Indeed [21] already rigorously proved the QNEC using the same setup, but with an overall different method. It is still interesting to try to connect the QNEC to (172) since such a connection would open up further quantum information insights into QFT and quantum gravity.

Corollary 2. Let B ⊂ A be a half-sided modular inclusion with respect to
In order to motivate a concrete version of this conjecture, consider first the interpolation vector (84) for the case of a half-sided modular inclusion. We have, V ψ = u ψ;η ∈ B [from (55)] is the partial isometry that takes |ψ A in the natural cone P A (defined w.r.t. |η ) to the state representer in P B (also defined w.r.t. |η ). The interpolation vector (84) thereby becomes in the case of half sided modular inclusions The vector (173) is similar to a vector studied in [21] in order to prove the quantum null energy condition (QNEC). Furthermore, some preliminary calculations of the Renyi-relative entropy difference using this vector and for some limited class of states point to the following: This is a more refined version of a conjecture appearing in [21]. A corollary to this conjecture, if proven, would be a new proof of the QNEC since the recovery channel is translationally invariant so applying the same result to a further translated null cut one can use monotonicity of the fidelity to prove that Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
If Π S is the empty set then it must be the case that π(ρ i ) = π(ρ j ) for all ρ i,j ∈ S , since otherwise we could use convexity to show a contradiction: So in this case (47) is trivial. We may thus assume from now on that Π S is non-empty. By Zorn's lemma we can pick a maximal family of mutually orthogonal projectors from Π S , where family means a subset of Π S , and maximal means that there are no other orthogonal families of projectors that are strictly larger under the order of inclusion. Call the maximal family q max . By the σ-finite condition, it is a countable family Given q max we define: The infinite sum converges in the linear functional norm and so by convexity and closedness of S we find that σ ∈ S . The support projector for this state satisfies: (understood as a direct sum in the norm topology.) By the maximality condition we can show (47). To see this, suppose that this is not true for some ρ k then: for all n. This contradicts the maximality of q max , which is absurd.

B. Isometric Embedding
We work with σ ∈ A faithful which implies that σ • ι ∈ B is faithful. Thus the corresponding vectors |ξ A σ , |ξ B σ in the natural cones are cyclic and separating.
By a trivial calculation, one sees that V σ defined in (18) is a norm-preserving (densely defined) map from K to H . So the map extends to the full Hilbert space as an isometric embedding V * σ V σ = 1 K . A similar argument shows that: where this equation applies on the subspace of H that is generated by B: In other words, |ξ A σ is not cyclic for ι(B) and π B (σ) defines the associated support projector for the commutant algebra.
The embedding satisfies: since we can approximate any |χ = lim n c n ξ B σ ∈ K for suitable c n ∈ B, and take the limit on both sides of: Thus, for all vectors |χ 1,2 ∈ K , or: The commutant satisfies: which can be verified via a short calculation for a ∈ A and b ∈ B: where we used the fact that π K ∈ ι(B) and A ⊂ ι(B) .

C.1. Proof of Lemma 3 (Fidelity and the Araki-Masuda norm).
Proof. (1) In this proof, all L 1 norms are taken relative to the commutant A as in from (57), and we want to relate this to the fidelity, where φ, ψ are normalized vectors. This relation is proven in [5], lem. 5.3 for a cyclic and separating vector |ψ . We will now remove this condition. The linear functional that appears in (190) A can be written using a polar decomposition ψ| · |φ = ξ| · u |ξ (191) for some |ξ in the natural cone and a partial isometry u with initial support (u ) * u = π (ξ). This polar decomposition has the property that the largest projector in A that satisfies ξ| x p u |ξ = 0 for all x is p = 1 − π (u ξ) = 1 − u (u ) * . 6 Thus: and since A |ψ = π(ψ)H we derive that the final support projector satisfies: where in the second line we used (8) and in the third we used the anti-unitarity of J. The above relation can be rewritten as: where we have freely added ζ ∈ (1 − π(ψ)π (ξ))H since π(ψ)(u ) * |φ is in the subspace π(ψ)π (ξ)H , and this subspace is also the support of Δ ξ,ψ . Now since the vector on the left of (195) is dense: π (ξ)A |ψ + (1 − π(ψ)π (ξ))H = H we learn that (Δ ξ,ψ ) 1/2 |ψ is in the domain of (Δ ξ,ψ ) 1/2 and so that where we used (193). The next step is to show that which implies that This is what we wanted to derive. The later equality in (198) is fairly standard, but for completeness we go through this. Without loss of generality we take |χ in (189) such that u Δ ξ,ψ |ψ is in the domain of (Δ χ,ψ ) −1/2 and also such that π (χ) ≥ π (u Δ ξ,ψ |ψ ) = π (u ξ) and |χ = 1. We would like to use the following result that we will justify later (for now the reader should feel free to verify this for type-I algebras with density matrices): where j(u ) = Ju J and all the domains in the above equation are appropriate. Now apply the Cauchy-Schwarz inequality: Taking the infimum over all such |χ we find that: The other inequality is found since the optimal vector in the infimum is |χ = u |ξ / |ξ where (200) becomes: which implies that: and this establishes equality. We now only need to prove (200). To do this we will analytically continue the equation: away from z = is for s real. We simply take an inner product with a dense set of vectors a |χ + |ζ where a ∈ A and |ζ ∈ (1 − π (χ))H : since we know that |ξ is in the domain of (Δ ξ,ψ ) 1/2 (since we established that |ψ is in the domain of Δ ξ,ψ ) it is clear that we can analytically continue the two functions above into the strip 0 < Rez < 1/2 with continuity in the closure (using standard results in Tomita-Takesaki theory.) Agreement along z = is implies agreement in the full strip. Setting z = 1/2 we have a uniform bound (with a |χ ≤ 1) on the left hand side since we started with the assumption where ψ |u φ = e iϕ sin(θ). We can then take x to be an operator in this subspace. Note that: such that the maximum is achieved for x = σ 3 = diag(1, −1) which has an operator norm of 1. So the norm of this linear functional is 2 cos θ, giving the last equality in (213). Taking the inf over u in (213), we have: In the other direction we can pick φ and ψ to live in the natural cone without loss of generality, and then we have where the later quantity is real since both vectors are in the cone. We use the inequality (67) for p = 1 that we reproduce here: Altogether, we have Note that the fidelity lies between 0 and 1 and: where equality is achieved on the left iff the two linear functionals are the same and on the right if the support of the two linear functionals are orthogonal. We can see this as follows. Note that for x ≤ 1: so that |ω ψ (x) − ω φ (x)| lies between 0 and 2. Equality is achieved for x = π(ψ) − π(φ) with orthogonal support.

C.2. Proof of Lemma 11 (Continuity of fidelity).
In this section, all L 1 norms refer to the commutant algebra A , as in (Lemma 3): where the supremum is over partial isometries u .
Lemma 11. For a von Neumann algebra A in standard form acting on a Hilbert space H and any |ψ i , |φ i ∈ H , Proof. The variational expression (222) immediately allows one to deduce the triangle inequality for the L 1 -norms. Note that: and that ψ 1,φ = φ 1,ψ are further trivial consequences of the variational definition. For normalized vectors |ψ 1 , |ψ 2 , |φ 1 , |φ 2 we derive for the L 1 -norms relative to A : where to go to the second line we used the reverse triangle inequality twice, and in the last step we used (224).
(1) First assume that ω ψ is faithful, and without loss of generality that the right side of (139) is not infinity. Then the standard theory developed in [5] applies, which we use heavily in this proof. It is no restriction to impose further that |ψ ∈ P M , for if not we may pass the the GNS representation of ω ψ and work with the natural cone P M = Δ 1/4 ψ M + |ψ . We use the notation Denote the dual of a Hölder index p by p , defined so that 1/p + 1/p = 1. [5] have shown that the non-commutative L p (M, ψ)-norm of a vector |ζ relative to |ψ can be characterized by (dropping the superscript on the norm) They have furthermore shown that when p ≥ 2, any vector |ζ ∈ L p (M, ψ) has a unique generalized polar decomposition, i.e. can be written in the form |ζ = uΔ 1/p φ,ψ |ψ , where u is a unitary or partial isometry from M. Furthermore, they show that ζ p ,ψ = φ p . We may thus choose a u and a normalized |φ , so that perhaps up to a small error which we can let go zero in the end. Now we define p θ as in the statement, so that and we define an auxiliary function f (z) by where α θ (t), β θ (t) are as in Lemma 9.
Applying this to g = f gives the statement of the theorem.
(2) Let us now extend this result to the case where ω ψ is not faithful for M when p 0 = 2, p 1 = 1. In this case, the norms under the integral (139) become the ordinary Hilbert space norm (p 0 = 2) and the fidelity (p 1 = 1). We employ the following common trick where we use case (1) above for the modified functional where |η is a cyclic and separating vector, which exists since M is assumed to be sigma-finite, and where |ψ is in the natural cone. Then |ψ in (236) with 0 < < 1 is now a faithful state, to which the standard theory of [5] applies.

E. An Alternative Strategy for Proving Theorem 1
It is conceivable that our approach based on the vector (84) can be modified by choosing other interpolating vectors, and this may lead to new insights relating the argument to somewhat different entropic quantities. Here we sketch an approach which seems to avoid the use of L p -norms, thus leading potentially to a substantial simplification. To this end, we consider now a vector |Ξ ψ (z, φ) = Δ z ψ,ξ;B Δ −z η,ξ;B Δ z η,φ;A |ψ , similar to vectors considered in [20]. Here, |ξ is some vector such that π B (ξ) ⊃ π B (ψ), and where in this appendix we find it more convenient to think of B as defined on the same Hilbert space as A. The vector (239) does not depend on the precise choice of |ξ (but on the vector |η in the natural cone of A, although we suppress this).
(239) is defined a priori only for imaginary z. But if we consider the set of states majorizing |ψ , defined as C (ψ, A ) = {|φ ∈ H : a ψ ≤ c φ a φ ∀a ∈ A }, then for |φ in this dense linear subspace of H , it has an analytic continuation to the half strip S 1/2 = {0 < Rez < 1/2} that is weakly continuous on the boundary. This can be demonstrated by the same type of argument as in [20], prop. 2.5, making repeated use of the following lemma by [20], lem. 2.1: Lemma 13. Suppose |G(z) is a vector valued analytic function for z ∈ S 1/2 , and A is a self-adjoint positive operator. Then A z |G(z) is an analytic function of z ∈ S 1/2 if A z G(z) is bounded on the boundary of S 1/2 .
Then, for imaginary z = it we get Δ z η,φ;A |ψ = u (z)v(z)|ψ , which has an analytic continuation to S 1/2 as v(z)|ψ is analytic there by Tomita-Takesaki theory. One next applies the lemma with |G(z) = Δ z η,φ;A |ψ and A z = Δ −z η,ψ;B (chosing |ξ = |ψ here). The conditions are verified using standard relations of relative Tomita-Takesaki theory as given e.g. in [5], app. C, such as (12), which is bounded for real t. On the other hand, at the lower boundary A it |G(it) is bounded by definition. Continuing this type of argument gives the following lemma.
The relationship with other approaches can be seen through the quantity In the setup of finite-dimensional von Neumann subfactors described in Sect. 4.1, we can write If we take z = θ real then the infimum over τ A (the density matrix representing |φ ) readily yields an L p -norm for p θ = 2/(2θ + 1), We recognize this again as (169) corresponding to an expression also studied by [41]. The strategy is now the following. First, Lemma 12 also applies to the holomorphic Hilbert-space valued function |Ξ ψ (z, φ) (because z → ln Ξ ψ (z, φ) is subharmonic). So we have for 0 < θ < 1/2 that Since ∀t ∈ R, Ξ ψ (it, φ) ≤ 1, α θ (t) > 0, we can drop the first term under the integral. Then, we want to divide by θ and take the infimum over |φ ∈ C (A , ψ), φ = 1. The next lemma will allow us to deal with the second term under the integral. Since |φ ∈ C (A , ψ), we can write |ψ = a|φ , where a ∈ A is self-adjoint, see [54], 5.21. Then: for all |φ ∈ C (A , ψ).
Proof. On the left hand side of (245), we may choose |ξ = |η . It is most convenient to work with state vectors in the natural cones, for notations see (48). Define b = Δ (The choice π A (ψ) = J 2 A ≤ π A (φ) ≤ π A (η) = 1 guarantees the supports of vectors on A are multiplied in the correct way, so we keep the π A 's implicit in the derivation-everything should be understood to happen on π A (ψ).) In the derivation we used the definition of the Petz recovery map, see e.g. [47] proof of prop. 8.4, such that ∀ a ∈ A, b ∈ B, Thus, we have (245). We obtain the claim in the lemma by taking the infimum in the set C (A , ψ) on both sides of (245) and using (60).
The lemma and concavity of ln allows us to conclude from (244) that where |ζ S is a vector representative of ω ψ • ι • α S ∈ A and α S the recovery channel (45). Note that taking the infimum over |φ ∈ C (ψ, A ) on the right side yields 2 ln F (ω ψ , ω ψ • ι • α S ) On the other hand, it is plausible to expect that for the term on the left side of (244), we obtain inf φ∈C (A ,ψ) If this latter equation could be demonstrated-which is possible at a formal level 7 -then it is clear that we would obtain an alternative proof of Theorem 2 (though not of Theorem 1). When attempting to demonstrate (249) (or equivalently (250)), one is facing similar technical difficulties as in the proof strategy described in the body of the text. There, we were forced to introduced suitably regularized versions |ψ P of the vector in question. Thus, while the strategy discussed in this appendix nicely avoids the use of L p -spaces up to a certain point, it is not clear whether their use can be altogether avoided. We think that this would be an interesting research project.