Channel Divergences and Complexity in Algebraic QFT

We consider a notion of divergence between quantum channels in relativistic continuum quantum field theory (QFT) that is derived from the Belavkin–Staszewski relative entropy and the concept of bimodules for general von Neumann algebras. Key concepts of the divergence that we shall prove based on a new variational formulation of that relative entropy are the subadditivity under composition and additivity under the tensor product between channels. Based on these properties, we propose to use the channel divergence relative to the trivial (identity-) channel as a novel measure of complexity. Using the properties of our channel divergence, we prove in the prerequisite generality necessary for the algebras in QFT that the corresponding complexity has several reasonable properties: (i) the complexity of a composite channel is not larger than the sum of its parts, (ii) it is additive for channels localized in spacelike separated regions, (iii) it is convex, (iv) for an N-ary measurement channel it is logN\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\log N$$\end{document}, (v) for a conditional expectation associated with an inclusion of QFTs with finite Jones index it is given by log(Jones Index)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\log (\text {Jones Index})$$\end{document}.


Introduction
A problem of both theoretical and practical interest in quantum information theory is to assess the "complexity" of a quantum state-or operation.A natural approach is to take as a measure of complexity the minimum number of operations from an underlying set considered as "basic" [1][2][3].Typical results in this context include bounds on the growth of complexity under time evolution, see e.g.[4,5].There are also proposals in the context of the AdS-CFT correspondence, linking the growth of complexity of a state in the boundary quantum field theory (QFT) to various geometric quantities in the bulk, see e.g.[6][7][8].
One may ask how to define a notion of complexity directly in a relativistic continuum QFT without reference to holographic ideas.In QFT, one faces the immediate problem to identify a suitable set of basic operations relative to which the complexity of a composite operation is to be assessed.If one wants to maintain a close analogy to ideas such as [4,5], it appears that one would have to specify a preferred set of local quantum field operators, possibly within some lattice regularization of the theory.For instance for Gaussian field theories, concrete proposals include e.g.[9,10], whereas e.g.[11][12][13] emphasize the role of symmetry operations -especially in theories with very large symmetry algebras.For sufficiently generic QFTs it seems to us that both the operator basis and/or lattice regularization would be highly non-unique.
One approach to this issue is to take a broader view of the problem, departing from the notion of basic operation and focussing attention instead on a suitable notion of "distance", D(S T ), between two channels S, T .Complexity would then be defined as c(T ) = D(id T ), the distance to the trivial (identity) channel.Of course, one would like specific properties from D to connect to the idea of complexity.Natural requirements would be: • Subadditivity: c(T 1 • T 2 ) ≤ c(T 1 ) + c(T 2 ) expressing that the complexity of a composite channel is not bigger than the sum of its parts.In particluar, for a 1parameter (Markov-) semi-group T t , t ≥ 0 of channels, we automatically get at most linear growth in time, c(T N t 0 ) ≤ C 0 N .• Locality: c(T 1 • T 2 ) = c(T 1 ) + c(T 2 ) if T 1 and T 2 are localized in spacelike related parts of the system.This expresses that the complexity c respects Einstein causality/locality.• Convexity: Thinking about performing operations T i randomly with some probabilities p i it is natural to ask that c( p i T i ) ≤ p i c(T i ).
One way to obtain a notion of channel divergence, hence c, is to start with a corresponding divergence D(ϕ ψ) in the ordinary sense (see e.g.[14]) between quantum states 1 ϕ, ψ, by considering how much the actions of T and S on a state can deviate as quantified by this divergence.With this idea in mind, a naive guess might be to consider sup ψ D(ψ • S ψ • T ), where the maximization is over all normalized states of the system and ψ • T is the action of the channel on the state ψ, viewed in this paper as an expectation functional on observables, see footnote 1.However, as is well-known, this notion is actually inadequate for quantum systems because one can obtain refined information about the action of channels by coupling the systems in question to an ancillary system and considering states that have a suitably engineered entanglement between the original system and the ancillary system.So one should define instead 2 where the maximization is now over all states of the original observable algebra M tensored with the ancillary algebra A. This is the definition that we shall also adopt, up to some technical caveats related to the fact that we will be dealing with von Neumann algebras of a sufficiently general type as appropriate for QFT: in such a setting, it is most natural to model the ancillary system by another von Neumann algebra, A, and it does not seem natural to restrict the nature of that system, e.g. by imposing that A should have a particular type such as I n .Then we must also have an enlarged Hilbert space on which both the original von Neumann algebras as well as the ancillary algebra 1 In this paper we follow the conventions in operator algebras that a state ψ is a positive functional on the observable algebra.For matrix algebras, ψ(m) = Tr(mρ ψ ), where ρ ψ is the corresponding density matrix, see Sect.2.1 for our conventions. 2The role of ancillary systems for complexity has also been discussed in a different context e.g. by [15].
A acts, i.e. we must consider bi-modules of von Neumann algebras, see e.g.[16,17] and references therein.Of course the main question is what D we should start from.One possibility might be a geometric approach along the lines of [2,3].In [18], on the other hand, the authors propose to use a particular quantum version [19] of the classical "Wasserstein-distance", see e.g.[20], and derive several convincing properties of the corresponding notion of complexity, including the ones listed above.The quantum Wasserstein distance as defined by [19] is for finite dimensional systems with Hilbert space of the form (C d ) ⊗N .While it may be possible to generalize it to von Neumann algebras of type III appearing in QFT [21], we proceed differently here and work with the so-called Belavkin-Staszewski (BS) divergence [22] D BS .That divergence has been considered 3 recently in the context of channel discrimination by [24] and is for matrix algebras.Here, ρ ψ is the density matrix representing the expectation functional ψ, i.e. ψ(m) = Tr(mρ ψ ) for all m ∈ M. A generalization to arbitrary von Neumann algebras is possible [25,26].Our reason for considering D BS is that [27] (see also [28]) have shown in the finite dimensional setting that it gives rise to a channel divergence with a subadditivity property under composition and an additivity property under the tensor product of channels. 4This would not be the case for other well-known divergences such as, say, the more commonly used Araki-Umegaki relative entropy [29].The classic works [30,31], while somewhat similar in spirit to ours, use that relative entropy to define an entropy-like quantity for the inclusion channel ι : N → M between two von Neumann algebras possessing a corresponding conditional expectation E : M → N .Though this quantity has many useful properties, it does not therefore seem to have the above-mentioned (sub-)additivity properties (at least not in total generality, see Remark 4.5 of [31]).
In this work, we analyze the channel divergence based on the BS divergence in the context of general von Neumann algebras, and prove that the corresponding notion of complexity c has the above properties in QFT.While we have to leave for future investigations the question how our c is related to conventional notions related to computational cost à la [2,3], or even to holographic proposals such as [6][7][8], we prove a number of further properties of our complexity c: where we mean the Jones index [32] of the inclusion ρ(A(O)) ⊂ A(O), and where d ρ is the statistical (or "quantum-") dimension of the sector. 5tems (1), (2) are basically negative results, but perhaps not totally unreasonable if we remember that any local operation in a continuum QFT (i.e. an operation in a finite spacetime region) must still involve an infinite number of degrees of freedom.The channels in items (3), ( 4) are conditional expectations.This suggest that these are to be regarded as the basic operations in QFT.Particular measurements in (3) implementing the idea of "setting individual q-bits" can be constructed trivially as follows.Imagine the QFT has a "basic" real scalar field φ and consider a cube of side-length δ in a time slice.Let f be a non-negative testfunction supported in the cube, let S = φ(0, x) f (x)d n−1 x, where n is the dimension of spacetime, and let p ± be the projectors corresponding to a positive/negative measurement of S. Shifting the cube periodically in the n − 1 spatial directions we can obtain a finite lattice with corresponding projections p x,σ , σ = ±, x ∈ associated with each point x of the (dual) lattice.Then we can define projections e({σ }) = x∈ p x,σ (x) , each corresponding to measuring a particular lattice configuration {σ }, e.g.
The complexity of the corresponding measurement channel is clearly As an example of item (4), consider the QFT of a real N -component free complex Klein-Gordon quantum field φ I (x), I = 1, . . ., N .We consider as observables the SU (N ) singlet operators (gauge invariant observables) under the SU (N )-symmetry.Consider a state in the Hilbert space which is in some non-trivial representation R of SU (N ).Then cannot be generated from the vacuum by the action of any charge neutral operator a, so the representation of charge neutral operators built on is not unitarily equivalent to the vacuum representation.In fact, by DHR theory [34,35], there exists an endomorphism ρ of the local algebra generated by SU (N ) singlet operators such that , ρ(a) = , a for all SU (N ) singlet operators a, (5) and ρ implements the charged sector with representation R.  [32], the smallest non-trivial value of which is 2, realized e.g. by the sector ρ of the (4, 3) minimal (Ising) model with quantum dimension d ρ = √ 2. We conjecture that for any localized channel T 6Either c(T ) ≥ log 2 or T = id (conjecture), (6) which is reminiscent of the Landauer bound [17,36].This paper is organized as follows.In Sect.2, we first recall the theory of fdivergences and operator means for states on von Neumann algebras and introduce our main technical tool, a variational characterization of D BS (Proposition 2.17).In Sect. 3 we introduce D BS for channels of von Neumann algebras of general type, and prove some basic properties.In Sect.4, we apply these results to QFT.
All Hilbert spaces appearing in this paper are assumed to be separable.The squared norm m 2 of an operator m ∈ M is defined to be the supremum of the spectrum σ (mm * ) of the positive operator mm * .The subset of all such operators is denoted by M + (positive part).A linear functional ψ is called normal if it is ultra-weakly continuous, and a positive linear functional ψ is called faithful if ψ(mm * ) = 0 ⇒ m = 0.The set of normal states is also denoted by M * ,+ .The existence of a normal faithful positive linear functional is guaranteed since we are assuming that H is separable.On a matrix algebra every state is of the form for a unique density matrix ρ ψ .• Channels: Generalizing the notion of state, a channel T : M → N is a normal, positive, unital (meaning T (1) = 1) linear map which is also completely positive, meaning that T ⊗ id : M B(K ) → N B(K ) is positive, where means the algebraic tensor product and where K is any Hilbert space.If ψ is a state on N , then ψ • T (m) := ψ(T (m)) is a state on M, and ψ → ψ • T corresponds to the dual action of channels on states (Schrödinger picture).In much of the quantum information theory literature, the Schrödinger picture is considered, but of course this is just a matter of convention.For finite dimensional von Neumann algebras N , M, the action of T in the Schrödinger picture may also be thought of as an action T + on density matrices (7), Then T + is completely positive and trace preserving (corresponding to T (1) = 1).
• Standard form: A vector ∈ H is called cyclic if M is dense in H in the strong topology, and it is called separating if m = 0 ⇒ m = 0.Such a representation on H of M and vector can always be obtained by the GNS-representation of a faithful normal state ω.A cyclic and separating vector is also called standard and a representation of M on a Hilbert space with standard vector is called a standard representation.Associated with is an anti-linear involution J on H such that J = and J MJ = M called the modular conjugation.The closure of the set of vectors of the form a J a J , a ∈ M is called "natural cone" and is also denoted as of a conditional expectation is the infimum over all positive real numbers λ such that the E(mm * ) ≥ λ −1 mm * for all m ∈ M. • Jones-index: Assume that N , M are factors such that there exists a conditional expectation E : M → N .If λ E < ∞, there exists a unique E 0 called "minimal conditional expectation" [38] such that λ E 0 is minimal, and in such a case λ E 0 =: [M : N ] is called the Jones-Kosaki index [32,39] of the inclusion.Otherwise we set interpolating between the space of normal functionals on M and M itself.They are defined relative to some standard vector and denoted as L p (M, ), see [40,41].One has L 2 (M, ) = H . Beyond this, we will only need L ∞ (M, ) which is a linear subspace of H .We will mainly use the following characterization of this space [40]: As a vector space L ∞ (M, ) = M .The Banach space norm is ξ L ∞ (M, ) = m where m ∈ M is the unique element such that ξ = m .• Opposite algebra: The opposite algebra M op of a von Neumann algebra M is identical as a vector space with * -operation, but has the reversed product m op

Maximal f -divergence for bounded operators.
See [25,[42][43][44] as general references.Central to the concept of operator mean and the divergences studied in this paper are the notions of operator monotone-and operator convex functions.Definition 2.1.Let I ⊂ R be an interval.f : I → R is said to be are self adjoint operators on a Hilbert space such that A ≤ B and that their spectra satisfy σ (A), σ (B) ⊂ I ; t is operator monotone on (0, t 0 )).Furthermore, if f : [0, t 0 ) → R is operator monotone, then it is also operator concave.While the converse is not true, it is the case that ( f operator concave and f (t) ≥ 0 for all t ∈ [0, ∞)) implies ( f is operator monotone on [0, ∞)).
The following well-known representation (9) allows one to reduce many constructions involving operator monotone functions to certain weighted averages of a special operator monotone function.Consider a continuous operator monotone function t .There exists a unique finite positive Radon measure μ on [0, ∞), such that The Kubo-Ando theorem establishes a one-to-one correspondence between operator connections and non-negative operator monotone functions on [0, ∞), see [45,Theorems 3.3,3.4].The isomorphism is provided by σ → f , where f (t)I H := I H σ (t I H ). Its inverse f → σ is defined by taking the integral expression (9) of a non-negative operator monotone function f on [0, ∞), f (t) = a + bt + (0,∞) (1+s)t t+s dμ(s), and then defining the corresponding σ as Here, A : B is the parallel sum operator connection which is defined as the bounded quadratic form (see e.g.[43,Lemma 3.1.5]) If A and B are positive operators with bounded inverses, then The corresponding measure μ α and constants a α , b α as in (9) are therefore Particular examples are the left-and right trivial means (for α = 0, 1) and the geometric mean for which α = 1/2.
Example 2.7 (Logarithm).The logarithm f (t) = log t is operator monotone on (0, ∞) and formally has a = −∞, b = 0 and dμ log = t −1 (1 + t) −1 dt.We will typically consider the approximation f n (t) Consider two bounded positive operators such that B ≤ λA for some λ < ∞.Then f (A −1/2 B A −1/2 ) ≤ f (λ)I is a bounded operator and the Kubo-Ando mean σ corresponding to the operator monotone function f can also be expressed as [45,Theorem 3.3] as one may see using the integral representations ( 9), (10) as well as the expression for parallel sum (12).The following divergences first appeared in [46] and were developed further in [47,48].
Definition 2.8.Consider a non-negative operator monotone function f : [0, ∞) → [0, ∞) characterized by (9) and positive trace class operators A, B such that B ≤ λA for some λ > 0. Then the "maximal quantum f -divergence" of A with respect to B is defined by where σ is the operator connection corresponding to f .Remark 2.9.For general positive trace class operators A, B such that B ≤ λA possibly does not hold for any λ < ∞, one defines Here C is any bounded positive operator such that λ −1 (A + B) ≤ C ≤ λ(A + B) for some λ < ∞.By the monotonicity property of operator connections, the limit exists because the sequence is monotone decreasing.The limit is independent of the particular choice of C as a special case of Lemma 2.12 below.

Maximal f -divergence for von Neumann algebras.
Operator connections and maximal f -divergences can be generalized from bounded operators to more general settings such as to suitable classes of unbounded positive quadratic forms [49][50][51].In this work, we will mainly be interested in the notion of operator connection and maximal fdivergence between two positive normal functionals ϕ, ψ on a von Neumann algebra M.This setting is investigated in great detail in [25,26] to which we refer as general references.Based on the results [25, Appendix D] one can for instance obtain quite easily a variational characterization of the maximal f -divergence which will be the basis of most developments in this work.The starting point in the von Neumann algebra setting is the Connes cocycle together with the following well-known result [52], see e.g.[40] for the definitions of the modular operators ψ and Connes cocycles [Dψ : Dϕ] t .
It therefore makes sense to make the following definition [26].
Definition 2.13.Let ϕ, ψ let normal, positive functionals on M, and let f : [0, ∞) → [0, ∞) be an operator monotone function.The maximal quantum f -divergence of ϕ with respect to ψ is defined as where φ is any positive normal functional on M satisfying φ ∼ ϕ + ψ.Note that we may chose φ = ϕ + ψ.Remark 2.15.The attentive reader will notice that compared to [26], we require f in Definition 2.11 to be operator monotone rather than operator convex, the order of the states is reversed and we have a logarithm.The presence of the logarithm is just for convenience to make the entropy additive under tensor products.If we ignore the logarithm then our definition reduces to − Ŝ− f (ψ ϕ) of [26] for non-negative operator monotone functions noting that − f is operator convex.Of course, the definition of [26] works for the larger class of all operator convex functions.In the present work, a variational formula for D f and the BS divergence D BS will take center stage and from this, many of these properties could be seen directly in retrospect.First, one defines an analogue of the parallel sum (12) for two normal positive functionals ϕ, ψ on the von Neumann algebra M by (z ∈ M) Then ϕ : ψ is a positive normal functional on M + which is extended to all of M by writing a general element as a difference of elements from M + .Using the notion of parallel sum, one can next define a notion of operator mean ϕσ ψ associated with an operator monotone function f : [0, ∞) → [0, ∞) with representation (9) between two positive normal functionals on M by an analogue of the formula ( 10): where a, b, dμ with a, b < ∞ correspond to the operator monotone function f : [0, ∞) → [0, ∞) as in (9).By combining [25, Theorem D.7, D.8, D.10], it follows that From this relation and ( 20) one can obtain the variational formula with ease, see also [44,  Then we have where the infimum is taken over all step functions x : (0, ∞) → M with finite range such that x t = 1 for sufficiently small t, such that x t = 0 for sufficiently large t, and where y t := 1 − x t .
The above proposition does not cover the operator monotone function f (t) = log(t) (formally a = −∞ in the representation ( 9)).Since this case underlies the Belavkin-Staszewski (BS) divergence and is particularly interesting for us, we treat it explicitly.Consider first a pair of positive normal functionals on M such that ϕ ∼ ψ.Define the BS divergence as For a general pair of positive normal functionals such that ϕ ∼ ψ does not hold, we define D BS (ϕ ψ) analogously to Definition 2.13.The BS divergence can be seen as the limit α → 1 of the maximal geometric α-divergence corresponding to f α (t) = t α , α ∈ (0, 1).
Proposition 2.17.We have where the first sup is taken over n ∈ N, while the second is over finite range step functions x on ( 1 n , ∞) as in Proposition 2.16.Proof.First we assume ϕ ∼ ψ and consider the approximating sequence f n (t) := log(t + 1 n ), t ≥ 0 of operator monotone functions, which have the integral representation ).The spectral theorem implies that the operator in Lemma 2.10 has a spectral representation where λ < ∞.Thus as f n (t) ↑ log(t) for all t > 0, by the monotone convergence theorem we have that F n (ϕ ψ) ↑ D BS (ϕ ψ).On the other hand, the same argument of Proposition 2.16 applies to F n (compare the definition of F n with Definition 2.11), i.e.
Here the sup is, as usual, over finite range step-functions.Thus, the claim follows.
Next, let ϕ, ψ be an arbitrary pair of normal and positive functionals on M. Following Definition 2.13, we notice that as D BS (ϕ Using the variational formula (26) for F n (ϕ +εφ ψ +εφ), since we may take φ = ϕ +ψ, Therefore The proof is completed by combining the inequalities ( 27), (30).
We now list some of the main properties of (Lower semi-continuity) Let ϕ n , ψ n be sequences of normal positive functionals converging pointwise to normal positive functionals ϕ, ψ as n → ∞.Remark 2.18.Item 2 is not provided in [25, Theorem 4.4, Proposition 4.5]; indeed it is presented as a conjecture for general von Neumann algebras [25,Problem 4.13].The variational expressions given in Propositions 2.17 and 2.16 provide an immediate proof of this for D f and D BS , see for example [54] for the analogous argument for the (Araki) relative entropy.When it is clear from the context, we will denote a bimodule by the underlying Hilbert space H .For more details on bimodules see e.g.[16,17].Remark 3.2.We will use the natural notation nξ m := H (n)r H (m)ξ, n ∈ N , m ∈ M, ξ ∈ H , when the bimodule Hilbert space is the identity bimodule L 2 (M, ), which is the bimodule arising from a standard representation and standard vector of M, so it is unique up to unitary equivalence.As a vector space, L 2 (M, ) is realized up to unitary equivalence as the GNS Hilbert space H of some chosen faithful normal state ω with associated cyclic and separating GNS vector .The right-and left action defining the M − M bimodule structure of L 2 (M, ) are defined as

Bimodules and f -Divergences for Channels
where J = J is the modular conjugation associated with that sends M anti-unitarily to M .
The following proposition [17, Proposition 2.6] will be referenced below: and η is cyclic for H T (N ) ∨ r H T (M).Moreover, such a bimodule and unit vector are unique up to unitary transformations.
In the following, f : [0, ∞) → [0, ∞) is an operator monotone function (such that a, b < ∞ in its representation ( 9)).Definition 3.4.Consider a pair of channels S, T : N → M, and a von Neumann algebra A. We extend the channels to S ⊗ id A , T ⊗ id A : N A op → M A op .Let π be any binormal representation of M A op .Then for every vector ξ ∈ H π in the M − A bimodule given by π , we can consider the states ϕ S,π,ξ = ϕ ξ • π • (S ⊗ id A ) and ϕ T,π,ξ = ϕ ξ •π •(T ⊗id A ). Consider D f (ϕ S,π,ξ ϕ T,π,ξ ) as defined by Proposition 2.17, now involving the supremum over finite range step functions x with values in N A op .Then we define where the supremum is over the triples (A, π, ξ) consisting of a von Neumann algebra A, bimodule π as above and normalized ξ ∈ H .We make the analogous definition for the BS divergence.
Remark 3.5.When M is finite-dimensional, our definition for D BS agrees with that of [27].This follows from Proposition 2.17 and the part of Proposition 3.7 referring to finite dimensional type I algebras.In fact [27] also consider the channel divergence for the function If M is infinite dimensional and finite (direct sum of factors of type II 1 ) then where by L 2 ( 2 (N)) we mean the Hilbert-Schmidt operators on the separable Hilbert space 2 (N) and by The same holds for the BS divergence.
Proof.By definition To prove the reverse inequality, we can assume for the sake of simplicity that N , M, A are all factors; the general case may be treated by performing the usual decomposition into a direct sum of factors.
Case (1) M is of type I ∞ , II ∞ , III.Then the sup in Definition 3.4 can always be realized for a properly infinite A because we can take the tensor product A ⊗ B(H ) and the corresponding bimodule if necessary.Consider a M − A bimodule H .In this case, [17,Corollary 2.7] implies that there exists a normal homomorphism θ : A → M, such that H π is isomorphic to L 2 θ (M).In other words, there exists a unitary U : H → L 2 (M) intertwining the right representation of H with the right representation of L 2 θ (M).For a vector ξ ∈ H , denote η := U ξ .Then D f (ϕ S,π,ξ ϕ T,π ξ ) = D f (ϕ S,π θ ,η ϕ T,π θ ,η ), where π θ is the bimodule representation relative to L 2 θ (M).Now, using the variational formula in Proposition 2.16 where the inequality follows because the second sup is over a larger set which is easily seen by setting v t := (id N ⊗θ)(x t ) and w t = 1−v t .The right side is D f (ϕ S,π,η ϕ T,π,η ), again by our variational formula.Taking the supremum over unit vectors η in the natural cone then demonstrates the reverse inequality and we are done.Case (2) M is of type I n , i.e.M = M n (C).Let (ξ, π, A) be a nearly optimal triple in Definition 3.4 with corresponding bimodule H , up to tolerance ε.By replacing r H (A) if necessary with the potentially larger von Neumann algebra H (M) (which is type I), we can assume that r H (A) = B(K ), as well as H = C n ⊗ K .Now let P be the orthogonal projection on H with range H (M)ξ .Then P ∈ H (M) so P = r H ( p) for some orthogonal projection p ∈ A, and by the Schmidt-decomposition theorem, p has rank ≤ n.Going through the definitions, we then have for n ∈ N , a ∈ A: Let x be a step function valued in M A such that the infimum in Proposition 2.16 is achieved up to tolerance ε.Observe that and we get a similar relation for S → T and Now observe that 1 n ⊗ p is the unit in M pA p and that xt + ŷt = 1 n ⊗ p, so xt ∈ M pA p is an admissible step function in the variational principle of Proposition 2.16.Furthermore pA p is naturally isomorphic to a subalgebra of M n (C n ) = M (since the rank of p is ≤ n), and that PH (which contains ξ = Pξ ) is isometric to a subspace of Therefore, the right side of ( 39) is less than or equal to D f (ϕ S, π, ξ ϕ T, π, ξ ), where π is the representation associated with the standard M−Mbimodule C n ⊗ C n , and where ξ is the unit vector in that bimodule corresponding to ξ .
and A ⊗ B( 2 (N)) are properly infinite.Thus, by the same reasoning as above in (1), we know that the maximizing bimodule can be taken to be a standard bimodule for M ⊗ B( 2 (N)).Restricting this standard bimodule to a M − M ⊗ B( 2 (N))-bimodule gives a bimodule whose associated f -divergence as in Definition 3.4 is not smaller than that of the original M − A-bimodule.
For the BS divergence, we consider the variational principle given by Proposition 2.17 instead of Proposition 2.16.

Basic properties of channel divergence.
We will now prove some basic properties of the channel divergences.In the next lemmas, D f is the divergence associated with a non-negative operator monotone function f : [0, ∞) → [0, ∞) with the representation (9) such that a, b < ∞ and D BS is the BS divergence (basically corresponding to f (t) = log t).In our proof of the next theorem, we cannot use directly the proofs [25, Theorem 4.4, Proposition 4.5] for D f (ϕ ψ) because our definition of D f (T S) is based fundamentally on a variational principle for testfunctions valued in M A op , which is not a von Neumann algebra.Fortunately, it is well-known [53] (see also [54,Chapter 5]) that variational principles can provide alternative proofs.
(Joint convexity) Let p i , q j be probability distributions over a finite set and T i , S j : and similarly for D BS .4. (Dilation) Given channels S, T : ).The same holds for D BS (S T ). Proof.
(1) Consider an M − A bimodule π and unit vector ξ ∈ H π that achieves the supremum in the variational definition (34) up to a tolerance ε.Then The first inequality is proven as follows.Consider an admissible step function x : (0, ∞) → M A in Proposition 2.16 such that D f is achieved up to the tolerance ε, we see using again the variational principle in the last line.The statement for the BS divergence likewise follows from the corresponding variational formula, see Lemma 2.17.(2) Let (ξ, π, A) be a nearly optimal triple as in definition (34) for up to tolerance ε.Consider a step function x t ∈ N ⊗ A op as in Proposition 2.16 that is nearly optimal in the variational characterization of D f (ϕ S 1 •T,π,ξ ϕ S 2 •T,π,ξ ) up to a tolerance ε > 0. Then clearly (T ⊗ id)(x t ) ∈ M A op is an admissible step function in the variational characterization of D f (ϕ S 1 ,π,ξ ϕ S 2 ,π,ξ ), and so we have, using Kadison's theorem and therefore the statement follows because ε was arbitrary.The proof for the BS divergence is similar and now based on Proposition 2.17 ), and ( 4) is proven for D f .In the case of D BS we use instead Proposition 2.17.
Our basic proof strategy to prove certain more profound properties of the channel divergence below will be to reduce the statements to those for finite dimensional matrix algebras obtained recently in [27], and to do this we will need to restrict attention from now on to hyperfinite von Neumann algebras.[A von Neumann algebra N on a Hilbert space H is said to be hyperfinite if there exists a sequence ("filtration") {N n } n∈N ⊂ N of finite dimensional subalgebras, increasing, i.e.N n ⊂ N n+1 , such that N = n N n .]Examples of hyperfinite factors are all type I factors, i.e.B(H ), M n (C), but there also exist hyperfinite factors of types II and III.Let N , M be hyperfinite factors, let S, T : N → M be channels, and let N n , M n be filtrations of N , and M, respectively.Following [55] one can construct for each n a "generalized conditional expectation" E n : M → M n as follows.Let ω n = ωaeM n be the restriction of the faithful normal state ω on M, with GNS representation π n , GNS Hilbert space H n and GNS vector n .Let J n be the modular conjugation associated with n and define a partial isometry where J is the modular conjugation associated with and M. By construction, each E n is a channel.Lemma 3.9 (see [56]).E n (m) → m strongly as n → ∞ for all m ∈ M.
We get: Proof.Consider first the case D f .Let (ξ, π, A) be a nearly optimal triple as in the definition (34) of D f (S T ).We let p n be the abstract units in M n which from an increasing net of projections in M. By the von Neumann density theorem, p n → 1 strongly.By monotonicity, i.e. by the data processing inequality applied to the inclusion channel Let B := n M n which is a * -subalgebra of M whose weak closure is B = M.By the von Neumann density theorem, B is strongly dense on M. Let x t be an admissible step function as in Proposition 2.16 valued in M A op which approximates D f (ϕ S,π,ξ ϕ T,π,ξ ) up to an arbitrary chosen tolerance ε.Because x t has finite range and because B is strongly dense in M, we can construct a sequence the step functions x n,t in M n A such that x n,t is constant on each interval where x t is constant, such that x n,t = p n for any t which is so small that x t = 1, and such that moreover x n,t → x t strongly on each such interval as n → ∞.Then ϕ S,π,ξ (x n,t x * n,t ) → ϕ S,π,ξ (x t x * t ) and, letting y n,t = p n − x n,t ∈ M n , ϕ T,π,ξ (y n,t y * n,t ) → ϕ T,π,ξ (y t y * t ) as n → ∞, uniformly in t.We insert the step functions x n,t , y n,t and the unit p n instead of x t , y t and 1 into the right side of the variational formula Proposition 2. 16.
The convergence properties of the step functions x n,t , y n,t and the unit p n mean that the right side converges to We can take ε smaller and smaller, proving that lim inf n D f (Sae M n T ae M n ) ≥ D f (S T ), which demonstrates the proposition.
For the BS divergence, we proceed in a similar way now using variational principle Proposition 2.17.
Combining the previous lemma and the martingale property we get: follows in view of the lower semi-continuity of the channel divergence because E m is pointwise strongly -hence weakly -convergent by Lemma 3.9.Therefore, we see that, simply The statement now follows from the martingale property.The proof for the BS divergence is similar and based instead on Proposition 2.17.
The following property, observed and proven first in [27] for matrix algebras, is crucial for this work.Proposition 3.12 (Internal subadditivity).Let S 2 , T 2 : N → R, S 1 , T 1 : R → M be channels between hyperfinite von Neumann algebras.Then we have Proof.Let E m : R → R m , F k : M → M k be sequences of generalized conditional expectations as described above.
In the first two lines we used lower semi-continuity, in the third line we used the martingale property, in the fourth line we used the result by [27] in the context of finitedimensional von Neumann algebras, and in the last step we used the martingale property and Lemma 3.11.
Remark 3.13.A noteworthy special case of the proposition arises when M = C, i.e. S 1 , T 1 are states.In this case the subadditivity corresponds to the "chain rule" of [27].
Consider channels S i , T i : N i → M i between hyperfinite von Neumann algebras represented on Hilbert spaces H i , where i = 1, 2. We can form the weak closure of N 1 N 2 in B(H 1 ⊗H 2 ) and denote this (hyperfinite) von Neumann algebra by N 1 ⊗N 2 , and we proceed similarly for M i .Then it follows that and similarly for T 1 ⊗ T 2 .Then we have:

Proposition 3.14 (External additivity). Let S i , T i : N i → M i be channels between the hyperfinite von Neumann algebras
Proof.Similar to the proof of internal subadditivity using again Lemma 3.11 and that D BS is additive under the tensor product in the finite dimensional case by results of [27].

Channel divergences for Kraus channels.
Let M be a von Neumann algebra in standard form acting on the Hilbert space H with cyclic and separating vector .We consider a class of channels T, S : M → M of so-called "Kraus type" investigated in the context of general von Neumann algebras by [57].By definition, these are of the form with m, a i , b j ∈ M and N , M ∈ N. Our aim is to give a formula for D BS (S T ) for the channel divergence of two Kraus channels in terms of their "Choi operators" also introduced in this context by [57].To this end, we define the Choi operators C S , C T ∈ B(H ) for such channels as, respectively By construction C S ∈ B(H ) is a non-negative operator of finite rank such that Tr H C S = N i=1 a i 2 = 1, and similarly for C T .
Let C ⊂ B(H ) be the * -subalgebra of all operators of the form N i=1 c i | |d i for some N ∈ N, c j , d j ∈ M. By [57,Theorem 4], the spectral projections of the operators C S , C T are in C and consequently this algebra is closed under the spectral calculus.Now suppose σ is the Kubo-Ando connection associated with a non-negative operator monotone function f : [0, ∞) → [0, ∞) with a, b < ∞ in (9).It follows from ( 15) that the expressions under the limit in are non-negative elements from C. Since the arguments of the mean σ are decreasing and strongly convergent as ε → 0, the limit not only exists by the properties of the Kubo-Ando connections, but is also in C S σ C T ∈ C, see the proof of [57,Theorem 4].
Hence C S σ C T is in particular a non-negative finite rank operator in B(H ).
For the operator monotone function f (t) = log t on (0, ∞), similar arguments show that the operators [C S +ε(C S +C T )]σ [C T +ε(C S +C T )] are still in C for ε > 0. The limit ε → 0 of this decreasing sequence exists but possibly only in the sense of an unbounded quadratic form.In fact, as long as ε > 0, one can see e.g. from (15), C S , C T ≤ 1 together with V * f (A)V ≤ f (V * AV ) for contractions V , positive A ∈ B(H ) + and operator monotone functions f : (0, ∞) → R (see e.g.[58]), that with the convention that the right side is +∞ if C S σ C T is unbounded.
Proof.First assume that C S σ C T is bounded, so S,T ∈ L ∞ (M, ) ∼ = M by the preceding remark.By Proposition 3.7 we can restrict attention to the standard bimodule H = L 2 (M, ) in the variational definition (34) of channel divergence.Furthermore, since M is strongly dense in H as is standard, it is sufficient to restrict to vectors ξ ∈ H of the form ξ = x , x ∈ M in the variational definition.We get using the definitions and the notations m = J m op * J ∈ M , x = J x op * J and We also have a similar formula replacing S by T and a j by b j .The variational principle for the maximal B S-divergence (Proposition 2.17) thereby gives us by (54), where the first supremum is over n ∈ N, the second supremum is over the finite range step functions 0 for sufficiently large t, and where we use the abbreviations Since the strong closure of π(M M op ) is strongly dense in B(H ), the step functions V t can be used to approximate in the strong topology any given finite range step function (0, ∞) → B(H ) which is zero for sufficiently large t and 1 for sufficiently small t.Let P be any orthogonal projection onto a finite dimensional subspace of H containing the (finite dimensional) ranges of XC S X * and XC T X * .Then it follows that we may further replace V t by PV t P and W t by PW t P and the variational formula [44, Remark 9.2] (or our Proposition 2.17) therefore tells us that where we used that P was arbitrary so long as its range ranges of XC S X * and XC T X * to go the third line, and where we used the transformer equality (see e.g.[25,Lemma D.3]) to go to the fourth line.The last step is admissible if we assume that x , hence X , is invertible, which we assume momentarily is the case.Since we know that S,T ∈ L ∞ (M, ), there is m ∈ M such that S,T = m , therefore If we could show that x with x ∈ M ranging over the invertible elements is dense in H , then this formula would hold on for all ξ ∈ H .This follows, in fact, from the hyperfinite property because invertible elements are norm dense in a finite-dimensional von Neumann algebra, and M is the strong closure of hyperfinite algebras.Thus, we get a strongly convergent sequence x n → x with x n invertible for any x ∈ M. Applying this to x := J x J and choosing x n = J x n J gives the statement.Taking the supremum over our strongly dense set of vectors ξ with unit norm now gives the statement of the proposition because m = S,T L ∞ (M, ) .
Let us now assume that C S σ C T is not bounded.The completely positive maps T ε := T + ε(S + T ), S ε := S + ε(S + T ) do not suffer from this problem for ε > 0 and are (non-normalized) increasing (as ε → 0) sequences of Kraus channels.By monotonicity of the operator mean σ , C S ε σ C T ε is an increasing sequence of self-adjoint operators in C whose range remains in a fixed finite dimensional subspace of H. Hence it is convergent to the unbounded operator C S σ C T in norm from which we can see that there must be x ∈ M such that −Tr H [x * x (C S ε σ C T ε )] diverges to +∞, hence so does D BS (S ε T ε ) by (56).However, since T ε , S ε are decreasing sequences of channels, by monotonicity

Examples.
As a simple special case of Kraus channels we consider S, T in (47) of the form In other words, {a j } respectively {b j } each generate algebras isomorphic to the Cuntz algebras on N respectively M isometries.
By the integral representation (10) for this mean, we therefore get As n → ∞, the operator means Pσ n Q are decreasing (hence convergent) to the potentially unbounded quadratic form Pσ Q = [(P ∧ Q) − P]∞, where σ corresponds to the operator monotone function f (t) = log t.Therefore, if Pσ Q is to be bounded, we must have (P ∧ Q) − P = 0, so P must be a subprojection of Q, or otherwise D BS (S T ) = ∞ by Proposition 3.17.If P = Q, then N = M, and there must be R i j ∈ C such that a i = N j=1 R i j b j , and since is separating, we must have The Cuntz algebra relations then show that (R i j ) is a unitary matrix and then clearly S = T .If P < Q, then clearly N < M and it follows that Pσ n Q is decreasing (hence convergent) to 0 and D BS (S T ) = 0.
Another very simple but conceptually relevant example is: Proposition 3.20.Let M be a finite dimensional or properly infinite, hyperfinite von Neumann algebra and let e j ∈ M be N mutually orthogonal projections such that i e i = 1, 0 < e i < 1.We consider the Kraus channel corresponding to an N -ary measurement.Then Proof.The Choi operator associated with M is C M = e i | |e i and that for the identity channel id is C id = | |.We begin by working out the parallel sum ξ, [(tC id ) : C M ]ξ using the variational definition (11).A minimizer ζ 0 in that definition has to satisfy The vectors e i are non-zero and linearly independent because is separating and because the e i 's are orthogonal and non-trivial.We therefore see that for all i = 1, . . ., N and any solution ζ 0 is a minimizer for the variational problem (11).
To find a solution we consider the ansatz ζ 0 = i a i e i −2 e i , leading to a linear system for the unknown complex coefficients a i .A solution is Substituting the corresponding ζ 0 into the variational definition (11) yields noting that the dependence upon e i has cancelled.In other words [(tC id ) : Next we use the integral representation (10) for the Kubo-Ando means σ n associated with the functions f n (t) = log( 1 n + t).The corresponding measures dμ n are read off from the integral representations f n (t) = − log n + ∞ 1/n s t+s dt t .This gives for the Kubo-Ando mean C id σ C M associated with f (t) = log t as required for the BS divergence, Next we use the definition (52) for S = id, T = M, giving By Proposition 3.17, we therefore have D BS (id M) = log N as we wanted to show.
Our final example concerns finite index inclusions of von Neumann factors.We let e be the Jones projection for the inclusion, i.e.M is generated by N and e.Then E(e) = d −2 1.Let π be the representation of M M op coming from the standard bimodule L 2 (M) with underlying Hilbert space H . Recall that for ξ ∈ H we have by definition ϕ ξ,E,π (m ⊗ m op ) = ξ, E(m)J (m op ) * J ξ for the quantity appearing in (34) for the channel E : M → N .We use this bimodule in the variational characterization of Proposition 2.17 involving a supremum over n ∈ N and admissible step functions x − → M M op (as well as y t := 1 − x t ).We obtain a lower bound by constructing a specific step function x n for each n ∈ N and show that the limit n → ∞ of the variational expression in Proposition 2.17 tends to a quantity that is at least log d 2 .
For this, we choose a standard vector ∈ H for M and let ξ := e / e .We also let and we let y t = 1 − x t .Since eξ = ξ and E(e) = d −2 1, we get in the range t ≤ n.This gives us Then it follows from the variational characterization of the BS divergence (Proposition 2.17) that x − → M A op such that the supremum in the variational definition (34) is saturated up to tolerance ε: The right side is ≤ log d 2 + D BS (id id) = log d 2 using the variational definition again.Since ε > 0 can be as small as we like we have shown D BS (id E) ≤ log d 2 .We have already shown D BS (id E) ≥ log d 2 in a) so the proof is complete.

Applications to QFT
4.1.Algebraic QFT.We recall the axioms in the algebraic approach to QFT, see [60] as a general reference.In the preceding sections we have described properties of the on H implementing α g (a) = U (g)aU (g) * for all a ∈ A. There is a vector (the vacuum) which is cyclic for A and such that U (g) = for all g ∈ P. Positive energy means that if x ∈ R n ⊂ P is a translation by x, we can write and the vector generator P = (P 0 , P 1 , . . ., P n−1 ) has spectral values p in the forward lightcone For technical purposes, we also impose a "nuclearity condition."The main purpose of that condition is to ensure a certain regularity on the theory, and several closely related versions of such a condition have been proposed.In so far as we can see, many of these would more or less all be equally good for our purposes.For definiteness, we impose [61]: (a6) (BW-nuclearity) Let A be a ball of radius r in Cauchy surface, and let O r be the corresponding causal diamond.Consider the map where β > 0 and where H = P 0 is the Hamiltonian, i.e. the time-component of P in item (a4).It is required that there exist positive constants s > 0 and c = c(r ) > 0 such that for r > 0, β > 0 we have β,r 1 ≤ e (c/β) s .Here we use the nuclear 1-norm discussed further e.g. in [62].
We now comment on two well-known important consequences of these results for our analysis, see [60] for further details and references.First, by the Reeh-Schlieder theorem, is cyclic and separating for each A(O), so the vacuum automatically provides a standard form for each local von Neumann algebra.Secondly, each A(O) is a hyperfinite factor of type III 1 [63] which is a unique object up to von Neumann isomorphism by [64].
As a consequence, we can apply all of our results on the channel divergences D BS to the local algebras A(O).
It is important to stress that a priori, A(O) is defined only for causal diamonds associated with simply connected subsets of a Cauchy surface.If K is any open, causally complete subset of R n , we could define either DHR-Representations: See [34,35,60,65].The Hilbert space H may be considered as the defining (vacuum) representation of the net, but it is physically relevant to also consider other representations.We shall consider representation π of A on a Hilbert space H π which are ultraweakly continuous when restricted to any A(O) and which satisfy: • (DHR-selection criterion) [34,35] π | A(O) ∩A is unitarily equivalent to the vacuum representation for some O. • (BF-selection criterion) [65] The automorphisms α g in (a3) are unitarily implemented in π , i.e. there exists a strongly continuous positive energy representation U π (g) such that π(α g (a)) = U π (g)π(a)U π (g) * such that the generator P π of translations U π (x) = exp(−iη(P π , x)) on H π has an isolated mass shell in its spectrum, i.e. spec(P π ) ⊂ {p : η( p, p) = M 2 , p 0 > 0} ∪ {p : η( p, p) ≥ m 2 , p 0 > 0} for some m 2 > M 2 > 0.
If we let V be a unitary implementing the unitary equivalence in the first item, then ρ(a) := V * π(a)V is an endomorphism of A such that ρ| A(O) ∩A = id. (76) One says that ρ is a localized endomorphism (in O) for this reason.Furthermore, ρ is transportable in the following sense.Let O 1 := O, ρ 1 := ρ and let O 2 be another causal diamond.Then there exists a unitary u 21 ∈ A(O 1 ) ∨ A(O 2 ) such that Adu 21 • ρ 1 =: ρ 2 is an endomorphism satisfying the DHR-and BF-selection criteria that is localized in O 2 .We will refer to the endomorphisms arising from the selection criteria above as a localized, transportable endomorphism.Let ρ be a transportable irreducible endomorphism localized in some O.As is known, the selection criteria imply a considerable amount of further algebraic structure associated with ρ.First, we have a so-called conjugate transportable endomorphism ρ together with solutions r, r ∈ A(O) and ) is given by E ρ = ρ• ρ .d ρ is referred to as the "statistical dimension" of ρ.By the index-statistics theorem [33], Similar constructions apply to reducible endomorphisms/representations.For a variant of this theory for conformal field theories in n = 2 spacetime dimensions see [33,66].

Complexity of channels in AQFT.
Let T be a completely positive map of the quasilocal algebra A, 11 such that, for some sufficiently large causal diamond O, it restricts to a channel of A(O).By [17, Theorem 2.10], we may write where v is an isometry of A(O) and θ is an endomorphism of A(O).This motivates the following definition.Let x ∈ R n , let O + x be the translate of O and let α x (a) = U (x) * aU (x) be the translate of an element a ∈ A(O) to A(O + x) as in (a3).We consider T Since θ is by assumption an endomorphism satisfying the DHR-and BF selection criteria, translations are implemented in the sector θ by a strongly continuous group of unitaries i.e. it has the same form as (79) but with θ x , v x now localized in O + x.We now make a proposal for the complexity of a channel in algebraic quantum field theory.
Definition 4.3.The complexity of a localizable and transportable channel T is defined as where O is any sufficiently large causal diamond such that T | A(O) ∩A = id.
To be precise, we should demonstrate: Lemma 4.4.The definition of c(T ) does not depend on the sufficiently large causal diamond O chosen in (82).
n be a net of finite dimensional type I algebras exhausting A(O 1 ) c , and let M n be a net of finite-dimensional type I algebras exhausting A(O 1 ), which exist as a consequence of requirement (a6), see [21].
and by the martingale property for D BS we have  Proof.
(1)-( 3), ( 6), (10) are taken from Sect. 3. (4) Let T i be localized in O i , where O 1 and O 2 are spacelike related with strictly positive distance.By locality T 1 • T 2 (a 1 a 2 ) = T 1 (a 1 )T 2 (a 2 ), a i ∈ A(O i ).Then the BWnucelarity assumption (a6) implies the split property for the algebras A(O 1 ) and A(O 2 ), see [21] or [60, Chapter V.5.2] as a general reference.So there is a unitary W : H → H ⊗ H such that W * (a 1 ⊗ a 2 )W = a 1 a 2 and consequently where T 1 ⊗ T 2 is the tensor product channel on A(O 1 )⊗A(O 2 ) and Ad W X = W * X W .In particular, the map T 1 • T 2 is normal on A(O 1 ) ∨ A(O 2 ).Then, by applying internal subadditivity twice Since W is unitary, we have the reverse inequality by the same argument backwards, so c(T 1 • T 2 ) = c(T 1 ⊗ T 2 ) = c(T 1 ) + c(T 2 ), by external additivity.
(5) This follows from Sect.3; we only need to show that H is localized and transportable.Since A(O) is properly infinite, there are isometries a i , i = 1, . . ., N in A(O) satisfying the Cuntz algebra relations (58).Then we set v * = i e i a * i and θ(m) := j a j ma * j , m ∈ A(O).It follows that v * v = 1, that θ is a localized, transportable endomorphism, and that M(m) = v * θ(m)v, as desired.μ .For details, see [68].Example 4.9.A situation similar to to the previous example arises in local gauge theories of Yang-Mills type in n = 4 dimensions based on a compact local gauge group G: Take K to be the causal completion of a solid torus in a Cauchy-surface.Then we have A(K ), B(K ) as in (75).As argued in [69], we should have [B(K ) : A(K )] = dim Z (G) where Z (G) is the center of the gauge group (e.g.Z N in the case of G = SU (N )).Thus we relate a property of the gauge group to the complexity of the conditional expectation E : B(K ) → A(K ).An intuitive reasoning for what E does in terms of 't Hooft and Wilson loops is given in [69].where φ I ( f ) = φ I (x) f (x)d 4 x are the smeared KG quantum fields and the vacuum vector and C is a factor such that = 1.Let dim(λ) be the dimension of the space of tensors with Young-tableau symmetry λ.By DHR theory [34,35], there exist a localized (within O), transportable endomorphism ρ of the net such that , ρ(a) = , a for a ∈ A (92) and the statistical dimension d ρ of this ρ equals the Young-tableau dimension dim(λ).It is given by a standard formula in terms of the shape of the Young tableau, see e.g.[71], so we obtain from (7) in Theorem 4.5 in this example, In the following example diagram λ : with k = 13 and N = 10, the right side is 2 log 135.

Conclusions
In this work we have proposed a notion of complexity of a channel based on a specific information theoretic notion of distance to the identity channel.It would clearly be interesting to understand better the uniqueness of our definition within the axiom scheme that we proposed.It would also be interesting to understand better the relation of our proposal, if any, to holographic approaches.
is the channel corresponding to a non-trivial representation of the QFT ("charge superselection sector"), then c(ρ) = ∞.3.If M(a) = N i=1 e i ae i is the channel corresponding to a local N -ary von Neumann measurement, then c(M) = log N .4. Let E ρ be the (minimal) conditional expectation from A(O) to ρ(A(O)) where ρ is a charge superselection sector (charged representation), then H ), and M = (M ) is the bicommutant.A von Neumann algebra is called a factor if M ∩ M = C1.One denotes by A ∨ B = (A ∪ B) the von Neumann algebra generated by * −algebras of bounded operators A, B. • States: A state is a linear, positive, normal, normalized functional ψ : M → C, where positive means ψ(mm * ) ≥ 0 for all m ∈ M and normalized means ψ(1) = 1.

Example 2 . 5 (
Left and right trivial means).The left trivial mean σ 1 is induced by the function f (x) ≡ 1 and gives Aσ 1 B = A. The right trivial mean σ x is induced by f (x) = x and gives Aσ x B = B. Example 2.6 (α-geometric means).The α-geometric means are defined in terms of the operator monotone function

Remark 2 . 14 .
If M = B(H ) is a type I von Neumann factor (or more generally, direct sum of factors), positive normal functionals on M are in one to one correspondence with positive trace class operators on H .Under this identification, the above definition of maximal f -divergence reduces to Definition 2.8 and Remark 2.9.

2. 4 .
Properties of maximal f -divergence.Many properties of D f and D BS , and of the corresponding connections σ between states, 7 in the setting of von Neumann algebras are known, see e.g.[25, Theorem 4.4, Proposition 4.5].
D f and D BS , see [25, Theorem 4.4, Proposition 4.5]. 1. (Data processing inequality) Let T : N → M be a positive, normal, unital linear map between von Neumann algebras M and N satisfying the Schwarz property

3. 1
. Definitions.Definition 3.1 (Bimodules).Given von Neumann algebras N , M, a N − M bimodule is a triple (H , H , r H ) where H is a Hilbert space, and N H − − → B(H ) r H ← − − M are a normal representation, and a normal anti-representation, respectively, such that H (N ) and r H (M) commute.

Theorem 3 .8 1 .
(Lower semi-continuity).Let T n , S n be channels such that T n (m) → T (m), S n (m) → S(m) weakly for any m ∈ M. Then D f (S T ) ≤ lim inf n D f (S n T n ), and similarly D BS (S T ) ≤ lim inf n D BS (S n T n ). 2. (Data processing inequality) Let S 1 , S 2 : N → M, T : R → N be channels between von Neumann algebras.Then D

. ( 3 )
The variational principles expressed in definition(34) and Proposition 2.16 display D f (S T ) as a double supremum of affine functionals of S, T .Joint convexity follows.The details are similar to (2).(4) Let H be a M ⊗ B( 2 (N)) − A bimodule.Since a left representation is a right representation of the opposite algebra, this is also a M− B( 2 (N)) op ⊗A bimodule, hence included in the maximization in Definition 3.4 of D f (S T ).Thus, we haveD f (S T ) ≥ D f (S ⊗ id B( 2 (N)) T ⊗ id B( 2 (N))).On the other hand, let (π, A, ξ) be a nearly optimal triple in the Definition 3.4 of D f (S T ), up to tolerance ε, where π corresponds to some M − A bimodule H . Then H = 2 (N) ⊗ H is a M ⊗ B( 2 (N)) − A bimodule.Let η be any unit vector in 2 (N) and set ξ = ξ ⊗ η as well as S = S ⊗ id B( 2 (N)) .If x t is a step function achieving the supremum the variational formula (Proposition 2.16) of D f (ϕ π,S,ξ ϕ π,S,ξ ) up to tolerance ε, it follows that xt := x t ⊗ 1 2 (C) is a valid step function in the variational definition of D f (ϕ π, S, ξ ϕ π, S, ξ ) achieving the same value.So we have D and similarly for the BS divergence.Proof.Consider the channels E m • S, E m • T : N → M m and an M m − A bimodule H m , representation π m , and vector ξ m ∈ H m as in the definition of D f (E m • T E m • S) such that the supremum (34) is achieved.By Lemma 3.7, we may assume the bimodule in question to be the standard M m − M m bimodule L 2 (M m ).From the channel E m : M → M m and the functional ξ m , .ξm , we then get an induced M − M m bimodule in view of Proposition 3.3.It immediately follows that D f (S T ) ≥ D f (E m • S E m • T ) because in the variational definition (34) of D f (S T ), we take the supremum over the larger set of all bimodules whereas D f (E m • S E m • T ) corresponds precisely to the induced bimodule M − M m just described.Thus, we see that D

( 50 ) 2 ⎞⎠Definition 3 . 15 . 52 )Proposition 3 . 17 .T 2 L
Thus, for f (t) = log t, the corresponding Kubo-Ando mean C S σ C T defines a negative possibly unbounded quadratic form given by a finite rank operator in C on its domain.Assume that C S σ C T is bounded, hence in C. Then the positive finite rank operator −C S σ C T may be written as a linear combination of its eigenprojections asK j=1 c j | |c * j for some c j ∈ M, K ∈ N 0 ,which gives, for m ∈ M S,T , m * m S,T = , m * m .(51) Let σ be the Kubo-Ando mean for f (t) = log t and assume that C S σ C T is bounded (hence in C).Then we define S,T ∈ L 2 (M, ) + as the unique representer of the positive normal functional on M associated with the non-negative finite rank operator −C S σ C T for the operator mean associated with f (t) = log t, S,T , m S,T = −Tr H m (C S σ C T ) , m ∈ M .(Remark 3.16.By the Connes-Radon-Nikodym theorem and (51), there must be m ∈ M such that S,T = m ∈ M .We therefore have S,T ∈ L ∞ (M, ) ∼ = M by the well-known characterization of this space.For two Kraus channels S, T on the finite dimensional or properly infinite hyperfinite von Neumann algebra M standardly represented on L 2 (M, ) we have D BS (S T ) = S,∞ (M, )

Corollary 3 . 18 .Remark 3 . 19 .
Let M be a finite dimensional or properly infinite von Neumann algebra and let S, T be Kraus channels such that (58) holds for some N , M ∈ N. Then either N = M and T = S, or we have D BS (S T ) = ∞, or N < M and D BS (S T ) = 0.In particular, note that if T (m) = umu * with u ∈ M unitary and S = id, we have D BS (id T ) = D BS (T id) = ∞ unless u = λ1.Proof.It follows from the Cuntz algebra relations that the corresponding Choi operators are C T = Q and C S = P are orthogonal projections of rank N respectively M on H . Denote by P ∧ Q the orthogonal projection onto the intersection of the ranges of P and Q.Consider the operator monotone functions f n (t) := log(t + 1 n ), t ≥ 0, which have integral representations f n (t) = − log n + ∞ 1/n s t+s dt t , and let σ n be the corresponding operator means.By the proof of [45, Theorem 3.7], we have (t P) :

Proposition 3 . 21 .
Let N ⊂ M be a finite index inclusion of von Neumann factors with associated minimal conditional expectation E : M → N .Then D BS (id E) = log[M : N ].Proof.a) We let d 2 = [M : N ] and we first show D BS (id E) ≥ log d 2 using the variational definition (34) for the channel divergence in the case of the BS divergence.

)
By the variational characterization of the channel divergence as a supremum of D BS (ϕ ξ,id,π ϕ ξ,E,π ) over triples (A, π, ξ) we therefore haveD BS (id E) ≥ log d 2 .b)The conditional expectation satisfies the Pimsner-Popa bound E ≥ d −2 id[38,59].Let ε > 0. Then we can choose a triple (π, A, ξ) (consisting of a von Neumann algebra A, binormal representation π on H of M A op , and unit vector ξ in H ) an n ∈ N, and an admissible step function (1/n, ∞) channel divergence D BS in the general context of von Neumann algebras.In the context of local QFT, one has additional structure due to spacetime localization, and it turns out that this structure plays very nicely with the notion of channel divergence.We restrict to the setting of Minkowski spacetime (R n , η) for n ≥ 2.A causal diamond O is the causal completion of an open, simply connected subset U with compact closure of a Cauchy surface, where the causal structure is induced by the Minkowski metric.A QFT in the algebraic setting ('AQFT') is an assignment of simply connected causal diamonds to von Neumann factors O → A(O) represented on the same Hilbert space H , subject to the following conditions:(a1) (Isotony) A(O 1 ) ⊂ A(O 2 ) if O 1 ⊂ O 2 .We write A = O A(O) with completion in the operator norm.(a2) (Causality) [A(O 1 ), A(O 2 )] = {0} if O 1 is space-like related to O 2 .(a3) (Relativistic covariance) For each g ∈ P covering 10 a Poincaré transformation ( , a) ∈ P = SO + (n − 1, 1) ˙Rn , there is an automorphism α g on A such that α g A(O) = A( O + a) for all causal diamonds O and such that α g α g = α gg and α (1,0) = id is the identity.(a4) (Vacuum) There is a strongly continuous positive energy representation g → U (g)

)
In either case, a prime on a region O or K means the causal complement.For topologically trivial causal diamonds O with compact closure it is a result that A(O ) = A(O ) (Haag duality), so by (a5), A(O) = B(O) for topologically trivial causal diamonds.Either A or B gives a net in the above sense with the possible exception of condition (a5) in the case of B. B(K ) is in general strictly bigger for topologically non-trivial regions K than A(K ).
d ρ ≥ 1 to the intertwining ρ ρ(a)r = ra, ρρ(a)r = ra (a ∈ A) (77) and conjugacy relations r * r = d ρ 1, r * r = d ρ 1, r * ρ(r ) = 1 = r * ρ(r ).(78) A left inverse of ρ is given by ρ (a) := d −1 ρ r * ρ(a)r .The Jones projection for the extension A(O) of ρ(A(O)) is given by e ρ = d −1 ρ r r * and the minimal conditional expectation is E equality, we used that M n ∨ M c n ∼ = M n ⊗ M c n as von Neumann algebras because M n and M c are finite-dimensional and that T acts trivially on M c n by locality.In the third step we used external additivity of D BS .In the last step we used again the martingale property.The same could be shown for O 1 → O 2 .Thus the definition of c(T ) is independent of whether we take O 1 or O 2 in (82).

Theorem 4 . 5 .
The complexity c has the following properties (T, T i localized, transportable channels):1.(Identity) c(id) = 0.2.(Internal subadditivity) c(T1 • T 2 ) ≤ c(T 1 ) + c(T 2 ).3.(Convexity) Let { p i } be a probability distribution on a finite set.Then c( p i T i ) ≤ p i c(T i ). 4. (Locality) Let T 1 and T 2 be channels localized in spacelike related causal diamonds with strictly positive distance.Then c(T1 • T 2 ) = c(T 1 ) + c(T 2 ).5. (N -ary local measurement) Let M(a) = i e i ae i be the channel describing an N -ary local measurement associated with the N mutually orthogonal non-trivial projections e i ∈ A(O),e i = 1.Then M is localized and transportable and c(M) = log N .6. (Net extensions) Let B be a net extending A [67] with corresponding conditional expectation E. Then c(E) = log[B(O) : A(O)].(84) 7. (Localized transportable endomorphisms I) Let ρ be a transportable localized transportable endomorphism with conditional expectation E ρ and statistical dimension d ρ .Then E ρ is a localized transportable channel and c(E ρ ) = log d 2 ρ .(85) 8. (Translations) If T x is the translate of T by x ∈ R n as in (81), then c(T x ) = c(T ). 9. (Localized transportale endomorphisms II) If ρ = id is a localized transportable endomorphism of A(O), then c(ρ) = ∞.10. (Local unitaries) Let u ∈ A(O) be a unitary and U (a) = u * au be the corresponding channel on A(O).Then if U = id, we have c(U ) = ∞.

Example 4 . 10 .
An example for localized endomorphisms ρ as in(7) in Theorem 4.5 in a free field theory is the following[70, Section 4.7].Consider a real N -component free complex Klein-Gordon quantum field φ I (x), I = 1, . . ., N in n = 4 dimensions.We get a net A of all observables that are invariant under the obvious action of the SU (N )-symmetry.Consider a tensor T I 1 ...I k , I j = 1, . . ., N whose symmetry properties under index permutations are characterized by some Young-tableau λ = (λ 1 , . . ., λ s ) where λ 1 ≥ • • • ≥ λ s and where λ i is the number of boxes in the i-th row.Next, take testfunctions f I with support in a causal diamond O. Define= C N I 1 ,...,I k =1 T I 1 ...I k φ I 1 ( f 1 ) . . .φ I k ( f k ) (91)

•
Von Neumann algebra: A von Neumann algebra M is a closed * −subalgebra of the algebra of bounded operators B(H ) on a Hilbert space H in the weak operator topology.The weak topology is defined by the matrix elements, i.e. the open neighborhoods are be a channel and let ϕ be a normal state of M with vector representative ξ ∈ L 2 (M) + in the natural cone.There exists a N − M bimodule H T and a vector η ∈ H T such that (34) case is not considered in the present work since this function is not operator monotone but operator convex, and it is not obvious to what extent the variational formula in Proposition 2.16 still applies in this case.Even though we will stick with the above definition in what follows, one may ask to what extent it is necessary to consider all bimodules in the definition of the channel divergence(34).If M is properly infinite (direct sum of factors of types I ∞ , II ∞ or III) or a direct sum of type I n factors then we have Remark 3.6.Consider a normal homomorphism θ : A → M. Then a new bimodule H θ can be constructed by twisting the identity bimodule L 2 (M) on the right by using θ .More explicitly, H θ = L 2 (M) as Hilbert space, the left action of M is the one coming from the structure of L 2 (M) as a left M−module, while the right action of A is defined r θ (a)η := ηθ(a), η ∈ L 2 (M), a ∈ A. In this case, the bimodule is denoted by L 2 θ (M, ).9 T is the identity in the causal complement of O.(2) It is easy to see that the set of localized and transportable channels is stable under composition, i.e. the composition is again of the form (79).It is also closed under convex combinations: Let T i be localized, transportable channels of the form (79) with v It follows that θ is a localized, transportable endomorphism of A, that v is an isometry of A(O), and that p i T i is of the form (79). (3) One may generalize the definition to channels between two nets A, B.
Definition 4.1.A channel T : A → A is called localized and transportable if it is of the form (79) for some localized (in some causal diamond O) transportable endomorphism θ and some isometry v ∈ A(O).Remark 4.2.(1) Note that by definition, T | A(O) ∩A = id, i.e. i , θ i , and p i a probability distribution on a finite set.Since A(O) is type III [21], there are isometries a i in A(O) satisfying the Cuntz algebra relations (58).Then set v = √ p i a i v i and θ(m) = a i θ i (m)a * i , m ∈ A(O).