Perturbation Theory for the Logarithm of a Positive Operator

In various contexts in mathematical physics one needs to compute the logarithm of a positive unbounded operator. Examples include the von Neumann entropy of a density matrix and the flow of operators with the modular Hamiltonian in the Tomita-Takesaki theory. Often, one encounters the situation where the operator under consideration, that we denote by $\Delta$, can be related by a perturbative series to another operator $\Delta_0$, whose logarithm is known. We set up a perturbation theory for the logarithm $\log \Delta$. It turns out that the terms in the series possess remarkable algebraic structure, which enable us to write them in the form of nested commutators plus some"contact terms."


I. INTRODUCTION
In many different problems in mathematical physics one needs to compute the logarithm of a positive operator.This is commonplace in asymptotic quantum information theory when one is interested in various quantities constructed from the logarithm of a density matrix.For instance, given a reduced density matrix ρ, one needs to compute log ρ to find the von Neumann entropy and relative entropies.In a general quantum system, in the Tomita-Takesaki theory, given the modular operator ∆ Ω of a state |Ω or the relative modular operator of two states ∆ ΨΩ one needs to compute their logarithms to obtain the modular flow operator ∆ it Ω , or to calculate relative entropies.In this case, the positive operator in question, ∆ Ω , is unbounded.
Consider an unbounded positive operator ∆.In general, obtaining log ∆ directly is difficult since it has a simple form only in the spectral decomposition of the operator ∆.We consider the following situation: (i) ∆ is related to some other positive operator ∆ 0 by a smooth deformation, i.e. there exists a continuous parameter λ and a family of operators ∆(λ) that interpolate between ∆(0) = ∆ 0 and ∆(1) = ∆; (ii) the logarithm log ∆ 0 is known explicitly.Imagine setting up a perturbative series for log ∆(λ) in terms of log ∆ 0 for λ small.If the perturbation series converges for λ ≤ 1 one can extend the series to λ = 1.
It is the goal of this paper to set up such a perturbation theory.For a discussion of the fractional powers and the logarithm of bounded operators in the Hilbert space see [1,2].
For bounded operators that belong to the Lie algebra of a Lie group one often uses the Baker-Campbell-Hausdorff (BCH) expansion to compute the logarithm; see [3].See also recent discussions [4][5][6][7][8][9] in the context of quantum field theory.
Our main result is the following series expansion where Q m can be written in the form of nested commutators plus "contact" contributions and F ǫ i (t 1 , t 2 , .., t m ) = f (t 1 )g ǫ 1 (t 2 − t 1 )g ǫ 2 (t 3 − t 2 )...g ǫ m−1 (t m − t m−1 )f (t m ), (1.4) . (1.5) In (1.2) P m are given by terms with two fewer integrals ("contact terms") whose structure are a bit complicated and will be given later.The first few terms of this series are given by It is important in (1.2) that performs the integrals keeping ǫ's nonzero and then take the ǫ i → 0 limit.We have included quintic contact term P 5 in Appendix B.
In the special case the operators ∆ and ∆ 0 are both bounded one can use the spectral representation of these operators to match our expansion and the BCH expansion order by order in λ.
The plan of the paper is as follows.In Sec.II, we outline the main steps leading to the proof of (1.2) and give explicit expressions for F m .In Sec.III and Sec.IV we fill in the details of the proof.In Appendix A we give a simple example of harmonic oscillator to illustrate the use of (1.6)-(1.9).In Appendix B we present the explicit expression for P m=5 .
Appendices C-E include various fine details for the proof.

A. Setup
We start with the integral representation of the logarithm of an (positive invertible) operator ∆: For unbounded operators ∆, the integral on the right-hand-side should be thought of as a limit of Riemann sums in the strong operator topology induced by the domain of the logarithm of ∆.Thus, we have the operator equality on the common domain of log∆ 0 and log∆.
Introduce the operator which we take to depend on a continuous parameter λ and vanish as λ → 0. To lighten the notation, we keep the λ dependence implicit.We stress that α is an unbounded operator despite the fact that it is proportional to a small parameter λ.
Since the function f (x) = √ x(x + β) −1 is bounded for positive β and x the operator ∆ 1/2 (∆+β) −1 is a bounded operator in the Hilbert space.To make sense of the perturbation theory we assume that there exists a constant c such that ∆(λ) < c∆ 0 .As a result, the operator ∆ 1/2 0 (∆ + β) −1 is also bounded.This is part of what we mean by λ being a small perturbation.
For β > 0, we define the bounded operator to rewrite the integrand of Eq.(2.2) as where we have introduced One might want to naively expand (1 − ∆ 1 2 0 Aα) −1 in the second line of (2.5) in a power series of ∆ 1 2 0 Aα.But this is not a good expansion as ∆ 0 Aα is an unbounded operator.This is similar the approach taken by [9].It leads to singular integrals which can be sensible only if one provides a prescription to deform the integration Contour.To circumvent this problem, in the third line of (2.5) we introduced the operator δ, which is bounded with a norm δ ≤ 2. To see this note that the spectrum of the closure of α is contained in (−∞, 1) and x 1−x/2 is a bounded function in the range (−2, 2).On a dense domain, δ and its closure agree.Finally, expanding B in the spectral decomposition of ∆ 0 we find that the spectrum of B is contained in 0, 1  2 .Therefore, ||B|| = 1/2 and by the Cauchy-Schwarz inequality ||Bδ|| ≤ 1.
For ||Bδ|| < 1 expanding the third line of (2.5) in terms of Bδ gives a convergent series.
In general, it is not possible to exclude the ||Bδ|| = 1 case.To justify our expansion, we will restrict to those vectors |x in Hilbert space which satisfy ||BδA|x || < ||A|x || . (2.9) On this set, we have the operator equality and the sum on the right-hand-side is pointwise convergent.We will not specify |x below but its presence should always be kept in mind.
Using (2.5)-(2.10) in (2.2) we then find that We now further rewrite the above expression using the one-parameter unitary group ∆ it 0 generated by log ∆ 0 .In particular, we use the following integral expressions for A and . (2.13) In Appendix D, we show that the above ǫ → 0 limit exists.
Plugging (2.12) into (2.11)we find that Notice that if we exchange the orders of β and t-integrals (with associated ǫ i → 0 limit) the β-integral can be performed explicitly (2.15) Equation (2.14) can then be further written as (shifting the sum of m to start from 1) where the kernel F is defined by A variant of (2.16)-( 2.18) has appeared previously in [9]. 2 The main goal of the paper is to show that the kernel (2.18) has remarkable symmetric properties which enable one to write (2.20) The P m contact term involves terms which contain two fewer integrals.It has the following structure: where s sums over all possible ways in which three t i 's are selected from the set such that at least two of the indices on the chosen t i are adjacent.The three chosen t i 's are set to be equal to t, with the rest relabeled as t 1 , • • • t m−3 .J s is a kernel which can be obtained from F ǫ i after applying a number of operations which are described in the next subsection.For now, we give some simple examples.Suppose As the selected indices become larger, the number of terms in J increases and the terms also become more complicated.For and J has a term of the form 1 f (t) 2 F (t 3 , t 2 , t, t 1 )F (t, t 4 , t 5 , .., t m−3 ).

B. Basic ideas for the proof
The kernel (2.18) looks complicated, but it satisfies a number of amazing identities under the permutation of its arguments.To explain the basic idea leading to the proof of (2.20), we need to first establish some notation.
Let S m be the symmetric group of permutations of m-distinct objects.We use the cycle Any index not listed in the cycle is left untouched.We define the action of an element σ ∈ S m on a product of operators δ(t i 1 )δ(t i 2 )..δ(t im ) and a general function in the following manner: Note that in (2.24) σ acts on the left while in (2.25) it acts on the right.See Appendix C for further explanations and examples.Let us also introduce the special permutations µ j = (j(j − 1)(j − 2)....4321), Λ j = (1234...(j − 1)j), µ j = Λ −1 j . (2.26) One can show that the following statements are true: 1. Introduce the operator where id denotes the identity operation.Then, where the operation Σ m is defined as where N j [O] are "contact terms" containing only m − 2 integrals.They will be given explicitly at the end.
which leads to (2.20) upon using (2.28) together with the identification (2.34) We now give the explicit expressions for N j .First, let us introduce some definitions.For integers q 2 < q 1 we define where M is some operator and on the right hand side of (2.36) the integrations are over all distinct t i 's (i.e.q 1 − 2 integrations).Given an operator O(t 1 , t 2 , .., t m ), for each p ≤ k and where k is the lower index in the object Ξ m k , we define the operator W as where χ pj is given by and we have used the following shorthand notations We also introduce Õj (y 1 , y 2 , ..., y p+2 ) = W (y p+2 , y p+1 , ..., y 3 , y 2 , y 1 ) (2.40) which amounts to the relabelling Finally, the expression for N j [O] is given by In summary, to obtain the kernel J of (2.21) corresponding to t j , t r+1 , t r for some 1 ≤ r < j − 1, we use the following algorithm: To obtain J corresponding to the choice t j = t j−1 = t r when 1 < r < j − 2, the previous steps are followed with the choice p = r and no sum over p in the last step.To obtain J corresponding to the choice t j = t j−1 = t 1 follow the previous steps after setting p = 1 and to the result, add the term 2m(gǫ(t 1 −t j−1 )gǫ(t j −t 1 )) .This exhausts all the cases.The rest of the paper is devoted to establishing (2.28)-(2.32)and justifying the existence of ǫ i → 0 limit.In Sec.III we prove (2.28) and (2.29).In Sec.IV we prove (2.32).Appendix D discusses in detail the ǫ i → 0 limit.In Appendix E we examine more carefully the interchange of β-integral and ǫ i → 0 limit used in (2.14).

III. PERMUTATION IDENTITIES (I)
In this section, we present a proof of Eq. (2.29).Consider the integral Continuing this process repeatedly, we get In Eq. (3.4), the permutations act on the operators.But for subsequent applications, we need the permutations to act on functions.This is achieved by the obervations where in (3.6), we used (C5), in (3.7), used the permutation invariance of n-dimensional integrals, in (3.8), chose σ = τ −1 , and finally in (3.10), used (C6) and that µ j and Λ j are inverse of each other.Using (3.10) in (3.4) we then find (2.29).

IV. PERMUTATION IDENTITIES (II)
In this section, we prove (2.32) which is the most nontrivial step in the proof of (2.20).
Let us first note the identity where h(t) is a regular function at t = 0.In subsequent manipulations, we will abbreviate identities of this type by dropping the integral and the limit as which should (hopefully) cause no confusion.We also remind the reader of the short-hand notation introduced in (2.39).For instance, from (2.18) we have and Before proving (2.32) we first prove a lemma.
Lemma: Consider the operator where ) is a general operator and F ǫ m is defined in (4.3).For all m ≥ 2, we have where R (m) q [O] is defined in Eq.(2.36).We remind the reader that Λ l only acts on the kernel function F .
Proof : We prove this by induction on m.Taking m = 2, Λ 2 = (12), we get which is trivially true by the antisymmetry of the function g.So the correct base case is m = 3. Performing a few relabelings of the t i , we get Using the identity 4cosh(πt 2 )sinh(π(t and some algebraic manipulations we can write the term in the parentheses of (4.8) as Integrating against O(t 1 , t 2 , t 3 ) and taking ǫ i → 0, we get The second term of (4.11) can be simplified by noting the identity for f continuous in a, bounded and integrable.Thus, we obtain From the definitions (2.35)-(2.36)we note that and thus which establishes the lemma for the case m = 3.
Suppose the lemma holds for some m ≥ 4, i.e, Then, we have With a computation very similar to the case m = 3 we find the relation Finally, we perform the integration over t m in the first term on right hand side of (4.18) and use the induction hypothesis to obtain which proves the lemma.

B. Final proof
We now prove (2.32) which using the definition (4.5) can be written as Our strategy is to use induction on m with a fixed k.First, consider m = k + 1 for which we have It can be checked by an explicit computation following from g ǫ (−t) = −g ǫ (t).This completes the proof for the case m = k + 1. Next, consider m = k + 2, which is the base case for our induction argument.By explicit computation one can show that where O ′ is defined as where in the last step we have used the fact that g ǫ (t) is an odd function.This completes the proof for m = k + 2. Now, suppose (2.32) holds for m ≥ k + 3. Remember the definition of Ξ m k from (2.31).

It can be checked explicitly that
Here, Let us denote the sum of the second and the third terms in (4.31) as V .First, split V into where we have repeatedly used and y 3 = t 1 for the moment.We find

.38)
Integrating S 1 against O(t 1 , ..., t m ) and taking ǫ → 0, we find this is precisely the p = 1, j = m term in N j [O] that we are looking for, after applying the Lemma, in Eq.(2.42).Now, V 1 can be further split into where It is convenient to rename y 1 = t m , y 2 = t m−1 , y 3 = t 2 and y 4 = t 1 .Now, using the equalities we can write S 2 as if we integrate S 2 in (4.47) against O(t 1 , .., t m ) and take all the ǫ → 0 we find the p = 2, j = m term in N j [O] in Eq.(2.42).Now, the pattern is clear: we split the remainder V j into S j+1 and V j+1 until j = k.Since where Note that Θ m p is a direct generalization of (4.43)) and Ξ j 0 = 1.The basic idea to compute S p is the same as that of S 2 .One has to show that S p can be written as It is important to remember that χ ′ pm does not depend on any of the y variables.It then follows from the Lemma that after integrating against O(t 1 , .., t m ) and sending ǫ → 0, S p has the form required in Eq.(2.42).
To finish the proof, we need to demonstrate (4.51). 3For this purpose, let us look at the first term of S p in (4.49): Expanding Ξ explicitly we have where we have introduced The first equality of (4.60) says that the action of Λ m K j on χ 2 is independent of j and we have named the resulting function χ ′ pm in (4.51).The second equality says χ ′ pm can also be obtained from the permutation Λ m−1 • • • Λ m−p acting on χ 2 .The reasons behind (4.60) are: (i) the number of cycles in each permutation string acting on χ 2 is the same; (ii) the length of each cycle is larger than the highest index occurring in χ 2 .As a simple example, consider the following permutation on some function G(t 1 , t 2 , t 3 ) Λ 10 Λ 9 Λ 8 G(t 1 , t 2 , t 3 ) = Λ 6 Λ 5 Λ 4 G(t 1 , t 2 , t 3 ) . (4.61) The χ ′ pm in Eq.( 4.60) has the precise expression we saw for the j = m contact contribution in Eq.(2.42) which is G(t m−1 , t p , t p−1 ..., t j+2 , t m , t j+1 , t j , ..., t 2 , t 1 ) . (4.64) Collecting everything together we thus find (4.53) can be written as imply that e λ(1−ǫ) e λ +β is dominated by an integrable function.Then, the Lebesgue dominated convergence theorem [11]

2 . 3 . 2 and j. Divide the result by 1 f 4 .
For each such k and p, apply the string of permutations on the right-hand-side of Eq.(2.38) to the first j − p − 2 arguments of F ǫ to obtain χ pj .Sum over these permutations.Keep all arguments above j unpermuted, and delete the arguments between j − p − (t j )f (t j−1 ) .Define the variables y i as in (2.41).Apply the permutation Λ p+2−r to F (y 1 , .., y p+2 ) and divide by 4g ǫ (y 1 − y p+2−r )g ǫ (y p+3−r − y 1 ).Now set t j = t r+1 = t r 5. Multiply this by the result obtained in Step 3 after reexpressing y i in terms of t i .Perform the sum over p. Sum over k after multipying by (−1) k−1 .Multiply the whole result by 2π m .