Perturbative Method for Mutual Information and Thermal Entropy of Scalar Quantum Fields

A new approach is presented to compute entropy for massless scalar quantum fields. By perturbing a skewed correlation matrix composed of field operator correlation functions, the mutual information is obtained for disjoint spherical regions of size $r$ at separation $R$, including an expansion to all orders in $r/R$. This approach also permits a perturbative expansion for the thermal field entropy difference in the small temperature limit ($T \ll 1/r$).


Introduction
It has been appreciated for some time that the entropy of a black hole is not an extensive quantity.It is proportional to the black hole's surface area A, rather than its volume [1,2], where G is Newton's gravitational constant.This equation is considered to be one hint for developing a theory of quantum gravity.Indeed, the non-extensive nature of the black hole's entropy has been linked to a possible breakdown in local quantum field theory from gravitational effects [3][4][5][6][7][8][9].This and similar inquiries may benefit from further work on the entropy of quantum fields.
In [10], Srednicki noted a connection between the black hole entropy and the Von Neumann entropy for a quantum field theory, defined as the expectation value of the negative logarithm of the density operator ρ of the state, In short, the entropy of a massless scalar field on a sphere follows a similar area law as black hole entropy [10].For a general review, see Casini and Huerta [11].
Unfortunately, the quantum field theoretic entropy on a sphere has an ultraviolet divergence which complicates its study.For instance, Srednicki had to impose an ultraviolet cutoff by discretizing the sphere with a finite lattice.Rather than deal with this divergence directly, some authors choose to study a related quantity, the mutual information of two spatially separated regions [12][13][14][15].Given a bipartite state with density operator ρAB and reduced density operator ρA and ρB , the mutual information can be defined in terms of entropy, I = S(ρ A ) + S(ρ B ) − S(ρ AB ). ( In this work, we will focus on the mutual information between two spheres separated by a distance R. In a d dimensional scalar field theory, this quantity has been shown to be follow an area law similar to entropy, proportional to the surface areas of each sphere for large R [11], where A 1 and A 2 are the surface areas of the spheres.Shiba studied this case numerically for a massless field by approximating a quantum field theory with a finite number of harmonic oscillators and found estimates for the coefficient C in two and three dimensions [16].Cardy developed an analytic method to find these coefficients and found good agreement with Shiba's result [17].This was later extended by Agón and Faulkner [18] as well as Chen et al. to arbitrary dimensions [19], where r 1 and r 2 are the radii of the spheres and Γ denotes the Gamma function.These analytical computations involved computing a related quantity: the Rényi Mutual information, Here S a is the Rényi entropy, which converges to standard entropy as a → 1, S a (ρ) = 1 1 − a log(Tr(ρ a )).
References [17][18][19] used a replica trick to compute S a for integer a ≥ 2 and analytically continued to a = 1 to compute the leading coefficient for mutual information.This is notably different than the strategy Shiba pursued using numerical calculations on a lattice.
The first goal of this paper is to present an alternative technique to exactly compute these leading coefficients and all following coefficients which is more in line with numerical lattice techniques, but does not rely on either analytical continuation or the replica trick.
This alternative technique will also allow us to compute the entropy difference for thermal fields.Parallel to the work on mutual information, there has considerable effort on understanding the divergences of the entropy on a sphere and the ways to regularize it [11].The most straightforward way to do this is to simply subtract the entropy of an excited state on the sphere with that of the vacuum, ∆S = S(ρ) − S(ρ 0 ), where ρ is the density operator of the state of interest and ρ0 is the density matrix of the vacuum.This is reasonable because only the change in entropy has physical meaning.In an influential paper [20], Casini provided an argument for the finiteness of this quantity.Casini also pointed out the connection this quantity may have with the black hole entropy.A slight reformulation of the area law for black hole entropy is the Bekenstein bound.The entropy of a physical state should be bounded from above by the entropy of a black hole of the same size and with mass equal to its energy.This entropy is a constant times its energy times a characteristic length, Casini provided an interpretation of this bound in quantum field theory, using the modular Hamiltonian Ĥ of the vacuum ρ0 , ln ρ0 = − Ĥ − c Î.
Here, c is some constant representing the free energy of the vacuum.In place of entropy is the entropy difference ∆S and in place of λER is what we will call Casini's K, Casini noted that the difference ∆S − K was equal to the relative entropy S(ρ|| ρ0 ) which is always nonnegative, ensuring the bound ∆S ≤ K, S(ρ|| ρ0 ) = ln(ρ) − ln( ρ0 ) ρ. (12) The second goal of this paper is to study the entropy difference and Casini's K for a massless scalar thermal field partially traced to a spherical region, which is a relatively simple example of a nontrivial excited state.Specifically, we will construct an expansion for the entropy difference for such a thermal field at low temperature.
We will handle both of these problems using the same framework.We will use the fact that both the vacuum on two spheres and a thermal field on one sphere are Gaussian, which are particularly easy to understand states.Furthermore, both of these examples have the scalar vacuum as a limiting case, which allows us to use a perturbative approach.For the mutual information of spheres computation, when the distance between the two spheres is infinite, the spheres do not interact, becoming two copies of a vacuum on a single sphere.For the entropy difference of thermal fields computation, a thermal field approaches the vacuum as temperature goes to zero.The massless scalar vacuum on a sphere is conformally equivalent to the vacuum on a half plane and is well understood as a result [21].We use the term nearly conformal to refer to families of states with this property of limiting to conformal states.
Our central aim is to create a general perturbative strategy to compute expansions of information theoretic and thermodynamic quantities for nearly conformal field theories.The general strategy will be to compute an expression for the series coefficients for a finite lattice, which is equivalent to computing a series expansion for a finite dimensional Gaussian state.We then take the continuum limit to find our coefficients.We will apply this strategy to our two goals.
The outline of this paper is as follows.Section 2 presents preliminary definitions for classical and Hilbert spaces in our setup, along with associated correlation functions and projections.Section 3 gives some background on Gaussian states and provides a general description of our perturbative method, including a defintion of the skewed correlation matrix.Section 4 develops the skewed correlation matrix for a massless quantum scalar field on a sphere.Section 5 gives the perturbative expansion away from a state that is the real scalar vacuum on a sphere.Sections 6 and 7 work out the general series for the mutual information of distant spheres and the temperature of a thermal field, respectively.Section 8 gives a brief argument showing why one should expect an area law to appear for the entropy difference in a scalar field in general.In Section 9 we present general conclusions and future directions.Appendix A details general mathematical features of Gaussian states in lattice field theories, including the modular Hamiltonian.Appendix B lays out some mathematical background on symplectic matrices, the Takagi factorization, and spherical and cylindrical symmetries of classical fields on a sphere, which may be useful to reference at certain points in the derivation.

Lattice Field Theory and the Classical Space
In this section, we summarize the approach we use to treat Gaussian states in scalar lattice field theory.We then give the general formula for a perturbative expansion for the entropy and mutual information of Gaussian states on a lattice.For a detailed derivation of this series, see Appendix A. Understanding the Gaussian states on the lattice will then allow us to take the continuum limit in Section 4. As such, much of the results given in this section will be presented in a way to make taking the continuum limit as natural as possible.

The Classical Space
Any quantum Hilbert space is the quantization of some classical space, with classical variables promoted to operators in the quantum space.For a real scalar field on a finite lattice of size N, the field operator φj on the j-th lattice point corresponds to a classical variable φ j describing the value of the field at that point.The classical space is therefore itself a vector space, isomorphic to the vector space R N .
Since the classical space is a vector space, one might ask whether there are any natural linear transformations on this vector space and indeed, there are.Consider the two point correlation functions of some state ρ, defined as Here, φj denotes the field operators, πj denotes their conjugate momentum operators and • denotes the expectation value with respect to ρ.We can then define matrices X, P, V of f on the classical space whose entries are the two point correlation functions.These matrices will be the central tool we use to analyze entropy in lattice field theory.However, note that we are now dealing with linear operators in both the quantum Hilbert space and the classical space.To concisely distinguish between the two, this work will refer to quantum operators as 'operators' and use hats: Ô.We will use no hats for operators on the classical space and refer to them as 'matrices', 'classical matrices' or 'linear mappings on the classical space'.This convention seems natural for now.But in the continuum limit, we will see that the classical space is infinite dimensional and linear operators on infinite dimensional spaces are typically not called matrices.Regardless, we will still use this convention in the continuum limit for consistency.

Classical Spaces and Tensor Products in Lattice Field Theory
To describe mutual information in lattice field theory using the formalism of the classical space, we must understand how the classical space of a tensor product behaves in our setup.More precisely, suppose we have a quantum Hilbert space H with corresponding finite dimensional classical space C which decomposes into a tensor product H 1 ⊗ H 2 .Suppose H 1 and H 2 have their own classical vector spaces C 1 and C 2 .Furthermore, suppose the field operators in H 1 and H 2 are field operators in H as well.We must find how C, C 1 and C 2 are related.
Since the classical space is the set of field operators, this means C 1 and C 2 are subspaces of C. Because they are finite dimensional, this means that they are closed subspaces as well.Because H = H 1 ⊗ H 2 , we can create an orthonormal basis in H by constructing a complete set of commuting observables out of the field operators in H 1 and those in H 2 .In particular, this means that any field operator can be written as a linear combination of some H 1 and H 2 field operators.Put in terms of the classical space, this means that a basis for C can be constructed from elements in C 1 and C 2 .We can also show that C 1 and C 2 do not intersect except at the zero vector.If they did, the corresponding field and conjugate momentum operators would be operators in both H 1 and H 2 .This is impossible because it would imply that a operator on H 1 does not commute with an operator on H 2 .Putting the above facts together, we see that C = C 1 ⊕ C 2 .Therefore, a tensor product in the quantum space corresponds to a direct sum in the classical space.

Block Matrices of a Classical Matrix in Lattice Field Theory
In this paper, it will frequently be useful to divide a matrix into sub-matrices or blocks, such as To make the generalization to the continuum limit as simple as possible, we provide a basis independent way to define block matrices in the classical space.
Suppose the classical space C decomposes as a direct sum of closed subspaces C = n i=1 C i .This includes the special case of C = C 1 ⊕ C 2 .We can define the classical projection onto C i as the matrix P i from C to C i such that P i v equals v if v ∈ C i and equals 0 if v ∈ C j for any j = i.We can define the dual of the projection Pi as the matrix that naturally embeds vectors from C i into C.Given a matrix A and a decomposition of C, we define the block of A from C i to C j with respect to that decomposition as A C i ,C j = P j A Pi .
In addition, we will often be considering matrices on the vector space C 2 = C ⊕ C in this paper.This vector space can be decomposed as Therefore we can define, analogously to the above, blocks from C 2 i to C 2 j .This definition of a block matrix will be necessary to write the mutual information in a way that generalizes naturally to the continuum limit.As we will see, it will also be necessary to properly account for spherical and cylindrical symmetry in the continuum limit.But before we take the continuum limit, we must complete our description of the lattice field theory case in the next section.

Perturbative Series for Entropy of Gaussian States in Lattice Field Theory
In this section, we summarize the results of Appendix A, which gives an overview of Gaussian states.Gaussian states are a special subset of states which are completely defined by their two point correlation matrices, as well as their field expectation values.However, we will always assume that the latter are zero, meaning our Gaussian states are completely defined by only their two point correlation matrices.To keep things mathematically straightforward, we continue to focus on a scalar field on lattice of size N, giving us a classical space isomorphic to R N .We will define a single matrix which we call the skewed correlation matrix, which completely describes a Gaussian state and describe how it behaves under the operation of the partial trace.We then give general expressions for entropy and mutual information in lattice field theory in terms of the skewed correlation matrix.Lastly, we outline how to perturb the entropy of a state expressed in terms of the skewed correlation matrix, which is the central result we will use in the continuum limit.

The Skewed Correlation Matrix and the Partial Trace
We combine all of the correlation functions into one matrix which we call the skewed correlation matrix α, written as Here the blocks are chosen to respect the decomposition C 2 = C ⊕ C.This matrix encodes all information about a Gaussian state and proves to be a natural way to describe the state.For example, suppose we have a tensor product space H = H 1 ⊗ H 2 .As discussed in Section 2.2, we can write the classical space as a direct sum C = C 1 ⊕ C 2 .Then we can decompose If we have a Gaussian state ρ on H with skewed correlation matrix α, then the partial trace of ρ onto H 1 is also Gaussian.In addition, the partial trace of ρ onto H 1 has skewed correlation matrix equal to the block of α from C 2 1 to C 2 1 with respect to the decomposition

Entropy and Mutual Information
In particular, if we define a function h(z) = z 2 arccot(z) +1 4 ln − 1 4 (z 2 + 1) , the entropy S of a Gaussian state with skewed correlation matrix α is where Tr C 2 is the trace over the finite dimensional space C 2 .Suppose we split the skewed correlation matrix into blocks according to the decomposition described in Section 2.3, We can also write the the skewed correlation matrix α 0 of the non-interacting state ρ 1 ⊗ ρ 2 as Then the mutual information of this state between the two systems is Lastly, suppose we have a Gaussian state with skewed correlation matrix α along with a 'vacuum' Gaussian state skewed correlation matrix α 0 and modular Hamiltonian Ĥ.We can write the quantity K which is defined in Equation (11) as

The Perturbative Expansion for Entropy
Suppose that we have two Gaussian states with skewed correlation matrices α 0 and α 1 1 .If we write the difference between the skewed correlation function as ∆α = α 1 − α 0 , we can write the entropy difference between the two states as a series: Note that, thanks to the appearance of the resolvent (z − α 0 ) −1 , this series is only useful if the diagonalization of α 1 or the resolvent itself is known.If this is the case, this formula can be used to calculate a perturbation series for the entropy S 1 for small ∆α.Furthermore, as we will see, it is in a form that extends very naturally to the continuum limit.

Casini's K and Relative Entropy
Consider the first term in Equation (21), and let us specify that α 0 is a vacuum state.Then this is precisely the value of Casini's K given in Equation (20).This provides a nice demonstration of the usefulness of this quantity.Furthermore, if we recall the identity ∆S = K − S(1||0) where S(1||0) is the relative entropy between the two states, we see that the remaining terms in our expansion make up the relative entropy.Therefore, we see that we have a useful expression to study the entropy difference, and moreover, we have obtained a series expansion for relative entropy,

Eigenbasis of Skewed Correlation Matrix via Modular Hamiltonian
To use the peturbative series described in Equation ( 21) in practice, we need to to have some information about the eigenvalues of α 0 .Here, we summarize the results of Section A.10 which provides a strategy to understand the spectral decomposition of α 0 which generalizes nicely to the case of a vacuum in scalar field theory.Suppose we have some Gaussian state with skewed correlation matrix α 0 .Recall the definition of the modular Hamiltonian, The modular Hamiltonian can be written in terms of α 0 as Here V is the vector valued operator ( φ, π) and J is the matrix where I is the identity matrix on C. Now, suppose we have solved the eigenvalue problem for f (α 0 ): Here, we have separated the eigenvectors into its two components which are each an element of C.Then, given any matrix A in C 2 , we can write the entries in the f (α 0 ) (and therefore α 0 ) eigenbasis as This allows us to write the resolvent of α 0 in the α 0 eigenbasis as Solving the eigenvalue problem for f (α 0 ) instead of α 0 may seem strange for the case of lattice field theory, as both eigenvalue problems would be equally difficult.But, as we will see, in the continuum limit f (α 0 ) has a more easily computed spectrum than α 0 because the former is a differential operator and the latter is not.Therefore, understanding the spectrum of f (α 0 ) will provide an indirect approach to understand the spectrum of α 0 .
4 Scalar Field Theory as The Continuum Limit of Scalar Lattice Field Theory This section addresses the definition of scalar quantum field theory in terms of the skewed correlation matrix introduced in the prior section, and proceeds to define the vacuum and finite temperature correlation matrices for scalar quantum field theory in Sections 4.4 and 4.5.We then explicitly compute the Gaussian decomposition of the vacuum skewed correlation matrix introduced in Section 3.5, and find it involves a Fourier-type transform.Following this decomposition, we will be able to perturbatively expand expressions for the entropy around the vacuum.The first three subsections deal with questions of mathematical rigor; readers interested only in the main treatment and correlation matrices could skip to Section 4.4.

The Classical Space for a Scalar Field
For a scalar quantum field theory on an open set Ω, the classical space is also a vector space: the set of tempered distributions on Ω.These can be informally defined as the set of generalized functions (e.g.Dirac delta function) that grow as fast as a polynomial at infinity.More formally, the set of tempered distributions S * (Ω, R) are the continuous linear functionals in the Schwartz space on Ω, where the Schwartz space is written S(Ω, R).The Schwartz space is the set of all infinitely differentiable functions f from Ω to R such that multiplying any derivative of f by any polynomial is bounded.
In the case of quantum field theory, the two point correlation functions for the field operators φ and π are often informally defined as follows.
For the case of a lattice field theory, recall that it was useful to establish a correspondence between correlation functions and linear maps on the classical space, which amounted to defining a matrix with entries equal to the correlation functions (this was the construction of the matrix α).For a quantum field theory, we can do something analogous by defining a linear map with a two point correlation function as an integral kernel.To do so, we can use the Schwartz kernel theorem, which guarantees a linear map A : S(Ω, R) → S * (Ω, R) [22].The explicit form of this linear map can be expressed by writing the correlation functions as an integral kernel.For example we can write the field correlation matrix X as Recall that, for consistency with the lattice field theoretic case, we will still use the term 'matrices' for these linear maps to distinguish them from quantum operators.In addition, although the domain of definition of these matrices is not the entire classical space, we will still refer to them as linear maps on the classical space for simplicity of terminology.
In addition, recall that, if C is the classical space, in much of this work we are concerned with the skewed correlation matrix which is a linear map on C 2 = C ⊕ C. For quantum field theory, C 2 can be thought of as the set of two component generalized functions on Ω which have the general form: where the components φ 1 and φ 2 are tempered distributions.As suggested by this representation of C 2 , a matrix on C 2 can be represented as a 2 × 2-matrix valued kernel K(y, x) which operates on an element of C 2 analogously to Equation (30), If the classical space decomposes as a direct sum as described here, then the definition of a block matrix given in Section 2.3 carries over to the case of quantum field theory.This will allow us to have a well defined notion of mutual information between fields on two sets that do not share a boundary.Similarly, Appendix B.6 describes how the classical space when Ω is a sphere can be decomposed into spherical wave mode subspaces.As noted in the appendix, the definitions given in Section 2.3 carry over in that case as well.
However, suppose Ω 1 and Ω 2 are open sets that do share a boundary.Alternatively, suppose Ω 2 = Ω − Ω 1 is not open.In the former case, we won't have C = C 1 ⊕ C 2 .In the latter case, there is no sensible way to define a quantum field theory on Ω 2 .Furthermore, in both cases we may not have C 1 be a subspace of C at all.This prevent us from defining a notion of a block matrix perfectly analogously to the lattice field theoretic case.These messy technicalities are a large part of why the quantum field theoretic cases are treated here without full rigour.
We will still need some notion of a block matrix however.For example, suppose that Ω is the whole space R d and Ω 1 is some open subset, such as a sphere or the union of two disjoint spheres.Furthermore, suppose that we know the skewed correlation matrix on the whole space.If we want to generalize the result of 3.1 to a scalar field, we will still need some way to write a block matrix To do so, we can formally define the projector onto a region Ω 1 by restricting the domain of its input, We can write the dual of the projection, which acts on tempered distributions on functions defined on Ω 1 formally as well, While we cannot comfortably define the other blocks of A, we can define the block matrix of A from C 1 to C 1 as There is a problem with this definition however.We expect that every correlation matrix will take a vector in the Schwartz space to the set of tempered distributions.This is guaranteed by the Schwartz Kernel Theorem, as explained in Section 4.1.But, this is not the case for the momentum correlation matrix.To illustrate this, suppose Ω is the whole space and Ω 1 is the unit sphere.As can be checked in Section 4.4, the momentum correlation matrix P for the vacuum on the whole space satisfies the requirements of the Schwartz kernel theorem.Yet, its block matrix on the sphere P C 1 ,C 1 takes a constant function to a spherically symmetric function that asymptotes to ∼ 1  1−r as it approaches the boundary of the sphere at r = 1.Such a function cannot be a tempered distribution on Ω 1 . 2he reason for this divergence at the boundary is the discontinuity at the boundary in Equation (34).This means that the image of P1 is discontinuous and therefore not in S(R d , R) which is the domain of definition of A. Put another way, this means that the linear map A C 1 ,C 1 (cf.Eq. ( 35)).is not well defined.
This divergence at the boundary is ultimately responsible for the divergence of thermodynamic quantities like the entropy on a sphere.Since we are computing quantities in quantum field theory known or expected to be finite, it is likely that a more rigorous treatment of the problems we tackle here would show that these divergences 'cancel out' and do not affect the validity of the final result.Indeed, a quick review of following sections will show that the final calculations involve only well-defined linear maps on the classical space, even though the intermediate steps may not.We will make this assumption going forward and write the block matrix formally for now.
Suppose that we have the skewed correlation matrix α for a Gaussian state on the whole space with components X, P , and V of f .We can now use our notion of the block matrix to write the skewed correlation matrix for an open subset Ω 1 with classical space C 1 as In Section 4.5, we will apply this to the case of a scalar field at finite temperature, partial traced to a unit sphere.

The Trace over the Classical Space for a Scalar Field
For the case of lattice field theory, the meaning of a trace over the square of the classical space C 2 = C ⊕ C is unambiguous since the classical space is finite dimensional.But for a quantum field theory on an open set Ω, there is no well-defined notion of a trace in the space of tempered distributions in general.This is problematic because it makes it difficult to generalize the Tr C 2 appearing in Equation (21).Despite this, suppose we have two skewed correlation matrices α 1 and α 0 on C 2 Ω and define ∆α as the difference between them.We will justify a definition for the trace such that the trace appearing in Equation ( 21), namely n , is both well-defined and finite assuming z = ai for |a| ≥ 1.
To ensure that n is well-defined and finite, three conditions are necessary.Firstly, the open set Ω must be bounded.Secondly, the kernel of ∆α must have smooth entries on the closure of Ω.The third condition is that α 0 is densely defined in L 2 (Ω) 2 under the Hilbert space norm.
Here, L 2 (Ω) is the Hilbert space of square integrable functions on Ω and L 2 (Ω) 2 = L 2 (Ω) ⊕ L 2 (Ω).We also write S Ω = S(Ω, R) as the Schwartz functions on Ω.We note that any Schwartz function is bounded, meaning 2 , its resolvent is by definition a bounded endomorphism on L 2 (Ω) 2 if z in not in the spectrum of α 0 .Since the spectrum of a skewed correlation matrix is always imaginary in lattice field theory, its spectrum should remain so in the continuum limit.Therefore, if z = ai for a ≥ 1 or a ≤ −1, the resolvent should be bounded.
Next, consider the matrix ∆α on L 2 (Ω) 2 .Since ∆α is smooth, has a smooth kernel, and is defined on a bounded set, we can conclude that ∆α belongs to the set of trace class operators [23].In particular, this means that it is a continuous and compact endomorphism on L 2 (Ω) 2 , its spectrum is countable, and its trace (i.e. the sum of its eigenvalues) is finite.
We can now pull in an important fact from functional analysis: the product of any trace class operator with a bounded operator is a trace class operator [24].Using the fact that ∆α is trace class and (z − α 0 ) −1 is bounded, we see that the n-th power of (z But this is all for L 2 (Ω) 2 , not the classical space C 2 Ω .In fact, we have still not addressed the problem that, in an abstract sense, there's not an obvious way to define a trace over a space of tempered distributions.So far we have only written this trace formally.But how can we actually define it?Recall that ∆α, like any classical matrix on C 2 Ω , is in reality a linear mapping from the two-component Schwartz space By the continuity of (z − α 0 ) −1 , we can therefore conclude that both the preimage and the image of the classical matrix [(z − α 0 ) −1 ∆α] n are dense subspaces of L 2 (Ω) 2 .Therefore, there should exist some orthonormal basis of L 2 (Ω) 2 contained in both the image and the preimage.But, traces in Hilbert spaces are defined by using an orthonormal basis.This justifies defining the trace of the classical matrix to be its trace over L 2 (Ω) 2 .

The Correlation Matrix of the Vacuum
Consider a real massless scalar field φ in d + 1-dimensional Minkowski space with Lagrangian density L = φ, where denotes the d'Alembertian operator.We select a preferred time direction3 and construct the corresponding Hamiltonian as Generalizing from the case of finitely many oscillators given in Appendix A.3, this has the form of the Hamiltonian of finitely many coupled oscillators with the sum replaced by an integral and with K being equal to the negative Laplacian −∇ 2 , which is indeed positive definite.Therefore, we extend the correlation functions of a generalized thermal coupled oscillator in Appendix A.3.
For the special case of the vacuum state, the nonzero blocks of correlation matrix are the following fractional powers of the negative Laplacian: However, these formulas for the correlation operators are inconvenient for actual computation.For the vacuum case, explicit representations of these operators are found in the math literature on pseudo-differential operators [25].For d > 1, we write these operators in terms of their action on a field φ as where p.v. regulates integral divergences. 4.These formulae match the correlation functions found in the physics literature [26], which are computed using different methods than our approach of taking the continuum limit of lattice field theory.Here, the subscripts 0, d on the operators label the temperature of the field and the dimensions respectively.Now, suppose we partial trace this state onto the unit sphere Ω.The corresponding correlation matrix is a block matrix of the full correlation matrix.Following the prescription of Section 4.2, the correlation matrix on Ω is simply the restriction of this operator to functions on Ω, written as where Ω C is the complement of Ω.

The Correlation Matrix at Nonzero Temperature
Now, assume the entire field is thermalized at temperature T .Let α T,d be the skewed correlation matrix for a d-dimensional field at temperature T and let α T,Ω,d be the same for this state partial traced to the sphere.The kernels of the correlation matrices for the unit sphere are simply the kernels for the whole space restricted to the sphere.Once again generalizing from the case of a generalized thermal coupled set of oscillators given in Equation (149) in Appendix A, the correlation matrix for the whole space will have the form We know the correlation operators for the vacuum, which corresponds to T = 0.This allows us to write the difference between the skewed correlation matrices ∆α T,d = α T,d − α 0,d in terms of the negative Laplacian as The 2 × 2 matrix kernel of this ∆α T,d has a very simple representation in Fourier space: To write this kernel in position space, we need to use the convolution theorem and find the ddimensional (inverse) Fourier transforms (in a distributional sense) of ).For d ≥ 3, these functions are absolutely integrable, meaning that the Fourier transform is well-defined in the sense of standard functions.Moreover, they have exponential decay at infinity implying their Fourier transforms are real analytic.
For d ≤ 2, however f 2,X is not absolutely integrable and finding the corresponding kernel requires more care.For this reason and because we lose out on useful properties of the kernels,5 we will only consider thermal fields for d ≥ 3.
For d = 3, these Fourier transforms can be easily calculated using a Hankel transform of order 1 2 .We write the nonzero blocks of α T,Ω,3 .

The Spectrum of α via a Local Modular Hamiltonian
We wish to generalize the expansion in Section 3.3 to the case of quantum field theory, where the unperturbed state is the vacuum on the sphere with skewed correlation matrix α 0,Ω,d .To do so, we must understand the spectral decomposition of α 0,Ω,d so that we can construct its resolvent (z − α 0,Ω,d ) −1 .If the classical space was finite dimensional, diagonalizing α would be straightforward.But, in a quantum field theory on an open set Ω, it is much more difficult to diagonalize α directly if Ω is not the whole space R d .If Ω is a sphere, we can see from Section 4.4 that α 0,Ω,d a is some mix of a pseudo-differential operator and an integral operator.The spectra of these operators is generally not well understood. 6ortunately, α 0,Ω,d belongs to a special case which allows us to diagonalize it by generalizing the result of Section 3.5.If Ω is a half space or if Ω is a sphere and the quantum field theory is conformally invariant, the modular Hamiltonian is local.The massless scalar field is such an example of a conformally invariant theory.In this case, the positive definite matrix appearing in the modular Hamiltonian h ′ (α 0,Ω,d )J is a differential operator, where J is the operator defined in Equation ( 25) and h ′ (t) = 1  2 arccot (t).Therefore h ′ (α 0,Ω,d ) is also a differential operator.So, in principle, the eigenvalue problem of h ′ (α 0,Ω,d ) should be equivalent to but much easier to consider than that of α 0,Ω,d directly.
Casini and Huerta studied the entropy of an n-sphere in a scalar vacuum and among their work they give the explicit formula for the modular Hamiltonian of this state [21].Comparing Equations 7 and 8 in Casini and Huerta's work 7 with our formula for the Modular Hamiltonian in Equation (162) and defining p(r) = 1 − r 2 we get an expression for h ′ (α 0,Ω,d ) as a differential operator operating on two components φ 1 and φ 2 of a vector in C.
The eigenvalue problem of h ′ (α 0,Ω,d ) therefore becomes the following coupled partial differential equation: We uncouple this equation by transforming it into a second order PDE in φ 2 : Since this differential operator is spherically symmetric, we can symmetrize it by considering fields of the form φ(x) = ψ(r)Y m,η ℓ as described in Section B.6.We get an ODE on the interval [0, 1) which depends only on ℓ, On the surface, this looks like a Sturm-Liouville equation.But, it is not quite.If we put this into Sturm-Liouville form, the weight function diverges at the boundary, which nullifies an assumption necessary for classical Sturm-Louville theory to apply.There are also no boundary conditions obvious at the boundary from the setup.The only sensible restriction is that the function is smooth inside the sphere and, in particular at r = 0.This means that the spectrum will be likely continuous.This does match with Casini and Huerta's result that this operator is a Laplacian in some hyperbolic space [21].To take advantage of some existing results in the literature, we make the substitutions r = tanh(x) and ψ = sinh(x) ℓ cosh(x) d+ℓ−1 Φ.We define γ = ℓ + d−1 2 ,8 and rewrite the ODE as There is also the matter of boundary conditions.Using the Frobenius method, the two independent solutions for ψ behave like ψ(r) ∼ r ℓ and ψ(r) ∼ r −ℓ−d+2 near r = 0.The exception is if d = 2 and ℓ = 0, in which case the second term is logarithmic.In either case, the smooth solution at the origin of the n-ball is the r ℓ solution. 9This gives a boundary condition of Φ(0) = A for some A = 0.
If we try to find another boundary condition at r = 1 by using the Frobenius method again, we see that ψ(r) ∼ (1 − r) ±Iλ .So, the solution is oscillatory as it approaches the surface of the sphere.This is what ultimately implies a continuous spectrum.Φ(r) is oscillatory as x → ∞, meaning we reasonably expect there to be some Fourier-type transform giving some orthogonality relations which would allow us to use the eigenvectors of Equation (50) as a continuous basis.Fortunately, this continuity of the spectrum, as opposed to the discreteness of a classical Sturm-Liouville spectrum, turns out out to be very helpful as it turns what would otherwise be difficult to handle infinite sums into integrals.Koornwinder [28] studied solutions to Equation (50) with initial condition Φ(0) = 1.The solutions are the unnormalized Jacobi functions, which we will label as Φ γ (x, λ).The Jacobi functions can be represented in terms of the hypergeometric function as From Ref. [28], the Jacobi functions do indeed form the kernel of a Fourier type transform called the Fourier-Jacobi transform.Equations 1.2 and 1.3 in from [28] imply the following orthogonality relation: This allows us to write, up to a normalization constant, the second component of the symmetrized eigenvector of h ′ (α 0,Ω,d ) after transforming back into radial coordinates.We will label this function ψ ℓ,d (r, λ) which can be written as It is worth noting that despite the apparent dependence on i, this is a real function, due to the Pfaff transformation of the hypergeometric function.These functions satisfy an orthogonality relation following from Equation (52): We can easily find the first component of the ψ ℓ,d,1 eigenvector using the second relation in Equation (47) allowing us to write the write the eigenvectors of f (α) as To normalize the eigenvectors, we insert them into a continuous version of the orthogonality relation in Equation (187) in Appendix A.10, (56) Now, using the orthogonality relation for spherical harmonics and comparing Equations ( 54) and (56) we can solve for the normalization constant A as We have now completely described the normalized eigenvectors of h ′ (α 0,Ω,d ) through Equations ( 53), (55), and (57).

The Eigenbasis of the Skewed Vacuum Correlation Matrix
Now that we have the spectral decomposition of α 0,Ω,d , we can extend Equation ( 27) to transform a matrix into the α 0,Ω,d eigenbasis.
That is, given an arbitrary matrix A in C 2 with 2 × 2 matrix valued kernel K(x, y), we define the kernel KA of A with respect to α 0,Ω,d or alternatively the kernel KA of A in the α 0,Ω,d eigenbasis as 1 .
(58) Note that placing a matrix in the α 0,Ω,d eigenbasis also gives the decomposition of the matrix into the spherical wave mode blocks outline in Appendix B.6.Exactly as the name suggests, this expression is equivalent to our original kernel up to an isomorphism of vector spaces.In particular, we can extend Equation 28 to compute the kernel Rz,ℓ,m,η,ℓ ′ ,m ′ ,η ′ (λ, λ ′ ) of the resolvent of α 0,Ω,d in the α 0,Ω,d eigenbasis.
Now that we have used conformal invariance to completely understand the spectrum of the α 0,Ω,d , we can perturb it to construct series for the entropy and mutual information of nearby states.We describe this in the next section.

Perturbing a Scalar Massless Vacuum on a Sphere
In this section, we give a general procedure for expanding around the vacuum of a massless scalar on a sphere, which will be applicable to both cases we consider in this paper: the mutual information of distant spheres and the entropy difference of low temperatures states.Given the open unit sphere Ω, define C Ω as its classical space: the set of tempered distributions on the unit sphere.Let α(t) be a one parameter family of skewed correlation matrices on C 2 Ω with a power series representation α(t) = ∞ k=0 α (k) t k that satisfies α(0) = α 0,Ω,d , the skewed correlation matrix of the d-dimensional vacuum on a sphere.Here, the α (k) are a sequence10 of matrices on the sphere such that α (0) = α 0,Ω,d .In particular, this allows us to write the difference ∆α = α(t)−α(0) as a power series ∆α = ∞ k=1 α (k) t k .Before we begin, we will check that the assumptions outlined in Section 4.3 which allow us to take the trace over L 2 (Ω) 2 .Clearly, Ω is bounded.We will choose to make the assumption that the entries of the kernel of ∆α are smooth, as that will be true for both cases we look at in this paper.Lastly, we note that α 0,Ω,d is densely defined on L 2 (Ω) 2 .For example, we can set the domain of definitions of α 0,Ω,d to be set of smooth functions on Ω with compact support.To confirm this, note that X 0,Ω,d is already a bounded operator on L 2 (Ω) 2 [29].Furthermore, since a function on the domain of definition will be zero on some neighborhood of the boundary, its image under P 0,Ω,d is bounded and therefore in L 2 (Ω) 2 meaning α 0,Ω,d is indeed densely defined.
For cleanliness, in this section we will label the wave mode subspaces discussed in Appendix B.6 with a single index υ = (ℓ, m, η).Suppose that we know the form of the block matrix of α (k) from υ 1 to υ 2 , which we label as α (k) υ 1 ,υ 2 .In particular, suppose we know its form in the α 0,Ω,d basis, as defined in Equation (58).We label the symmetrized components of the α (k) in this basis by Given the matrix function h ′ (t) = 1 2 arccot(t), our goal is to compute the power series representation for Tr L 2 (Ω) 2 [p(α(t))] in t.We begin with the quantum field theoretic version of Equation (21), We now substitute in our power series representation for ∆α(t), We can see that the term of order t N will have a single contribution of Q N from Tr C 2 Ω [p ′ (α)∆α] and the remaining contributions come from all possible ways to sum a set of positive integers k 1 , . . ., k n to N where n ≥ 2: Furthermore, each of these terms can be subdivided into more terms by splitting α 0,Ω,d and the α (k) into wave mode block matrices.As described in Section B.3, the block matrices of their products can be given by a sum of paths over the wave mode subspaces C 2 υ .Since we are taking a trace, we only need to include the paths that begin and end at the same subspace.Each term contributing to P N therefore corresponds to a sequence of tuples (k, υ) with length greater than one, written schematically as This sequences generates a sequence of symmetrized matrices, namely Practicing a bit of foresight, it will be very helpful to group the terms which will produce the same contribution due the invariance of the trace under cyclic permutations.To avoid listing the same term twice, we need the following definition.A circular list is an ordered list under the equivalence relation of cyclic equivalence.We take circular lists to be periodic with x j = x n+j for some length n.
Given a circular list τ , we define S(τ ) to be the number of cyclic permutations that leave τ unchanged.Equivalently, S(τ ) is the number of equal ordered lists τ can be sliced into. 11We denote by χ N the set of all circular lists of tuples (k i , υ i ) such that i k i = N.We can now write the t N term using a sum over χ N as The 1 S(τ ) prevents the sum from over-counting the cyclically equivalent terms.We now find the kernel of n i=1 α To take the trace, we set λ n+1 = λ 1 and integrate over λ 1 .Therefore, the coefficients of the expansion can be written in terms of iterated integrals as .
Recall that the contour integral encloses the portion of the imaginary axis such that |z| > 1.Now, we can use the Cauchy integral theorem to evaluate the contour integral.Our final form is now We note that in general, there may be infinite number of terms to compute each P N .In the cases we consider later, the sum reduces to a finite number of nonzero terms.

Mutual Information of Identical Spheres Separated by R
Consider two spheres Ω 1 and Ω 2 of unit radius whose centres are separated by a distance R. In particular, our goal is to find an expansion for the mutual information between these two spheres in terms of their inverse separation 1 R for large R.After making a simplification owing to the fact that the spheres are identical, we will expand the skewed correlation matrix in terms of 1 R .We can then use the general procedure of Section 5 to compute the full expansion and lowest order term of the mutual information.
If R > 2, the spheres do not intersect and we can decompose the classical space as a direct sum C = C 1 ⊕ C 2 , where C is the classical space on Ω 1 ∪ Ω 2 , C 1 is the classical space on Ω 1 and C 2 is the classical space on Ω 2 .Note that C 1 and C 2 are both isomorphic to C Ω , the classical space for a scalar field on the unit sphere.We use the result of Section 3.2 to find the mutual information.If we set up spherical coordinates on the spheres such that the central axes of the spheres point towards each other, the problem has reflection symmetry.Therefore, in the proper basis, we have a simplified form for the correlation matrix α, Here α 0 is the correlation matrix of a sphere of radius 1.With correct choices of coordinates systems, this is precisely α 0,Ω,d , the operator analyzed in Section 4.6.We can now write the mutual information as We can compute the mutual information using Equation (181) , using h(t) = t 2 arccot(t) + 1 4 ln − 1 4 (t 2 + 1) , as After multiplying out the block matrices and simplifying, we find that the left term contributes nothing.For right term, we find the odd n terms in the sum don't contribute and the even n terms can be simplified into essentially the same form as Equation (181), This allows us to apply the results of Section 5 directly to α 1 instead of to 0 α 1 α 1 0 .
Note that it would still be possible to use our approach to analyze two spheres of different sizes.But this final simplification would not be possible and we would need to work with the full forms of the skewed correlation matrices the entire way through.Regardless, the case of spheres with different radii is related to the case where the spheres have equal radii by conformal symmetry.

Expanding and Symmetrizing α 1
Now we need to find an expansion for α 1 in terms of 1 R .For concreteness, let us place one sphere, called Ω 1 , centered on the origin with radius 1 and place another, called Ω 2 , centered on the point R = (R, 0, . . ., 0) with radius 1.Then we write α 1 as a linear map from functions on Ω 1 to functions on Ω 2 .This operator has the following 2 × 2 matrix valued kernel, which can be found by restricting the domain and image of Equation ( 40), Here x ∈ Ω 1 and y ∈ Ω 2 .We note that since Ω 1 and Ω 2 don't share a boundary, this kernel has smooth entries and so we can comfortably use the expansion in Section 5. We re-parameterize y so that it is centered on its sphere and its positive x direction is pointing towards the sphere at the origin, y → R − y giving us If we define the function , where C a k denotes the Geigenbaur polynomials described in Section B.5 using spherical coordinates, we can write K in terms of an infinite series in 1  R , Here u(n) is the Heaviside Step function, which is equal to 0 for negative integers and 1 for non-negative integers.
It will be convenient to decompose each term in the expansion of α 1 into spherical wave modes.Notice that α 1 is symmetric with respect to the representation of SO(d − 1) corresponding to rotations around R. As discussed in Section B.6, this imposes selection rules on the spherical block matrices.We will label the symmetrized block matrices of the k-th term of Equation (75) in d spatial dimensions by α , we need to evaluate the following integral for 0 < r, s < 1, We will not solve this integral in general.However, we will describe an algorithm to compute an equivalent expansion for H a k directly.We will show that this general expansion is equivalent to a finite dimensional linear algebra problem.We begin by rewriting H a k (x + y) in terms of solid harmonics using the connection formula for Gegenbauer polynomials given by Equation 18.18.16 in NIST's Digital Library of Mathematical Functions [30], Comparing with Equation (209), we see that H α k is a homogeneous polynomial of degree k. 12 Here we have defined a constant D d,k,j,α and use the constant B d,α defined in Equation (204), Since H a k (x) is a homogeneous polynomial of degree k, H a k (x+y) will be a homogeneous polynomial in both x and y such that the total degree of in x and y combined is k.Therefore, we can expand in terms of the basis functions G m,η ℓ,j defined in Equation (209),13 The first line of this equation writes the homogeneous polynomial on the left hand side as a linear combination of all possible combinations of homogeneous polynomials G with the same degree on the right hand side.But the homogeneous polynomials are a finite dimensional vector space, meaning that the constants A k,a,d n,m,j 1 ,j 2 can be computed through a straightforward finite dimensional linear system of equations 14 .Here we note that for these coefficients, the indices k and a are indices of the Gegenbauer polynomials, the index d corresponds to the spatial dimension, while n, m, j 1 , j 2 label a basis for homogeneous polynomials.There is likely no neat formula for these coefficients in general.However, we can work with special cases.For example, we easily compute the expansion for k = 0 later and there are likely more universal terms that appear in the expansion for every dimension.But practically, if one wants to compute this decomposition for any specific dimension, say d = 3, they can quickly find the A k,a,d n,m,j 1 ,j 2 exactly by solving this linear algebra problem.The only requirement is knowing the Cartesian form of the spherical harmonics for that dimension so that the G polynomials can be written.
This expansion gives some additional selection rules for K k ℓ 1 ,ℓ 2 ,m .By noting that the degrees of the spherical harmonics are of the form ℓ 15 Now, let us define two new sets of constants by re-indexing A k,α,d n,m,j 1 ,j 2 and factoring in the other constants in Equation (75), Now, by grouping terms with matching spherical harmonics, we can now write the general form of K k,d ℓ 1 ,ℓ 2 ,m as (81) For clarity, we remind the reader that d is the spatial dimension, ℓ 1 denotes the SO(d) index of the preimage subspace, ℓ 2 denotes the SO(d) index of the image wave mode subspace, m denotes their shared SO(d − 1) wave mode index guaranteed by Schur's Lemma, and k means that we are looking at the order 1 R k term of ∆α.

6.2
The Symmetrized Kernel with respect to α 0,Ω,d We compute the symmetrized kernel in the basis of eigenvectors of α 0,Ω,d , labelled as Kk,d ℓ 1 ,ℓ 2 ,m .We substitute Equation (81) into the definition of K given in Equation (58) using the eigenvectors of α 0,Ω,d described in Equation (55), Here, we define a function Q ν d,ℓ,δ , where we recall the definition of ψ ℓ,d in Equation ( 53) and our notation that γ Now, we make the substitution u = r 2 , (84) Now using 7.512.5 in Gradshteyn and Ryzhik [31], we can evaluate this integral using a hypergeometric function, First we note that we can evaluate the 3 F 2 hypergeometric function in terms of 2 F 1 hypergeometric functions.To do so, we note that ν is always an integer, and so we can use 15.5.2 and 15.5.3 in the Digital Library of Mathematical Functions [30] as well as Gauss's Summation theorem for hypergeometric function 2 F 1 to rewrite this function in terms of the Gamma function as Once the finite dimensional linear algebra problem finding the constants F P,k,j ℓ 1 ,ℓ 2 ,m and F X,k,j ℓ 1 ,ℓ 2 ,m has been solved, one can use Equation (86) to write Kk,d ℓ 1 ,ℓ 2 ,m in terms of algebraic combinations of Gamma functions.

The General Expansion for Mutual Information
We can now describe the general form of the 1 R N term of the expansion for mutual information using the results of Section 5.
To properly parameterize the terms, we need the following definitions.Given spatial dimension d and exponent in the 1 R N expansion N, define χ d,N to be the set of all length n circular lists of tuples of the form (k, ℓ, m) such that n is even, n j=1 (d−1+k j ) = N, |ℓ j |+|ℓ j+1 | ≤ k j , ℓ j −ℓ j+1 ≡ k i mod 2, m i = m and |m| ≤ min j (|ℓ j |).Here, k is a negative integer, ℓ is the index that labels a representation of SO(d) and m is the index that labels a representation of SO(d − 1).Before we write the final expression, we highlight two important consequences of our selection rules.Firstly, we can prove that N is always even (i.e.only even powers of 1 R occur in the expansion).
Secondly, this is a finite sum since χ d,N is a finite set.Therefore, assuming all integrals converge, all sums we write for any term of our expansion for mutual information will be finite.
Recalling that h(t) = t 2 arccot(t) + 1 4 ln − 1 4 (t 2 + 1) as introduced in Section 3.2, we see h ′ (−i coth(λ)) = i λ 2 , which allows us to write the general formula for the 1 R N term of the mutual information in d dimensions.In Section B.5, we review spherical harmonic functions.From these we define N (d, ℓ), where this is the number of spherical harmonics in spatial dimension d at spherical harmonic degree ℓ.In the following expression N (d − 1, m) is obtained by summing over all remaining indices in the spherical harmonics, which we labelled by η, as described in Sections B.5 and B.6.Specifically, Section B.6 uses Schur's Lemma to show that the τ summation in Equation ( 89) is independent of η.We take N (1, m) to be equal to 1.We have where the expression for Kk ℓ j ,ℓ j+1 ,m is given by Equations ( 82) and (86), and we remind ourselves that S(τ ) is the number of cyclic permutations that leave τ unchanged.The full expansion for mutual information in d dimensions is then where we note that if the sum over χ d,N in Equation ( 89) is empty then the contribution of order N is zero.We will will that the first nonzero term is actually when N = 2d − 2.

Mutual Information at Very Large Distances
We look at the lowest order term in the expansion for Mutual Information.The smallest possible N is N = 2d − 2 which occurs when τ = ((0, 0, 0), (0, 0, 0)), S(τ ) = 2, and N (d − 1, 0) = 1.We can therefore simplify Equation (89) dramatically in this case to get As we can see, we only need to worry about the lowest order term in the expansion of K(y, x) in Equation (75), The corresponding expansion in terms of the spherical wave modes is trivial by noting that the kernel is constant.By noting that the surface area of the d-dimensional unit sphere is 2π , we can write the only nonzero term in the spherical wave mode expansion of K: To find the kernel with respect to α 0,Ω,d , we find Q 0 d,0,0 using Equation (86): Here, we used the fact that |Γ(ib)| 2 = π b sinh(πb) .Substituting this into Equation (82) and using the Legendre duplication formula for the Gamma function, we can find the kernel K0 0,0,0 , in the α 0 basis to get We substitute this form for K into Equation (91) and find .
We now use a hyperbolic identity to rewrite the integrand as Dimension Exact Numerical 2

0.185
Table 1: The leading order coefficient of the mutual information of distant spheres of unit radius for a massless scalar field, obtained using the perturbative method detailed in this paper.These coefficients were obtained previously in [17][18][19] using analytic continuation methods.For comparison to numerical computations as in [16], we have included the numeric forms of these coefficients.
The integral over u can be computed with Barnes' Beta Integral [32]. 16 This integral can also be computed using Barnes' Beta integral and we get the final result.
We can reinsert units and let r be the radius of the two spheres.
As we would expect from the positivity of mutual information, the leading coefficient is positive.For d = 3, we find the lowest order coefficient is 4  15 .For d = 2, we get 1 3 .These values match the results of Cardy [17] as well as the extensions by Agón and Faulkner [18] and Chen et al [19] to arbitrary dimensions.
Furthermore, we note that Equation (81) implies that with a suitable solution for the coefficients F bearing indices k, j, d, etc., which amounts to solving a finite dimensional linear algebra problem, we can compute corrections to (101) to arbitrary order in r 2d R 2d .We will comment on this further in the Conclusions 9.For now, we turn to the computation of the entropy difference for thermal fields.

Entropy Difference of Thermal Massless Fields for d ≥ 3
In this section, we study fields thermalized at temperature T , then restricted to a unit sphere.We will consider the entropy difference between a field at temperature T and the same vacuum field at temperature 0, ∆S(T ) = S(ρ T ) − S(ρ 0 ).
In particular, we will construct a series for this quantity for small T .The procedure for doing this is very similar to the procedure for the expansion for mutual information.Recalling that ∆α T,Ω,d = α T,Ω,d − α 0,Ω,d is the difference between the d-dimensional skewed correlation operators at temperatures T and 0 on the sphere, we can use Equation (181) and Equation ( 16) to write a series for ∆S(T ): (103) Recall that the kernel of ∆α T,Ω,d is smooth for d ≥ 3 but has a singularity for d = 2. Thus, we will only consider d ≥ 3 since the finiteness of each trace in this expansion is guaranteed by our work in Section 4.3.
To explicitly compute the series for ∆S(T ), we first compute the expansion for ∆α T,Ω,d .We can then compute the general expansion for entropy and its lowest order term, just as with mutual information.But, there is an another interesting feature of this expansion.We will see in the next section that Casini's K pops out of the expansion.This is easy to compute in full, which we will do in Section 7.3.

Expanding and Symmetrizing ∆α T,Ω,d
Following our derivation for mutual information, we begin by decomposing ∆α d,Ω,T .Our goal in this section is to find a series expansion for the difference ∆α d,Ω,T for small temperatures assuming d ≥ 3.
As discussed in Section 4.5, finding the kernel of ∆α d,Ω,T requires finding the inverse Fourier transform of the functions , where the inverse Fourier transform is taken to be We will label these inverse Fourier transforms g d,X and g P,d , We know g d,X and g d,P will be real analytic.Furthermore, since f d,X and f d,P are spherically symmetric, g d,X and g d,P will be as well.Therefore, g d,X and g d,P are expandable as these radial power series of even order, The powers of T are required to balance units.The coefficients A k d,X and A k d,P can be found by repeatedly taking the Laplacian and evaluating at zero for a temperature of T = 1.To illustrate this, we look at A k d,X , which we can therefore write as This constant has a convenient representation in Fourier space as We can write this integral in radial coordinates as We must now solve the following integral, We expand coth t 2 − 1 as a power series in e −t .dget The integral in the sum can be solved with the Gamma function, We can now write the sum in terms of the Riemann zeta function, which gives a final expression for this integral as This allows us to write a full expression for A k d,X and A k d,P (using identical reasoning) as A comparison with the Taylor series at T = 0 of the thermal correlation operator for d = 3 written in Equation (45) shows that the two match.These constants allow us to expand the kernel of the correlation matrix for ∆α T,Ω,d in terms of temperature, Here u(n) is again the Heaviside Step function.Observe that this matrix is is spherically symmetric.Therefore, as discussed in Section B.6, the only nonzero blocks are those where all indices match.Moreover, the nonzero block matrices depend only on ℓ.Finding these block matrices requires decomposing the power |x − y| 2k into spherical harmonics.Similar to the case of mutual information, we note that |x − y| 2k is a homogeneous polynomial in both x and y such that the combined degree of in x and y is 2k.This means there exists a decomposition into the homogeneous polynomials G m,η ℓ,j given in Equation ( 209), which we can write as This expansion implies an additional selection rule for ℓ.Namely, ℓ ≤ k.The coefficients F k ℓ,j can be computed by solving a finite dimensional linear algebra problem over the vector space of polynomials, just as in the case of mutual information.This allows us to write the symmetrized kernel of the k-th term of ∆α T,Ω,d in d spatial dimensions, which we will label as K k,d ℓ , in the form (116) We note that this has the same functional form as the analogous kernel for mutual information in Equation ( 81).Exactly as in that case, we can find its kernel with respect to α T,Ω,d , which we label Kk,d ℓ , to be Here we reuse the function Q defined in in Equation ( 86) and we remind the reader that d is the spatial dimension, that ℓ denotes the SO(d) index of thew ave mode subspace which is guaranteed by Schur's Lemma to be the same for the image and preimage and that k means that we are looking at the order T k term of ∆α T,Ω,d .

General Expansion for the Entropy Difference and Relative Entropy
We can now describe the general form of the T N term of the expansion for the entropy difference.As we showed in Section 3.4, the entropy difference can be split into the relative entropy and Casini's K. Assume that Casini's K can be decomposed into a power series As we will see in Section 7.3, computing these coefficients is relatively straightforward because h ′ (α 0,Ω,d ) is a differential operator.So, we focus on the decomposition of relative entropy.
We can use the results of Section 5. Just as with mutual information, our selection rules for ℓ, m, η will limit to finitely many paths, meaning all sums we write for any term of our expansion for mutual information will be finite.We parameterize the terms as follows.Given d and N, define χd,N to be the set of all circular lists of tuples of the form (k, ℓ) such that n j=1 (d − 1 + 2k j ) = N, n ≥ 2, ℓ j = ℓ, and ℓ ≤ min j (k j ).Here, k and ℓ are non-negative integers.
Recalling that that h ′ (−i coth(λ)) = i λ 2 , we write the general formula for the T N term of the relative entropy in d dimensions as Note that the N (d, ℓ) comes from summing over all m, η since the summand is independent of it.We can then find the coefficients of the entropy difference using S d,N = Q d,N − Sd,N .Given these coefficients, the relative entropy and the entropy difference have the following expansions: 7.3 Evaluating Casini's K K Ω (T ) is straightforward to compute using the x basis.Using the form of the modular Hamiltonian in Equation ( 46) and recalling the definitions of g d,X and g d,P as the matrix entries of the kernel of ∆α T,Ω,d (see text above Equation ( 105)), we find the kernel of 1 2 h ′ (α 0,Ω,d )∆α T,Ω,d to be To find the trace, we take the trace of this 2 × 2 matrix, set y = x and integrate over Ω: Let us focus on the final term first.We begin by taking the limit outside the integral, We can now use the divergence theorem to obtain But recall that p(x) = 1 − r 2 is zero on ∂Ω and g d,X is smooth everywhere.So, this term is zero.We can solve the remaining integrals in Equation (122) by using the formulae for the volume and surface area of the unit sphere, Now using Equation (113), we can entirely solve for K Ω (T ): We see that K Ω (T ) contributes terms of order T d−1 and T d+1 to the entropy difference.(rT ) 7 − 256ζ(5) 2 693 (rT ) 6 0.509(rT ) 5 − 11.9(rT ) 7 − 0.397(rT ) 10   Table 2: The leading order coefficients of entropy difference between a low temperature field and the vacuum on a sphere for a massless scalar field.Note that the terms in the expansion are proportional to rT for a sphere of radius r and temperature T .

Entropy Difference for Very Small Temperatures
We look at the lowest order term in the expansion for relative entropy, cf.Equation 22.The smallest possible N is N = 2d − 2 which occurs when τ = ((0, 0), (0, 0)), n = 2, S(τ ) = 2, and N (d, 0) = 1.We find an expression for the leading coefficient that is similar to the coefficient for mutual information, We again only need to worry about the lowest order term in the expansion of K d,Ω,T (y, x) in Equation (114), We note that this has the same form as the lowest order term in Equation (92) up to a factor of 2ζ(d − 1).Because of this, and the fact that the form of the Equation (127) is the same as Equation (91), we can use our result for the leading coefficient of mutual information to obtain As we would expect from the positivity of relative entropy, its leading term is positive.This preserves the inequality ∆S(T ) ≤ K Ω (T ) to lowest order.We now have the first three leading coefficients of the entropy difference: (130) Note that if we reinsert units and let r be the the radius of the sphere, we get (131) The lowest order term is therefore proportional to the surface area of the sphere.

Entropy Difference and Area Laws
We observed that for a thermal field, the lowest order term of the entropy expansion had an 'area law' term.In this Section, we will show that this is a general feature of low entropy expansions for entropy difference on the sphere so long as the kernel of ∆α is smooth.
Consider a skewed correlation matrix family α(t 1 , . . ., t q ) with α(0, . . ., 0) = α 0,Ω,d where the t i have units of energy.Suppose we have a general expansion for ∆α = α(t 1 , . . ., t q ) − α(0, . . ., 0), Consider the corresponding expansion of the kernel of ∆α: We note that the entries of K(x, y) will have different units following from their definitions.Reminding ourselves of the form of the entries of the skewed correlation matrix, we see that the kernel for the X entry must have units of E d−1 where E is energy, the kernel of the P entry must have units of E d+1 , and the kernel of the V of f and V † of f entries must have units of E d .Since the scalar field is conformal, the only quantities with units in the problem are x, y, and the t i .So by unit analysis, if an entry of K(x, y) has units of δ, the corresponding entry of K k 1 ,...,kq (x, y) must be a homogeneous function of order i k i − δ.But we also know that the entries of K are smooth, meaning each of the entries of the K k 1 ,...,kq are smooth.Since we cannot have a smooth homogeneous function of an order less than zero, the lowest order term for each entry must have order δ.The lowest order term for K must then be of order min(d, d − 1, d + 1) = d − 1 with the only nonzero entry being a constant function in the X component, Now consider the lowest order term for the entropy measured on a unit sphere, making a choice of units such that R = 1.Just as with the thermal field, it will originate from Casini's K and we get But, we can easily compute this using the same procedure as Section 7.3.If A d is the surface area of the unit sphere, we have But now we can reinsert units to consider a sphere of any radius r, So we do indeed see an area law for the entropy difference in lowest order.

Conclusion
We have outlined a new perturbative strategy for computing an expansion for the entropy and mutual information of nearly conformal states on a sphere.We applied this to the problem of computing the mutual information of two distant spheres in a massless scalar field and found that the lowest order term agrees with preexisting literature.We then applied this to the case of the entropy difference of thermal fields and found the lowest order terms.In particular, we found that the lowest order term is proportional to the area of the sphere, which we showed was a general feature of a low energy expansion for the energy difference.
Given that we have a full expansion for the mutual information and the entropy difference, it would be natural to use them to compute the next to leading order term and compare them with other results in the literature such as Chen et al [33].It would also be valuable to look at a thermal field for the d = 2 case.Recalling that the correlation matrix X has an unremovable logarithmic divergence, we note that Casini's K is clearly infinite using the results of Section 7.3.However, we note that the relative entropy may still be finite, as the traces appearing in its expansion are finite, owing to the fact that ∆α(T ) is Hilbert Schmidt for d = 2.It may be the case that the entropy is infinite but the relative entropy is finite.
Additionally, the approach we have outlined is very general and extends to any physical quantity which is a matrix function of the skewed correlation matrix.These include the Rényi entropy, the standard Rényi mutual information used by Cardy to analytically continue to the standard mutual information, the Petz-Rényi relative entropy between a state and the vacuum [34], and the alternative definition of Rényi mutual information given by Kudler-Flam [35].In particular, it would be interesting to compute the standard Rényi mutual information for integer n and compare with Cardy.The perturbative approach we have laid out is particularly valuable because, even if the coefficients of an expansion do not have a simple analytic form, one can still numerically integrate to compute them.Since the number of iterated integrals needed to compute the coefficients depends only of the order of the coefficients and not on the dimension of the underlying field theory, this approach provides a numerical strategy for computing these expansions which avoids the some troubles of dimensionality present in lattice field theory.
Furthermore, our approach should extend to general Gaussian bosonic states.In particular, future work could look at both the mutual information and the entropy difference for a thermal massive scalar field by expanding the correlation matrix in terms of mass as well as distance and/or temperature.We also expect there to be an analogous strategy for expansions of nearly conformal fermionic Gaussian states built from the continuum limit of fermionic lattice field theory.
On the mathematical side of things, the derivation of this expansion presented in this work is at times non-rigorous and more work could be done to establish the validity of both the general expansion for an analytic function of a linear mapping in Section A.9 and the specific expansion for Gaussian states on a sphere presented in Section 5. We made some progress in Section 4.3 by explaining why the traces in each term were finite.However, we did not establish why the integrals for each term were finite other than by direct computation.
We also point out that our derivation for the perturbation of a function of the skewed correlation matrix relies only on the fact that the set of skewed correlation matrices is a convex cone whose spectra lie in a specific subset of the complex plane.We note that this property is shared by the set of density operators.Therefore, the mathematical approach of perturbing the resolvent may also give expansions for the entropy difference and mutual information of nearly conformal non-Gaussian states when applied to the density operator rather than the skewed correlation matrix.A particular simple case to study would be a weakly interacting λφ 4 theory.
Finally, we would be remiss not to point out the connection between quantum field theory and gravity implied by the appearance of an area law in lowest order for the entropy difference.It would be interesting to investigate any connection between this area law and the numerical one observed by Srednicki.Perhaps the entropy of lattice field theory with lattice separation a is related to the entropy difference between the scalar vacuum on a sphere and some state with energy scale a −1 .
It would also be valuable to investigate whether the area law for entropy difference is a general characteristic of a quantum field theory.If so, a possible origin of the black hole entropy would be as the lowest order term in the expansion of the entropy difference in some quantum theory of gravity.This would be akin to the classical kinetic energy difference appearing as the lowest order term in the expansion for the relativistic kinetic energy difference.If true, the black hole entropy would really be an entropy difference ∆S BH = ∆A 4G .This is of course pure speculation.Indeed, the existence of an area law for a scalar field theory may simply be a coincidence arising from the fact that the scaling dimension of a scalar field theory happens to be d−1 2 , giving the two point correlation function the same units as inverse area.

A Gaussian States in Lattice Field Theory
In this Appendix, we provide a general discussion of Gaussian states.Gaussian states are a special kind of state which are completely described by the two point correlation functions.All states that we care about are Gaussian.To avoid mathematical technicalities, let us only consider systems with finite dimensional classical spaces.The first two sections give the definition of and an example of a Gaussian state.The next section discusses the decomposition of Gaussian states into a tensor product of one dimensional harmonic oscillator states.We then use this decomposition to find formulas for entropy and mutual information in terms of traces in the classical space.Then, we transition to quantum field theory and outline how find the analogue of the Gaussian decomposition for a subset of Gaussian states which includes the massless scalar vacuum on the sphere.Finally, we describe how to perturb the entropy and mutual information of a state whose Gaussian decomposition is known to a state where the decomposition isn't known.This perturbative technique is the strategy we use to compute the central results of this paper.

A.1 Wigner Functions
To discuss Gaussian states, we first need to introduce the concept of Wigner functions.Suppose that we have a quantum system with a classical space C. The Wigner function is a function of the vector v = (v 1 , v 2 ) ∈ C 2 and holds all of the same information as the density operator, completely describing a quantum state.It can be obtained from the density operator ρ, Given w ∈ C with components w k , the ket |w denotes the simultaneous eigenvector of the field operators such that φk |w = w k |w .

A.2 Gaussian States
In general, computing thermodynamic or quantum informational quantities for arbitrary quantum states can be difficult.To simplify calculations, we will only consider a subset of states known as Gaussian states, which are completely determined by their two point correlation functions.A Gaussian state is defined as a state with Wigner function, A general review of Gaussian states is given by Ferraro [36], from which much of this section is borrowed.Here σ is the covariance matrix, which is a real symmetric matrix and v 0 is the mean vector.They defined by the following equations, where {•, •} denotes an anti-commutator, • denotes an expectation value and Vj denotes the components of he vector valued operator which combines the fields and their conjugate momenta: V = ( φ, π).From the canonical commutation relation (see Appendix B, Equation ( 192))σ must be such that σ + iJ is positive definite.Here, we represent by J the matrix where I is the identity matrix on C. Because σ is real, this means that the complex conjugate of σ + iJ, which is σ − iJ, is also positive definite.Therefore, we can show meaning σ itself is also positive definite.Note that knowing the mean vector and the covariance matrix is equivalent to knowing the state.In every example that we will look at here, the mean vector will be zero, v 0,j = Vj = 0.
We will call such Gaussian states centered.Centered Gaussian states are therefore completely determined by their covariance matrix.Additionally, it will sometimes be convenient to split σ into four square block matrices for our purposes, here X, P , and V of f are the matrices with entries given by the two point correlation functions defined in Equation (13).
A useful fact about Gaussian states is that partial traces of Gaussian states are also Gaussian state.To be precise, let H 1 be a subsystem of H with classical spaces C 1 and C respectively.If ρ is a Gaussian state in H with correlation matrix σ, then ρ C 1 is Gaussian in H 1 with correlation matrix equal to the block matrix σ C 1 ,C 1 , as defined in Section 2.3.

A.3 Generalized Thermal Coupled Oscillator
An important case of a Gaussian state is the generalized thermal coupled oscillator.Consider n coupled oscillators with Hamiltonian 2 Ĥ = n i=1 π2 i + ij K ij φi φj where K is an n × n positive definite coupling matrix.Now, consider a Bogoliubov transformation from the standard vector valued field operators φ and π to normal modes φ′ = Q T φ and π′ = Q T π.Here Q is the matrix whose columns are the eigenvectors of K.This transforms the Hamiltonian into a commuting sum of one-particle Hamiltonians with normal frequencies given by ω i , the square rooted eigenvalues of K, We can construct a Gaussian state by joining each individual normal mode of the oscillator with frequency ω i to a heat bath with inverse temperature β i .This is described by the density matrix where φ′ i and π′ i are the operator components of φ′ and π′ .Since a one particle thermal oscillator is Gaussian, the entire tensor product state will also be Gaussian.Its two point correlation functions in terms of the normal modes operators can be easily computed as Now we can invert the Bogoliubov transformation and write the correlation matrices of the original field operators as Here we define a symmetric matrix B which has the same eigenvectors as K but with eigenvalues β i .

A.4 Decomposing a Arbitrary Gaussian State
All work in this section will assume the classical space is n dimensional.We start with the Wigner function of a centered Gaussian state, where σ is the correlation matrix.From the Takagi Transformation reviewed in Section B.2, there exists a symplectic matrix S such that S T σS = D where D 0 = diag(λ 1 , . . ., λ n ) with each λ i > 1 and We can rewrite this expression as σ −1 = SD −1 S T .Therefore if we canonically transform the classical vector v into w = S T v, we can transform an arbitrary Gaussian state into one with diagonal Wigner function, That is, this is now the Wigner function of a Gaussian state with a diagonal correlation matrix D.
Comparing Equations ( 151), (145), and (149), we see that this new Gaussian state is equivalent to a generalized thermal coupled oscillator with B = 2arccoth(D) and K = I.In other words, we can decompose an arbitrary Gaussian state as the tensor product of n independent oscillators all with unit frequency at inverse temperatures β i = 2arccoth(λ i ), noting that λ i > 1.
We wish to explicitly write the decomposed density operator of our Gaussian state.To do so, we must first define the transformed vector valued field operators φ′ and π′ corresponding to w as Our density operator can now be written as the tensor product of n canonical ensembles with unit frequency, at inverse temperatures where φ′ i and π′ i are the operator components of φ′ and π′ .

A.5 The Modular Hamiltonian
The general decomposition given above decomposition allows us to write the operator logarithm of ρ, ln We can turn this back into matrix form using a trace over the space C 2 by using the vector operator Ŵ = ( φ′ , π′ ), defined analogously to the vector valued operator V.The operator logarithm becomes, Î .
(156) Note that the transpose here denotes a transpose in the classical space.Now, we invert the Bogoliubov transformation and write Ŵ In other words, for all t ∈ [0, 1], −Jα(t) is a valid correlation matrix of some Gaussian state.Therefore, we can apply the results of Appendix B.2 to the spectrum of α(t).Namely, α(t) is diagonalizable and its spectrum lies in the following set: Given some some matrix function p, define p(t) = Tr C 2 [p(α(t))] where Tr C 2 is a trace over two copies of the classical space C. 17 The goal of this section is to find a series expansion p(t) = n=0 P n t n .The standard chain rule does not generalize to matrix functions and the non-commutativity of operators makes using a series expansion of f difficult.Instead we write p(t) as the following contour integral analogously to Equation (200): Here ξ is any contour enclosing the portion of the imaginary line with absolute value greater than one. 18In general, the derivative of a matrix inverse is d dt (B(t) −1 ) = −B(t) −1 B(t) B(t) −1 .Using this formula and the product rule for matrices, 19 we can derive the derivatives of p(t), This formula can be proved by induction.Put in terms of P n , this formula becomes We substitute t = 1 and sum over all n to get If we look at Equation (176), we see it has the form of a geometric series.Summing this series returns us to Equation (173) with t = 1.This serves as a consistency check but also suggests this formula may be much more generally applicable to perturbative problems than suggested by its derivation here.However, Equation (176) turns out not to be the most convenient form for the expansion.To find a better one, we take a closer look at each P n .Firstly, if n = 0, then For n ≥ 1, we can use the cyclic invariance of the trace to rewrite P n .We then average over all of the ways to write P n and get 17 There is no well-defined notion of trace for all of C 2 for a quantum field theory.We once again proceed formally and handle this problem in Section 4.3 under certain assumptions. 18This set is unbounded meaning that realistically integrating over ξ should be viewed as the limit of integrals over bounded contours.We will overlook such details here.
But now observe that the sum inside the contour integral is the derivative of − Tr n with respect to z.Therefore, we can use integration by parts20 to rewrite this integral as For n = 1, this gives us a particularly nice result: We can therefore write a general series expansion for Tr C 2 [p(α 1 )] as In summary, this formula describes a perturbative series for the entropy of α 1 in terms of the resolvent of α 0 and the difference of skewed correlation matrices ∆α = α 1 − α 0 .

A.10 Diagonalizing the Skewed Correlation Matrix via the Modular Hamiltonian
Recall the form of the modular Hamiltonian in terms of the skewed correlation matrix as where the matrix f (α)J is positive definite.Note that we have dropped the unnecessary classical trace, as the operator is already a scalar-valued operator.Suppose that, for some reason, the spectral decomposition of the operator appearing in the modular Hamiltonian, f (α), is easier to compute than α.How do we compute the spectral decomposition of α in terms of the spectral decomposition of f (α)?Denote by •, • the following inner product on This is a sensible inner product because f (α)J is positive definite and J † = J −1 = −J, making Jf (α) positive definite as well.f (α) is skew-Hermitian with respect to this inner product, which follows from where we make use of the fact that Jf (α) is symmetric.This implies the eigenvalues of f (α) are imaginary.Granted, we knew this already because α has imaginary eigenvalues and f is an odd function, but it is a good consistency check nonetheless.Now, consider the eigenvalue problem for f (α): where we label the eigenvalues with the index k.Since f (α) is skew-Hermitian, its eigenvectors are orthogonal with respect to the Jf (α) inner product, a fact which can be rewritten in terms of the standard inner product in C 2 as But, using the fact that v k is an eigenvector of f (α), we can remove f (α) from the orthogonality relation: This orthogonality relation allows us transform into the α-eigenbasis.Suppose that we separate the eigenvectors v k into its two component vectors on C, Then, given any matrix A in C 2 , we can write the entries in the f (α) eigenbasis as In particular, we can write the resolvent of α in the f (α) eigenbasis as B Mathematical Background

B.1 Symplectic Matrices
Let C be a n-dimensional complex vector space.We define the following operator in C 2 : where I is the identity matrix on C. Observe that J 2 = −1.We say an operator is symplectic if it preserves the bilinear form v † Jw.Equivalently, a symplectic operator is an operator S satisfying the equation S † JS = J.The real symplectic matrices of dimension 2n form a Lie group denoted SP (2n, R).The Lie algebra of this group is denoted by sp(2n, R) and consists of real matrices A satisfying JA + A T J = 0, the symplectic analogues of skew-symmetric matrices.Mathematicians call these matrices Hamiltonian.This nomenclature would be confusing due to the Hamiltonian energy operator however.So we will avoid it and simply say 'in sp(2n, R)' instead.
To see the relevance of symplectic matrices to quantum mechanics, suppose we are working in a quantum system with classical space C. We define the operator valued vector V = ( φ, π).For finite dimensions, this is a vector in C 2 .The canonical commutation relation can be written as the following equivalence between operator valued matrices in C 2 : This relationship defines a bilinear form in the classical space.A Bogoliubov transformation is a linear transformation in the classical space that preserves this commutation relation.In other words, a Bogoliubov transformation is a linear canonical transformation in the classical space.

B.2 The Symplectic Takagi Factorization
Let C be a n-dimensional complex vector space and suppose that we have a real positive definite matrix A on the direct sum C ⊕ C.Then, we can define an inner product v, w = v † Aw on C 2 .Now, consider the matrix JA.We can see that this matrix is skew-Hermitian with respect to this inner product, which can be seen here: JA is also clearly real.Therefore, by the spectral theorem for skew Hermitian matrices, JA is diagonalizable with eigenvalues ±iλ 1 , . . ., ±iλ n .Given the matrix D 0 = diag(λ 1 , . . ., λ n ), we define a matrix W .
W and JA have the same eigenvalues and are both diagonalizable.So, they are similar.Furthermore, direct calculation shows that they are both in sp(2n, R).Therefore they are symplectically similar.So by [37], there exists a real symplectic matrix S such that Now from the definition of symplectic, we have S T = −JS −1 J.This allows us to find a symplectic matrix S such that S T AS is diagonal, as shown here: We will call this diagonal matrix D. This is the called symplectic Takagi Factorization of A.
In this paper, we are particularly interested in positive definite matrices A such that A + iJ is also positive definite.In this case, S T (A + iJ)S = D + iJ is also positive definite.This is true if and only if each λ i > 1.

B.3 Block Matrices of a Product
Let C be a vector space decomposed into a direct sum of closed subspaces C = n i=1 C i .Suppose we have a sequence of matrices A 1 , A 2 , . . ., A m in C and we know their blocks A k i 1 ,i 2 .We want to find the block matrix from C i to C j of the product A 1 . . .A m ?To do this, we consider a sum over all paths of length m from C i to C j .Such paths can represented schematically as where i, i 1 , . . ., i m−1 , j label subspaces.The block matrix of the product can now easily be written as (A 1 . . .A m ) ij = i 1 ,...,i m−1

B.4 Writing a Matrix Function as a Contour Integral
Let C be a finite dimensional vector space.Given a diagonalizable matrix A on C with eigenvalues λ k and eigenvectors v k , we want to find the matrix function f (A) for an analytic function f .The standard approach is to use the diagonalization A = QDQ −1 and simply write f (A) = Qf (D)Q −1 , where f (D) is diagonal with (f (D)) ii = f (λ i ).f (A) can be written as a sum, where w k are the columns of (Q −1 ) T .However, this approach does not lend itself easily to perturbations, as perturbing the matrix A changes both the eigenvalues and the eigenvectors.We will use an alternative formula for f (A) assuming f is holomorphic on some open set containing the eigenvalues of A. We write f (A) using a contour integral over ξ, which is any contour enclosing all eigenvalues of A, This equation can be proven by decomposing (z − A) −1 into index form and using the residue theorem,

B.5 Spherical Harmonics in Higher Dimensions
Consider the surface of the unit sphere in d dimensions which we label ∂Ω.We can parameterize ∂Ω using hyperspherical coordinates.
x 1 = cos(θ 1 ) x 2 = sin(θ 1 ) cos(θ 2 ) x 3 = sin(θ 1 ) sin(θ 2 ) cos(θ 3 ) . . .by the rotations that keep the x 1 and x 2 axes fixed and so on.Consider the set of square integrable functions S ∂Ω on ∂Ω.Given each rotation group G in the sequence above with a d dimensional matrix representation Q, we can naturally define a corresponding representation on S ∂Ω by rotation of the input. 22The quadratic Casimir invariants of these representations form a complete set of commuting Hermitian matrices 23 .Their simultaneous eigenvectors are called the spherical harmonics and form a symmetry adapted basis of S ∂Ω for each representation of SO(n).The spherical harmonics are labeled by the indices ℓ, m, η 1 , η 2 , . . ., η d−3 .Spherical harmonics of constant ℓ form a basis for an irreducible representation of SO(d), harmonics of constant ℓ and m label an irreducible representation SO(d − 1), and so on.In this work only the first two indices are important to isolate and we will compress the remaining indices into a single η, labeling the spherical harmonics as Y m,η ℓ .For d = 2, ℓ can be any integer and the remaining indices are superfluous.We will define them to always be zero.For d = 3, ℓ and m are integers satisfying ℓ ≥ |m|, while η is also superfluous and will also be defined to be zero.For d ≥ 4, ℓ and m are non-negative integers satisfying ℓ ≥ m.
The spherical harmonics are normalized such that ∂Ω dS |Y m,η ℓ | 2 = 1 and they form an orthonormal basis for S ∂Ω .In the case of m = η = 0, a spherical harmonic has a relatively simple form in spherical coordinates.
However, they are still related to the Gegenbauer polynomials through a limit, We point out the the expression in the limit is well-defined for all real numbers d > 2, not just integers.We define N (d, ℓ) as dimension of the representation of SO(d) composed of d dimensional spherical harmonics of rank ℓ.Note that this is both the number of d-dimensional spherical harmonics with first index ℓ and the number of d + 1 dimensional harmonics with first index ℓ 0 and second index ℓ.There exists a simple formula for N (d, ℓ) [38], using the binomial coefficient n k ,24 We highlight a useful property of the spherical harmonics.Any function on the boundary of the sphere has a unique harmonic extension to the sphere's interior, due to the Poisson kernel formula [39].The extension of a spherical harmonic Y m,η The solid harmonics R m,η ℓ are special because they are homogeneous d-variate polynomials of degree ℓ.Note that the function f (r) = r 2 = i x 2 i is also a homogeneous polynomial.By combining these polynomials we can construct the following set of homogeneous polynomials using two indices satisfying j ≤ ⌊ n 2 ⌋, G m,η n,j (x) = |x| 2j R m,η n−2j (x). ( The exception is for d = 2, where we must be careful to account for the fact that the index of a solid harmonic can be negative.The set of homogeneous polynomials are in this case, G n,j (x) = |x| 2j R 0,0 n−2j (x) n ≥ 0 |x| 2j R 0,0 −n+2j (x) n < 0 . (210) On one hand, these polynomials are all linearly independent because they are in different wave mode subspaces.On the other hand, we can compute the total number of these homogeneous polynomials with a telescoping sum via Equation (207), giving us25 This is exactly the dimension of the space of d-variate homogeneous polynomials of degree ℓ by a combinatorial stars and bars argument.Therefore, the functions G m,η ℓ,j form a basis for this space.Since any polynomial can be written as a sum of homogeneous polynomials, these functions can also serve as a basis for all multivariate polynomials if we allow ℓ to vary.

B.6 Accounting for Spherical and Cylindrical Symmetry on a Sphere
In this work, we deal with quantum field theories on the unit sphere, denoted by Ω in arbitrary dimensions.In this case, it will be useful to decompose fields into spherical wave modes.We recall from the main body that the classical space of fields is the set of tempered distributions on the sphere.Given any tempered distribution on the sphere, written in spherical coordinates φ(r, θ 1 , . . ., θ d−1 ), we can (formally) decompose it in terms of spherical harmonics,