Petz recovery from subsystems in conformal field theory

We probe the multipartite entanglement structure of the vacuum state of a CFT in 1+1 dimensions, using recovery operations that attempt to reconstruct the density matrix in some region from its reduced density matrices on smaller subregions. We use an explicit recovery channel known as the twirled Petz map, and study distance measures such as the fidelity, relative entropy, and trace distance between the original state and the recovered state. One setup we study in detail involves three contiguous intervals $A$, $B$ and $C$ on a spatial slice, where we can view these quantities as measuring correlations between $A$ and $C$ that are not mediated by the region $B$ that lies between them. We show that each of the distance measures is both UV finite and independent of the operator content of the CFT, and hence depends only on the central charge and the cross-ratio of the intervals. We evaluate these universal quantities numerically using lattice simulations in critical spin chain models, and derive their analytic forms in the limit where $A$ and $C$ are close using the OPE expansion. In the case where $A$ and $C$ are far apart, we find a surprising non-commutativity of the replica trick with the OPE limit. For all values of the cross-ratio, the fidelity is strictly better than a general information-theoretic lower bound in terms of the conditional mutual information. We also compare the mutual information between various subsystems in the original and recovered states, which leads to a more qualitative understanding of the differences between them. Further, we introduce generalizations of the recovery operation to more than three adjacent intervals, for which the fidelity is again universal with respect to the operator content.


Introduction
In the last two decades, entanglement has provided many valuable insights into quantum many-body systems, quantum field theories, and quantum gravity. A quantity called entanglement entropy has played a central role in these studies. Given a state ρ and a subsystem A with complementĀ, the entanglement entropy is defined as (1.1) From an information-theoretic perspective, if the state is pure, ρ = |ψ ψ|, then the entanglement entropy is the answer to an intuitive operational question: what is the largest number of bell pairs between A andĀ that can be extracted by acting only with local operations and classical communication (LOCC) between A andĀ? 1 The first studies of entanglement entropy in many-body systems considered its behaviour in conformal field theories [1][2][3]. For a single interval in a conformal field theory in (1+1) spacetime dimensions, the entanglement entropy of the vacuum state was found to take a simple and universal form, independent of the operator content of the theory. For an interval of length R in a CFT with central charge c and UV cutoff , the vacuum entanglement entropy is given by This universal expression has turned out to be vastly useful, with applications ranging from identifying a c-function that behaves monotonically under RG flow in quantum field theory [4] to motivating a key element of the AdS/CFT correspondence known as the Ryu-Takayanagi formula [5]. While the entanglement entropy quantifies bipartite entanglement in pure states, much remains to be understood about the multipartite entanglement structure in quantum manybody systems, even in familiar settings like the vacuum state of a (1+1)-D CFT. A variety of quantities have been introduced in order to probe this structure, including the mutual information [6,7], negativity [8,9], and reflected entropy [10]. However, unlike the entanglement entropy, these quantities do not have a clear operational interpretation.
In this paper, we study a set of new information-theoretic measures associated with multiple regions in the vacuum state of a (1+1)-D CFT. These measures address operational questions about how well we can reconstruct the density matrix of a combined system from the reduced density matrices of smaller subsystems. We will find that these measures are independent of the matter content of the CFT, and depend only on its central charge. The  Figure 1. We trace out the subsystem C from a state ρ ABC , and then attempt to recover the original state with a channel N B→BC that acts non-trivially only on B.
universality of (1.2) can be understood from the fact that the entanglement entropy of a single interval is determined by a two-point function of primary twist operators. The measures we study in this paper are n-point functions of twist operators for n ≥ 4, but nevertheless turn out to be universal.
For most of this paper, we focus on the operational question shown in Fig. 1. Suppose we have a state ρ ABC on the union of three subsystems A, B and C. We trace out one of the subsystems C, and are left with the reduced state ρ AB . We would then like to recover an approximation to the state ρ ABC , with the restriction that we can act with some non-trivial quantum channel on B, but not on A. How close can the recovered state N B→BC (ρ AB ) be to the state ρ ABC ?
Intuitively, the restricted set of operations we allow can generate direct correlations between B and C, but any correlations that they generate between A and C must be mediated by B. This operational question is thus a concrete way of probing correlations between the subsystems A and C that are not mediated by B. Ideally, we would want to minimize the distance between the two states for all possible choices of the channel N B→BC . It is not clear how to carry out such a minimization procedure in practice, so we will instead make use of an explicit channel called the twirled Petz map, which can be defined for any state: The parameter λ can be any real number. The λ = 0 case is called the Petz map. Previous works [11][12][13] identified certain setups in quantum field theory where the above recovery operation works perfectly. As shown in [14][15][16], perfect recovery for the setup of One setup where the CMI vanishes is when A and C are null regions on either side of a spacelike interval B in a (1+1)-D CFT, shown in Fig. 2 (a). To see this, note that since AB is related to the spacelike slice 1 by unitary evolution, where in the last inequality, we have used (1.2) together with the fact that the vacuum entanglement entropy in some region in a relativistic theory is a function only of the Lorentzinvariant length of the region. Similarly using S(ρ BC ) = S(ρ 2 ) and S(ρ ABC ) = S(ρ 3 ) and substituting into (1.6), we find that the CMI vanishes.  Fig. 1 can be carried out perfectly. B is a spatial slice, A and C are null, and 1, 2 and 3 are spacelike.
(b) shows the setup we study in this paper, where A, B and C are spacelike intervals and the recovery is imperfect.
As emphasized in [13], the fact that A and C are null regions is important for the perfect recoverability in this setup. The statement of perfect recoverability and the vanishing of the CMI are also equivalent to a third statement that ρ ABC has a "Markov state" structure [14], which in particular implies that the reduced density matrix on AC is separable between these two systems. Due to the Reeh-Schlieder theorem [17,18], the reduced density matrix for any two regions of non-zero volume in the vacuum state of a QFT cannot be separable [19]. Indeed, if we consider the case shown in Fig. 2 (b), where A, B, and C are three adjacent regions on a spatial slice, we can use (1.2) to see that the CMI is non-zero and is given by (1.8) indicating that perfect recovery is not possible in general. When L B is much greater than at least one out of L A and L C , so that the cross-ratio η is much smaller than 1, the CMI is small and we can use certain information-theoretic inequalities to give a non-trivial lower bound on how well the recovery works [20][21][22]. We review these inequalities in Section 2.1.
To better understand the entanglement structure of the vacuum state, it is useful to directly quantify the imperfect recoverability in the setup of Fig. 2 (b) for arbitrary sizes of A, B and C. This is our goal in the present paper, where we study the difference between ρ ABC and the recovered stateρ by a few different distance measures: the fidelity, trace distance, relative entropy, and a oneparameter generalization of the Renyi entropy. 2 In principle, we should be careful in applying the map (1.3) in a quantum field theory, since the reduced density matrices appearing in it are not well-defined without putting a UV cutoff on the theory using some lattice regularization. Despite this, we will find that the various distance measures between ρ ABC andρ ABC are independent of the UV cutoff and are finite in the continuum limit. This shows that although ρ ABC is constructed from reduced density matrices, it is able to capture the large amount of short-distance entanglement between adjacent subsystems that is present in ρ ABC . Due to conformal invariance, we further find that these measures depend on the interval lengths only through the cross-ratio η defined in (1.8).
More remarkably still, we find that each of these distance measures between ρ ABC and ρ ABC depends on the specific CFT being studied only through its central charge. This universality can be seen by using replica tricks which allow us to express these quantities as analytic continuations of four-point functions of twist operators for an integer number of copies of the CFT. Standard techniques for studying such correlation functions involve mapping them to a partition function on a covering space [23]. The genus of the covering space is determined by the structure of the twist operators, and turns out to be zero for the quantities studied here, leading to a universal result.
To find these universal functions numerically, we use lattice simulations in critical spin chain models including the Ising model and the free fermion and free boson CFTs. We find that the λ = 0 case of (1.3) corresponds to the best recovery. As L B becomes much smaller than both L A and L C , so that η → 1, both the logarithm of the fidelity F (ρ ABC ,ρ ABC ) and the relative entropy D(ρ ABC ,ρ ABC ) diverge, 3 − log F (ρ ABC ,ρ ABC ) = − c 9 log(1 − η) + O(1) (1.10) The coefficient of the logarithmically divergent contribution to the relative entropy is the same as that of the CMI in (1.8), while the coefficient in − log F is smaller, consistent with the bounds of [20][21][22]. We explain these limiting behaviours using the OPE expansions in the twist operator formalism. This formalism also allows us to relate the O(1) contributions in both (1.10) and (1.11) to other entanglement quantities, some of which are not obviously related to the fidelity or relative entropy from a general information-theoretic perspective.
In the limit of small η, where at least one out of L A and L C is much smaller than L B , we numerically find at λ = 0 that both − log F and D approach 0 with a quadratic dependence on η, − log F (ρ ABC ,ρ ABC ) = f 2 c η 2 + O(η 4 ) (1.12) (1. 13) for some numerically determined universal constants f 2 and d 1 . In contrast, the upper bound on (1.12) from [20][21][22] is linear in η. We also study − log F at other values of λ, and find that it approaches zero with a λ-dependent power as η → 0. Using the OPE expansion for this limit in the replica formalism, we are able to correctly see that both quantities are zero in the strict η = 0 limit. However, this formalism can also be used to argue that all coefficients in the expansion of − log F or D away from the strict η = 0 limit should vanish, which contradicts the numerically observed power laws in (1.12) and (1.13). We expect that the incorrect prediction from the replica formalism in this case likely comes from a noncommutation between the replica limit and this OPE limit; see Sec. 2.5 for details. In addition to the above quantitative measures of the distance between ρ ABC andρ ABC , we can also ask more qualitative questions about which subregions or which types of correlations account for the distance between the two states. This is useful both for understanding the structure of the original vacuum state, and for understanding the structure of the state formed by (1.3) from an information-theoretic perspective. Immediately from the definition of (1.3), we can see thatρ ABC has identical reduced density matrices to ρ ABC on A and on BC. However, the correlations between A and B, and between B and C, are different iñ ρ ABC from those in ρ ABC . We consider the following mutual information differences between the original and the recovered state: (1.14) δI(A : B) and δI(A : C) are both UV-finite functions of the cross-ratio. δI(A : B) depends only on the central charge of the CFT, while δI(A : C) is also sensitive to the operator content. In the η → 1 limit, where the intervals are close, δI(A : B) approaches a constant non-zero value, while δI(A : C) shows a divergence, similar to the CMI, − log F , and the relative entropy. In fact, the divergence has precisely the same form as that of the CMI. So in this limit, we see thatρ ABC fails to capture a diverging amount of correlations between A and C, as well as some finite amount of correlations between A and B.
In the limit η → 0 where the intervals A and C are far, δI(A : B) is larger than δI(A : C), showing that the loss of correlations between A and B accounts for a larger part of the difference between ρ ABC andρ ABC in this limit. The behaviour of δI(A : C) in this limit is non-universal, but still shows some interesting general features. It is well-known that I(A : C) in the vacuum state decays as [7] I(A : C) ρ = κ η 2(h+h) , η → 0 .
(1. 15) for some constant κ, where h (h) is the (anti-)holomorphic dimension of the lowest-dimension primary operator after the identity. The mutual information between A and C inρ ABC turns out to have precisely the same leading behaviour, so that δI(A : C) has a smaller leading power than (1.15). In this sense, the leading correlations between A and C in the far-interval limit are mediated by B.
We also introduce some natural generalizations of the recovery process of Fig. 1 to multiple intervals and multiple steps. We show general information-theoretic lower bounds on the recovery fidelity of multiple-step protocols for four subsystems in terms of the CMI, by extending the methods of [20][21][22]. For the CFT vacuum state, we show explicitly that these quantities are universal for the case of four adjacent intervals. Based on this, and a general argument discussed in the conclusions, it is natural to conjecture that the generalization to an arbitrary number of adjacent intervals is also universal. As expected, we find quantitatively that the fidelity of the single-step recovery process is higher than that of the multiple-step ones, which in turn are larger than the general information-theoretic lower-bounds we show for them.
The plan of the paper is as follows. In Section 2, we study the fidelity between ρ ABC andρ ABC , and develop the general formulation in terms of twist operators which is also used with some small modifications for all other quantities studied in later sections. In Section 3, we study the relative entropy, a one-parameter generalization called the Petz-Renyi relative entropy, and the trace distance between ρ ABC andρ ABC , and compare both sides of various general information-theoretic quantities relating these distance measures. In Section 4, we explore the differences in correlations between different subsystems in the two states. In Section 5, we discuss generalizations to multiple intervals and multiple steps. We discuss a number of future directions motivated by our results in Section 6.
Appendix A provides details of the tensor network algorithms and numerical methods used to evaluate various quantities. In Appendix B, we review the covering space method of [23], derive the values of certain OPE coefficients from it, and discuss some challenges with analytic continuation using this method. In Appendix C, we show general informationtheoretic bounds on multiple-step recovery tasks for four intervals.
2 Fidelity between ρ ABC andρ ABC In this section, we compare the state ρ ABC with the stateρ ABC obtained using the Petz map by evaluating the fidelity between the two states. We define the fidelity and discuss a general information-theoretic bound on it in Section 2.1, and present numerical results in Section 2.2. In Section 2.3, we develop a replica formalism for this quantity and discuss its consequences, including universality. Sections 2.4 and 2.5 discuss the OPE limits.

Review of fidelity and information-theoretic bounds
In any quantum-mechanical system, the fidelity between two states ρ and σ is defined as This quantity satisfies the following upper and lower bounds: In particular, the upper bound holds for normalized states even in infinite-dimensional systems. F (ρ, σ) = 1 if and only if ρ = σ, and F (ρ, σ) = 0 if the support of ρ is orthogonal to that of σ. The quantity −2 log F (ρ, σ) can therefore be seen as a measure of distance between ρ and σ, which is sometimes referred to as the min-relative entropy. For the later discussion, it is useful to note that we can also write The right-hand side is defined as the sum of the square root of the eigenvalues of ρσ. Although ρσ is not a Hermitian matrix in general, its eigenvalues are equal to those of √ ρσ √ ρ, and so in particular are real and nonnegative. This is due to the fact that for any two square matrices A and B, the matrices AB and BA have the same eigenvalues. To see (2.3), take A = √ ρσ and B = √ ρ.
To understand physically why the fidelity is a good measure of the distance between two states, suppose we measure both states using some positive operator-valued measurement In this sense, the fidelity tells us how well two states can be distinguished by an optimal measurement [24]. In any quantum system and for any subsystems A, B, C, for the pair of states ρ ABC and ρ (λ) ABC , we have a lower bound on the fidelity in terms of the CMI defined in (1.6). Recall that the strong subadditivity of entropy is the statement that (1.6) is non-negative. As mentioned in the introduction, when the strong subadditivity inequality is saturated, i.e. I(A : C|B) = 0, [14][15][16] showed thatρ (λ) ABC = ρ ABC for any real value of λ. More generally, we have the following inequalities between the CMI and F (ρ ABC ,ρ (λ) ABC ) [20][21][22] These inequalities were shown in [20][21][22] for general states in quantum mechanical systems, which are type I von Neumann algebras. They follow from a more general inequality which strengthens the data-processing inequality for any quantum channel. A proof of these inequalities for general quantum channels was given for Type III von Neumann algebras, which include quantum field theories, in [25].
where β(λ) = π 2 1 1 + cosh(πλ) (2.7) Note that (2.5) implies (2.6) due to concavity of the function log F . When the CMI is small, these inequalities tell us that both the optimal case of P (λ) and its average with the probability distribution (2.7) work well.
Let us now turn to our setup in Fig. 2(b) of three adjacent intervals on a spatial slice in the vacuum state of a (1+1)-D CFT, and recall the formula (1.8) for the CMI. Note that the dependence of the UV cutoff in (1.2) gets cancelled out in the CMI, so that the latter is a well-defined quantity in the continuum. We can immediately use the above discussion to conclude certain properties of the fidelity: ABC ) which does not depend on . Assuming that the qualitative features are similar for different λ, we conclude that the fidelity does not approach 0 for → 0. We will see more explicitly in the following sections that this quantity is independent of the UV cutoff.
2. In the limit η → 0, we must have F (ρ ABC ,ρ (λ) ABC ) → 1 for all λ, since (2.5) implies that the average value approaches 1, which is also the largest possible value.
Note further that the expression (1.8) for the CMI depends only on the central charge of the CFT and not on its detailed operator content. One natural way for the above bounds to be satisfied is if the fidelity is also universal with respect to the operator content. We will find that this is indeed the case.

Numerical results
Before introducing a replica trick formalism to study F (ρ ABC ,ρ ABC ) analytically, let us summarize some numerical results for this quantity from lattice calculations. We consider lattice discretizations of three CFTs: the Ising CFT with c = 0.5, the tricritical Ising CFT with c = 0.7, and the compactified free boson CFT with c = 1. We make use of critical quantum spin chains which realize these CFTs, with length L and periodic boundary conditions. The Ising CFT and the tricritical Ising CFT are realized using the O'Brien-Fendley model [26] This model has a critical line at 0 ≤ λ ≤ λ * ≈ 0.428. At λ = 0, the model is the transverse field Ising model. At λ = λ * , the model flows to the tricritical Ising CFT. For 0 < λ < λ * , we have an RG flow from the tricritical Ising CFT to the Ising CFT. The free boson CFT is realized by the XXZ spin chain where −1 ≤ ∆ < 1, and the compactification radius R is related to ∆ as ∆ = cos(2π/R 2 ) [27]. The operator content depends on R but the central charge is constant and equal to 1 along the critical line. We obtain the ground state in each case using the periodic uniform matrix product state method [28] and compute the fidelity F (ρ,ρ) numerically. In computing the fidelity, we make use of the Uhlmann theorem [29] to reduce the numerical cost, which enables us to go to up to L = 128 spins for the Ising CFT and L = 60 for the free boson CFT. See Appendix A for details on the numerical implementation.  We observe the following features of the fidelity F (ρ ABC ,ρ ABC ): 1. In Fig. 3, for each model and choice of λ, we consider the quantity F (ρ ABC ,ρ  The orange curve shows the λ = 0 case, which has a slope 2 as η → 0, indicating − log F ∝ η 2 . The blue curve shows the averged fidelity −log F , which has a linear dependence in η as η → 0 (parallel to the green CMI line).
We find that there is no dependence on the individual L A , L B , L C except through the cross-ratio, which in particular shows that F (ρ ABC ,ρ ABC ) is independent of the UV cutoff.
are strictly smaller than I(A : C|B)/2. 5 5. At small η, where at least one out of A and C is much smaller than B, − log F (λ=0) has a perturbative expansion that starts at quadratic order, where f 2 ≈ 0.070 is a numerically determined constant. For a general value of λ, we observe that − log F (λ) ∝ η p at small η, where the power p decreases as λ increases. We show this small η behaviour in Fig. 5. where f ≈ 0.055.
Note that the expansion of the conditional mutual information (1.8) at small η starts at linear order: The inequality Eq. (2.5) requires that f ≤ 1/6, which is satisfied but not saturated.
6. For the regime 1 − η 1, where A and C are both much larger than B, we have an expansion for any λ of the form See Fig. 6. The coefficients a 0 and a 1 are λ-dependent constants. So in particular the inequality (2.4) is again not saturated:

Replica formalism and properties of the fidelity
In this section, we first introduce a replica trick in order to derive the fidelity through analytic continuation. This maps the problem to studying the partition function on M copies of the theories for some integer M , with different boundary conditions connecting the copies in A, B, and C. We show how to write this partition function as a four-point function of twist operators associated with different permutations in S M , the symmetric group of M elements. By examining the properties of this four-point function, we explain various features of the fidelity observed in the previous section, including independence of the UV cutoff and universality with respect to the operator content. We then consider the OPE limits of this four-point function.

Replica trick and analytic continuation
Recall that we want to evaluate the quantity As we discuss below, standard techniques involving partition functions on an integer number of copies of the theory will allow us to calculate a quantity of the form for some a > 0. 6 In particular, the case where we set m i , n i equal to the values in (2.19) falls within this range. Provided (2.20) also satisfies the growth conditions of Carlson's theorem, we can try to uniquely infer the value of by analytic continuation of m i , n i from positive integer values of these parameters. Next, we want to consider F (λ) k for general complex values of k. Recalling from the discussion below (2.3) that the eigenvalues of ρσ for two density matrices ρ, σ are non-negative, we see that (2.22) is well-defined for complex values of k. We further expect based on numerical evidence that F (λ) k is upper-bounded by 1 for Re(k) > 0, and analytic in this regime. Then provided we can find an analytic continuation of F λ k that satisfies Carlson's theorem, we can evaluate the quantity (2.19).
Below we will sometimes refer to this analytic continuation of all five variables as the replica limit: (2.23) 6 We checked this statement numerically in some special cases. For example, for k = 1, m1 = m2 = 1 2 and n1 = n2 = n, we find that (2.20) is bounded for n −0.8.

Representation in terms of twist operators
Let us now set up the calculation of the quantity (2.20) for positive integer values of all parameters in a QFT in 1+1 dimensions. We consider the theory on the manifold R 2 . Results for the CFT on a cylinder, which was considered in the numerics, can be derived by conformal mapping.
Recall that the matrix elements of the vacuum state ρ = |0 0| of a QFT between two states |φ and |φ are given by where I E is the Euclidean action, and (2.20) is then given by a Euclidean path integral on N k copies of the theory, with N = m 1 + n 1 + m 2 + n 2 + 2. On the I-th copy, we have a matrix element φ I |0 0|φ I . The partial traces and matrix multiplications in (2.20) indicate that in each of the subsystems P = A, B, C, and the complement ABC, we should identify each φ I with some φ I . More explicitly, we have the conditions φ I ( x) = φ τ P (I) ( x) for certain permutations τ P ∈ S N k . Since ABC is traced out in each copy of ρ, τ ABC is the identity permutation. The permutations in A, B, and C are given by (see Fig. 7 for diagrams) Here we use the cycle notation for each permutation, so for instance τ = (1, 2, 3) in S 5 refers to the permutation which sends 1 to 2, 2 to 3, 3 to 1, and the remaining elements 4 and 5 to themselves. In enumerating the cycles of various permutations below, we will often explicitly discuss only cycles of length greater than 1. Each of the permutations τ A,B,C consists of a single cycle, which has length 2k, N k, and (m 1 + m 2 + 1)k respectively. The path integral for (2.20) thus involves N k copies of the fields ψ I (x µ ), whose combined action is given by . In all such figures, the copies are labelled from 1 to N k from left to right unless explicitly stated otherwise. There are k groups of N = n 1 +m 1 +n 2 +m 2 +2 copies. The arrows show how the copies map to each other under the permutation.
Copies that are not connected by arrows map to themselves.

Copy 1
Copy & (1) Figure 8. The Euclidean path integral for (2.20) on N k copies, with boundary conditions relating the different copies specified by τ A,B,C , is shown on the left. This can be reinterpreted as a four-point function of twist operators in a theory with N k copies of the fields living on R 2 , as shown on the right.
path integral at the points x i . For a permutation τ ∈ S n , we define the twist operator Σ τ (p) as an operator that implements the following boundary condition on a small circle around the point p: Then we have the following representation for the path integral: (2.32) The twist operators inserted at x 2 and x 3 are associated with the permutations and τ −1 C τ B =(m 1 , m 1 + 1, ..., m 1 + n 1 + n 2 + 1) (N + m 1 , N + m 1 + 1, ..., N + m 1 + n 1 + n 2 + 1) These permutations are shown in Fig. 9. Note in particular that: • τ −1 B τ A has k cycles of length m 1 + n 1 + 1, and k cycles of length m 2 + n 2 + 1. • τ −1 C τ B has k cycles of length n 1 + n 2 + 2. Figure 9. We show the permutations (2.33) and (2.34), using similar notation to Fig. 7. Note that in τ −1 B τ A , we put the N k-th copy to the left of the first.
In order to see that the relevant twist operators should be the ones appearing in (2.31), we can start with the field φ I and go in a small anticlockwise circle around x i in the left figure of Fig. 8, and see which φ J it gets mapped to. We show an example in Fig. 10.
In later sections, we will study other quantities such as the relative entropy between ρ ABC andρ ABC , or the entanglement entropy of various subsystems inρ ABC . By similar reasoning, we will express these quantities as four-point functions of the form (2.31), with the difference that τ A,B,C refer to some other permutations depending on the quantity.

Consequences of the twist operator representation
Let us now make some comments both on the formal properties and the physical consequences of the expression (2.31) or (2.32): 1. In a 1+1 D CFT, each Σ τ defined by (2.30) is a primary scalar operator. If τ has t cycles with a 1 , a 2 , ..., a t elements respectively, then the holomorphic dimension of Σ τ is Hence, the holomorphic dimensions of the operators inserted at the points x i are respectively (2.39) These operators are scalars, so their antiholomorphic dimensions,∆ 1 , ...,∆ 4 , are also given by (2.36)-(2.39).
2. In the replica limit (2.23), note that the lengths of all cycles of each of the permutations (2.26)-(2.28) and (2.33)-(2.34) become 1. As a result, the dimensions in (2.36)-(2.39) are zero in this limit. Naively, (2.31) in the replica limit seems to reduce to a four-point function of identity operators. Despite this, we find that the fidelity has a non-trivial behaviour as a function of η, as we discuss more explicitly in Section 2.4.
3. The twist operators in (2.31) are not local operators: as discussed in the literature on CF T ⊗M /S M orbifolds, we get genuine local operators only on summing over operators corresponding to all permutations in a given conjugacy class. The correlation function is therefore well-defined only on specifying the branch cuts of the twist operators, as we have done in Fig. 8. Due to this non-locality, the correlation function is not invariant under arbitrary reorderings of the twist operators, but it is invariant under cyclic reorderings, as discussed in [31,32]. The OPE of any two such operators will also depend on their ordering.
4. As discussed in [23], we can evaluate any correlation function of twist operators such as (2.31) by mapping the original manifold R 2 , on which the fields are multi-valued due to the presence of twist operators, to a "covering space" on which the fields become single-valued. The covering space has a metric ds 2 with non-trivial curvature, and can also have non-trivial topology. By the Weyl anomaly, where ds 2 is some simple metric on the covering space. S L is the Liouville action that relates the partition functions with metrics ds 2 and ds 2 . Its only dependence on the specific CFT involved is through an overall factor of the central charge c; the remaining factor is determined entirely by the structure of the permutations. We review these methods in some more detail in Appendix B.
5. The genus of the covering space is determined by the cycle structure of the permutations in the correlation function. Suppose the total number of cycles involved in all the twist operators of a given correlation function is l, the total number of copies of the theory that appear in one or more of the l cycles is s, and the length of the i'th cycle is n i . Then the genus of the covering space is given by the Riemann-Hurwitz formula: Putting the data for the permutations in (2.31) into this formula, we find g = 0 for any n i , m i , k. Hence we can takeg in (B.4) to be the flat metric on R 2 . Then putting (B.4) into (2.32), we find (2.42) In the replica limit, ABC ) is equal to the limit (2.23) of −S L . As discussed in the previous point, the only dependence on the specific CFT in this quantity is the overall factor of the central charge. This explains the universality of − log F/c that we found numerically in Fig. 3.
6. For positive integer values of m i , n i , k, the quantity (2.31) has an explicit dependence on the UV cutoff, similar to the Renyi entropies. To see this, note that the boundary condition (2.30) should be imposed in a small circle of radius around the point p. The Liouville action depends on this UV cutoff . In order to cancel out the dependence on , we can divide (2.31) by the quantity (2.43) for some arbitrary length a. See for instance Appendix D of [33] for an explanation of this cancellation. Note that N can be expressed in terms of the density matrix ρ Q on an interval Q of length a as In the replica limit (2.23), we simply have N = Tr[ρ Q ] = 1, which indicates that this limit of S L does not depend on the UV cutoff. This is consistent with the vanishing of the dimensions (2.36)-(2.39) in the replica limit. This explains the cutoff-independence of the fidelity, which we anticipated from upper and lower bounds in Section 2.1 and observed numerically in Section 2.2.
7. Due to conformal invariance, since (2.31) is a four-point function of primary scalar operators, it takes the following form [34]: . (2.46) In the replica limit, the prefactor involving |x ij | becomes 1 as all ∆ i are zero, so the fidelity only depends on the cross ratio, as we observed numerically. Note in particular that the fidelity is unaffected on interchanging the values of L A and L C , despite the asymmetric roles of A and C in the definition (1.3) of the twirled Petz map. This invariance is an interesting feature of this quantity specific to conformally invariant theories.
In principle, the covering space method introduced in [23] and used in points 4 and 5 above can be used to find the expression for − log F (η)/c if we can evaluate and analytically continue S L . An important ingredient of this calculation is finding the covering map that takes us from the base space to the covering space. In Appendix B, we write down a parameterization of the covering map for integer values of m i , n i , k. However, in order to fix certain constants in the map which appear in S L , we need to solve polynomial equations of arbitrarily high degree for general m i , n i , k. No general analytic solutions can be found for such equations, which in turn prevents us from obtaining an analytic expression for S L in terms of m i , n i , k using this method.
In the next two subsections, we study the two OPE limits η → 1 and η → 0 of the four-point function (2.31), which are more tractable than finding the expression for arbitrary η. We will find that we are able to explain all numerical observations of the η → 1 limit, and find and confirm some interesting interpretations of the coefficients of the expansion in that limit, but the η → 0 limit has some subtle features.

η → 1 limit
The cross-ratio η approaches 1 in the limit where L A and L C are both much larger than L B . When η becomes exactly equal to 1, the region B disappears, and the states ρ ABC ,ρ ABC reduce to ρ AC , ρ A ⊗ ρ C . The fidelity F (ρ AC , ρ A ⊗ ρ C ) is equal to zero in the continuum limit → 0, so we should find that F (ρ ABC ,ρ ABC ) approaches zero as η → 1. In this section, we will find the way in which this quantity approaches zero by using its OPE expansion.
Recall that the correlation function of interest is Since various other quantities we study will also be expressible as four-point functions of this kind, let us write down a series expansion allowing the permutations τ A,B,C to be general, and later specialize to the specific permutations (2.26)-(2.28) for the fidelity. As Here 2 and 3 are shorthand for the operators Σ τ −1 B τ A and Σ τ −1 C τ B , and f ABC is the coefficient appearing in the three-point function A(x 1 )B(x 2 )C(x 3 ) . Since A, B, C are twist operators, the ordering in the three-point function is important. We consider the A(x 1 )B(x 2 )C(x 3 ) with x 1 < x 2 < x 3 , and put the branch cuts between x 1 , x 2 , and between x 2 , x 3 .Ō p is the primary operator with which O p has a non-zero two-point function. (For non-gauge-invariant twist operators, O p =Ō p .) Putting the expansion (2.48) into the four-point function (2.31), we find where c Op is the overall coefficient in the two-point function of O p andŌ p , the overall factor involving x ij is and g is the conformal block for any (1+1)-D CFT which takes into account contributions from descendants, and has an expansion of the form ; ∆p are universal coefficients that are fixed by conformal symmetry, and that have been evaluated for instance in [35].
Let us now identify the operators O p appearing in the OPE (2.48), whose dimensions will determine the behaviour of (2.47) in the η → 1 limit. Since the twists associated with Σ τ −1 do not cancel with each other, all primary operators appearing in the OPE must also implement the same twisted boundary conditions at infinity as the combined effect of these operators. See Fig. 11. Below we will label the primary operators appearing in the Figure 11. Each operator appearing in the OPE of Σ τ −1 B τ A and Σ τ −1 C τ B should give rise to the same branch cut at long distances as the combined effect of The discussion so far applied to any correlation function of the form (2.47). Let us now consider the specific case of the fidelity, by taking the permutations τ A,B,C to be the ones in (2.26)-(2.28). Then τ −1 C τ A is the following permutation, shown in Fig. 12: τ −1 C τ A has k non-trivial cycles, each of length m 1 + m 2 + 2, and its dimension is therefore Figure 12. The permutation τ −1 C τ A from (2.52).
The remaining primary operators that create the boundary conditions in Fig. 11 are excitations of Σ τ −1 C τ A with fractional modes. Recall that in a 1+1-D CFT without twisted boundary conditions, we can give the following mode decomposition of any quasiprimary field with dimension h: If we have q copies of the quasiprimary field φ I , which are multiple-valued due to the presence of a twist operator Σ τq (0) associated with τ q = (1, ..., q), then the mode decomposition includes fractional powers of z: For the holomorphic and antiholomorphic parts of the stress tensor T (z),T (z) the associated modes are labelled L m/q ,L m/q . The commutation relations of the operators L m/q and φ m/q can be worked out by mapping these operators to the covering space on which the fields become single-valued; see [36,37] for details. Using this algebra, we find that each φ −m/q Σ τq for −h ≤ m ≤ q − 1 and each L −m/q Σ τq for −2 ≤ m ≤ q − 1 is a primary operator. For a twist operator with multiple cycles such as Σ τ −1 C τ A , we can independently dress any subset of cycles with fractional modes. For instance, the lowest-dimension fractional mode operators are ones where a single cycle i is dressed. We list a few such operators and their holomorphic and anti-holomorphic dimensions.
Now if the operator O b appeared in the OPE, it would give a contribution to (2.72) which depends on the dimension h, and hence on the operator content of the theory. Since we argued in the previous subsection that the quantity F m 1 ,n 1 ,m 2 ,n 2 ,k should not not have any such dependence, the product f 23Ōp f 1Op4 for all such non-universal operators must be zero. 7 The lowest-dimension dressed operators contributing to (2.72) are then (2.57) and (2.58). 7 As an aside, the OPE coefficients f23O p and f1O p 4 do not individually have to be zero , which have series expansions in terms of f 23Ōp and f1O p4 respectively. These quantities do not have genus zero and can therefore have non-universal contributions in their OPE expansion. So it seems that we can have f23O p be non-zero for some non-universal Op as long as f1O p 4 is zero, and vice versa.
Putting together the contributions from O a , O c , O d , the first two powers appearing in the η → 1 expansion of F m 1 ,n 1 ,m 2 ,n 2 ,k are (2.59) with ∆ a defined in (2.53). In the replica limit (2.23), f 1 (x ij , ∆ i ) becomes 1, and the powers of (1 − η) in the leading and subleading terms become c/9 and c/9 + 2/3 respectively. These agree with the powers observed numerically in Fig. 6. The descendants in the conformal block of Σ τ −1 C τ A give further subleading corrections starting at O(1 − η). Let us now turn to understanding the coefficients in (2.59). In the first coefficient, f 1Oa4 and f 23Ōa are both proportional to three-point functions of twist operators, and c Oa is proportional to a two-point function of twist operators. As mentioned in section 2.3.3 and reviewed in more detail in Appendix B.1, each of these correlators can be written in the form e S L where S L is the Liouville action associated with some covering map, which is proportional to c. The genus of the covering surface for each of these cases is zero. So the coefficient in front of (1 − η) 2∆a can be written as e a 0 c for some universal constant a 0 , as noted in (2.16).
Let us now try to derive this constant a 0 . First note that the coefficient c Oa can be related to the Renyi entropy of an interval: where a 1 and a 2 are the endpoints of some interval A, and a 12 = a 1 − a 2 . Taking the replica limit, we see that c Oa is related to the third Renyi entropy of a single interval, Next, consider the factor f 1Oa4 , which is defined by the following three-point function (with y 1 < y 2 < y 3 on some spatial slice according to our conventions): Say R is the interval between y 1 and y 2 , and S the interval between y 2 and y 3 . By inverting the reasoning that we used in Section 2.3.2 to express a quantity involving reduced density matrices as a correlation function of twist operators, we can rewrite (2.62) as We are interested in the replica limit of this expression, where we have 3), the quantity on the RHS can be identified to be the fidelity between ρ R ⊗ ρ S and ρ RS for two adjacent intervals R and S. Recall from the discussion at the beginning of this section that in the strict η = 1 limit, the region B vanishes and ρ ABC ,ρ ABC reduce to ρ AC , ρ A ⊗ ρ C for adjacent intervals A, C. It is therefore reasonable that the universal coefficient associated with the fidelity between these two limiting states contributes to e a 0 c . In Appendix B.2, we use the Liouville action to find which agrees with the numerically evaluated value from (2.64), f 1Oa4 √ c Oa = e 0.11465 c , up to finite size corrections. 8 Let us next try to understand the constant f 23Ōa . By similar reasoning to (2.63), we have Taking the replica limit, is not an obviously identifiable information-theoretic quantity. One simple observation we can make is that in the case where ρ RS is a product state ρ R ⊗ ρ S , we have so it may be possible to interpret the log of the LHS of (2.68) as a measure of entanglement between R and S in ρ RS . It would be interesting to better understand this quantity and the reason why it appears in this limit of F (ρ ABC ,ρ ABC ). We can further try to derive the numerical value of the cutoff-independent quantity f 23Ōa √ ca using the Liouville action, similar to (2.68). In Appendix B.3, we outline the calculation of (2.66) using the Liouville action for integer values of m i , n i , k. We find that one of the parameters which appears in the Liouville action can only be derived by solving a polynomial equation whose degree increases with m i , n i . As a result, no analytic solution can be found for general m i , n i , and we cannot obtain f 23Ōa √ ca by analytic continuation using this method. Indeed, as we discuss in the appendix, the Liouville action and the equations for the covering map seem to depend only on λ-independent combinations of the parameters m i , n i . So if there had been a solution that we could analytically continue, this would have led to inconsistency with the explicit λ-dependence in this quantity that we show below. This case presents a 8 We evaluate the numerical value of the ratio as this is cutoff-independent, unlike the individual factors f1O a 4 and √ ca which depend on the UV regularization. Note that the yij refer to the lengths of various intervals and not to the lengths divided by the UV cutoff. simpler version of the issues encountered in trying to obtain an analytic continuation of the four-point function (2.47) using the Liouville action.
We can, however, check that the constant obtained using the quantities on the RHS of (2.61), (2.64), and (2.67), agrees with the constant a 0 obtained from the numerical fitting of F (ρ ABC ,ρ ABC ) in (2.16). We check this in Fig. 13, finding excellent agreement for all λ. So we find that although we cannot obtain the numerical value of a 0 using analytic continuation, the analytic continuation works well at the level of relating it to other entanglement quantities. It was discussed in [31,32] that correlation functions of twist operators are invariant under cyclic reorderings. In particular, for a three-point function of twist operators, this implies that the constant f ABC should be equal to f CAB . However, the two correlation functions A(y 1 )B(y 2 )C(y 3 ) and C(y 1 )A(y 2 )B(y 3 ) can have very different expressions in terms of reduced density matrices. For instance, the coefficient f 1Oa4 that we related to the fidelity of ρ RS and ρ R ⊗ ρ S in (2.64) can also be written as We also checked the relations (2.70) and (2.71) numerically. Understanding the physical reason for such relations between different combinations of the reduced density matrices, which are likely unique to conformal field theories, would be an interesting question for future work.
For the coefficient of the second term in (2.59), there does not seem to be an obvious way to relate it to some function of the density matrices, but we can show using the general expressions for such OPE coefficients in terms of the covering map (see for instance Section 4.2 of [37]) that it is equal to a 1 c e a 0 c for some universal constant a 1 .

η → 0 limit
Let us now consider the η 1 OPE limit, which corresponds to either L A L B , or L C L B , or both. Note that the L A L B ≈ L C and L C L B ≈ L A cases have different interpretations due to the different roles played by A and C inρ ABC , but the answer is the same for both. Recall that from the discussion of the lower bound in Section 2.1, we expect the fidelity to approach 1 in this limit.
We can now use the OPEs between O 1 , O 2 and O 3 , O 4 to get the following series: and the coefficients f ABC , c Op and functions g are defined as in the previous subsection, but note that the order of dimensions appearing in g has changed. The set of operators O p contributing to the expansion are also different from the case in the previous subsection. The We can see that ∆ τ B is zero in the replica limit (2.23), indicating that the fidelity approaches a constant in this limit. In order to see that this constant is 1, we can interpret the quantities , and c τ B in terms of the density matrix. With y 1 < y 2 < y 3 on some spatial slice, and R and S defined as the intervals y 1 y 2 and y 2 y 3 , we have (2.77) Taking the replica limit of (2.75), (2.76), and (2.77), we find Putting this into (2.72), we find that the fidelity approaches 1 as η → 0.
For integer values of the parameters m i , n i , k, the corrections to the leading behaviour of (2.72) come from the descendants of Σ τ −1 B , and from other universal primary operators O p in the twisted sector of Σ τ −1 B . In the replica limit, where all dimensions ∆ 1 , ..., ∆ 4 as well as ∆ τ B go to zero, all coefficients B in the conformal block appear to go to zero. 9 In principle, we could have fractional modes of Σ τ −1 B appearing in the OPE: the lowestdimension such operator consistent with universality would be L − 2 N k . However, we can provide the following simple argument that all conformal block coefficients and OPE coefficients should go to zero on taking the replica limit of the m i , n i parameters, even when k is some positive integer. To see this, consider the following four-point functions and their OPE expansions: . Hence, the coefficients of all other contributions must 9 We have checked that this is true for the expressions in [35] and [38] up to k = 6. become zero in the replica limit. Indeed, we can observe by comparing (2.79), (2.81) with (2.75), (2.76) that the argument for these coefficients of subleading contributions reducing to zero is closely related to the argument for the leading coefficient to be 1 in the replica limit.
The natural conclusion from the above discussion would be that the fidelity approaches 1 faster than any power of η in the η → 0 limit, similar to discussions of the negativity in the far-interval limit [8,9]. But recall from the numerical results of Sec. 2.2 that we found F (ρ ABC ,ρ ABC ) ∝ η 2 in this limit, up to very small values of the fidelity close to 10 −6 . It seems likely, then, that the expression we get from first taking the OPE limit and then analytically continuing in m i , n i (as we do here) does not agree with the answer we get from first analytically continuing and then expanding for small η. The latter procedure should be the correct one, but cannot be implemented with the Liouville action methods due to the difficulties discussed in Appendix B.4.

Relative entropy and trace distance
In this section, we will consider two other measures of the distance between ρ ABC andρ ABC : the relative entropy and the trace distance. We study the relative entropy in Sec. 3.1, where the calculation and results turn out to be qualitatively quite similar to the discussion of the fidelity. We discuss the behaviour of the trace distance in 3.2. We compare both sides of various general inequalities relating the fidelity, trace distance, and relative entropy in Sec. 3.3.

Relative entropy
The relative entropy is the following measure of distance between two states ρ and σ: (3.1) A one-parameter generalization called the Petz-Renyi relative entropy has also been widely discussed in the literature [39,40]: (3.1) is the limit of (3.2) as α → 1. The D α are monotonically increasing in α. We can see that both (3.1) and (3.2) are zero if and only if ρ = σ, and diverge if the support of ρ is orthogonal to that of σ. These behaviours are similar to those noted for − log F (ρ, σ) in Section 2.1. However, unlike the fidelity, the relative entropies for α = 1/2 are not symmetric between ρ and σ. One consequence of this asymmetry is that these quantities diverge whenever the support of ρ is not contained in the support of σ, but not necessarily the other way round. In our discussion below, we take ρ = ρ ABC and σ =ρ ABC : this is the natural choice given that in the limit L B → 0, ρ ABC approaches ρ AC whileρ ABC approaches ρ A ⊗ ρ C . The case α = 1/2 is symmetric and corresponds to Holevo's just-as-good fidelity, which is equal to the fidelity between the canonical purifications of ρ and σ.
The relative entropy can also be given an interpretation in terms of distinguishing ρ and σ by measurements [41,42], but the setup is different from the one discussed for the fidelity in Sec. 2.1. Suppose we are given n copies of a certain state, and know that the state is one out of ρ and σ. We want to carry out some two-outcome measurement {M, I − M } on n copies, and guess that the state is ρ if the outcome is M , and σ otherwise. In general, we cannot choose M such that both p 1 , the probability of a "type 1" error where we are given ρ but guess σ, and p 2 , the probability of a "type 2" error where are given σ but guess ρ, are identically zero. If we choose the optimal M to minimize p 2 , while also requiring that p 1 vanishes as n → ∞, then we get an exponential decay p 2 = e −nD(ρ||σ) as n → ∞. In [43], the Petz-Renyi relative entropies for other values of α ∈ [0, 1] are also given interpretations in terms of similar protocols, which require more stringent conditions on the decay of p 1 with n.  Despite these differences between the fidelity and D α , we will find that the D α shares several features of the fidelity for the states ρ ABC ,ρ ABC in our setup. To evaluate (3.2), we can use the following replica trick [44]: first evaluate for integer α and β, and then take the replica limit β → 1 − α. We also replace the powers appearing inρ ABC with integers m i , n i as before. By similar manipulations to those in Section 2.3.2, we can express (3.3) as a four-point function of the form (2.47), now associated with twist operators on M = α + β(m 1 + m 2 + n 1 + n 2 + 1) copies. We use the same notation τ A,B,C for the permutations as in Sec. 2, but they now refer to the permutations shown in Fig. 14 and 15. In particular, • τ A has 1 cycle with α + β elements.
• τ C has one cycle with α + β(m 1 + m 2 ) elements. By plugging these cycle structures into (2.35), we find that the dimensions ∆ 1 , ..., ∆ 4 all go to zero on taking the replica values of m i , n i and β = 1 − α for any α. By similar reasoning to point 6 in Sec. 2.3.3, we can see that D α (ρ||σ) is independent of the UV cutoff. Plugging the cycle structures into the Riemann-Hurwitz formula (2.41), we find that the genus of the covering space is zero. The total number of copies also goes to 1 in the replica limit for any α, so D α (ρ||σ)/c is the same function of η in any CFT. We confirm this universal behaviour using numerical computations in the lattice models introduced in Sec. 2.2. See Fig. 16 for an illustration. We restrict to the case λ = 0 for numerical results in this section. in (2.72) is as shown in Fig. 15. This permutation has one cycle of length (m 1 + m 2 + 1)β + 1.
In the replica limit for D α , the dimension of this operator becomes This gives a leading behaviour Tr[ρ αρ1−α ] = e b 0 c (1 − η) 2∆α for some universal constant b 0 which can in principle be obtained from the Liouville actions for certain three-point functions. This coefficient can again be related to quantities associated with density matrices on two adjacent intervals. Like in (2.69), we can again write We can obtain the expressions for f 1α4 and c α in terms of the density matrix in a similar way to (2.61) and (2.64), in terms of the density matrices on two adjacent intervals R = y 1 y 2 and S = y 2 y 3 : From (3.6), we see that like in the case of the fidelity, one of the factors contributing to e b 0 c is related to the Petz-Renyi relative entropy between ρ RS and ρ R ⊗ ρ S in this limit.
The OPE coefficient f 23α is somewhat more complicated; the corresponding function of the reduced density matrices on R and S for integer values of β is shown using a tensor network representation in Fig. 17. This can be expressed in a relatively compact way in terms of two Figure 17. Tensor network representation of the quantity corresponding to the OPE coefficient f 23α for integer values of β. Each A stands for the matrix ρ n2 R ρ m1+m2 RS ρ n1 R , which becomes ρ . The dashed lines represent index contractions in S, and the solid lines represent index contractions in R.
copies of the Hilbert space on RS, H R 1 S 1 ⊗ H R 2 S 2 , as follows: where ρ T refers to the transpose of ρ, and |MAX A 1 A 2 is the unnormalized maximally entangled state between A 1 and A 2 : for some orthonormal basis |a of A. Also note that for a product state ρ RS = ρ R ⊗ ρ S , the RHS of (3.8) divided by Tr[ρ 3−2α S ] is equal to 1. Again, it would be interesting to see if this quantity has some general interpretation in terms of entanglement.
The leading, divergent behaviour of the entropy D α in the η → 1 limit is then given by The α → 1 limit of the above expression is Note that (3.11) has the same leading divergence as the conditional mutual information. We can heuristically understand this from the fact that both D(ρ ABC ||ρ ABC ) and the CMI reduce to I(A : C) on setting L B to zero. There are two possible sources of leading corrections away from the η → 1 limit: the descendants of τ −1 C τ A , and other primary operators. The next primary operator appearing in the OPE expansion is the fractional mode L − 2 (m 1 +m 2 +1)β+1 Σ τ −1 C τ A , which in the replica limit has dimensions ∆ 2,α = ∆ α + 2 3 − 2α ,∆ 2,α = ∆ α (3.12) We also have its antiholomorphic versionL − 2 (m 1 +m 2 +1)β+1 For α > 1 2 , the contribution is sub-dominant compared to the contribution from the linear term in the conformal block of Σ τ −1 C τ A . Including both corrections, we get Here we have used the universal formula for the first coefficient of the conformal block expansion defined in (2.51). b 1 is some universal constant which would in principle be determined from a covering map. In Fig. 18, we compare (3.13) to the numerical results for the Ising model, allowing b 0 and b 1 to be fitting parameters, and find good agreement.

η → 0 limit
Since we expect that the difference between the states ρ ABC andρ ABC vanishes in the η → 0 limit, we should expect each of the D α to approach zero in this limit. The four-point function that corresponds to Tr[ρ α ABCρ β ABC ] should therefore approach 1 in the replica limit. The expansion of the four-point function can be written as in (2.72), and the leading operator O p in the expansion is Σ τ B appearing in Fig. 14, which has one cycle of length α + β(m 1 + n 1 + m 2 + n 2 + 1). The dimension of this operator goes to zero on substituting β = 1 − α and the replica values of m i , n i , showing that Tr[ρ αρ1−α ] approaches a constant as η → 0. We can see that the constant is 1 by very similar arguments to the discussion from (2.75)-(2.78). Moreover, on considering corrections away from this limit, we run into precisely the same issue as that around (2.79)-(2.82), arriving at the conclusion that all OPE coefficients vanish in the replica limit. In contrast, numerically we again find quadratic behaviour of each of the D α at small η, The results for the Ising model are shown in Fig. 19, where the coefficient d α monotonically increases with α ∈ (0, 1].

Trace distance
We now turn to the trace distance, The trace distance takes values between 0, for the case where ρ = σ, and 1, for the case where the support of ρ is orthogonal to that of σ. Like the fidelity, this is a symmetric measure of the distance between two states. Operationally, if we consider all possible projectors P , or all possible positive operators satisfying P ≤ 1, the trace distance is given by It therefore tells us how well ρ and σ can be distinguished by comparing the probabilities of some measurement outcome, optimized over all possible measurements.
For the trace distance between ρ ABC andρ ABC in the CFT vacuum state, we can we can use the following replica trick, introduced in [45]: compute for even n e , and analytically continue to n e → 1. This replica trick allows us to see that similar to the fidelity and the relative entropy, the trace distance depends on the CFT only through its central charge and is independent of its operator content. To see this, we can note that for any n e , all terms appearing in (3.18) have the form for some integer l. All possible 0 ≤ l ≤ n e /2 can appear, and all possible x i , y i such that l i=1 (x i + y i ) = n e . The total number of copies is s = M Y + X, where M = m 1 + n 1 + m 2 + n 2 + 1, Y = l i=1 y i , and X = l i=1 x i . We can show that (3.19) can be again be written as a four-point function of the form (2.47), where 1. τ A has one cycle with N τ A = X+Y elements.
Using the Riemann-Hurwitz formula, each such correlation function has genus zero. We confirm this universality in Fig. 20, where the data points for the XXZ model with different compactification radius nearly coincide. Since (3.18) involves an increasing number of terms which each have exponential dependence on c, we cannot extract the c-dependence in the replica limit as an overall factor like in the case of − log F or the relative entropy. Let us restrict to understanding the small η behaviour numerically. In this regime, Fig. 20 shows that the trace distance is proportional to √ cη. We can also consider more general UV-regulated distance norms T n (ρ,ρ) = 1 2 ||ρ −ρ|| n ||ρ|| n , ||ρ|| n = (tr(|ρ| n )) 1/n (3.20) for other integers n, which reduces to (3.16) for n = 1. In all cases we find The proportional constant can be determined numerically, e.g., t 1 ≈ 0.28. Based on the numerical results,T n (ρ,ρ)/ √ c still weakly depends on c at large η.

Comparison between different distance measures
For any two quantum states, the trace distance T , fidelity F , and relative entropy D satisfy the following inequalities Recall that at small η, the three distance measures obey Eqs. (2.13), (3.15) and (3.20). In the first inequality, 1 − F ≤ T is trivially satisfied as T and 1 − F have different powers of η.
The other two are satisfied if and only if the following constraint holds on the coefficients, This is numerically found to be the case, where t 1 ≈ 0.28, √ 2f 2 ≈ 0.37 and log 2 2 d 1 ≈ 0.44. We see that both bounds are satisfied. In particular, T in (3.22) is much closer to √ 1 − F 2 than to 1−F . For any two pure states, we have T = √ 1 − F 2 . This suggests that the relation between ρ ABC andρ ABC is somewhat similar to the difference between two pure states. It would be good to understand this more precisely.

Differences in mutual information between ρ ABC andρ ABC
So far, we have quantitatively studied the differences between ρ ABC andρ ABC using various distance measures. Let us now try to understand more qualitatively which properties of the density matrix account for the distance between the two states. As noted in the introduction, the definition of P (λ) ensures that the reduced density matrices of ρ ABC andρ ABC on each of the subsystems A, B, C, and BC are precisely equal to those of ρ ABC . The differences between the two states must therefore be in the correlations between A and B, correlations between A and C, and tripartite correlations among the three systems.
Let us first understand the difference in the mutual information I(A : B) between the two states. Sinceρ AB = Tr C P B→BC (ρ AB ), and the channel Tr C P B→BC leaves ρ A ⊗ ρ B unchanged, we must have by the monotonicity of the relative entropy under quantum channels. Since the reduced density matrices in A and B are the same for both states, we have where S(ρ AB ) is the entanglement entropy of the vacuum state in a single interval and is given by (1.2). Below we will consider the Renyi version 3) and take its q → 1 limit to get (4.2). The q-th Renyi entropy of ρ AB can be expressed as a two-point function of twist operators of dimension ∆ (q) defined in (2.35), located at x 1 and for some constant r q which is proportional to 4∆ (q) . To find (4.2), let us then consider the twist operator representation of the q-th Renyi entropy ofρ AB , which can again be expressed as a four-point function like the quantities studied in previous sections, The total number of copies is q(m 1 + n 1 + m 2 + n 2 + 1), and the permutations appearing in this correlation function can be worked out similarly to previous sections. We find that: • τ A has one cycle of length q.
• τ −1 C τ B has one cycle of length q, and q cycles of length n 1 + n 2 + 2.
• τ C has q cycles of length m 1 + m 2 .
Putting this cycle structure into the Riemann-Hurwitz formula, we see that this correlation function has genus zero for any q. We confirm the resulting universal behaviour of δI(A : B) using numerical simulations in the free fermion and Ising CFTs, which have the same central charge c = 1/2 but different operator content, in Fig. 21. See appendix A.5. for a discussion of the methods used for the free fermion calculation. On taking the replica values of m i , n i , we see that the dimensions of the operators at x 2 and x 4 go to zero, and the dimensions of the operators at x 1 and x 3 are ∆ (q) . The UV divergence therefore cancels out between the two terms in the Renyi version of (4.2), and hence also in the q → 1 limit. Moreover, this difference depends only on the cross-ratio η. Let us understand the η → 1 limit of (4.5), where we can again use the general form (2.72), and the leading operator is O q = Σ τ −1 C τ A . For the quantity Tr[(ρ AB ) q ], τ −1 C τ A has one cycle of length q, and q cycles of length m 1 + m 2 . In the replica limit, this operator therefore has dimension ∆ (q) , and the prefactor f 1 defined in (2.50) becomes . (4.6) We therefore see that in the η → 1 limit, the leading behaviour of (4.5) is We can also observe that the coefficient appearing in the two-point function of O q in the replica limit is the same as the coefficient in the Renyi entropy of ρ AB (4.4). Putting (4.4) and (4.7) into (4.3), we find that the difference in the two mutual informations approaches a constant in the η → 1 limit, We have already identified r q in equation (4.4), and f 23Ōq and f 1Oq4 can also be given expressions in terms of the density matrix using similar logic to the previous sections. In fact, Figure 22. Tensor network representation of the quantity appearing in the η → 1 limit of δ q I(A : B) for q = 3. Dashed lines represent index contractions in S and solid lines represent index contractions in R.
we can see that f 1Oq4 is also equal to r q . f 23Ōq is related to the quantity shown in Fig. 22, which can be written in terms of a matrix A on three copies of S as Here R 1 , R 2 are two copies of R, and |MAX was defined in (3.9). For a product state . We confirm that the η → 1 limit of the q = 1 case approaches a constant non-zero value approximately equal to 0.18 in Fig. 21. It is interesting that even though we may naively interpret the η → 1 limit as a case where B vanishes, the difference in I(A : B) between the two states does not go to zero in this case. Roughly, this seems to come from a competition between the fact that B is becoming smaller, but at the same time the distance between the states ρ ABC andρ ABC is diverging according to − log F or D.
Let us now consider the mutual information difference for A and C, Its Renyi version δI q (A : C) can be defined similarly to (4.3). Recall that the definition of ρ ABC was such that any correlations between A and C had to be generated by a map that acts only on the subsystem B in ρ AB . It is therefore natural to expect that δI(A : C) should be positive, and we find in Fig. 21 that this is indeed the case. Moreover, the dependence on the individual interval lengths in the two terms of (4.10) cancels out such that (4.10) depends only on η. Let us first recall the behaviour of S(ρ AC ), the entropy of two non-adjacent intervals in the vacuum state. While this quantity is non-universal for general η, its leading behaviour in the η → 1 limit comes from the identity operator in the OPE between the operators at x 2 and x 3 , and is universal: Let us now consider the twist operator representation of Tr[ρ q AC ], which again has the general form (2.47). In this case, • τ A has one cycle of length q.
• τ −1 C τ B has q cycles of length n 1 + n 2 + 2, and one cycle of length q.
• τ C has one cycle of length q(m 1 + m 2 ).
From the cycle structure in this case, we can see that the genus of the covering surface is q −1, like in the case of Tr[ρ q AC ]. We can also see that in the replica limit for m i , n i , the dimension of each of the four operators becomes ∆ (q) . In the η → 1 limit, we again use the expansion (2.72). The leading operator O q = Σ τ −1 C τ A for this case has one cycle of length q, and one cycle of length (m 1 + m 2 )q, so that its dimension in the replica limit is 2∆ (q) . Putting this into (2.72), we find that the leading contribution in the η → 1 limit is The leading behaviour of the difference in mutual informations is then This agrees with the fitting in Fig. 21 up to finite size effects. It also precisely agrees with the behaviour of the conditional mutual information I(A : C|B). The CMI is often heuristically interpreted as quantifying correlations between A and C that are not mediated by B. From way in whichρ ABC is constructed, it is natural to view the quantity δI(A : C) as another measure of the unmediated correlations between A and C. Such unmediated correlations diverge as we make both A and C much larger than B, and both measures of such correlations turn out to agree in this limit. Let us now try to understand to the η → 0 limit of both quantities δI(A : B) and δI(A : C). Due to the issues with analytic continuation in this OPE limit discussed in previous sections, let us simply interpret the numerical results of Fig. 21 instead of trying to derive them from the twist operator formalism. We see that δI(A : C) vanishes faster than δI(A : B) in this limit. For small enough η, this suggests that the difference between ρ ABC andρ ABC is mostly accounted for by the difference between the reduced density matrices on AB rather than on AC. We confirm this in Fig. 23, where we compare the trace distances T (ρ ABC ,ρ ABC ), T (ρ AB ,ρ AB ), and T (ρ AC ,ρ AC ). Note that we cannot necessarily interpret this as coming from L C becoming much smaller than L B ; we can keep L B = L C constant and decrease η by decreasing L A .
At small η, we find from Fig. 21 that where d is the Hilbert space dimension. Since the Hilbert space dimension is infinite in the present setup, this inequality does not seem to imply any constraints. However, as discussed in [46], there may be versions of the inequality that can be applied to special subsets of states in infinite-dimensional systems. The behaviour (4.17) of δI(A : C) is interesting when compared with the leading behaviour of I(A : C) ρ . From the general arguments of [7], the leading power law of the mutual information between A and C in the vacuum state as η → 0 is where h +h is the total dimension of the lowest-dimension primary operator in the theory. For the free fermion CFT, h +h = 1/2, and for the Ising CFT, h +h = 1/8. The fact that this leading power law does not appear in δI(A : C) shows that I(A : C)ρ has exactly the same leading power law and coefficient. Physically, this tells us that we can think of the correlations between A and C captured by the leading power law (4.19) as correlations mediated by B, since they are present in both ρ ABC andρ ABC . This is consistent with the fact that the CMI is insensitive to the operator content of the theory and in particular to the correlations captured by (4.19), although we do not find a quantitative matching between these two measures of unmediated correlations in this case. In particular, I(A : C|B) is universal while δI(A : C) is not.

Sequential recovery on multiple regions
So far, we have considered a protocol for reconstructing the state by acting with a nontrivial channel from B to BC for arbitrary sizes of B and C, as in Fig. 1. One natural constraint we can impose on a physical reconstruction process is that in a single step, both the input and the output of the map can have volume no larger than some fixed V . Then consider a state ρ A 1 ...An , where each A i has volume V /2. We can refer to this as the "target state" which we wish to reconstruct. Suppose we start with the state ρ A 1 A 2 , and attempt to reconstruct the target state with a series of n − 2 steps, as follows. At the j-th step, we act with P A j+1 →A j+1 A j+2 , where P R→RS for any R, S refers to the map where ρ R , ρ RS are the reduced density matrices of the target state. If the state after n − 2 such steps isρ A 1 ...An , then the fidelity provides a new measure of multipartite correlations among the regions A i in the target state ρ.
In a state where the correlations decay on some length scale R, such as the ground state of a gapped system, it seems reasonable to expect that as long as V R d in d dimensions, the above sequential recovery procedure works just as well as a single step where we act on ρ A 1 A 2 with P A 2 →A 2 A 3 ...An . This reasoning was used to propose a multiple-step recovery process for systems with finite correlation length in [47]. In states with long-range entanglement, such as the vacuum state of a CFT or states obtained from non-equilibrium dynamics, we should expect (5.2) to be worse than the fidelity of reconstruction by a single-step process, and to contain further information about the entanglement structure of the state.
More concretely, let us consider a setup where the full system is divided into four subsystems. A few different recovery protocols we may consider this setup are shown in Fig.  24: 1. Protocol 1(A) is simply the protocol in Fig. 1, which was studied in the previous sections, with CD playing the role of C in that earlier setup. In Protocol 1(B), we act with P B→BC in the first step, forming the output state σ (1) ABC . In the next step, we act with P BC→BCD on σ (1) ABC . While this procedure naively appears different from Protocol 1(A), we can immediately see from the definition of P R→RS that it gives rise to exactly the same final state on ABCD, which we callρ (1) ABCD . 10 Recall from the discussion of Sec. 2.1 that we have the following lower-bound on the fidelity of this recovery: In Protocol 2, we start with the state ρ AB , and the first step is the same as in Protocol 1(B). In the next step, we act with the map P C→CD , which is restricted to only act nontrivially on C. This gives rise to a new stateρ (2) ABCD . By an extension of the methods used in [22], we show in Appendix C that the fidelity of this state with the target state obeys the following lower-bound in any quantum-mechanical system: 3. In Protocol 3, we start with ρ BC and act with P B→AB , which gives rise to a state σ ABC . 11 In the next step, we act with the same map as in Protocol 2, ending up with a state ρ The equivalence between comes from the fact that by the strong subadditivity inequality, each of the terms on the LHS is always non-negative. In Protocol 2, I(A : C|B) = 0 implies that the recovery in the first step is perfect, so that σ  10 The superscripts here refer to the different protocols and should not be confused with the value of λ, which we never write explicitly in this section. 11 Note that σ ABC has the same reduced density matrix as ρABC on BC, but different reduced density matrices on AB and AC. σ (2) ABC has the same reduced density matrix as ρABC on AB, but different reduced density matrices on BC and AC. Figure 24. In all cases, we start with the reduced density matrix of the target state on some subsystem. The blue shaded regions indicate where the next stage of the map acts non-trivially.

Protocol 1(A)
known as the "chain rule" of the CMI. The last two conditions on the RHS of (5.7) are therefore together equivalent to I(AB : D|C) = 0, which in turn implies that the second step of the recovery process in Protocol 2 yields the state ρ ABCD . (5.4) is therefore consistent with our expectations for the conditions for perfect recovery using Protocol 2. The conditions for perfect recovery using Protocol 3 can be understood similarly.
Let us now turn to understanding the fidelities associated with the sequential recovery procedures for the setup where A, B, C, D are adjacent intervals in the vacuum state of a (1+1)D CFT. By using the replica trick in a similar way to the discussion in previous sections, both F (2) and F (3) can be expressed as replica limits of five-point functions of twist operators: In both cases we still have five replica parameters, k, n 1 , n 2 , m 1 , m 2 , and the replica limit is the one defined in (2.23). The total number of copies is now s = k(2 + 2(m 1 + n 1 + m 2 + n 2 )). The five relevant permutations in the two cases have the following structure: 1. For F (2) , • τ −1 A has one cycle with 2k elements. • τ −1 B τ A has k cycles with m 1 + n 1 + 1 elements, and k cycles with m 2 + n 2 + 1 elements.
Putting these cycle structures into the Riemann-Hurwitz formula, we find that both (log F (2) )/c and (log F (3) )/c should be independent of any details of the CFT. The dimensions of all five operators go to zero in the replica limit, and these quantities are independent of the UV cutoff. In Fig. 25, we compare F (1) , F (2) , F (3) in the free fermion CFT, taking λ = 0 for each Petz map, for a setup where we fix L A = L D , L B = L C , and vary 12 .
We consider the regime of small η, where as discussed in previous sections, log F (1) = f 2 cη 2 for f 2 ≈ 0.07. We find that with f < f , confirming that the sequential recovery is less effective than the single-step case. For this setup, we have Again, the fidelity is much better than we would expect from the lower bounds (5.4) and (5.5).

Conclusions and discussion
In this paper, we studied the extent to which the reduced density matrix of the CFT vacuum state in 1+1 dimensions on some region can be reconstructed from smaller subregions by an explicit recovery channel called the twirled Petz map. We found that a variety of distance measures between the original and reconstructed states have a universal form that depends only on the central charge and the cross ratio. The recovery as measured by the fidelity turns out to be better than the minimum value expected from general information-theoretic bounds. We found the universal form numerically for all values of η, and analytically in the limit η → 1. We also studied differences in mutual information between the original and recovered states, and multiple-interval generalizations of the universal quantities associated with recovery. While the universality of the various quantities we considered seems rather mysterious from the arguments based on the covering space and the Riemann-Hurwitz formula, we can provide a simpler argument as follows: each of the quantities we considered can be written (after using the replica trick for some powers) as vacuum expectation values of the following form, 0|ρ 1 ρ 2 ...ρ n |0 (6.1) where each of the ρ i is a reduced density matrix of the vacuum state on some single interval. For example, from (2.3), we can write the fidelity between ρ ABC andρ ABC as The relative entropy, trace distance, and the multiple-interval generalizations F (2) , F (3) , and ..An ) defined in Section 5 can similarly all be expressed in the form (6.1). Now recall that in a conformal field theory, the density matrix of a ball-shaped region has a universal form in terms of the stress tensor T µν of the theory [48], which can be written for a single interval A = [− a 2 , a 2 ] in 1+1 dimensions as where C is some constant that ensures normalization. Expressions like (6.1) and (6.2) thus involve correlation functions of exponentials of the stress tensor in the vacuum state. In general, any operator constructed entirely from the stress tensor is expected to have a universal expectation value in the vacuum state of a (1+1)D CFT that is determined by the Virasoro algebra.
One interesting future direction would be to use the expression (6.3) for the single-interval density matrix, or the even more explicit expressions for the density matrix and correlation matrix in the free fermion CFT [49], in order to analytically evaluate the universal expressions for the fidelity and other quantities without having to use the replica trick. The expressions for quantities like F (ρ ABC ,ρ ABC ) in terms of the correlation matrix of the free fermion vacuum state are rather complicated, but it may be possible to simplify these with some further work. Certain expressions derived for the Petz map associated with a general fermion Gaussian channel in [50] may also be useful for this purpose.
An interesting conceptual question would be to understand better why the stateρ ABC is able to capture the short-distance entanglement structure of ρ ABC , so that the distance measures between ρ ABC andρ ABC , and the mutual information difference δI (A : B), are all well-defined quantities in the continuum limit. From the rigorous perspective of algebraic QFT, it is somewhat challenging even to understand why the mutual information of two non-adjacent intervals turns out to be a well-defined quantity in the continuum limit, as discussed in [51]. Extending such proofs to the conditional mutual information of three adjacent intervals and the distance measures between ρ ABC andρ ABC should lead to a deeper understanding of the structure of the QFT vacuum state.
Our results also provide a concrete setting for thinking about more general informationtheoretic questions about the twirled Petz map. In all cases with non-zero conditional mutual information that we considered in this paper, the general inequalities (2.4)-(2.6) were not saturated. We can ask whether or not it is possible to find any quantum states that saturate these inequalities when the CMI is non-zero. If such states can be found, then do they have some special structure, analogous to the Markov state structure of states that saturate the strong subadditivity inequality [14]? If not, then is there a stronger lower bound on the conditional mutual information in terms of some other recovery operation, or a stronger lower bound on the fidelity in terms of some other information-theoretic quantity in the state ρ ABC ? Similar questions can be asked about the multiple-interval generalizations of Sec. 5.
The striking difference in the behaviour of the maximum and the average fidelity for small η that we found in Fig. 4 is also worth understanding better. Recall that in this limit, the maximum fidelity showed a different power law from the conditional mutual information lower bound, while the average fidelity only differed from the lower bound by an overall constant. Due to our inability to understand the small η limit using the methods of this paper, we were not able to provide an understanding of this observation, but it would be interesting to see if this is an example of more general relations between the CMI and the average, or between the average and the maximum.
It would also be interesting to make more precise the sense in which tracing out C and then recovering with the Petz map acting on B leads to a loss of correlations between A and C that are not mediated by B. Our results in Sec. 4 show clearly that the map does not only change the correlations between A and C in the reconstructed state relative to the original state: for small η, the change in the correlations between A and B is much more significant. It would be instructive to use some simple toy models to better understand the tradeoffs between entanglement of A with B and with C that are achieved by the map, and how these are related to the entanglement structure of the original state. It should also be interesting to better understand the physical reason for the fact that δI(A : B) does not vanish in the η → 1 limit.
The quantities (2.68), (3.8), and (4.9) which arise naturally from the η → 1 limits of various measures, would also be interesting to understand better, as would the interpretation of the cyclic invariance discussed at the end of Sec. 2.4.
In any relativistic quantum field theory, it is possible to show a non-trivial lower bound on the conditional mutual information I(A : C|B) in setup of Fig. 2 (b), for the case where L A = L C L B . This is simply a restatement of the c-theorem of [4]. The starting point is the strong subadditivity inequality in the setup of Fig. 2 (a), with R 1 = R 3 = ∆ R 2 = R. We now consider the case of the vacuum state of a general relativistic QFT, where the strong subaddivitiy inequality is in general not saturated, but we still have S(AB) = S(BC) = S( R(R + 2∆)), and S(ABC) = S(R + 2∆). Then expanding LHS of the strong subadditivity inequality in this setup to quadratic order ∆/R, we find This inequality is interpreted in the literature as the monotonic decay of the c-function c(R) = RS (R) with R. Another interpretation is that we can write the LHS of (6.4) as S(R + ∆) + S(R + ∆) − S(R) − S(R + 2∆), so that it is the conditional mutual information I(A : C|B) for the configuration in Fig. 2(b) with three adjacent intervals of lengths ∆, R, and ∆ on a spatial slice. So in a relativistic QFT, the usual strong subadditivity inequality CMI ≥ 0 applied to the setup of Fig. 2(a) gives a stronger lower bound on the CMI for the setup of Fig. 2(b). We can then ask how this lower bound compares to the general information theoretic bounds of (2.4) and (2.6). The results of this paper show that in a conformal field theory, the c-theorem is stronger than the general information-theoretic bounds -the former is saturated, while the latter are not. If there does exist some stronger information-theoretic lower bound on the CMI, it would be interesting to see whether that coincides with the bound from the c-theorem. Yet another interesting question is whether the distance measures between ρ ABC and ρ ABC can be identified with some bulk geometric quantities in holographic CFTs. The reduced density matrix of a subsystem in the boundary CFT is expected to encode the properties of a bulk region known as its entanglement wedge [52,53]. Intuitively, the fidelity or relative entropy between ρ ABC andρ ABC should be related to the part of the entanglement wedge of ABC that lies outside the entanglement wedges of AB, B, and BC, from whichρ ABC is constructed. It would be interesting to try to identify a bulk dual by extending the methods of [54] used for the entanglement entropy, although this may be quite challenging due to the lack of replica symmetry in the permutations that define these quantities, and the issues with taking the replica limit that we encountered in the CFT calculations in Section 2.5 and Appendix B. One starting point may be to look for a bulk quantities which have the limiting behaviours in equations (1.10)-(1.13) in empty AdS 3 .
Finally, the differences between ρ ABC andρ ABC could provide a new way of characterizing the properties of interesting states with long-range entanglement that appear in other contexts in quantum many-body systems, such as topologically ordered states, or states obtained from nonequilibrium dynamics.

Acknowledgments
We would like to thank Gauri Batra, Lorenz Eberhardt, Thomas

A Numerical algorithm
In this section we explain how to compute the fidelity and other distance measures between ρ ABC andρ ABC numerically. For the spin chain models we use the periodic uniform MPS technique. This section will be mostly concerned about this technique. For the free fermion model, we may alternatively use the correlation matrix techniques, although high numerical precision will be needed due to inversion of an ill-conditioned matrix.

A.1 Coarse graining of periodic uniform MPS
Starting with the critial spin chain Hamiltonian with N spins, we can use a periodic uniform matrix product state with finite bond dimension χ to represent the ground state. Note that χ has to increase with the system size N [55,56]. The MPS can be obtained by energy minimization of the Hamiltonian [28] 13 . Let A, B, C, D be its four parties. We can obtain a coarse-grained state |ψ ABCD using the standard coarse graining of matrix product state [57], where the dimensions of the coarse grained Hilbert spaces are smaller than χ 2 (the Schimidt rank of the density matrices ρ i , i = A, B, C, D). In practice, since most of the Schmidt coefficients are very small, we can further reduce the dimension of the coarse grained Hilbert space d i by paying a small truncation error. For the detailed algorithm, see the Supplemental Material of [58]. We will use this approximation to reduce the numerical cost. With the sizes of the critical system under consideration (Ising model with L ≤ 128, XXZ model with L ≤ 60), we find that d i = 64 is enough to reproduce accurate results with truncation error on the order of 10 −7 .

A.2 Fidelity
As shown in Ref. [29], one may utilize the Uhlmann theorem to efficiently compute the fidelity of tensor network states. Here we will apply similar techniques to the Petz map fidelity.
The Uhlmann theorem states that the fidelity between two mixed states is the maximal fidelity between their purifications. Let ρ and σ be two mixed states, and |ψ ρ and |ψ σ be two arbitrary purifications, then In Ref. [29], it has been shown that if we can decompose ρ = XX † and σ = Y Y † , then the above expression reduces to where Tr|A| = Tr √ A † A is the trace norm (sum of singular values of A). Now we compute the matrices X, Y and X † Y . Let |ψ ABCD be a four-party pure state The state can be obtained using the coarse graining algorithm of the MPS in the last subsection. The reduced density matrix ρ ABC can be expressed as ρ ABC = XX † , where where we have introduced an auxiliary Hilbert space C with basis |c and the Hilbert space is isomorphic to C. Under the isomorphism between the bra and the ket, the matrix X becomes exactly the original pure state |X ABCD = |ψ ABCD and the matrix Y becomes a purification of the Petz recovered stateρ ABC , where In terms of tensor networks, the matrices X, Y and X † Y are shown in Fig. 26.

A.3 Renyi relative entropy
The Renyi relative entropy can also be computed efficiently using the MPS at a similar cost. The quantity is defined as for 0 < α < 1. The limit of α → 1 gives the usual relative entropy D(ρ||ρ) = tr(ρ log ρ − ρ logρ). (A.9) To start with, we use singular value decomposition of the matrix X to get the entanglement spectrum of ρ ABC , In order to compute the entanglement spectrum of the Petz recovered stateρ ABC , we singular value decompose the matrix Y , which gives Finally, we can compute The most expensive part of the algorithm is the singular value decomposition of Y , which is a 6-leg tensor. One crucial trick to reduce the numerical cost is that we never compute the matrix Y explicitly. Instead, we follow the following steps to compute the singular values and singular vectors, where the intermediate tensors have at most 4 external legs.
First of all, we can use a coarse graining isometry w : C D → E to reduce the auxiliary dimension in |Y . The isometry projects onto the subspace spanned by the support of ρ C D , neglecting eigenvalues below = 10 −7 . The coarse graining can be achieved with a small dimension d E since ρ C D has small entanglement, which follows from the fact that ρ C D isomorphic to the ground state density matrix on CD. The resulting tensor |Ỹ ABCEC * = w|Y ABCC DC * is a purification ofρ ABC on ABC ⊗ EC * . Next, we compute the eigenvalues q j and eigenvectors |φ j EC * of the density matrix ρ EC * =Ỹ †Ỹ . This part is the most expensive part of the algorithm, where the dominant cost is a diagonalization of a matrix of dimension d E d C . The order of contraction to obtain ρ EC * is such that we never encounter an intermediate tensor with more than four external legs. Finally, we obtain the singular vectors |φ j ABC by √ q j |φ j ABC = φ j | EC * |Ỹ ABCEC * . In order to compute these norms, we have to obtain the eigenvalues of ρ −ρ. Naively, ρ ABC andρ ABC are both 6-leg tensors, so computing the eigenvalues of ρ −ρ would require a diagonalization of a 6-leg tensor. However, we can use a similar trick as in the previous section to reduce the cost to a diagonalization of 4-leg tensors. The key observation is that the support of ρ ABC is at most d D dimensional, and the support ofρ ABC is at most d E d C dimensional. Thus, their difference ρ −ρ is supported on a subspace which is at almost d E d C + d D dimensional (the direct sum of the two supports). In the case of d D d E d C , the dominant cost is similar to diagonalizing ρ EC * in the previous section. However, the support of the two density matrices are not orthogonal subspaces, and in fact they are almost identical at small η. This introduces an extra complication of diagonalizing an ill-conditioned norm matrix, which increases the error to the order of 10 −4 in the worst case considered (see details below).
We define a norm matrix N ij as 18) where i runs in the support of ρ ABC , which is at most d D dimensional, and j runs in the support ofρ ABC , which is at most d E d C dimensional. Assuming the supports are linearly independent, we define an extended norm matrix, which encodes the overlap of basis vectors in the direct sum of the two supports.Ñ The extended norm matrix is ill-conditioned if the two supports are close. In this basis, the two density matrices can be represented as where P and Q are diagonal matrices with entries p i and q i , respectively. In order to obtain the trace distance, we have to find an orthonormal basis of the direct sum of the support. To do this, we diagonalizeÑ = W DW † , where W is unitary and D is diagonal. The difference ρ −ρ in the orthonormal basis defined by the columns of W is Thus, the distance norm is Finding the orthonormal basis is not a numerically stable process ifÑ is ill-conditioned. In the worst case considered, that is, the case with smallest η, which happens for L A = L C = 2, L B = 62 with the total length L = 128 of the Ising model, the total error is roughly escalated to 10 −4 , which is roughly 10 3 times the truncation error 10 −7 . At larger η, the escalation is alleviated as expected, where the total error falls onto the same order of the truncation error.

A.5 Comments on the correlation matrix method for the free fermion
For the free fermion CFT, one may alternatively use the correlation matrix method to compute the Petz map fidelity. This allows us to compute the fidelity exactly with the numerical cost scaling as O(L 3 ), where L is the size of the subsystem A, B, C. The reconstruction is given in explicit form in Ref. [50], see their Eq. (44) and Eq. (51) in particular. Here we provide additional necessary input for the specific CFT example.
The key idea is that the reduced density matrix is completely determined by the twopoint correlation functions of the fermionic operator. Let us consider the critical Hamiltonian which is dual to the critical Ising model under the Jordan-Wigner transformation, where γ i is the fermion operator with anticommutation relation {γ i , γ j } = 2δ ij . The correlation matrix of any Gaussian state is defined as G jl = (i/2)tr([γ j , γ l ]ρ). The ground state correlation matrix can be obtained analytically using the Fourier transform, .
Now we can apply the techniques in Ref. [50] to obtain the Petz map fidelity. Although the formula is straightforward, one has to take extra caution to take the matrix inverse in their Eq. (44). This is because the state is weakly entangled and the majority of the eigenvalues of G of an interval is close to ±i. This then requires very high precision in the numerics. One may use arbitrary precision linear algebra provided in programming language such as Julia or Python. Such a need for high precision is already noted in Ref. [59] when they considered the entanglement Hamiltonian. In our numerics, we observe that approximately 5L digits of floating point precision is needed for diagonalization of a correlation matrix on 2L contiguous fermions.

B Covering maps and Liouville actions
We review the general setup and formulas for the Liouville action in Sec. B.1, and then derive various specific correlation functions of twist operators relevant for the main text in the subsequent sections.

B.1 Review of Liouville action
Let us review the methods of [23] for calculating a correlation function of twist operators by mapping the original space to a covering space. We will closely follow the notation of Appendix D of [33], and we refer the reader to that reference for a clear and detailed derivation of the formulas briefly summarized below. Suppose we have M copies of the CFT, and a K-point function of twist operators in S M , The right-hand side should be seen as a more precise version of the definition of the twist operators from (2.30). It refers to a partition function on M copies of the CFT, with a circle of radius cut out around each of the z j , at which we put the boundary conditions that a field Ψ I on the I'th copy is sent to Ψ τ j (I) on going around the circle. The definition also includes an IR cutoff δ, so the metric on the original space is given by which describes two large discs glued together at radius 1/δ. The main idea of [23] is that if we can find a map z(t) such that the fields as a function of the coordinate t become single-valued, then on the space labelled by t, known as the covering space, (B.1) can be equated with a partition function of the theory without any twisted boundary conditions. We will refer to the original space labelled by z as the base space below. The covering space can in general be some higher genus surface, but we can restrict to the case where it is genus zero for the correlation functions of interest in this paper. Even in this case, the covering space path integral is non-trivial due to the fact that the metric as a function of the coordinate t is no longer the flat metric. Instead, if (ds 2 )g is the flat metric dtdt on the covering space, then the curved metric induced from the base space is where the last equality can be seen as the definition of φ(t), and we have not explicitly written down a similar equation for the second half of the sphere in (B.2). Then by the Weyl anomaly, where S L is the Liouville action associated with φ, whereR refers to the curvature on the covering space. We see that the main ingredient for computing S L is to find the covering map z(t), and use it to find φ(t). Suppose the twist operator τ 1 consists of a single cycle (1, 2, ..., p), so that on going around z 1 , we have ψ 1 → ψ 2 → ... → ψ p → ψ 1 . This means that to return to ψ 1 , we need to go around z 1 p times by an angle of 2π on the base space. For the fields to be single-valued on the covering space, going an angle of 2πp around z 1 should correspond to going around an angle of 2π around the image t 1 of z 1 . So close to t 1 , the map should have the form for some constant a 1 . Now if τ 1 consists of multiple cycles which have lengths p 1 , ..., p N 1 , we should have N 1 distinct images t 1 , ..., t N 1 of z 1 , such that for some constants a i . All such images of z j with p i > 1, which correspond to the presence of non-trivial cycles on the base space, are referred to as "ramified points" on the covering space, and p i is referred to as the ramification index. Now if τ 2 has N 2 cycles of length p N 1 +1 , ..., p N 1 +2 , then z 2 should have N 2 ramified images, close to which we should have a similar behaviour to (B.7) with z 1 replaced with z 2 , and so on for other z j up to j = K − 1. It is convenient to set the last point z K = ∞ using global conformal invariance. (More precisely, this means it is placed at the center of the second disc in (B.2).) Let τ K have N K cycles, which we label q 0 , ..., q K−1 . The first cycle can be mapped to infinity on the covering space without loss of generality, and the remaining cycles correspond to multiple poles of the covering map at t = t ∞ i , for i = 1, ..., In addition to this, the map may also have simple poles at points i , which do not correspond to any cycles of the permutation but do contribute to the Liouville action. We can add these poles to the set of {t ∞ i }, with q i = 1, and also label their residues with b i . The above discussion gives us the parameters {a i , p i }, i = 1, ..., N J , where N J is the total number of cycles in all permutations at finite points z j , and {b i , q i }, i = 0, ..., N K + N P − 1, where N K is the total number of cycles of the twist operator placed at infinity and N P is the total number of simple poles. In order to find the a i , b i , we have to find the explicit form of z(t) which is consistent with the above expansions close to each of the t i and t ∞ i . As shown in [33], once we do find these parameters, S L can be expressed entirely in terms of them and in terms of the IR and UV cutoffs. After dividing the twist operators by an appropriate normalization factor so that their two-point function at unit separation becomes 1, we get an expression that depends only on {a i , p i }, {b i , q i }, and the IR cutoff δ: Some general methods for finding a genus zero covering map were developed in [23] and [36]. We will make use of these methods in our discussion of various examples below.
B.2 Liouville action for fidelity of ρ RS and ρ R ⊗ ρ S Let us evaluate the quantity f 1Oa4 √ c Oa , which appeared in the coefficient of the η → 1 limit of the F (ρ ABC ,ρ ABC ) in Sec. 2.4. The main purpose of this calculation is to show a simple example of a quantity appearing in this paper where the Liouville action can be computed explicitly and analytically continued, before discussing more complicated cases in later sections.
Recall that in the replica limit, f 1Oa4 √ c Oa is UV finite. It can therefore be written as a limit of the appropriately normalized twist operators written in the previous section. In the discussion of Sec. 2.4, f 1Oa4 was related to Tr[(ρ R ρ m 1 +m 2 S ρ RS ) k ] for two adjacent intervals R and S, which became Tr[(ρ R ρ S ρ RS ) 1 2 ] on taking the replica limits of m 1 , m 2 , k. Since we are eventually interested in this replica limit, let us simplify the calculation by writing Tr[(ρ R ρ S ρ RS ) . The latter quantity for integer k can be related to a three-point function of the following permutations on 3k copies: 3, 4, 6, ..., 3(k − 1) + 1, 3k) (B.11) τ 3 = (2, 3, 5, 6, ..., 3k − 1, 3k) (B.13) τ 1 and τ 3 each have one cycle of length 2k, and τ 2 has k cycles of length 3. In calculating Tr[(ρ R ρ S ρ RS ) k ], τ 1 , τ 2 , τ 3 appear in order from left to right at the three endpoints of R and S. To find the covering map, it is convenient to use the cyclic invariance of the three-point function mentioned in Section 2.4 to put τ 3 , τ 1 , and τ 2 at 0, 1, and ∞ respectively. We then have To use the formula (B.10) for the correlation function of normalized twist operators, we need to find the covering map z(t) and in particular the parameters a i , b i . From the cycle structures of τ 3 , τ 1 , and τ 2 , z = 0 and z = 1 should have one ramified image each with ramification order 2k, and these can be mapped to t = 0 and t = 1 using global conformal transformations. z = ∞ should have k ramified images with ramification order 3. We can use global conformal invariance to map one of these images to infinity, but the remaining images t ∞ 1 , ..., t ∞ k−1 cannot be arbitrary. While it is in general complicated to evaluate three-point functions involving multiple-cycle twist operators for this reason, we can use a trick inspired by Appendix C of [10] to find the covering map for this particular case.
We first look for the covering map z(w) for a three-point function of single-cycle twist operators on three copies of the theory of length 2, 2, and 3 respectively placed at z = 0, z = 1, and z = ∞. Say we fix the images of z = 0, 1, ∞ under this map to be t = 0, 1, ∞ respectively. Then we can compose this with a map w(t) for a two-point function of singlecycle twist operators of length k at the points w = 0 and w = 1 respectively. By expanding both w(t) and z(w) close to their ramified points, we can convince ourselves that the map z(t) resulting from this composition has precisely the structure we need for the three-point function (B.14). Now the two steps of finding z(w) and w(t) are both simple. We want z(w) to have the The point needs to be determined. Now from the behaviour of the derivative z (t) close to 0, 1, and , it must have the form z (t) = C t n 1 +n 2 +1 (t − 1) m 1 +m 2 +1 (t − ) m 2 +n 2 +2 .

(B.35)
This also has the right behaviour z (t) ∝ t m 1 +n 1 as t → ∞, so we can take C to be a tindependent constant. Now to determine the parameters and C and the overall constant on integrating (B.35), we need to impose the following conditions: 1. Since z (t) is a total derivative, its residue at the multiple pole should vanish [31,36]. This determines in terms of m i , n i .
2. z(t = 0) = 0, and z(t = 1) = 1. These conditions determine C and the constant of integration in terms of m i , n i .
After solving for these parameters, we can find the a i , b i in (B.34) in terms of m i , n i and substitute into (B.10) to find Q, and find f 23Ōa √ c Oa by taking the replica limit. It turns out, however, that it is not possible to find an analytic expression for in terms of m i , n i from point 1. The condition is Res t= z (t) = 0 (B.36) where Res t= z (t) = d m 2 +n 2 +1 dt m 2 +n 2 +1 t n 1 +n 2 +1 (t − 1) m 1 +m 2 +1 | t= (B.37) where (x) n refers to the Pochhammer symbol x(x + 1)...(x + n − 1), and Γ and 2 F 1 to the Gamma function and the hypergeometric function. The final expression (B.39) is not particularly illuminating, but we can already see from (B.38) that the condition for the vanishing of the residue is a polynomial equation in whose degree increases with m 1 , n 1 . This means that we cannot obtain an analytic expression for in terms of arbitrary integers m i , n i . As a result, we cannot obtain S L and f 23Ōa √ c Oa in the replica limit using analytic continuation in this case.
One reason we should not be too surprised that there did not turn out to be an analytic solution in this case is that all expressions above depend on m i , n i only through the combinations m 2 + n 2 , m 1 + n 1 , n 1 + n 2 , m 1 + m 2 . The dependence on the parameter λ of the twirled Petz map cancels out in each of these combinations. So any analytic expression we had found from the above procedure would have resulted in a value of f 23Ōa √ c Oa that did not depend on λ. This would be inconsistent with the nontrivial λ-dependence that we observe in Fig. 13.

B.4 Covering map for the fidelity of ρ ABC andρ ABC
Recall that we equated the replica version of the fidelity to a four-point function of twist operators: F k,n 1 ,n 2 ,m 1 ,m 2 = Σ −1 where τ −1 A has one cycle with 2k elements, τ −1 B τ A has k cycles with m 1 + n 1 + 1 elements and k cycles with m 2 + n 2 + 1 elemets, τ −1 C τ B has k cycles with n 1 + n 2 + 2 elements, and τ C has one cycle with (m 1 + m 2 + 1)k elements. It is convenient to first use cyclic invariance and then put three of the points at 0, 1, and ∞, so that we instead need to evaluate (B.41) Next, using a composition trick similar to [10] and Sec. B.2, above, we can reduce the problem to two steps: first, we need to find the covering map z(w) for a four-point function for certain permutations τ i on N = n 1 + m 1 + n 2 + m 2 + 2 copies, and then we compose this with a covering map w(t) for single-cycle twist operators of length k at the images of η and 1 in w-space under the first map z(w). In trying to do the first step of finding z(w), we will encounter a more complicated version of the same difficulty as in the previous subsection, which will prevent us from obtaining an analytic expression. In (B.42), τ 1 has one cycle with n 1 + n 2 + 2 elements, τ 2 has one cycle with m 1 + m 2 + 1 elements, τ 3 has one cycle with 2 elements, and τ 4 has two cycles: one with m 1 + n 1 + 1 elements, and one with m 2 + n 2 + 1. So the map z(w) should have the following behaviours: z ≈ a 1 w n 1 +n 2 +2 , w → 0 (B.43) z ≈ η + a 2 (w − w 1 ) m 1 +m 2 +1 , w → w 1 (B.44) Here w 1 , w 2 are points in the w plane which need to be determined. As in the previous sections, based on the above behaviours we can write z (w) = C w n 1 +n 2 +1 (w − w 1 ) m 1 +m 2 (w − 1) (w − w 2 ) m 2 +n 2 +2 (B.48) As w → ∞, (B.48) implies that z (w) ∝ w m 1 +m 2 , which is the expected behaviour from (B.47), so there is no need to include additional poles, and C is a constant with respect to w. To determine w 1 , w 2 , and C, we must use the following conditions: 1. From the fact that z is a total derivative, Res w=w 2 z (w) = 0 . (B.49) 2. z(0) = 0, z(1) = 1, z(w 1 ) = η, z(w 2 ) = ∞ .
Again, the difficulties will arise from the residue condition (B.49). Note that Res w=w 2 z (w) = d m 2 +n 2 +1 dw m 2 +n 2 +1 w n 1 +n 2 +1 (w − w 1 ) m 1 +m 2 (w − 1) | w=w 2 (B.50) Using the binomial expansion for (w − w 1 ) m 1 +m 2 , we find Now we need to solve for the above expression being equal to zero to express w 2 in terms of w 1 or vice versa, and then impose the requirements in point 2. But again from (B.51), we see that we have a polynomial equation of arbitrarily high degree in either w 1 or w 2 for general m i , n i , which cannot be solved analytically. So again, we cannot obtain an analytic expression for the Liouville action using this method. Once again, note that all expressions in this section involved λ-independent combinations of the m i , n i . So not being able to solve the above equations and analytically continue is important for consistency with the non-trivial λ-dependence we saw in this quantity in Sec.

C Proof of lower bounds for multiple-step recovery
Let us show the general bounds of (5.4) and (5.5) for the multiple-step recovery process. We discuss the case (5.4) in detail, and briefly mention the similar proof of (5.5).
Let us start with the twirled Petz map for a general channel N , following [22]. If N is a map S → B, then the twirled Petz map P N ,σ (for some reference state σ ∈ S) is defined as P (λ) σ,N (ρ) = σ in terms of which the adjoint can be expressed simply as Now we consider the composition of two maps, N 1 : S 1 → S 2 , and N 2 : S 2 → B, as shown in Fig. 28. For each of these we can introduce isometric extensions, using the isometries U S 1 →S 2 E 1 and V S 2 →BE 2 respectively. Then we can express the adjoints as follows: and let G : S → L(H) a bounded map that is holomorphic on the interior of S and continuous on the boundary. Let θ ∈ (0, 1) and define where we use the norm A p p = Tr(A † A) p/2 . Now let us take p 0 = 2, p 1 = 1, and θ ∈ (0, 1), so that p θ = 2/(1 + θ). Then note that for all α ∈ (1/2, 1), where we have defined (C.13) We expect that for general channels N 1 and N 2 , lim α→1∆ α = D(ρ||σ 1 ) − D(N 1 (ρ)||N 1 (σ 1 )) + D(N 1 (ρ)||σ 2 ) − D(N 2 • N 1 (ρ)||N 2 (σ 2 )) (C.14) For the particular case of Fig. 29 = D (ρ ABCD || exp (log ρ CD − log ρ C + log ρ BC − log ρ B + log ρ AB )) = lim α→1D α (ρ ABCD || exp (log ρ CD − log ρ C + log ρ BC − log ρ B + log ρ AB )) (C. 19) whereD α is the Sandwiched Renyi relative entropy [61], where the product on the left-hand side can be taken in any order. Using this for the exponential in (C.19), with n = (2α)/(1 − α), we find that the final expression in (C. 19) is precisely the α → 1 limit of (C.18) (using the fact that X † X has the same eigenvalues as XX † for any X). The proof of (5.5) can be seen very similarly by taking N 1 = Tr D , σ 1 = ρ AB ⊗ ρ CD , and N 2 = Tr A , σ 2 = ρ AB ⊗ ρ C in the above discussion. In particular, for (C.13) in this case, we get a product of the same set of matrices ρ CD that appear in (C.18) in a different order. Recall that in the α → 1 limit, from (C.21), we can choose to put the matrices in the exponent in (C. 19) in any order.