Dissimilarities of reduced density matrices and eigenstate thermalization hypothesis

We calculate various quantities that characterize the dissimilarity of reduced density matrices for a short interval of length $\ell$ in a two-dimensional (2D) large central charge conformal field theory (CFT). These quantities include the R\'enyi entropy, entanglement entropy, relative entropy, Jensen-Shannon divergence, as well as the Schatten 2-norm and 4-norm. We adopt the method of operator product expansion of twist operators, and calculate the short interval expansion of these quantities up to order of $\ell^9$ for the contributions from the vacuum conformal family. The formal forms of these dissimilarity measures and the derived Fisher information metric from contributions of general operators are also given. As an application of the results, we use these dissimilarity measures to compare the excited and thermal states, and examine the eigenstate thermalization hypothesis (ETH) by showing how they behave in high temperature limit. This would help to understand how ETH in 2D CFT can be defined more precisely. We discuss the possibility that all the dissimilarity measures considered here vanish when comparing the reduced density matrices of an excited state and a generalized Gibbs ensemble thermal state. We also discuss ETH for a microcanonical ensemble thermal state in a 2D large central charge CFT, and find that it is approximately satisfied for a small subsystem and violated for a large subsystem.


Introduction
Motivated by the eigenstate thermalization hypothesis (ETH) [1,2] or its generalization, the subsystem ETH [3,4], it is important to characterize quantitatively the difference between the excited state and the thermal state. One such characterization is to quantify the difference between reduced density matrices over a local regions of these two states. This is also an interesting question by itself in quantum information theory. For two-dimensional (2D) conformal field theory (CFT), many other quantities of examining ETH have been adopted, such as correlation functions [5,6], entanglement entropy, Rényi entropy, relative entropy [3,4,[7][8][9][10], trace square [11], etc. Due to the infinite number of degrees of freedom in CFT, not every quantity is good for the use of examining the ETH [3,4], unless its behaviors for both excited and thermal states are known precisely.
It was proposed in [12] to use correlation functions of twist operators to calculate the Rényi entropy in a 2D CFT, i.e., the partition function of the Riemann surface resulting from the replica trick. When there is no compact form for these twist-operator correlation functions, one can use operator product expansion (OPE) of twist operators to calculate the short interval expansion of Rényi entropy [13][14][15][16][17].
Following this method, in this paper we will calculate various quantities which are just the sums of some partition functions, and moreover can be used to characterize the dissimilarity of the reduced density matrices of thermal and excited states, and other states on various Riemann surfaces.
Our results can be used to examine ETH. The ETH and subsystem ETH are originally defined by comparing the highly excited state with the microcanonical ensemble thermal state [1][2][3][4]. Motivated by [18,19], as well as [5][6][7][8], we compare in [10] the excited state with the canonical ensemble thermal state, and adopt the so-called weak ETH [18,19]. In [10] the short-interval expansions of the entanglement entropies for the excited state and canonical ensemble thermal state are calculated to order 8 , and it was found that their difference, which is just the relative entropy, is only suppressed by the powers of large central charge c, instead of exponential suppression. In this paper we show that there are similar behaviors for the Jensen-Shannon divergence and Schatten 2-norm. For the more refined consideration, one should compare the excited state with the generalized Gibbs ensemble (GGE) thermal state [20][21][22][23][24][25]. We will discuss the possibility that all the dissimilarities considered in this paper vanish when comparing the reduced density matrices of an excited state and a suitably defined GGE thermal state. As a by-product, we also check ETH for the microcanonical ensemble thermal state with the dissimilarity measures of comparing with the energy eigenstate.
The rest of this paper is arranged as follows. In section 2 we give prescriptions of the method and show how to get the partition functions from OPE of twist operators. Moreover, in subsection 2.5 we apply the prescriptions to evaluate the Rényi and entanglement entropies. In section 3 we calculate the various dissimilarity measures between reduced density matrices. In section 4 we apply our results to examine ETH for the canonical ensemble thermal state. In section 5 we discuss the possible scenarios ETH for the GGE thermal state. In section 6 we discuss ETH for a microcanonical ensemble thermal state in a 2D large central charge CFT, and find that it is approximately satisfied for a small subsystem and violated for a large subsystem. We conclude with discussion in section 7. In appendix A we calculate the relative entropy from modular Hamiltonian as a consistent check. In appendix B we consider the contributions from general operators, and get the formal forms of the various dissimilarity measures and the Fisher information metric. Some lengthy and not so enlightening results in section 3 are collected in appendix C.

Prescriptions of the method
In this section we first give the useful basics of the vacuum conformal family in two-dimensional large central charge CFT and then show how we calculate the partition functions on various Riemann surfaces using OPE of the twist operators.

CFT basics
In this paper we only consider the contributions from the holomorphic sector of the vacuum conformal family in a two-dimensional large central charge CFT, and the generalization to antiholomorphic sector can be figured out easily. We need the quasiprimary operators to level 9, i.e., T , A, B, D, E, H, I and J as shown in table 1. The definitions, normalization factors, and conformal transformations of the quasiprimary operators up to level 8, as well as some useful structure constants, can be found in [10,16,17,26].   with (X Y) denoting normal ordering of two operators X and Y. Under a general conformal transformation z → f (z) it transforms as J (z) = f 9 J (f ) + · · · + c(2c − 1)(5c + 22)(4s 2 s + 15s 3 − 18ss s ) 259200 , where s denotes the Schwarzian derivative and · · · represents the omitted terms that are proportional to T , A, B, D and their derivatives.

OPE of twist operators
For one short interval A = [0, ] on a Riemann surface R, replica trick leads to a CFT on an n-fold Riemann surface R n . The partition function on R n can be written as a two-point function of twist operators T andT in an n-fold CFT on R [12] tr 5) and the n folds of the CFT, which we call CFT n , are independent except the connection by the twist operators. In this paper we only consider Riemann surface R with translation symmetry, and so the one-point functions are all constants. Using OPE of twist operators [13][14][15][16][17], we may get (2.6) and in the summation we only need to consider the quasiprimary operators Φ K in CFT n that are the direct products of the quasiprimary operators in different replicas of the CFT. Only considering the contributions from the vacuum conformal family, we list the quasiprimary operators in CFT n to level 9 in table 2. To level 8, the coefficients d K can be found in [16,26], and using the method in [15] and Interestingly, there is no contribution from level 9 operators, which consist of J only.
level operator level operator level operator level operator Table 2: The holomorphic nonidentity quasiprimary operators to be considered in this paper for CFT n and up to level 9. We have omitted the replica indices and their constraints, which can be easily figured out and can also be found in [26].
Each of the CFT n quasiprimary operator Φ K in (2.6) has the form with X 1 , X 2 , · · · , X k being nonidentity quasiprimary operators in table 1 and there are also some constraints for the k replica indices j 1 , j 2 , · · · , j k . We have the one-point functions that are independent of the replica indices and so we can define b K from the OPE coefficient d j 1 j 2 ···j k K by summing over the replica indices [17] b To level 8 the form of b K can be found in [10,17], and from (2.7) we know Then we write (2.5) explicitly as (2.12) Due to the absence of level 9 contribution, in the above the unknown terms start from O( 10 ).
In this paper we consider several different Riemann surfaces that are environments of a short interval A = [0, ], and they are shown in figure 1. Note that the complex plane case figure 1(a) can be got as limits of other six cases.
• In figure 1(a), the interval is on an infinite straight line in ground state of the CFT. It is just a complex plane R(∅), and we denote the total system density matrix as ρ(∅) and reduced density matrix as ρ A (∅).
• In figure 1(b), the interval is on a length L circle in ground state, and it is a vertical cylinder R(L). We have the density matrix ρ(L) and reduced density matrix ρ A (L).
• In figure 1(c), the interval is on a circle in excited state |φ of a primary operator φ with conformal weight h φ and normalization α φ = 1. The manifold is a vertical cylinder capped with an operator inserted at each of the two ends, and we denote it as R(L, φ). We have the density matrix ρ(L, φ) and reduced density matrix ρ A (L, φ).
• In figure 1(d), the interval is on an infinite straight line in thermal state with inverse temperature β. The manifold is a horizontal cylinder R(β), and it is the modular transformation of R(L).
We have the density matrix ρ(β) and reduced density matrix ρ A (β).
• Figure 1(e) is the modular transformation of figure 1(c). The interval is on an infinite straight line in thermal state with inverse temperature β, and also there are boundary conditions imposed on both ends of the horizontal cylinder. Each boundary condition is effectively represented by insertion of a primary operator φ. We have the Riemann surface R(β, φ), the density matrix ρ(β, φ) and reduced density matrix ρ A (β, φ).
• In figure 1(f), the interval is on a length L circle in thermal state with inverse temperature β.
The temperature is low β L, and the manifold is a fat torus. In limit β/L → ∞, it becomes a vertical cylinder figure 1(b). We have the Riemann surface R(L, q), the density matrix ρ(L, q) and reduced density matrix ρ A (L, q), with definition q = e −2πβ/L .
• In figure 1(g), the interval is on a length L circle in thermal state with inverse temperature β. The temperature is high L β, the manifold is a thin torus, and it is the modular transformation of the fat torus figure 1(f). In limit L/β → ∞, it becomes the horizontal cylinder figure 1(d).
We have the Riemann surface R(β, p), the density matrix ρ(β, p) and reduced density matrix We need the one-point functions X R with X = T, A, B, D, E, H, I for R being each of these Riemann surfaces in figure 1. In practice, we only need to consider the cases of R(L, φ) and R(L, q), and the other cases can be got from them by some simple substitutes and/or limits. For the case R(L, φ) one can find the results in [10]. For the case R(L, q) one can find the results to level 6 in [17]. Using the method in appendix B of [17], the conformal transformations of E, H, I in [10], as well as the structure constants in [17] and (2.1), we get the one-point functions

Partition function from twist operators
Gluing n reduced density matrices ρ A,j on n different Riemann surface R j with j = 0, 1, · · · , n − 1, one gets a CFT on the Riemann surface R n = R 0 ⊕ · · · ⊕ R n−1 . This suggests to assume that the partition function on R n can still be written as a two-point function of twist operators tr A (ρ A,0 · · · ρ A,n−1 ) = T ( )T (0) R 0 ⊕···⊕R n−1 . (2.14) Each replica of the CFT lives on one of the Riemann surfaces, and different replicas are connected only by twist operators. For the n = 2 and n = 3 cases one can see, for examples, [11,21,22,[27][28][29][30][31][32][33], but we are not sure if it is applicable for general n when Z n replica symmetry is lost. Actually, in this paper we only use a relaxed relation and Z n replica symmetry is recovered after permutations. Thus when we write (2.14), we actually mean (2.15), and there is caveat that (2.15) basically is an assumption that we have no concrete proof.
For two different Riemann surfaces R and S, we may define respectively two reduced density matrices ρ A and σ A . In this paper, we need to calculate the partition function with n being an integer and m = 0, 1, · · · , n. Using (2.15), we see that it is just the right-hand side of (2.12) with the substitutes of the forms with X , Y denoting general quasiprimary operators. A general substitute takes the form with C k n and C k m being the binomial coefficients, and in the right hand side we have omitted various terms with some R's being replaced by S's.
In section 3.2, we need to calculate the partition function ) being understood as the left-hand side of (2.15). Using the summation formulas we get that (2.19) is just the right-hand side of (2.12) with the substitutes In section 3.3, we need to calculate Using the fact that n m=0 C m n (−) n−m m k = 0 for k = 0, 1, · · · , n − 1, we get Note that the summation of {X 1 , X 2 , · · · , X n } is over different sets of nonidentity quasiprimary operators and the order of the operators in each set does not matter. For n = 2 it is just the result in [11].
Note that for general n, b X 1 X 2 ···Xn is complex and has no universal form, and it is related to the n-point correlation function on complex plane X 1 (z 1 )X 2 (z 2 ) · · · X n (z n ) C .

The n → 1 limit
If we are only interested in the n → 1 limit instead of the general n result, there can be a simpler calculation [34,35]. For each CFT n operator Φ K , we may define with b K being defined in (2.10). Using the results of b K in [10,17], we get the relevant results of a K For the reduced density matrix ρ A on Riemann surface R, we get For the reduced density matrix ρ A , σ A , defined respectively on Riemann surface R, S, we get that equals right-hand side of (2.28) with the substitutes Similarly, we get that equals right-hand side of (2.28) with the substitutes (2.21).

Rényi and entanglement entropies on various Riemann surfaces
Using the above prescriptions, we can evaluate the entanglement and Rényi entropies on various Riemann surfaces, some of which have been obtained before. The results will then serve in the next section for calculating the dissimilarity measures between reduced density matrices.
For a reduced density matrix ρ A , the Rényi entropy is defined as and taking the n → 1 limit one can get the entanglement entropy The Rényi entropy can be calculated from (2.12), and the entanglement entropy can be calculated from the n → 1 limit of the Rényi entropy or directly from (2.28).
We calculate the Rényi entropies and entanglement entropies for the seven Riemann surfaces in Rényi entropy S n (L, φ) and entanglement entropy S(L, φ) have been calculated in [10] 1 , and we will not repeat the results here. Since now at level 9 we have (2.11), the unknown terms O( 9 ) in S n (L) S n (L,ϕ) S n (L,q) S n (β,ϕ) S n (β,p) Figure 2: The seven Rényi entropies we can calculate using OPE of the twist operators. In practice, we only need to calculate S n (L, φ) and S n (L, φ), as marked in blue, and the other cases can be obtained easily from them.
results of [10] are actually of order O( 10 ). For the reduced density matrix ρ A (β, φ), we have the Rényi entropy and entanglement entropy For ρ A (L, q), the Rényi entropy and entanglement entropy have been calculated using OPE of the twist operators to order 7 in [17], and here we calculate the results to order 9 . In large c limit we write the Rényi entropy as the leading part, the next-to-leading part, the next-to-next-to-leading part, and etc, and to order 9 only the first three parts are non-vanishing. Explicitly, we have the leading part the next-to-leading part and the next-to-next-to-leading part The leading and next-to-leading parts match the results in [39][40][41], which are calculated in another method. The 8 order of the next-to-next-to-leading part is a new result. Taking n → 1 limit we get the entanglement entropy The Rényi entropy and entanglement entropy for ρ A (β, p) are just the modular transformation of Without considering the subtlety of boundary conditions at the entangling surface [42,43], the Rényi entropy and entanglement entropy for ρ A (∅), ρ A (L) and ρ A (β) are of universal forms and depend only on the central charge [12] S n (∅) = c(n + 1) 12n log , S(∅) = c 6 log , (2.41) To order 9 the above results can be obtained easily as the limits and/or substitutes of S n (L, q), S(L, q).

Dissimilarities of reduced density matrices
In this section we evaluate various dissimilarity measures between reduced density matrices, which include relative entropy, Jensen-Shannon divergence, Schatten 2-norm and 4-norm. Some lengthy and not so enlightening results are collected in appendix C and the attached Mathematica notebook in arXiv.

Relative entropy
The relative entropy is also called Kullback-Leibler divergence. For two reduced density matrices ρ A and σ A , the relative entropy is defined as To calculate the relative entropy, one may first calculate the n-th relative entropy and then takes the n → 1 limit. The relative entropy is not symmetric for its two arguments, and one may define the symmetrized relative entropy To calculate the symmetrized relative entropy, one can first calculate the n-th symmetrized relative and then takes the n → 1 limit. It turns out that As shown in figure 3, we use OPE of twist operators as described in section 2 to calculate four relative entropies. For ρ A (L 1 , φ 1 ) and ρ A (L 2 , φ 2 ) we have the relative entropy (C.1). For the special case L 1 = L 2 in (C.1), it matches the result in [10]. For ρ A (L 1 , φ) and ρ A (L 2 , q) we have the relative For ρ A (L 1 , q 1 ) and ρ A (L 2 , q 2 ) we have the relative entropy (C.4).
is positive definite and can be used to characterize the dissimilarity of ρ A , σ A . In fact, it is directly related to the overlap of the two reduced density matrices .
As shown in figure 4, we calculate three symmetrized relative entropies (C.5), (C.6), and (C.7) using OPE of twist operators. We get the 2nd symmetrized relative entropies (C.8), (C.9), and (C.10). Figure 4: The 27 symmetrized relative entropies we can calculate using OPE of the twist operators. We only need to calculate the three ones marked in blue. This figure also applies to the 2nd symmetrized relative entropy, Jensen-Shannon divergence, as well as the Schatten 2-norm and 4-norm in the following subsections.

Jensen-Shannon divergence
The Jensen-Shannon divergence of two reduced density matrices ρ A and σ A are defined as One can also define the Jensen-Shannon distance To calculate the Jensen-Shannon divergence, we first calculate the Jensen-Rényi divergence being the Rényi entropies, and then take the n → 1 limit. We then (3.14) Explicitly, we obtain (C.11), (C.12), and (C.13).

Schatten 2-norm and 4-norm
For a general matrix ρ, the Schatten n-norm is defined as with |ρ| = ρ † ρ. For n = 1 it is just the trace norm, and for n = 2 it is just the Hilbert-Schmidt norm. For two reduced density matrices ρ A , σ A , we just calculate For n = 1 it is just the trace distance, and for n = 2 it is just trace square. Since the reduced density matrices are hermitian, when n := 2p is an even integer we have a simpler expression When there is no ambiguity, we call ρ A − σ A 2p 2p also as Schatten 2p-norm. We use (2.24) and get the Schatten 2-norms (C.14), (C.15), (C.16) and Schatten 4-norms (C.17), (C.18), (C.19).

ETH for canonical ensemble thermal state
Whether ETH is satisfied or not depends on how it is precisely defined, and for different quantities there may be different criteria. The local ETH is defined in terms of local operators [1,2]. More precisely it requires that in the basis of energy eigenstates {|φ a } the operator A has the form a coherent state in a narrow energy window around this single eigenstate, and it also equals to the microcanonical ensemble average of A in this narrow energy window up to exponential suppression of the entropy S(E). A generalization of local ETH is the subsystem ETH that is defined in terms of reduced density matrices [3,4], and it states that in the excited state |φ of energy E the reduced density matrix ρ A,φ of a small region A is close to some universal density matrix ρ A,E by trace distance In this paper we do not check directly the local ETH or subsystem ETH. Instead we compare the reduced density matrix of the excited energy eigenstate with the reduced density matrices of some explicit thermal states. In this section we consider the canonical ensemble states, in section 5 the GGE thermal state, and in section 6 the microcanonical ensemble thermal state. We use several quantities to characterize the difference of the reduced density matrices of the excited and thermal states. To claim whether ETH is satisfied or not, we need to set up a criterion for each quantity, which is beyond the scope of the present paper. Our results can be viewed as a first step towards such criteria. However, based the observations in [3][4][5][6][7][8][44][45][46], we can make some claims for the Rényi entropy and entanglement entropy, as we will discuss in the end of this section.
As a first step towards defining and checking ETH for the canonical ensemble thermal state, we calculate various quantities to characterize the dissimilarity of the reduced density matrix ρ A (L, φ) for the excited state and ρ A (β) for the thermal state. Note that ETH is for comparing a highly exited state and a high temperature state, so that we use ρ A (β) to approximate ρ A (β, p). The excited state |φ is heavy and we write the conformal weight as and by requiring we get the identification [5,6] We have the difference of Rényi entropy 2 and it has been calculated in [3,9,10]. The difference of entanglement entropy is and it has been calculated in [10]. We have the relative entropies 8) and the first one has been calculated in [10] by a different method. Note that S(ρ A (L, φ) ρ A (β)) and S(ρ A (β) ρ A (L, φ)) happen to be the same at order 8 , and we expect they will be different at higher orders. We have the symmetrized relative entropy and the 2nd symmetrized relative entropy The Jensen-Rényi divergence and Jensen-Shannon divergence are respectively (4.10) We also have As we have said in the beginning of this section, with the above results, we cannot claim whether ETH is satisfied for an individual quantity without a precise criterion of ETH. As stated in [3,4], in a CFT not every quantity is good to define ETH. For the Rényi entropies of the excited and thermal states being equal, it is necessary that the subsystem is much smaller than the whole system /L → 0 [44,45].
If one defines ETH for canonical ensemble as S n (L, φ) − S n (β) → 0 when /L → 0, then from (4.6) one concludes that such an ETH is satisfied. However, this criterion seems too strong to yield useful result for general cases. Instead, we can think Rényi entropy as a refined quantity compared to the entanglement entropy to characterize the violation of local thermality of a energy eigenstate.
Similarly, the Jensen-Shannon divergence is a better quantity to define ETH than the Jensen-Rényi divergence, since the former is always nonnegative due to the concavity of the von Neumann entropy while the latter is not. This can be seen in equations (4.10). Note that at order 8 , the Jensen-Rényi divergence is of order c 2 and the Jensen-Shannon divergence of order c 0 . This is reminiscent of the fact that the Rényi entropy difference is of order c and the entanglement entropy difference is of order c 0 . This is another indication that the Jensen-Rényi divergence is not a good quantity to define ETH, as the Rényi entropy. The Rényi entropy is just a higher genus free energy, and this is consistent with the fact that it is of order c. However, the Jensen-Rényi divergence is not a free energy or a sum of free energies, it is not necessary that it is of order c or subleading to order c.
The Schatten 2-norm (4.11), or equivalently the square trace distance, is dependent on the UV regulator and it is vanishing as / → 0. It is not a good quantity to define ETH, either.
For a large c CFT, it was found in [7,8] that the leading order c entanglement entropy of the excited and canonical ensemble thermal states is the same as long as 0 < /L < 1/2. If ETH for the entanglement entropy is defined in this way with 0 < /L < 1/2, the result (4.7) clearly shows the violation of ETH at the next-to-leading order of large c [10].

ETH for GGE thermal state
All the above dissimilarities in the previous section between the excited and thermal state originate from the fact that the level 4 operator A has different expectation values [9,10] A R(L,φ) = A R(β) .

(5.1)
A more refined consideration is that one should not compare the excited state and the canonical ensemble thermal state, instead one need to consider the generalized Gibbs ensemble (GGE) thermal state [20][21][22][23][24][25]. The GGE state has the density matrix with J i being some conserved charges and β i being the corresponding chemical potentials. By requiring the ETH comparison is done for the same macroscopic super-selection sector, we should impose so that one can get the relation of h φ with the GGE parameters β, µ i . In the vacuum conformal family, there are an infinite number of commuting conserved charges I 2k+1 with k = 0, 1, · · · [47,48].
For examples, one has We may choose the GGE state Then we have the requirement for all vacuum conformal family quasiprimary operator X . Since there are more equations than the unknown chemical potentials, we do not know if there is a unique solution for all β, β 2k+1 , k = 1, 2, · · · .
If this is the case, all the dissimilarities considered in this paper vanish so that there is no difference between the reduced density matrices of the excited state and GGE thermal state.
Furthermore, in GGE it is not necessarily that all the conserved charges commute with each other [22]. For each nonidentity quasiprimary operator in vacuum conformal family, say X , we may define a conserved charge Then we may define the GGE state To be more concrete, we consider a toy model of GGE For an arbitrary operator X we have .

(5.10)
We get the expectation value of GGE in expansion of the small chemical potential µ The correlation functions on the cylinder R(β) can be calculated by mapping the cylinder to a complex plane by the conformal transformation z = e 2πw β . Note that the above expectation value should be independent of the position w 0 . Using the integral 3 with S = 4 and S = 8, we finally get In the excited state |φ of a holomorphic primary operator φ with conformal weight h φ = c φ , there are expectation values [9, 10] (5.14) To consider ETH comparison for the same super-selection sector, we equate (5.13) and (5.14) and solve the inverse temperature β and chemical potential µ in terms of φ , c, L. As known that the ETH for canonical ensemble works well in the leading order of large c limit [7,8], we should then On the other hand, the finite c correction causes the mismatch between excited state and the canonical thermal state by power suppression of 1/c [10], we then need to find the solution of (5.15) for GGE with power correction of 1/c to (5.16) as follows. To make the 1/c expansions in (5.13) well-defined, we need the leading order µ ∼ 1/c α with α > 1. Since there is no subleading term in T φ , we need the leading order correction to β of order 1/c α−1 . We then make the following ansatz for the solution to equations (5.15) with the constants α, a, b to be determined. It is easy to see that A GGE = A φ cannot be satisfied for α ≥ 2. Thus, we have 1 < α < 2 in ansatz (5.17). However, we cannot determine the coefficient b in ansatz (5.17) at the present expansion order of (5.13), but might be determined uniquely at the higher expansion orders. 4

ETH for microcanonical ensemble thermal state
The local ETH [1,2] and its corollaries such as the subsystem ETH [3,4] are originally considered for comparing the energy eigenstate and the microcanonical (ensemble) thermal state. Despite that the difference between canonical and microcanonical thermal states is power-law negligible in the limit of large number of degrees of freedom, it is still interesting to check ETH directly for microcanonical thermal state. In this appendix we will do this using OPE of twist operators as described in section 2.
The microcanonical thermal state to be considered is the equal-weight sum of the pure states , Ω, i.e., its density matrix is given by where φ i 's are nonidentity primary operators of conformal weights (h φ i ,h φ i ). For the microcanonical thermal states, we should require for all i = 1, 2, · · · , Ω where (h φ ,h φ ) is the conformal weight of the excited state φ with which we will compare for checking ETH.
For simplicity, we can choose an orthonormal set of φ i 's, i.e., We also choose φ as one of the Ω operators φ i , i.e., φ ∈ {φ i }.
Globally, the pure excited state density matrix ρ φ = |φ φ| and the microcanonical thermal state density matrix ρ me = 1 Ω Ω i=1 ρ φ i are very different. This can be seen from various dissimilarity measures, i.e., starting from their von-Neumann entropies, S(ρ φ ) = 0, S(ρ me ) = log Ω, (6.4) and then the relative entropy S(ρ φ ρ me ) = log Ω, (6.5) and the Jensen-Shannon divergence JS(ρ φ , ρ me ) = log 2 + 1 2 log Ω − Ω + 1 2Ω log(Ω + 1). (6.6) Instead, the ETH should be explored by the local observables. If ETH holds, for arbitrary local observable X we should have If X is the operator in the vacuum conformal family, it is easy to see that (6.7) holds by the fact (6.2). On the other hand, if X is some nonidentity primary operator or its descendants, then the ETH imposes constraints on OPE coefficients C φ i φ i X : This implies that not every CFT satisfies ETH.
However, in a large c CFT, it is often a good approximation to consider contributions only from the vacuum conformal family, and this is what we adopt in this paper. We now consider to divide the circle of length L, on which the large c CFT lives, into a small subsystem A of length and its large compliment B of length L − . We can define the reduced density matrices ρ A,φ and ρ B,φ for the excited state ρ φ , and ρ A,me and ρ B,me for the microcanonical thermal state ρ me . We then use OPE of twist operators to calculate dissimilarity measures for comparing ρ A,φ , ρ A,me , and for comparing ρ B,φ , We only include contributions from the vacuum conformal family in the following calculation. For the small subsystem A, from (6.7) we get tr A (ρ m A,φ ρ n−m A,me ) tr A ρ n A,φ , m = 0, 1, · · · , n, (6.9) and we further get the entanglement entropy, relative entropy, and Jensen-Shannon divergence S(ρ A,φ ) S(ρ A,me ), S(ρ A,φ ρ A,me ) 0, JS(ρ A,φ , ρ A,me ) 0. (6.10) For the large subsystem B, we use (6.3) and [52] (6.11) and get tr B ρ n B,me ) Ω m−n tr A ρ n A,φ , m = 1, 2, · · · , n. (6.12) Then we get The above result agrees with the expectation from ETH, which states that the energy eigenstate approximates the microcanonical thermal state only for a small enough subsystem but not for a large one. This is also verified by the numerical simulations for lattice models done in [44] as long as the size of subsystem is smaller than the half of the total system size. When the size of the subsystem A becomes as large as half the total system size, the trace square distance starts to deviate from zero, and the behavior indicates that one may be able to extract some critical exponents from the behavior around = L/2.

Conclusion and discussion
We have used the OPE of the twist operators to calculate various quantities that can be used to characterize the dissimilarity of two reduced density matrices, and these quantities include the Rényi entropy, entanglement entropy, relative entropy, Jensen-Shannon divergence, as well as the Schatten 2-norm and 4-norm. We first consider contributions from only the holomorphic sector of the vacuum conformal family, and make expansion of all the quantities by the length of short interval to order 9 . As an application of the results, for ETH we show how these dissimilarity measures behave for the excited and thermal states in the high temperature limit. As we have showed in this paper, all these quantities can capture the dissimilarity of the two reduced density matrices. Furthermore, we also discuss the possibility to define ETH with GGE thermal state. By using GGE, we provide a possible scenario to define ETH and resolve the mismatch between ETH and highly excited states in large c CFT. We also discuss ETH for microcanonical ensemble thermal state. In the appendix we give the formal forms of the entanglement entropy, relative entropy, Jensen-Shannon divergence, and Fisher quantum metric with contributions from general operators.
In the method of twist operators we cannot calculate the trace distance, which is essential for the definition of subsystem ETH [3,4]. Trace distance is just the Schatten n-norm with n = 1, and the absolute value in the definition makes it hard to evaluate when n is an odd integer. It would be nice if the trace distance can be calculated in CFT.

Acknowledgement
We would like to thank Alexandre Belin, Xi Dong, Thomas Faulkner, Nabil Iqbal, Zuhair U. Khandker, Guojing Liu, Gábor Sárosi and Huajia Wang for helpful discussions. We thank the anonymous JHEP referee of our previous paper [10] for discussions about higher order conserved charges the generalized

A Relative entropy from modular Hamiltonian
We calculate the relative entropies using modular Hamiltonian as shown in figure 5, and some of them have been calculated from the same method in [61,62]. This appendix serves as a check of the relative entropies from twist operators in section 3.1. Figure 5: The 20 relative entropies we can calculate using modular Hamiltonian and entanglement entropy. We only need to calculate the two relative entropies S(ρ A (L 1 , φ) ρ A (L 2 )) and S(ρ A (L 1 , q) ρ A (L 2 )) as marked in blue.
For a reduced density matrix ρ A , the modular Hamiltonian H(ρ A ) is defined as For two reduced density matrices ρ A , σ A , the relative entropy can be written as with H(σ A ) being the modular Hamiltonian of σ A . The modular Hamiltonian is known only for cases of ρ A (∅), ρ A (L) and ρ A (β), and one has [43,63,64] 5 We have only incorporated contributions from the holomorphic sector. They satisfy the relations As shown in figure 5, we use the entanglement entropy and modular Hamiltonian to calculate the relative entropies S(ρ A (L 1 , φ) ρ A (L 2 )) and S(ρ A (L 1 , q) ρ A (L 2 )). We have and this is in accord with (C.1) and (C.2). We have and this is in accord with (C.3) and (C.4).

B Contributions from general operators
In the main text, we only consider the contributions from the holomorphic part of the vacuum conformal family to order 9 . In this appendix we consider the contributions from general holomorphic and antiholomorphic operators, and we get closed forms of the entanglement entropy, relative entropy, and Jensen-Shannon divergence. 5 One can see modular Hamiltonian for excited states in [65,66].
For a short interval A = [0, ] on a Riemann surface R that has translational symmetry, we have the reduced density matrix ρ A and get with the summation {X 1 · · · X k } being over different sets of all the nonidentity holomorphic and antiholomorphic quasiprimary operators. For a quasiprimary operator X , we use ∆ X to denote its scaling dimension. Then we get the entanglement entropy For the same short interval A = [0, ] on another Riemann surface S that also has translation symmetry, we have the reduced density matrix σ A and similar expression for entanglement entropy S(σ A ). The difference of entanglement entropies is For k quasiprimary operators X 1 , · · · , X k , and two translation invariant Riemann surfaces R, S, we may define with 0 ≤ i ≤ k. Note that the above definition is normalized such that F i (X , · · · , X |R, S) = (B.5) Then we get the relative entropy 6) and the symmetrized relative entropy Using the summation n m=0 C m n C i m C k−i n−m = 2 n−k C k n C i k , (B.8) and (B.5), we get Then we get the Jensen-Shannon divergence With the above results, we can also calculate short interval expansion of the Fisher information metric. We parameterize the states of the CFT by θ α , and we have the density matrix ρ(θ), and formally the Riemann surface R(θ). For the reduced density matrix ρ A (θ), the Fisher information metric is defined as It is related to the relative entropy and Jensen-Shannon divergence as [67,68] From (B.6) or (B.10) we get short interval expansion of the Fisher information metric with the definition (B.14) In principle, the Fisher information metric can be used to define the distance on the state space, i.e., all the thermal and quasi-primary states of 2D CFTs as considered in this paper. Though we do not know at present how to efficiently characterize the state space by this metric, we expect it may help to visualize the ETH geometrically for the future studies.
In section 4, for the reduced density matrices of the excited state and canonical ensemble thermal state ρ A (L, φ), ρ A (β), we have calculated the relative entropy S(ρ A (L, φ) ρ A (β)) (4.8) and Jensen-Shannon divergence JS(ρ A (L, φ), ρ A (β)) (4.10), with contributions of only the vacuum conformal family, and find that they are non-vanishing and positive at order 8 , c 0 . One question is can they be cancelled with the addition of some suitable non-vacuum conformal families. We address the issue below.
For a general fermionic operator ψ, we have Without loss of generality, we consider a hermitian nonidentity bosonic primary operator X with normalization α X , scaling dimension ∆ X = h X +h X and spin s X = h X −h X . Note that s X is an integer, and so i 4s X = 1. From (B.6), we get the leading correction of conformal family X to the It turns out that [60] a X X = − i 2s X √ πΓ(∆ X + 1) and we get Since X is hermitian, α X is real and positive α X > 0, and on a complex plane we have [69,70] [ and by definition φ is also a hermitian primary operator [φ(0)] † = φ(∞). From the three-point correlation function on complex plane we get that C φX φ is real. When C φX φ = 0, the conformal family X does not contribute to S(ρ A (L, φ) ρ A (β)), and so we only need to consider the case that C φX φ is real and non-vanishing. For (B.18), we have Similarly, from (B.10), we get that the leading correction of the conformal family X to the Jensen-Shannon divergence JS(ρ A (L, φ), ρ A (β)) (4.10) is real and positive In summary, in a unitary CFT, the non-vanishing results of the relative entropy S(ρ A (L, φ) ρ A (β)) (4.8) and Jensen-Shannon divergence JS(ρ A (L, φ), ρ A (β)) (4.10) with contributions of only the vacuum conformal family cannot be cancelled by the addition of any non-vacuum conformal families.

C Collection of results in section 3
In this appendix we collect some lengthy equations in section 3. In these equations we also omit some complex parts, and denote them by · · · . The full forms can found in the attached Mathematica notebook in arXiv.