U(1) axial symmetry and Dirac spectra in QCD at high temperature

We derive some exact results concerning the anomalous U(1)$_A$ symmetry in the chirally symmetric phase of QCD at high temperature. We discuss the importance of topology and finite-volume effects on the U(1)$_A$ symmetry violation characterized by the difference of chiral susceptibilities. In particular, we present a reliable method to measure the anomaly strength in lattice simulations with fixed topology. We also derive new spectral sum rules and a novel Banks-Casher-type relation. Through our spectral analysis we arrive at a simple alternative proof of the Aoki-Fukaya-Taniguchi"theorem"on the effective restoration of the U(1)$_A$ symmetry at high temperature.


Introduction
The physics of the U(1) axial symmetry in quantum chromodynamics (QCD) has been a subject of intensive research over many years. While the QCD Lagrangian is invariant under U(N f ) R × U(N f ) L at the classical level, the U(1) A symmetry is broken by quantum effects [1] (see (2.1)). In fact, experimentally observed hadron spectra in the vacuum do not fully respect the U(N f ) R × U(N f ) L symmetry: the η meson concerning the U(1) A symmetry is much heavier than the other pseudoscalar mesons associated with chiral symmetry breaking. This U(1) A problem was resolved [2,3] through the discovery of nonperturbative topological excitations in the Euclidean spacetime, called instantons.
Recently, much attention has been focused on another "U(1) A puzzle" in QCD at T > T c , where T c is the (pseudo)critical temperature of the chiral transition. Namely, even though the U(1) A anomaly relation (2.1) itself does not receive any modification at finite temperature [4], 1 the U(1) A symmetry could be effectively restored at the level of mesonic two-point (or higher-point) correlation functions [5]. This problem is of particular interest due to the role played by the axial anomaly in determining the order of chiral phase transition [6]. It was claimed by Cohen [7] that massless two-flavor QCD at T > T c should be effectively symmetric under U(1) A in the sense that two-point correlators in the π, σ, δ, and η channels become all degenerate. This work was soon followed by counterarguments [8,9]. Later it was recognized by Laine and Vepsäläinen [10] that the U(1) A violation in the flavor-singlet (axial) vector channel can be shown rigorously at least for high enough temperatures without any assumptions. This conclusion qualitatively agrees with [11] that shows U(1) A violation at high temperatures by calculating a nonzero splitting of scalar and pseudoscalar screening masses using a semiclassical dilute instanton gas picture [12]. More recently, Aoki et al. [13] claimed to have shown the effective restoration of the U(1) A symmetry rigorously under certain assumptions, although the validity of their assumptions appears to be nontrivial and subtle (see Sec. 5).
Alongside these theoretical studies, the U(1) A problem at finite temperature has also been studied intensively in first-principles lattice QCD simulations [14,15], but a consensus is not reached yet: effective restoration of the U(1) A symmetry was reported in simulations with overlap fermions [16] and domain-wall fermions [17][18][19][20], whereas a violation of the U(1) A symmetry was reported in simulations using staggered fermions [21][22][23][24] and domainwall fermions [25]. 2 We warn that some of the simulations [17,[21][22][23][24][25] were performed for 2 + 1 flavors; the effect of a heavier strange quark on the possible U(1) A violation in the light-quark sector is not completely clear yet.
In this paper, we do not try to solve this U(1) A puzzle at finite temperature. Rather, we derive some rigorous results involving the U(1) A symmetry in high-temperature QCD, assuming that the U(1) A symmetry is violated. (Since the U(1) A violation must be present at least for high enough temperatures [10,11], we consider its presence at all T > T c to be quite plausible.) We first derive general expressions for chiral susceptibilities and the topological susceptibility at T > T c using the method of [15,26]. They are used to highlight that, while the majority of the U(1) A violation at small volume comes from exact (topological) zero modes, the dominant contribution at large volume comes from nonzero modes. We estimate finite-volume effects suffered by lattice QCD simulations with fixed topology, and propose a way to measure the U(1) A -violating effects reliably in a finite volume. Furthermore, we rigorously derive new sum rules and a new Banks-Casher-type relation for the Dirac eigenvalue spectra at T > T c . This relation provides a link between the connected two-point correlation function of Dirac eigenvalues and the U(1) A anomaly. As a by-product of our spectral analysis, we find a remarkably simple proof of the Aoki-Fukaya-Taniguchi "theorem" on the effective restoration of the U(1) A symmetry at high temperature, under the same assumptions as in [13]. It should be emphasized that all 1 We noticed that the tensor decomposition of the anomalous correlation function at finite temperature in [4] is incomplete, and that many more terms need to be included; see Appendix A. This does not, however, affect the conclusion of [4] that the anomaly relation (2.1) remains unchanged at finite temperature. 2 In [24] the overlap Dirac operator was used to probe the Dirac spectra while configurations were generated with improved staggered fermions.
our results are based on a systematic analysis of QCD. We expect that testing our exact relations and proposal in future lattice simulations should be a useful step towards the resolution of the U(1) A puzzle at finite temperature. The paper is organized as follows. In Sec. 2, we review the argument of [10] for U(1) A violation at high temperature. In Sec. 3, we discuss the importance of topology and finitevolume effects on the breaking of the U(1) A symmetry, as well as its implication for lattice QCD simulations. In Sec. 4, we derive new spectral sum rules and a Banks-Casher-type relation for Dirac spectra concerning the U(1) A anomaly. In Sec. 5, we comment on the Aoki-Fukaya-Taniguchi "theorem" in [13]. Section 6 contains our conclusions.
In Appendix A, we point out and correct the deficiency in the tensor decomposition of the anomalous correlation function in QCD at finite temperature studied by Itoyama and Mueller [4]. In Appendix B, we discuss the microscopic scaling that is different from (4.12) in the main text. In Appendix C we derive (4.19) in the main text. Throughout this paper, we will work on QCD with quarks in the fundamental representation of the gauge group SU(3).

U(1) A anomaly at high temperature
In this section we review an argument for the U(1) A anomaly in massless QCD at high temperature, given by Laine and Vepsäläinen [10]. 3 Their argument is based on the anomaly relation for the axial current j Aµ = ψγ µ γ 5 ψ, and the Debye screening of the gauge fields at high temperature. Here N f is the number of flavors, g is the QCD coupling constant and G a µν is the gluon field strength with a being the color index.
Let us work in Euclidean spacetime with the imaginary time τ = it ∈ [0, β] and a spatial box of size L 1 × L 2 × L 3 . We assume that both quarks and gauge fields obey periodic boundary conditions in spatial directions, whilst quarks (gauge fields) obey the anti-periodic (periodic) boundary condition in the temporal direction. As our interest is in the screening of gauge fields, we shall consider a spatial correlation function. Without loss of generality we choose a spatial separation in the x 3 direction. Then, for the axial "charge" (integrated over the volume transverse to the x 3 axis) . is a statistical average, and the coordinates x 3 and y 3 are entirely arbitrary. The fact that Q 3 (x 3 )Q 3 (y 3 ) = 0 when U(1) A is unbroken may be shown in two steps as follows. Suppose we add a constant external field a that couples to the axial charge Q A 3 . Then the action acquires an additional contribution δS = a dx 3 Q A 3 (x 3 ). This leads to the susceptibility where V 4 = βL 1 L 2 L 3 is the total volume of spacetime. As U(1) A is conserved by assumption, χ A must be finite in the thermodynamic limit; hence χ A < ∞ as L 3 → ∞. This completes the first step of the proof.
In the second step, we use the fact that Q A 3 (x 3 ) is independent of x 3 . Actually, this is a direct consequence of the (assumed) conservation of the axial current: where the last step follows from the boundary conditions for fields. Note that (2.6) holds rigorously in a finite volume. Then it is clear that we must have Combining (2.7) with the finiteness of χ A in the limit L 3 → ∞, we conclude that Q A 3 (x 3 )Q A 3 (0) must vanish in the thermodynamic limit for an arbitrary x 3 . This completes the second step of the proof.
Next, recall that the anomaly relation (2.1) can be rewritten as a total derivative, In terms of this K µ , the so-called Chern-Simons charge reads as Because the gauge fields are Debye screened at high temperature, the correlation of Q CS 3 should decay exponentially: where ξ is the correlation length (or the inverse Debye mass); ξ −1 ≈ gT at leading order in g. Although the correlator (2.10) may appear to vanish trivially due to the non-gauge-invariance of Q CS 3 , this is not necessarily true. As Q CS 3 is gauge invariant up to surface terms, it becomes gauge invariant and its correlator becomes nonzero once we fix boundary conditions for the gauge fields.
Since Q A 3 and Q CS 3 have the same quantum numbers, Q CS 3 contributes to the correlator of Q A 3 . Hence we expect cannot be a constant as a function of x 3 , indicating that the U(1) A symmetry is certainly violated at the level of correlation functions of quark bilinears in QCD at sufficiently high temperature.
It would then be quite natural to expect, by continuity, that the U(1) A symmetry be violated at any T > T c . In the following, we shall assume U(1) A violation for T > T c in the thermodynamic limit and pursue its consequences in detail.

Topology and finite-volume effect
In this section we write down the most general QCD partition function for T > T c in terms of quark masses and derive general expressions for the chiral susceptibilities and topological susceptibility based on the method of [15,26]. Our arguments here are based on symmetries of QCD and a systematic expansion in terms of a small parameter m/T 1, and are fully under theoretical control. We then elucidate the contributions of topology and finite-volume effects to the violation of the U(1) A symmetry (characterized by the difference of two-point functions χ π − χ δ to be defined below), and discuss possible implications for lattice QCD simulations. For definiteness, we will concentrate on two-flavor QCD below.

Partition function and topological susceptibility
We consider the partition function of two-flavor QCD at finite temperature as a function of quark masses m u,d . Since there are no massless modes at T > T c in the chiral limit, 4 the free energy density should be analytic in quark mass. 5 To write down the general form of the free energy, we consider a generic quark mass matrix M (= 2 × 2 matrix in the flavor space) and let M transform under the symmetry, G ≡ SU(2) R × SU(2) L × U(1) A , so that the quark mass term in the QCD Lagrangian is invariant under G. Here ψ R,L are the right-and left-handed quarks, which transform Noting that the free energy density at T > T c is invariant under the restored SU(2) R × SU(2) L chiral symmetry but not under the U(1) A symmetry, the partition function of QCD 4 In the imaginary-time formalism, contributions of massless quarks is infrared (IR) finite because the lowest Matsubara frequency for fermions ∼ πT acts as an effective IR cutoff. 5 It should be stressed that analyticity of the partition function breaks down if the zero-temperature part is thrown away. For example, in a non-interacting theory the free energy of quarks after subtraction of the zero-temperature part includes a term ∼ m 4 log(m/πT ) [27], which is not analytic in m.
in a spatial volume V 3 can be expanded in terms of a small parameter m u,d /T 1 as [15,26] where f 0 , f 2 and f A are functions of T and V 3 . We assume that this expansion has a nonzero radius of convergence. The term ∝ f A represents the effect of axial anomaly: for a U(1) A rotation ψ → e iγ 5 θ A ψ, this term transforms as det M → e 4iθ A det M , so it breaks U(1) A down to Z 4 . The absence of O(M ) terms is consistent with the vanishing chiral condensate in the chiral limit for T > T c . In the following we will disregard the O(M 4 ) terms in the free energy as they are suppressed by additional powers of m u,d /T 1. Since the partition function (3.2) is obtained with a systematic expansion, this will be called the "effective theory" in this paper (although there is no dynamical field in it).
We now turn to the study of topological sectors. As is well known, the θ angle can be incorporated into the partition function via M → M e iθ/N f [28], where N f = 2 is of our interest here. Then the partition function in a sector of given topological charge where V 4 ≡ V 3 /T is the spacetime volume, I Q is the modified Bessel function of Q-th order, and M = diag(m u , m d ) was substituted. Intriguingly, the probability distribution of Q is proportional to I Q in one-flavor QCD, too [28]. 6 The Taylor expansion of (3.7) in powers of quark masses starts with (V 4 f A m u m d ) |Q| , which is the contribution of exact zero modes. Hence the topological sectors with Q = 0 will all drop out in the chiral limit if V 4 is finite. By contrast, topological fluctuations will not be suppressed at all even near the chiral limit if V 4 is sufficiently large. This subtle balance between topology and volume has an important practical consequence for lattice simulations, as we will discuss shortly.
An important quantity that characterizes topological fluctuations is the mean square of the topological charge at θ = 0, where (3.7) was used. The topological susceptibility is then given [15,26] by Alternatively, one can reach (3.9) by considering that, with the replacement M → M e iθ/2 in (3.3), the θ-dependence of the free energy reads The topological susceptibility is then which is the same as (3.9). Note that our result, obtained assuming f A = 0, does not agree with the result by Aoki et al.
The topological susceptibility here should be contrasted with that of the QCD vacuum, with Σ being the magnitude of the chiral condensate [28]. This difference can be understood in the following way. In the presence of chiral symmetry breaking, the quark mass dependence of the free energy can be expanded in terms of the quark mass as where U denotes the SU(2) Nambu-Goldstone field associated with spontaneous chiral symmetry breaking. The topological susceptibility is dominated by the second term in (3.13) and is given by (3.12), with the higher order contributions being O(M 2 ). When chiral symmetry is restored (Σ = 0), on the other hand, the leading contribution to the topological susceptibility is O(M 2 ) and is given by (3.9). The behavior χ top ∝ m u m d is also found in the 2SC phase of dense QCD for the same reason [30]. From the partition function (3.3), higher moments of Q can also be derived [26] as

Chiral susceptibilities and U(1) A anomaly
We now turn to two-point correlation functions of quark bilinears (also called chiral susceptibilities). The utility of chiral susceptibilities in two-flavor QCD as a convenient probe for the U(1) A anomaly has been advocated long time ago [5,7], and nowadays they are measured in lattice simulations with dynamical quarks [16,17,25]. It is therefore of primary interest to relate these susceptibilities to the coefficients f 0 , f 2 and f A of the effective theory (3.3) [15]. In this paper we will work with the following definitions for chiral susceptibilities: where ψψ = uu + dd and ψiγ 5 ψ = uiγ 5 u + diγ 5 d in our notation, and τ 3 is the third Pauli matrix. Note that these definitions do not necessarily coincide with those in the literature.
In the same way one can also show χ η = 4f 2 − 4f A . The equality χ σ = χ π , as well as χ δ = χ η , is a direct consequence of the restored SU(2) R × SU(2) L chiral symmetry. The axial anomaly manifests itself in the difference [15] where the correction is displayed for completeness. Practically, the correlators of π and δ are most convenient in lattice simulations, because they have no disconnected components (for degenerate masses). Note that, if one wishes, it is entirely straightforward to extend the calculation for (3.19) to higher orders by including U(1) A -violating terms such as (det M ) 2 and (tr M M † )(det M ) in the free energy (3.3). However this is not expected to bring about quantitative differences when M/T 1. In [11], the two-point correlators π(x)π(0) and δ(x)δ(0) (rather than their spatial integrals, χ π and χ δ ) were calculated directly in high-temperature QCD by using the 't Hooft vertex of instantons. They found that the difference of the two correlators decreases at high temperature but does not vanish exactly, in agreement with the general argument presented in Sec. 2.
It was emphasized in [8,9] that the dominant contribution to χ π − χ δ comes from exact zero modes in the Q = ±1 sector. A more recent paper [13] argues to the contrary that contributions of exact zero modes is suppressed in the thermodynamic limit. In what follows we aim to clarify this issue.
Let us first decompose the anomalous contribution (3.19) into contributions from each topological sector. We assume θ = 0 in the following. Since the second terms in (3.15b) and (3.15c) vanish for degenerate masses, it follows that where it is tacitly assumed in (3.20) that the first term is evaluated for M = diag(m + ib, m − ib) and the second term for M = diag(m + c, m − c). In (3.21) we defined the contribution P Q from the sector of topological charge Q as depending on the sign of Q, one may cast P Q into a suggestive form The first terms in (3.25) are the contributions from exact zero modes. This can be easily seen by plugging Z Q ∝ (m 2 + b 2 ) |Q| and Z Q ∝ (m 2 − c 2 ) |Q| into the first and the second terms in (3.22), respectively. Therefore the U(1) A -violating contribution (3.21) may be split into the zero-mode fraction 7 and the nonzero-mode fraction as In addition, the contribution of the Q = ±1 sectors to S z is defined as The quantities S z , S nz and S ±1 are plotted in Figure 1 as functions of x ≡ 2V 4 f A m 2 . We observe that, in a small volume or near the chiral limit (x 1), χ π − χ δ is dominated by the contribution of exact zero modes in the Q = ±1 sector, as argued in [8,9]. By contrast, if we take the thermodynamic limit (x 1), the contribution of nonzero modes dominates, and the exact zero modes are completely irrelevant. This can be understood from (3.8): since Q 2 ∼ V 4 f A m 2 , one naturally expects |Q| = O( √ V 4 ), implying that the first term in (3.25) is suppressed in a large volume. 8 On the other hand, the second term in (3.25) tends to 8f A , which is the same value as in the full theory (3.19). This means that the anomaly (f A = 0) in the thermodynamic limit must be attributed to nonzero Dirac eigenmodes. The Q = ±1 sector does not play a distinguished role. Indeed, one can show for x 1 that Z Q /Z obeys a Gaussian distribution (see also [28]), according to which and is suppressed otherwise. Therefore, if the volume is sufficiently large with a fixed nonzero mass, all contributions to χ π − χ δ from the sectors with |Q| V 4 f A m 2 are equally important, in contradistinction to the finite-volume regime (x 1) where only the Q = ±1 sectors contribute to χ π − χ δ .
To avoid confusion, we stress that the total amount of χ π −χ δ is equal to 8f A irrespective of the value of x; the order-of-limit issue does not arise, of course, because there is no longrange-order in QCD above T c . The reason the exchange of dominance occurs between zero x I 1 (x) I 0 (x) Figure 2. The magnitude of (χ π − χ δ ) Q=0 normalized by (χ π − χ δ ) full as a function of x = 2V 4 f A m 2 . At large volume (x 1), and nonzero modes as we vary the volume is that a long-range correlation is induced once the global topological charge is fixed [28].

Implications for lattice QCD simulations
We now discuss implications of the above results for lattice QCD simulations. So far the U(1) A anomaly at high temperature has been thoroughly investigated on the lattice (as reviewed in Sec. 1), but despite efforts, a definitive conclusion on the (non-)restoration of the U(1) A symmetry is not reached yet. This is not surprising, considering that the physics of U(1) A anomaly is highly sensitive to the explicit breaking of chiral symmetry by lattice discretization; even domain-wall fermions have serious problems, as pointed out in [20]. In this regard, the most reliable simulations are those in [16] employing dynamical overlap fermions. They reported restoration of the U(1) A symmetry based on simulations with a fixed global topological charge (Q = 0). They also evaluated possible finite-size effects associated with the topology fixing, by using the formalism developed in [29,33]. Here we wish to revisit this issue based on our effective-theory framework. It follows from (3.21) that in the topologically trivial sector (Q = 0) we have The ratio of (3.30) to (χ π − χ δ ) full = 8f A is plotted in Figure 2. It shows that the ratio tends to 0 for small x and obscures the nonzero value in the full theory. This signals a strong finite-volume effect at small x. It seems necessary to ensure at least x = 2V 4 f A m 2 1 in order to observe a nonzero value of χ π − χ δ clearly. Our result so far is rigorous, as long as f A = 0 and the O(M 4 ) correction to (3.3) can be neglected. At sufficiently high temperature T T c we may resort to the dilute instanton gas approximation [12], which yields for N c = 3 and N f = 2. Since it decays so rapidly, it would be a challenging task to achieve a sufficiently large volume that satisfies 2V 4 f A m 2 1 while keeping m small. On the other hand, near T c , the asymptotic formula (3.31) breaks down and we do not exactly know how small f A is.
One way to extract f A from a topology-fixed simulation is as follows: if the simulation volume V 4 is not large and hence V 4 f A m 2 1, then we have, to a good approximation, from (3.30), where we used I 1 (x)/I 0 (x) x/2 for x 1. Therefore in principle one can extract f A by fitting the lattice data to the formula (3.32). This is a proposal for future lattice simulations with overlap fermions.

Dirac eigenvalue spectra and U(1) A anomaly
In this section we derive new spectral sum rules and a novel Banks-Casher-type relation which link the Dirac spectrum to the violation of the U(1) A symmetry in high-temperature QCD. We then discuss possible forms of the spectral functions. Throughout this section we will focus on two-flavor QCD. We denote the purely imaginary eigenvalues of the Euclidean Dirac operator D = γ µ (∂ µ + igA µ ) by {iλ n } n with λ n ∈ R and define the spectral density for a fixed gauge field A µ in a finite volume as

Spectral sum rules I: macroscopic limit
The partition function of two-flavor QCD in the sector of topological charge Q is given by where the average is taken with respect to the pure Yang-Mills action, and |Q| exact zero modes are implicitly included in the product. Equating the microscopic partition function (4.2) to that of the effective theory (3.7) and taking their derivatives with respect to m u and/or m d , one finds various nontrivial formulas for correlation functions of the Dirac eigenvalues. The simplest one is given by with I Q (x) ≡ dI Q (x)/dx. Now . . . Q stands for the average with full N f = 2 QCD measure.
In the thermodynamic limit, the number of Dirac eigenvalues scales linearly with V 4 , so let us define the one-point function (or the macroscopic spectral density) by in terms of which (4.5) reads as Note that R 1 (λ) depends on m u and m d implicitly through the QCD measure used for averaging. R 1 (λ) is expected to have no dependence on Q since topology is irrelevant once the thermodynamic limit is taken. Strictly speaking, we have to specify a UV cutoff scheme in order to make (4.7) fully meaningful. Equation (4.7) will be used later in Sec. 5.
There exists another relation that directly relates R 1 (λ) to the U(1) A anomaly [15,34]. Setting m u = m d ≡ m and evaluating the chiral susceptibilities (3.15) in the basis of Dirac eigenstates, one can straightforwardly obtain (4.8) By subtraction we arrive at [15,34] (4.9) Combining this formula with (3.19), we find (4.10) It is clear from this relation that small Dirac eigenvalues are necessary for f A to be nonzero. 9 Indeed f A = 0 follows immediately if R 1 (λ) has a spectral gap near zero at T > T c . Equations (4.7), (4.9) and (4.10) highlight essential properties of R 1 (λ). Unfortunately the form of R 1 (λ) in the near-zero region cannot be deduced uniquely from (4.7) and (4.10) alone. Actually there are infinitely many functions that satisfy (4.10); e.g., where n ≥ 1 and 0 ≤ k ≤ 2n + 1 are arbitrary integers. Recently, the Dirac spectrum in two-flavor QCD at T T c has been studied intensively in lattice QCD simulations [16,17,24,25] (see also [14,15,[35][36][37][38][39] for early works). The three possibilities R 1 (λ) ∼ m, λ, and m 2 δ(λ) were examined in detail in [17,25], whereas the Breit-Wigner form R 1 (λ) ∼ ρ 0 A/(λ 2 + A 2 ) was nicely fitted to the lattice data in [24,40] (but see [19,20,41,42] for detailed investigations of lattice artifacts stemming from partial reweighting). The δ form is motivated by the dilute instanton gas picture [12] in QCD at T → ∞, but the exact δ form is unlikely to emerge at T T c due to the overlap of neighboring instantons and anti-instantons. For the moment, contrasting results from different simulations do not allow us to draw a definitive conclusion on the form of R 1 (λ).

Spectral sum rules II: microscopic limit
There is yet another way to take the thermodynamic limit in (4.5): if we let all of λ, m u and m d scale as 1/ √ V 4 f A in the V 4 → ∞ limit, the dependence on the topology does persist. To see this, let us define a rescaled dimensionless spectral density which is analogous to the microscopic spectral density in the ε-regime [28,43] but note that the relevant scale of eigenvalues here is 1/ √ V 4 rather than 1/V 4 . 10 Then (4.5) becomes Notice that all O(m 3 ) corrections in (4.5) drop out in this limit; hence (4.13) is exact. This, of course, comes with a caveat that such a limit is meaningful only when the assumption f A = 0 is correct. (See Appendix B for another microscopic scaling.) It is straightforward to extend the rescaling (4.12) to higher-order spectral correlation functions. We conjecture that spectral fluctuations on the scale 1/ √ V 4 should be universal, i.e., determined solely by global symmetries and independent of the detailed form of QCD interactions in the ultraviolet. Although such a new "microscopic limit" prompts us to construct a random matrix theory that describes the Dirac spectrum in this regime, we have not been successful yet. The difficulty in finding a proper random matrix theory may have something to do with the fact that no global symmetry is spontaneously broken at T > T c , unlike in the QCD vacuum [28,43] and high-density QCD [44,46,47].
We can derive infinitely many spectral sum rules rigorously, by expanding the following expression in powers of quark masses, (4.14) 10 Intriguingly, a similar unusual scaling ∼ 1/ √ V4 also appears in color-superconducting phases of QCD at high density [44,45], in the superfluid phase of two-color QCD [46,47] and in an exotic phase proposed by Stern [48][49][50].
where Q ≥ 0 is assumed and the product runs only over λ n > 0. The average above is taken for the massless two-flavor QCD measure. Note that both sides of (4.14) are normalized to unity in the chiral limit. Two examples of the sum rules read above denotes a sum over λ n > 0. Note that the first sum rule receives UV-divergent contributions from large eigenvalues with density ∝ λ 3 , implying that f 2 is a regularizationscheme-dependent quantity. By contrast, the second sum rule is dominated by contributions from O(1/ √ V 4 ) eigenvalues. UV eigenvalues only give O(V 4 ) correction to the RHS, which is negligibly small as compared to the O(V 2 4 ) term in the V 4 → ∞ limit. Indeed one can show that f A is free from UV divergences (see Appendix B of [49]). The suppression of the second sum rule for large Q is due to the repulsion of Dirac eigenvalues from the origin by Q exact zero modes.
In terms of the "microscopic" spectral density, the sum rules (4.15) read To obtain the universal function ρ Q (ζ; 0, 0) analytically is an important open problem that deserves further investigation.

New Banks-Casher-type relation
If the axial anomaly is present at high temperature, it will be manifested not only in the Dirac eigenvalue density but also in the n-point spectral correlation functions for n ≥ 2. To examine this possibility, let us introduce the connected two-point correlation function (see e.g., [51]) 11 R C depends on m u,d implicitly through the averaging weight. Note that R C satisfies the constraint If eigenvalues are entirely uncorrelated, they obey the Poisson statistics. In this case, the two-point function is related to the one-point function (cf. [52, (5.4)] and [53, (3.33)]) as In the rest of this subsection we will ignore topological zero modes altogether. As argued in Sec. 3.2, this is justifiable in the macroscopic limit with a positive path-integral measure.
where N denotes the number of chiral pairs of Dirac eigenvalues {±iλ n } (hence the total number of eigenvalues is 2N ). The δ-functions in first term represent a trivial selfcorrelation. In Appendix C we outline the derivation of (4.19) for completeness. Then we obtain for the uncorrelated Dirac spectra As N grows linearly with V 4 , V 4 /N has a well-defined thermodynamic limit. One can check that (4.20) satisfies (4.18). Nontrivial two-level correlations among eigenvalues can be characterized by the deviation of R C from the Poisson case. Let us define the two-point cluster function T 2 (λ, λ ) by We observe that T 2 (λ, λ ) = T 2 (λ , λ) and owing to chiral symmetry. For the Poisson distribution, T 2 vanishes identically by definition. To see how R C and T 2 are related to the axial anomaly, we note that (3.16) and (3.17) together with χ σ = χ π imply where χ disc is the disconnected scalar susceptibility defined by . (4.24) In the limit m u = m d = m, we substitute (4.21) into (4.24), obtaining where (4.7) and (4.10) have been used. Recalling (4.23) we obtain Finally, taking the chiral limit in (4.28), we arrive at a new Banks-Casher-type relation for massless two-flavor QCD at T > T c . To the best of our knowledge, (4.29) is a new result. It reveals that the U(1) A anomaly is encoded, not only in the spectral density as in (4.10), but also in the nontrivial two-level correlations among near-zero eigenvalues. If the near-zero Dirac eigenvalues are entirely uncorrelated, then T 2 vanishes and leads to f A = 0, suggesting effective restoration of the U(1) A symmetry. Whether an analogue of (4.29) can be derived in N f > 2 QCD at T > T c is an interesting open problem. Some remarks are in order. In taking the chiral limit we have replaced 1/[(iλ+m)(iλ + m)] with π 2 δ(λ)δ(λ ). Strictly speaking, in doing so we have tacitly assumed that the typical scale over which T 2 (λ, λ ) varies is much larger than m. Whether this is true or not in actual QCD is a dynamical problem and must be checked separately. 12 We also remark that (4.29) cannot be extended to T < T c because of the infrared-singular behavior One might suspect that the correlation in the Dirac spectra revealed by (4.29) is at variance with the quasi-instanton picture proposed in [26] where a Poisson distribution of topological objects (i.e., dressed instantons called quasi-instantons) was argued at all T > T c . 13 In the limit T → ∞, where the interaction is weak, the quasi-instanton gas is expected to reduce to the conventional dilute bare instanton gas [12]. Let us try to explain how they can be consistent with each other. The point is that the Poisson distribution of topological zero modes (quasi-instantons) does not necessarily mean the Poisson distribution of Dirac eigenvalues.
In the quasi-instanton picture, independently distributed topological charges are expected to generate small Dirac eigenvalues that can be described, to a good approximation, by a spectral density with a δ-peak at the origin. Let us discuss how to deal with this case explicitly within the present spectral analysis. The spectral density over a gauge field A µ now assumes a form whereρ A (λ) is the density of eigenvalues away from zero and c A ≥ 0 is an integer which is equal to the total number of topological objects, N = N + + N − . 14 If we assume that 12 This smoothness condition is necessary to derive the original Banks-Casher relation, too [28]. 13 Topological objects similar to our quasi-instantons have been advocated for color-superconducting phases of QCD at high density [30,54]. While quasi-instantons in hot QCD do not interact with each other [26], those in dense QCD weakly interact via exchange of (pseudo) Nambu-Goldstone modes [30,54]. the density of nonzero eigenvaluesρ A (λ) is so small that the anomalous contribution f A in (4.10) solely originates from the δ peak at the origin, then it follows that Next, by plugging (4.30) into (4.17) we find whereR C is the two-point connected function for eigenvalues away from zero. The last two lines will vanish if there is no correlation between zero and nonzero modes, which we assume. Then we substitute (4.32) into (4.24) to obtain . (4.33) As we have been assuming that the density of nonzero modes is sufficiently low, it follows that the second term can be neglected in the chiral limit compared to the first term. Then (4.31) and χ disc = 2f A (cf. (4.23)) imply (4.34) This coincidence between the average and the variance of c A indicates that c A is Poisson distributed. This is indeed what the quasi-instanton picture in [26] suggests. We mention that the Poisson statistics of topological objects was indeed observed in recent lattice data at T = 1.5T c [24]. However, in the real world, quasi-instantons will not be strictly noninteracting (due to the O(m 4 ) term in the free energy) and the δ-peak of the spectral density may not be sufficiently narrow to rigorously justify the above treatment. Also, the correlations between zero modes and nonzero modes will not be negligible in general. With these caveats in mind, we still believe that the quasi-instanton picture in [26] and the exact Banks-Casher-type relation (4.29) can be a useful starting point for a fuller analytical and numerical investigation of the Dirac spectrum in QCD at high temperature in future.

Comment on the Aoki-Fukaya-Taniguchi theorem
Contrary to our assumption that f A = 0 for T > T c , Aoki at al. [13] claim that, under certain assumptions, the violation of the U(1) A symmetry is invisible in correlation functions of scalar and pseudoscalar quark bilinears for T > T c in two-flavor QCD. (This claim does not generalize to the vector-axial-vector sector, as we discussed in Sec. 2.) There are two key assumptions in their analysis: 15 1. The Dirac spectral density can be expanded in Taylor series near the origin, with a radius of convergence that does not vanish in the chiral limit.
In particular, there is no δ(λ) term. The notation ρ A n m makes it clear that these coefficients are dependent on the quark mass m. It should be noted that none of the examples in (4.11) satisfies the first assumption. Precisely speaking, Ref. [13] assumes that the spectral density for a given gauge field A µ can be expanded in Taylor series, while the above assumption 1 is only concerned with the spectral density averaged over all gauge fields. 16 While their original proof [13] is rather involved, we now show that a much simpler proof of f A = 0 for N f = 2 based on our analysis in the former sections is possible. Namely, one can easily prove the following theorem: Theorem. Under the two assumptions above, f A = 0.
Proof. From (5.1) and the Banks-Casher relation [55], we have For T > T c where ψψ = 0, it must be that Since ρ A 0 m is an analytic function of m 2 according to the second assumption, (5.3) means that On the other hand, it follows from (4.7) that 15 Aoki et al. used overlap fermions on the lattice to regularize UV divergences. This is not crucial in the following discussion, however. 16 We thank Sinya Aoki for clarifying this point to us.
where a UV cutoff Λ was inserted. Note that, to derive this expression, we have only used (i) analyticity of the free energy and (ii) irrelevance of exact zero modes in the thermodynamic limit (as explained in Sec. 3). If the limit m u → 0 is taken with m d fixed, the RHS of (5.5) converges to 2f A m d + O(m 3 d ) . 17 Next, we introduce an arbitrary scale ε > 0 which is smaller than the radius of convergence of (5.1) in the chiral limit. Then we split the LHS of (5.5) as where (5.1) was used. The second term in (5.7) is obviously O(m u ), whereas the first term is more nontrivial. To check its behavior near the chiral limit, we use As the leading term in the limit m u → 0 comes from (5.8a), we deduce Plugging this into (5.7), we observe that Recalling (5.4), it is now clear that This is to be compared with (5.5), which tells that the leading term in the limit m u → 0 is 2f A m d . Thus f A = 0 is concluded. This completes the proof. This short proof is made possible by treating m u and m d as two independent variables, unlike the original one [13], where only the case of degenerate masses was considered. Of course, whether the two assumptions are correct or not in QCD is highly nontrivial. If f A is non-vanishing in QCD, which has been assumed in the former sections, then one has to relax at least one of the two conditions above. Considering that recent lattice simulations [17,24,25] have demonstrated a singular peak structure in R 1 (λ) at small λ, it seems natural to abandon the Taylor expansion (5.1). This issue deserves further investigation.

Conclusions
In this paper, we derived some rigorous results on the violation of the U(1) A symmetry in two-flavor QCD at T > T c , which is characterized by the difference of chiral susceptibilities, χ π −χ δ (see (3.19)) and is parametrized by f A (see (3.3) for the definition). We clarified how the different topological sectors conspire to violate the U(1) A symmetry and how it varies with the spatial volume of the system. We demonstrated that any moment of the topological charge at T > T c can be obtained, once just a single parameter f A is fixed. We also derived new spectral sum rules and a Banks-Casher-type relation that relate the anomaly strength f A to statistical correlations in Dirac spectra. As a by-product of the sum rules, we found a simple proof of the Aoki-Fukaya-Taniguchi "theorem" on the effective restoration of the U(1) A symmetry [13]. Since nontrivial assumptions are required to prove this theorem, we cannot conclude U(1) A restoration in QCD yet. However, our simplified proof would hopefully serve to understand the importance of these assumptions more clearly.
All of our new exact relations can, in principle, be tested on the lattice. In particular, the relation (3.32) can be used to extract the value of f A at T > T c even in a small volume with fixed topology (Q = 0). Finally, we note that determination of f A should also be important from a phenomenological point of view, as it is related, through (3.9), to the temperature-dependent mass of the QCD axion-an input for the evolution of the axion density, which might account for the dark matter density of the universe; see, e.g., [56,57] for recent works.
where j µ = ψγ µ ψ is the vector current and j Aµ = ψγ µ γ 5 ψ is the axial current. The color and flavor degrees of freedom are suppressed for simplicity. We take the rest frame of the medium as η µ = (1, 0).

B Another microscopic scaling
In (4.12), all dimensionful quantities are rescaled by √ 2V 4 f A . On the other hand, it is also allowed (from a mathematical point of view) to use √ 2V 4 f 2 for rescaling. Defining we obtain from (4.5) a modified spectral relation We suspect, however, that this scheme is precarious because f 2 is dominated by UVdivergent contributions from the perturbative Dirac spectra R 1 (λ) ∼ λ 3 which generally depends on the regularization scheme. In contrast, f A is free from UV divergences (see Appendix B in [49]). This leads us to consider the microscopic scaling by f A as the most natural one.

C Derivation of (4.19)
The relation (4.19) for uncorrelated Dirac spectra can be easily shown as follows. With 2N Dirac eigenvalues {±iλ n } N n=1 we have, from (4.1), {δ(λ − λ k ) + δ(λ + λ k )} δ(λ − λ k ) + δ(λ + λ k ) In the last step we factorized the average, which is justified by the absence of correlations among different eigenvalues. Then, using the trivial identity we arrive at the desired formula One can see by integrating over λ and λ that both sides are normalized to 4N 2 correctly.