Spontaneous $CP$ breaking in QCD and the axion potential: an effective Lagrangian approach

Using the well-known low-energy effective Lagrangian of QCD --valid for small (non-vanishing) quark masses and a large number of colors-- we study in detail the regions of parameter space where $CP$ is spontaneously broken/unbroken for a vacuum angle $\theta= \pi$. In the $CP$-broken region there are first order phase transitions as one crosses $\theta=\pi$, while on the (hyper)surface separating the two regions, there are second order phase transitions signaled by the vanishing of the mass of a pseudo Nambu-Goldstone boson and by a divergent QCD topological susceptibility. The second order point sits at the end of a first order line associated with the $CP$ spontaneous breaking, in the appropriate complex parameter plane. When the effective Lagrangian is extended by the inclusion of an axion these features of QCD imply that standard calculations of the axion potential have to be revised when the QCD parameters fall in the above mentioned $CP$-broken region, in spite of the fact that the axion solves the strong-$CP$ problem. These latter results could be of interest for axionic dark matter calculations if the topological susceptibility of pure Yang-Mills theory falls off sufficiently fast when temperature is increased towards the QCD deconfining transition.


Introduction
Already in the early seventies Dashen recognized [1] that phases in the quark mass matrix could spontaneously break CP and the possibility that such a phenomenon could explain the observed CP violation in kaon physics was explored [2]. It turned out that these violations were too large to explain the experiments with K mesons and would give a much too high value for the electric dipole moment of the neutron and for the η → 2π decay amplitude [3]. At about the same time Weinberg pointed out [4] that possible CP violating phases can be eliminated through chiral rotations of the quark fields. These rotations included an anomalous U A (1) transformation and therefore generated a CP violating term proportional to FF . However, at the time such a term was considered innocuous since it amounts to adding to the Lagrangian a total derivative (and, indeed, it is irrelevant at all orders in perturbation theory). It looked therefore as if QCD did automatically conserve CP .
The phenomenological problem with that naive conclusion is that the same triviality of FF implies the famous U (1) problem, expressed for instance by the anomalously large η mass. After the discovery of the instanton solutions and the presence of different topological sectors in pure Yang-Mills (YM) theory, it was soon realized [5] that the U (1) problem might be solved although this remained somewhat controversial for a while [6]. The observation [7,8] that, in the framework of large-N QCD, the mass matrix of the mesons contains, besides the terms related to the masses of the quark, an extra parameter connected to the topological susceptibility of pure YM theory, opened the way to a quantitative resolution of the U (1) problem [9,10,11,12] Unfortunately, the resolution of the U (1) problem brought back the question of CP conservation in strong interactions. Indeed, CP violating phases of the quark mass matrix could no longer be rotated away so that QCD would not automatically preserve CP . The YM Lagrangian could be supplemented with an extra term, given by the topological charge density and containing a parameter, the so-called vacuum angle θ, that also breaks CP . By performing an anomalous U A (1) transformation of the quark fields, it turns out that the relevant observable quantity is a combination of the θ parameter and the phases present in the quark mass matrix M , given byθ ≡ θ+arg det m. The CP violation induced by a nonvanishingθ was first used to estimate the resulting electric dipole moment of the neutron in [14]. It was later refined in [15] by identifying a leading logarithmic contribution thus establishing a limit onθ of order 10 −9 − 10 −10 for the smallness of which QCD, on its own, has no explanation. In Sect. 4 we will come back to this problem and to its resolution with the help of an axion. The next step was the construction and study of an extension [16,17,18,19] of the effective Lagrangian of the light pseudo Nambu-Goldstone bosons (the non-linear σmodel) to include a term linear in the topological charge density and reproducing both the U A (1) anomaly and the θ term of the microscopic theory, as well as a quadratic term whose coefficient is associated with the topological susceptibility of pure YM theory 2 .
The θ dependence of physical quantities, in the framework of the effective Lagrangian for mesons, was studied in detail in Refs. [17,19] where it was found that for a generic non-zero value of θ CP is broken but, for θ = π (where CP is a symmetry of the theory) could be either spontaneously broken or independent of the values of the quark masses and the topological susceptibility.
The possibility of spontaneously breaking of CP from the introduction of phases in the quark mass matrix was taken up again in [24,25,26] in the framework of low-energy effective Lagrangian for the pseudoscalar mesons, where it was shown that at θ = π there are indeed two regions in parameter space, one where CP is conserved and the other where CP is broken, separated by a surface whose shape depends on the quark mass ratios. An important result of the analysis of Ref. [25] is that, on the separating surface, one of the mesons becomes massless.
Recently, the discussion of the case θ = π has been taken up again in a very interesting paper [27] where it was proven, under a few very plausible assumptions, that, even for finite N , CP must be spontaneously broken at θ = π in SU (N ) YM theory. The main ingredient in the derivation of this result is the use of 't Hooft's anomaly constraint for the mixed anomaly of the discrete CP and center symmetries. This first order transition nicely fits with the spontaneous CP breaking in QCD at θ = π in the decoupling (heavy quark mass) limit.
In the first part of this paper we discuss again the θ dependence of chiral, large-N QCD in its low-energy approximation, using the above mentioned effective Lagrangian and concentrating our attention on what happens in the neighborhood of θ = π. Besides the quark masses, parametrized in terms of the N f parameters −2m i ψ ψ ≡ µ 2 i F 2 π , there is an additional parameter, the topological susceptibility of YM theory, χ Y M , which, as already mentioned, plays a crucial role in the large-N resolution of the U (1) problem. In this enlarged parameter space (w.r.t. the one considered in [25]) there is an hypersurface separating the region where CP is conserved from the one where CP is spontaneously broken. On the hypersurface itself the theory exhibits a second order phase transition where one of the pseudo Nambu-Goldstone bosons (PNGBs) becomes exactly massless and the topological susceptibility of QCD diverges. Inside the CP broken region the ground state makes a sudden, finite jump asθ goes from π − to π + corresponding to a first order phase transition. In an appropriate complex parameter space (discussed in Sect. 3) the second order point resides at the endpoint of a first order line associated with CP breaking and starting at −∞ where the decoupling to YM occurs. The position of the second order end-point resides depends on all the other parameters (mass ratios, topological susceptibility).
These results can be seen as a rather straightforward generalization of those of [25,26] to the case of a generic value of χ Y M and of [28,29] to the case of a generic quark mass matrix (the equal mass case is indeed quite special since it is always in the CP broken phase except in the case of a single light flavor). In [29] the issue of CP breaking in QCD was also addressed for finite N , and the theories residing on the resulting domain walls were studied.
In the second part of this paper we turn our attention to the case in which QCD has been augmented by the addition of an axion field, the best known way to solve, in a natural way, the strong-CP problem. The axion can be easily incorporated in the effective Lagrangian (see e.g. [23]). We then find that the QCD results of the previous Sections have an interesting bearing on the properties of the axion potential near the boundary of its periodicity interval. Depending again on where one is in the QCD parameter space the axion potential can differ significantly from the one commonly used in the literature (see e.g. [30]). Furthermore, in the immediate vicinity of the critical hypersurface the very concept of an axion potential ceases to be physically meaningful since the dynamics is described by two very light pseudoscalars whose mass is of the order of the geometric mean between the PNGB mass and the conventional axion mass. Quite naturally, in that region the mass eigenstates are strongly mixed combinations of the two. Although at zero temperature real QCD is quite deeply inside the CP conserving region, one cannot exclude a-priori the possibility that, as one moves towards the deconfining, chiral-symmetry-restoring temperature, QCD may move (in parameter space) towards the critical hypersurface or even inside the CP breaking region. If true, this could have interesting physical effects, e.g. on the standard computation of axionic dark matter abundance. As we will discuss, some precise lattice calculations in quenched QCD at finite temperature would be highly desirable in order to settle this point.
The paper is organized as follows. In Sect. 2 we review the main properties and consequences of the low-energy effective Lagrangian at generic values of the θ angle and quark masses. In Sect. 3.1 we study in detail the behavior at θ = π in the case of a single flavor, while in Sects. 3.2 and 3.3 we discuss the case of two or more flavors respectively. Non-trivial checks that the results derived from the effective Lagrangian exactly satisfy general Ward-Takahshi identities (WTIs) are presented in Appendix A. In Sect. 4 we consider QCD with a very generic additional axionic degree of freedom and discuss the axion potential in the different situations described above. In particular we examine the "realistic" case of two or three unequal mass light flavors. Some final remarks are presented in Sect. 5.
2 Chiral, large-N QCD at arbitrary θ: a reminder For the sake of being self-contained we summarize in this section some already known facts. We will refer, where appropriate, to the original literature for further details.
Assuming confinement and spontaneous chiral symmetry breaking by a quark-antiquark condensate at a generic value of θ, QCD, for three light quarks (m i Λ QCD ) and a large number of colors (N 1) 3 , is described at low-energy by the following effective La-grangian [16,17,18,19] Here F π is the pion decay constant (F π ∼ 95M eV in the real world with N = 3) 4 and the 3 × 3 matrix U describes, non-linearly, the spontaneous breaking of the approximate U (3) L ⊗ U (3) R chiral symmetry in terms of nine light PNGBs so that where T a ij are the matrices satisfying the algebra of U (3) normalized as Tr(T a T b ) = δ ab . Furthermore, µ 2 is proportional to the quark mass matrix 5 which, without loss of generality, can be taken to be real, diagonal and non negative (provided a θ-term is added). More precisely, in terms of the quark masses m i and condensate at θ = 0, ψ ψ , µ 2 is defined by Although the physically relevant case is the one with two or three light flavors, for the sake of generality, we will consider hereafter the case of N f light flavors (hence now i, j = 1, . . . , N f ). Q is the QCD topological charge density that appears in the divergence of the U A (1) current Modulo the mass term, the Lagrangian as needed. The quadratic term in Q contains a coefficient, χ Y M , which turns out to be nothing but the topological susceptibility of pure YM theory in the large-N limit. Finally, the last term takes into account of the presence of a non-zero θ parameter. 4 Remember that F π grows like √ N for large N . 5 In the literature µ 2 is often denoted by M . In this paper we prefer this different notation in order to avoid confusion with a different use of the symbol M .
The 2π periodicity in θ (which in the underlying QCD theory is related to the quantization of ν ≡ d 4 xQ(x)) can be easily checked at the level of (2.1). Indeed, a shift in θ by 2π can be reabsorbed, thanks to the anomaly term in (2.1), by a chiral rotation by 2π of a component (say U 11 ) of U under which even the mass term in (2.1) is invariant. We also note that, under CP , Q → −Q and U → U † . Thus naively, in our convention of real positive quark masses, only the last term in (2.1) breaks CP unless θ = 0 6 . However, even if θ = ±π, CP is not explicitly broken since 2π periodicity implies that θ = +π and θ = −π are equivalent. Nonetheless, as discussed below, CP can be spontaneously broken at θ = ±π.
In the infinite-N limit the anomaly effectively turns off and the physical PNGB spectrum consists of N 2 f unmixed states of mass In general, one could add to the previous Lagrangian a U (N f ) L ⊗ U (N f ) R invariant function of Q, U and U † . However, it can be shown [16,17,18,19] that the only surviving terms at large N are those appearing in (2.1). Before we proceed further let us notice that the Lagrangian (2.1) for a single flavor is exactly the Lagrangian one gets by using the two-dimensional bosonization rules in the massive Schwinger model, where the kinetic term of the gauge field corresponds to the first term in the second line of (2.1) with a ≡ e 2 π , F π = 1 √ 2π , while the term coupling the fermions to the gauge field corresponds to the anomaly term with the logarithm. The other terms are also reproduced as also noticed in Ref. [26]. A similar structure appears also in other two-dimensional models as the one discussed in Ref. [31]. In those models, as also in the massive Schwinger model, the bosonized Lagrangian is equivalent to the original microscopic Lagrangian, while, in our case, the effective Lagrangian (2.1) is only valid at low energy, for small quark masses, and for large N . However, the fact that in all these cases one gets the same Lagrangian indicates that our results may not necessarily be valid only at large N .
Since the equation of motion of Q(x) is algebraic, we could integrate out Q(x) from the start. However, as later on we will want to compute the QQ correlator, we prefer to rewrite Eq. (2.1) as follows: The presence of the θ term implies that, for unequal masses, the vacuum does not correspond anymore to U being proportional to the unit matrix 7 . We are obliged to introduce a separate VEV for each flavor by writing Inserting Eq. (2.8) in the previous Lagrangian the vacua of the theory correspond to the minima of the following potential 9) and are therefore obtained by looking for the stable solutions of the equations (2.11) The Eqs. (2.10) determine φ i and all physical quantities in terms of µ 2 i , a and θ. Denoting this solution by φ i =φ i (µ 2 i , a, θ), and computing Q from the quadratic part of the Lagrangian in (2.14), we finally identify Q with Defining a newÛ matrix in terms of the shifted fieldŝ as well a shifted Q fieldQ we get a Lagrangian that depends onÛ andQ as follows where we have defined The first line of Eq. (2.14) (apart from the first term which is a constant) describes the spectrum and the interaction of the PNGBs, the second, being odd underΦ → −Φ, gives the CP violating contributions (controlled by its coefficient , and the third line will be useful to determine the topological susceptibility in QCD. As we shall see below, while for θ = 0 the CP violating coefficient is zero, for θ = ±π it can be non-zero. The latter case has to be attributed to the spontaneous breaking of CP by some non-CP -invariant VEVs. The spectrum of the PNGBs is obtained by restricting our attention to the terms quadratic inΦ, coming from the first line of (2.14), for which we get Separating inΦ the generators in the Cartan sub-algebra from the otherŝ we have from L 2 the following two-point correlation functions in momentum space and H ij is a matrix with 1 in all entries. The masses M i of the physical states in the Cartan sub-algebra are obtained by diagonalizing the matrix M 2 ij and satisfy the equation For p 2 = 0 one gets In the last part of this section we use the Lagrangian (2.14) to compute the two-point correlator ofQ (note that, by definition Q = v i = 0) and relate the topological susceptibilities of YM and QCD. Since there is no quadratic term involving v i with the combination ofQ and v j appearing in the last line of Eq. (2.14), we get immediately the following two-point correlation function  where in the last step we have used Eq. (2.19). From the relation .

(2.26)
Finally, from the last line of Eq. (2.14) we get  . (2.29) In particular, for p 2 = 0 one gets the topological susceptibility in QCD with N f flavors Since our effective Lagrangian is, strictly speaking, valid for N → ∞ (where the η is a PNGB), the quark condensate in the previous equation should be evaluated in the leading planar order proportional to N . The next to the leading terms should not be included.
In particular, it means that the next to the leading contributions which are affected by logarithmic divergencies [32,33,34] and make the quenched quark condensate ill-defined, are avoided 8 . Finally as a last remark we wish to stress an important property of both Eqs. (2.21) and (2.30), namely that they both reduce to the case of a theory with N f − 1 flavors when one of the quark masses becomes very large. If all quarks become much heavier than a (which can still be the case in the chiral regime since a scales like 1/N at large N ) then χ QCD → χ Y M . Finally, when any quark flavor becomes massless the QCD topological susceptibility goes to zero as it should on general grounds.
In Appendix A we provide the form of various two-point functions at small (but not necessarily vanishing) momenta and show that they satisfy exactly (i.e. without O(1/N ) corrections) all the expected anomalous and non-anomalous Ward-Takahashi identities (WTIs).

QCD phase diagrams
In this section we discuss the phase diagrams of QCD at zero temperature and chemical potential for different numbers of quark flavors N f . The parameter space in which we consider possible phase transitions is spanned by the (N f + 1) parameters µ 2 i ≥ 0 and θ (with 0 ≤ θ < 2π) while considering χ Y M and F π (and thus a) as given. In Sect. 4 we will see how those phase diagrams acquire a different meaning in the presence of a QCD axion and also briefly mention possible non-zero temperature effects.
Just to make our terminology clear. We will be talking about CP conservation or violation referring, respectively, to the vanishing or non-vanishing of the quantity χ Y M (θ− N f j=1 φ j ) in Eq. (2.14). Sometimes the breaking of CP is explicit (e.g. for generic values of θ) while in some other cases it is spontaneous (like for θ = π). We will try to make the distinction when needed in order to avoid confusion.

N f = 1
In the case of a single flavor the potential in Eq. (2.9) becomes, up to an irrelevant factor  from which we can compute its derivatives with respect to φ Let us distinguish two cases: In this case V > 0 so that there can only be a single stable minimum with positive mass. This is confirmed by solving graphically the equation V = 0, as illustrated in Fig. 1. At θ = 0 the minimum is at φ = 0 while at θ = π it is at φ = π. In both cases CP is unbroken. At 0 < θ < π (π < θ < 2π) the minimum is at some 0 < φ < θ (θ < φ < 2π) and CP is explicitly broken.
This case is much richer. Since now V can be negative, some stationary points can correspond to maxima rather than minima of V . For a zero mass ground state we should require V = V = 0. But for it to be the absolute minimum we should also have V = 0 and V > 0. However, from (3.2) we see that V = 0 is only possible if φ = π mod(π) and therefore (from the first and last of Eqs. (3.2)) if θ = π. Let us then consider this case in more detail. For θ = π there is always a stationary point at φ = π which, however, for the case > 1, corresponds to a maximum (V < 0). Since V is bounded from below there should be minima elsewhere. Indeed, for = 1 + δ, δ 1, one easily finds two (degenerate) minima. For = 1 the three stationary points degenerate at φ = π and the stable minimum corresponds to a massless CP conserving ground state.
To make the discussion more quantitative let us assume that θ = π and that φ = π−δ where δ is a small quantity. We can determine δ by plugging it into the first equation in (3.2) getting In this way we find again the solution δ = 0, which corresponds to a maximum, together with two stable minima related by CP (see below) at This can be seen by plugging (3.4) in the second of the equations (3.2) obtaining respectively This implies that the solution with δ = 0 is a stable one for ≤ 1, while the two other solutions are stable for > 1 (see Fig. 2). At = 1 there is a second order phase transition where the PNGB becomes massless. Indeed the mass square is given by the second derivative of the potential computed at the minimum, yielding as follows from (2.21) with N f = 1. Notice that M 2 goes to zero for = 1, θ = φ = π.
If we move away from θ = π while > 1 we can have different situations. Below a critical (θ) there is only one minimum while above it an extra couple of stationary points pops out. One of them is a local maximum, the other a local minimum. Which is the absolute minimum depends on θ. For θ < π the true minimum is at φ < θ while for θ > π it is at φ > θ as illustrated in Fig. 3. Precisely at θ = π there is a two-fold degeneracy easily understood as due to the spontaneous breaking of CP 9 . This abrupt change in the minimum of the potential around θ = π signals a first order phase transition all along the line µ 2 e iθ = [−∞, −a 2 ] ending at the second order phase transition point θ = π, µ 2 = a as first observed in [26] and more recently discussed in [28,29].
The second order phase transition is not only signalled by the mass gap going to zero, but also from the divergence of the topological susceptibility (generally defined as the Q Q correlator at zero momentum) at = 1, θ = π. This follows from Eq. (2.30) for N f = 1 which diverges for = 1 at θ = φ = π.
Figs. 2 and 3 illustrate the shape of the potential for different values of and for θ = π or θ = π, respectively. Note that the potentials shown in Figs. 2 and 3 do not look periodic in φ while they should. Indeed the potential is multi valued because of the log term in the effective Lagrangian (2.7) and the correct branch has to be chosen as we vary φ. Periodicity is thus restored at the expense of non-analyticity points (cusps) in V at particular values of φ. For instance, for θ = π (Fig. 2) the cusp are at φ = 0 mod(2π), while for a generic θ they are at θ + π mod(2π).

N f = 2
In the case N f = 2 with unequal masses (say, µ 2 1 < µ 2 2 ) the equations to be solved are For θ = π the solutions are simply The masses of the two pseudoscalar mesons can be read from Eq. (2.22) and are given by (3.10) valid for arbitrary θ. It is easy to check that the mass squared with the minus sign is massless if the following condition is satisfied (3.11) Notice that, if both µ 2 1,2 (θ) are positive, the previous condition cannot be satisfied because the r.h.s. is always negative, while the l.h.s. is always positive. In particular, it cannot be satisfied at θ = 0. But at θ = φ 1 = π, the previous condition becomes This means that, if the condition is fulfilled, CP is unbroken because θ − φ 1 − φ 2 = 0. Although the second solution in (3.9) conserves CP , it does not correspond to the absolute minimum and does not satisfy (3.11).
On the other hand, if µ −2 1 < µ −2 2 + a −1 not even the first solution in Eq. (3.9) corresponds to a minimum and other solutions takes over. As in the case N f = 1, let us consider the following example. Defining one finds, to leading order in σ 1, the two further solutions In the general case the solutions can be found numerically. Fig. 4 illustrates again the three distinct cases for θ = π, while Fig. 5 does the same for θ = π. We see clearly that, as in the N f = 1 case, the critical surface µ −2 1 = µ −2 2 + a −1 separates the situation with a single solution from the one with several solutions. In the latter case CP is spontaneously broken and the ground state jumps as we go from θ < π to θ > π. On the critical surface there is a massless excitation and the QCD topological susceptibility blows up.
In this generic case the phase structure resembles the N f = 1 case. In the complex µ 2 1 e iθ plane (µ 2 1 is the smallest mass parameter) we find a line of first order transitions along the negative axis ending on a second order transition point where one mass goes to zero. The position of the second order point depends on the other parameters (mass ratios, a). We can also see this structure in the complex det µ 2 plane, as discussed in the next subsection.
Let us close with a short discussion of the peculiarities of the equal mass case, µ 2 1 = µ 2 2 = µ 2 . In this case the condition (3.13) cannot be satisfied except, asymptotically, if we send µ 2 /a to zero. In other words, as discussed in [29], the first order phase transition line now extends over the whole negative real axis terminating at the origin. However, before jumping too quickly to this conclusion we should observe that the potential becomes very flat for small µ 2 /a, so much that it develops a flat direction at O(µ 2 /a). This continuous vacuum degeneracy is lifted at O((µ 2 /a) 2 ) so that the CP violating minimum is found to lie O((µ 2 /a) 2 ) below the CP conserving one. The existence of this quasi-flat direction and its lifting to O(m 2 ) was first pointed out in [24] and further discussed in [29]. In general, O(m 2 ) corrections are not included in effective Lagrangians like (2.1) but, in the context of our double limit m/Λ → 0, N → ∞ with mN/Λ fixed (recall a ∼ Λ 2 /N ), the split in the potential between the two vacua is of order Λ 4 (mN/Λ) 2 while the O(m 2 ) corrections we are ignoring are at least a factor 1/N lower. We can thus conclude that, above a sufficiently large N , CP is broken for two equal mass flavors 10 .

N f ≥ 3
For a generic number of flavors we have to solve Eqs. (2.10). It can be immediately seen that for θ = π we have the following solution that generalizes to N f flavors what we found for two flavors, namely 11 It can be immediately checked that the determinant in Eq. (2.21) is positive if the condition is satisfied. In the corresponding region of parameter space we have a CP conserving stable solution since θ − On the surface where (3.17) is replaced by an equality, the topological susceptibility diverges, as follows from Eq. (2.30), and there is a massless state, signalling a second order phase transition. In the region where, instead, ∆ < 0, the solution in Eq. (3.16) ceases to be a minimum and we have to look for new solutions corresponding to minima where we will find that CP is spontaneously broken.
In terms of the dimensionless quantities the criticality condition can be written as and the zero-mass eigenvector is simply given by Clearly the above expression is consistent with decoupling when one of the ρ's goes to zero. CP is broken (unbroken) when the l.h.s. of (3.19) is larger (smaller) than the r.h.s. It is always broken if Σ > 1. If instead we look at the equal mass case, ρ i = 1, we see that ∆ < 0 except in the case N f = 1 and µ 2 /a < 1 and in the case N f = 2 and µ = 0 [29]. As before, in the generic mass case we have a line of first order transition in the complex µ 2 1 e iθ plane ending on a second order point where one physical mass goes to zero. The position of the second order point resides at the intersection of the negative µ 2 1 11 We can find many other stationary solution that preserve CP by choosing an arbitrary number of φ i to be ±π with their sum adding up to θ = π. However, it is trivial to show that the solution in Eq. (3.16) is, among those, the one with the lowest energy and thus the one to be compared with other (in general CP breaking) solutions. line with the critical hyper surface and therefore depends on the other parameters (mass ratios, a).
We end this section giving a definition of the critical hypersurface in terms of the quantity D ≡ det(µ 2 /a 2 ) = det( ), where, however, µ 2 is now the matrix introduced in (2.1) after having absorbed the θ angle by a chiral rotation 12 . The critical value of D, D c , is negative (corresponding toθ = − arg D = ±π) and its absolute value depends only on the ratios ρ i introduced earlier. Indeed the condition for CP violation can be expressed as follows

Spontaneous CP violation and the axion potential
We shall now discuss some consequences of the considerations made in the previous sections when an extra dynamical low-energy degree of freedom, the axion, is added to those of chiral QCD. As pointed out independently by Weinberg [35] and Wilczek [36], the existence of an axion is a necessary consequence of the Peccei-Quinn (PQ) resolution [37] of the strong-CP problem. The latter consists in the observation that present bounds on the electric dipole moment of the neutron force the θ angle (actuallyθ) to be less than 10 −9 [15]. Of course, if one of the quarks is massless, the strong-CP problem would be automatically solved since θ could be rotated away (equivalentlyθ = 0). Unfortunately, the low-energy spectrum of QCD is inconsistent with the data if one of the quark flavors is massless. A generic way to introduce the PQ resolution of the problem, and the axion, parallels the massless quark solution while avoiding its unwanted consequences. One assumes the existence a new axial U (1) global symmetry, only broken by the QCD anomaly (in QCD that symmetry would be the chiral rotation of the massless quark field). Then the existence of the axion follows from Goldstone's theorem associated with the spontaneous breaking of this symmetry. The axion is only a PNGB because there is an no anomaly-free spontaneously broken exact symmetry. The only additional free-parameters with respect to QCD are the so-called axion decay constant F α , the analog of F π , and α P Q , denoting the strength of the contribution of the new sector to the U A (1) anomaly. Instead, the θ parameter can be rotated away as we shall now discuss in detail.

Including the axion in the QCD effective Lagrangian
In view of the above considerations, the axion can be easily incorporated in the QCD effective Lagrangian discussed in Sect. 2 as if there were an extra zero-mass fermion, condensing at the scale F α , and contributing to the anomaly with a coefficient α P Q (relative to the weight of a QCD fermion). This can be simply implemented by introducing, together with U and Φ, similarly related axionic fields α and N The generalization of the Lagrangian (2.1) then reads 13 Restricting, for the sake of simplicity, our analysis to the fields in the Cartan sub- 13 See Ref. [23].
algebra of the QCD pseudoscalar mesons, the previous Lagrangian becomes where again we have allowed for a non-trivial expectation U as in Eq. (2.8) and we have also introduced an expectation value for α(x) and a shifted axion field σ as α( Proceeding now as in Sect. 2, we determine the phases φ i and β by minimizing The stationary points of this potential are solutions of the equations and are given byφ where V (φ i ,β) is a constant. Thus unlike the QCD case, physics has become θ-independent and CP conserving. As we shall see in the following subsection, the full richness of the QCD case reappears once we consider the axion potential.
For the moment, in analogy with Eq. (2.14), we rewrite (4.8) in the form (4.9) The mass spectrum of the system can be found by diagonalizing the quadratic part of Eq. (4.9) which reads where H is an N f + 1-column vector and A is the squared-mass matrix The mass spectrum is the result of the diagonalization of A and can be read off from (as in Eq. (2.11)) and b = Fπα P Q Fα . The M i are the masses of the physical states that diagonalize the mass matrix. By going to p 2 = 0, Eq. (4.12) implies 13) where the product on the r.h.s. includes the axion as well as the Cartan PNGB masses. Note that, unlike the non-axionic case, for non-vanishing m i , a and b, this determinant is always positive implying no massless state (and indeed a non-tachyonic spectrum). This would have also been the case had we considered QCD with one massless flavor (in that case b = 1). In particular, for small b, the mass of the axion is given by looking for a zero at small p 2 of the term in square brackets in Eq. (4.12). Neglecting p 2 with respect to µ 2 i one obtains (4.14) This reduces to the usual expression for the axion mass [35,38] in the limit a, µ 2 s µ 2 u,d . Alternatively, using Eq. (2.30) and the definition of b, we can write another formula often used in the literature (see e.g. Ref. [39]). Finally, from the term in the last line of Eq. (4.9) and the matrix definition in Eq. (4.10) we get (having Q = 0) the following two-point correlation function (4.16) that vanishes at p 2 = 0 signalling that the topological susceptibility in a theory where QCD is "augmented" by another sector that includes the axion, is zero consistently with the fact that the dependence on the θ parameter disappears.
For the physically interesting case we have to take b 1 so that the spectrum should contain a very light pseudo-scalar, the physical axion, which is the original field σ up to an O(b) admixture of PNGBs. This is all well known. We will now discuss how things take an interesting turn when we go from properties of the spectrum (i.e. of small fluctuations around the minimum of V ) to those of the full potential at a finite distance from its minimum.

The axion potential
From Eq. (4.9) we can immediately read the axion-PNGB potential (4.17) In the literature one introduces the concept of an axion potential after integrating out the remaining N f degrees of freedom in the assumption that they are much heavier then the axion. In principle this requires diagonalizing the mass matrix so as to be in position of identifying the lowest lying state, the physical axion that will be a mixture of σ and the v i . In the limit of very small b, which is where physics lies, one can neglect these mixings and identify σ with the axion modulo some exceptional cases to be discussed below.
For the physically interesting case of two light flavors the axion potential was first derived in [17] under the assumption µ 2 1 , µ 2 2 a with the result [30] V axion (σ) = − F 2 18) which for N f = 1 simply becomes We see, however, that by having considered the axion potential at a generic value of σ we have effectively recovered, mutatis mutandis, the situation discussed in QCD at fixed θ. This is why the discussion of Sect. 3 becomes very relevant here. Indeed, the previous analysis shows that, precisely around σ = πFα √ 2α P Q , some PNGB mass can become arbitrarily small. In this case integrating out the PNGB fields is no longer justified and a more careful analysis is needed. In other cases the naive solution for the v i corresponds to a maximum and it has to be replaced with the right solution. The rest of this section is devoted to such an analysis for different numbers of quark flavors.
In the following, for simplicity of notation, we shall denote by ϕ i and ζ the dimensionless quantities − √ 2 Fπ v i and √ 2α P Q Fα σ, respectively. In this notation the potential (4.17) simply reads (4.20) The potential V (ζ, ϕ) has two distinct stationary points, one at ζ = ϕ = 0 and one at ζ = ϕ = π. The first is a true minimum, the second a saddle point. Let us now consider the stationary points in ϕ at fixed ζ in order to compute V axion (ζ), distinguishing three cases (looking at Fig. 1 can help following the discussion).
• µ 2 /a < 1. In this case there is a single stationary point atφ(ζ) ≤ ζ which grows monotonically with ζ interpolating between the two stationary points of V . In this case the potential (4.19) is easily recovered. At ζ = π the potential is smooth and reaches a maximum lying µ 2 F 2 π above the absolute minimum. One can easily check that, for µ 2 /a not too close to 1, the mass of the PNGB is always much larger than the scale of variation of the axion potential so that integrating out that degree of freedom is justified. We shall discuss separately the case |1 − µ 2 /a| 1.
• µ 2 /a > 1. In this case, as one varies ζ from 0 to π,φ(ζ) remains always smaller than ζ. Actually, above a value of ζ that depends on µ 2 /a, new stationary points in ϕ (lying above ϕ = π) appear but they have higher energy. This is nothing but the situation we have described and discussed around Fig. 3. In particular, as we approach ζ = π,φ approaches a finite value smaller than π and behaving as πa/µ 2 for µ 2 /a 1. Precisely at ζ = π this minimum becomes degenerate with one at ϕ > π which, upon a shift by 2π is just its CP transformed. Again, for µ 2 /a not too close to 1, integrating out the PNGB appears fully justified but, instead of (4.19), we get where for a moment we have reintroduced the canonical σ field. In particular, the axion mass is now controlled by a rather than by µ 2 . At the boundary of its periodicity interval V axion now reaches its maximal value 1 2 χ Y M π 2 µ 2 F 2 π (in the small-a limit). Furthermore, at that point its first derivative is non-vanishing (and positive) and, since the potential is periodic, its first derivative will be discontinuous, giving a spike at ζ = π. This, of course, is related to the fact that the solution for ϕ jumps abruptly as we go through θ = π (see again Fig. 3).
• |1 − µ 2 /a| 1. This third regime is perhaps the most interesting one, at least theoretically. Let us consider the mass matrix (better the matrix of second derivatives) around ζ = ϕ = π. It takes the form We see that, if |µ 2 − a| = O(ba), the off-diagonal entries become of the same order as the difference between the two diagonal ones (remember that b 1). This is precisely the situation in which the two eigenvectors are strongly mixed w.r.t. the original (axion-PNGB) basis. Indeed the maximal mixing occurs at µ 2 = a(1 − b 2 ) since then the matrix A becomes whose eigenvectors are (1, ±1), with eigenvalues b 2 a ± ba. In fact, as we go through the point µ 2 = a, the two eigenvectors evolve very quickly (i.e. as µ 2 goes from a − O(ab) to a + O(ab)) from almost pure axion to almost pure PNGB or vice versa. This is clearly shown by the numerical calculation presented in Fig. 7. Since det A < 0 the spectrum always consists of a normal and a tachyonic state, but the latter is mainly in the PNGB direction at large µ 2 while it becomes mainly axionlike at small µ 2 . That means that, had we started the evolution of the PNGB plus axion system at ζ = ϕ = π the evolution would go immediately towards smaller ζ's if µ 2 < a while, for µ 2 > a, it would first roll down to the true minimum in ϕ and only then will roll down towards ζ = 0, ϕ = 0.
It is also quite clear that in this particular range of µ 2 /a and ζ it is not possible to describe the system only in terms of a V axion (ζ) since the other degree of freedom is as light as the axion itself. Only a description in terms of a V (ζ, ϕ) is fully adequate.

N f ≥ 2 and discussion
The real world has two very light quarks, u and d, a light one, s, and three heavy quarks. The latter play no role in our discussion. Thus the case of physical interest is N f = 2 or 3. Also, at zero temperature, the quantitative solution of the U (1) problem requires [7], The ratios µ 2 u : µ 2 d : µ 2 s : a are about 1 : 2 : 40 : 18. In what follows we shall use these numbers together with the results we obtained from the large-N effective action approach, even though in the real world N = 3. The success of the large-N solution to the U (1) problem suggests that, at least in this sector, the large-N expansion converges quite fast.
We should keep in mind, however, that, while quark mass ratios are expected to be constant below the QCD deconfining temperature (they depend on phenomena occurring at the electroweak-breaking scale), the temperature dependence of χ Y M could possibly differ from that of the quark condensate meaning a possible (strong?) T -dependence of µ 2 /a. An increase of that ratio by an order of magnitude would bring us inside the CP broken region. The available lattice measurements [40,41,42] do not seem to favor this possibility. We defer further comments on this issue to the conclusion section.
In the following we will consider therefore the case of two or three quark flavors of different masses and allow for arbitrary ratios µ 2 i /a. The situation is now more involved than in the N f = 1 case, but qualitatively similar. The stationary points of the potential (4.20) are ζ = 0, π mod (2π) ; ϕ i = 0, π mod (2π) ; The absolute minimum is as usual the trivial one ζ = ϕ i = 0. In general it is legitimate to integrate out the PNGB degrees of freedom by minimizing their potential at fixed ζ and then insert the solutionφ i (ζ) in V (ζ, ϕ i ). If µ 2 i a this can be easily done. In the two-flavor case this gives the result (4.18). In the three-flavor case recalling that we see that the result (4.18) still holds up to corrections O(µ 2 u,d /µ 2 s ). This is indeed the result used in the literature.
What happens if, for some physical reason, χ Y M drops so fast with T that a becomes of order µ 2 u,d or even smaller? We can understand the situation by considering what happens at the saddle point corresponding to ζ = ϕ u = π , ϕ d = ϕ s = 0 . We have seen in Sects. 3.2 and 3.3 that the condition for having a massless boson (in the absence of the axion) is Precisely around this point we expect a large mixing to occur between the would-be massless PNGB and the axion and, as one goes through that region, we expect the tachyonic boson to change its dominant component from axionic to mesonic. This is indeed fully supported by the numerical results shown in Figs. 7 and 8 for N f = 1 and N f = 2, respectively. We have solved, using Mathematica, the minimization conditions at fixed ζ and reconstructed this way the axion potential (see Fig. 9). We then clearly see that, while at small µ 2 u,d /a the potential has a regular maximum around ζ = π which coincides with the one of (4.18) and agrees well with it elsewhere, as we increase µ 2 u,d /a above the critical value 1 − µ 2 u /µ 2 d (see Eq. (4.27)), the potential is lower that the one given by (4.18) even at ζ = π and, by periodicity must develop a spike at that point. As we finally go much beyond the critical point, the true potential has nothing to do with the conventional one.
As in the N f = 1 case also here, the description of physics in terms of a single axion field is no longer appropriate when we are the vicinity of the condition (4.27). In that case only one "heavy" field can be integrated out and a description in terms of two light fields is more appropriate.

Conclusions
The phase structure of QCD associated with spontaneous CP breaking at θ = π may, potentially (depending on parameters like quark masses and topological susceptibility, their ratios and temperature dependence), have important implications on the axion potential and it's cosmological "phenomenology".
In the present work we employed the effective chiral Lagrangian approach to investigate the inter-relation between spontaneous CP breaking in QCD at θ = π and the axion potential near the boundary of its periodicity interval. Formally, the effective Lagrangian approach is applicable at low energies and, in particular, when all mass parameters (no-tably quark masses) are small with respect to the QCD scale, Λ. We also look at the large-N limit in which we can have ratios of quark masses to Λ small but still much larger then 1/N . This allows us to identify and reliably investigate the existence, at θ = π, of a second order phase transition point on the hypersurface dividing the region in parameters space where CP is spontaneously broken from the one where it is not. The second order point is characterized by one of the PNGB mass going to zero and by the topological susceptibility (which can be seen as the order parameter) to diverge.
For generic masses the phase structure of QCD reveals a line of first order transitions, associated with spontaneous CP breaking at θ = π, along the negative real axis in the complex µ 2 1 e iθ mass plane (µ 1 being the lowest quark mass). The first order line extends all the way from −∞ to the second order point without reaching the chiral point at the origin. The position of the second order transition depends on all other parameters (mass ratios and the susceptibility related parameter we called a). A similar phase structure is obtained by working in the complex quark-mass-determinant plane.
It is the existence of this second order point which has the most dramatic effect on the axion potential. Clearly, upon introducing the axionic field into the effective Lagrangian there is no more a θ dependence and no strong-CP breaking. However, precisely around the point in parameter space (quark masses and topological susceptibility) where, in the absence of the axion, the condition for having a zero mass boson is met, we find large mixing between the would be massless particle and the axion. In this region one cannot integrate out all the PNGB since one of them becomes very light with a mass of the order of the axion mass. Hence, in this region, the notion of an axionic potential which depends on just the axion field (obtained upon integration out all the PNGB) is not viable and should be replaced by a potential which depends on the two above mentioned light degrees of freedom as discussed in Sect. 4. This potential is obtained upon integrating out all the other much heavier PNGBs.
Given the actual physical numerical values of the parameters (for N f = 2 and N f = 3) we see that, at zero temperature, we are not in the region of the parameter space where the concept of an axion potential and the derived result for the axion mass should be modified. However, if, as we raise the temperature while staying below the deconfinement transition (which for QCD is not a sharp transition), the corresponding YM topological susceptibility (and hence the parameter a) drops faster with the temperature than the quark condensate so as to allow µ 2 /a to increase by about an order of magnitude, we will enter into this intriguing region (see Fig. 10).
It seems, however, that lattice calculations (see e.g. [40,41,42] as well as [43,44]) show a rather mild T -dependence of both χ Y M and the quenched chiral condensate with a sharp drop (but not necessarily vanishing) of both above a similar value of T . There does not seem to be a clean window in which µ 2 /a increases by the above-mentioned order of magnitude. It would be desirable to have detailed lattice data on both χ Y M and the planar chiral condensate by a single group using the same Montecarlo configurations. It would be particularly interesting to study the pure number χ Y M / mψψ in the vicinity of the above-mentioned drop and also check its N -dependence (expected to be 1/N ).
An obviously related issue is whether there is a critical temperature T top above which The red line separates the phases with broken and unbroken CP at θ = π and corresponds to a second-order transition with a massless particle. For m d >> m u we recover the N f = 1 case represented by the vertical axis. Also shown is the m u = m d case lying entirely in the CP broken phase. The real world at T = 0 is far up on the blue line (representing m d ∼ 2m u ). As T is increased towards T dec the real world will stay on the blue line (since m d /m u is T -independent) but may move down and cross the red line as indicated in the picture. Present lattice data seem to disfavour this possibility. χ Y M vanishes, at least in the large-N limit (dilute instantons [45,46,47], for instance, predict χ Y M ∼ e −cN ) and, in that case, whether T top can be higher than T ch , the temperature above which chiral symmetry is restored. Under reasonable assumptions, claims that χ Y M should vanish above T ch were made in the past [48,49] leaving open the possibility that χ Y M goes to zero either together or before ψ ψ does it.
Although some old lattice calculations [50] appear to point in the opposite direction (and such a possibility has its own effective Lagrangian formulation [51]), more recent simulations of the pure gauge theory [52,53] suggest the existence of a similar (or even identical) value for the temperatures of deconfinement, chiral restoration and U A (1) restoration. Above the transition temperature the dilute instanton gas approximation seems to set in. Actually there is lattice evidence [54] that χ Y M drops rather fast above T c for large N (and even at N = 3 a substantial decrease of χ Y M is visible [55,56,57]) and may actually go to zero above it for N → ∞. However, it is not clear what the ratio ψ ψ planar /χ Y M does around T c . It would thus be very interesting to plan new lattice projects dedicated to the calculation of χ Y M and ψ ψ in the planar limit across the phase transition.
Recently, using the mixed CP /Center discrete anomaly matching (together with some other plausible assumptions), it was shown [27] that in YM theory the CP symmetry is spontaneously broken at θ = π and zero temperature and that the temperature T res at which CP is restored is higher than the deconfinement temperature, i.e. T res ≥ T dec . This result seems to be going in favor of the scenario advocated in [48,49]. Breaking of CP in YM connects smoothly with CP -breaking in, say, N f = 1 QCD at µ 2 /a > 1. As we increase the temperature, if CP were restored before reaching T dec , it would suggest that, in its QCD analog, µ 2 /a would go down till, at T res , it reaches 1, which is precisely the opposite of what we were advocating, i.e. a ratio µ 2 /a increasing with temperature. Hence the statement T res ≥ T dec is an (admittedly very mild) indication in favor of the scenario in which the finite temperature axion potential has to be revised in a certain range of temperature. Even if such a revision would be necessary, it remains to be seen whether it would make any substantial difference with respect to the standard calculations [39] (see also [58], [59]) of axionic dark matter abundance.

A Ward-Takahashi identities
In this Appendix we derive the WTIs for the anomalous U A (1) currents in QCD and check that the two-point amplitudes derived from the effective Lagrangian in Sect. 2 exactly satisfy them. We start from the anomaly equation in (2.4), but written for a single flavor ∂ µ J µ 5i = 2Q + 2m i P i ; J µ 5i =ψ i γ µ γ 5 ψ i ; P i = iψ i γ 5 ψ i . while, for O(y) = 2m j P j (y), the commutator gives [Q 5i , P j ] = −2iψ i ψ i δ ij and we get d 4 x e ipx 2Q(x)2m j P j (y) + 2iµ 2 i F 2 π + d 4 x e ipx 2m i P i (x)2m j P j (y) = −i d 4 x e ipx p µ J µ 5i (x)2m j P j (y) , (A.5) having made use of the Gell-Mann-Oakes-Renner relation −2δ ij m i ψ i ψ i = δ ij µ 2 i F 2 π . One checks that the following two-point amplitudes satisfy the previous anomalous WTIs and we get