The material of Chaps. 7 and 8 is in principle sufficient to construct the low-energy effective field theory (EFT) for any system with a spontaneously broken internal symmetry. However, the implementation of the general methodology for a concrete physical system often still requires a nontrivial amount of effort. Moreover, the actual phenomenology of the EFT may depend sensitively on the internal and spacetime symmetries present. In this chapter, I will therefore work out in detail two applications of the general formalism, one in particle physics and one in condensed-matter physics. Apart from serving as an extensive illustration, this will allow us to dive deeper into some technical details of EFTs for Nambu–Goldstone (NG) bosons: the consistency of the derivative expansion of the effective Lagrangian, and the topological aspects of quasi-invariant Lagrangians.

1 Chiral Perturbation Theory of Mesons

Historically, the development of EFT for NG bosons was largely motivated by the need for a phenomenological description of low-energy hadron physics.Footnote 1 On the one hand, computation of hadron properties from first principles is challenging because the strong nuclear interaction is not amenable to a perturbative treatment at energies below the intrinsic scale of quantum chromodynamics (QCD). On the other hand, the spectrum of lightest hadrons exhibits scale separation; see Fig. 9.1. The eight lightest particles in the spectrum—the three pions, four kaons and the \(\eta \)-meson—are all pseudoscalars with varying quark flavor composition. Their masses are much lower that what one would expect given the constituent mass of about \(300\,\mathrm {MeV}\) for the \(u,d\) quarks and nearly \(500\,\mathrm {MeV}\) for the s quark. This suggests that the low-energy physics of the light pseudoscalar mesons should be captured by an EFT.

Fig. 9.1
figure 1

Mass spectrum of light hadrons. All the masses in parentheses are shown in \(\mathrm {MeV}\). Current numerical values as of writing this book can be found in the Review of Particle Physics [1]

The key to understanding the observed scale separation in the hadron spectrum is provided by spontaneous symmetry breaking (SSB). The gauge interaction of QCD is vector-like, that is, gluons couple equally to left- and right-handed quarks. Moreover, the interaction is insensitive to the quark flavor. As a consequence, in the limit of vanishing quark masses, QCD with \(n_{\mathrm {f}}\) quark flavors possesses a \(G\simeq \mathrm {SU}(n_{\mathrm {f}})_{\mathrm {L}}\times \mathrm {SU}(n_{\mathrm {f}})_{\mathrm {R}}\times \mathrm {U}(1)_{\mathrm {B}}\) symmetry. Under an element \((g_{\mathrm {L}},g_{\mathrm {R}},\mathrm{e} ^{\mathrm{i} \epsilon })\in G\), the left- and right-handed quark spinors \(\Psi _{\mathrm {L},\mathrm {R}}\) transform as

$$\displaystyle \begin{aligned} \Psi_{\mathrm{L}}\to\mathrm{e}^{\mathrm{i}\epsilon/3}g_{\mathrm{L}}\Psi_{\mathrm{L}}\;,\qquad \Psi_{\mathrm{R}}\to\mathrm{e}^{\mathrm{i}\epsilon/3}g_{\mathrm{R}}\Psi_{\mathrm{R}}\;. {} \end{aligned} $$
(9.1)

The unitary matrices \(g_{\mathrm {L},\mathrm {R}}\) act on the flavor index of the quark field. The vector-like \(\mathrm {U}(1)_{\mathrm {B}}\) group corresponds to conservation of baryon number, and the factors \(1/3\) in (9.1) indicate the baryon number of a single quark. In addition to this internal chiral symmetry, QCD is invariant under the spacetime Poincaré group, and under the discrete symmetries of charge conjugation, parity and time reversal.

In the ground state of (still massless) QCD, the chiral symmetry is spontaneously broken down to the “vector” subgroup, \(H\simeq \mathrm {SU}(n_{\mathrm {f}})_{\mathrm {V}}\times \mathrm {U}(1)_{\mathrm {B}}\), consisting of elements of the type \((g,g,\mathrm{e} ^{\mathrm{i} \epsilon })\), that is, \(g_{\mathrm {L}}=g_{\mathrm {R}}\). This implies the existence of \(n_{\mathrm {f}}^2-1\) pseudoscalar NG bosons. Restricting to \(n_{\mathrm {f}}=2\) accounts for the lightest mesons: the pion triplet. The additional five modes that appear with \(n_{\mathrm {f}}=3\) are the strange pseudoscalar mesons (kaons) and the \(\eta \)-meson. Of course, in reality, none of these is exactly massless. They are all pseudo-NG bosons owing to the fact that the chiral symmetry is only approximate, being explicitly broken by current quark masses. The fact that the strange mesons are heavier than the pions is a consequence of the s quark being considerably heavier than the u and d quarks. The masses of the c, t and b quarks are so high (above the intrinsic scale of QCD) that with \(n_{\mathrm {f}}\geq 4\), QCD does not possess even an approximate chiral symmetry. In practice, it is therefore sufficient to focus on the \(n_{\mathrm {f}}=2,3\) cases.

The classical Lagrangian of massless QCD is also invariant under the axial symmetry\(\mathrm {U}(1)_{\mathrm {A}}\), which amounts to . Being also spontaneously broken in the QCD vacuum, this would suggest the existence of another, flavor-singlet, pseudoscalar meson. Yet the lightest available candidate is the \(\eta '\)-meson with the mass of \(958\,\mathrm {MeV}\). This is too high to be accounted for by explicit symmetry breaking due to quark masses. The resolution of this so-called \(\mathrm {U}(1)_{\mathrm {A}}\) problem is that the axial symmetry is broken at the quantum level by the axial anomaly, arising from nonperturbative gluon dynamics. See Sect. 13.6 of [2] for further details.

The EFT for the light pseudoscalar mesons, constructed below, is called chiral perturbation theory (ChPT).Footnote 2 The reader will find a modern graduate-level exposition of ChPT in the dedicated monograph [3]. However, the pioneering works of Gasser and Leutwyler [4, 5] remain a valuable source of insight. The construction of ChPT will follow the general machinery developed in Sect. 8.2 with the appropriate coset space, \(G/H\simeq [\mathrm {SU}(n_{\mathrm {f}})_{\mathrm {L}}\times \mathrm {SU}(n_{\mathrm {f}})_{\mathrm {R}}]/\mathrm {SU}(n_{\mathrm {f}})_{\mathrm {V}}\). Here I have tacitly dropped the \(\mathrm {U}(1)_{\mathrm {B}}\) factor of the symmetry group of QCD. This remains unbroken and moreover leaves the meson fields intact, and so has no effect on the invariant part of the ChPT Lagrangian. I will reinstate the \(\mathrm {U}(1)_{\mathrm {B}}\) symmetry in Sect. 9.1.4, where it will help us build the quasi-invariant (anomalous) part of the Lagrangian.

1.1 Power Counting

In a Lorentz-invariant EFT, the effective Lagrangian is organized by the total number, \(n=s+t\), of spatial and temporal derivatives,

$$\displaystyle \begin{aligned} \mathcal{L}_{\mathrm{eff}}[\pi]=\sum_{n\geq2}\mathcal{L}^{(n)}_{\mathrm{eff}}[\pi]\;. {} \end{aligned} $$
(9.2)

Discounting possible tadpole operators, linear in \(\pi ^a\), a minimum of two derivatives is enforced by the nonlinearly realized internal symmetry and Lorentz invariance. The latter is ensured by contracting Lorentz indices of spacetime derivatives with the Minkowski metric \(g_{\mu \nu }\) or the Levi-Civita (LC) tensor, \(\varepsilon _{\mu \nu \lambda \dotsb }\). Hence, in an even number of spacetime dimensions D (which is the case of QCD, where \(D=4\)), only operators with even n can contribute to (9.2).

In order to assess the relative importance of the individual parts , consider a generic Feynman diagram \(\Gamma \), contributing to a given observable. Suppose the diagram contains altogether I internal propagators, L loops and \(V_n\) interaction vertices from each of . Upon Fourier transform, the diagram will evaluate to a homogeneous function of energy–momenta on the external legs. Adding up all the powers of energy–momentum from the vertices, propagators and loop integrals, the degree of the diagram as a function of the external energy–momenta then reads

$$\displaystyle \begin{aligned} \deg\Gamma=DL-2I+\sum_{n\geq2}nV_n\;. {} \end{aligned} $$
(9.3)

The variable I can be eliminated using the identity \(I=L+\sum _nV_n-1\), known from graph theory to hold for any connected graph (see, for instance, Sect. 13.4 of [6]). This leads to

$$\displaystyle \begin{aligned} \deg\Gamma=2+(D-2)L+\sum_{n\geq2}(n-2)V_n\;. {} \end{aligned} $$
(9.4)

Diagrammatic contributions to a given observable can now be organized by increasing powers of energy–momentum; higher powers imply stronger suppression at low energies. Any Feynman diagram will have \(\deg \Gamma \geq 2\). The leading-order (LO) contribution to the observable corresponds to \(\deg \Gamma =2\) and consists of all tree-level (\(L=0\)) diagrams with all vertices from . This justifies a posteriori the focus of this book on classical Lagrangians with the lowest possible number of derivatives.

The degree \(\deg \Gamma \) can be increased by adding vertices from higher-order (\(n\geq 3\)) parts of the Lagrangian or by adding loops. In \(D=4\) dimensions, the next-to-leading order (NLO) corresponds to \(\deg \Gamma =4\). This can be reached in two different ways. Either we restrict ourselves to tree-level diagrams but allow for one vertex from , or allow one loop but keep all vertices from . It is easy to extend this reasoning to classify possible contributions to any observable at even higher orders of the derivative expansion.

Let me conclude the discussion of power counting with several remarks. First, for pseudo-NG bosons, the propagator \(\mathrm{i} /(p^2-m^2)\) is actually not a homogeneous function of the energy–momentum \(p^\mu \) because of the mass m. That can be fixed by assigning the mass a formal counting degree 1 so that \(m^2\) counts equally to \(p^2\). This affects the classification of operators in the effective Lagrangian of ChPT that incorporate the effects of explicit breaking of chiral symmetry by the quark mass, \(m_q\). We expect based on Sect. 6.4.1 that m will scale as \(\sqrt {m_q}\). Hence, we can continue using (9.4), as long as we count the quark mass as a small quantity of degree \(\deg m_q=2\). Similarly, we will want to couple ChPT to a set of background gauge fields, \(A_\mu \). In order for the power counting to be consistent with background gauge invariance, we must assign these fields the degree \(\deg A_\mu =1\).

Second, the counting rule (9.4) also tells us how to renormalize the EFT. In order to remove an overall divergence in a Feynman graph \(\Gamma \), we need a counterterm among the operators in . Since all diagrams at \(\deg \Gamma =2\) are tree-level, the effective couplings in are finite constants that do not run with the renormalization scale. The effective couplings in must include divergences that cancel against loop diagrams with degree n. Their running will be determined by the corresponding renormalization group equation. Important is that to any finite order in the derivative expansion, only a finite number of operators, and a finite number of counterterms, is needed.

Third, note that for \(D=2\), all loop diagrams with the same \(V_{n\geq 4}\) and arbitrary \(V_2\) contribute at the same order of the derivative expansion. The lack of suppression of loop effects hints that the EFT is no longer weakly coupled. Eventually, it turns out that the infrared fluctuations of NG bosons are so wild in \(D=2\) dimensions that they destroy the order parameter leading to SSB. I will return to this in Sect. 15.2.

Finally, the basic counting rule (9.4) and much of the above comments applies equally to any EFT with only type-A NG bosons, relativistic or not. The LO Lagrangian will still be . The LO contribution to any observable will come from tree-level diagrams with all vertices from . The classification of contributions at higher orders may however be modified by the presence of operators with odd \(n\geq 3\) regardless of the spacetime dimension D. The details for any particular choice of effective Lagrangian and D are easy to work out using (9.4).

1.2 Effective Lagrangian

The coset space of QCD, \(G/H\simeq [\mathrm {SU}(n_{\mathrm {f}})_{\mathrm {L}}\times \mathrm {SU}(n_{\mathrm {f}})_{\mathrm {R}}]/\mathrm {SU}(n_{\mathrm {f}})_{\mathrm {V}}\), is symmetric. Let me therefore start by recalling some general properties of symmetric coset spaces; see Sect. 7.3.2 for full detail.

By definition, a coset space is symmetric if the Lie algebra \(\mathfrak {g}\) of G possesses an automorphism \(\mathcal {R}\) that acts as identity on the unbroken subalgebra \(\mathfrak {h}\) and as minus identity on the complementary subspace \(\mathfrak {g}/\mathfrak {h}\) of \(\mathfrak {g}\). Loosely speaking, \(\mathcal {R}\) changes the sign of all broken generators of G while leaving all unbroken generators intact. The automorphism \(\mathcal {R}\) can be, at least locally, lifted from the Lie algebra \(\mathfrak {g}\) to the Lie group G. One can then choose the coset representative \(U(\pi )\) so that \(\mathcal {R}(U(\pi ))=U(\pi )^{-1}\). With this choice, one can build a matrix-valued NG field that transforms linearly under the entire group G,

(9.5)

The automorphism \(\mathcal {R}\) makes it easy to project out the broken component of the gauged Maurer–Cartan (MC) form (8.47),

$$\displaystyle \begin{aligned} \Omega_\perp (\pi,A)=\frac{1}{2}[\Omega (\pi,A)-\mathcal{R}(\Omega (\pi,A))]\;, \end{aligned} $$
(9.6)

where \(A\equiv A_\mu \mathrm{d} x^\mu \) is the \(\mathfrak {g}\)-valued 1-form gauge field of G. Upon a brief manipulation using the definition of \(\mathcal {R}\), we find that

(9.7)

where

$$\displaystyle \begin{aligned} D\Sigma(\pi)=\mathrm{d}\Sigma(\pi)-\mathrm{i} A\Sigma(\pi)+\mathrm{i}\Sigma(\pi)\mathcal{R}(A) {} \end{aligned} $$
(9.8)

is the G-covariant derivative of \(\Sigma (\pi )\), and \(D\Sigma (\pi )^{-1}\) is defined analogously.

What does all this translate to in case of ChPT? Here \(\mathcal {R}\) acts by swapping the transformations of left- and right-handed quarks, \(\mathcal {R}(g_{\mathrm {L}},g_{\mathrm {R}})=(g_{\mathrm {R}},g_{\mathrm {L}})\). It is then natural to choose the coset representative as

$$\displaystyle \begin{aligned} U(\pi)=(u(\pi),u(\pi)^{-1})\quad \text{where }u(\pi)\in\mathrm{SU}(n_{\mathrm{f}})\;. \end{aligned} $$
(9.9)

The matrix variable \(\Sigma (\pi )=(u(\pi )^2,u(\pi )^{-2})\) is subject to the linear transformation . All information about the NG fields can then be encoded in a single \(\mathrm {SU}(n_{\mathrm {f}})\)-valued matrix variable,

(9.10)

The Lie algebra of \(\mathrm {SU}(n_{\mathrm {f}})_{\mathrm {L}}\times \mathrm {SU}(n_{\mathrm {f}})_{\mathrm {R}}\) is \(\mathfrak {su}(n_{\mathrm {f}})_{\mathrm {L}}\oplus \mathfrak {su}(n_{\mathrm {f}})_{\mathrm {R}}\). In our pair notation, the matrix-valued gauge field \(A_\mu \) can thus be decomposed as , where are independent \(\mathfrak {su}(n_{\mathrm {f}})\)-valued gauge fields acting respectively on the left- and right-handed quarks. A straightforward manipulation then leads to \(D\Sigma (\pi )=(D\mathcal {U}(\pi ),\mathcal {U}(\pi )^{-1})+(\mathcal {U}(\pi ),D\mathcal {U}(\pi )^{-1})\), where

$$\displaystyle \begin{aligned} \begin{aligned} D_\mu\mathcal{U}(\pi)&=\partial_\mu\mathcal{U}(\pi)-\mathrm{i} A_\mu^{\mathrm{L}}\mathcal{U}(\pi)+\mathrm{i}\mathcal{U}(\pi)A_\mu^{\mathrm{R}}\;,\\ D_\mu\mathcal{U}(\pi)^{-1}&=\partial_\mu\mathcal{U}(\pi)^{-1}-\mathrm{i} A_\mu^{\mathrm{R}}\mathcal{U}(\pi)^{-1}+\mathrm{i}\mathcal{U}(\pi)^{-1}A_\mu^{\mathrm{L}}\;. \end{aligned} {} \end{aligned} $$
(9.11)

This eventually gives

(9.12)

Before we can write down even the LO effective Lagrangian of ChPT, we still have to discuss the explicit breaking of chiral symmetry by the quark masses. The mass term in the microscopic Lagrangian of QCD takes the form \(\overline {\Psi }_{\mathrm {L}}\mathcal {M}\Psi _{\mathrm {R}}+\overline {\Psi }_{\mathrm {R}}{\mathcal {M}}^{\dagger }\Psi _{\mathrm {L}}\), where \(\mathcal {M}= \operatorname {\mathrm {diag}}(m_u,m_d,\dotsc )\) is a real diagonal matrix, collecting the quark masses. In the spirit of Sect. 8.2.3, this is promoted to a complex matrix field that transforms under the chiral symmetry as , thus restoring exact chiral invariance of the QCD Lagrangian. The basic building block for incorporating the effects of explicit symmetry breaking in ChPT is then the matrix field

$$\displaystyle \begin{aligned} \Xi(\pi,\mathcal{M})=u(\pi)^{-1}\mathcal{M} u(\pi)^{-1}\;. \end{aligned} $$
(9.13)

Being linear in quark masses, this is assigned the order \(\deg \Xi =2\).

The operators we now have for building the LO effective Lagrangian of ChPT are, schematically, \(\Omega _{\perp \mu }\Omega _\perp ^\mu \) and \(\Xi \). The only way to ensure invariance of these operators under the linearly realized unbroken subgroup, \(\mathrm {SU}(n_{\mathrm {f}})_{\mathrm {V}}\), is to take a trace. Moreover, \(\Xi \) has to enter the Lagrangian through \( \operatorname {\mathrm {tr}}(\Xi +{\Xi }^{\dagger })\). This is required by parity invariance of QCD and the fact that parity interchanges left- and right-handed quarks, thereby acting on both \(u(\pi )\) and \(\mathcal {M}\) by Hermitian conjugation. At the end of the day, there are only two independent operators one can put into the LO Lagrangian,

$$\displaystyle \begin{aligned} \mathcal{L}_{\mathrm{eff}}^{(2)}=\frac{f_\pi^2}4\operatorname{\mathrm{tr}}\big[D_\mu{\mathcal{U}(\pi)}^{\dagger}D^\mu\mathcal{U}(\pi)\big]+\frac{f_\pi^2B}2\operatorname{\mathrm{tr}}\big[\mathcal{M}{\mathcal{U}}^{\dagger}(\pi)+{\mathcal{M}}^{\dagger}\mathcal{U}(\pi)\big]\;. {} \end{aligned} $$
(9.14)

Accordingly, there are two independent parameters, conventionally denoted as \(f_\pi \) and B, both with mass dimension 1.

Example 9.1

Let us work out some immediate consequences of the LO Lagrangian (9.14). To that end, I will drop the background gauge fields and parameterize \(\mathcal {U}(\pi )\) in terms of a Hermitian traceless matrix \(\Pi (\pi )\) as \(\mathcal {U}(\pi )=\exp [\mathrm{i} \Pi (\pi )/f_\pi ]\). This is a matrix version of the familiar exponential parameterization of the coset space; the factor \(f_\pi \) is inserted to give \(\Pi \) mass dimension 1. Upon expansion in powers of \(\Pi \), the Lagrangian becomes

$$\displaystyle \begin{aligned} \begin{aligned} \mathcal{L}_{\mathrm{eff}}^{(2)}={}&\frac{f_\pi^2B}2\operatorname{\mathrm{tr}}(\mathcal{M}+{\mathcal{M}}^{\dagger})\\ &+\frac 14\operatorname{\mathrm{tr}}\big[\partial_\mu\Pi(\pi)\partial^\mu\Pi(\pi)\big]-\frac B4\operatorname{\mathrm{tr}}\big[(\mathcal{M}+{\mathcal{M}}^{\dagger})\Pi(\pi)^2\big]+\mathcal{O}(\Pi^4)\;, \end{aligned} {} \end{aligned} $$
(9.15)

where I used that \(\mathcal {M}\) is ultimately Hermitian to drop terms linear and cubic in \(\Pi (\pi )\).

The first, constant term contributes to the energy density of the chiral-symmetry-breaking vacuum. Taking the derivative of the vacuum energy density with respect to any of the quark masses in turn gives the vacuum expectation value (VEV) of the corresponding mass operator; cf. (5.16). The latter serves as the order parameter for spontaneous breaking of chiral symmetry. The LO prediction of ChPT, based on (9.15), is that this so-called chiral condensate is independent of the quark flavor. The total condensate, summed over all quark flavors, equals

$$\displaystyle \begin{aligned} \langle{\overline{\Psi}\Psi}\rangle =-n_{\mathrm{f}} f_\pi^2B\;. \end{aligned} $$
(9.16)

Let us now turn attention to the bilinear part of (9.15). It is convenient to further parameterize the matrix \(\Pi (\pi )\) in terms of \(n_{\mathrm {f}}^2-1\) meson fields \(\pi ^a\) as \(\Pi (\pi )=\pi ^a\lambda _a\). Here \(\lambda _a\) is a basis of traceless Hermitian \(n_{\mathrm {f}}\times n_{\mathrm {f}}\) matrices, normalized as \( \operatorname {\mathrm {tr}}(\lambda _a\lambda _b)=2\delta _{ab}\). This ensures canonical normalization of the kinetic term for the meson fields. In case of \(n_{\mathrm {f}}=2\), the \(\lambda _a\)s are just the usual Pauli matrices \(\tau _a\), whereas for \(n_{\mathrm {f}}=3\), the so-called Gell-Mann matrices can be used instead. The next step is to find the eigenvalues of the mass matrix for \(\pi ^a\). Setting \(n_{\mathrm {f}}=3\) and using the known flavor composition of the light pseudoscalar mesons, these eigenvalues can be identified with the masses of the individual states in Fig. 9.1 as

$$\displaystyle \begin{aligned} m_{\pi^0}^2&=\frac{2B}3\left(m_u+m_d+m_s-\sqrt{m_u^2+m_d^2+m_s^2-m_um_d-m_um_s-m_dm_s}\right)\;,\\ m_{\pi^\pm}^2&=B(m_u+m_d)\;,\\ {} m_{K^\pm}^2&=B(m_u+m_s)\;,\\ m_{K^0}^2&=B(m_d+m_s)\;,\\ m_\eta^2&=\frac{2B}3\left(m_u+m_d+m_s+\sqrt{m_u^2+m_d^2+m_s^2-m_um_d-m_um_s-m_dm_s}\right)\;. \end{aligned} $$
(9.17)

This determines the five different meson masses in terms of the three current quark masses, yet the latter are not uniquely fixed by (9.17). Indeed, any overall rescaling of the quark masses can be absorbed into a redefinition of the B parameter. It is however possible to use (9.17) to eliminate B and the quark masses altogether and thus obtain a constraint on the meson spectrum,

$$\displaystyle \begin{aligned} 2(m_{\pi^\pm}^2+m_{K^\pm}^2+m_{K^0}^2)=3(m_{\pi^0}^2+m_\eta^2)\;. \end{aligned} $$
(9.18)

This so-called Gell-Mann–Okubo formula can be interpreted, for instance, as giving an estimate for the mass of the \(\eta \)-meson in terms of those of the pions and kaons. With the data shown in Fig. 9.1 as input, one gets \(m_\eta \approx 568\,\mathrm {MeV}\), which is less than four per cent off the correct value.

The machinery developed in Sect. 8.2 can in principle be applied to an arbitrarily high order of the derivative expansion of ChPT. However, already at NLO (\(n=4\)), constructing the effective Lagrangian is a nontrivial exercise. I will therefore content myself with spelling out the final result and refer the reader to Sect. 3.2 of [7], where all details are worked out rather pedantically. For both \(n_{\mathrm {f}}=2\) and \(n_{\mathrm {f}}=3\), the invariant part of the ChPT Lagrangian at NLO can be written as

$$\displaystyle \begin{aligned} \mathcal{L}_{\mathrm{eff}}^{(4)}={}&c_1\operatorname{\mathrm{tr}}(D_\mu\mathcal{U} D^\mu{\mathcal{U}}^{\dagger} D_\nu\mathcal{U} D^\nu{\mathcal{U}}^{\dagger})+c_2\operatorname{\mathrm{tr}}(D_\mu\mathcal{U} D^\mu{\mathcal{U}}^{\dagger})\operatorname{\mathrm{tr}}(D_\nu\mathcal{U} D^\nu{\mathcal{U}}^{\dagger})\\ &+c_3\operatorname{\mathrm{tr}}(D_\mu\mathcal{U} D_\nu{\mathcal{U}}^{\dagger})\operatorname{\mathrm{tr}}(D^\mu\mathcal{U} D^\nu{\mathcal{U}}^{\dagger})\\ &+c_4\operatorname{\mathrm{tr}}(F^{\mathrm{L}}_{\mu\nu}D^\mu\mathcal{U} D^\nu{\mathcal{U}}^{\dagger}+F^{\mathrm{R}}_{\mu\nu}D^\mu{\mathcal{U}}^{\dagger} D^\nu\mathcal{U})\\ {} &+c_5\operatorname{\mathrm{tr}}(F^{\mathrm{L}}_{\mu\nu}\mathcal{U} F^{\mathrm{R}\mu\nu}{\mathcal{U}}^{\dagger})+c_6(F^{\mathrm{L}}_{\mu\nu}F^{\mathrm{L}\mu\nu}+F^{\mathrm{R}}_{\mu\nu}F^{\mathrm{R}\mu\nu})\\ &+d_1\operatorname{\mathrm{tr}}(\mathcal{M}{\mathcal{U}}^{\dagger})\operatorname{\mathrm{tr}}({\mathcal{M}}^{\dagger}\mathcal{U})+d_2\big[(\operatorname{\mathrm{tr}}\mathcal{M}{\mathcal{U}}^{\dagger})^2+(\operatorname{\mathrm{tr}}{\mathcal{M}}^{\dagger}\mathcal{U})^2\big]\\ &+d_3\operatorname{\mathrm{tr}}(\mathcal{M}{\mathcal{U}}^{\dagger}\mathcal{M}{\mathcal{U}}^{\dagger}+{\mathcal{M}}^{\dagger}\mathcal{U}{\mathcal{M}}^{\dagger}\mathcal{U})+d_4\operatorname{\mathrm{tr}}(\mathcal{M}{\mathcal{M}}^{\dagger})\\ &+d_5\operatorname{\mathrm{tr}}(\mathcal{M}{\mathcal{U}}^{\dagger}+{\mathcal{M}}^{\dagger}\mathcal{U})\operatorname{\mathrm{tr}}(D_\mu\mathcal{U} D^\mu{\mathcal{U}}^{\dagger})\\ &+d_6\operatorname{\mathrm{tr}}\big[(\mathcal{M}{\mathcal{U}}^{\dagger}+\mathcal{U}{\mathcal{M}}^{\dagger})D_\mu\mathcal{U} D^\mu{\mathcal{U}}^{\dagger}\big]\;. \end{aligned} $$
(9.19)

Here \(F^{\mathrm {L}}_{\mu \nu }\) and \(F^{\mathrm {R}}_{\mu \nu }\) are the field-strength tensors of the background gauge fields. The 12 effective couplings \(c_{\text{1--6}}\) and \(d_{\text{1--6}}\) are mutually independent in the \(n_{\mathrm {f}}=3\) case. In case of \(n_{\mathrm {f}}=2\), special properties of \(2\times 2\) matrices make the \(c_1\) operator redundant with \(c_2\), and the \(d_6\) operator redundant with \(d_5\).

The form of the NLO Lagrangian shown in (9.19) matches the detailed derivation offered in [7]. However, in the literature, a somewhat different basis of operators is often used. For the reader’s convenience, I list here detailed relations between the parameters \(c_{\text{1--6}},d_{\text{1--6}}\) introduced above and the more common NLO couplings of ChPT \(L_{\text{1--10}},H_{\text{1--2}}\), cf. Sect. 3.5.1 of [3],

$$\displaystyle \begin{aligned} \begin{gathered} c_1=L_3\;,\qquad c_2=L_1\;,\qquad c_3=L_2\;,\qquad c_4=-\mathrm{i} L_9\;,\\ c_5=L_{10}\;,\qquad c_6=H_1\;,\qquad d_1=2(L_6-L_7)\;,\qquad d_2=L_6+L_7\;,\\ d_3=L_8\;,\qquad d_4=H_2\;,\qquad d_5=L_4\;,\qquad d_6=L_5\;. \end{gathered} \end{aligned} $$
(9.20)

1.3 Interaction with External Fields

For further illustrations of the use of ChPT, I will utilize its simplest version: the \(n_{\mathrm {f}}=2\) ChPT in the “isospin-symmetric” limit. In this limit, one sets \(m_u=m_d=m\) so that all three pions have the same squared mass, \(m^2_\pi =2Bm\). Accordingly, the LO Lagrangian (9.14) reduces to

$$\displaystyle \begin{aligned} \mathcal{L}_{\mathrm{eff}}^{(2)}=\frac{f_\pi^2}4\big[\operatorname{\mathrm{tr}}(D_\mu{\mathcal{U}}^{\dagger}D^\mu\mathcal{U})+m_\pi^2\operatorname{\mathrm{tr}}(\mathcal{U}+{\mathcal{U}}^{\dagger})\big]\;. {} \end{aligned} $$
(9.21)

The kinetic term can be further expanded and organized by powers of the background gauge fields,

$$\displaystyle \begin{aligned} {} \operatorname{\mathrm{tr}}(D_\mu{\mathcal{U}}^{\dagger} D^\mu\mathcal{U})&=\operatorname{\mathrm{tr}}\big[\partial_\mu{\mathcal{U}}^{\dagger}\partial^\mu\mathcal{U}-\mathrm{i} A^{\mathrm{L}}_\mu(\mathcal{U}\partial^\mu{\mathcal{U}}^{\dagger}-\partial^\mu\mathcal{U}{\mathcal{U}}^{\dagger})\\ &\qquad \ -\mathrm{i} A^{\mathrm{R}}_\mu({\mathcal{U}}^{\dagger}\partial^\mu\mathcal{U}-\partial^\mu{\mathcal{U}}^{\dagger}\mathcal{U})\\ &\qquad \ -2\mathcal{U} A^{\mathrm{R}}_\mu{\mathcal{U}}^{\dagger} A^{\mathrm{L}\mu}+A^{\mathrm{L}}_\mu A^{\mathrm{L}\mu}+A^{\mathrm{R}}_\mu A^{\mathrm{R}\mu}\big]\;. \end{aligned} $$
(9.22)

Knowing the explicit dependence on the background fields is, among others, a useful starting point for deriving the Noether currents of chiral symmetry. I will however focus on illustrating the implications of some particular choices of actual, physical background fields.

Example 9.2

As explained in Example 8.7, a chemical potential parameterizing the statistical equilibrium of a many-body system can be introduced in the Lagrangian as a constant temporal gauge field. Thus, the effects of nonzero density of isospin, that is the diagonal generator of \(\mathrm {SU}(2)_{\mathrm {V}}\), can be captured by setting

$$\displaystyle \begin{aligned} A^{\mathrm{L}}_\mu=A^{\mathrm{R}}_\mu=\delta_{\mu0}\mu_{\mathrm{I}}\frac{\tau _3}2\;, \end{aligned} $$
(9.23)

where \(\mu _{\mathrm {I}}\) is the isospin chemical potential. The actual statistical ground state is found by minimizing the Hamiltonian of ChPT with respect to \(\mathcal {U}\). A detailed analysis shows that the ground state can be represented by a real orthogonal matrix,

$$\displaystyle \begin{aligned} \langle{\mathcal{U}}\rangle =\begin{pmatrix} \cos\alpha & \sin\alpha\\ -\sin\alpha & \cos\alpha \end{pmatrix}\;. {} \end{aligned} $$
(9.24)

For \(\left \lvert {\mu _{\mathrm {I}}}\right \rvert \leq m_\pi \), the value of the angle \(\alpha \) minimizing energy is \(\alpha =0\), implying \(\langle {\mathcal {U}}\rangle =\mathbb {1}\). This is the usual QCD vacuum. On the other hand, for \(\left \lvert {\mu _{\mathrm {I}}}\right \rvert \geq m_\pi \), the ground state corresponds to \(\cos \alpha =m_\pi ^2/\mu _{\mathrm {I}}^2\). This state describes Bose–Einstein condensation of charged pions. The condensate carries nonzero isospin density obtained as minus the derivative of the Hamiltonian density with respect to \(\mu _{\mathrm {I}}\),

$$\displaystyle \begin{aligned} n_{\mathrm{I}}=f_\pi^2\mu_{\mathrm{I}}\sin^2\alpha=f_\pi^2\mu_{\mathrm{I}}\left(1-\frac{m_\pi^4}{\mu_{\mathrm{I}}^4}\right)\;. \end{aligned} $$
(9.25)

Next, let us look at the spectrum of excitations above the ground state (9.24). This can be extracted by expanding the Lagrangian (9.21) to second order in fluctuations around (9.24). In the vacuum phase (\(\left \lvert {\mu _{\mathrm {I}}}\right \rvert <m_\pi \)), the neutral pion maintains its relativistic dispersion relation, . The energy of the charged pions is, on the other hand, trivially shifted by the chemical potential, . In the pion condensation phase (\(\left \lvert {\mu _{\mathrm {I}}}\right \rvert >m_\pi \)), the dispersion of the neutral pion changes to

$$\displaystyle \begin{aligned} E_{\pi^0}(\boldsymbol p)=\sqrt{\boldsymbol p^2+\mu_{\mathrm{I}}^2}\;. \end{aligned} $$
(9.26)

This is a relativistic-looking dispersion, except that the “mass” equals \(\left \lvert {\mu _{\mathrm {I}}}\right \rvert \). That is not a coincidence. In the pion condensation phase, the neutral pion mode can be interpreted as a massive NG boson of the isospin \(\mathrm {SU}(2)_{\mathrm {V}}\) symmetry; see Sect. 6.4.3 and [8] for details. In the charged pion sector, it is no longer possible to distinguish isospin (or electric charge) eigenstates as a result of SSB. There are two excitation branches that are mixtures of \(\pi ^+\) and \(\pi ^-\), and their squared energies are

$$\displaystyle \begin{aligned} E_\pm(\boldsymbol p)^2=\boldsymbol p^2+\frac{\mu_{\mathrm{I}}^2}2(1+3\cos^2\alpha)\pm\frac{\mu_{\mathrm{I}}}2\sqrt{(1+3\cos^2\alpha)^2\mu_{\mathrm{I}}^2+16\boldsymbol p^2\cos^2\alpha}\;. \end{aligned} $$
(9.27)

The lower of the two branches is gapless, \(E_-(\mathbf 0)=0\). This is the NG boson of the spontaneously broken isospin symmetry. Further discussion of meson condensates in QCD can be found for instance in the pedagogical review [9].

QCD alone does not encompass all of particle physics. Hadrons can also interact via the weak and electromagnetic forces. ChPT makes it easy to couple pseudoscalar mesons to the electroweak sector of the Standard Model (see, for instance, Sect. 20.2 of [10] for an overview). Indeed, we can imitate the coupling of quarks to the electroweak gauge bosons by setting

$$\displaystyle \begin{aligned} A^{\mathrm{L}}_\mu=\frac g2\boldsymbol{\tau}\cdot\boldsymbol{A}_\mu+\frac{g'}6B_\mu\;,\qquad A^{\mathrm{R}}_\mu=g'QB_\mu\;. \end{aligned} $$
(9.28)

Here \(\boldsymbol A_\mu \) is a triplet of potentials of the weak isospin gauge group, \(\mathrm {SU}(2)_{\mathrm {I}}\). Similarly, \(B_\mu \) is the potential of the hypercharge gauge group, \(\mathrm {U}(1)_{\mathrm {Y}}\). The corresponding gauge couplings are \(g,g'\). Finally, Q is the matrix of electric charges of the u and d quarks,

$$\displaystyle \begin{aligned} Q=\begin{pmatrix} 2/3 & 0\\ 0 & -1/3 \end{pmatrix}=\frac 16\mathbb{1}+\frac{1}{2}\tau _3\;. {} \end{aligned} $$
(9.29)

A complete set of LO electroweak interactions of pions is then obtained by inserting the above definitions into (9.22).

Example 9.3

A glance at (9.22) shows that adding the electroweak gauge sector leads to nontrivial effects even in the ground state of ChPT, \(\langle {\mathcal {U}}\rangle =\mathbb {1}\). Namely, the last three terms in (9.22) generate a mass term for the electroweak gauge bosons,

$$\displaystyle \begin{aligned} \frac{f_\pi^2}4\operatorname{\mathrm{tr}}\big(A^{\mathrm{L}}_\mu A^{\mathrm{L}\mu}+A^{\mathrm{R}}_\mu A^{\mathrm{R}\mu}-2A^{\mathrm{L}}_\mu A^{\mathrm{R}\mu}\big)&=\frac{f_\pi^2}4\operatorname{\mathrm{tr}}\big[(A^{\mathrm{L}}_\mu-A^{\mathrm{R}}_\mu)(A^{\mathrm{L}\mu}-A^{\mathrm{R}\mu})\big]\\ &=\frac{f_\pi^2}8\big[(gA^1_\mu)^2+(gA^2_\mu)^2\\ &\qquad \quad +(gA^3_\mu-g'B_\mu)^2\big]\\ &=\frac{f_\pi^2}8\big[2g^2W_\mu^+W^{-\mu}+(g^2+g^{\prime 2})Z_\mu Z^\mu\big]\;. \end{aligned} $$
(9.30)

Here, I introduced the charged gauge boson fields by , and the neutral weak gauge boson via . Finally, the Weinberg angle \(\theta _{\mathrm {W}}\) is related to the gauge couplings by . Thus, spontaneous breaking of chiral symmetry in QCD leads to the following contributions to the masses of the electroweak gauge bosons,

$$\displaystyle \begin{aligned} m_W^2=\frac 14f_\pi^2g^2\;,\qquad m_Z^2=\frac 14f_\pi^2(g^2+g^{\prime 2})\;. \end{aligned} $$
(9.31)

Given the characteristic scale of QCD, encoded in the value of \(f_\pi \) (fixed precisely below), these contributions are tiny. However, the idea that the gauge boson masses might be generated by a strong dynamics that spontaneously breaks chiral symmetry is intriguing. It lies behind the “technicolor” scenario of dynamical electroweak symmetry breaking. In this scenario, the Higgs boson is not elementary, but rather a composite bound state of constituent “techniquarks.” The value of \(f_\pi \) is expected to be of the order of the electroweak scale, that is a few hundreds of \(\mathrm {GeV}\). See [11] for a pedagogical introduction to technicolor models.

The physical value of \(f_\pi \) can be fixed by likewise utilizing the coupling of pions to the electroweak sector of the Standard Model. All we need is a single observable that does not depend on any other as yet unknown parameter. A suitable candidate is the leptonic decay of the charged pion.

Example 9.4

The charged pion \(\pi ^+\) decays with the probability of \(99.988\%\) into an antimuon \(\mu ^+\) and a muon neutrino \(\nu _\mu \). The leading perturbative contribution to the amplitude for this decay is shown in Fig. 9.2. The conversion of an on-shell pion into a virtual W-boson is described by our Lagrangian (9.21). Indeed, by expanding it to the first order in both \(W^\pm _\mu \) and , we find the bilinear term . The subsequent decay of the virtual W-boson into a lepton pair is governed by the “charged-current” interaction of the Standard Model, specifically the operator . I used an obvious notation for the spinor fields representing the leptons. Putting all the pieces together, the invariant amplitude for the process becomes

(9.32)

Here \(k^\mu ,p^\mu ,q^\mu \) are respectively the four-momenta of the pion, muon neutrino and antimuon. Also, \(u(p)\) and \(v(q)\) are the corresponding Dirac spinors; polarization indices are suppressed for clarity. Finally, \(G_{\mathrm {F}}\) is the Fermi coupling constant,

$$\displaystyle \begin{aligned} G_{\mathrm{F}}=\frac{g^2}{4\sqrt{2} m_W^2}\approx1.166\times10^{-5}\,\mathrm{GeV}^{-2}\;. \end{aligned} $$
(9.33)

Upon squaring the amplitude and summing over polarizations of the particles in the final state, the integrated decay rate in the rest frame of the pion is found to be

$$\displaystyle \begin{aligned} \Gamma_{\pi^+\to\mu^++\nu_\mu}=\frac{f_\pi^2G_{\mathrm{F}}^2}{4\pi}\frac{m_\mu^2(m_\pi^2-m_\mu^2)^2}{m_\pi^3}\;. {} \end{aligned} $$
(9.34)

I have treated the neutrino as a massless particle. The masses of the pion and antimuon are, respectively, \(m_\pi \approx 139.6\,\mathrm {MeV}\) and \(m_\mu \approx 105.7\,\mathrm {MeV}\). Finally, we need an input on the lifetime of the charged pion, \(\tau \approx 2.60\times 10^{-8}\,\mathrm {s}\). This converts to the total decay rate of \(\Gamma \approx 2.53\times 10^{-8}\,\mathrm {eV}\). At the end of the day, we get an estimate for the pion decay constant,

$$\displaystyle \begin{aligned} f_\pi\approx91\,\mathrm{MeV}\;. \end{aligned} $$
(9.35)
Fig. 9.2
figure 2

Feynman diagram for the leading (tree-level) contribution to charged pion decay. The \(\pi ^+\)\(W^+\) coupling is provided by ChPT whereas the interaction vertex between \(W^+\) and the charged lepton current follows from the Standard Model of electroweak interactions

The result (9.34) tells us more than merely a good estimate for the total decay rate of the charged pion. Namely, leptons from different families have identical weak interactions. Upon replacing \(m_\mu \) with the electron mass \(m_e\), (9.34) therefore also gives us a decay rate for the process \(\pi ^+\to e^++\nu _e\). This is more conveniently expressed in terms of the branching ratio,

$$\displaystyle \begin{aligned} R_{\pi^+\to e^++\nu_e}\approx\frac{\Gamma_{\pi^+\to e^++\nu_e}}{\Gamma_{\pi^+\to\mu^++\nu_\mu}}=\frac{m_e^2}{m_\mu^2}\frac{(m_\pi^2-m_e^2)^2}{(m_\pi^2-m_\mu^2)^2}\approx1.28\times10^{-4}\;. \end{aligned} $$
(9.36)

This is very close to the experimental value, which is about \(1.23\times 10^{-4}\) [1]. The suppression of the electron decay channel compared to the muon one is purely kinematical. By angular momentum conservation, one of the leptons in the final state must be left-handed and one right-handed. Yet, the W-boson only couples to left-handed fermion fields. The combination of these two effects requires a helicity flip and is responsible for the proportionality of (9.34) to the lepton mass squared.

1.4 Effects of the Chiral Anomaly

So far I have tacitly assumed, following Sect. 8.2, that the effective action of ChPT is gauge-invariant in presence of the background fields. This allowed us to construct strictly gauge-invariant Lagrangians at LO (9.14) and NLO (9.19) of the derivative expansion of ChPT. Are there any contributions to the ChPT Lagrangian that are merely quasi-invariant? A detailed derivation of such contributions and their coupling to background gauge fields would require a differential-geometric approach akin to that of Sect. 8.1. The problem of finding all such Wess–Zumino (WZ), or Wess–Zumino–Witten, terms was studied thoroughly in the 1980s and 1990s. For a discussion close in spirit to this book, I refer the reader to [12, 13]. A pedagogical account of the method including explicit expressions for quasi-invariant Lagrangians for a broad class of coset spaces can be found in [14].

Here I will resort to a trick, which gives the right answer in case of \(n_{\mathrm {f}}=2\) quark flavors. Suppose we were able to construct a current \(J^\mu \), conserved off-shell, that is without imposing the equation of motion (EoM) for the NG fields in our EFT. If in addition the EFT includes an Abelian gauge field \(A_\mu \), then the operator \(A_\mu J^\mu \) is quasi-invariant and can be added to the Lagrangian density. It remains to guess what \(J^\mu \) and \(A_\mu \) might be within ChPT. The current is the tricky bit. For the moment, I will simply write it down, a partial a posteriori justification will be offered below;

$$\displaystyle \begin{aligned} \begin{aligned} J^\mu_{\mathrm{GW}}=\lambda\varepsilon^{\mu\nu\alpha\beta}\operatorname{\mathrm{tr}}\bigg\{&(\mathcal{U} D_\nu{\mathcal{U}}^{\dagger})(\mathcal{U} D_\alpha{\mathcal{U}}^{\dagger})(\mathcal{U} D_\beta{\mathcal{U}}^{\dagger})\\ &-\frac{3\mathrm{i}}2\big[(D_\nu\mathcal{U}{\mathcal{U}}^{\dagger})F^{\mathrm{L}}_{\alpha\beta}-(D_\nu{\mathcal{U}}^{\dagger}\mathcal{U})F^{\mathrm{R}}_{\alpha\beta}\big] \bigg\}\;. \end{aligned} {} \end{aligned} $$
(9.37)

This is the Goldstone–Wilczek (GW) current; the overall factor \(\lambda \) is in principle arbitrary. In the absence of the background fields, the GW current would be manifestly conserved thanks to the antisymmetry of the LC tensor and cyclicity of trace. With the background in place, it is a nontrivial but straightforward exercise to show thatFootnote 3

$$\displaystyle \begin{aligned} \partial_\mu J^\mu_{\mathrm{GW}}=\frac{3\lambda}4\varepsilon^{\mu\nu\alpha\beta}\operatorname{\mathrm{tr}}\big(-F^{\mathrm{L}}_{\mu\nu}F^{\mathrm{L}}_{\alpha\beta}+F^{\mathrm{R}}_{\mu\nu}F^{\mathrm{R}}_{\alpha\beta}\big)\;. {} \end{aligned} $$
(9.38)

The current is not conserved as promised, except for backgrounds that are purely vector-like, \(A^{\mathrm {L}}_\mu =A^{\mathrm {R}}_\mu \). Luckily, this is not a problem. In fact, the nonconservation of the GW current turns out to be exactly what is needed to implement correctly the microscopic physics of QCD within ChPT.

To that end, recall that the flavor symmetry of QCD has a single \(\mathrm {U}(1)\) factor, namely the baryon number \(\mathrm {U}(1)_{\mathrm {B}}\). This can also be coupled to a background gauge field, \(A^{\mathrm {B}}_\mu \), even if such a field may not have an obvious experimental realization. The presence of a coupling in the ChPT action implies that it is possible to create baryon number solely out of meson fields. This intriguing possibility was first proposed by Skyrme in the 1960s. In presence of the chiral background fields \(A^{\mathrm {L,R}}_\mu \), baryon number is not conserved due to the chiral anomaly. An explicit calculation (see Sect. 22.3 of [15]) shows that this anomaly is reproduced at the ChPT level by (9.38) if we set \(\lambda =-1/(24\pi ^2)\). This brings us to the final expression for the WZ term in the ChPT Lagrangian for \(n_{\mathrm {f}}=2\) quark flavors,

$$\displaystyle \begin{aligned} \begin{aligned} \mathcal{L}_{\mathrm{WZ}}^{(4)}=-\frac 1{24\pi^2}\varepsilon^{\mu\nu\alpha\beta}A^{\mathrm{B}}_\mu\operatorname{\mathrm{tr}}\bigg\{&(\mathcal{U} D_\nu{\mathcal{U}}^{\dagger})(\mathcal{U} D_\alpha{\mathcal{U}}^{\dagger})(\mathcal{U} D_\beta{\mathcal{U}}^{\dagger})\\ &-\frac{3\mathrm{i}}2\big[(D_\nu\mathcal{U}{\mathcal{U}}^{\dagger})F^{\mathrm{L}}_{\alpha\beta}-(D_\nu{\mathcal{U}}^{\dagger}\mathcal{U})F^{\mathrm{R}}_{\alpha\beta}\big] \bigg\}\;. \end{aligned} {} \end{aligned} $$
(9.39)

The superscript \({ }^{(4)}\) indicates that the WZ term enters at the NLO of the derivative expansion of ChPT. Note also that it does not come with an arbitrary coupling. The normalization of the WZ term is fixed by anomaly matching and does not receive any radiative corrections.

The WZ term (9.39) is manifestly invariant under background gauge transformations from the \(\mathrm {SU}(2)_{\mathrm {L}}\times \mathrm {SU}(2)_{\mathrm {R}}\) chiral group. On the contrary, under a background \(\mathrm {U}(1)_{\mathrm {B}}\) transformation, \(A^{\mathrm {B}}_\mu \to A^{\mathrm {B}}_\mu +\partial _\mu \epsilon ^{\mathrm {B}}\), the corresponding WZ action changes by

$$\displaystyle \begin{aligned} \updelta S_{\mathrm{WZ}}=-\int\mathrm{d}^4x\,\epsilon^{\mathrm{B}}(x)\partial_\mu J^\mu_{\mathrm{GW}}(x)\;, {} \end{aligned} $$
(9.40)

with the divergence of \(J^\mu _{\mathrm {GW}}\) given by (9.38). Watch the interplay of two symmetries: the variation of the action under \(\mathrm {U}(1)_{\mathrm {B}}\) is proportional to the background fields for \(\mathrm {SU}(2)_{\mathrm {L}}\times \mathrm {SU}(2)_{\mathrm {R}}\). What we have here is an example of a mixed ’t Hooft anomaly. This should be contrasted to the naive \(\mathrm {U}(1)_{\mathrm {A}}\) axial symmetry of QCD. The divergence of the axial current receives a contribution from the dynamical gluon fields, giving an example of an Adler–Bell–Jackiw anomaly. This kind of anomaly fundamentally invalidates a would-be classical symmetry of a quantum system. On the other hand, a symmetry exhibiting a ’t Hooft anomaly still implies exact relations (Ward identities) for the generating functional of the theory. This makes ’t Hooft anomalies a powerful tool for constraining low-energy EFTs, as I have illustrated here.

The construction of the WZ term (9.39) is not a mere academic exercise, as one might suspect from the presence of the “baryon number gauge field.” The term has measurable consequences for the electromagnetic interactions of pions. To see why, recall (9.29), which shows that electric charge does not belong to the chiral Lie algebra \(\mathfrak {su}(2)_{\mathrm {L}}\times \mathfrak {su}(2)_{\mathrm {R}}\) due to not being traceless. Interactions of pions with an external electromagnetic field (with all other background fields switched off) can then be generated by setting

$$\displaystyle \begin{aligned} A^{\mathrm{L}}_\mu=A^{\mathrm{R}}_\mu=\frac e2\tau _3A^Q_\mu\;,\qquad A^{\mathrm{B}}_\mu=\frac e2A^Q_\mu\;. \end{aligned} $$
(9.41)

Here \(A^Q_\mu \) is the electromagnetic gauge potential and e the electromagnetic coupling. The effects of interaction with the electromagnetic field via the WZ term are most striking in case of the neutral pion.

Example 9.5

Let us keep only the electromagnetic background field and the neutral pion \(\pi ^0\). The charged pions are discarded by using the replacement \(\mathcal {U}\to \exp (\mathrm{i} \pi ^0\tau _3/f_\pi )\). Upon simple integration by parts, the whole WZ term (9.39) then boils down to

$$\displaystyle \begin{aligned} \mathcal{L}_{\mathrm{WZ}}^{(4)}\to-\frac{e^2}{32\pi^2f_\pi}\pi^0\varepsilon^{\mu\nu\alpha\beta}F^Q_{\mu\nu}F^Q_{\alpha\beta}\;. \end{aligned} $$
(9.42)

This operator governs the electromagnetic decay of \(\pi ^0\) into a pair of photons. Denoting the four-momentum of the pion as \(k^\mu \) and those of the photons as \(p^\mu ,q^\mu \), the invariant amplitude for the decay turns out to be

$$\displaystyle \begin{aligned} \mathcal{A}_{\pi^0\to\gamma+\gamma}=-\frac{e^2}{4\pi^2f_\pi}\varepsilon^{\mu\nu\alpha\beta}p_\mu\epsilon^*_\nu(p)q_\alpha\epsilon^*_\beta(q)\;. \end{aligned} $$
(9.43)

Here \(\epsilon ^*_\mu (p)\) and \(\epsilon ^*_\mu (q)\) are the polarization vectors of the photons in the final state. It remains to take the square and sum over polarizations of the photons. The final result for the decay rate in the rest frame of the pion is

$$\displaystyle \begin{aligned} \Gamma_{\pi^0\to\gamma+\gamma}=\frac{\alpha^2m_\pi^3}{64\pi^3f_\pi^2}\;, \end{aligned} $$
(9.44)

where \(\alpha \equiv e^2/(4\pi )\) is the fine structure constant. Using the numerical input \(m_\pi \approx 135.0\,\mathrm {MeV}\) and \(\alpha \approx 7.297\times 10^{-3}\) along with the value for \(f_\pi \) found in Example 9.4, our final result is . This is less than \(3\%\) off the experimental value of \(7.81\,\mathrm {eV}\) [1].

The two-flavor WZ term (9.39) has remarkable physical consequences, yet vanishes by construction in the absence of external fields. A new twist in the story comes for \(n_{\mathrm {f}}=3\) quark flavors. Here another WZ term appears, which remains nonzero even in the absence of background fields. This governs scattering processes with an odd number of mesons, such as \(K^++K^-\to \pi ^++\pi ^-+\pi ^0\), which would otherwise be forbidden in ChPT. The mathematical origin of this WZ term parallels that of the quasi-invariant Lagrangians with one time derivative, analyzed in Sect. 8.1. The Lagrangian density of the WZ term can be mapped to a 4-form \(\omega _{\mathrm {WZ}}\) such that the 5-form \(\mathrm{d} \omega _{\mathrm {WZ}}\) is chirally invariant and closed but not exact. Such 5-forms are classified by the fifth de Rham cohomology group of the coset space. The coset space \([\mathrm {SU}(3)_{\mathrm {L}}\times \mathrm {SU}(3)_{\mathrm {R}}]/\mathrm {SU}(3)_{\mathrm {V}}\) has a unique generator of degree-5 cohomology,

$$\displaystyle \begin{aligned} \mathrm{d}\omega_{\mathrm{WZ}}\propto\operatorname{\mathrm{tr}}\big[(\mathcal{U}^{-1}\mathrm{d}\mathcal{U})\wedge(\mathcal{U}^{-1}\mathrm{d}\mathcal{U})\wedge(\mathcal{U}^{-1}\mathrm{d}\mathcal{U})\wedge(\mathcal{U}^{-1}\mathrm{d}\mathcal{U})\wedge(\mathcal{U}^{-1}\mathrm{d}\mathcal{U})\big]\;. \end{aligned} $$
(9.45)

The overall normalization is again fixed by matching to the flavor anomalies of QCD. More details about the geometric nature of this WZ term and its coupling to background gauge fields can be found in the original work of Witten [16].

The two-flavor coset space \([\mathrm {SU}(2)_{\mathrm {L}}\times \mathrm {SU}(2)_{\mathrm {R}}]/\mathrm {SU}(2)_{\mathrm {V}}\), being three-dimensional, obviously has vanishing fifth cohomology group. However, it does have a nontrivial third de Rham cohomology with a single generator,

$$\displaystyle \begin{aligned} \omega_{\mathrm{GW}}\propto\operatorname{\mathrm{tr}}\big[(\mathcal{U}^{-1}\mathrm{d}\mathcal{U})\wedge(\mathcal{U}^{-1}\mathrm{d}\mathcal{U})\wedge(\mathcal{U}^{-1}\mathrm{d}\mathcal{U})\big]\;. \end{aligned} $$
(9.46)

When pulled back to the four-dimensional Minkowski spacetime, \(\omega _{\mathrm {GW}}\) is just the Hodge dual of the GW current (9.37) in absence of external fields. This explains why such an identically conserved current exists in the first place. Moreover, the integral of over \(\mathbb {R}^3\) defines a conserved charge that is a topological invariant. Upon a suitable normalization, it coincides with the Brouwer degree of the pion fields viewed as a map ; cf. Example A.25. A skyrmion is a configuration of pion fields for which the topological charge is nonvanishing. Thanks to the coupling to \(A^{\mathrm {B}}_\mu \) in (9.39), the topological charge has the interpretation as baryon number. This provides a mathematical foundation for the Skyrme model of baryons.

2 Spin Waves in Ferro- and Antiferromagnets

I have already used ferromagnets repeatedly to illustrate various features of SSB, including the peculiarities of the spectrum of NG bosons in nonrelativistic systems. In order to make the present section self-contained, I will however start with a concise summary of the basic facts.

Ferro- and antiferromagnets are phases of matter that exhibit spin order. Although such order may also be induced in relativistic matter, I will have implicitly in mind its realization in ordinary crystalline solids. The advantage of this restriction is that in the nonrelativistic limit, spin can be treated as an internal degree of freedom. One can then base the construction of EFT for (anti)ferromagnets on spontaneous breakdown of the internal \(G\simeq \mathrm {SU}(2)\) spin symmetry. In both types of systems, the unbroken subgroup is \(H\simeq \mathrm {U}(1)\), corresponding to spin rotations about the axis of spin alignment. The coset space is therefore \(G/H\simeq \mathrm {SU}(2)/\mathrm {U}(1)\simeq S^2\). From the point of view of low-energy EFT, the only difference between ferro- and antiferromagnets is a nonzero VEV of spin in the former. This is directly reflected by the spectrum of NG bosons: spin waves, or magnons. In ferromagnets, there is a single type-B magnon, whereas antiferromagnets feature two type-A magnons.

The generators of \(G\simeq \mathrm {SU}(2)\) can be taken as \(\tau _A/2\). Without loss of generality, we may choose the spin axes so that \(H\simeq \mathrm {U}(1)\) is generated by \(\tau _3/2\). The coset space \(\mathrm {SU}(2)/\mathrm {U}(1)\) is then symmetric thanks to the inner automorphism \(\mathcal {R}(g)=R^{-1}gR\) with \(g\in \mathrm {SU}(2)\) and \(R=\mathrm{i} \tau _3\). This makes it possible to map the coset representative \(U(\pi )\) on a unit-vector variable \(\boldsymbol n(\pi )\in S^2\) via

$$\displaystyle \begin{aligned} \boldsymbol{\tau}\cdot\boldsymbol{n}(\pi)\equiv N(\pi)=U(\pi)^2\tau _3=U(\pi)\tau _3 U(\pi)^{-1}\;. {} \end{aligned} $$
(9.47)

The matrix field \(N(\pi )\) transforms linearly in the adjoint representation of G. As a consequence, the G-covariant derivative (9.8) boils down to

$$\displaystyle \begin{aligned} D\boldsymbol n(\pi)=\mathrm{d}\boldsymbol n(\pi)+\boldsymbol A\times\boldsymbol n(\pi)\;, \end{aligned} $$
(9.48)

where \(\boldsymbol A_\mu \) is a triplet of background gauge fields of \(\mathrm {SU}(2)\). The covariant derivative \(D_\mu \boldsymbol n(\pi )\) is the basic building block for construction of EFT for (anti)ferromagnets.

In addition to the internal spin symmetry, I will assume invariance under continuous spacetime translations and continuous spatial rotations. This is of course just a crude idealization of real materials, where such ideal symmetry may be explicitly broken by a variety of perturbations. These include especially the anisotropy induced by the underlying crystal lattice, and the effects of spin–orbit coupling. I will nevertheless initially assume the ideal, unperturbed limit. An outline of some phenomenological consequences of explicit symmetry breaking is deferred to Sect. 9.2.3.

2.1 Power Counting and Effective Lagrangian

The general philosophy of the derivative expansion of the EFT for (anti)ferromagnets copies closely that for ChPT, detailed in Sect. 9.1.1. However, the two cases differ substantially due to the qualitatively different spectra of (anti)ferromagnetic magnons. I will start with the more nontrivial, genuinely nonrelativistic case of ferromagnets.

2.1.1 Ferromagnets

The energy of ferromagnetic magnons is quadratic in momentum, at least in the long-wavelength limit. In order to assign to a given Feynman diagram a well-defined degree, we therefore count momentum as order one and energy as order two. The Schrödinger-like propagator of the magnon then has overall degree \(-2\). In close parallel with (9.3), the degree of a Feynman diagram \(\Gamma \) becomes

$$\displaystyle \begin{aligned} \deg\Gamma=(D+1)L-2I+\sum_{s,t}(s+2t)V_{s,t}\;. \end{aligned} $$
(9.49)

Here \(V_{s,t}\) denotes the number of vertices from \(\mathcal {L}_{\mathrm {eff}}^{(s,t)}\), the part of effective Lagrangian with s spatial and t temporal derivatives. As before, the number of propagators I can be eliminated via \(I=L+\sum _{s,t}V_{s,t}-1\), which leads to the final result

$$\displaystyle \begin{aligned} \deg\Gamma=2+(D-1)L+\sum_{s,t}(s+2t-2)V_{s,t}\;. {} \end{aligned} $$
(9.50)

The LO of the derivative expansion, \(\deg \Gamma =2\), corresponds to tree-level diagrams (\(L=0\)) with all vertices satisfying \(s+2t=2\). Thus, the LO effective Lagrangian consists of and . Before we construct these, let us briefly consider higher orders of the derivative expansion. First, note that unlike in ChPT, here we have a well-defined derivative expansion even in \(D=2\) spacetime dimensions. Indeed, ferromagnetic order that can be described by a derivatively coupled low-energy EFT exists even in one-dimensional spin chains. This is special to type-B NG bosons, as I will show in Sect. 15.2.

What exactly constitutes the NLO of the derivative expansion depends on the number of dimensions. For \(D=2\) (ferromagnetic chains), the NLO corresponds to \(\deg \Gamma =3\). It collects contributions from one-loop diagrams with all vertices from the LO Lagrangian, and from tree-level diagrams with one vertex from , if any such operators exist. (They may be forbidden by parity.) In \(D=4\) dimensions (bulk ferromagnets), on the other hand, there are two options. In case is allowed, then tree-level diagrams with one such vertex constitute the sole contribution with \(\deg \Gamma =3\). Otherwise, the NLO corresponds to \(\deg \Gamma =4\), and consists of tree-level diagrams with one vertex from , , or . Up to and including NLO, there are no quantum corrections; loops only start contributing at \(\deg \Gamma =5\). Perhaps the most interesting is the case of \(D=3\) (thin ferromagnetic films or layers). Barring the possible existence of , the NLO here is \(\deg \Gamma =4\). It includes both one-loop diagrams with all vertices from the LO Lagrangian and tree-level diagrams with one vertex from the NLO Lagrangian.

Clearly, the setup of the derivative expansion depends very sensitively on the specific choice of material (which affects discrete symmetries such as parity) and sample (which controls the dimension D). I will therefore limit the discussion to the effective Lagrangian at LO. A detailed classification of possible operators up to order four in derivatives, including the effects of the discrete crystal and time-reversal symmetries, can be found in [17].

The part of the LO Lagrangian is trivial. According to the general analysis in Sect. 8.2, we expect it to be of the type \(\kappa _{ab}\delta ^{rs}\Omega ^a_r\Omega ^b_s\), where the coupling \(\kappa _{ab}\) must be invariant under the adjoint action of \(H\simeq \mathrm {U}(1)\). The broken part of the gauged MC form (9.7) now takes the specific form

(9.51)

The H-invariant part of the symmetric tensor product of \(\Omega _\perp \) with itself is projected out by taking the trace. This leads immediately to the Lagrangian

$$\displaystyle \begin{aligned} \mathcal{L}_{\mathrm{eff}}^{(2,0)}=-\frac{\varrho_{\mathrm{s}}}4\operatorname{\mathrm{tr}}[\boldsymbol DN(\pi)\cdot\boldsymbol DN(\pi)]=-\frac{\varrho_{\mathrm{s}}}2\delta^{rs}D_r\boldsymbol n(\pi)\cdot D_s\boldsymbol n(\pi)\;. {} \end{aligned} $$
(9.52)

The parameter \(\varrho _{\mathrm {s}}\) is called spin stiffness and controls the gradient energy arising from “bending” the uniform ground state magnetization.

In \(d=2\) spatial dimensions, it is also possible to construct an invariant operator using antisymmetric tensor product, \(\lambda _{ab}\varepsilon ^{rs}\Omega ^a_r\Omega ^b_s\). This leads to the operator \(\varepsilon ^{rs}\boldsymbol n\cdot (D_r\boldsymbol n\times D_s\boldsymbol n)\). In the absence of background fields, the latter is a pure surface term; its integral over \(\mathbb {R}^2\) gives, up to normalization, the Brouwer degree of the map \(\boldsymbol n:\mathbb {R}^2\to S^2\). That is however no longer the case when the EFT is coupled to background gauge fields. A quick calculation shows that up to surface terms, \(\varepsilon ^{rs}\boldsymbol n\cdot (D_r\boldsymbol n\times D_s\boldsymbol n)\simeq \varepsilon ^{rs}\boldsymbol {n}\cdot \boldsymbol {F}_{rs}\), where \(\boldsymbol F_{\mu \nu }=\partial _\mu \boldsymbol A_\nu -\partial _\nu \boldsymbol A_\mu +\boldsymbol A_\mu \times \boldsymbol A_\nu \) is the field strength of the background. When added to the Lagrangian, this operator will modify the EoM for spin waves; cf. the discussion of EoM in Sect. 8.3. In the following, I will nevertheless disregard this contribution to the EFT. First, it only exists in \(d=2\) dimensions and moreover violates time reversal, under which \(\boldsymbol n(\boldsymbol x,t)\to -\boldsymbol n(\boldsymbol x,-t)\). Second, it requires a specific, nontrivial background to be nonzero, and thus does not affect the propagation of free magnons.

Let us now focus on the part of the LO Lagrangian. According to the general discussion in Sect. 8.2, this reads

$$\displaystyle \begin{aligned} \mathcal{L}_{\mathrm{eff}}^{(0,1)}=-M\omega ^3_a(\pi)\dot\pi^a+M\nu^3_A(\pi)A^A_0\;, {} \end{aligned} $$
(9.53)

where M is the density of spin (magnetization) in the ferromagnetic ground state. The second term in (9.53) is seen to equal \(M\boldsymbol A_0\cdot \boldsymbol n(\pi )\). To evaluate the first term, we project out the third component of the MC form by taking trace with \(\tau _3\). Then we apply the exponential parameterization, \(U(\pi )=\exp (\mathrm{i} \pi ^a\tau _a/2)\), and use (7.31),

$$\displaystyle \begin{aligned} -M\omega ^3_a(\pi)\dot\pi^a&=\mathrm{i} M\operatorname{\mathrm{tr}}\big[\tau _3U(\pi)^{-1}\partial_0U(\pi)\big]\\ &=-\frac M2\dot\pi^a\int_0^1\mathrm{d}\tau\operatorname{\mathrm{tr}}\big[\tau _3U(\tau\pi)^{-1}\tau _aU(\tau\pi)\big]\\ &=-M\dot\pi^a\int_0^1\mathrm{d}\tau\,n_a(\tau\pi)\simeq M\pi^a\int_0^1\mathrm{d}\tau\,\dot n_a(\tau\pi)\;, \end{aligned} $$
(9.54)

where \(\simeq \) indicates equality up to a total derivative. The index a runs over \(1,2\), we can however formally extend \(\pi ^a\) to a three-component vector \(\boldsymbol \pi \) and write \(\pi ^a\dot n_a=\boldsymbol \pi \cdot \dot {\boldsymbol n}=(\boldsymbol n\times \boldsymbol \pi )\cdot (\boldsymbol n\times \dot {\boldsymbol n})\). Using the definition (9.47) to take the derivative \(\partial _\tau N(\tau \pi )\), we find that \(\boldsymbol n(\tau \pi )\times \boldsymbol \pi =\partial _\tau \boldsymbol n(\tau \pi )\). Putting all the pieces together, we then arrive at the final expression for the LO effective Lagrangian for ferromagnets,

$$\displaystyle \begin{aligned} \begin{aligned} \mathcal{L}_{\mathrm{eff}}^{\mathrm{LO}}={}&M\int_0^1\mathrm{d}\tau\,\partial_\tau\boldsymbol n(\tau\pi)\cdot[\boldsymbol n(\tau\pi)\times\dot{\boldsymbol n}(\tau\pi)]+M\boldsymbol A_0\cdot\boldsymbol n(\pi)\\ &-\frac{\varrho_{\mathrm{s}}}2\delta^{rs}D_r\boldsymbol n(\pi)\cdot D_s\boldsymbol n(\pi)\;. \end{aligned} {} \end{aligned} $$
(9.55)

The second term in (9.55) is the usual Zeeman coupling of spin to an external magnetic field, whereas the third term represents the gradient energy of the spin configuration. The first term, however, deserves a comment, since the presence of integration over the parameter \(\tau \) makes it look nonlocal. To get rid of the integral, let us take a step back. Consider a one-parameter family of fields, \(\boldsymbol n(\tau ,\pi )\), \(\tau \in [0,1]\), such that \(\boldsymbol n(0,\pi )=(0,0,1)\equiv \boldsymbol n_0\) (ground state) and \(\boldsymbol n(1,\pi )=\boldsymbol n(\pi )\). The explicit choice of interpolation used in (9.55) corresponds to \(\boldsymbol n(\tau ,\pi )=\boldsymbol n(\tau \pi )\). It is easy to check that upon a smooth deformation of the field, \(\updelta \boldsymbol n(\tau ,\pi )\), the variation of the action only depends on \(\updelta \boldsymbol n(1,\pi )=\updelta \boldsymbol n(\pi )\). The concrete choice of interpolation between the \(\tau =0\) and \(\tau =1\) limits therefore does not matter. Assuming for simplicity that \(n^3(\boldsymbol x,t)\) as a function on the spacetime is non-negative everywhere, we can change the interpolation to

$$\displaystyle \begin{aligned} \boldsymbol n(\tau,\pi)=\big(\tau n^1(\pi),\tau n^2(\pi),\sqrt{1-\tau^2[(n^1(\pi))^2+(n^2(\pi))^2]}\big)\;, \end{aligned} $$
(9.56)

which makes it possible to carry out the integral over \(\tau \),

$$\displaystyle \begin{aligned} \mathcal{L}_{\mathrm{eff}}^{\mathrm{LO}}=-M\frac{\varepsilon_{ab}n^a(\pi)\dot n^b(\pi)}{1+n^3(\pi)}+M\boldsymbol A_0\cdot\boldsymbol n(\pi)-\frac{\varrho_{\mathrm{s}}}2\delta^{rs}D_r\boldsymbol n(\pi)\cdot D_s\boldsymbol n(\pi)\;. {} \end{aligned} $$
(9.57)

This is as far as we can get. The Lagrangian is local and manifestly invariant under \(H\simeq \mathrm {U}(1)\). Moreover, it only depends on the NG fields \(\pi ^a\) through \(\boldsymbol n(\pi )\), and thus does not rely on the exponential parameterization of \(U(\pi )\), originally used to derive (9.55).

2.1.2 Antiferromagnets

As far as the construction of the effective Lagrangian is concerned, antiferromagnets are much simpler than ferromagnets. The spectrum consists of two type-A NG bosons whose energy is, in the long-wavelength limit, linear in momentum. For the sake of power counting, we therefore have to treat spatial and temporal derivatives on equal footing. The resulting expression for the degree of a given Feynman diagram is a trivial generalization of (9.4) we found in ChPT,

$$\displaystyle \begin{aligned} \deg\Gamma=2+(D-2)L+\sum_{s,t}(s+t-2)V_{s,t}\;. \end{aligned} $$
(9.58)

The only difference to ChPT is that spatial and temporal derivatives may enter the effective Lagrangian independently.

Owing to the same symmetry-breaking pattern, the building blocks for constructing the EFT are the same for ferro- and antiferromagnets. The main difference is that the part with a single time derivative, , is missing in the latter; antiferromagnets have zero net magnetization. The LO Lagrangian then consists of and , and we can write it down at once,

$$\displaystyle \begin{aligned} \mathcal{L}_{\mathrm{eff}}^{\mathrm{LO}}=\frac{\varrho_{\mathrm{s}}}{2v^2}\big[D_0\boldsymbol n(\pi)\cdot D_0\boldsymbol n(\pi)-v^2\delta^{rs}D_r\boldsymbol n(\pi)\cdot D_s\boldsymbol n(\pi)\big]\;. {} \end{aligned} $$
(9.59)

There are two independent parameters which are easy to relate to physical observables. The spin stiffness \(\varrho _{\mathrm {s}}\) measures the gradient energy of the order parameter, whereas v turns out to be the phase velocity of antiferromagnetic magnons.

Similarly to ferromagnets, the organization of the derivative expansion beyond LO depends sensitively on D and the presence of discrete symmetries such as parity or time reversal. I will therefore stop the discussion of power counting here, and turn to the consequences of the EFT at LO.

2.2 Equation of Motion and Magnon Spectrum

We already know the number and type of magnons in both ferro- and antiferromagnets. However, the EFT tells us more, in particular what the corresponding fluctuations of the order parameter look like. To that end, I will drop the background gauge fields and derive the EoM corresponding to the LO effective Lagrangian.

Let us start with ferromagnets. As hinted above, taking a variation of (9.55) gives a surface term in \(\tau \), which allows one to write the variation of the action solely in terms of \(\updelta \boldsymbol n(\pi )\),

$$\displaystyle \begin{aligned} \updelta S_{\mathrm{eff}}^{\mathrm{LO}}=\int\mathrm{d}^D\!x\,\updelta\boldsymbol n\cdot\big(M\boldsymbol n\times\dot{\boldsymbol n}+\varrho_{\mathrm{s}}\boldsymbol\nabla^2\boldsymbol n\big)\;. \end{aligned} $$
(9.60)

The variation \(\updelta \boldsymbol n\) is not arbitrary but rather should keep \(\boldsymbol n\) on the coset space, that is the unit sphere \(S^2\). In other words, \(\updelta \boldsymbol n\) should be a tangent vector to the sphere. The vanishing of therefore requires that

$$\displaystyle \begin{aligned} M\boldsymbol n\times\dot{\boldsymbol n}+\varrho_{\mathrm{s}}\boldsymbol\nabla^2\boldsymbol n=\lambda\boldsymbol n\;; \end{aligned} $$
(9.61)

\(\lambda \) can be interpreted as a Lagrange multiplier for the constraint \(\boldsymbol {n}\cdot \boldsymbol {n}=1\). We can get rid of it by taking a cross product with \(\boldsymbol n\), which gives the Landau–Lifshitz equation,

$$\displaystyle \begin{aligned} \dot{\boldsymbol n}=\frac{\varrho_{\mathrm{s}}}M\boldsymbol n\times\boldsymbol\nabla^2\boldsymbol n\;. {} \end{aligned} $$
(9.62)

Previously, in Sect. 4.3.1, I derived this equation using the Hamiltonian (symplectic) formulation of field theory from the postulated Poisson bracket for the spin variable \(\boldsymbol n(\boldsymbol x)\); cf. (4.29). The Lagrangian and Hamiltonian descriptions of ferromagnets are of course equivalent. For instance, the symplectic 2-form (4.33) can be recovered by noting that according to (9.53), the symplectic potential is \(-M\omega ^3\). The exterior derivative thereof is easily evaluated using the MC equation (8.11). This shows that the fundamental Poisson bracket for \(\boldsymbol n(\boldsymbol x)\) is already automatically built in the low-energy Lagrangian EFT for ferromagnets.

To solve the Landau–Lifshitz equation (9.62) is a hard problem due to its nonlinearity. What one can do easily is to linearize the equation in small fluctuations around the ground state. Inserting \(\boldsymbol n=\boldsymbol n_0+\updelta \boldsymbol n\) and keeping only terms linear in \(\updelta \boldsymbol n\), we get

$$\displaystyle \begin{aligned} \updelta\dot{\boldsymbol n}=\frac{\varrho_{\mathrm{s}}}M\boldsymbol n_0\times\boldsymbol\nabla^2\updelta\boldsymbol n\;. {} \end{aligned} $$
(9.63)

We can now look for plane-wave solutions by using the ansatz

$$\displaystyle \begin{aligned} \updelta\boldsymbol n(\boldsymbol x,t)=\boldsymbol A\mathrm{e}^{-\mathrm{i} Et}\mathrm{e}^{\mathrm{i}\boldsymbol{p}\cdot\boldsymbol{x}}\;, {} \end{aligned} $$
(9.64)

where \(\boldsymbol A\) is a complex amplitude orthogonal to \(\boldsymbol n_0=(0,0,1)\). Inserting the ansatz in (9.63) shows that the energy and momentum satisfy the dispersion relation

$$\displaystyle \begin{aligned} E(\boldsymbol p)=\frac{\varrho_{\mathrm{s}}}M\boldsymbol p^2\;, {} \end{aligned} $$
(9.65)

typical for type-B NG bosons. The amplitude must satisfy the constraint \(\mathrm{i} \boldsymbol A=\boldsymbol n_0\times \boldsymbol A\). This is solved by any \(\boldsymbol A\propto (1,-\mathrm{i} ,0)\). Ferromagnetic spin waves are circularly polarized in the plane transverse to the direction of the ground state magnetization, \(\boldsymbol n_0\), regardless of the direction of momentum \(\boldsymbol p\). This can be understood as Larmor precession of individual spins around the effective magnetic field generated by the spin-polarized background.

Antiferromagnets can be treated in the same way, without the complications due to the single-time-derivative operator in the Lagrangian. The EoM obtained from (9.59) can be written as

$$\displaystyle \begin{aligned} \frac{\varrho_{\mathrm{s}}}{v^2}\big(\partial_0^2\boldsymbol n-v^2\boldsymbol\nabla^2\boldsymbol n\big)=\lambda\boldsymbol n\;, \end{aligned} $$
(9.66)

where \(\lambda \) is a Lagrange multiplier. Upon linearization around the ground state, \(\boldsymbol n_0\), we find plane-wave solutions of the same general form as in (9.64). However, the dispersion relation is now \(E(\boldsymbol p)=v\left \lvert {\boldsymbol p}\right \rvert \), typical for type-A NG bosons. The complex amplitude \(\boldsymbol A\) must be orthogonal to \(\boldsymbol n_0\). We conclude that antiferromagnetic spin waves are also polarized in the plane transverse to the direction of \(\boldsymbol n_0\), regardless of the direction of momentum. However, the polarization can be both linear and circular, or in general elliptic. There are therefore two different, independent types of antiferromagnetic magnons, which can be chosen to be linearly polarized.

2.3 Effects of Symmetry-Breaking Perturbations

So far, we have assumed exact symmetry under \(\mathrm {SU}(2)\) spin rotations, continuous spacetime translations and spatial rotations. Let us now briefly consider the effects of some phenomenologically important perturbations. These can be classified into two broad groups: perturbations controlled by external fields, and those intrinsic to the given material.

The most natural tunable perturbation that the spins in (anti)ferromagnets can be exposed to is an external magnetic field, \(\boldsymbol B\). Insofar as its effect on orbital degrees of freedom can be neglected, the magnetic field couples directly to the conserved charge of \(G\simeq \mathrm {SU}(2)\): the total spin. It can therefore be treated as a vector-valued chemical potential. In the low-energy EFT, this is implemented by setting \(\boldsymbol A_\mu (x)=\delta _{\mu 0}\boldsymbol B(x)\); the magnetic moment of the spins is absorbed into the definition of \(\boldsymbol B\). I will now show that treating \(\boldsymbol B\) as a background gauge field allows us to make some exact statements about magnon spectrum. Importantly, we do not have to introduce any new arbitrary parameters into the Lagrangian.

Example 9.6

According to (9.55), the effect of an external magnetic field on ferromagnets is taken into account by adding the Zeeman term, \(M\boldsymbol B\cdot \boldsymbol n(\pi )\), to the Lagrangian. As long as the magnetic field is uniform (which I will from now on assume), the ground state \(\boldsymbol n_0\) will remain uniform as well. However, its orientation is no longer arbitrary, but rather has to be aligned parallel to \(\boldsymbol B\). The magnetic field selects a unique stable equilibrium state. The effect on the magnon spectrum is also easy to work out. The left-hand side of the Landau–Lifshitz equation (9.62) has to be modified by replacing \(\dot {\boldsymbol n}\to D_0\boldsymbol n=\dot {\boldsymbol n}+\boldsymbol B\times \boldsymbol n\). Upon linearization, the plane-wave magnon solutions are still found to be circularly polarized in the plane transverse to \(\boldsymbol n_0\parallel \boldsymbol B\). The only effect of the magnetic field is a constant shift of the dispersion relation (9.65),

$$\displaystyle \begin{aligned} E(\boldsymbol p)=\left\lvert{\boldsymbol B}\right\rvert +\frac{\varrho_{\mathrm{s}}}M\boldsymbol p^2\;. {} \end{aligned} $$
(9.67)

The reason why the response of ferromagnets to a uniform magnetic field is so simple is that the conserved charge that \(\boldsymbol B\) couples to remains unbroken. Energy levels can therefore be labeled by the projection of spin into the direction of \(\boldsymbol B\). The excitation energy of a state with spin S (relative to the ground state) will be shifted by the magnetic field by \(-S\left \lvert {\boldsymbol B}\right \rvert \). The ferromagnetic ground state is maximally polarized, and a single magnon carries a unit of spin less than the ground state. This explains why the magnon receives a gap equal to \(\left \lvert {\boldsymbol B}\right \rvert \). This is an exact result valid to all orders of the derivative expansion; the ferromagnetic magnon is an example of a massive NG boson in the sense of Sect. 6.4.3.

Example 9.7

The effect of magnetic fields on antiferromagnets is somewhat less trivial. According to (9.59), their LO Lagrangian (in absence of other perturbations than \(\boldsymbol B\)) becomes

$$\displaystyle \begin{aligned} \mathcal{L}_{\mathrm{eff}}^{\mathrm{LO}}=\frac{\varrho_{\mathrm{s}}}{2v^2}\big\{[\partial_0\boldsymbol n(\pi)+\boldsymbol B\times\boldsymbol n(\pi)]^2-v^2\delta^{rs}\partial_r\boldsymbol n(\pi)\cdot \partial_s\boldsymbol n(\pi)\big\}\;. \end{aligned} $$
(9.68)

The corresponding Hamiltonian density is

$$\displaystyle \begin{aligned} \mathcal{H}_{\mathrm{eff}}^{\mathrm{LO}}=\frac{\varrho_{\mathrm{s}}}{2v^2}\big\{[\partial_0\boldsymbol n(\pi)]^2-[\boldsymbol B\times\boldsymbol n(\pi)]^2+v^2\delta^{rs}\partial_r\boldsymbol n(\pi)\cdot \partial_s\boldsymbol n(\pi)\big\}\;. \end{aligned} $$
(9.69)

The response of the ground state is quite different from ferromagnets: the energy is minimized by any \(\boldsymbol n_0\perp \boldsymbol B\). Hence the \(\mathrm {U}(1)\) group of spin rotations around the direction of \(\boldsymbol B\) is spontaneously broken. We expect the spectrum to contain one true NG boson, whereas the other magnon should receive a gap from the magnetic field.

To see this explicitly, let us choose \(\boldsymbol B=(0,0,\left \lvert {\boldsymbol B}\right \rvert )\) and \(\boldsymbol n_0=(1,0,0)\). We can use \(n^2\) and \(n^3\) as two independent fluctuations of the order parameter, and parameterize the latter as

$$\displaystyle \begin{aligned} \boldsymbol n(x)=\big(\sqrt{1-[(n^2(x))^2+(n^3(x))^2]},n^2(x),n^3(x)\big)\;. {} \end{aligned} $$
(9.70)

To the second order in the fluctuations and up to a total time derivative, the Lagrangian then reads

$$\displaystyle \begin{aligned} \mathcal{L}_{\mathrm{eff}}^{\mathrm{LO}}\simeq\frac{\varrho_{\mathrm{s}}}{2v^2}\bigg\{\sum_{i=2,3}\big[(\partial_0n^i)^2-v^2\boldsymbol\nabla n^i\cdot\boldsymbol\nabla n^i\big]+\boldsymbol B^2\big[1-(n^3)^2\big]\bigg\}+\dotsb\;. \end{aligned} $$
(9.71)

This makes it clear that \(n^2(x)\) remains gapless, as expected. On the other hand, the \(n^3(x)\) mode receives a gap, its full dispersion relation being . The conclusion that \(E(\mathbf 0)=\left \lvert {\boldsymbol B}\right \rvert \) is exact. The spectrum of an ideal antiferromagnet in a uniform magnetic field contains one true NG boson and one massive NG boson.

As opposed to the effect of external fields, perturbations induced by the underlying crystal lattice are intrinsic to the given material and therefore cannot be “switched off.” A prominent position among these is occupied by anisotropy in either spatial or spin structure of the microscopic interactions. I will content myself with the simplest illustrative example of such a crystal anisotropy, whereby one spin axis is distinguished from the other two,

$$\displaystyle \begin{aligned} \mathcal{L}_{\mathrm{pert}}=\epsilon(n^3)^2\;. \end{aligned} $$
(9.72)

For \(\epsilon >0\), the perturbation favors spin alignment along the third axis; this kind of anisotropy is called easy-axis. In the opposite case of \(\epsilon <0\), spin alignment along the first or second axis is preferred. This is an easy-plane anisotropy. The effects of easy-axis and easy-plane anisotropy on ferro- and antiferromagnets are very different. It is therefore best to discuss them separately. In each individual case, I will proceed by first identifying the ground state and then expanding the Lagrangian to second order in fluctuations.

Example 9.8

Let us start with ferromagnets. In the absence of background gauge fields but presence of the anisotropy, the Lagrangian (9.57) becomes

$$\displaystyle \begin{aligned} \mathcal{L}_{\mathrm{eff}}^{\mathrm{LO}}=-M\frac{\varepsilon_{ab}n^a\dot n^b}{1+n^3}-\frac{\varrho_{\mathrm{s}}}2\delta^{rs}\partial_r\boldsymbol n\cdot\partial_s\boldsymbol n+\epsilon(n^3)^2\;. \end{aligned} $$
(9.73)

In easy-axis ferromagnets, the ground state is unique up to overall sign, \(\boldsymbol n_0=(0,0,1)\). The two independent fluctuations can be taken as \(n^1\) and \(n^2\). Both the anisotropy and the ground state preserve the \(\mathrm {U}(1)\) group of spin rotations around the third axis. We can thus look for normal modes as eigenstates of this symmetry. This motivates the introduction of a complex field, \(\psi \equiv (n^1+\mathrm{i} n^2)/\sqrt {2}\). Dropping the energy density of the ground state and expanding the Lagrangian to second order in \(\psi \), we get

$$\displaystyle \begin{aligned} \mathcal{L}_{\mathrm{eff}}^{\mathrm{LO}}\simeq\mathrm{i} M{\psi}^{\dagger}\partial_0\psi-\varrho_{\mathrm{s}}\boldsymbol\nabla{\psi}^{\dagger}\cdot\boldsymbol\nabla\psi-2\epsilon{\psi}^{\dagger}\psi+\dotsb\;. \end{aligned} $$
(9.74)

This leads to a Schrödinger-like equation, describing circularly polarized spin waves with dispersion relation

$$\displaystyle \begin{aligned} E(\boldsymbol p)=\frac{2\epsilon}M+\frac{\varrho_{\mathrm{s}}}M\boldsymbol p^2\;. {} \end{aligned} $$
(9.75)

The anisotropy gives the magnon a gap since both generators of \(\mathrm {SU}(2)\), spontaneously broken in the preferred ground state, are also broken explicitly. In contrast to (9.67), the gap predicted by (9.75), \(E(\mathbf 0)=2\epsilon /M\), is only a LO result and will receive corrections at higher orders of the derivative expansion. The same is true for all the other magnon dispersion relations, derived below. The magnon has become a pseudo-NG boson.

In the easy-plane case, any uniform state with \(\langle {n^3}\rangle =0\) minimizes the energy. I will choose the ground state as \(\boldsymbol n_0=(1,0,0)\) and the independent fluctuations as \(n^2\) and \(n^3\). Upon series expansion in the latter using the parameterization (9.70), the Lagrangian becomes

$$\displaystyle \begin{aligned} \mathcal{L}_{\mathrm{eff}}^{\mathrm{LO}}\simeq Mn^3\partial_0n^2-\frac{\varrho_{\mathrm{s}}}2\sum_{i=2,3}\boldsymbol\nabla n^i\cdot\boldsymbol\nabla n^i-\left\lvert{\epsilon}\right\rvert (n^3)^2+\dotsb\;. \end{aligned} $$
(9.76)

We already found the spectrum of this type of Lagrangian back in Sect. 3.2.1, I can therefore just write down the final result,

$$\displaystyle \begin{aligned} E(\boldsymbol p)=\frac{\varrho_{\mathrm{s}}}M\left\lvert{\boldsymbol p}\right\rvert \sqrt{\boldsymbol p^2+\frac{2\left\lvert{\epsilon}\right\rvert }{\varrho_{\mathrm{s}}}}\;. \end{aligned} $$
(9.77)

The dispersion relation remains gapless but has become linear. This is because the \(\mathrm {U}(1)\) group of spin rotations, left intact by the anisotropy, is now spontaneously broken. The anisotropy has turned the magnon into a type-A NG boson.

Example 9.9

Next we turn to antiferromagnets, which are now for a change much easier to analyze. In the absence of background gauge fields but upon adding the anisotropy term, the Lagrangian (9.59) turns into

$$\displaystyle \begin{aligned} \mathcal{L}_{\mathrm{eff}}^{\mathrm{LO}}=\frac{\varrho_{\mathrm{s}}}{2v^2}\big[(\partial_0\boldsymbol n)^2-v^2\delta^{rs}\partial_r\boldsymbol n\cdot\partial_s\boldsymbol n\big]+\epsilon(n^3)^2\;. \end{aligned} $$
(9.78)

This Lagrangian is diagonal in the spin index of \(n^i\). We can therefore read off the spectrum immediately upon identification of the ground state and its independent fluctuations. In easy-axis antiferromagnets, the ground state is \(\boldsymbol n_0=(0,0,1)\) up to a sign, and its fluctuations are \(n^1\) and \(n^2\). These excite two gapped magnons,

$$\displaystyle \begin{aligned} E_{1,2}(\boldsymbol p)=\sqrt{v^2\boldsymbol p^2+\frac{2v^2\epsilon}{\varrho_{\mathrm{s}}}}\;. \end{aligned} $$
(9.79)

We can choose the basis of independent spin waves freely, either as linear, circular, or generally elliptic. In the easy-plane case, on the other hand, we can choose the ground state as \(\boldsymbol n_0=(1,0,0)\) and its fluctuations as \(n^2\) and \(n^3\). Here we find two different excitation branches with different dispersion relations, corresponding to linearly polarized spin waves,

$$\displaystyle \begin{aligned} E_2(\boldsymbol p)=v\left\lvert{\boldsymbol p}\right\rvert \;,\qquad E_3(\boldsymbol p)=\sqrt{v^2\boldsymbol p^2+\frac{2v^2\left\lvert{\epsilon}\right\rvert }{\varrho_{\mathrm{s}}}}\;. \end{aligned} $$
(9.80)

The reason why one of the excitations remains gapless is that there is an exact \(\mathrm {U}(1)\) symmetry, left intact by the perturbation, that is spontaneously broken.

I will conclude the discussion of perturbations in spin systems with a very peculiar example that leads to fascinating phenomenology. Until now, I have ruled out the existence of based on invariance under spatial rotations. However, one can, in fact, construct an operator with a single spatial derivative that does not break rotations, as long as two conditions are satisfied. First, parity must be broken, typically by the structure of the underlying crystal lattice. Second, spin–orbit coupling must be taken into account. This breaks the separate symmetries under spatial (orbital) and spin rotations to a single \(\mathrm {SU}(2)\), under which spatial coordinates \(\boldsymbol x\) and the spin vector \(\boldsymbol n\) transform simultaneously. One can then add to the Lagrangian of (anti)ferromagnets the Dzyaloshinskii–Moriya (DM) term,

$$\displaystyle \begin{aligned} \mathcal{L}_{\mathrm{eff}}^{(1,0)}=-\frac{2\pi\varrho_{\mathrm{s}}}{\lambda_{\mathrm{DM}}}\boldsymbol n\cdot(\boldsymbol\nabla\times\boldsymbol n)\;, {} \end{aligned} $$
(9.81)

where \(\lambda _{\mathrm {DM}}\) is a new parameter with the dimension of length. This length scale is fixed by the choice of concrete material, and is usually much larger than the scale of the underlying crystal lattice. For instance, in MnSi one finds \(\lambda _{\mathrm {DM}}\approx 18\,\mathrm {nm}\) and in FeGe \(\lambda _{\mathrm {DM}}\approx 70\,\mathrm {nm}\) [18]. This justifies treating the perturbation coupling \(2\pi \varrho _{\mathrm {s}}/\lambda _{\mathrm {DM}}\) as a small parameter of degree one in the power counting. The Lagrangian can then be consistently included in the LO of the derivative expansion.

Example 9.10

The ground state induced by the DM interaction can be discussed jointly for ferro- and antiferromagnets if one restricts to time-independent spin configurations. In the absence of external gauge fields and anisotropy, the LO Hamiltonian then reduces to

$$\displaystyle \begin{aligned} \mathcal{H}_{\mathrm{eff}}^{\mathrm{LO}}=\frac{\varrho_{\mathrm{s}}}2\delta^{rs}\partial_r\boldsymbol n\cdot\partial_s\boldsymbol n+\frac{2\pi\varrho_{\mathrm{s}}}{\lambda_{\mathrm{DM}}}\boldsymbol n\cdot(\boldsymbol\nabla\times\boldsymbol n)\;. \end{aligned} $$
(9.82)

The condition for minimum energy follows by “completing the square,”

$$\displaystyle \begin{aligned} \begin{aligned} \mathcal{H}_{\mathrm{eff}}^{\mathrm{LO}}&\simeq\frac{\varrho_{\mathrm{s}}}2\big[(\boldsymbol{\nabla}\cdot\boldsymbol{n})^2+(\boldsymbol\nabla\times\boldsymbol n)^2\big]+\frac{2\pi\varrho_{\mathrm{s}}}{\lambda_{\mathrm{DM}}}\boldsymbol n\cdot(\boldsymbol\nabla\times\boldsymbol n)\\ &=\frac{\varrho_{\mathrm{s}}}2(\boldsymbol{\nabla}\cdot\boldsymbol{n})^2+\frac{\varrho_{\mathrm{s}}}2\left(\boldsymbol\nabla\times\boldsymbol n+\frac{2\pi}{\lambda_{\mathrm{DM}}}\boldsymbol n\right)^2-\frac{2\pi^2\varrho_{\mathrm{s}}}{\lambda_{\mathrm{DM}}^2}\;. \end{aligned} \end{aligned} $$
(9.83)

The second term will be minimized by any spin configuration satisfying the first-order differential equation \(\boldsymbol \nabla \times \boldsymbol n=-(2\pi /\lambda _{\mathrm {DM}})\boldsymbol n\). This guarantees without further conditions minimization of the first term, \((\boldsymbol {\nabla }\cdot \boldsymbol {n})^2\), and thus of the whole Hamiltonian.

Let us now temporarily forget that \(\boldsymbol n(\boldsymbol x)\) should be a unit vector at any \(\boldsymbol x\), and Fourier-transform it to momentum space. The complex amplitude \(\boldsymbol n_{\boldsymbol p}\) of the plane wave with momentum \(\boldsymbol p\) is then subject to the condition \(\mathrm{i} \boldsymbol p\times \boldsymbol n_{\boldsymbol p}=-(2\pi /\lambda _{\mathrm {DM}})\boldsymbol n_{\boldsymbol p}\). This requires that \(\boldsymbol p\perp \boldsymbol n_{\boldsymbol p}\) and \(\left \lvert {\boldsymbol p}\right \rvert =2\pi /\lambda _{\mathrm {DM}}\) for any \(\boldsymbol p\) such that \(n_{\boldsymbol p}\) is nonzero. This gives the parameter \(\lambda _{\mathrm {DM}}\) the interpretation as the wavelength of a spatially inhomogeneous order, induced by the DM interaction. An arbitrary linear combination of plane waves with fixed \(\left \lvert {\boldsymbol p}\right \rvert \) is, however, not allowed by the constraint \(\boldsymbol {n}\cdot \boldsymbol {n}=1\). Without going into further technical details, let me spell out the final result. The ground state corresponds to a real helix that is right-handed for \(\lambda _{\mathrm {DM}}>0\) and left-handed for \(\lambda _{\mathrm {DM}}<0\). The direction of the axis of the helix is arbitrary and spontaneously breaks the symmetry under spatial and spin rotations. Choosing it for illustration along the z-axis, and \(\boldsymbol n\) to point along the x-axis at \(z=0\), the ground state becomes

$$\displaystyle \begin{aligned} \langle{\boldsymbol n(\boldsymbol x)}\rangle =(\cos2\pi z/\lambda_{\mathrm{DM}},\sin2\pi z/\lambda_{\mathrm{DM}},0)\;. \end{aligned} $$
(9.84)

In presence of the DM interaction, the spin system becomes a helimagnet.

The magnon spectrum of helimagnets is no less fascinating than their ground state. In case the underlying spin order is ferromagnetic, there is a single magnon branch (cf. Example 6.5). It has an unusual, strongly anisotropic dispersion relation. Along the axis of the helix, the energy is linear in momentum for wavelengths much longer than \(\lambda _{\mathrm {DM}}\), whereas in the two transverse directions it is quadratic [19]. One can show in addition that in the former case, the spin wave is linearly polarized, while in the latter case it is polarized circularly as usual in ferromagnets.

2.4 Some Topological Aspects of Ferromagnets

The list of interesting features of spin systems does not end with the multitude of textures that can be induced by various perturbations. In particular ferromagnets exhibit a number of intriguing topological properties that can be traced to the part of the effective Lagrangian. A brief sample is presented below. A reader wishing to learn more about the topology of spin systems is encouraged to consult [20].

Let us start by rewriting the single-time-derivative part of the ferromagnet Lagrangian (9.55) in a way that exposes its geometric nature. Suppose the field \(\boldsymbol n(\boldsymbol x,t)\) converges to a constant for \(t\to \pm \infty \); this is a usual assumption when setting up the variational principle for fields. Viewed as a function of \(\tau \) and t, \(\boldsymbol n\) then maps \([0,1]\times \mathbb {R}\) to a “disk” D on \(S^2\) whose boundary \(\Gamma \) carries the physical field (\(\tau =1\)). The action defined by (9.55) can be cast as

$$\displaystyle \begin{aligned} S_{\mathrm{eff}}^{\mathrm{LO}}\{\boldsymbol n,\boldsymbol A\}=M\int\mathrm{d}^d\!\boldsymbol x\int_D\Omega[\boldsymbol n](\boldsymbol x,t,\tau)+\dotsb\;, {} \end{aligned} $$
(9.85)

where \(\Omega [\boldsymbol n]\equiv \boldsymbol n\cdot (\partial _t\boldsymbol n\times \partial _\tau \boldsymbol n)\mathrm{d} t\wedge \mathrm{d} \tau \); the ellipsis stands for the spacetime integral of the second and third term in (9.55). Note that \(\boldsymbol n\cdot (\mathrm{d} \boldsymbol n\times \mathrm{d} \boldsymbol n)\) is the area form on \(S^2\). The action is therefore determined geometrically by the area of the domain on \(S^2\), bounded by the curve \(\Gamma \); see Fig. 9.3 for a visualization.

Fig. 9.3
figure 3

Geometric interpretation of the single-time-derivative term in the effective Lagrangian (9.55) for ferromagnets. The spin configuration \( \boldsymbol n( \boldsymbol x,t)\) maps the time axis to a closed curve \(\Gamma \) on the coset space \(S^2\). Manifest invariance of the Lagrangian under \(\mathrm {SU}(2)\) spin rotations can be saved at the cost of extending the integration from \(\Gamma \) to the disk D (shaded area) whose boundary is \(\Gamma \)

But how do we know which domain? I showed already that smoothly varying the interpolation \(\boldsymbol n(\tau ,\pi )\) between the fixed limits at \(\tau =0\) and \(\tau =1\) does not change the action. However, there are distinct classes of maps that cannot be smoothly deformed into each other. We could, for instance, think of the interpolation as filling the domain D shown by shading in Fig. 9.3, or its complement on the sphere. There is no a priori way to distinguish the “inside” and “outside” of the curve \(\Gamma \). The interpolation \(\boldsymbol n(\tau ,\pi )\) could even cover the whole sphere multiple times, before converging to the curve \(\Gamma \) in the limit \(\tau \to 1\). The only way around this intrinsic ambiguity is to ensure that it is not physically observable.

To that end, consider two domains, \(D_1\) and \(D_2\), swept by two different interpolations \(\boldsymbol n_1(\tau ,\pi )\) and \(\boldsymbol n_2(\tau ,\pi )\) of the same physical field, \(\boldsymbol n_1(1,\pi )=\boldsymbol n_2(1,\pi )=\boldsymbol n(\pi )\). We can glue the two maps into a single one, \(\tilde {\boldsymbol n}(\tau ,\pi )\), \(\tau \in [0,2]\), by setting

$$\displaystyle \begin{aligned} \begin{aligned} \tilde{\boldsymbol n}(\tau,\pi)&\equiv\boldsymbol n_1(\tau,\pi)\quad &\text{for }\tau\in[0,1]\;,\\ \tilde{\boldsymbol n}(\tau,\pi)&\equiv\boldsymbol n_2(2-\tau,\pi)\quad &\text{for }\tau\in[1,2]\;. \end{aligned} \end{aligned} $$
(9.86)

This new map satisfies \(\tilde {\boldsymbol n}(0,\pi )=\tilde {\boldsymbol n}(2,\pi )=\boldsymbol n_0\). Thanks to the compactification of the time axis to the circle \(S^1\), we can think of it as a map \(S^2\to S^2\) with the domain spanned by the variables \(\tau ,t\). The points \(\tau =0\) and \(\tau =2\) are the poles of the domain and \(\tau =1\) is the equator. As a consequence,

$$\displaystyle \begin{aligned} \int_{D_1}\Omega[\boldsymbol n_1]-\int_{D_2}\Omega[\boldsymbol n_2]=4\pi w[\tilde{\boldsymbol n}]\;, \end{aligned} $$
(9.87)

where \(w[\tilde {\boldsymbol n}]\) is the integer-valued Brouwer degree (A.145) of \(\tilde {\boldsymbol n}\). The conclusion is that the classical action of a ferromagnet suffers from a topological ambiguity, shifting it by \(4\pi MVw[\tilde {\boldsymbol n}]\) where V  is spatial volume. The functional integral of the EFT as a quantum theory will still be well-defined provided the action is only ambiguous up to an integer multiple of \(2\pi \). This requires that the total spin MV  be quantized in half-integers. The low-energy EFT for ferromagnets, constructed solely based on the geometry of the coset space, “knows” about quantization of spin!

Now that we have established the topological nature of the Lagrangian, let us look at some of its consequences. The most immediate one is the presence of a Berry phase; see Sect. 1.5 of [21] for a general introduction and [22] for the application to EFTs for NG bosons. Let us denote the quantum-mechanical ground state of a ferromagnet, corresponding to the order parameter \(\langle {\boldsymbol n}\rangle =\boldsymbol n_0\), as \(\left \lvert {\boldsymbol n_0}\right \rangle \). Suppose we expose the ferromagnet to a weak, uniform magnetic field \(\boldsymbol B\). This forces it to align its magnetization with the field so that \(\langle {\boldsymbol n}\rangle \equiv \boldsymbol n(\boldsymbol B)=\boldsymbol B/\left \lvert {\boldsymbol B}\right \rvert \), with the corresponding vacuum state \(\left \lvert {\boldsymbol n(\boldsymbol B)}\right \rangle \).

Consider now a time-dependent magnetic field \(\boldsymbol B(t)\) and arrange the ferromagnet state \(\left \lvert {\psi (0)}\right \rangle \) at \(t=0\) to be the vacuum state \(\left \lvert {\boldsymbol n(\boldsymbol B(0))}\right \rangle \). If the variation of \(\boldsymbol B(t)\) with time is sufficiently slow, the evolution of the state of the system \(\left \lvert {\psi (t)}\right \rangle \) will be adiabatic. We can set the ground state energy E to zero to eliminate the trivial phase factor of \(\mathrm{e} ^{-\mathrm{i} Et}\). Even then, we cannot expect \(\left \lvert {\psi (t)}\right \rangle \) to equal \(\left \lvert {\boldsymbol n(\boldsymbol B(t))}\right \rangle \). Rather,

$$\displaystyle \begin{aligned} \left\lvert{\psi(t)}\right\rangle =\exp[\mathrm{i}\gamma_{\mathrm{B}}(t)]\left\lvert{\boldsymbol n(\boldsymbol B(t))}\right\rangle \;, {} \end{aligned} $$
(9.88)

where \(\gamma _{\mathrm {B}}(t)\) is the Berry phase. Taking the time derivative of (9.88) and projecting the result to \(\left \lvert {\psi (t)}\right \rangle \) leads to \(\dot \gamma _{\mathrm {B}}(t)=\mathrm{i} \left \langle {\boldsymbol n(\boldsymbol B(t))}\middle \vert {\partial _0}\middle \vert {\boldsymbol n(\boldsymbol B(t))}\right \rangle \). The total phase accumulated over an interval \([t_1,t_2]\), , only depends on the path followed by \(\boldsymbol B\), not on its precise time dependence. To underline the geometric nature of the Berry phase, we introduce the Berry connection,

$$\displaystyle \begin{aligned} \omega_{\mathrm{B}}(\boldsymbol B)\equiv\mathrm{i}\left\langle{\boldsymbol n(\boldsymbol B)}\middle\vert{\mathrm{d}}\middle\vert{\boldsymbol n(\boldsymbol B)}\right\rangle \;, {} \end{aligned} $$
(9.89)

which is a 1-form on the space of all quantum ground states of the ferromagnet. The gauge freedom associated with the Berry connection amounts to the arbitrary choice of phase of \(\left \lvert {\boldsymbol n(\boldsymbol B)}\right \rangle \) for every value of \(\boldsymbol B\). Since the state \(\left \lvert {\boldsymbol n(\boldsymbol B)}\right \rangle \) only depends on the direction of \(\boldsymbol B\), the Berry connection naturally induces a 1-form on the coset space \(S^2\) of the ferromagnet. When the magnetic field varies so that it traces a closed loop in the parameter space, the total Berry phase is

$$\displaystyle \begin{aligned} \gamma_{\mathrm{B}}(\Gamma)=\int_\Gamma\omega_{\mathrm{B}}=\int_D\mathrm{d}\omega_{\mathrm{B}}\;. {} \end{aligned} $$
(9.90)

Here \(\Gamma \) is the loop on \(S^2\) traced by the order parameter in the process and D the disk bounded by it, as in Fig. 9.3. The 2-form \(\mathrm{d} \omega _{\mathrm {B}}\) is called the Berry curvature.

To see the connection between the Berry phase and the effective Lagrangian for ferromagnets, recall the parameterization of the coset space by matrices \(U(\pi )\). With a slight abuse of notation, one can write \(\left \lvert {\boldsymbol n(\boldsymbol B)}\right \rangle =U(\pi (\boldsymbol B))\left \lvert {\boldsymbol n_0}\right \rangle \). The Berry connection then reads

(9.91)

Comparing this to (9.53), we see that up to the factor of volume, the Berry connection is precisely the 1-form that defines the Lagrangian density . On the other hand, it follows from (9.85) and the Stokes theorem that \(-\mathrm{d} \omega ^3\) is the area form on \(S^2\). The Berry phase (9.90) therefore equals the total spin of the system times the area of D.

Mathematically, the area 2-form is the single generator of the second de Rham cohomology group of \(S^2\). The topological features of stem from pulling this 2-form back to the space of variables \(\tau ,t\). However, interesting physics also arises from pulling it back to the whole spacetime. For simplicity, I will now restrict to \(d=2\) spatial dimensions. The resulting closed spacetime 2-form is then Hodge-dual to a current, conventionally normalized as

$$\displaystyle \begin{aligned} J^\mu[\boldsymbol n]=\frac 1{8\pi}\varepsilon^{\mu\nu\lambda}\varepsilon_{ijk}n^i\partial_\nu n^j\partial_\lambda n^k\;. {} \end{aligned} $$
(9.92)

This current is conserved off-shell, similarly to the GW current introduced in Sect. 9.1.4. Following the analogy, we expect the density \(J^0[\boldsymbol n]\) to give rise to a topological charge, that is a topological invariant of the field \(\boldsymbol n(\boldsymbol x,t)\). Indeed, is just the Brouwer degree \(w[\boldsymbol n]\) of the map \(\boldsymbol n:\mathbb {R}^2\to S^2\). In \(d=2\) dimensions, there are smooth spin configurations carrying nonzero value of \(w[\boldsymbol n]\), called skyrmions, or sometimes baby skyrmions, to distinguish them from their counterparts in QCD. See the dedicated monograph [20] for further details about this fascinating subject.

Recall now the symplectic approach to ferromagnets, outlined in Sect. 4.3.1. It is easy to check explicitly that as a functional on the phase space of a ferromagnet, \(w[\boldsymbol n]\) has a vanishing Poisson bracket with any other functional. This is an example of a general feature that topological charges do not generate any flow on the phase space. Let us see what happens if we deform \(w[\boldsymbol n]\) by inserting an arbitrary function of spatial coordinates. Changing for convenience the overall normalization, we define a class of functionals,

$$\displaystyle \begin{aligned} Q_f[\boldsymbol n]\equiv\frac M2\int\mathrm{d}^2\boldsymbol x\,f(\boldsymbol x)\varepsilon^{rs}\varepsilon_{ijk}n^i(\boldsymbol x)\partial_rn^j(\boldsymbol x)\partial_sn^k(\boldsymbol x)\;. \end{aligned} $$
(9.93)

It is a simple exercise to verify that the Poisson algebra of these functionals reproduces the algebra of functions on \(\mathbb {R}^2\),

$$\displaystyle \begin{aligned} \{Q_f,Q_g\}=Q_{\varepsilon^{rs}\partial_rf\partial_sg}\equiv Q_{\{f,g\}}\;. {} \end{aligned} $$
(9.94)

In addition, the functional \(Q_f\) generates a flow on the phase space through

$$\displaystyle \begin{aligned} \{\boldsymbol n(\boldsymbol x),Q_f[\boldsymbol n]\}=\varepsilon^{rs}\partial_rf(\boldsymbol x)\partial_s\boldsymbol n(\boldsymbol x)\;. \end{aligned} $$
(9.95)

Curiously, this makes it possible to identify \(P^r\equiv Q_{\varepsilon _{rs}x^s}\) with the generator of spatial translations, that is momentum. The real surprise comes now: according to (9.94),

$$\displaystyle \begin{aligned} \{P^r[\boldsymbol n],P^s[\boldsymbol n]\}=4\pi M\varepsilon^{rs}w[\boldsymbol n]\;. \end{aligned} $$
(9.96)

For skyrmions, which carry nonzero \(w[\boldsymbol n]\), the two components of momentum do not commute with each other. This has a simple physical interpretation. Namely, for nonzero \(w[\boldsymbol n]\),

$$\displaystyle \begin{aligned} x_w^r[\boldsymbol n]\equiv\frac 1{8\pi w[\boldsymbol n]}\int\mathrm{d}^2\boldsymbol x\,x^r\varepsilon^{uv}\varepsilon_{ijk}n^i(\boldsymbol x)\partial_un^j(\boldsymbol x)\partial_vn^k(\boldsymbol x)=-\frac{\varepsilon^{rs}P_s[\boldsymbol n]}{4\pi Mw[\boldsymbol n]} \end{aligned} $$
(9.97)

represents the center of the topological charge distribution. Conservation of momentum in the absence of external forces then implies conservation of the center of charge. Skyrmions are pinned to a fixed position unless an external field is applied to them. The fact that \(\{x^r_w[\boldsymbol n],x^s_w[\boldsymbol n]\}=\varepsilon ^{rs}/(4\pi Mw[\boldsymbol n])\) hints that the dynamics of skyrmions closely resembles that of a charged particle in a magnetic field.