5.1 Introduction and Outline

Since Wilson’s seminal papers of the mid-1970s, the lattice approach to Quantum Chromodynamics has become increasingly important for the study of the strong interaction at low energies, and has now turned into a mature and established technique. In spite of the fact that the lattice formulation of Quantum Field Theory has been applied to virtually all fundamental interactions, it is appropriate to discuss this topic in a chapter devoted to QCD, since by far the largest part of activity is focused on the strong interaction. Lattice QCD is, in fact, the only known method which allows ab initio investigations of hadronic properties, starting from the QCD Lagrangian formulated in terms of quarks and gluons.

5.1.1 Historical Perspective

In order to illustrate the wide range of applications of the lattice formulation, we give a brief historical account below.

First applications of the lattice approach in the late 1970s employed analytic techniques, predominantly the strong coupling expansion, in order to investigate colour confinement and also the spectrum of glueballs. While these attempts gave valuable insights, it soon became clear that in the case of non-Abelian gauge theories such expansions were not sufficient to produce quantitative results.

First numerical investigations via Monte Carlo simulations, focusing in particular on the confinement mechanism in pure Yang–Mills theory, were carried out around 1980. The following years saw already several valiant attempts to study QCD numerically, yet it was realized that the available computer power was grossly inadequate to incorporate the effects of dynamical quarks. It was then that the so-called “quenched approximation” of QCD was proposed as a first step to solving full QCD numerically. This approximation rests on the ad hoc assumption that the dominant non-perturbative effects are mediated by the gluon field. Hadronic observables can then be computed on a pure gauge background with far less numerical effort compared to the real situation where quarks have a feedback on the gluon field. The main focus of activity during the 1980s was on bosonic theories: numerical simulations were used to compute the glueball spectrum in pure Yang–Mills theory. Another important result during this period concerned ϕ 4-theory and the implications of its supposed “triviality” for the Higgs-Yukawa sector of the Standard Model. Using a combination of analytic and numerical techniques, the triviality of ϕ 4 theory could be rigorously established.

Except for a brief spell of activity around the turn of the decade to simulate QCD with dynamical fermions, most projects in the 1990s were devoted to explore quenched QCD. Having recognized that the available computers and the efficiency of known algorithms were by far not sufficient to perform “realistic” simulations of QCD with controlled errors, lattice physicists resorted to exploring the quenched approximation and its limitations for a number of phenomenologically interesting quantities. Although the systematic error that arises by neglecting dynamical quarks could not be quantified reliably, many important quantities, such as quark and hadron masses, the strong coupling constant and weak hadronic matrix elements, were computed for the first time. One of the icons of that period was surely a plot of the masses of the lightest hadrons in the continuum limit of quenched QCD, produced by the CP-PACS Collaboration: their results indicated that the quenched approximation works surprisingly well (at least for these quantities), since the computed spectrum agreed with experimental determinations at the level of 10%. Simultaneously, a number of sophisticated techniques have been developed during the 1990s, thereby helping to control systematic effects, mainly pertaining to the influence of lattice artefacts, as well as the renormalization of local operators in the lattice regularized theory and their relation to continuum schemes such as \({\overline {{\mathrm {MS}}}}\). Perhaps the most significant development at the end of the 1990s was the clarification of the issue of chiral symmetry and lattice regularization. Following this work it is now understood under which conditions the lattice formulation is compatible with chiral symmetry. The importance of this development extends far beyond QCD and implies new prospects for the non-perturbative study of chiral gauge theories.

Since 2000 the focus has decidedly shifted from the quenched approximation to serious attempts to simulate QCD with dynamical quarks, thereby tackling the biggest remaining systematic uncertainty. Progress in this area has not just been determined by the vast increase in computer power since the very first Monte Carlo simulations, but rather by the development of new algorithmic ideas, combined with the use of alternative discretizations that are numerically more efficient. At the time of writing this contribution (2007), the whole field is actually in a state of transition: although the quenched approximation is being abandoned, the latest results from simulations with dynamical quarks have not yet reached the same level of accuracy in regard to controlling systematic errors due to lattice artefacts and effects from renormalization, as compared to earlier quenched calculations. It can thus be expected that many of the results discussed later in this chapter will soon be superseded by more accurate numbers. In turn, the quenched approximation will be completely obsolete in a few years time, except perhaps to test new ideas or for exploratory studies of more complex quantities.

5.1.2 Outline

We begin with an introduction of the basic concepts of the lattice formulation of QCD. This shall include the field theoretical foundations, discretizations of the QCD Lagrangian, as well as simulation algorithms and other technical aspects related to the actual calculation of physical observables from suitable correlation functions. The following sections deal with various applications. Lattice calculations of the hadron spectrum are described in Sect. 5.3. Section 5.4 is devoted to lattice investigations of the confinement phenomenon. Determinations of the fundamental parameters of QCD, namely the strong coupling constant and quark masses are a major focus of this article, and are presented in Sect. 5.5. Another important property of QCD, namely the spontaneously broken chiral symmetry, is discussed in some detail in Sect. 5.6, which also includes a brief introduction into analytical non-perturbative approaches to the strong interaction, based on effective field theories. Lattice calculations of weak hadronic matrix elements, which serve to pin down the elements of the Cabibbo–Kobayashi–Maskawa matrix, are covered in Sect. 5.7. We end this contribution with a few concluding remarks.

In addition to the topics listed above, lattice simulations of QCD have also made important contributions to the determination of the phase structure of QCD, including results for the critical temperature of the deconfinement phase transition. Nevertheless, in this chapter we restrict the discussion to QCD at zero temperature and refer the reader to other parts of this volume.

5.2 The Lattice Approach to QCD

The essential features of the lattice formulation can be summarized by the following statement:

Lattice QCD is the non-perturbative approach to the gauge theory of the strong interaction through regularized, Euclidean functional integrals. The regularization is based on a discretization of the QCD action which preserves gauge invariance at all stages.

This definition includes all basic ingredients: starting from the functional integral itself avoids any particular reference to perturbation theory. This is what we mean when we call lattice QCD an ab initio method. The Euclidean formulation, which is obtained by rotating to imaginary time, reveals the close relation between Quantum Field Theory and Statistical Mechanics. In particular, the Euclidean functional integral is equivalent to the partition function of the corresponding statistical system. This equivalence is particularly transparent if the field theory is formulated on a discrete space-time lattice. Via this relation, the whole toolkit of condensed matter physics, including high-temperature expansions, and, perhaps most importantly, Monte Carlo simulations, are at the disposal of the field theorist.

Many of the basic concepts introduced in this section are discussed in several common textbooks on the subject [1,2,3,4], which can be consulted for further details.

5.2.1 Euclidean Quantization

The generic steps in the Euclidean quantization procedure of a lattice field theory are the following:

  1. 1.

    Define the classical, Euclidean field theory in the continuum;

  2. 2.

    Discretize the corresponding Lagrangian;

  3. 3.

    Quantize the theory by defining the functional integral;

  4. 4.

    Determine the particle spectrum from Euclidean correlation functions.

We shall now illustrate this procedure for a simple example, namely the theory for a neutral scalar field.

Step 1

Consider a real, classical field ϕ(x), with x = (x 0, x 1, x 2, x 3), whose time variable x 0 is obtained by analytically continuing t to −ix 0. The Euclidean action S E[ϕ] is defined as

$$\displaystyle \begin{aligned} S_{\mathrm{E}}[\phi] = \int \mathrm{d}^4x \left\{ \frac{1}{2} \partial_\mu \phi(x)\partial_\mu \phi(x) +V(\phi) \right\}, \qquad \partial_\mu\equiv\frac{\partial}{\partial x_\mu}, \end{aligned} $$


$$\displaystyle \begin{aligned} V(\phi)=\frac{1}{2}m^2\phi(x)^2 +\frac{\lambda}{4!}\phi(x)^4. \end{aligned} $$

Step 2

In order to discretize the theory, a hyper-cubic lattice, ΛE, is introduced as the set of discrete space-time points, i.e.

$$\displaystyle \begin{aligned} \begin{array}{rcl} \Lambda_{\mathrm{E}}&\displaystyle =&\displaystyle \left\{ x\in\mathbb{R}^4\left|x^0/a=1,\ldots,N_{\mathrm{t}}; \, x^j/a=1,\ldots,N_{\mathrm{s}}\right.,\,j=1,2,3\right\},\\ T&\displaystyle =&\displaystyle N_{\mathrm{t}}a,\; L=N_{\mathrm{s}}a. \end{array} \end{aligned} $$

Thus, any space-time point is an integer multiple of the lattice spacing a. The total number of lattice sites is \(N_{\mathrm {t}}\times N_{\mathrm {s}}^3\), while the physical space-time volume is T × L 3. The discretized action is then given by

$$\displaystyle \begin{aligned} S_{\mathrm{E}}[\phi] = a^4\sum_{x\in\Lambda_{\mathrm{E}}} \left\{ \frac{1}{2}d_\mu\phi(x)d_\mu\phi(x)+\frac{1}{2}m^2\phi(x)^2 +\frac{\lambda}{4!}\phi(x)^4 \right\}, \end{aligned} $$

where the lattice derivatives can be defined as


Here and below \(\hat \mu \) denotes a unit vector in direction of μ. Via a Fourier transform, the Euclidean lattice ΛE is related to the dual lattice, \(\Lambda _{\mathrm {E}}^*\), defined by

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle \Lambda_{\mathrm{E}}^*=\left\{ p\in\mathbb{R}^4\left| p_0=\frac{2\pi}{T}n^0,\, p_j=\frac{2\pi}{L}n^j\right.\right\} \\ &\displaystyle &\displaystyle n^0=-\frac{N_{\mathrm{t}}}{2},-\frac{N_{\mathrm{t}}}{2}+1,\ldots, \frac{N_{\mathrm{t}}}{2}-1, \quad n^j=-\frac{N_{\mathrm{s}}}{2},-\frac{N_{\mathrm{s}}}{2}+1,\ldots, \frac{N_{\mathrm{s}}}{2}-1.\qquad \, \end{array} \end{aligned} $$

This not only implies that the momenta p 0 and p j are quantized in units of 2πT and 2πL, respectively, but also that a momentum cutoff has been introduced, since

$$\displaystyle \begin{aligned} -\frac{\pi}{a}\leq p_\mu \leq \frac{\pi}{a}. \end{aligned} $$

As we shall see below, this way of introducing a momentum cutoff can be extended to gauge theories in such a way that gauge invariance is respected. An important point to realize is that the lattice action is not unique: it is only required that the discretized expression for S E reproduces the continuum result as the lattice spacing a is taken to zero.

Step 3

The theory is quantized via the Euclidean functional integral

$$\displaystyle \begin{aligned} Z_{\mathrm{E}} := \int D[\phi]\,\mathrm{e}^{-S_{\mathrm{E}}[\phi]},\qquad D[\phi]=\prod_{x\in\Lambda_{\mathrm{E}}} \mathrm{d}\phi(x). \end{aligned} $$

Here one sees explicitly that the discretization procedure has given a mathematical meaning to the integration measure, which reduces to that of an ordinary, multiple-dimensional integration.

One can now define Euclidean correlation functions of local fields through

$$\displaystyle \begin{aligned} \left\langle \phi(x_1)\cdots\phi(x_n)\right\rangle = \frac{1}{Z_{\mathrm{E}}}\int D[\phi] \phi(x_1)\cdots\phi(x_n) \mathrm{e}^{-S_{\mathrm{E}}[\phi]}. \end{aligned} $$

In the continuum limit, these correlation functions approach the Schwinger functions, which encode the physical information about the spectrum within the Euclidean formulation. Osterwalder and Schrader [5] have laid down the general criteria which must be satisfied such that the information in Minkowskian space-time can be reconstructed from the Schwinger functions.

Step 4

The particle spectrum is extracted from the exponential fall-off of the Euclidean two-point correlation function. To this end, one must define the Euclidean time evolution operator. The transfer matrixT describes time propagation by a finite Euclidean time interval a. The functional integral can be expressed in terms of the transfer matrix as

$$\displaystyle \begin{aligned} Z_{\mathrm{E}} = {\mathrm{Tr}}\,\mathsf{T}^{N_{\mathrm{t}}}, \end{aligned} $$

where the trace is taken over the basis |α〉 of the Hilbert space of physical states. In order to obtain expressions which are more reminiscent of those in Minkowski space-time, one can define a Hamiltonian H E by

$$\displaystyle \begin{aligned} \mathsf{T}=:\mathrm{e}^{-a{\mathsf{H}}_{\mathrm{E}}}. \end{aligned} $$

If |α〉 denotes an eigenstate of the transfer matrix with eigenvalue λ α, i.e.

$$\displaystyle \begin{aligned} \mathsf{T}|\alpha\rangle = \lambda_\alpha|\alpha\rangle =\mathrm{e}^{-aE_\alpha}|\alpha\rangle, \end{aligned} $$

then one can work out the spectral decomposition of the two-point correlation function, viz.

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle \left\langle\phi(x)\phi(y)\right\rangle = \frac{1}{Z_{\mathrm{E}}}\int D[\phi] \phi(x)\phi(y)\mathrm{e}^{-S_{\mathrm{E}}[\phi]} \end{array} \end{aligned} $$
$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle =\sum_\alpha\mathrm{e}^{-(E_\alpha-E_0)(x_0-y_0)} \big\langle\alpha\big|\hat\phi(0,\vec{y})\big|0\big\rangle \big\langle0\big|\hat\phi(0,\vec{x})\big|\alpha\big\rangle. \end{array} \end{aligned} $$

Here, the quantity (E α − E 0) is the so-called mass gap, i.e. the energy of the state |α〉 above the vacuum. For large Euclidean time separations (x 0 − y 0) the lowest state dominates the two-point function, i.e. all higher states die out exponentially. The spectral decomposition of the two-point function forms the basis for numerical simulations of lattice field theories, as the mass (or energy) of a given state is given by the dominant exponential fall-off at large Euclidean times (see Sect. 5.2.3).

5.2.2 Lattice Actions for QCD

Our goal now is to find a lattice transcription of the Euclidean QCD action in the continuum, i.e.

$$\displaystyle \begin{aligned} S_{\text{QCD}} = \int \mathrm{d}^4x\bigg\{-\frac{1}{2g_0^2}{\mathrm{Tr}}\,(F_{\mu\nu}F_{\mu\nu}) +\sum_{f=u,d,s\ldots} \bar{\psi}_f\left(\gamma_\mu D_\mu+m_f\right)\psi_f \bigg\}, \end{aligned} $$

where g 0 denotes the gauge coupling, and our conventions are chosen such that the covariant derivative is defined through

$$\displaystyle \begin{aligned} D_\mu = \partial_\mu + A_\mu, \end{aligned} $$

while the field tensor reads

$$\displaystyle \begin{aligned} F_{\mu\nu}=\partial_\mu A_\nu-\partial_\nu A_\mu+[A_\mu,A_\nu],\qquad A_\mu^\dag = -A_\mu. \end{aligned} $$

Before attempting to write down a discretized version, we must first elucidate the notion of a lattice gauge field in a non-Abelian theory. In fact, in this case it turns out that the gauge potential A μ must be abandoned when the theory is discretized. The reason is that the familiar non-Abelian transformation law, i.e.

$$\displaystyle \begin{aligned} A_\mu(x) \to g(x)A_\mu(x)g(x)^{-1} +g(x)\partial_\mu(x)g(x)^{-1},\quad g(x)\in\text{SU(3)}, \end{aligned} $$

no longer holds exactly when μ is replaced by its discrete counterpart d μ of Eq. (5.5). Strict gauge invariance at the level of the regularized theory cannot be maintained in this fashion.

The definition of a lattice gauge field relies on the concept of the parallel transporter. If a quark moves in the presence of a background gauge field from y to x, it picks up a non-Abelian phase factor, given by

$$\displaystyle \begin{aligned} U(x,y)=\text{P.O.}\exp\left\{ -\int_y^x \mathrm{d}{z_\mu}\,A_\mu(z)\right\}, \end{aligned} $$

where “P.O.” denotes path ordering, as a consequence of the non-Abelian nature of the gauge field. By contrast to the gauge potential A μ, which is an element of the Lie algebra of SU(3), the parallel transporter U(x, y) is an element of the gauge group itself. On the lattice, the parallel transporter between neighbouring lattice sites x and \(x+a\hat \mu \) is called link variable:

$$\displaystyle \begin{aligned} U(x,x+a\hat\mu)\equiv U_\mu(x), \qquad U(x+a\hat\mu,x)=U(x,x+a\hat\mu)^{-1}=U_\mu(x)^{-1}. \end{aligned} $$

A consistent and manifestly gauge invariant discretization of QCD is obtained by identifying the gauge degrees of freedom with the link variables U μ(x), which transform under the gauge group as

$$\displaystyle \begin{aligned} U_\mu(x) \to g(x)\,U_\mu(x)\,g(x+a\hat\mu)^{-1},\quad g(x),\,g(x+a\hat\mu)\in \text{SU(3)}. {} \end{aligned} $$

The connection with the gauge potential A μ(x) is somewhat subtle: if U μ(x) denotes a given link variable in the discretized theory, it can be used to define a vector field A μ(x) as an element of the Lie algebra of SU(3) via

$$\displaystyle \begin{aligned} \mathrm{e}^{aA_\mu(x)} \equiv U_\mu(x). \end{aligned} $$

In turn, if \(A_\mu ^{\mathrm {c}}\) is a given gauge potential in the continuum theory, one can always find a link variable which approximates \(A_\mu ^{\mathrm {c}}\) up to cutoff effects.

Now we turn to the problem of defining a discretized version of the Yang–Mills action. To this end we define the plaquetteP μν(x) as the product of link variables around an elementary square of the lattice:

$$\displaystyle \begin{aligned} P_{\mu\nu}(x) \equiv U_\mu(x)U_\nu(x+a\hat{\mu}) U_\mu(x+a\hat{\nu})^{-1} U_\nu(x)^{-1}. \end{aligned} $$

A graphical representation is shown in Fig. 5.1. Using the transformation property in Eq. (5.22), it is easy to convince oneself that this object is manifestly gauge invariant. Moreover, it serves to define the simplest discretization of the Yang–Mills action, the Wilson plaquette action [6]

$$\displaystyle \begin{aligned} S_{\mathrm{G}}[U] = \beta\sum_{x\in\Lambda_{\mathrm{E}}} \sum_{\mu<\nu}\Big( 1-\frac{1}{3}\text{Re Tr }\,P_{\mu\nu}(x)\Big). {} \end{aligned} $$

It has become a standard textbook exercise to verify that for small lattice spacings

$$\displaystyle \begin{aligned} S_{\mathrm{G}}[U] \longrightarrow -\frac{1}{2g_0^2}\int \mathrm{d}^4x\, {\mathrm{Tr}}\,(F_{\mu\nu}F_{\mu\nu}) +{\mathrm{O}}(a), \end{aligned} $$

provided that one relates the parameter β to the bare gauge coupling via \(\beta =6/g_0^2\) in Eq. (5.25). We have remarked already that the discretization of a field theory is not unique, and hence one is free to add further gauge invariant terms to the plaquette action which formally vanish as a → 0, but which produce a discretization with an accelerated rate of convergence to the continuum limit. The most widely chosen alternatives are the Symanzik [7] and Iwasaki [8] actions.

Fig. 5.1
figure 1

Graphical representation of the plaquette P μν(x) in the (μ, ν)-plane. The arrow between sites \(x+a\hat \mu \) and x denotes the link variable U μ(x)

Quark and antiquark fields, ψ(x) and \(\bar {\psi }(x)\), are associated with the lattice sites and transform under the gauge group as

$$\displaystyle \begin{aligned} \psi(x) \to g(x)\psi(x),\qquad \bar{\psi}(x) \to \bar{\psi}(x)g(x)^{-1}. \end{aligned} $$

Using the transformation property of the link variables, it is straightforward to write down a discretized version of the covariant derivative, i.e.

$$\displaystyle \begin{aligned} \begin{array}{rcl} \nabla_\mu\psi(x) &\displaystyle :=&\displaystyle \frac{1}{a}\left( U_\mu(x)\psi(x+a\hat\mu) -\psi(x) \right) \\ \nabla_\mu^*\psi(x) &\displaystyle :=&\displaystyle \frac{1}{a}\left( \psi(x) -U_\mu(x-a\hat\mu)^{-1}\psi(x-a\hat\mu) \right), {} \end{array} \end{aligned} $$

where ∇μ and \(\nabla _\mu ^*\) denote the “forward” and “backward” derivatives, respectively. Finally, we note that in Euclidean space-time, the Dirac matrices can be defined to satisfy \(\left \{\gamma _\mu ,\gamma _\nu \right \}=2\delta _{\mu \nu }\).

Before we attempt to construct the fermionic part of the action of lattice QCD, it is useful to identify the basic properties that the discretized, massless Dirac operator, D, should satisfy:

  1. (a)

    D is local;

  2. (b)

    \(\widetilde {D}(p)={\mathrm {i}}\gamma _\mu p_\mu +{\mathrm {O}}(ap^2)\);

  3. (c)

    \(\widetilde {D}(p)\) is invertible for p ≠ 0;

  4. (d)

    γ 5D + Dγ 5 = 0.

Locality, i.e. the absence of long-ranged interactions, is a basic property of any quantum field theory describing elementary particles. Property (b) implies that the correct continuum behaviour of the quark-gluon interaction is reproduced. Furthermore, condition (c) ensures that the correct fermion spectrum is obtained: fermion masses are associated with poles of \(\{\widetilde {D}(p)\}^{-1}\), which, in the continuum theory, only occur at vanishing four-momentum. Finally, property (d) ensures that the massless theory respects chiral symmetry.

Using the definition of the covariant derivative and the conventions for the Dirac matrices in Euclidean space-time, we can now write down the simplest discretized version of the massless lattice Dirac operator:

$$\displaystyle \begin{aligned} D_{\text{disc}} = \textstyle{1\over2}\gamma_\mu(\nabla_\mu+\nabla_\mu^*). \end{aligned} $$

It turns out, however, that this “naïve” discretization violates condition (c) and therefore produces spurious fermionic degrees of freedom. This is the so-called fermion doubling problem, which is most easily explained by considering D disc in momentum space for the free theory. The Fourier transform yields

$$\displaystyle \begin{aligned} \widetilde{D}_{\text{disc}}(p) = {\mathrm{i}}\gamma_\mu\frac{1}{a}\sin{}(ap_\mu) = {\mathrm{i}}\gamma_\mu p_\mu + {\mathrm{O}}(a^2). \end{aligned} $$

The discretization procedure has thus replaced p μ by a sine function. While the Taylor expansion guarantees that condition (b) is satisfied, the occurrence of \(\sin {}(ap_\mu )\) implies that \(\widetilde {D}_{\text{disc}}(p)\) vanishes not only at p μ = 0, but also at πa for μ = 0, …, 3 in the permitted range of momenta, thereby violating condition (c). The massless propagator \(\{\widetilde {D}_{\text{disc}}(p)\}^{-1}\) therefore has 24 = 16 poles, and thus there is a 16-fold degeneracy of the fermion spectrum.

As we shall see below, the fermion doubling problem is closely linked with the issue of chiral symmetry on the lattice. For now we simply list the various methods that have been devised to address fermion doubling. Historically the first was due to Wilson (“Wilson fermions”) [6]. Here, the degeneracy is lifted completely, but the price to pay is the explicit breaking of chiral symmetry at the level of the regularized theory. Another method, due to Kogut and Susskind (“staggered fermions”) [9], is based on the idea of spreading individual spinor components over the corners of an elementary hypercube of the lattice. Although the degeneracy is only lifted partially (from 16 to 4), this formulation has the advantage of leaving a subgroup of chiral symmetry unbroken. More recent developments include the use of so-called “domain wall” [10, 11] or “overlap” [12] fermions. These formulations leave chiral symmetry unbroken in principle, and also succeed in lifting the degeneracy completely. Finally, there are the so-called “perfect” actions [13], which are based on a renormalization group approach and which are in principle completely free of lattice artefacts. An exact realization of the perfect action which can be used in simulations is, however, difficult to obtain. In practice, one typically uses a so-called truncated fixed point action. Domain wall and overlap fermions, as well as perfect actions are particular realizations of a class of discretizations dubbed “Ginsparg-Wilson fermions”. They have the remarkable feature that chiral symmetry is preserved, while the fermion doubling problem is completely avoided. We shall come back to this issue in more detail below.

For now we turn specifically to Wilson’s treatment of the fermion doubling problem. It exploits the fact that the discretization is not unique. Thus, one can add a term to D disc, which formally vanishes as a → 0, but which pushes the masses of the unwanted doubler states to the cutoff scale at any non-zero value of the lattice spacing. Explicitly, the massless Wilson-Dirac operator D w reads

$$\displaystyle \begin{aligned} D_{\mathrm{w}} = \textstyle{1\over2}\gamma_\mu(\nabla_\mu+\nabla_\mu^*) +ar\nabla_\mu^*\nabla_\mu, \end{aligned} $$

where r is the so-called Wilson parameter, which is usually set to one. The Fourier transform of D w for a trivial gauge field reads

$$\displaystyle \begin{aligned} \widetilde{D}_{\mathrm{w}}(p) = {\mathrm{i}}\gamma_\mu\frac{1}{a}\sin{}(ap_\mu) +\frac{2r}{a}\sin^2\left(\frac{ap_\mu}{2}\right), \end{aligned} $$

which explicitly demonstrates (for the free theory, at least) that the poles at p μ = πa receive additional contributions proportional to ra, which is of order of the cutoff for r = O(1). Although this procedure leads to a complete lifting of the degeneracy,Footnote 1 it has a number of unwanted features: first, it should be noted that the Wilson fermion action differs from the classical action in the continuum by terms of order a, as a result of adding the counterterm proportional to r. By contrast, the leading discretization effects of the Wilson plaquette action for Yang–Mills theory are only O(a 2). The Wilson fermion formulation will thus have a reduced rate of convergence towards the continuum limit. Secondly, the addition of the Wilson term results in an explicit breaking of chiral symmetry, since the massless theory is no longer invariant under global axial rotations, such as

$$\displaystyle \begin{aligned} \psi(x)\to \mathrm{e}^{{\mathrm{i}}\alpha\gamma_5}\psi(x),\qquad \bar{\psi}(x)\to \bar{\psi}(x)\mathrm{e}^{{\mathrm{i}}\alpha\gamma_5}, \end{aligned} $$

which implies that property (d) is violated. While the rate of convergence to the continuum limit can be accelerated by employing what is known as “O(a) improvement” (see below), the explicit breaking of chiral symmetry cannot be cured within the Wilson theory. Thus, quantities like the quark condensate, which arises from the spontaneous breaking of chiral symmetry, cannot be studied in a conceptually “clean” manner using Wilson fermions. A detailed discussion how this can be achieved with the help of a more sophisticated fermionic discretization (“Ginsparg-Wilson fermions”) is presented in Sect. 5.6. However, for most applications of lattice QCD, explicit chiral symmetry breaking is merely an inconvenience, but no serious obstacle.

We have already remarked when discussing the discretized Yang–Mills part of the QCD action that the non-uniqueness of the discretization opens the possibility to construct lattice actions with an accelerated rate of convergence towards the continuum limit. A systematic way how to do this is the so-called Symanzik improvement programme [14], in which lattice artefacts can be removed order by order in the lattice spacing. In a nutshell, the improvement programme amounts to extending the renormalization procedure of a field theory to the level of irrelevant operators, i.e. operators that formally vanish as a → 0. In this sense one adds suitable counterterms, which for any non-zero value of a produce a cancellation of the cutoff effects at a given order, provided that their coefficients are tuned appropriately. For QCD with Wilson fermions, Sheikholeslami and Wohlert [15] have shown that the Symanzik improvement programme to lowest order is realized by adding one O(a) counterterm to the Wilson-Dirac operator D w. The resulting expression in the massless case reads

$$\displaystyle \begin{aligned} D_{\mathrm{sw}} = D_{\mathrm{w}} +\frac{ia}{4}\,c_{\mathrm{sw}}\sigma_{\mu\nu}\widehat{F}_{\mu\nu}, {} \end{aligned} $$

where \(\sigma _{\mu \nu }=\frac {{\mathrm {i}}}{2}[\gamma _\mu ,\gamma _\nu ]\), and \(\widehat {F}_{\mu \nu }\) is a lattice transcription of the gluon field strength tensor F μν. A suitable representation of \(\widehat {F}_{\mu \nu }\) in terms of plaquette variables is given by

$$\displaystyle \begin{aligned} \widehat{F}_{\mu\nu}(x) = \frac{1}{8a^2}\left(Q_{\mu\nu}(x)-Q_{\nu\mu}(x)\right), \end{aligned} $$

where Q μν(x) is the sum of the four plaquettes emanating from the site x, as depicted in Fig. 5.2. The object Q μν(x) is aptly called “clover” leaf. In order to remove all lattice artefacts of order a in hadron masses, the improvement coefficient c sw must be fixed by imposing a suitable improvement condition. Without going into details here, we note that it is possible to find such a condition, which can also be evaluated at the non-perturbative level [16, 17]. The resulting, non-perturbatively O(a) improved Wilson action can then be used to compute, say, hadron masses whose values differ from the continuum result by terms of only O(a 2).

Fig. 5.2
figure 2

Four plaquettes that must be summed over to yield the quantity Q μν(x) in the lattice definition of the field strength tensor. The site x is at the center of the “clover” leaf

The Wilson-Dirac operator for a quark with bare mass m 0 is simply (D w + m 0). However, the form of the Wilson fermion action, \(S_{\mathrm {F}}^{\mathrm {W}}[U,\bar {\psi },\psi ]\), which is found in the literature is usually expressed in terms of the “hopping parameter” κ rather than m 0. By rescaling the fermion fields according to

$$\displaystyle \begin{aligned} \psi(x)\rightarrow\sqrt{2\kappa}\,\psi(x),\qquad \bar{\psi}(x)\rightarrow\bar{\psi}(x)\,\sqrt{2\kappa}, {} \end{aligned} $$

one obtains


The hopping parameter κ is related to the bare mass m 0 via

$$\displaystyle \begin{aligned} \kappa=\frac{1}{2am_0+8r}, {} \end{aligned} $$

while the dimensionless parameter r is usually set to one. Taken together with the plaquette action of Eq. (5.25), the Wilson action for QCD is thus conveniently parameterized in terms of the bare parameters (β, κ), with \(\beta =6/g_0^2\) and κ as above, instead of the bare gauge coupling and quark mass (g 0, m 0).

Another consequence of adding the Wilson term to the naïve lattice action is the resulting additive renormalization of the quark mass. In other words, the point where the quark mass vanishes is a priori unknown. The value that must be subtracted is called the critical quark mass, which corresponds to the critical value of the hopping parameter, κ c. The bare subtracted quark mass is then given by

$$\displaystyle \begin{aligned} m = \frac{1}{2a}\bigg(\frac{1}{\kappa}-\frac{1}{\mbox{{$\kappa_c$}}}\bigg). \end{aligned} $$

From Eq. (5.38) one easily infers that the critical value of κ in the free theory occurs at

$$\displaystyle \begin{aligned} \mbox{{$\kappa_c$}} =\frac{1}{8},\qquad r=1, \end{aligned} $$

while for non-zero g 0 the value of κ c must be determined, for instance, by adjusting κ to the point where the pion mass vanishes.

We now turn to discussing one alternative to using Wilson’s solution to the fermion doubling problem, namely the so-called “staggered” (or Kogut-Susskind) fermions. One might think that the doubling problem arises since there are too many fermion degrees of freedom in the discretized theory, if one associates a four-component Dirac spinor with each individual lattice site. Pictorially, the main idea of Kogut and Susskind was to “thin out” the degrees of freedom by distributing single spinor components over different lattice sites. In their particular formulation, the 16 corners of a four-dimensional hypercube serve to accommodate the individual components of four Dirac spinors. Therefore, if these hypercubes are regarded as the main building blocks for the fermionic discretization, rather than the lattice sites themselves, this procedure will result in a partial lifting of the degeneracy from 16 fermion species down to four. It is clear, though, that a simple distribution of spinor components is not sufficient to define the action, since the Dirac matrices mix different spinor components. Thus, the staggered fermion action is only obtained after performing a diagonalization in spinor space, which then decouples the individual components.

Rather than describing the details of this procedure, which can be found in most textbooks, we simply state the result. Starting from the usual four-component spinor and performing a spin-diagonalization, the lattice action for staggered fermions with bare mass m 0 coupled to the gauge field is derived as


where χ α denotes a one-component Grassmann variables. The spin-diagonalization has thus replaced the Dirac matrices γ μ by real, position-dependent phase factors η μ(x), which are given by

$$\displaystyle \begin{aligned} \eta_0(x) = 1,\qquad \eta_j(x) = (-1)^{n_0+\ldots+n_{j-1}},\quad n_j=x_j/a. \end{aligned} $$

At the level of the classical action, the spinor components are completely decoupled, and the action is decomposed into four identical pieces. In order to occupy all 16 corners of a four-dimensional hypercube with one-component Grassmann variables, one needs four Dirac spinors, each of which contributes a term like Eq. (5.41) to the overall action. This produces the fourfold degeneracy of staggered fermions, with the remnant doubler states being referred to as “tastes”, in order to distinguish them from physical flavours. The formulation using the one-component fields within a hypercube can be re-expressed in terms of the spin-taste basis [18], from which one can infer directly that the taste symmetry is broken. However, one axial generator of the taste symmetry remains unbroken. The fermion mass in the staggered approach is therefore protected against any additive renormalization through the associated global axial U(1) symmetry, unlike the case of the Wilson action. While the various tastes decouple in the continuum limit, non-vanishing interactions between the tastes at O(a 2) in the lattice spacing are induced, leading to large lattice artefacts. The Symanzik improvement programme can be employed to reduce these taste-changing interactions [19], and the resulting “improved staggered fermions” (the so-called “Asqtad”-action being one particular example [20]) have been widely used in a series of simulations.

For a long time lattice physicists have struggled to find a fermionic discretization which would both solve the doubling problem and be compatible with chiral symmetry. In fact, physicists grew increasingly doubtful that this could be achieved, following the proof of a “No-Go theorem” by Nielsen and Ninomiya [21], which stated that the conditions (a)–(d) mentioned above could not be satisfied simultaneously. Since one does not want to give up locality and property (b), this would imply that either (c) or (d) must be violated. Indeed, the Wilson and staggered discretizations seem to confirm this expectation: while the Wilson fermion action removes all doublers, it breaks chiral symmetry, leading to an additive renormalization of the quark mass, as well as several other consequences. By contrast, the staggered formulation preserves a U(1) subgroup of chiral symmetry at the price of only partially removing the spurious degrees of freedom.

A way to circumvent the Nielsen–Ninomiya theorem was already pointed out by Ginsparg and Wilson in 1982 [22], when they suggested to relax condition (d) in favour of

$$\displaystyle \begin{aligned} \gamma_5{D}+D\gamma_5 = aD\gamma_5{D}. {} \end{aligned} $$

However, it was not before 1997 that this condition—now commonly referred to as the Ginsparg-Wilson relation—was confronted with a non-trivial solution. It was shown [23] that the so-called “perfect action” constructed from a renormalization group approach satisfied equation (5.43). It was also realized that any lattice Dirac operator, which is a solution to the Ginsparg-Wilson relation, also satisfies the Atiyah–Singer index theorem, i.e.

$$\displaystyle \begin{aligned} \left\{\gamma_5,D\right\} = aD\gamma_5{D}\quad \Leftrightarrow\quad \mbox{index}(D)=a^5\sum_{x\in\Lambda_{\mathrm{E}}}\textstyle{1\over2}{\mathrm{Tr}}\,\left(\gamma_5 D\right) = n_{-}-n_{+}, \end{aligned} $$

such that the operator D exhibits |n − n +| exact chiral zero modes. Finally, it was shown [24] that the Ginsparg-Wilson relation implies an exact symmetry of the associated action, with infinitesimal variations proportional to

$$\displaystyle \begin{aligned} \delta\psi=\gamma_5(1-aD)\psi,\qquad \delta\bar{\psi}=\bar{\psi}\gamma_5. \end{aligned} $$

Moreover, this symmetry reproduces the correct chiral anomaly in the flavour singlet case, and therefore all the hallmarks of the correct chiral behaviour are present in the lattice theory: chiral zero modes, an exact index theorem and the chiral anomaly derived from the Ward identities associated with the exact symmetry.

Another line in the development of lattice fermion actions that preserve chiral symmetry goes back to Kaplan’s domain wall fermion approach [10], which was subsequently applied to QCD by Furman and Shamir [11]. Without going into detail, we state that the basic idea is to introduce an extra, fifth dimension and to couple the fermions to a mass defect (the so-called “domain wall height”) in that extra dimension. To make this more explicit, let x, y denote the coordinates in the four-dimensional bulk, and s, t the coordinates in the 5th dimension, which has finite length N 5. The gauge fields are trivial in the 5th direction, and the Dirac operator then has the general structure

$$\displaystyle \begin{aligned} D_{\text{dwf}}(x,s;y,t) = D^\parallel(x,y)\delta_{st} + \delta(x-y)D_{st}^\perp \end{aligned} $$

where D (x, y) is the usual Wilson-Dirac operator with a negative mass term, − M, which represents the domain wall height. The operator \(D_{st}^\perp \) couples fermions in the 5th dimension and contains the physical bare quark mass m 0. It can then be shown that for m 0 = 0 and in the limit N 5 → there are no fermion doublers and, more importantly, chiral modes of opposite chirality are trapped in the four-dimensional domain walls at s = 1, N 5.

However, in a real lattice simulation of domain wall fermions, one has to work with a finite value of N 5, so that the decoupling of chiral modes is not exact. One expects, though, an exponential suppression of the remnant chiral symmetry breaking effects, and this has been confirmed in several simulations. Furthermore, the rate of suppression may be accelerated by optimizing the choice of lattice action for the gauge fields. Hence, the domain wall formulation of QCD offers a method to realize almost exact chiral symmetry at non-zero lattice spacing at the expense of simulating a five-dimensional theory.

Another operator which correctly reproduces the chiral properties of QCD at non-zero lattice spacing was constructed by Neuberger [12]. Its definition is

$$\displaystyle \begin{aligned} D_{\mathrm{N}} = \frac{1}{\overline{a}} \left( 1-\frac{A}{\sqrt{A^{\dagger{A}}}} \right), \quad A=1+s-aD_{\mathrm{w}},\quad \overline{a}=\frac{a}{1+s}, {} \end{aligned} $$

where D w is the massless Wilson-Dirac operator, and |s| < 1 is a tunable parameter. By defining Q = −γ 5A, one can rewrite Eq. (5.47) as

$$\displaystyle \begin{aligned} D_{\mathrm{N}} = \frac{1}{\overline{a}} \left( 1+\gamma_5\mbox{sign}(Q) \right). \end{aligned} $$

The Neuberger-Dirac operator D N removes all doublers from the spectrum, and can easily be shown to satisfy the Ginsparg-Wilson relation [12]. The occurrence of an inverse square root in D N raises two issues. First, it is a priori not clear whether or not D N is local. Second, the application of D N in a computer program is potentially very costly, since the sign-function of the matrix Q must be implemented using, for instance, a polynomial approximation.

In order to qualify as a viable discretization of the quark action, “strict” locality, meaning that only fields in a local neighbourhood of a given lattice site are coupled, is not actually required. If D(x, y) denotes a generic lattice Dirac operator which couples fields at sites x and y, then a sufficient condition for locality of D is the exponential suppression of non-local interactions, i.e.

$$\displaystyle \begin{aligned} \|D(x,y)\| \leq \mathrm{e}^{-\gamma|x-y|/a}, {} \end{aligned} $$

where |x − y| is the distance between sites and ∥⋅∥ denotes a suitably defined matrix norm. In Ref. [25] it was shown that the Neuberger-Dirac operator D N is local in the sense of Eq. (5.49), provided that the lattice spacing in physical unitsFootnote 2 is not larger than about 0.13 fm. As far as the issue of numerical efficiency is concerned, we note that the most widely used approximations of sign(Q) with good convergence properties include Chebysheff or Zolotarev polynomials, as well as rational fractions.

The last fermionic discretization we wish to mention here was originally constructed to address another problem of Wilson’s discretization, namely the fact that they are not protected against the occurrence of zero modes for any non-zero value of the bare quark mass. These unphysical zero modes manifest themselves as “exceptional” configurations, which occur with a certain frequency in numerical simulations with Wilson quarks and which can lead to strong statistical fluctuations. The problem can be cured by introducing a so-called “chirally twisted” mass term, after which the fermionic part of the QCD action in the continuum assumes the form

$$\displaystyle \begin{aligned} S_{\mathrm{F}}^{\text{tm;cont}} = \int \mathrm{d}^4{x}\, \bar{\psi}(x)(\gamma_\mu D_\mu + m + {\mathrm{i}}\mu_{\mathrm{q}}\gamma_5\tau^3)\psi(x). \end{aligned} $$

Here, μ q is the twisted mass parameter, and τ 3 is a Pauli matrix. The standard action in the continuum can be recovered via a global chiral field rotation:

$$\displaystyle \begin{aligned} \psi^\prime(x) = \mathrm{e}^{{\mathrm{i}}\alpha\gamma_5\tau^3/2}\psi(x),\qquad \bar{\psi}^\prime(x) = \bar{\psi}(x)\mathrm{e}^{{\mathrm{i}}\alpha\gamma_5\tau^3/2}. {} \end{aligned} $$

Fixing the twist angle α by requiring that \(\tan \alpha =\mu _{\mathrm {q}}/m\) one finds

$$\displaystyle \begin{aligned} S_{\mathrm{F}}^\prime = \int \mathrm{d}^4{x}\, \bar{\psi}^\prime(x)(\gamma_\mu D_\mu +M)\psi^\prime(x), \qquad M=\sqrt{m^2+\mu_{\mathrm{q}}^2}, \end{aligned} $$

which demonstrates the complete equivalence of the twisted formulation with “ordinary” QCD. The lattice action of twisted mass QCD for N f = 2 flavours is defined as [26]

$$\displaystyle \begin{aligned} S_{\mathrm{F}}^{\mathrm{tm}}[U,\bar{\psi},\psi] = a^4\sum_{x\in\Lambda_{\text{E}}}\bar{\psi}(x)(D_{\mathrm{w}}+m_0+{\mathrm{i}}\mu_{\mathrm{q}}\gamma_5\tau^3)\psi(x). {} \end{aligned} $$

Although this formulation breaks physical parity and flavour symmetries, is has a number of advantages over standard Wilson fermions. In particular, the presence of the twisted mass parameter μ q protects the discretized theory against unphysical zero modes. Another attractive feature of twisted mass lattice QCD is the fact that the leading lattice artefacts are of order a 2 without the need to add the Sheikholeslami-Wohlert term [27], even though the Wilson-Dirac operator is used in Eq. (5.53). Although the problem of explicit chiral symmetry breaking remains, the twisted formulation is particularly useful to circumvent some of the problems that are encountered in connection with the renormalization of local operators on the lattice. Recent review of twisted mass lattice QCD can be found in [28, 29].

We wish to end this part with a few general remarks. Although we have discussed discretizations of the QCD action in some detail, including the most recent developments, many more variants of the basic types of action—including several different combinations of fermionic and pure gauge parts—can be found in the literature. This reflects the fact that the discretization is not unique. The actual choice of lattice action in a particular simulation will influence the convergence rate to the continuum limit, the algorithmic efficiency, the renormalization properties of local operators, or—in the case of domain wall fermions—the extent to which chiral symmetry is realized. Depending on the properties of a particular discretization, the choice of lattice action can be optimized for the physics one wishes to study.

5.2.3 Functional Integral and Observables

The lattice formulation provides a regularization of non-Abelian gauge theories whilst preserving the gauge invariance at all stages of the calculation. This comes at a price, since all continuous space-time symmetries are broken explicitly and must be recovered in the continuum limit. Nevertheless, the lattice regularized theory inherits all consequences of gauge invariance, including renormalizability. Moreover, the lattice regularizes the theory without any reference to perturbation theory. By contrast, in continuum schemes like the \({\overline {{\mathrm {MS}}}}\) scheme of dimensional regularization the cutoff is only defined after fixing the order of the perturbative expansion. As we shall see below, observables in lattice QCD are directly given in terms of functional integrals, which can be evaluated stochastically using Monte Carlo integration. In this way, any use of perturbation theory is completely avoided.

For concreteness, let us assume that we have made a particular choice for the Yang–Mills part S G[U] and the fermionic part \(S_{\text{F}}[U,\bar {\psi },\psi ]\), for instance, the Wilson plaquette action and Wilson fermions. Let Ω denote an observable, which is represented by a polynomial in the quark and antiquark fields and the link variables. The expectation value, 〈 Ω〉, is defined through the Euclidean functional integralFootnote 3

$$\displaystyle \begin{aligned} \left\langle\Omega\right\rangle = \frac{1}{Z}\int D[U]D[\bar{\psi},\psi] \,\Omega\, \mathrm{e}^{-S_{\mathrm{G}}[U]-S_{\mathrm{F}}[U,\bar{\psi},\psi]}, \end{aligned} $$

where Z is fixed by the condition. The functional integral involves an integration over the gauge group and over all fermionic degrees of freedom, the latter being represented by anti-commuting (Grassmann) variables. Since the fermionic action, \(S_{\mathrm {F}}[U,\bar {\psi },\psi ]\) is bilinear in the quark and antiquark fields, the integration over the Grassmann variables is Gaussian and can be performed analytically. This yields

$$\displaystyle \begin{aligned} \left\langle\Omega\right\rangle = \frac{1}{Z}\int \prod_{x\in\Lambda_{\mathrm{E}}}\prod_{\mu=0}^3 {\mathrm{d}}U_\mu(x)\, \widetilde{\Omega}\,\left\{\det D_{\text{lat}}\right\}^{N_{\mathrm{f}}} \mathrm{e}^{-S_{\mathrm{G}}[U]}. {} \end{aligned} $$

Equation (5.55) requires some further explanation:

  • \(\widetilde {\Omega }\) denotes the representation of Ω in the (effective) theory, where the quark fields have been integrated out and only the link variables remain in the functional integral measure;

  • D lat denotes a generic, massive lattice Dirac operator. For instance, for Wilson quarks one has D lat = D w + m 0. For simplicity we have displayed the expression for QCD with N f flavours of equal mass m 0, which accounts for the power N f. In the case of non-degenerate quarks \(\{\det {D_{\text{lat}}}\}^{N_{\mathrm {f}}}\) must be replaced by a product of determinants, in which each factor represents the contribution from a single flavour:

  • The lattice formulation has given a well-defined meaning to the measure D[U]. The integration over the gauge degrees of freedom reduces to a finite-dimensional integration over the gauge group, based on the invariant group (Haar) measure.

The numerical evaluation of 〈 Ω〉 via Monte Carlo integration proceeds as follows. One starts by generating a set of gauge configurations using a computer program. One configuration in the set represents the collection of all link variables on a given lattice, i.e.

$$\displaystyle \begin{aligned} \left\{ U_\mu(x)\left| x\in\Lambda_{\mathrm{E}}, \,\mu=0,\ldots,3\right.\right\}, \end{aligned} $$

for which we shall use the shorthand {U μ(x)} below. A collection of an infinite number of configurations is called an ensemble. The statistical weight, W, of an individual configuration is given by

$$\displaystyle \begin{aligned} W=\left\{\det D_{\text{lat}}\right\}^{N_{\mathrm{f}}}\,\mathrm{e}^{-S_{\mathrm{G}}[U]}. {} \end{aligned} $$

In other words, the composition of the ensemble is determined by a probability distribution, which is given by the negative exponentiated classical action in the integrand of the Euclidean functional integral. Owing to the weight factor, the integrand of the functional integral will be strongly peaked around those configurations for which W is large. This particular feature makes the expectation value amenable to a Monte Carlo treatment. The key idea is to replace the ensemble by a finite sample of N cfg gauge configurations, which is dominated by those configurations for which W is large. Provided that one can construct a suitable algorithm, the sample will then consist predominantly of those configurations which give a large contribution to the Euclidean functional integral and thus 〈 Ω〉. Such a procedure is called importance sampling.

Technically, the sample is produced by generating a sequence of configurations via a Markov process:

$$\displaystyle \begin{aligned} \left\{U_\mu(x)\right\}_1 \longrightarrow \left\{U_\mu(x)\right\}_2 \longrightarrow\ldots\longrightarrow \left\{U_\mu(x)\right\}_{N_{\mathrm{cfg}}}. \end{aligned} $$

One assigns a probability for the transition from \(\left \{U_\mu (x)\right \}_i\) to \(\left \{U_\mu (x)\right \}_{i+1}\), which is usually a function of the statistical weights of the two configurations, W i and W i+1, respectively. For each individual configuration in the sequence one then evaluates the observable, which yields the estimates Ωi, i = 1, …, N cfg. The expectation value 〈 Ω〉 is related to the mean value \(\overline \Omega \) via

$$\displaystyle \begin{aligned} \langle\Omega\rangle = \lim_{N_{\mathrm{cfg}}\to\infty} \overline\Omega,\qquad \overline\Omega=\frac{1}{N_{\mathrm{cfg}}} \sum_{i=1}^{N_{\mathrm{cfg}}}\,\Omega_i. \end{aligned} $$

In other words, in the limit of infinite statistics the mean value converges to the ensemble average which is identical to the expectation value. An important consequence of approximating the ensemble average by the sample average is a non-zero value of the variance. Hence, in order to specify the results from a Monte Carlo integration completely, one must also quote the statistical error which is given by the square root of the variance.

In the standard algorithms that implement Markov processes (such as the Metropolis algorithm [30]), the transition probabilities for going from one configuration to another are determined by comparing the statistical weights for local variations in the field variables. This guarantees computational efficiency, since the variation of individual link variables does not involve global information from the entire lattice. In Eq. (5.55) the dynamical effects of the quark fields are incorporated via the determinant of the lattice Dirac operator. The determinant, however, is a non-local object, which is expensive to compute. When the first efforts were made to compute observables in QCD in the 1980s, the available computer power did not allow for the inclusion of the quark determinant. Instead, lattice physicists resorted to what is known as the “quenched approximation”, which is based on the assumption that the bulk of non-perturbative contributions is carried by the gauge field, so that the determinant is set to a constant:

$$\displaystyle \begin{aligned} \mbox{Quenched approximation:}\qquad \det{D_{\mathrm{lat}}}=1 \quad \Leftrightarrow\quad N_{\mathrm{f}}=0. \end{aligned} $$

The resulting gain in computer time amounts to several orders of magnitude. In the quenched approximation the effects of virtual quark loops are entirely suppressed. As a consequence, results for observables are afflicted with an unknown systematic error. As we shall see later, there are several quantities (for instance, the masses of the lightest hadrons) for which the quenching error amounts to just 10–15%. Although this justifies the use of the quenched approximation to some extent, it is clear that dynamical quark effects must be taken into account, in order to arrive at reliable, non-perturbative predictions with a total accuracy at the percent level.

Modern algorithms for dynamical quarks, such as the Hybrid Monte Carlo algorithm [31], do not evaluate the quark determinant directly. Rather, one exploits the property that the determinant can be rewritten as a functional integral over bosonic fields, which is then evaluated stochastically. Thereby one avoids computing a global object, but the computational effort involved in the stochastic estimation of the quark determinant is still large compared with the quenched approximation. More details can be found in Sect. 5.2.6 below.

Correlation functions, i.e. the expectation values of polynomials in the quark and gluon fields, are the most important quantities, since they determine implicitly the particle spectrum of the theory. As was discussed already in Sect. 5.2.1, the link between correlation functions and the particle spectrum is provided by the transfer matrix T. For lattice QCD with Wilson fermions, the existence of a positive transfer matrix was rigorously established [32].

As a concrete example we shall discuss the two-point correlation function of a charged kaon. A polynomial of quark fields with the quantum numbers of the kaon is given by

$$\displaystyle \begin{aligned} \phi_{\mathrm{K}}(x) = \left(\bar{u}{\gamma_5}s\right)(x), \end{aligned} $$

where the parentheses indicate summation over spinor and colour components of the fields. Mostly one is interested in correlation functions in which all spatial points have been summed over and which therefore only depend on the Euclidean time separation. We define

$$\displaystyle \begin{aligned} C_{\mathrm{K}}(x_0;\vec{p})=\sum_{\vec{x}}\,\mathrm{e}^{{\mathrm{i}}\vec{p}\cdot\vec{x}} \left\langle\phi_{\mathrm{K}}(x)\phi_{\mathrm{K}}^\dagger(0) \right\rangle. \end{aligned} $$

The inclusion of the phase factor in conjunction with the summation over \(\vec {x}\) amounts to a projection onto spatial momentum \(\vec {p}\). On a finite lattice with periodic boundary conditions \(C_{\mathrm {K}}(x_0;\vec {p})\) must be symmetric under x 0 ↔ T − x 0. Therefore, the spectral decomposition of \(C_{\mathrm {K}}(x_0;\vec {p})\) reads

$$\displaystyle \begin{aligned} C_{\mathrm{K}}(x_0;\vec{p})=\sum_\alpha \frac{\big|\big\langle0\big| \phi_{\mathrm{K}}(0) \big|\alpha\big\rangle\big|{}^2}{2\epsilon_\alpha(\vec{p})} \left\{ \mathrm{e}^{-\epsilon_\alpha(\vec{p})x_0} +\mathrm{e}^{-\epsilon_\alpha(\vec{p})(T-x_0)} \right\}, \end{aligned} $$

where the sum runs over all states in the kaon channel with fixed momentum \(\vec {p}\), and \(\epsilon _\alpha (\vec {p})\) is the mass gap (see Sect. 5.2.1).Footnote 4 For large Euclidean times x 0 the ground state dominates. If we further set \(\vec {p}=0\), then the asymptotic form of the two-point function reads

$$\displaystyle \begin{aligned} \lim_{x_0\to\infty}C_{\mathrm{K}}(x_0;\vec{p}) = \frac{\big|\big\langle0\big| \phi_{\mathrm{K}}(0) \big|K\big\rangle\big|{}^2}{m_{\mathrm{K}}} \mathrm{e}^{-m_{\mathrm{K}}T/2}\,\cosh\left( m_{\mathrm{K}}(T/2-x_0) \right), {} \end{aligned} $$

where \(m_{\mathrm {K}}=\epsilon _0(\vec {p})|{ }_{\vec {p}=0}\) is the mass of the kaon, and the sum of the two exponentials has been re-expressed using the cosh function. Owing to the ordering \(\epsilon _0(\vec {p})<\epsilon _1(\vec {p})<\ldots \), the higher excited states are exponentially suppressed. The functional form of Eq. (5.64) is nicely illustrated by the plot in Fig. 5.3, where simulation data for \(C_{\mathrm {K}}(x_0;\vec {p}=0)\) are compared to its asymptotic form. The data show indeed the expected cosh-behaviour. Furthermore, one observes how the contributions from higher excited states, which are clearly visible at small values of x 0a, quickly die out as the time separation increases. From the two-point function we can extract two important quantities: the fall-off of \(C_{\mathrm {K}}(x_0;\vec {p}=0)\) is characteristic of the kaon mass, i.e. the energy of the ground state. Moreover, the pre-factor of the \(\cosh \)-function yields the transition amplitude between a kaon state and the vacuum, and thus contains information on the kaon’s decay properties.

Fig. 5.3
figure 3

Two-point correlation function for a pseudoscalar meson. The curve denotes a fit to Eq. (5.64) in the interval 6 ≤ x 0a ≤ 26

5.2.4 Continuum Limit, Scale Setting and Renormalization

In Sect. 5.2.2 we have discussed how to discretize the QCD action. The main principle for their construction was the condition that the corresponding expressions reproduce the continuum action in the formal limit a → 0, regardless of the values of the bare parameters (such as β and the hopping parameter κ in the case of QCD with Wilson fermions). If one goes beyond the classical theory this is not possible anymore: it is a general property of quantum field theory that the parameters of the regularized theory (masses and couplings) must be adjusted as the regulator is removed. In the context of lattice QCD this implies that the continuum limit, a → 0, is reached by a suitable tuning of the bare parameters.

To make this statement more precise, we shall invoke the close connection between Euclidean lattice field theory and a system in statistical mechanics. Models in statistical physics (think of the Ising model as an example) usually have a phase structure. Depending on the choice of parameters, the different phases may exhibit entirely different physical properties. The analogy with lattice field theory then implies that a particular discretization of QCD also possesses a phase structure in the space of bare parameters (β and κ, for example).Footnote 5 We shall now explain that the continuum limit of QCD is associated with a critical point in the phase diagram, which corresponds to a second-order phase transition. In the previous section we have considered hadronic two-point correlation functions, and how the mass in a given channel can be extracted from the asymptotic behaviour at large Euclidean times. Actually, this procedure yields the dimensionless combination (aM), i.e. the hadron mass in lattice units. In order to take the continuum limit, one must take a → 0, while the physical mass M must remain constant. This implies

$$\displaystyle \begin{aligned} \frac{1}{(aM)}\equiv\xi\to0. \end{aligned} $$

In other words, the correlation length ξ diverges in the continuum limit. In the language of statistical physics, a divergent correlation length signals a second-order phase transition. The existence of the continuum limit in lattice QCD is therefore equivalent to the existence of a second-order transition in the space of bare parameters.

For simplicity we shall now consider Yang–Mills theory on the lattice, which we choose to describe by Wilson’s plaquette action and the bare coupling parameter \(\beta \equiv 6/g_0^2\). The existence of a second-order phase transition corresponds to a critical value of the bare gauge coupling, g 0,c. Furthermore, it implies that the bare coupling g 0 and the lattice spacing a (or, equivalently, the correlation length ξ) cannot be varied independently when the continuum limit is approached.Footnote 6 In this way we may regard the bare coupling as a function of the lattice spacing, g 0(a), such that

$$\displaystyle \begin{aligned} \lim_{a\to0}g_0(a)=g_{0,c}. \end{aligned} $$

Let P be an observable, computed for a particular value of g 0, i.e. P = P(g 0, a). Since P is a physical quantity it must stay constant as the continuum limit is taken, i.e.

$$\displaystyle \begin{aligned} a\frac{\mathrm{d}}{{\mathrm{d}}a}P(g_0,a) = 0. \end{aligned} $$

This leads to the Callan–Symanzik equation

$$\displaystyle \begin{aligned} \left\{ a\frac{\partial}{\partial{a}} +a\frac{\partial{g_0}}{\partial{a}} \frac{\partial}{\partial{g_0}} \right\}P(g_0,a) = 0. {} \end{aligned} $$

We can define the renormalization group β-function β lat as

$$\displaystyle \begin{aligned} \beta_{\mathrm{lat}}(g_0) := -a\frac{\partial{g_0}}{\partial{a}}, \end{aligned} $$

which describes the change in g 0 when a is varied. Note that β lat depends on the choice of discretization. In perturbation theory, however, one recovers the familiar universal coefficients at one- and two-loop order. For gauge group SU(N) one has

$$\displaystyle \begin{aligned} \beta_{\mathrm{lat}}(g_0) = -b_0 g_0^3 -b_1 g_0^5 +{\mathrm{O}}(g_0^7), \end{aligned} $$


$$\displaystyle \begin{aligned} b_0 = \frac{1}{(4\pi)^2} \left\{\frac{11}{3}N-\frac{2}{3}N_{\mathrm{f}}\right\}, \quad b_1 = \frac{1}{(4\pi)^4} \left\{\frac{34}{3}N^2-N_{\mathrm{f}}\left(\frac{13}{3}N-\frac{1}{N} \right)\right\}, {} \end{aligned} $$

and N f = 0 in pure Yang–Mills theory. Starting from the perturbative expansion of β lat one can integrate the Callan–Symanzik equation, which gives

$$\displaystyle \begin{aligned} a\Lambda_{\mathrm{lat}} = (b_0 g_0)^{-b_1/(2b_0^2)} \mathrm{e}^{-1/(2b_0 g_0)} \left\{1+{\mathrm{O}}(g_0^2)\right\}, {} \end{aligned} $$

where the integration constant Λlat represents a characteristic scale of the theory. The above expression establishes the connection between the lattice spacing and the bare coupling in perturbation theory. One reads off that

$$\displaystyle \begin{aligned} a\to0 \quad \Leftrightarrow\quad g_0\to0, \end{aligned} $$

and hence the critical point occurs at g 0,c = 0. These findings are a consequence of asymptotic freedom. Taking Eq. (5.72) at face value one would conclude that the relation between P(a, g 0) and \(P(a^\prime ,g_0^\prime )\), computed for two different values of the bare coupling g 0 and \(g_0^\prime \) near the critical point, was simply given by the ratio of Eq. (5.72) evaluated for g 0 and \(g_0^\prime \). However, actual simulations do not confirm this expectation. The reason for the failure to observe “asymptotic scaling”, i.e. a variation of P(a, g 0) with g 0 which is consistent with Eq. (5.72), is that the accessible values of g 0 in simulations are by far not near enough the critical point, in order for perturbation theory to be a good approximation.

Let P and P be two different observables that both satisfy Eq. (5.68). Then, regardless of whether or not asymptotic scaling holds, one would expect the ratio aP(a, g 0)∕aP (a, g 0) to be equal to the physical ratio PP for all values of g 0. However, even this weaker scaling criterion is usually not observed, the reason being that the right-hand side of Eq. (5.68) is not strictly zero. Rather one has

$$\displaystyle \begin{aligned} \left\{ a\frac{\partial}{\partial{a}} -\beta_{\mathrm{lat}}(g_0)\frac{\partial}{\partial{g_0}} \right\}P(g_0,a) = {\mathrm{O}}(a^p), \end{aligned} $$

where p is a positive integer. These so-called scaling violations on the right-hand side depend both on the lattice action and the observable in question. As a consequence, the ratio considered above behaves like

$$\displaystyle \begin{aligned} \frac{aP(a,g_0)}{aP^\prime(a,g_0)} = {\mathrm{O}}(a^p). \end{aligned} $$

In other words, as g 0 is tuned towards zero, dimensionless ratios of observables converge to the continuum limit with a rate proportional to a p, where the power p is characteristic of the particular discretization employed in the lattice calculation. In Table 5.1 we have already listed the leading scaling violations (lattice artefacts) for several widely used fermionic lattice actions. Discretizations of the Yang–Mills part, such as the plaquette action, have leading lattice artefacts of O(a 2). The Symanzik improvement programme allows to construct lattice actions with an accelerated rate of convergence to the continuum limit.

Table 5.1 Most widely used discretizations of the Dirac operator and some of their properties

In actual lattice calculations, the continuum limit must be taken by performing simulations at several different values of the lattice spacing and extrapolating the results to a = 0. The functional form of the extrapolation is chosen such that it is consistent with the leading discretization errors for a given lattice action. Such a procedure is only viable if the relation between the lattice spacing in physical units and the dimensionless coupling parameter g 0 (which is an input parameter in the simulation) is known with good accuracy. Since the perturbative formula Eq. (5.72) is not of any practical use, the relation between the scale and the coupling must be mapped out non-perturbatively. To this end one picks a value for g 0 and computes in a Monte Carlo simulation a dimensionful quantity Q, whose value is known from experiment. Common choices for Q in the pure gauge theory are the string tension or the hadronic radius r 0 [33, 34], while in full QCD one may choose the mass of the nucleon. The Monte Carlo procedure yields Q in lattice units, (aQ), and the calibration of the lattice spacing is achieved via

$$\displaystyle \begin{aligned} a^{-1}\,[\text{MeV}] = \frac{Q|{}_{\text{exp}}\,[\text{MeV}]}{(aQ)|{}_{g_0}}. \end{aligned} $$

Knowledge of (aQ) over a range of bare couplings is a prerequisite for performing the continuum extrapolation. In Fig. 5.4 we show a particular example, namely the continuum extrapolation of the combination M s + 1 2(M u + M d) of quark masses, normalized by the kaon decay constant, computed using O(a) improved Wilson fermions in the quenched approximation [35]. The expected linear convergence in a 2 is clearly exhibited by the lattice data.

Fig. 5.4
figure 4

Continuum extrapolation of the dimensionless ratio of quark masses and the kaon decay constant [35]

So far we have restricted the discussion to the pure gauge theory which contains only one bare parameter, the gauge coupling g 0 (sometimes expressed in terms of \(\beta =6/g_0^2\)). When quarks are incorporated, the set of parameters must be enlarged by the values of the bare masses, one for each flavour. Lattice QCD is thus parameterized by the set of bare parameters

$$\displaystyle \begin{aligned} \{g_0; m_u, m_d, m_s, m_c, m_b, m_t\}. \end{aligned}$$

In order to be predictive, the theory must be renormalized, by expressing the bare parameters in terms of renormalized ones.

A convenient and practical method for lattice QCD is based on so-called hadronic renormalization schemes. Here the bare coupling and quark masses are eliminated in favour of renormalized quantities such as hadron masses or decay constants. An example how this works in the pure gauge theory was already given in the preceding discussion on scale setting, where the bare coupling was eliminated by assigning a value in physical units to the lattice spacing. In the process one has to choose a quantity that sets the scale and which cannot be predicted anymore.

Replacing the values of the bare quark masses m u, m d, … in favour of hadronic quantities works as follows. Like the bare coupling, the bare quark mass is an input parameter for the simulation and thus freely adjustable. Therefore, simulations yield hadron masses (in lattice units) as a function of the input quark masses. For instance, am PS(m 1, m 2) denotes the mass in lattice units of a generic pseudoscalar meson composed of a quark and antiquark with bare masses m 1 and m 2, respectively. Let us assume that the lattice spacing a has been calibrated using some input quantity Q. If we further assume exact isospin symmetry we can then determine the value of the bare isospin-symmetrized light quark mass \({\hat {m}}=\textstyle {1\over 2}(m_u+m_d)\), by requiring that

$$\displaystyle \begin{aligned} \frac{m_{\mathrm{PS}}(m_1,m_2)}{Q} = \left.\frac{m_\pi}{Q}\right|{}_{\text{exp}}, \qquad m_1=m_2, \end{aligned} $$

i.e. the value of \({\hat {m}}\) is fixed by adjusting the input mass m 1 until m PS(m 1, m 2)∕Q coincides with the experimental result. We can extend this procedure to include more massive flavours. The bare strange quark mass is found by tuning m 2 such that

$$\displaystyle \begin{aligned} \frac{m_{\mathrm{PS}}({\hat{m}},m_2)}{Q} = \left.\frac{m_{\mathrm{K}}}{Q}\right|{}_{\text{exp}}. \end{aligned} $$

Alternatively one can fix m s via the condition \( {m_{\mathrm {V}}({\hat {m}},m_2)}/{Q} = {m_{\mathrm {K}}^\ast }/{Q}|{ }_{\text{exp}}, \) where m V denotes the mass in the vector channel. An example of a particular hadronic renormalization scheme is shown below:



g 0

f π

1 2(m u + m d)

m π

m s

m K

m c

\(m_{\mathrm {D}_{\mathrm {s}}}\)

m b

\(m_{\mathrm {B}_{\mathrm {s}}}\)

All quantities in a lattice calculation are genuine predictions, except for those that are listed in the right-hand column of the table, which are used to eliminate the bare parameters.

Given the multitude of hadronic states, it is obvious that there is considerable freedom in choosing hadronic renormalization schemes. Usually, masses or mass splittings of hadrons that are stable in QCD are suitable to define a scheme. Resonances, such as the ρ, should be avoided, since they do not have a sharply defined energy, owing to their large width.

5.2.5 Limitations and Systematic Effects

The lattice formulation is the basis for an exact non-perturbative treatment of QCD. The accuracy of lattice results is chiefly limited by the algorithmic performance and the available computer power. In particular, the set of bare parameters that can be simulated efficiently for a given number of lattice sites is restricted. This has the important consequence that the quark masses at the very extremes of the physical mass scale (i.e. the up/down quarks and the b-quark) cannot be simulated directly with currently available methods and machines. These technical limitations are usually translated into a systematic error, which is quoted alongside the statistical one. The most important systematic effects are due to

  • lattice artefacts (cutoff effects),

  • finite volume effects, and

  • extrapolations in the quark mass.

In order to have sufficient control over these effects, the simulation parameters must be chosen such that the following inequalities are satisfied:

$$\displaystyle \begin{aligned} \frac{1}{am_{\text{had}}} \ll \frac{L}{a},\qquad m_{\text{had}}\ll a^{-1}, {} \end{aligned} $$

where m had is the mass of a generic hadron in physical units computed in the simulation. The inequality on the left of (5.79) states that the hadron’s correlation length must be much smaller than the linear extent of the spatial box (in lattice units), as otherwise its value will be strongly distorted by finite volume effects. The inequality on the right states that the hadron mass must be significantly smaller than the inverse lattice spacing. If this is not the case, lattice artefacts will be uncontrollably large, meaning that the presence of higher-order cutoff effects cannot be excluded, so that a reliable extrapolation to the continuum limit as a linear function of the leading power of lattice artefacts cannot be performed. With current algorithms and machines, lattice sizes of up to La = 48 and lattice spacings down to 0.05 fm are affordable, even if dynamical quarks are included. Since a = 0.05 fm corresponds to a −1 ≈ 4 GeV, it is obvious that the b-quark mass is too large to be simulated directly. Several techniques have been devised to address this problem, and a brief account can be found in Sect. 5.7.2.

In the light quark sector, the primary limitation that forbids making direct contact with the physical values of the up and down quarks is mostly due to algorithmic performance, rather than finite size effects. A detailed discussion of the algorithmic difficulties associated with the simulation of light dynamical quarks is presented separately in the following section. Moreover, it is difficult even in the quenched approximation to reach quark masses significantly smaller than half the physical strange quark mass, in particular with Wilson fermions. This is related to the occurrence of arbitrarily small eigenvalues in the spectrum of the Wilson-Dirac operator, even for small but non-vanishing values of the bare mass. As a result, observables computed on individual, so-called “exceptional” configurations may differ from the Monte Carlo average by orders of magnitude, and thus a reliable determination of the result and its error is virtually impossible. As already mentioned in Sect. 5.2.2, the problem of exceptional configuration can be cured by employing alternative discretizations such as twisted mass QCD or the overlap operator. A related problem arising from the particular spectral properties of the Wilson-Dirac operator is the bad performance of standard algorithms for dynamical quarks, discussed in the next section.

Due to these reasons, many simulations (quenched and unquenched) were restricted to quark masses not much smaller than m s∕2. This value translates into a minimum mass of about 490 MeV in the pseudoscalar meson channel, so that in these simulations the pion is as heavy as the physical kaon. In this region of parameter space it is known empirically that a spatial lattice length of 2–3 fm is sufficient to satisfy the first inequality in (5.79) and to rule out significant finite volume effects. Moreover, an important analytic result derived by Lüscher [36], implies that the asymptotic convergence to the result in infinite volume is exponential.

In order to make contact with the chiral regime, lattice results must be extrapolated to the physical values of the up and down quark masses. The functional form for the dependence of observables on the quark mass is usually provided by Chiral Perturbation Theory (ChPT). For instance, at lowest order the relation between the mass of a pseudoscalar meson composed of quarks with masses m 1 and m 2 is

$$\displaystyle \begin{aligned} m_{\mathrm{PS}}^2 = B_0(m_1+m_2)+\ldots, {} \end{aligned} $$

where the ellipses represent higher orders in the chiral expansion. Similar expressions are derived for vector meson and baryon masses, e.g.

$$\displaystyle \begin{aligned} m_{\mathrm{V}}=m_{\mathrm{V}}^0+C\,M^2+\ldots,\qquad m_{\mathrm{N}}=m_{\mathrm{N}}^0+k\,M^2+\ldots, \end{aligned} $$

and also for other quantities such as pseudoscalar decay constants. In the above formulae, M 2 ≡ B 0(m 1 + m 2), and \(m_{\mathrm {V}}^0\) and \(m_{\mathrm {N}}^0\) denote the (non-vanishing) masses in the chiral limit. A more formal introduction to the basic concepts of ChPT is presented in Sect. 5.6.1.

It remains largely unknown whether or not the expressions of ChPT considered at a given order in the expansion can be applied in the quark mass range that is accessible in current simulations. Therefore, chiral extrapolations can lead to substantial systematic uncertainties. For instance, lattice predictions for the ratio of decay constants of the B and B s mesons, \(f_{\mathrm {B}_{\mathrm {s}}}/f_{\mathrm {B}}\), may differ by 10%, depending on whether the LO or NLO expressions are used as an ansatz for the extrapolation from quark masses around m s∕2. Currently it is estimated that pseudoscalar meson masses of 300 MeV and below must be reached in simulations, in order that ChPT at one- or even two-loop order provides an accurate prediction for the quark mass dependence of hadron masses and matrix elements.

In the quenched approximation, chiral extrapolations are particularly problematic, since the chiral limit is intrinsically pathological, due to the appearance of singularities in the quark mass dependence. This is illustrated by the NLO expression for the ratio \(m_{\mathrm {PS}}^2/(m_1+m_2)\), i.e.

$$\displaystyle \begin{aligned} \frac{m_{\mathrm{PS}}^2}{m_1+m_2} = B_0\big\{ 1-\left({\delta}-\textstyle\frac{2}{3}{\alpha_\Phi}y\right) \left(\ln{y}+1\right) +\left[\left(2\alpha_8-\alpha_5\right) -\textstyle\frac{1}{3}{\alpha_\Phi}\right]y \big\}, {} \end{aligned} $$

where B 0, α 5, α 8, δ and α Φ are low-energy constants. For notational convenience we have introduced

$$\displaystyle \begin{aligned} y=\frac{M^2}{(4\pi F_0)^2},\qquad M^2=B_0(m_1+m_2), {} \end{aligned} $$

where F 0 denotes the pion decay constant in the chiral limit. The low-energy constants δ and α Φ, which multiply the so-called “quenched chiral logarithms”, have no counterpart in the unquenched case. Since δ has a non-zero value [37], the quenched chiral logarithm in Eq. (5.82) gives rise to a singularity in the chiral limit. For many applications, the singularity can be ignored, since its effect is numerically small even at the physical pion mass. However, it signals that the quenched approximation suffers from fundamental conceptual problems.

5.2.6 Simulations with Dynamical Quarks

Although one may argue that the quenched approximation describes hadronic properties fairly well, it is clearly unsatisfactory, both from a conceptual point of view, and also because it introduces an unknown systematic error. Below we shall discuss some general issues relating to simulations with dynamical quarks. It must be stressed that several different techniques how to treat the quark determinant of Eq. (5.57) efficiently are currently being explored. A preferred or clearly superior method has not emerged so far, and it is likely that some of the approaches presented below may become obsolete in the years to come.

In order to illustrate the main difficulties, we start by introducing the Hybrid Monte Carlo (HMC) algorithm [31], which has been the standard algorithm for simulations with dynamical quarks for many years. In order to produce one step in the Markov chain, the algorithm evolves the link variables according to the equations of motion of a classical Hamiltonian system. To this end one introduces a conjugate momentum variable Πμ(x) for every link U μ(x). The Hamiltonian is defined as

$$\displaystyle \begin{aligned} H[U,\Pi] = {\textstyle{1\over2}}\sum_{x\in\Lambda_{\mathrm{E}}}\sum_{\mu=0}^3 \Pi_\mu(x)\Pi_\mu(x) +S_{\mathrm{G}}[U] +S_{\mathrm{F}}^{\text{eff}}[U,\phi^*,\phi], \end{aligned} $$

where S G[U] is the lattice gauge action, and \(S_{\mathrm {F}}^{\mathrm {eff} }[U,\phi ^*,\phi ]\) denotes an effective lattice fermion action, which is obtained by rewriting the quark determinant as a functional integral over complex bosonic fields ϕ(x) and ϕ (x). Explicitly, for N f = 2 one has

$$\displaystyle \begin{aligned} \left(\det D_{\text{lat}}\right)^{2} = \int D[\phi^*,\phi]\, \exp\left\{-\sum_{x\in\Lambda_{\mathrm{E}}} \phi^*(x)\left[(D_{\text{lat}}^\dagger D_{\text{lat}})^{-1}\phi\right](x) \right\}. \end{aligned} $$

For each step in the Markov chain, the conjugate momenta are drawn randomly from a Gaussian distribution (“momentum refreshment”). The Hamiltonian H[U,  Π] governs the dynamics of the variables U μ(x) and Πμ(x) with respect to “simulation time” τ, which parameterizes the evolution of U μ(x) and Πμ(x) as the simulation algorithm progresses. The evolution is described by Hamilton’s equations, which read

$$\displaystyle \begin{aligned} \frac{\mathrm{d}}{\mathrm{d}\tau}U_\mu(x) = \Pi_\mu(x)U_\mu(x),\qquad \frac{\mathrm{d}}{\mathrm{d}\tau}\Pi_\mu(x) = -F_{\mathrm{G},\mu}(x) -F_{\mathrm{F},\mu}(x), \end{aligned} $$


$$\displaystyle \begin{aligned} F_{\mathrm{G},\mu}(x)=\frac{\partial S_{\mathrm{G}}[U]}{\partial U_\mu(x)}, \qquad F_{\mathrm{F},\mu}(x)=\frac{\partial}{\partial U_\mu(x)} \sum_{x\in\Lambda_{\mathrm{E}}} \phi^*(x)\left[(D_{\mathrm{lat}}^\dagger D_{\mathrm{lat}})^{-1}\phi\right](x) \end{aligned} $$

are the forces associated with the gluon and quark fields, respectively. The algorithm then proceeds by integrating the equations of motion numerically. As in any numerical integration scheme, the total time interval is divided into a number of sub-intervals of finite length Δτ, which is called the step size. Starting from an initial gauge configuration {U μ(x)} and a set of conjugate momenta { Πμ(x)}, one obtains new sets \(\{U_\mu ^\prime (x)\}\), \(\{\Pi _\mu ^\prime (x)\}\) after the integration. In the language of classical mechanics, the variables U μ(x) and Πμ(x) evolve along a trajectory in phase space which connects the initial and final configurations. However, since numerical integration is not exact, owing to the finite step size, the energy is not conserved. In the HMC algorithm this is rectified by introducing a global accept/reject step: if ΔH denotes the energy difference between the initial and final configurations, i.e.

$$\displaystyle \begin{aligned} \Delta{H}\equiv H[U^\prime,\Pi^\prime] -H[U,\Pi], \end{aligned} $$

then the new configuration \(\{U_\mu ^\prime (x)\}\) is accepted with probabilityFootnote 7

$$\displaystyle \begin{aligned} P\{U\to U^\prime\} = {\mathrm{min}}(1,\mathrm{e}^{-\Delta{H}}). \end{aligned} $$

In other words, a configuration \(\{U_\mu ^\prime (x)\}\) associated with a large value for the energy violation ΔH is less likely to be accepted. This final step completes the Monte Carlo update. The name “Hybrid Monte Carlo” reflects the fact that one combines a deterministic classical dynamics procedure with a pseudo-random accept/reject step.

One major problem which has plagued simulations with dynamical quarks over many years is the fact that the efficiency of the conventional HMC algorithm deteriorates sharply when the lattice spacing is decreased and the masses of the light (up and down) quarks are tuned to their physical values. The poor scaling behaviour is driven by the condition number of the lattice Dirac operator D lat, i.e. the ratio of the largest to the smallest eigenvalue. This quantity is known to grow inversely proportional to the lattice spacing and the quark mass. In particular, the HMC algorithm scales with the second, perhaps the third power of the light quark mass. Thus, simulations based on the Wilson-Dirac operator were found to be unpractical for lattice spacings below 0.1 fm and quark masses significantly smaller than half of the strange quark mass.Footnote 8 This is related to the afore-mentioned fact that even the massive Wilson-Dirac operator is not protected against arbitrarily small eigenvalues. Its condition number may thus fluctuate strongly in the course of the simulation, leading not only to numerical instabilities, but also to large fluctuations in the quark force term F F,μ(x), and, in turn, ΔH. In order to keep a reasonably large acceptance rate of well over 75%, one must reduce the step size Δτ accordingly, and thus the numerical effort to integrate the equations of motion for an interval τ of fixed length, increases.

Two basic strategies to address this problems have been followed: the first is based on using fermionic discretizations that avoid the problem of arbitrarily small eigenvalues, while the aim of the second approach is to improve the simulation algorithms.

Staggered fermions have been advocated as a numerically more efficient alternative to the Wilson-Dirac formulation: since the staggered Dirac operator couples one-component Grassmann fields rather than four-component spinors, fewer floating point operations are required for one application of the operator. Moreover, the residual U(1) ⊗U(1) symmetry protects the quark mass against additive renormalization and thus prevents the occurrence of very small eigenvalues. However, the fact that the staggered formulation describes four “tastes” per quark flavour makes a physical interpretation difficult. Technically, the degeneracy implies that the statistical weight of the quark determinant is too large compared with that of one physical flavour. An ad hoc method to compensate for this is to take fractional powers of the staggered quark determinant. For instance, to simulate QCD with a doublet of degenerate up and down quarks with mass \(\hat {m}\), and a single heavier (strange) quark with mass m s, the probability measure is taken as

$$\displaystyle \begin{aligned} P=\frac{1}{Z}\left\{ \det\left(D_{\text{stagg}}+\hat{m}\right) \right\}^{1/2} \left\{ \det\left(D_{\text{stagg}}+m_s\right) \right\}^{1/4} \mathrm{e}^{-S_{\mathrm{G}}[U]}, {} \end{aligned} $$

where D stagg is the massless staggered Dirac operator. This procedure is known as the “fourth root trick”. The main question, which has been hotly debated, is whether or not the rooted staggered operator corresponds to a local field theory, or whether it induces spurious interactions among the fermionic degrees of freedom, which might lead to a violation of the universality of the continuum limit. A thorough analysis of this problem was given in [39], but so far no firm conclusion has been reached. Nevertheless, the probability measure Eq. (5.90) and the “rooting trick” it is based on, have been employed in large-scale simulations (see, e.g. Ref. [40]).

Discretizations based on twisted mass QCD have also been proposed as a numerically more efficient quark action. Here, the twisted mass parameter μ q protects the operator against arbitrarily small eigenvalues. The smallest mass in the pion channel that has been reached with this formulation was as low as 300 MeV [41]. This corresponds to a physical quark mass of about m s∕5, which may be sufficient to enter the regime where the quark mass behaviour of observables can be described analytically using Chiral Perturbation Theory.

Owing to several major algorithmic improvements, simulations based on the Wilson-Dirac operator can now be performed much more efficiently. Without going into much detail, we simply state that most of the gain is due to the use of suitably chosen factorizations of the Wilson-Dirac operator into its low- and high-frequency parts. The various factors are then “better conditioned”. In particular, fluctuations in the condition number can be controlled via a separate and optimized treatment of the low-energy part. In this way the step size Δτ can be increased whilst keeping a reasonably high acceptance rate for fixed total trajectory length τ. Algorithmic implementations of factorization range from Hasenbusch’s “mass preconditioning” [42, 43], Lüscher’s domain decomposition technique based on the Schwarz Alternating Procedure (DD-HMC algorithm) [44], to factorizations based on mass preconditioning combined with rational approximations of the contributions from multiple pseudo-fermion fields [45]. Thanks to these developments, it appears that the spectral properties of the Wilson-Dirac operator are no longer an obstacle to the efficient simulation of lattice QCD with light dynamical quarks. At the same time, large-scale simulations employing the recent algorithmic improvements are only just starting.

5.3 Hadron Spectroscopy

The determination of the spectrum of hadrons, i.e. mesons, baryons, glueballs, and possibly “exotic” hadronic states, starting from the underlying gauge theory of quarks and gluons has traditionally been one of the main applications of lattice QCD. The rôle of lattice calculations in this context is twofold: first, the determination of the experimentally known values of hadron masses from first principles represents a stringent test of QCD. Second, lattice calculations can make predictions for the masses of undiscovered or poorly established states. For instance, lattice results have been instrumental in the search for glueball candidates, and have also contributed significantly to the debate on the existence of pentaquarks.

The principles of hadronic mass calculations have already been outlined at the end of Sect. 5.2.3: After defining a suitable interpolating operator with the quantum numbers of the desired hadronic channel, one computes its Euclidean two-point function. The mass (energy) of the ground state in that channel is then extracted from the exponential fall-off of the correlation function at large Euclidean times. The detailed functional form of the asymptotic behaviour depends on the choice of boundary conditions. Thus, it is not always described by a \(\cosh \) function, as in the example of a pseudoscalar meson on a lattice with periodic boundary conditions in time, c.f. Eq. (5.64). In the limit of infinite temporal lattice size T, the effect of the boundary conditions is sufficiently weak, so that one may approximate the functional form of the correlation function for a generic interpolating operator ϕ had(x) by

$$\displaystyle \begin{aligned} C_{\text{had}}(x_0;\vec{p})=\sum_{\vec{x}}\mathrm{e}^{{\mathrm{i}}\vec{p}\cdot\vec{x}} \left\langle \phi_{\text{had}}(x)\phi_{\text{had}}^\dagger(0) \right\rangle \stackrel{T\to\infty}{=} \sum_{\alpha} w_\alpha(\vec{p})\mathrm{e}^{-\epsilon_\alpha(\vec{p})x_0}. {} \end{aligned} $$

Here, the quantity \(w_\alpha (\vec {p})\) is referred to as the spectral weight of the state |α〉. A large value for the spectral weight of the ground state, \(w_1(\vec {p})\), will lead to an early domination of the correlation function by the ground state energy. The choice of ϕ had in a given channel can be optimized such that

$$\displaystyle \begin{aligned} w_1(\vec{p})\;\gg\;w_i(\vec{p}),\qquad i=2,3,\ldots. \end{aligned} $$

An optimal choice of interpolating operator is not only important to ensure a reliable determination of the ground state energy: In order to determine the energies in the excitation spectrum, the associated spectral weights must be maximized by specifying appropriate operators.

Below we provide examples for interpolating operators in several mesonic and baryonic channels:


Here, parentheses indicate summation over spinor and colour indices, while curly brackets denote that only spinor indices are summed over.

5.3.1 Light Hadron Spectrum

The determination of the spectrum of light hadrons was historically one of the first attempts to compute hadronic properties on the lattice. Since the masses of the low-lying hadrons are known from experiment, such calculations serve as benchmarks to test the intrinsic accuracy of the lattice approach.

The quenched approximation has been widely used to compute a number of quantities that are of great phenomenological interest. However, these results are of limited value, since the inherent quenching error is left undetermined. A precise calculation of the masses of the lowest lying hadrons in quenched QCD will expose the typical magnitude of the systematic error incurred by neglecting dynamical quark effects. To this end, several calculations of the quenched light hadron spectrum, using different lattice actions, have been performed [46,47,48,49,50,51].

In Ref. [47], the CP-PACS Collaboration presented a comprehensive study of the masses of the lowest pseudoscalar and vector mesons, as well as octet and decuplet baryons. The Wilson fermion action without O(a) improvement was used at four different values of the lattice spacing, and a continuum extrapolation linear in a has been performed for all quantities. CP-PACS adopted a hadronic renormalization scheme in which the lattice scale was fixed using the mass of the ρ-meson. The average up and down quark mass was set using m π. In order to fix m s, either the kaon mass (“K”-input) or the mass of the ϕ-meson (“ϕ”-input) was used. Chiral extrapolations were either based on the form expected from quenched Chiral Perturbation Theory at NLO (see Eq. (5.82)), or on the leading-order formula supplemented by a quadratic term in the quark mass. The resulting (small) differences in the extrapolated values were added as systematic errors in the final results, which are summarily displayed in Fig. 5.5. Although the lattice results are in remarkable overall agreement with the experimentally observed spectrum, one finds significant deviations. For instance, the ratio of the nucleon and the ρ-meson masses is determined as

$$\displaystyle \begin{aligned} \frac{m_{\mathrm{N}}}{m_\rho}=1.143\pm0.033\pm0.018, \end{aligned} $$

where the first error is statistical, and the second is an estimate of systematic uncertainties other than quenching. The above value is 6.7% (2.5 standard deviations) below the experimental value of 1.218. Similarly, vector-pseudoscalar mass splittings, such as \(m_{\mathrm {K}^*}-m_{\mathrm {K}}\), are underestimated by 10–15% (4–6σ), depending on whether m K or m ϕ was used to fix the strange quark mass.

Fig. 5.5
figure 5

Quenched light hadron spectrum computed in [47], compared with experiment. The statistical error and the sum of the statistical and systematic errors are indicated

The findings reported by CP-PACS, which were based on unimproved Wilson fermions, have been broadly confirmed by other collaborations employing different lattice actions [48,49,50,51]. Thereby, the universality of the continuum limit of quenched QCD has been established: although different discretizations may yield statistically inconsistent results at non-zero lattice spacing, they converge to a common continuum limit, provided that the same hadronic renormalization scheme has been employed. The latter requirement is important, as there is considerable freedom in choosing a particular scheme. This leads to ambiguities in the quenched approximation, since different quantities are affected in different way by quark loops. In Ref. [51] it was found that, by using only stable or narrow states to define the hadronic renormalization scheme, the discrepancies between the quenched and experimental spectra could be shifted to the broad resonances, ρ, Δ, N , while the agreement for states like K, ϕ, N,  Ω could be improved. Yet this observation does not alter the conclusion that the quenched approximation is unable to reproduce the spectrum of light hadrons with an accuracy better than 10%.

The obvious question is whether sea quark effects can account for the observed deviation between the quenched and experimental spectra. Owing to the larger numerical effort required to simulate QCD with dynamical quarks, unquenched studies have not yet reached the same level of control over systematic effects—notably lattice artefacts and chiral extrapolations—compared with the quenched benchmark [47]. Thus, a “definitive” unquenched calculation of the light hadron spectrum is still lacking, and thus we refrain from presenting an overview of recent results.

Nevertheless, the observed tendency in all simulations performed to date is that dynamical quarks “do the right thing”, i.e. the deviation from experiment is decreased. An example is shown in Fig. 5.6, where continuum extrapolations of meson masses in the quenched and unquenched theories are compared. The plot shows that the data obtained for N f = 2 are closer to the experimental results in the continuum limit in comparison with their quenched counterparts. However, the figure also shows that the extrapolation of unquenched data is not well constrained, since only three data points are available. Clearly, additional simulations at smaller lattice spacings and quark masses are required for a solid determination of the total error in unquenched calculations of the light hadron spectrum.

Fig. 5.6
figure 6

Continuum extrapolations of the masses of the K and ϕ mesons in full (N f = 2, full symbols) and quenched QCD (open symbols), compared with experiment (diamonds) [52]

It should also be noted that the various discretizations of the quark action have complementary advantages and shortcomings. While simulations with Wilson quarks have in the past been restricted to quark masses not much smaller than half the strange quark mass for algorithmic reasons, the use of staggered fermions in conjunction with the rooting procedure may be afflicted with conceptual problems (see the discussion in Sect. 5.2.6). Domain wall and overlap fermions are per se more expensive to simulate. In simulations based on tmQCD the incorporation of a third, heavier quark flavour is quite complicated. Thus, progress in this area is likely to be made through the combined information from different discretizations.

5.3.2 Glueballs

In addition to bound states composed of a quark-antiquark pair or, alternatively, three quarks, QCD is also widely believed to support the existence of glueballs, i.e. bound states consisting mainly of gluonic degrees of freedom. Although several candidates for such states have been proposed (e.g. the f 0(1370), f 0(1500) and f 0(1710)), the experimental difficulty consists in their unambiguous identification as glueballs. To this end, they need to be distinguished from “conventional” flavour-singlet meson resonances in the scalar channel. Predictions for the masses and widths of glueballs from lattice QCD provide crucial input for this task.

The basic principles of mass calculations for glueballs in lattice QCD are the same as for bound states composed of quark degrees of freedom: first one must define an interpolating operator with the appropriate quantum numbers of the glueball state in question. That is, the operator must transform correctly under spin, parity and charge conjugation. At this point a complication arises: the lattice breaks all continuous space-time symmetries, such that Lorentz-invariance or—in the language of Euclidean field theory—rotational invariance is only recovered in the continuum limit. At non-zero lattice spacing the spin assignment is therefore ambiguous. Since the gluon field is represented by link variables, any glueball operator must be constructed from particular combinations of Wilson loops, i.e. products of link variables along closed paths on a hypercubic lattice (see Fig. 5.7).

Fig. 5.7
figure 7

Wilson loops used in the construction of glueball operators (from Ref. [53])

Operators constructed in this way transform under irreducible representations (IRs) of the octahedral group O h, which are conventionally labelled A 1, A 2, E, T 1 and T 2. By computing the relations between the IRs of O h and SU(2) one finds that each IR in the set {A 1, A 2, E, T 1, T 2} corresponds to infinitely many spins in the continuum. For instance, A 1 transforms not only like a scalar (spin 0) state, but also contributes to spin 4 and yet higher spin states. Similarly, the lowest states to which T 1 makes a contribution are spin 1 and spin 3, while E corresponds to spins 2, 4, 5,…. In order to fully classify lattice glueball operators, the representations of O h are supplemented by the transformation properties under parity and charge conjugation, in full analogy with the usual J PC-assignment in the continuum. For example, an operator labelled \(A_1^{++}\) corresponds to the scalar channel 0++ in the continuum.

The above discussion implies that the two-point correlation function of an operator transforming under \(A_1^{++}\), which is used to describe the scalar glueball, will be contaminated by contributions from a spin 4 state. However, in accordance with Regge theory one may expect that the latter dies out quickly, since higher spin states are more massive.

Another technical complication arises from the empirical observation that the spectral weight, \(w_1(\vec {p})\), of the ground state in Eq. (5.91) is usually quite small. This implies that the asymptotic behaviour of the two-point correlation function is only isolated at large Euclidean times. However, the statistical accuracy deteriorates quickly as x 0 is increased, and in the asymptotic regime the correlation function is numerically comparable to the statistical noise. This precludes a precise determination of the mass of the ground state. A heuristic explanation for the small spectral weight can be given by noting that the operators constructed from the usual link variables are point-like and thus have little projection onto an extended object such as a glueball. The situation can be much improved if the links in the Wilson loops of Fig. 5.7 are replaced by so-called “smeared” or “fuzzed” links [54, 55]. For instance, the approach of [54] replaces the spatial link U j(x) by the combination


where α is a real, tunable parameter, and the symbol \({\mathcal {P}}\) denotes the projection back into the group manifold of SU(3). The procedure can be iterated, so that links at smearing level s, i.e. \(U_j^s(x)\), are constructed from those at level s − 1 via Eq. (5.95). One may say that smearing reduces the UV fluctuations of the gauge field, so that the smeared, extended link variables are better suited to project onto the IR regime, i.e. the long-distance properties. It should be stressed that the links in the temporal direction do not undergo the fuzzing procedure: Fuzzed temporal links will alter the transfer matrix and the spectral information it contains.

In order to obtain detailed information on the glueball spectrum one also seeks to determine the masses of the excited states in a given channel. This requires another level of refinement, since one normally hopes that excited state contributions die out quickly, while they now become the very focus of interest. A widely used method to gain information on the higher excitations is to construct a whole set of interpolating operators {O 1, …, O r} in a given channel, say, \(A_1^{++}\). This is achieved either by considering different shapes of Wilson loops that share the same transformation properties, or by applying several different smearing levels to one particular Wilson loop. Thus, each individual member of the set {O 1, …, O r} is a perfectly valid operator in a given channel, but the projection properties, i.e. the associated spectral weights \(w_{\alpha }^{(i)}\) for a particular state α in the spectral sum will in general be different for each member i = 1, …, r. One then computes the matrix

$$\displaystyle \begin{aligned} C_{ij}(x_0) := \sum_{\vec{x}} \left\langle O_i(x) O_j^\dagger(0) \right\rangle,\qquad i,j=1,\ldots,r, \end{aligned} $$

whose elements consist of the correlations of all combinations of operators in the set. The diagonalization of the matrix correlator then yields the appropriate linear combination of operators which correspond to the states α = 1, 2, … in the spectral decomposition. Diagonalization is achieved by solving the generalized eigenvalue problem

$$\displaystyle \begin{aligned} C_{ij}(x_0)\phi_j = \lambda_i(x_0,x_0^\prime) C_{ik}(x_0^\prime)\phi_k, \quad x_0^\prime<x_0, \end{aligned} $$

where ϕ denotes a vector, \(x_0^\prime \) is fixed, and C(x 0), \(C(x_0^\prime )\) denote the matrix correlators taken at Euclidean times x 0 and \(x_0^\prime \), respectively. As shown in [56], the set of eigenvalues \(\lambda (x_0,x_0^\prime )\) converges rapidly towards

$$\displaystyle \begin{aligned} \lambda_\alpha(x_0,x_0^\prime) = \mathrm{e}^{-(x_0-x_0^\prime)\epsilon_\alpha},\qquad \alpha=1,\ldots,r, \end{aligned} $$

where 𝜖 α is the mass (energy) of the state α in the spectral sum.

After all these technicalities, we now report on the status of glueball calculations. Recent results obtained in the quenched approximation were published in [53, 57,58,59,60]. In Fig. 5.8 we show the results from Ref. [57]. The three lowest-lying states are the scalar (0++), tensor (2++) and the 0−+ glueballs, whose masses are determined as

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle m_{0^{++}}=1710(50)(80) \,{\text{MeV}},\quad m_{2^{++}}=2390(30)(120)\,{\text{MeV}},\\ &\displaystyle &\displaystyle m_{0^{-+}}=2560(35)(120)\,{\text{MeV}}. \end{array} \end{aligned} $$

Here, the first error is statistical, while the second is an estimate of systematic uncertainties, which is dominated by the ambiguity in the scale setting in the quenched approximation.

Fig. 5.8
figure 8

Glueball spectrum in quenched QCD (from Ref. [57])

While it is tempting to identify the experimentally established resonance f 0(1710) as a scalar glueball in the light of the above results, the situation is more complicated. Since lattice predictions for the mass of the lightest glueballs fall into the mass range of conventional scalar mesons, mixing of glueballs with conventional \(q\bar {q}\) states in conjunction with the observed decay patterns must be considered before drawing any definite conclusions. More details on the current phenomenological and experimental situation can be found in [61, 62]. So far, there have been only exploratory attempts to study glueball-meson mixing directly on the lattice. Any meaningful investigation must inevitably include dynamical quark effects, whose influence on the glueball spectrum have so far only been poorly understood.

5.4 Confinement and String Breaking

The empirical fact that quarks and gluons are not observed as free particles is commonly referred to as confinement. Since all experimentally observed states are singlets under SU(3)colour, confinement is tantamount to saying that isolated colour charges are not allowed. A theoretical understanding of this phenomenon must inevitably go beyond the perturbative level, since QCD is a strongly coupled theory.

In Ref. [6], Wilson formulated a criterion for the confinement of colour charges known as the “area law”. Let \(U({\mathcal {C}})\) denote the product of link variables around a closed loop \({\mathcal {C}}\) on a hyper-cubic lattice. The trace over colour indices is called the “Wilson loop”, i.e.

$$\displaystyle \begin{aligned} W({\mathcal{C}})={\mathrm{tr}}\,\{U({\mathcal{C}})\}. \end{aligned} $$

The area law then states that colour charges are confined if the expectation value of \(W({\mathcal {C}})\) decays exponentially with a rate proportional to the area \(A({\mathcal {C}})\) enclosed by the curve \({\mathcal {C}}\), i.e.

$$\displaystyle \begin{aligned} \left\langle W({\mathcal{C}}) \right\rangle \equiv \left\langle {\mathrm{tr}}\,\{U({\mathcal{C}})\} \right\rangle \propto \mathrm{e}^{-{\sigma}A({\mathcal{C}})}, \end{aligned} $$

where σ is a constant. An example for a rectangular Wilson loop is shown in Fig. 5.9.

Fig. 5.9
figure 9

Oriented product of link variables around a rectangle of area r ⋅ t

The interpretation of the area law rests on the observation that a Wilson loop of area rt is equal to the Euclidean correlator which describes the propagation of a static, i.e. infinitely heavy, quark-antiquark pair separated by a distance r over a Euclidean time interval t. If t is taken to infinity at fixed r, the correlator yields the energy of the quark-antiquark pair:

$$\displaystyle \begin{aligned} \left\langle W({\mathcal{C}}) \right\rangle \stackrel{t\gg0}{\sim} \mathrm{e}^{-V(r)t}. \end{aligned} $$

The area law then implies \({\sigma }A({\mathcal {C}})=V(r)t\), and for a rectangular loop one obtains

$$\displaystyle \begin{aligned} V(r)\sim {\sigma}r. \end{aligned} $$

Hence the energy of a static quark-antiquark pair increases linearly with the distance r. To achieve a full separation of static colour sources would therefore require an infinite amount of energy.

It has long been believed that SU(3) gauge theory is related to some kind of string theory. Heuristically, confinement may be viewed as due to the formation of a narrow tube of chromo-electric and -magnetic flux between static colour charges, the dynamics of which can be described by a string theory. The bosonic string model yields an asymptotic expansion for the static quark potential

$$\displaystyle \begin{aligned} V(r) = {\sigma}r+V_0+\frac{c}{r}+{\mathrm{O}}(1/r^2), {} \end{aligned} $$

where V 0 = const, and the universal coefficient c has been computed as [63]

$$\displaystyle \begin{aligned} c=-\frac{\pi}{12} \end{aligned} $$

in the four-dimensional theory. The proportionality factor σ is called the “string tension”. Instead of the potential one often considers the force, F(r) ≡dV (r)∕dr. The ansatz Eq. (5.104) yields

$$\displaystyle \begin{aligned} F(r) = \sigma-\frac{c}{r^2}+{\mathrm{O}}(1/r^3), \end{aligned} $$

so that the string tension is obtained as the limiting value of the force, as r →,

$$\displaystyle \begin{aligned} \sigma = \lim_{r\to\infty}F(r). \end{aligned} $$

String models of hadrons have been known since the late 1960s, and a phenomenological value for σ has been determined from Regge theory, \(\sqrt {\sigma }=440\,{\text{MeV}}\).

In QCD with light sea quarks the linear rise of the potential cannot persist for arbitrarily large distances. Instead, the creation of a light quark-antiquark pair from the vacuum will cause the hadronization of the static colour charges, leading to the formation of two static-light mesonic states. Thus, the string or flux-tube is expected to “break” when the two-meson state is energetically favoured over the linearly rising potential. The breaking of the string should set in at a characteristic value for the separation distance, r b, causing the potential to flatten off for , since the energy of a state of two mesons is independent of their separation.

Lattice simulations have been instrumental for establishing that the area law, the string picture of confinement, as well as string breaking (i.e. hadronization) are indeed properties of SU(3) gauge theory and/or QCD. However, computations of large Wilson loops in lattice simulations suffer from the same problem encountered in glueball mass calculations: due to the strong exponential fall-off, the correlator in the asymptotic region, r, t →, is of the same order of magnitude than the statistical noise. Consequently, the same techniques have been applied, namely the smearing of link variables and the variational approach, which is based on the diagonalization of a matrix correlator. By combining these techniques with procedures designed to reduce statistical fluctuations [64] in the computation of large Wilson loops, one could verify the linear rise of the potential up to distances of r ≲ 1.5 fm [65, 66] (See Fig. 5.10).

Fig. 5.10
figure 10

Left panel: static quark potential in SU(3) gauge theory (from Ref. [66]). Right panel: force (from Ref. [65]) compared to the bosonic string model (dashed curve) and perturbation theory (solid curve). To compare results at different lattice spacings, all dimensionful quantities have been expressed in units of the hadronic radius r 0 = 0.5 fm (see text)

Since a phenomenological value for \(\sqrt {\sigma }\) could be inferred from Regge theory, the string tension used to be a popular quantity to set the lattice scale. However, as lattice calculations became increasingly precise, it was realized that the extrapolation r → is not easy to perform on the basis of lattice data restricted to r ≲ 1.5 fm. An alternative, conceptually much more reliable scale is obtained from the force between static colour charges [33]. The hadronic radius r 0 is defined by requiring that the force F(r) evaluated at r = r 0 assumes a given reference value. The latter is fixed by matching F(r) to phenomenological, non-relativistic potential models for heavy quarkonia. The scale r 0 is defined as the solution of

$$\displaystyle \begin{aligned} \left. F(r)r^2\right|{}_{r=r_0}=1.65, \end{aligned} $$

where the constant on the right-hand side is chosen such that r 0 has a value of r = 0.5 fm in QCD. Choosing r 0 to set the scale avoids the systematic uncertainty associated with the extrapolation of the force to infinite distance. Furthermore, r 0 remains well-defined in QCD with dynamical quarks, where string breaking must occur and the concept of a string tension as the limiting value of the force is intrinsically flawed. The quantity r 0a has been determined numerically with good statistical accuracy over a wide range of bare couplings, corresponding to lattice spacings between 0.026 − 0.17 fm [34, 65].

To test whether the bosonic string model for confinement is consistent with lattice data, one must confront the value of the Coulombic coefficient c in Eq. (5.104) with the predicted value of c = −π∕12. As in the case for the string tension, such a comparison is difficult to perform reliably, since − π∕12 represents the asymptotic value at infinite distance, which must be determined from data computed over a narrow range of accessible distances. Using highly accurate data for the potential V (r), generated by an algorithm which allows for an exponential suppression of statistical fluctuations at large r and t, it could be shown [67] that the quantity

$$\displaystyle \begin{aligned} c_{\mathsf{eff}}(r) =\frac{1}{2} r^3\frac{{\mathrm{d}}^2 V(r)}{{\mathrm{d}}r^2} \end{aligned} $$

indeed converges towards the predicted value of − π∕12. This result confirms the string picture of confinement and suggests that string-like behaviour already sets in at rather small distances of .

The incorporation of dynamical quarks should drastically change the string picture beyond a characteristic scale r b, where due to \(q\bar {q}\) pair creation string breaking occurs, since a two-meson state is energetically favoured over the flux-tube. However, the static quark potential determined from Wilson loops on dynamical configurations typically does not show any clear signs of flattening off, even at distances as large as 1 fm, where one expects hadronization to set in. This is attributed to the Wilson loop having little overlap onto the state of a broken string, such that the spectral weight associated with the broken string is extremely small. Therefore, extracting its energy reliably would require large Euclidean time separations, for which the statistical signal is usually lost.

It was thus proposed to address this problem by constructing a matrix correlator of Wilson loops supplemented by operators that directly project onto a two-meson state, and to consider their cross-correlations with the unbroken flux-tube. This strategy was first applied to Higgs models, i.e. non-Abelian gauge theory coupled to bosonic matter fields (“scalar QCD”), which are computationally much more efficient, whilst preserving the mechanism for string breaking to occur [68, 69]. The method was later extended to QCD with two flavours of dynamical quarks [70]. The plots in Fig. 5.11 clearly show that the ground state energy at short distances is linearly rising, while the first excited state (i.e. the two-meson state) is constant in r. At a certain separation r b one observes a crossing of energy levels and a continuing flat behaviour of the ground state energy. Near the crossing point one actually observes a repulsion of the energy levels, which is characteristic for the breaking phenomenon. The diagonalization of the matrix correlator also yields information on the composition of the states in the spectral decomposition. Indeed, for distances r < r b the combination of operators describing the ground state is dominated by Wilson loops, whereas for r > r b, two-meson operators are the most relevant.

Fig. 5.11
figure 11

Ground state and first excited state of the static quark potential computed using matrix correlators in the SU(2) Higgs model [68] (left panel) and QCD with N f = 2 flavours of dynamical quarks [70] (right panel)

5.5 Fundamental Parameters of QCD

We have noted already that QCD is parameterized in terms of the gauge coupling and the masses of the quarks. In order to make predictions for cross sections, decay rates and other observables, their values must be fixed from experiment. As was discussed in detail in Sect. 4.3 , the renormalization of QCD leads to the concept of a “running” coupling constant, which depends on some momentum (energy) scale μ, and the same applies to the quark massesFootnote 9:

$$\displaystyle \begin{aligned} \alpha_s(\mu)\equiv \frac{\bar{g}^2(\mu)}{4\pi},\; \bar{m}_u(\mu),\,\bar{m}_d(\mu),\,\bar{m}_s(\mu),\, \bar{m}_c(\mu),\,\bar{m}_b(\mu),\,\bar{m}_t(\mu). {} \end{aligned} $$

The property of asymptotic freedom implies that the coupling becomes weaker as the energy scale μ is increased. This explains why the perturbative expansion of cross sections in the high-energy domain allows for an accurate determination of α s from experimental data.

The scale dependence of the coupling and the quark masses is encoded in the renormalization group (RG) equations, which are formulated in terms of the β-function and the anomalous dimension τ,

$$\displaystyle \begin{aligned} \mu\frac{\partial\bar{g}(\mu)}{\partial\mu}=\beta(\bar{g}),\qquad \mu\frac{\partial\bar{m}(\mu)}{\partial\mu}=\bar{m}\tau(\bar{g}). \end{aligned} $$

At high enough energy the RG functions β and τ admit perturbative expansions according to

$$\displaystyle \begin{aligned} \beta(\bar{g})=-b_0\bar{g}^3-b_1\bar{g}^5+\ldots,\qquad \tau(\bar{g}) =-d_0\bar{g}^2-d_1\bar{g}^4+\ldots. \end{aligned} $$

Here, b 0, b 1 and d 0 = 8∕(4π)2 are universal, while the higher coefficients depend on the adopted renormalization scheme.

From the asymptotic scaling behaviour at high energies one can extract the fundamental scale parameter of QCD via

$$\displaystyle \begin{aligned} \Lambda= \lim_{\mu\to\infty}\left\{\mu (b_0\bar{g}^2)^{-b_1/2b_0^2}\, \mathrm{e}^{-1/2b_0\bar{g}^2}\right\},\qquad \bar{g}\equiv\bar{g}(\mu). {} \end{aligned} $$

Like the running coupling itself, the Λ-parameter depends on the chosen renormalization scheme.Footnote 10 A related, but less commonly used variable is the renormalization group invariant (RGI) quark mass

$$\displaystyle \begin{aligned} M_{\mathrm{f}}=\lim_{\mu\to\infty}\left\{\bar{m}_{\mathrm{f}} (2b_0\bar{g}^2)^{-d_0/2b_0}\right\}, \quad f=u,d,s,\ldots, \qquad \bar{m}\equiv\bar{m}(\mu). {} \end{aligned} $$

Unlike Λ, the RGI quark masses are scheme-independent quantities. Instead of using the running coupling and quark masses of Eq. (5.110), one can parameterize QCD in an entirely equivalent way through the set

$$\displaystyle \begin{aligned} \Lambda,\,M_u,\,M_d,\,M_s,\,M_c,\,M_b,\,M_t. {} \end{aligned} $$

At the non-perturbative level these quantities represent the most appropriate parameterization of QCD, since their values are defined without any truncation of perturbation theory.

The perturbative renormalization of QCD is accomplished by replacing the bare parameters with renormalized ones, whose values are fixed by considering the high-energy behaviour of Green’s functions, usually computed in the \({\overline {{\mathrm {MS}}}}\)-scheme of dimensional regularization. However, at low energies it is convenient to adopt a hadronic renormalization scheme, in which the bare parameters are eliminated in favour of quantities such as hadron masses and decay constants (see Sect. 5.2.4). Since QCD is expected to describe both the low- and high-energy regimes of the strong interaction, one should be able to express the quantities of Eq. (5.115), which are determined from the high-energy behaviour, in terms of hadronic quantities. In other words, by matching a hadronic renormalization scheme to a perturbative scheme like \({\overline {{\mathrm {MS}}}}\) one achieves the non-perturbative renormalization of QCD at all scales. In particular, one can express the fundamental parameters of QCD (running coupling and masses, or, equivalently, the Λ-parameter and RGI quark masses) in terms of low-energy, hadronic quantities. This amounts to predicting the values of these fundamental parameters from first principles.

5.5.1 Non-perturbative Renormalization

To illustrate the problem of matching hadronic and perturbative schemes like \({\overline {{\mathrm {MS}}}}\), it is instructive to discuss the determination of the light quark masses. A convenient starting point is the PCAC relation, which for a charged kaon can be written as

$$\displaystyle \begin{aligned} f_{\mathrm{K}}m_{\mathrm{K}}^2 = (\bar{m}_u+\bar{m}_s)\left\langle 0| (\bar{u}\gamma_5{s})|{K^+}\right\rangle. {} \end{aligned} $$

In order to determine the sum of quark masses \((\bar {m}_u+\bar {m}_s)\), using the experimentally determined values of f K and m K, it suffices to compute the matrix element \(\left \langle 0| \bar {u}\gamma _5{s}|{K^+}\right \rangle \) in a lattice simulation, as outlined in Sect. 5.2.3 (see Eq. (5.64)). The dependence on the renormalization scale and scheme cancels in Eq. (5.116), since the quantities on the left hand side are physical observables. Thus, in order to determine the combination \((\bar {m}_u+\bar {m}_s)\) in the \({\overline {{\mathrm {MS}}}}\)-scheme, one must compute the relation between the bare matrix element of the pseudoscalar density evaluated on the lattice and its counterpart in the \({\overline {{\mathrm {MS}}}}\)-scheme:

$$\displaystyle \begin{aligned} (\bar{u}\gamma_5{s})_{{\overline{{\mathrm{MS}}}}} = Z_{\mathrm{P}}(g_0,a\mu)(\bar{u}\gamma_5{s})_{\text{lat}}. \end{aligned} $$

Here, μ is the subtraction point (renormalization scale) in the \({\overline {{\mathrm {MS}}}}\)-scheme. Provided that Z P and the matrix element of \((\bar {u}\gamma _5{s})_{\text{lat}}\) are known, one can use Eq. (5.116) to compute \((\bar {m}_u+\bar {m}_s)/f_{\mathrm {K}}\), which is just the ratio of a renormalized fundamental parameter expressed in terms of a hadronic quantity, up to lattice artefacts. In Fig. 5.4 we have already shown the continuum extrapolation of this ratio.Footnote 11

The factor Z P is obtained by imposing a suitable renormalization condition involving Green’s functions of the pseudoscalar densities in the \({\overline {{\mathrm {MS}}}}\) as well as the hadronic scheme. Since the \({\overline {{\mathrm {MS}}}}\)-scheme is intrinsically perturbative, in the sense that masses and couplings are only defined at a given order in the perturbative expansion, it is actually impossible to formulate such a condition at the non-perturbative level. In perturbation theory at one loop one finds

$$\displaystyle \begin{aligned} Z_{\mathrm{P}}(g_0,a\mu) = 1 + \frac{g_0^2}{4\pi}\left\{ \frac{2}{\pi}\ln(a\mu) + C\right\} + O(g_0^4), {} \end{aligned} $$

where C is a constant that depends on the chosen discretization of the QCD action. Expressions like these are actually not very useful, since perturbation theory formulated in terms of the bare coupling g 0 converges rather slowly, so that reliable estimates of renormalization factors at one- or even two-loop order in the expansion cannot be obtained. Thus it seems that the problem of non-perturbative renormalization is severely hampered by the intrinsically perturbative nature of the \({\overline {{\mathrm {MS}}}}\) scheme in conjunction with the bad convergence properties of lattice perturbation theory.

This problem can, in fact, be resolved by introducing an intermediate renormalization scheme. Schematically, the matching procedure for the pseudoscalar density (or, equivalently, the quark mass) via such a scheme is sketched in Fig. 5.12. At low energies, corresponding to typical hadronic scales, it involves computing a non-perturbative matching relation between the hadronic and the intermediate scheme X at some scale μ 0. This matching step can be performed reliably if μ 0 is much smaller than the regularization scale a −1. In the following step one computes the scale dependence within the intermediate scheme non-perturbatively from μ 0 up to a scale \(\bar \mu \gg \mu _0\), which is large enough so that perturbation theory can be safely applied. At that point one may then determine the matching relation to the \({\overline {{\mathrm {MS}}}}\)-scheme perturbatively. Alternatively, one can continue to compute the scale dependence within the intermediate scheme to infinite energy via a numerical integration of the perturbative RG functions. According to Eq. (5.114) this yields the relation to the RGI quark mass. Since the latter is scale- and scheme-independent, one can use directly the perturbative RG functions, which in the \({\overline {{\mathrm {MS}}}}\)-scheme are known to four-loop order [71], to compute the relation to \(\bar {m}_{{\overline {{\mathrm {MS}}}}}\) at some chosen reference scale. By applying this procedure, the direct perturbative matching between between the hadronic and \({\overline {{\mathrm {MS}}}}\)-schemes (upper two boxes in Fig. 5.12), using the expression in Eq. (5.118) is thus completely avoided.

Fig. 5.12
figure 12

Sketch of the matching of quark masses computed in lattice regularization and the \({\overline {{\mathrm {MS}}}}\)-scheme, via an intermediate renormalization scheme X

Decay constants of pseudoscalar mesons provide another example for which the renormalization of local operators is a relevant issue. For instance, the kaon decay constant is defined by the matrix element of the axial current, i.e.

$$\displaystyle \begin{aligned} f_{\mathrm{K}}m_{\mathrm{K}} = \left\langle 0\left| (\bar{u}\gamma_0\gamma_5 s)(0) \right|K^{+}\right\rangle. \end{aligned} $$

If the matrix element on the right hand side is evaluated in a lattice simulation, then the axial current in the discretized theory must be related to its counterpart in the continuum via a renormalization factor Z A:

$$\displaystyle \begin{aligned} (\bar{u}\gamma_0\gamma_5 s) = Z_{\mathrm{A}}(g_0) (\bar{u}\gamma_0\gamma_5 s)_{\text{lat}}. \end{aligned} $$

Normally one would expect that the chiral Ward identities ensure that the axial current does not get renormalized. However, this no longer applies if the discretization conflicts with the symmetries of the classical action. This is clearly the case for Wilson fermions, which break chiral symmetry, such that the resulting short-distance corrections must be absorbed into a renormalization factor Z A. Similar considerations apply to the vector current: if the discretization does not preserve chiral symmetry, current conservation is only guaranteed if the vector current is suitably renormalized by a factor Z V, which must be considered even in the massless theory. Unlike the case of the renormalization factor of the pseudoscalar density, Z A and Z V are scale-independent, i.e. they only depend on the bare coupling g 0. From the above discussion it is obvious that perturbative estimates of Z A and Z V are inadequate in order to compute hadronic matrix elements of the axial and vectors currents with controlled errors. A non-perturbative determination of Z A and Z V can be achieved by imposing the chiral Ward identities as a renormalization condition.

Two widely used intermediate schemes, namely the Schrödinger functional (SF) and the Regularization independent momentum subtraction (RI/MOM) schemes are briefly reviewed in the following. We strongly recommend that the reader consult the original articles (Refs. [72,73,74,75] for the SF, and [76] for RI/MOM) for further details.

5.5.2 Finite Volume Scheme: The Schrödinger Functional

The Schrödinger functional is based on the formulation of QCD in a finite volume of size L 3 ⋅ T—regardless of whether space-time is discretized or not—with suitable boundary conditions. Assuming that lattice regularization is employed, one imposes periodic boundary conditions on the fields in all spatial directions, while Dirichlet boundary conditions are imposed at Euclidean times x 0 = 0 and x 0 = T. In order to make this more precise, let C and C denote classical configurations of the gauge potential. For the link variables at the temporal boundaries one then imposes

$$\displaystyle \begin{aligned} \left.U_k(x)\right|{}_{x_0=0}=\mathrm{e}^{aC},\qquad \left.U_k(x)\right|{}_{x_0=T}=\mathrm{e}^{aC^\prime}. \end{aligned} $$

In other words, the links assume prescribed values at the temporal boundaries, but remain unconstrained in the bulk (see Fig. 5.13).

Fig. 5.13
figure 13

Left panel: sketch of the SF geometry, indicating the classical gauge potentials at the temporal boundaries. Middle panel: correlation function of boundary quark fields \(\zeta , \bar \zeta \) with a fermionic bilinear operator in the bulk. Right panel: boundary-to-boundary correlation function

Quark fields are easily incorporated into the formalism. Since the Dirac equation is first order, only two components of a full Dirac spinor can be fixed at the boundaries. By defining the projection operator P ± = 1 2(1 ± γ 0), one requires that the quark fields at the boundaries satisfy

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle \left.P_{+}\psi(x)\right|{}_{x_0=0}=\rho(\vec{x}),\quad \left.P_{-}\psi(x)\right|{}_{x_0=T}=\rho^\prime(\vec{x}), \\ &\displaystyle &\displaystyle \left.\bar{\psi}(x)P_{-}\right|{}_{x_0=0}=\bar\rho(\vec{x}),\quad \left.\bar{\psi}(x)P_{+}\right|{}_{x_0=T}=\bar\rho^\prime(\vec{x}), \end{array} \end{aligned} $$

where \(\rho ,\ldots ,\bar \rho ^\prime \) denote prescribed values of the fields. The functional integral over all dynamical fields in a finite volume with the above boundary conditions is called the Schrödinger functional of QCD:

$$\displaystyle \begin{aligned} {\mathcal{Z}}[C^\prime,\rho^\prime,\bar\rho^\prime;C,\rho,\bar\rho] =\int D[U]D[\bar{\psi},\psi]\,\mathrm{e}^{-S}. {} \end{aligned} $$

The classical field configurations at the boundaries are not integrated over. Using the transfer matrix formalism, one can show that this expression is the quantum mechanical amplitude for going from the classical field configuration \(\{C,\rho ,\bar \rho \}\) at x 0 = 0 to \(\{C^\prime ,\rho ^\prime ,\bar \rho ^\prime \}\) at x 0 = T.

Functional derivatives with respect to \(\rho ,\ldots ,\bar \rho ^\prime \) behave like quark fields located at the temporal boundaries, and hence one may identify

$$\displaystyle \begin{aligned} \zeta(\vec{x}) = \frac{\delta}{\delta\bar\rho(\vec{x})},\quad \bar\zeta(\vec{x}) = -\frac{\delta}{\delta\rho(\vec{x})},\quad \zeta^\prime(\vec{x}) = \frac{\delta}{\delta\bar\rho^\prime(\vec{x})},\quad \bar\zeta^\prime(\vec{x}) = -\frac{\delta}{\delta\rho^\prime(\vec{x})}. \end{aligned} $$

The boundary fields \(\zeta , \bar \zeta ,\ldots \) can be combined with local composite operators (such as the axial current or the pseudoscalar density) of fields in the bulk to define correlation functions. Particular examples are the correlation function of the pseudoscalar density, f P and the boundary-to-boundary correlation f 1

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle f_{\mathrm{P}}(x_0) = -\frac{a^6}{3}\sum_{\vec{y},\vec{z}} \left\langle \bar{\psi}(x)\gamma_5{\textstyle{1\over2}}\tau^a\psi(x) \bar\zeta(\vec{y})\gamma_5{\textstyle{1\over2}}\tau^a\zeta(\vec{z}) \right\rangle, \\ &\displaystyle &\displaystyle f_1 = -\frac{a^{12}}{3L^6}\sum_{\vec{u},\vec{v},\vec{y},\vec{z}} \left\langle \bar\zeta'(\vec{u})\gamma_5{\textstyle{1\over2}}\tau^a\zeta'(\vec{v}) \bar\zeta(\vec{y})\gamma_5{\textstyle{1\over2}}\tau^a\zeta(\vec{z}) \right\rangle, \end{array} \end{aligned} $$

which are shown schematically in the middle and right panels of Fig. 5.13. In the above expressions, the Pauli matrices act on the first two flavour components of the fields.

The specific boundary conditions of the Schrödinger functional ensure that the Dirac operator has a minimum eigenvalue proportional to 1∕T in the massless case [73]. As a consequence, renormalization conditions can be imposed at vanishing quark mass. If the aspect ratio TL is set to some fixed value, the spatial length L is the only scale in the theory, and thus the masses and couplings in the SF scheme run with the box size. The recursive finite-size scaling study described below can then be used to map out the scale dependence of running quantities non-perturbatively from low to high energies. It is important to realize that in this way the relevant scale for the RG running (the box size L) is decoupled from the regularization scale (the lattice cutoff a). It is this features which ensures that the running of masses and couplings can be obtained in the continuum limit.

Let us now return to our earlier example of the renormalization of quark masses. The transition from lattice regularization and the associated hadronic scheme to the SF scheme is achieved by computing the scale-dependent renormalization factor which links the pseudoscalar density in the intermediate scheme to the bare one, i.e.

$$\displaystyle \begin{aligned} (\bar{s}{\gamma_5}u)_{\mathrm{SF}}(\mu_0) = Z_{\mathrm{P}}(g_0,a\mu_0)\,(\bar{s}{\gamma_5}u)_{\text{lat}}(a). \end{aligned} $$

A renormalization condition that defines Z P can be formulated in terms of SF correlation functions:

$$\displaystyle \begin{aligned} Z_{\mathrm{P}}(g_0,a\mu_0) = c\left.\frac{\sqrt{f_1}}{f_{\mathrm{P}}(x_0)} \right|{}_{x_0=T/2},\qquad \mu_0=1/L_{\max}, \end{aligned} $$

where the constant c must be chosen such that Z P = 1 in the free theory. In order to determine the RG running of the quark mass non-perturbatively one can perform a sequence of finite-size scaling steps, as illustrated in Fig. 5.14. To this end one simulates pairs of lattices with box lengths L and 2L, at fixed lattice spacing a. The ratio of Z P evaluated for each box size yields the ratio \(\bar {m}_{\mathrm {SF}}(L)/\bar {m}_{\mathrm {SF}}(2L)\) (upper horizontal step in Fig. 5.14), which amounts to the change in the quark mass when the volume is scaled by a factor 2. In a subsequent step, the physical volume can be doubled once more, which gives \(\bar {m}_{\mathrm {SF}}(2L)/\bar {m}_{\mathrm {SF}}(4L)\). The important point to realize is that the lattice spacing can be adjusted for a given physical box size. In this way the number of lattice sites can be kept at a manageable level, while the physical volume is gradually scaled over several orders of magnitude, as indicated by the zig-zag pattern in Fig. 5.14. Furthermore, each horizontal step can be performed for several lattice resolutions, so that the continuum limit can be taken. By contrast, if one attempted to scale the physical volume for fixed lattice spacing, one would, after only a few iterations, end up with systems so large that they would not fit into any computer’s memory.

Fig. 5.14
figure 14

Illustration of the recursive finite-size scaling procedure to determine the running of \(\bar {m}(L)\) for L → 2L → 4L → 8L. In any horizontal step L is scaled by a factor 2 for fixed lattice spacing a. In every diagonal shift one keeps the physical box size L fixed and increases a by an appropriate tuning of the bare coupling g 0

In an entirely analogous fashion one can set up the finite-size scaling procedure for the running coupling constant in the SF scheme, \(\bar {g}_{\mathrm {SF}}(L)\).Footnote 12 Setting a value for the coupling actually corresponds to fixing the box size L, since the renormalization scale and the coupling in a particular scheme are in one-to-one correspondence. The sequence of scaling steps begins at the matching scale \(\mu _0=1/L_{\max }\) between the hadronic and SF schemes, and in order to express the scale evolution in physical units, the maximum box size \(L_{\max }\) must be determined in terms of some hadronic quantity, such as f π or r 0. In typical applications of the method, \(L_{\max }\) corresponds to an energy scale of about 250 MeV. After n steps, the box size has decreased by a factor 2n (typically n = 7 − 9), and at this point one is surely in the regime where the perturbative approximations to the RG functions are reliable enough to extract the Λ-parameter (in the SF scheme) and the RGI quark masses according to Eqs. (5.113) and (5.114). The transition to the \({\overline {{\mathrm {MS}}}}\)-scheme is easily performed, since the ratios \(\Lambda _{\mathrm {SF}}/\Lambda _{{\overline {{\mathrm {MS}}}}}\), as well as \(\bar {m}_{{\overline {{\mathrm {MS}}}}}/M\) are computable in perturbation theory. At that point one has completed the steps in Fig. 5.12, and all reference to the intermediate SF scheme has dropped out in the final result.

As examples we show the running coupling and quark mass in the SF scheme from actual simulations of lattice QCD for N f = 2 flavours of dynamical quarks in Fig. 5.15. The numerical data points in these plots originate from simulations with two flavours of O(a)-improved Wilson fermions and have been extrapolated to the continuum limit.

Fig. 5.15
figure 15

Running of α s (left panel) [77] and quark mass in units of the RGI mass M (right panel) [78] in the SF scheme. The results from simulations (full circles) are compared to the integration of the perturbative RG equations

5.5.3 Regularization-Independent Momentum Subtraction Scheme

An alternative choice of intermediate renormalization scheme is based on imposing renormalization conditions in terms of Green’s functions of external quark states in momentum space, evaluated in a fixed gauge (e.g. Landau gauge) [76]. The external quark fields are off-shell, and their virtualities are identified with the momentum scale. Here we summarize the basic steps in this procedure by considering a quark bilinear non-singlet operator \(O_\Gamma =\bar {\psi }_1\Gamma \psi _2\), where Γ denotes a generic Dirac structure, e.g. Γ = γ 5 in the case of the pseudoscalar density. The corresponding renormalization factor Z Γ is fixed by requiring that a suitably chosen renormalized vertex function ΛΓ,R(p) be equal to its tree-level counterpart:

$$\displaystyle \begin{aligned} \left.\Lambda_{\Gamma,\mathrm{R}}(p)\right|{}_{p^2=\mu^2} = \left.Z_{\Gamma}Z_\psi^{-1}\Lambda_\Gamma(p)\right|{}_{p^2=\mu^2} = \Lambda_{\Gamma,0}(p). {} \end{aligned} $$

This condition defines Z Γ up to quark field renormalization. Such a prescription can be formulated in any chosen regularization, which is why the method is said to define a regularization-independent momentum subtraction (RI/MOM) scheme. However, Z Γ does depend on the external states and the gauge.

In order to connect to our previous example of the renormalization of quark fields, we consider the pseudoscalar density for concreteness: Γ = γ 5 = “P”. In this case, , and Eq. (5.128) can be cast into the form

$$\displaystyle \begin{aligned} \left.Z_{\mathrm{P}}^{\text{MOM}}(g_0,a\mu)\,Z_{\psi}^{-1}(g_0,ap) \frac{1}{12}{\mathrm{Tr}}\,\left\{ \Lambda_{\mathrm{P}}(p)\gamma_5\right\} \right|{}_{p^2=\mu^2} = 1, \end{aligned} $$

where the trace is taken over Dirac and colour indices.

In practice, the unrenormalized vertex function ΛP(p) is obtained by computing the quark propagator in a fixed gauge in momentum space and using it to amputate the external legs of the Green’s function of the operator in question, evaluated between quark states, i.e.

$$\displaystyle \begin{aligned} \begin{array}{rcl} \Lambda_{\mathrm{P}}(p) &\displaystyle =&\displaystyle S(p)^{-1}\,G_{\mathrm{P}}(p)\,S(p)^{-1},\quad S(p) = \int \mathrm{d}^4{x}\,\mathrm{e}^{-{\mathrm{i}}{p}x} \left\langle S(x,0) \right\rangle , \\ G_{\mathrm{P}}(p) &\displaystyle =&\displaystyle \int \mathrm{d}^4{x}\,\mathrm{d}^4{y} \mathrm{e}^{-{\mathrm{i}}{p}(x-y)} \left\langle \psi_1(x) \left(\bar{\psi}_1(0)\gamma_5\psi_2(0)\right) \bar{\psi}_2(y) \right\rangle. \end{array} \end{aligned} $$

The quark field renormalization constant \(Z_\psi ^{1/2}\) can be fixed, e.g. via the vertex function of the vector currentFootnote 13:

$$\displaystyle \begin{aligned} \left.Z_\psi = \frac{1}{48}{\mathrm{Tr}}\,\left\{ \Lambda_{\mathrm{V}^{\mathrm{C}}_\mu}(p) \gamma_\mu\right\}\right|{}_{p^2=\mu^2}. \end{aligned} $$

The numerical evaluation of the Green’s function and quark propagators in momentum space is performed on a finite lattice with periodic boundary conditions. Unlike the situation encountered in the Schrödinger functional, there is thus no additional infrared scale, so that the renormalization conditions cannot be evaluated directly at vanishing bare quark mass. A chiral extrapolation is then required to determine mass-independent renormalization factors.

Equation (5.128) is also imposed to define the subsequent matching of the RI/MOM and \({\overline {{\mathrm {MS}}}}\) schemes. In this case, the unrenormalized vertex function on the left-hand side is evaluated to a given oder in perturbation theory, using the \({\overline {{\mathrm {MS}}}}\)-scheme of dimensional regularization. For a generic quark bilinear this yields the factor \(Z_{\Gamma }^{{\overline {{\mathrm {MS}}}}}(\bar {g}_{{\overline {{\mathrm {MS}}}}}(\mu ))\). In our specific example of the pseudoscalar density operator in the PCAC relation, Eq. (5.116), the transition between the RI/MOM and \({\overline {{\mathrm {MS}}}}\) schemes is provided by

$$\displaystyle \begin{aligned} (\bar{u}\gamma_5{s})_{{\overline{{\mathrm{MS}}}}}(\bar\mu) = R_{\mathrm{P}}(\bar\mu/\mu) {Z_{\mathrm{P}}^{\text{MOM}}(g_0,a\mu)}(\bar{u}\gamma_5{s})_{\text{lat}}(a). \end{aligned} $$

The ratio R P admits a perturbative expansion in terms of the coupling in the \({\overline {{\mathrm {MS}}}}\)-scheme, i.e.

$$\displaystyle \begin{aligned} R_{\mathrm{P}}(\bar\mu/\mu)\equiv \frac{Z_{\mathrm{P}}^{{\overline{{\mathrm{MS}}}}}(\bar{g}_{{\overline{{\mathrm{MS}}}}}(\bar\mu))} {Z_{\mathrm{P}}^{\text{MOM}}(g_0,a\mu)} = 1 + R_{\mathrm{P}}^{(1)}\bar{g}_{{\overline{{\mathrm{MS}}}}}^2 + {\mathrm{O}}(\bar{g}_{{\overline{{\mathrm{MS}}}}}^4), \end{aligned} $$

which is not afflicted with the bad convergence properties encountered in the direct matching of hadronic and \({\overline {{\mathrm {MS}}}}\)-schemes. Finally, for the whole method to work, one must be able to fix the virtualities μ of the external fields such that

$$\displaystyle \begin{aligned} \Lambda_{\text{QCD}} \ll \mu \ll 1/a. \end{aligned} $$

In other words, the method relies on the existence of a “window” of scales in which lattice artefacts in the numerical evaluation are controlled, μ ≪ 1∕a, and where μ is also large enough such that the perturbative matching to the \({\overline {{\mathrm {MS}}}}\) scheme can be performed reliably. In the ideal situation one expects that the dependence of \(Z_{\Gamma }^{\text{MOM}}(g_0,a\mu )\) on the virtuality μ inside the “window” is well described by the perturbative RG function.

The RI/MOM prescription is a flexible method to introduce an intermediate renormalization scheme and can easily be adapted to a range of operators and lattice actions. In particular, the extension to discretizations of the quark action based on the Ginsparg-Wilson relation is straightforward. This contrasts with the situation encountered in the Schrödinger functional, where extra care must be taken to ensure that imposing Schrödinger functional boundary conditions is compatible with the Ginsparg-Wilson relation [79,80,81]. On the other hand, the non-perturbative scale evolution, for which the Schrödinger functional is tailored, is not so easy to incorporate into the RI/MOM framework. Hence, the matching between RI/MOM and \({\overline {{\mathrm {MS}}}}\) schemes is usually performed at fairly low scales, i.e. \(\bar \mu =\mu _0\) in the notation of Fig. 5.12. Furthermore, the accessible momentum scales in the matching of hadronic and RI/MOM schemes are typically quite narrow, i.e. 0 ≈ 1. Special care must also be taken when one considers operators that couple to the pion, such as the pseudoscalar density. In this case the vertex function receives a contribution from the Goldstone pole, which for p ≡ μ = 0 diverges in the limit of vanishing quark mass. The fact that the chiral limit is ill-defined may spoil a reliable determination of the renormalization factor, in particular when the accessible “window” is narrow such that μ cannot be set to large values.

5.5.4 Mean-Field Improved Perturbation Theory

Another widely used strategy is to avoid the introduction of an intermediate renormalization scheme altogether and attempt the direct, perturbative matching between hadronic and \({\overline {{\mathrm {MS}}}}\) schemes via an effective resummation of higher orders in the expansion. In this sense one regards the bare coupling and masses as parameters that run with the cutoff scale a −1.

The bad convergence properties of perturbative expansions such as Eq. (5.118) has been attributed to the presence of large gluonic tadpole contributions in the relation between the link variable U μ(x) and the continuum gauge potential A μ(x). It was already suggested by Parisi [82] that the convergence of lattice perturbation theory could be accelerated by replacing the bare coupling \(g_0^2\) by an “improved” coupling \(\tilde {g}^2\equiv g_0^2/u_0^4\), where \(u_0^4\) denotes the average plaquette:

$$\displaystyle \begin{aligned} u_0^4={\textstyle\frac{1}{3}}{\mathrm{Re}}\, \langle{\mathrm{tr}}\,{P}\rangle,\qquad P\equiv \frac{1}{6} \sum_{\mu,\nu,\nu<\mu}P_{\mu\nu}. \end{aligned} $$

A more systematic extension of the idea of setting up such a “tadpole” or “mean-field” improved version of lattice perturbation theory was presented in Ref. [83]. The main strategy is to factor out tadpole contributions through a redefinition of the link variable:

$$\displaystyle \begin{aligned} U_\mu(x) \to \tilde{U}_\mu(x)\equiv U_\mu(x)/u_0, \end{aligned} $$

where u 0 is the average link, defined e.g. via the average plaquette. A factor of u 0 is then absorbed into the normalization of the quark fields. According to [83], the mismatch between non-perturbative estimates for u 0 and its expression in lattice perturbation theory can be used to improve the convergence properties of lattice perturbation theory via a relative rescaling of quark fields in the continuum and lattice formulations. To make this more explicit, we consider Wilson fermions (see Sect. 5.2.2). Factoring out the average link u 0 modifies the quark field normalization of Eq. (5.36) according to

$$\displaystyle \begin{aligned} \psi^{\text{cont}}(x) = \sqrt{2{\kappa}u_0}\,\psi(x),\qquad \bar{\psi}^{\text{cont}}(x) = \bar{\psi}(x)\,\sqrt{{2\kappa}u_0}. \end{aligned} $$

The general expression for the perturbative expansion of Z P in powers of the bare coupling reads

$$\displaystyle \begin{aligned} Z_{\mathrm{P}}(g_0,a\mu) = 1+g_0^2Z_{\mathrm{P}}^{(1)}(a\mu) + {\mathrm{O}}(g_0^4), {} \end{aligned} $$

where \(Z_{\mathrm {P}}^{(1)}(a\mu )\) denotes the one-loop expansion coefficient. The convergence of Eq. (5.138) can be accelerated by dividing out u 0 in the rescaling factors of the quark and antiquark fields using its perturbative expansion and replacing it by its non-perturbative estimate computed in simulations. In other words, the rescaling of the quark fields is exploited to divide out the relative mismatch between the perturbative and non-perturbative estimates for the average link in expressions like Eq. (5.138):

$$\displaystyle \begin{aligned} 1=u_0(u_0)^{-1} \simeq u_0\left\{ 1-u_0^{(1)}g_0^2 +{\mathrm{O}}(g_0^4) \right\}, \end{aligned} $$

where the one-loop coefficient \(u_0^{(1)}=-1/12\) for the average plaquette. In this way, i.e. by combining non-perturbatively determined values for u 0 with its perturbative expansion, and after replacing the bare coupling by \(\tilde {g}^2\), one arrives at the mean-field improved version of Eq. (5.138), viz.

$$\displaystyle \begin{aligned} Z_{\mathrm{P}}^{\mathrm{mf}} = u_0\left\{ 1+\left[ Z_{\mathrm{P}}^{(1)}(a\mu) -u_0^{(1)}\right] \tilde{g}^2 \right\}. \end{aligned} $$

Instead of Parisi’s “boosted” coupling \(\tilde {g}\) other expansion parameters have been suggested, which are expected to accelerate the convergence of the perturbative series [83]. While mean-field improvement is a general procedure, which is easily adapted to a wide range of actions and operators, it is difficult to estimate the effectiveness of the resummation and, in turn, the size of higher-order corrections. Also, a principal problem is the identification of the running scale with the cutoff, since it is difficult to separate renormalization effects from lattice artefacts.

5.5.5 The Running Coupling from the Lattice

Having discussed the non-perturbative renormalization of QCD in detail, we shall now present results for the running coupling constant, α s, from two different approaches. This complements the discussion in Sect. 4.6, where determination of α s from experimental data has been described in detail. Any lattice calculation of α s proceeds along the following steps:

  1. 1.

    A non-perturbative definition of the coupling must be provided in terms of some quantity which can be evaluated in lattice simulations with high precision. This amounts to specifying the running coupling in a particular renormalization scheme, α X( 0), which can be related to the \({\overline {{\mathrm {MS}}}}\) scheme of dimensional regularization.

  2. 2.

    Scale setting: the matching to a hadronic scheme is performed via the calibration of the lattice spacing, which yields the scale μ 0 at which α X is evaluated in units of some physical quantity Q:

    $$\displaystyle \begin{aligned} \mu_0\,[\text{MeV}] = (a\mu_0)\cdot a^{-1}\,[\text{MeV}] = (a\mu_0)\cdot \frac{Q\,[\text{MeV}]}{(aQ)}. \end{aligned} $$
  3. 3.

    Running and matching: provided that the energy scale at which α X has been determined is large enough, one can use perturbation theory to relate α X to the coupling in the \({\overline {{\mathrm {MS}}}}\) scheme, e.g.

    $$\displaystyle \begin{aligned} \alpha_{{\overline{{\mathrm{MS}}}}}(\bar\mu) = \alpha_{\mathrm{X}}(\mu) +c_{\mathrm{X}}^{(1)}(\bar\mu/\mu)\alpha_{\mathrm{X}}(\mu)^2 +\ldots. \end{aligned} $$
  4. 4.

    The Λ-parameter can be determined from the asymptotic behaviour of α X via Eq. (5.113).

The attentive reader has surely noticed that the above steps follow closely the general strategy for non-perturbative renormalization via an intermediate renormalization scheme outlined in Sect. 5.5.1 and Fig. 5.12.

First we discuss the determination of α s from the Schrödinger functional. The definition of the running coupling is somewhat technical in this case. The starting point is the effective action of Eq. (5.123); the classical field configurations at the boundaries at x 0 = 0, T can be parameterized in terms of a real variable η:

$$\displaystyle \begin{aligned} C=C(\eta),\qquad C^\prime=C^\prime(\eta). \end{aligned} $$

For explicit expressions we refer the reader to the original article [84]. The associated effective action is defined by

$$\displaystyle \begin{aligned} \Gamma(\eta) = -\ln{\mathcal{Z}}[C^\prime(\eta),0,0;\,C(\eta),0,0] \end{aligned} $$

and admits a perturbative expansion in terms of the bare coupling g 0, viz.

$$\displaystyle \begin{aligned} \Gamma(\eta) = \frac{1}{g_0^2}\Gamma_0 +\Gamma_1 +g_0^2\Gamma_2 +\ldots. \end{aligned} $$

A renormalized coupling can then be defined in terms of the effective action via

$$\displaystyle \begin{aligned} \frac{1}{\bar{g}_{\mathrm{SF}}^2(L)} = \left\{ \frac{\partial}{\partial\eta}\Gamma(\eta)\right. \left/ \frac{\partial}{\partial\eta}\Gamma_0(\eta)\right\}_{\eta=0,\,m=0}. \end{aligned} $$

This definition is imposed at vanishing quark mass, m = 0, and provided that the aspect ratio TL has been fixed, the spatial dimension is the only scale in the theory, such that \(\bar {g}_{\text{SF}}(L)\) runs with the box size L. From the perturbative expansion of Γ(η) one easily infers that \(\bar {g}_{\mathrm {SF}}^2(L)=g_0^2\) at tree level. The quantity on the right-hand side is given in terms of plaquettes attached to the SF boundaries and can be computed with good statistical precision.

If \(L_{\max }\) denotes the largest box size for which \(\bar {g}_{\mathrm {SF}}\) is computed, then the scale is set by expressing \(L_{\max }\) in terms of some known dimensionful quantity, for instance, by computing the combination \(L_{\max }/r_0\) in the continuum limit and using r 0 = 0.5 fm.

The finite-size scaling procedure described earlier in Sect. 5.5.1 allows to compute the scale evolution of \(\bar {g}_{\mathrm {SF}}\) over several orders of magnitude. In particular, each of the horizontal steps in Fig. 5.14 can be repeated for several values of the lattice spacing, so that the continuum limit is reached by taking aL → 0 for fixed physical box size L. The resulting scale evolution of \(\alpha _{\mathrm {SF}}\equiv \bar {g}_{\mathrm {SF}}^2/4\pi \) is shown in Fig. 5.15 and compared to the perturbative evolution. Although the non-perturbatively determined points are described very well by perturbation theory, using the three-loop expression for the RG function, one should realize that this behaviour may be specific to the SF scheme and should not be generalized to other schemes.

Starting from \(\mu _0=1/L_{\max }\) one obtains the coupling at \(\mu =2^9/L_{\max }\) after nine steps in the scaling procedure. At that point one can extract the Λ-parameter by evaluating the exact expression

$$\displaystyle \begin{aligned} \Lambda_{\mathrm{SF}} = \mu\left(b_0\bar{g}^2(\mu)\right)^{-b_1/(2b_0^2)} \mathrm{e}^{\,-1/(2b_0\bar{g}^2(\mu))} \exp\left\lbrace -\int_0^{\bar{g}(\mu)} \! dx \left[ \frac{1}{\beta(x)} + \frac{1}{b_0x^3} - \frac{b_1}{b_0^2x} \right]\right\rbrace, {} \end{aligned} $$

where \(\mu =2^9/L_{\max }\). The integral can be computed using the three-loop approximation to the RG β-function in the SF scheme. Equation (5.147) yields the combination \(\Lambda _{\mathrm {SF}}L_{\max }\), and knowledge of \(L_{\max }\) in physical units allows to express the Λ-parameter in MeV. Conversion to the \({\overline {{\mathrm {MS}}}}\) scheme is easily achieved, since the ratio of Λ-parameters in two different schemes is computable via a one-loop calculation in which \(\bar {g}_{{\overline {{\mathrm {MS}}}}}^2\) is expanded in powers of \(\bar {g}_{\mathrm {SF}}^2\). This gives

$$\displaystyle \begin{aligned} \Lambda_{{\overline{{\mathrm{MS}}}}} = \Lambda_{\mathrm{SF}}\cdot c_{\Lambda}. \end{aligned} $$

The entire procedure of determining the Λ-parameter via the Schrödinger functional has so far been carried out for the pure SU(3) gauge theory (N f = 0) and for QCD with two flavours of dynamical quarks. The values of the coefficient c Λ are 2.04872(4) for N f = 0 [84] and c Λ = 2.382035(3) for N f = 2 [85], and the resulting values for \(\Lambda _{{\overline {{\mathrm {MS}}}}}\) are [75, 77]

$$\displaystyle \begin{aligned} \begin{array}{l c l} \Lambda_{{\overline{{\mathrm{MS}}}}}^{(0)}r_0 = 0.602\pm0.048 & \quad {\Leftrightarrow}\quad & \Lambda_{{\overline{{\mathrm{MS}}}}}^{(0)} = 238\pm 19\,{\mathrm{MeV}} \\[0.3cm] \Lambda_{{\overline{{\mathrm{MS}}}}}^{(2)}r_0 = 0.62\pm0.04\pm0.04 & \quad {\Leftrightarrow}\quad & \Lambda_{{\overline{{\mathrm{MS}}}}}^{(2)} = 245\pm 16\pm16\,{\mathrm{MeV}}, \end{array} \end{aligned} $$

where r 0 = 0.5 fm is used to convert into physical units. There is room for improvement in several respects: for N f = 2 the extrapolation to the continuum limit can be made more reliable by including simulations at smaller lattice spacings, which should reduce the first of the two quoted errors. Also, the conversion into physical units should be performed in terms of a quantity such as f π, which is directly accessible in experiment. Finally, the calculation must be repeated with more dynamical quark flavours, in order to allow for a direct comparison with phenomenology, since all experimental determinations yield the Λ-parameter for N f = 4 or 5 quark flavours.

The determination of α s and \(\Lambda _{{\overline {{\mathrm {MS}}}}}\) via the Schrödinger functional is quite involved. However, it is the only method so far, which allows to map out the running of α s in a completely non-perturbative manner, including the systematic elimination of lattice artefacts. In particular, perturbation theory is used only for energy scales well above 50 GeV.

The second method that we will discussed here in some detail is the determination of α s via heavy quarkonia. Below we present an account of the calculation published in [86]. Here, the dynamical quark effects of the light (u, d, s) quarks have been accounted for in simulations with improved staggered quarks employing the fourth-root trick (see Sect. 5.2.6). In this approach, the coupling constant is defined in the so-called “V -scheme” via the heavy quark potential in momentum space:

$$\displaystyle \begin{aligned} V(q) = -C_{\mathrm{F}}\frac{4\pi}{q^2}\,\alpha_{\mathrm{V}}(q). \end{aligned} $$

Small Wilson loops such as the plaquette can be expanded in powers of α V

$$\displaystyle \begin{aligned} -\ln{\textstyle\frac{1}{3}}\left\langle{\text{Re tr}}\,P\right\rangle =c_{\mathrm{P}}^{(1)}\alpha_{\mathrm{V}}(s_{\mathrm{P}}/a) +c_{\mathrm{P}}^{(2)}\left[\alpha_{\mathrm{V}}(s_{\mathrm{P}}/a)\right]^2+\ldots, {} \end{aligned} $$

where s P is a real dimensionless variable which can be chosen to optimize the convergence properties of the expansion [83]. Equation (5.151) thus provides the link between the coupling and a quantity that is easily computed in lattice simulations. The above expression can be generalized to (small) rectangular Wilson loops W rt with area r ⋅ t:

$$\displaystyle \begin{aligned} -\ln{\textstyle\frac{1}{3}}\left\langle{W_{rt}}\right\rangle = \sum_{k=0}^{\infty}\,c_{rt}^{(k)} \left[\alpha_{\mathrm{V}}(s_{\mathrm{rt}}/a)\right]^k. \end{aligned} $$

Knowledge of the expansion coefficients in conjunction with lattice data for the quantity on the left hand side allows for the determination of α V.

The second step, namely the calibration of the momentum scale which appears in the argument of α V, is done by determining the lattice spacing from mass splittings in the bottomonium system. Here one typically considers the mass differences between the Υ and Υ, or alternatively, between the χ b and Υ states. Of course, any other low-energy quantity like f π or r 0 could be used. It can be argued, however, that mass splittings in heavy quarkonia are a natural choice for setting the scale in this particular approach, chiefly because of their relative insensitivity to the exact value of the heavy quark mass. Since the b-quark mass of m b ≈ 4 GeV is greater than typical values of the inverse lattice spacing, a −1 one must employ special techniques to deal with heavy quarks on the lattice. In [86] this is done via an approach based on non-relativistic QCD. A detailed discussion of the specific treatment of heavy quarks in lattice simulations is deferred to Sect. 5.7.2.

After setting the scale, the Wilson loops \(\left \langle {W_{rt}}\right \rangle \) computed on ensembles with N f = 3 flavours of rooted staggered quarks are used to determine α V via a global fit involving data at three different values of the lattice spacing. This yields

$$\displaystyle \begin{aligned} \alpha_{\mathrm{V}}^{(3)}(7.5\,{\mathrm{GeV}}) = 0.2082\pm 0.0040, \end{aligned} $$

where the superscript on the coupling reminds us that the result is valid in the three-flavour theory. The relation to the coupling in the \({\overline {{\mathrm {MS}}}}\)-scheme at the Z-pole is determined in perturbation theory, by employing the third-order expansion of \(\alpha _{{\overline {{\mathrm {MS}}}}}\) in terms of α V [87]:

$$\displaystyle \begin{aligned} \alpha_{{\overline{{\mathrm{MS}}}}}^{(3)}(\mathrm{e}^{-5/6}q) = \alpha_{\mathrm{V}}^{(3)}(q) +\frac{2}{\pi}\left[\alpha_{\mathrm{V}}^{(3)}(q)\right]^2 -(0.3111\ldots) \left[\alpha_{\mathrm{V}}^{(3)}(q)\right]^3, \end{aligned} $$

which yields \(\alpha _{{\overline {{\mathrm {MS}}}}}^{(3)}(3.26\,{\mathrm {GeV}})\). This coupling is then translated to \(\alpha _{{\overline {{\mathrm {MS}}}}}^{(5)}(M_Z)\) via the numerical integration of the four-loop RG β-function, including the effects from quark mass thresholds at m c and m b, which finally yields

$$\displaystyle \begin{aligned} \alpha_{{\overline{{\mathrm{MS}}}}}^{(5)}(M_Z) = 0.1170\pm0.0012. {} \end{aligned} $$

This result is included in the world average of \(\alpha _{{\overline {{\mathrm {MS}}}}}^{(5)}(M_Z) = 0.1176\pm 0.002\) in Ref. [61]. It is also in very good agreement with the non-lattice global estimate of \(\alpha _{{\overline {{\mathrm {MS}}}}}^{(5)}(M_Z) = 0.1182\pm 0.0027\) [88].

The running and matching in this approach is done perturbatively, involving energy scales from M Z down to m c. In this sense the method may be regarded as similar in spirit to, say, the determination of α s from the semi-leptonic branching ratio of τ decays, as in both cases the coupling is extracted from the perturbative expansion of a particular observable. While for τ-lepton decays an experimentally measured quantity is considered, it is the non-perturbatively computed data for the Wilson loops in the lattice approach which are expressed in terms of the running coupling. This contrasts with the Schrödinger functional approach, where also the running is computed non-perturbatively, albeit with considerable numerical effort.

The error on the result in Eq. (5.155) is rather small. It is left for future studies to confirm this level of precision, which must entail further investigations into the influence of lattice artefacts, as well as the validity of the fourth root trick.

5.5.6 Light Quark Masses

We shall now apply the general framework of non-perturbative renormalization to the determination of quark masses. Typically one distinguishes the “light” u, d, s quarks from the “heavy” c, b, t quarks. At first, this distinction may seem rather arbitrary. It is actually based on the relative magnitude of the quark masses compared with the chiral symmetry breaking scale Λχ, which separates “soft” from “hard” momentum scales. Masses and momenta well below Λχ break chiral symmetry only softly, so that spontaneous chiral symmetry breaking still dominates over the explicit breaking generated by non-zero values of the quark masses. Gasser and Leutwyler [89, 90] have demonstrated that QCD with u, d, s flavours can be studied via an “effective” theory of Goldstone boson fields. This approach, called Chiral Perturbation Theory (ChPT), has an SU(3)L ⊗ SU(3)R chiral symmetry, which is spontaneously broken to the SU(3) vector subgroup. The associated Goldstone bosons are then identified with the pions, kaons and η-mesons, whose masses are indeed small compared to typical hadronic scales, such as the mass of the nucleon, for instance. Thus, the magnitude of Λχ is identified with a value close to 1 GeV. In ChPT, quantities like hadron masses, decay rates or cross sections are computed through an expansion in powers of quark masses (and 4-momenta) about the chiral limit. The inclusion of the charm quark into the formalism is rather useless, since the masses if the lightest charmed pseudoscalar mesons are far greater than Λχ ≈ 1 GeV.

The top quark can be safely ignored in this context, since its lifetime is an order of magnitude shorter than typical QCD processes. As a consequence, the top quark does not undergo any hadronization effects (for instance, “toponium”, i.e. \(t\bar {t}\) bound states have never been observed), but rather decays weakly into a W-boson and a b-quark.

The mass of the b-quark is rather large (and to some extent this is also true for the charm quark), so that one may attempt to determine their values from perturbative expansions in α s of some mass-dependent quantity. By contrast, in the light quark sector non-perturbative effects such as spontaneous chiral symmetry breaking dominate. As far as the determination of the masses of the u, d, s quarks is concerned, ChPT is of limited value, since only ratios of quark masses can be predicted, but not their absolute values. The reason is that although the light quark masses appear as parameters of ChPT, their values cannot be fixed by chiral symmetry (see Sect. 5.6.1 for more details). The absolute normalization must therefore be provided by non-perturbative methods such as lattice simulations or QCD sum rules.

Below we will focus on attempts to compute the values of the light quark masses in units of some hadronic quantity. As indicated in Sect. 5.5.1, this entails the knowledge of the renormalization factor that links lattice regularization to the chosen continuum scheme. Lattice simulations have maximum impact in the light quark sector, owing to the dominance of non-perturbative effects, which is in fact signified by the large uncertainties quoted for the values of the u, d and s quark masses in the particle data book [61].

The general procedure for the determination of light quark masses in lattice QCD starts from the PCAC relation, Eq. (5.116). Assuming exact isospin symmetry, m u = m d, one can consider a generic light flavour with mass \(m_\ell \equiv \hat {m}=\textstyle {1\over 2}(m_u+m_d)\). In order to determine, say, the combination \(\hat {m}+m_s\), one must define a particular hadronic renormalization scheme, by specifying the lattice scale and the hadronic quantity that fixes the value of \(\hat {m}+m_s\). Furthermore, the renormalization factor which connects hadronic and continuum schemes must be known. Equation (5.116) can then be rewritten such that it yields the sum of RG-invariant quark masses \(\hat {M}+M_s\) in units of the quantity Q which sets the lattice spacing:

$$\displaystyle \begin{aligned} \frac{\hat{M}+M_s}{Q} = Z_{\mathrm{M}}\times \left.\left(\frac{f_{\mathrm{PS}}^{\text{bare}}Q}{G_{\text{PS}}^{\text{bare}}}\right) \right|{}_{m_{\mathrm{PS}}=m_{\mathrm{K}}} \times \left.\left(\frac{m_{\mathrm{K}}^2}{Q^2}\right)\right|{}_{\text{exp}} +{\mathrm{O}}(a^p). {} \end{aligned} $$

In this expression, the subscript “exp” denotes the experimental values for the respective quantities, while the matrix element \(G_{\mathrm {PS}}^{\text{bare}}\) is given by

$$\displaystyle \begin{aligned} \left.G_{\mathrm{PS}}^{\text{bare}}\right|{}_{m_{\mathrm{PS}}=m_{\mathrm{K}}}\equiv G_{\mathrm{K}}^{\text{bare}}=\left\langle 0\left|(\bar\ell\gamma_5 s)_{\text{lat}}\right|{K}\right\rangle. {} \end{aligned} $$

The pseudoscalar decay constant \(f_{\mathrm {PS}}^{\text{bare}}\) parameterizes the matrix element of the unrenormalized axial current, i.e.

$$\displaystyle \begin{aligned} \left.f_{\mathrm{PS}}^{\text{bare}}m_{\mathrm{PS}}\right|{}_{m_{\mathrm{PS}}=m_{\mathrm{K}}}\equiv f_{\mathrm{K}}^{\text{bare}}m_{\mathrm{K}}=\left\langle 0\left|(\bar\ell\gamma_0\gamma_5 s)_{\text{lat}} \right|{K}\right\rangle. \end{aligned} $$

The renormalization factor Z M relates the bare current quark mass to the RG-invariant mass. Thus, the task for lattice calculations is to compute the ratio \({f_{\mathrm {PS}}^{\text{bare}}}Q/G_{\mathrm {PS}}^{\text{bare}}\) for a generic pseudoscalar state and tune the bare quark mass such that m PS = m K. By combining the result with the renormalization factor Z M and the experimental value of \(m_{\mathrm {K}}^2/Q^2\), the RGI quark masses in units of Q are obtained up to lattice artefacts of order a p, where p is characteristic of the details of the discretization. Since the RGI quark masses are scale- and scheme-independent quantities, the factor Z M depends only on the bare coupling g 0. Using the Schrödinger functional as the intermediate renormalization scheme, non-perturbative estimates of Z M computed for O(a) improved Wilson fermions within a wide range of bare couplings, have been published in Refs. [75] and [78]. In this case, Z M is given by

$$\displaystyle \begin{aligned} Z_{\mathrm{M}}(g_0) = \frac{M}{\bar{m}_{\mathrm{SF}}(\mu_0)} \frac{Z_{\mathrm{A}}(g_0)}{Z_{\mathrm{P}}(g_0,a\mu_0)}, \end{aligned} $$

where the ratio \({M}/{\bar {m}_{\mathrm {SF}}(\mu _0)}\) is computed via the finite-size scaling procedure. The transition between lattice regularization and the SF-scheme is accomplished by determining Z P and the renormalization factor Z A of the axial current.Footnote 14 Note that the dependence on the intermediate matching scale μ 0 drops out completely in this expression. Finally, the conversion to the \({\overline {{\mathrm {MS}}}}\)-scheme is performed by considering

$$\displaystyle \begin{aligned} Z_{\mathrm{m}}(g_0,a\mu) \equiv \frac{\bar{m}_{{\overline{{\mathrm{MS}}}}}(\mu)}{M}\,Z_{\mathrm{M}}(g_0), \end{aligned} $$

where the ratio \({\bar {m}_{{\overline {{\mathrm {MS}}}}}(\mu )}/{M}\) can be computed through the numerical integration of the perturbative approximation of the anomalous dimension τ and the β-function at four loops. This yields [35, 78]

$$\displaystyle \begin{aligned} \frac{\bar{m}_{{\overline{{\mathrm{MS}}}}}(2\,{\text{GeV}})}{M}= \left\{\begin{array}{ll} 0.7208, & \quad N_{\mathrm{f}}=0 \\ 0.7013, & \quad N_{\mathrm{f}}=2\,. \end{array}\right. \end{aligned} $$

Estimates for the strange quark mass itself can be obtained in two ways: first, one combines \(\hat {M}+M_s\) with the ratio \(M_s/\hat {M}=24.4\pm 1.4\) estimated in ChPT [38]. Alternatively, one might attempt to compute \(\hat {M}\) directly from lattice data, by considering Eq. (5.116) for a pion. In this case, however, one relies on chiral extrapolations, because of the difficulties involved when tuning the masses of the light quarks towards the values of the physical up- and down-quark masses.

In Table 5.2 we present a selection of results for the mass of the strange quark in the quenched approximation, normalized in the \({\overline {{\mathrm {MS}}}}\)-scheme at μ = 2 GeV, as well as the ratio \(M_s/\hat {M}\). Two observations are worth mentioning: first, direct determinations of \(M_s/\hat {M}\) via chiral extrapolations agree well with the estimate from ChPT, even though the chiral limit is ill-defined in the quenched approximation. Second, the different systematics in the simulations (lattice actions, renormalization of local operators) generate a spread of seemingly incompatible results for the mass of the strange quark. However, the spread can be traced to the particular choice of hadronic renormalization scheme. To this end one can compute the relation between quark masses computed for two different lattice scales, Q and Q . From Eq. (5.156) one easily infers that the strange quark mass \(m_s^{(Q^\prime )}\) estimated using Q , is related to its counterpart \(m_s^{(Q)}\) via [37]

$$\displaystyle \begin{aligned} m_s^{(Q^\prime)}\,[\text{MeV}] = \left(Q^\prime\over Q\right)_{\text{lat}} \left(Q\over Q^\prime\right)_{\text{exp}} m_s^{(Q)}\,[\text{MeV}]. \end{aligned} $$

Here, the subscripts “lat” and “exp” refer to lattice and experimental estimates of the scale ratios. The ratio (Q Q)lat can be determined in the continuum limit using published lattice data, and the deviation of the proportionality factor from unity is a measure of the relative quenching effects, when either Q or Q is chosen to set the scale. Once the results have been converted to the common scale r 0, the estimates for m s in the continuum limit show remarkable consistency, despite the very different systematic effects among the simulations included in this analysis (c.f. Table 5.2). This demonstrates that lattice artefacts and renormalization effects can be controlled at the level of a few percent with the available techniques.

Table 5.2 Results for the strange quark mass in the \({\overline {{\mathrm {MS}}}}\)-scheme at μ = 2 GeV and for the ratio \(m_s/\hat {m}\), in the continuum limit of the quenched approximation

The challenge for current and future simulations is to eliminate the remaining uncertainty due to quenching. Several simulations with N f = 2 or 2 + 1 flavours of dynamical quarksFootnote 15 based on different fermionic discretizations have produced results for the light quark masses, which are shown in Table 5.3. Despite the enormous progress that has been made in simulating light dynamical quarks, it is important to realize that systematic effects such as lattice artefacts and/or renormalization effects are currently not as well controlled as in the quenched theory. The fact that affordable lattice spacings are still relatively large implies that extrapolations to the continuum limit are in general longer than in the quenched approximation, thereby leading to larger errors. In some cases it is not even clear whether the leading lattice artefacts in dynamical simulations have been isolated. Also, the quantity Q that sets the scale must be known at least as accurately as the quark mass itself, and hence the determination of these observables may prove just as costly. Finally, dynamical quark masses are still fairly large, especially in many simulations using Wilson fermions, and thus the long and potentially uncontrolled chiral extrapolations significantly affect estimates for the isospin-averaged light quark mass \(\hat {m}\).

Table 5.3 Selection of recent unquenched results for the light quark masses

5.6 Spontaneous Chiral Symmetry Breaking

Chiral symmetry has already been mentioned in connection with the masses of the light quarks. Here we will extend the general framework and elaborate on effective descriptions of QCD at low energies, which can be treated analytically. As we shall see, much can be learnt via the interplay of such effective theories and lattice simulations of QCD.

Massless QCD with N f flavours is invariant under independent rotations of the left- and right-handed components of the quarks fields. If one defines the field Ψ as the vector of N f Dirac spinors ψ i via

$$\displaystyle \begin{aligned} \Psi=\left(\psi_1,\ldots,\psi_{N_{\mathrm{f}}}\right)^T, \end{aligned} $$

its left- and right-handed components are given by


The action of the massless theory is then invariant under transformations like

$$\displaystyle \begin{aligned} \Psi\to\Psi^\prime = \exp\left\{ {{\mathrm{i}}}P_{-}(\boldsymbol{\omega}_{\boldsymbol{L}}{\cdot}\boldsymbol{T})+{{\mathrm{i}}}P_{+}(\boldsymbol{\omega}_{\boldsymbol{R}}{\cdot}\boldsymbol{T}) \right\} \Psi, \end{aligned} $$

where ω L, ω R are real vectors, and T denotes the generators of SU(N f), which satisfy

$$\displaystyle \begin{aligned} \left[T^a,T^b\right] = {{\mathrm{i}}}f^{abc}T^c,\qquad {\mathrm{Tr}}\,(T^a T^b)=\textstyle{1\over2}\delta^{ab}. \end{aligned} $$

The above transformation can be rewritten in terms of vector and axial rotations, i.e.

$$\displaystyle \begin{aligned} \Psi\to\Psi^\prime = \exp\left\{ {{\mathrm{i}}}{\boldsymbol{\alpha}_{\boldsymbol{V}}{\cdot}\boldsymbol{T}} +{{\mathrm{i}}}{\boldsymbol{\alpha}_{\boldsymbol{A}}{\cdot}\boldsymbol{T}}\gamma_5 \right\} \Psi, \end{aligned} $$

where \({\boldsymbol {\alpha }_{\boldsymbol {V}}} \equiv \textstyle {1\over 2}\left ( {\boldsymbol {\omega }_{\boldsymbol {R}}}+{\boldsymbol {\omega }_{\boldsymbol {L}}}\right )\) and \({\boldsymbol {\alpha }_{\boldsymbol {A}}} \equiv \textstyle {1\over 2}\left ( {\boldsymbol {\omega }_{\boldsymbol {R}}}-{\boldsymbol {\omega }_{\boldsymbol {L}}}\right )\). Invariance under these transformation laws is what one usually means when one says that (massless) QCD is invariant under a global SU(N f)L ⊗SU(N f)R symmetry.

Actually, QCD has even more global symmetries, namely a U(1)V symmetry, which corresponds to a common rotation of all quark flavours. The conserved charge derived from the Noether current, which is associated with this unbroken symmetry, is the quark number. The conservation of the axial current associated with the remaining axial U(1) symmetry is, however, severely broken by an anomalous term, which gives rise to strong non-perturbative effects generated by instantons. Without going into further detail here, we refer to common textbooks.

Returning now to SU(N f)L ⊗SU(N f)R, we note that symmetries in sub-nuclear physics are usually deduced from the particle spectrum. That is, symmetries manifest themselves through the occurrence of mass-degenerate (or nearly degenerate) particle multiplets that can be grouped according to the irreducible representations of the symmetry group. Indeed, for N f = 3 one finds that the light pseudoscalar mesons, i.e. the pions, kaons and η-mesons form an octet. The mass splittings among the members of the octet are small when viewed on typical hadronic scales, and arise due to the unequal, non-zero masses of the light quarks. However, if the pseudoscalar octet were interpreted as a manifestation of an (approximate) SU(3)L ⊗ SU(3)R chiral symmetry, one would expect that each member of the octet is accompanied by a parity partner, i.e. a scalar meson, whose mass is of the same order of magnitude. This is not observed in experiment, where the lightest scalar mesons are found to lie 600–700 MeV above the pseudoscalar octet. One therefore concludes that the symmetry must be spontaneously broken. The term “spontaneous breaking” refers to the fact that theories like QCD possess more internal symmetries than those that can be inferred from the particle spectrum. In general, spontaneously broken symmetries are not realized as symmetry transformations involving the physical states of the theory. In particular, the ground state, i.e. the vacuum, is not invariant under the transformation. As discussed in many textbooks, it is precisely the invariance of the vacuum under the symmetry transformation that is required to ensure the degeneracy of the particle spectrum. If the vacuum is not invariant, certain operators may acquire a non-vanishing expectation value. In fact, a sufficient condition for the spontaneous breaking of the physical SU(3)L ⊗ SU(3)R chiral symmetry is fulfilled if the expectation value of the scalar density, \(\bar \Psi \Psi \), is non-zero, i.e.

$$\displaystyle \begin{aligned} \left\langle \bar\Psi\Psi \right\rangle \equiv \left\langle \bar{u}u+\bar{d}d+\bar{s}s \right\rangle \neq 0. \end{aligned} $$

Furthermore, according to Goldstone’s theorem [101], the generator of each broken symmetry is associated with a massless particle. Since the masses of the members of the pseudoscalar octet are rather small in comparison with the proton mass, they are identified as the Goldstone bosons of the spontaneously broken chiral symmetry.

Spontaneous chiral symmetry breaking is an entirely non-perturbative phenomenon. The task is then to explore the breaking mechanism and compute the value of the quark condensate \(\left \langle \bar \Psi \Psi \right \rangle \). As shall be outlined below, this can be achieved through the interplay of lattice simulations and effective low-energy descriptions of QCD.

5.6.1 Chiral Perturbation Theory

Chiral Perturbation Theory (ChPT) has already been mentioned in connection with extrapolations of lattice data to the physical values of the up- and down-quark masses, and also in the context of lattice determinations of the strange quark mass. Here we present a brief introduction into the general formalism. More thorough reviews can be found in Refs. [102, 103].

Chiral Perturbation Theory is an effective theory, based on a systematic expansion of the low-energy dynamics of QCD in powers of the 4-momentum and the quark mass about the chiral limit [89, 90], i.e.

$$\displaystyle \begin{aligned} {\mathcal{L}}_{\mathsf{eff}} = {\mathcal{L}}_{\mathsf{eff}}^{(2)} +{\mathcal{L}}_{\mathsf{eff}}^{(4)}+\ldots, \end{aligned} $$

where the superscripts label the order of the expansion in powers of p. In contrast to QCD, the basic degrees of freedom which appear in \({\mathcal {L}}_{\mathsf {eff}}\) are the Goldstone bosons, rather than the fundamental quarks and gluons. ChPT is parameterized in terms of a set of empirical couplings, usually called “low-energy constants” (LECs). At lowest order, the effective chiral Lagrangian (in Euclidean space-time) reads

$$\displaystyle \begin{aligned} {\mathcal{L}}_{\mathsf{eff}}^{(2)}=\textstyle{1\over2}{F_0^2}\Big\{ \textstyle{1\over2} {\mathrm{Tr}}\,\left(\partial_\mu U^\dag\partial_\mu U\right) -B_0{\mathrm{Tr}}\,\left(\mathcal{M}(U+U^\dag) \right)\Big\}, {} \end{aligned} $$

where \({\mathcal {M}}=\text{diag}(m_u,\,m_d,\,m_s)\) is the quark mass matrix, and U(x) collects the Goldstone boson fields, i.e.

$$\displaystyle \begin{aligned} U(x) = \exp\left(\frac{{\mathrm{i}}}{F_0}\boldsymbol{\lambda}\cdot\boldsymbol{\phi}(x)\right); \quad \boldsymbol{\lambda}\cdot\boldsymbol{\phi}\equiv\sum_{a=1}^8\lambda^a\phi_a= \left(\begin{array}{c c c} \pi^0+{\textstyle\frac{1}{\sqrt{3}}}\eta & \sqrt{2}\pi^+ & \sqrt{2}K^+ \\ \sqrt{2}\pi^- & -\pi^0+{\textstyle\frac{1}{\sqrt{3}}}\eta & \sqrt{2}K^0 \\ \sqrt{2}K^- & \sqrt{2}\bar{K}^0 & {\textstyle-\frac{2}{\sqrt{3}}}\eta \end{array} \right). \end{aligned} $$

The λ a’s denote the Gell-Mann matrices which are normalized as Tr (λ aλ b) = 2δ ab. The LECs at leading order are B 0 and F 0, where the latter corresponds to the pion decay constant in the chiral limit.Footnote 16The expression for \({\mathcal {L}}_{\mathsf {eff}}^{(4)}\), i.e. the interaction terms at next-to-leading order in the chiral expansion, contains 12 additional interaction terms, multiplied by the LECs L 1, …, L 10, H 1, H 2. The values of the LECs are usually determined by matching the expressions of ChPT for physical observables to experimental data. However, it turns out that the complete set of LECs cannot be obtained in this way. Rather, in order to fix the values of some LECs, one must resort to additional theoretical assumptions. One particular example is the value of B 0, which appears in the chiral expansion of the pion mass at lowest order (see also Eq. (5.80)):

$$\displaystyle \begin{aligned} m_\pi^2 = B_0(m_u+m_d). \end{aligned} $$

From this expressions it is clear that B 0 can only be determined using m π as input if the physical values of the quark masses are known in the first place. By the same token, the value of \(\hat {m}=\textstyle {1\over 2}(m_u+m_d)\) can only be inferred if an estimate for B 0 is available. However, the a priori unknown parameter B 0 drops out in suitably chosen ratios of \(m_\pi ^2, m_{\mathrm {K}}^2,\ldots \). This explains why ChPT can be used to predict the ratios of the light quark masses but fails to provide an absolute mass scale. Another reason why the complete set LECs cannot be determined from chiral symmetry considerations alone is the fact that the effective Lagrangian beyond leading order is invariant under a symmetry transformation which involves the LECs and the mass matrix \({\mathcal {M}}\), but which is absent in QCD. This is the so-called “Kaplan-Manohar ambiguity” [104]. At this point it is clear that lattice simulations of QCD can provide valuable input for the determination of LECs. For instance, since the values of the quark masses are input parameters in the simulations, lattice QCD allows to map out the quark mass dependence of the masses of Goldstone bosons and thus determine the LEC B 0. We shall see below that B 0 is related to the quark condensate \(\langle \bar \Psi \Psi \rangle \) which can be considered as the order parameter for spontaneous chiral symmetry breaking. Furthermore, as we have already discussed in Sect. 5.5.6, absolute values of quark masses are accessible via lattice QCD.

We end our brief introduction to ChPT with the derivation of a few relations which will be useful for our discussion of chiral symmetry breaking below. In particular, we shall derive the leading-order mass formulae such as Eq. (5.80) and establish a link between the quark condensate and B 0. To this end we expand the field U in the chiral Lagrangian \({\mathcal {L}}_{\mathsf {eff}}^{(2)}\) in powers of the Goldstone boson fields. Assuming exact isospin symmetry, m u = m d, one finds at lowest order in ϕ a:

$$\displaystyle \begin{aligned} \begin{array}{rcl} {\mathcal{L}}_{\mathsf{eff}}^{(2)} &\displaystyle =&\displaystyle \frac{1}{2} \sum_{a=1}^8 \partial_\mu\phi_a \partial_\mu\phi_a+\ldots \\ &\displaystyle &\displaystyle +\frac{1}{2}(m_u+m_d)B_0 \sum_{a=1}^3 \phi_a^2 +\frac{1}{2}(\hat{m}+m_s)B_0 \sum_{a=4}^7 \phi_a^2\\ &\displaystyle &\displaystyle +\frac{1}{3}(\hat{m}+2m_s)B_0 \phi_8^2 +\ldots.\quad \end{array} \end{aligned} $$

After identifying ϕ 1, ϕ 2, ϕ 3 with the pions, ϕ 4, …, ϕ 7 with the kaons, and ϕ 8 ≡ η, one derives the leading-order relations between the quark masses and the masses of the Goldstone bosons, viz.

$$\displaystyle \begin{aligned} m_\pi^2=2B_0\hat{m},\quad m_{\mathrm{K}}^2=B_0(\hat{m}+m_s),\quad m_\eta^2={\textstyle\frac{2}{3}}B_0(\hat{m}+2m_s). {} \end{aligned} $$

Thus, the relation for a generic pseudoscalar Goldstone boson made up of quarks with masses m 1 and m 2 is precisely what was already shown in Eq. (5.80). We note that from Eq. (5.174) one easily derives the Gell-Mann–Okubo mass relation, i.e.

$$\displaystyle \begin{aligned} 3m_\eta^2+m_\pi^2-4m_{\mathrm{K}}^2 = 0, \end{aligned} $$

which is satisfied experimentally within a few percent. Furthermore, Eq. (5.174) yields the ratio \(m_s/\hat {m}\) at lowest order, viz.

$$\displaystyle \begin{aligned} \frac{m_s}{\hat{m}} = \frac{2m_{\mathrm{K}}^2-m_\pi^2}{m_\pi^2} \simeq 24, \end{aligned} $$

which is already close to the estimate at next-to-leading order of \(m_s/\hat {m}= 24.4\pm 1.5\) [38], quoted in Sect. 5.5.6.

For the discussion of spontaneous symmetry breaking, it is useful to establish a connection between the quark condensate in QCD, \(\left \langle \bar \Psi \Psi \right \rangle \), and the LECs which parameterize the effective chiral Lagrangian. This link is provided by the so-called Gell-Mann–Oakes–Renner relation [105], which we are going to derive below. To this end we consider the QCD Lagrangian in the continuum:

$$\displaystyle \begin{aligned} {\mathcal{L}}_{\text{QCD}} = -\frac{1}{4}F^a_{\mu\nu}(x)F^a_{\mu\nu}(x) +\sum_f\bar{\psi}_f(x)\left(\gamma_{\mu{D_\mu}} + m_f\right)\psi_f(x). \end{aligned} $$

The path integral is defined as

$$\displaystyle \begin{aligned} Z_{\text{QCD}} = \int D[A_\mu]D[\bar{\psi},\psi]\,{\exp}\left\{-\int{\mathrm{d}}^4x \,{\mathcal{L}}_{\text{QCD}}\right\}, \end{aligned} $$

and the expression for the quark condensate can be formally derived by taking derivatives with respect to the light quark masses, i.e.

$$\displaystyle \begin{aligned} \sum_{f=u,d,s}\left. \frac{\partial\ln Z_{\text{QCD}}}{\partial m_f} \right|{}_{m_f=0} = -\left.\left\langle \bar{u}u+\bar{d}d+\bar{s}s \right\rangle\right|{}_{m_f=0} \equiv -\left\langle\bar\Psi\Psi\right\rangle. {} \end{aligned} $$

What is the analogue of this expression in the effective chiral theory? To answer this question one takes the lowest-order chiral Lagrangian of Eq. (5.170) and defines the corresponding path integralFootnote 17

$$\displaystyle \begin{aligned} Z_{\text{ChPT}} = \int D[U] \exp\left\{-\int {\mathrm{d}}^4x\, {\mathcal{L}}_{\mathsf{eff}}^{(2)}\right\}. \end{aligned} $$

Since \({\mathcal {L}}_{\mathsf {eff}}^{(2)}\) contains the quark mass matrix one can consider similar derivatives, i.e.

$$\displaystyle \begin{aligned} \sum_{f=u,d,s}\left. \frac{\partial\ln Z_{\text{ChPT}}}{\partial m_f} \right|{}_{m_f=0} = \frac{F_0^2 B_0}{2} \sum_{f=u,d,s} \frac{\partial}{\partial m_f} \left.\left\langle {\mathrm{Tr}}\, {\mathcal{M}}(U+U^\dagger) \right\rangle\right|{}_{m_f=0} =3\cdot F_0^2 B_0 +\ldots, \end{aligned} $$

and comparison with Eq. (5.179) yields

$$\displaystyle \begin{aligned} -\frac{1}{3}\left\langle \bar{u}u+\bar{d}d+\bar{s}s \right\rangle \equiv \varSigma = F_0^2 B_0. \end{aligned} $$

In other words, the quark condensate is related to the slope parameter in the lowest-order mass formulae and the pion decay constant in the chiral limit, F 0. This result is known as the Gell-Mann–Oakes–Renner relation.

5.6.2 Lattice Calculations of the Quark Condensate

The Gell-Mann–Oakes–Renner relation is the starting point for many lattice determinations of the quark condensate. For a generic pseudoscalar meson consisting of a mass-degenerate quark and antiquark, i.e. m 1 = m 2 ≡ m, the LEC Σ is given by

$$\displaystyle \begin{aligned} \varSigma = \lim_{m\to0} \left(\frac{m_{\mathrm{PS}}^2 F_{\mathrm{PS}}^2}{2m}\right). \end{aligned} $$

The technical drawback of this straightforward approach is that the chiral limit in the above expression is difficult to take in practice, as we have mentioned several times already. In the quenched approximation the situation is even worse: due to the appearance of quenched chiral logarithms (c.f. Eq. (5.82)) the ratio \(m_{\mathrm {PS}}^2/m\) becomes singular at vanishing quark mass, and hence the chiral limit does not exist. Since the quenched approximation is being abandoned, this issue will gradually become irrelevant.

However, a more serious obstacle remains in the case of dynamical simulations with Wilson fermions: since this particular type of regularization breaks chiral symmetry explicitly, the matching of simulation data at non-zero lattice spacing to the expressions of ChPT is—strictly speaking—not permitted. Matching is certainly justified if a fermionic discretization is employed which preserves chiral symmetry, such as overlap or domain wall fermions, or if results obtained using Wilson fermions are extrapolated to the continuum limit before a comparison to ChPT is performed.

A complementary approach for determining the condensate on the lattice is based on the Banks–Casher relation [106]. It provides a link between the LEC Σ and the spectral properties of the Dirac operator, viz.

$$\displaystyle \begin{aligned} \varSigma = \lim_{\lambda\to0}\lim_{m\to0}\lim_{V\to\infty} \frac{\pi}{V}\rho(\lambda), {} \end{aligned} $$

where V  is the space-time volume. The spectral density ρ(λ) is defined as follows: Let \({\mathcal {D}}\) denote the massless Dirac operator in the continuum, satisfying \(\left \{\gamma _5,{\mathcal {D}}\right \}=0\). Its eigenvalue equation reads

$$\displaystyle \begin{aligned} {\mathcal{D}}\psi_n = {\mathrm{i}}\lambda_n\psi_n,\qquad \lambda_n\in\mathbb{R}, \end{aligned} $$

where the eigenvalues and eigenfunctions depend on the gauge field. A suitable definition of the spectral density is then represented by

$$\displaystyle \begin{aligned} \rho(\lambda) := \sum_n \big\langle \delta(\lambda-\lambda_n) \big\rangle, \end{aligned} $$

where the expectation value is taken with respect to the QCD functional integral.Footnote 18 Note that in Eq. (5.184) the ordering of limits must be obeyed. In particular, since the spontaneous breaking of a continuous symmetry cannot occur in finite volume, the limit V → must be taken before the chiral limit and the spectrum in the deep infrared are considered.

The Banks–Casher relation provides not only a method to determine the condensate, but also suggests a mechanism how spontaneous chiral symmetry breaking comes about. Indeed, Eq. (5.184) implies that a non-zero value of the quark condensate is generated through a non-vanishing value of the spectral density in the deep infrared. In other words, spontaneous chiral symmetry breaking is driven by an accumulation of small eigenvalues. An immediate consequence of the Banks–Casher relation is that the level spacing Δλ between the small eigenvalues is given by

$$\displaystyle \begin{aligned} \Delta\lambda \equiv \frac{1}{\rho(\lambda)} = \frac{\pi}{V\varSigma}. {} \end{aligned} $$

Hence, as V → the level spacing becomes arbitrarily small. In the free theory, i.e. in the absence of a non-trivial gauge field one finds that ρ(λ) ∝ λ 3, which vanishes as λ → 0. The accumulation of eigenvalues near zero with a rate predicted by Eq. (5.187) must therefore arise through the interaction with the gauge field.

In order to test the Banks–Casher scenario, a possible strategy is to compute the spectral density and check whether it actually produces an arbitrarily dense spectrum near the origin. Analytic predictions for ρ(λ) can be derived in the framework of effective theories of QCD at low energies, namely ChPT, as well as chiral Random Matrix Theory (RMT). The latter also yields predictions for the distributions of individual eigenvalues, in addition to the spectral density.

Chiral Random Matrix Theory goes back to an idea of Wigner who tried to utilize statistical properties for the theoretical description of systems with many degrees of freedom and complicated dynamics, such as nuclear resonances. Rather than trying to model the local interactions within such a system explicitly, all possible interactions that are consistent with the symmetries of the theory are equally likely. The Hamiltonian is then approximated by a matrix whose elements are uncorrelated but obey a particular probability distribution. The main guiding principle for the RMT description of QCD is the requirement that all global symmetries must be respected. The massless Dirac operator can then be represented by an N × N matrix \(\hat {D}\) with an off-diagonal block structure which is characteristic for systems with chiral symmetry:

$$\displaystyle \begin{aligned} \hat{D}=\left(\begin{array}{cc} 0 & W \\ -W^\dagger & 0 \end{array}\right) \begin{array}{c} \} N_{+} \\ \} N_{-} \end{array}. {} \end{aligned} $$

As illustrated by the above expression, the matrix W is, in general, rectangular with N + rows and N columns, such that N = N + + N . For N +  ≠  N the matrix \(\hat {D}\) has |N + − N | zero modes, and the index ν ≡ N + − N may be identified with the topological charge in QCD. With this definition, \(\hat {D}\) is anti-hermitian and has purely imaginary eigenvalues which come in complex conjugate pairs:

$$\displaystyle \begin{aligned} \hat{D}\phi_n = {\mathrm{i}}\mu_n\phi_n,\qquad \mu_n \in \mathbb{R}. \end{aligned} $$

One can define the system’s partition function in a sector of fixed topological charge ν via

$$\displaystyle \begin{aligned} {\mathcal{Z}}_\nu = \int D[W] \det\big(\hat{D}+m\big)^{N_{\mathrm{f}}} \mathrm{e}^{-{\textstyle{1\over2}}N{\mathrm{Tr}}\,(W^\dagger W)}, \end{aligned} $$

where N f is—as usual—the number of dynamical quark flavours. It makes sense to identify the matrix size N with the physical volume V  of the theory (up to some proportionality constant).

In order to study the spectral properties of \(\hat {D}\) in the deep infrared, it is useful to rescale the eigenvalues by the system size

$$\displaystyle \begin{aligned} z \equiv \mu_n N, \qquad N\propto V \end{aligned} $$

since, according to Eq. (5.187), the level spacing of the scaled eigenvalues z is of order one. The so-called microscopic spectral density in the sector of topological charge ν is then defined as

$$\displaystyle \begin{aligned} \rho_s^{(\nu)}(z) := \lim_{N\to\infty} \sum_n\left\langle \delta(z-\mu_n N) \right\rangle_{\nu}, \end{aligned} $$

where the expectation value 〈⋯ 〉ν is taken with respect to the partition function \({\mathcal {Z}}_\nu \). An explicit expression for \(\rho _s^{(\nu )}(z)\) in terms of Bessel functions has been worked out by Verbaarschot and Zahed [107]

$$\displaystyle \begin{aligned} \rho_s^{(\nu)}(z) = \frac{z}{2} \left\{\left[ J_{\nu+N_{\mathrm{f}}}(z)\right]^2 -J_{N_{\mathrm{f}}+\nu+1}(z)\,J_{N_{\mathrm{f}}+\nu-1}(z) \right\}. \end{aligned} $$

The microscopic spectral density is the convolution of the distribution functions \(p_k^{(\nu )}\) of the individual scaled eigenvalues, i.e.

$$\displaystyle \begin{aligned} \rho_s^{(\nu)}(z) = \sum_{k=1}^\infty p_k^{(\nu)}(z),\qquad \int_0^\infty {\mathrm{d}}z\, p_k^{(\nu)}(z) = 1. \end{aligned} $$

Chiral RMT yields predictions for these distributions. For instance, for the lowest eigenvalue in the sector with ν = 0 one obtains for N f = 0

$$\displaystyle \begin{aligned} p_1^{(0)}(z) = \frac{1}{2}z\,\mathrm{e}^{-z^2/4}. \end{aligned} $$

For further illustration the microscopic spectral density and the distribution functions for a few of the lowest eigenvalues are plotted in Fig. 5.16. The result for \(\rho _s^{(\nu )}(z)\) indicates that an accumulation of small eigenvalues does indeed take place. Since one considers the simultaneous limits μ → 0 and N → for fixed z, a non-zero value of \(\rho _s^{(\nu )}(z)\) for finite z signals that the spectrum is packed more and more densely near the origin.

Fig. 5.16
figure 16

RMT predictions for the microscopic spectral density and distributions for individual eigenvalues in the sector with topological charge ν = 0

Can the predictions of RMT be verified from first principles in simulations of lattice QCD? The answer is ‘yes’, provided one considers a particular kinematical situation, commonly referred to as the “𝜖-regime” of QCD. It is based on the formulation of QCD in a large but finite volume of spatial size L and for arbitrarily small quark mass. The Compton wavelength of the pion then exceeds the spatial size, and thus the 𝜖-regime is characterized by

$$\displaystyle \begin{aligned} m_{\pi{L}} \ll 1,\qquad F_{\pi{L}} \gg 1. {} \end{aligned} $$

In this particular situation the path integral of the theory is dominated by zero momentum modes. In a symmetric finite box with volume V = L 4, the minimum non-zero momentum is given by p min ∝ 1∕L. Let us recall the expression for the lowest-order effective chiral Lagrangian, i.e.

$$\displaystyle \begin{aligned} {\mathcal{L}}_{\mathsf{eff}}^{(2)}=\textstyle{1\over2}{F_0^2}\Big\{ \textstyle{1\over2} {\mathrm{Tr}}\,\left(\partial_\mu U^\dag\partial_\mu U\right) -m\varSigma\,{\mathrm{Tr}}\,\left(\mathrm{e}^{{\mathrm{i}}\theta/N_{\mathrm{f}}}U +\mbox{h.c.}\right)\Big\}, \end{aligned} $$

where we have included the vacuum angle θ and assumed that . If the quark mass m is tuned so that

$$\displaystyle \begin{aligned} m\varSigma \ll F_0^2 p_{\text{min}}^2 \sim F_0^2/L^2, \end{aligned} $$

the statistical weight of fields with μU ≠ 0 will be strongly suppressed in the path integral. In other words, the mass term will dominate over the kinetic term, except for fields U with μU = 0. Since \(2m\varSigma /F_0^2 = m_{\mathrm {PS}}^2\), the conditions in Eq. (5.196), which define the kinematical situation of the 𝜖-regime, are equivalent to

$$\displaystyle \begin{aligned} m{\varSigma}V \ll 1. \end{aligned} $$

The zero-momentum part can be represented by a constant SU(3) matrix U 0 such that

$$\displaystyle \begin{aligned} U(x) = U_0\,\mathrm{e}^{2{\mathrm{i}}\xi(x)/F_0},\qquad U_0 \in \mathrm{SU}(3), \end{aligned} $$

where the field ξ incorporates the fluctuations about the zero momentum mode. According to Leutwyler and Smilga [108], the path integral of the theory in topological sector ν can be written in the form

$$\displaystyle \begin{aligned} Z_\nu^{(0)} = \int D[U_0] \left(\det{U_0}\right)^{\nu} \exp\left(m{\varSigma}V\,{\mathrm{Re}}{\mathrm{Tr}}\, U_0\right). \end{aligned} $$

After this somewhat lengthy preparatory discussion, the connection between QCD in the 𝜖-regime and chiral RMT can finally be established. An important result derived by Shuryak and Verbaarschot [109] states that the path integral \(Z_\nu ^{(0)}\) can be mapped exactly onto the partition function \({\mathcal {Z}}_\nu \) of RMT. One therefore expects that the low-lying eigenvalues of QCD in the 𝜖-regime are distributed in the same way as those in RMT. By computing the former in a lattice simulation and performing a comparison to the analytically known distributions in RMT, one may verify the Banks–Casher scenario of spontaneous chiral symmetry breaking.

The Neuberger-Dirac operator D N of Eq. (5.47) is ideally suited for this task. Since it satisfies the Ginsparg-Wilson relation, chiral symmetry is preserved at the level of the discretized theory. Furthermore, D N can be shown to satisfy an exact index theorem, so that it sustains |ν| exact zero modes on gauge configurations with topological charge ν. This allows for an unambiguous identification of topological sectors to which the path integral \(Z_\nu ^{(0)}\) is restricted [110]. Therefore, the investigation of spontaneous chiral symmetry breaking is a prime example where it is absolutely vital that the lattice-regularized theory obeys the same symmetries that are present in the continuum.

Before we proceed we must elucidate the relation of the spectra of the random matrix \(\hat {D}\) and the Neuberger-Dirac operator. While the eigenvalues of \(\hat {D}\) are purely imaginary, the operator D N is unitary, and hence its eigenvalues lie on a circle with radius \(1/\overline {a}\) in the complex plane, centered around the point \(1/\overline {a}\) on the real axis. Thus, if γ denotes an eigenvalues of D N, it can be parameterized as

$$\displaystyle \begin{aligned} \gamma=\frac{1}{\overline{a}} \big(1-\mathrm{e}^{{\mathrm{i}}\phi}\big),\qquad \overline{a}=\frac{a}{1+s}. \end{aligned} $$

Since the radius of the circle diverges in the continuum limit, the low-lying part of the spectrum satisfies \(|\gamma |\ll 1/\overline {a}\), and hence Reγ ≃ 0. One can then identify an eigenvalue μ of \(\hat {D}\) with Imγ, i.e.

$$\displaystyle \begin{aligned} \mu \quad \leftrightarrow\quad {\mathrm{Im}}\gamma\simeq |\gamma|=\frac{1}{\overline{a}}\left[2(1-\cos\phi)\right]. \end{aligned} $$

A simple but effective check of the RMT description of the low-lying spectrum can be performed by comparing ratios of scaled eigenvalues. The combination |γ k|ΣV  of the kth eigenvalue in QCD corresponds to μ kN in RMT. If the low-lying spectra in the two theories indeed coincide one expects the following equalities in a given topological sector ν

$$\displaystyle \begin{aligned} \frac{\langle|\gamma_k|\rangle_\nu}{\langle|\gamma_j|\rangle_\nu} \stackrel{!}{=} \frac{\langle\mu_k\rangle_\nu}{\langle\mu_j\rangle_\nu} \equiv \left.\int_0^\infty {\mathrm{d}}z\,z\,p_{k}^{(\nu)}(z)\right/ \int_0^\infty {\mathrm{d}}z\,z\,p_{j}^{(\nu)}(z). \end{aligned} $$

While the ratio 〈|γ k|〉ν∕〈|γ j|〉ν is determined in the simulation, the two integrals on the right-hand side can be evaluated analytically for the first few eigenvalues.Footnote 19

In Refs. [111, 112] ratios for some of the lowest eigenvalues have been computed in the quenched approximation. The results from [111] are shown in Fig. 5.17 for a box size L = 1.49 fm. The agreement between lattice results and RMT is excellent. By contrast, a smaller box size of about 1 fm yields significant discrepancies between QCD and RMT, which can be as large as 10 standard deviations. This is a reflection of the fact that the large volume limit must be taken before the RMT behaviour sets in. Similar findings have been reported for QCD with N f = 2 flavours of dynamical overlap quarks [113].

Fig. 5.17
figure 17

Comparison of simulation results for ratios of eigenvalues with Random Matrix Theory (horizontal bars) in the sectors with topological charge ν = 0, 1, 2 [111]

The confirmation of the RMT prediction for the distribution of the low-lying eigenvalues supports the Banks–Casher scenario of spontaneous chiral symmetry breaking. In a subsequent step one may therefore extract the LEC Σ via the relation

$$\displaystyle \begin{aligned} \langle|\gamma_k|\rangle_{\nu}{\varSigma}V = \langle{\mu_k}N\rangle_\nu \equiv \int_0^{\infty} {\mathrm{d}}z\,z\,p_{k}^{(\nu)}(z). \end{aligned} $$

If Σ is identified with the expectation value of the scalar density, as suggested by the effective low-energy description of QCD, it must be related to a particular continuum scheme, like the \({\overline {{\mathrm {MS}}}}\)-scheme of dimensional regularization. If the regularization prescription obeys chiral symmetry, the corresponding renormalization factor, Z S, satisfies

$$\displaystyle \begin{aligned} Z_{\mathrm{S}}=Z_{\mathrm{P}}=1/Z_{\mathrm{m}}. \end{aligned} $$

where Z m relates the bare quark mass to the chosen continuum scheme (for instance, \({\overline {{\mathrm {MS}}}}\)). Provided that Z S, or equivalently, Z m has been computed for a range of bare couplings, the lattice estimates for Σ can be used to determine the renormalized condensate in units of some scale, e.g.

$$\displaystyle \begin{aligned} r_0^3 \varSigma_{{\overline{{\mathrm{MS}}}}}(\mu) = Z_{\mathrm{S}}(g_0,a\mu)r_0^3\varSigma +{\mathrm{O}}(a^2). \end{aligned} $$

For the Neuberger-Dirac operator, Z S has been computed non-perturbatively in the quenched approximation [114], employing the technique outlined in Ref. [115]. The resulting values for Z S could then be combined with the results for Σ extracted from the matching to RMT from [111]. A subsequent extrapolation to vanishing lattice spacing yields the results for the renormalized condensate in the continuum limit:

$$\displaystyle \begin{aligned} \varSigma_{{\overline{{\mathrm{MS}}}}}(2\,\text{GeV}) = (285\pm9\,\text{MeV})^3,\qquad \mbox{(scale set by {$f_{\mathrm{K}}$})}. \end{aligned} $$

The quoted error represents the total uncertainty arising from statistics, the uncertainty in the renormalization factor, and the continuum extrapolation. If the nucleon mass is used to set the scale the central value drops to 261 MeV, as a consequence of the scale ambiguity encountered in the quenched approximation. We stress once more that the chiral condensate is ill-defined in the quenched theory, and thus great care must be taken when the results are interpreted in the context of the full theory. Nevertheless, it is encouraging that for N f = 2 flavours of dynamical quarks, a similar calculation [113] finds \(\varSigma _{{\overline {{\mathrm {MS}}}}}(2\,\text{GeV}) = (251\pm 7\pm 11\,\text{MeV})^3\) at a ≃ 0.11 fm, in good agreement with the quenched result, given the inherent ambiguities and inconsistencies of the latter.

Lattice results for the condensate have been reported by many other authors (e.g. [116,117,118,119,120,121,122,123,124,125]), employing a variety of approaches. Although the various calculations are subject to different systematics, the overall picture is rather consistent, with values for the condensate centering around (250 MeV)3. As for many other quantities, the influence of lattice artefacts and renormalization effects must be studied in more detail, especially in the case of fully dynamical calculations. It is also important to mention that analytic non-perturbative approaches to the strong interaction, such QCD sum rules, also give broadly consistent results with lattice simulations within the quoted uncertainties (see e.g. [126,127,128] and references therein). This completes the consistent picture of chiral symmetry and its spontaneous breaking in QCD.

5.7 Hadronic Weak Matrix Elements

The experimental programme at the B-factories BaBar and Belle, as well as many other experiments at high-energy colliders, such as the Tevatron and LEP, have greatly enhanced the accuracy of many observables related to flavour physics and the Cabibbo–Kobayashi–Maskawa (CKM) matrix. The main motivation for studying flavour physics is to gain a proper understanding of CP violation and, in turn, the matter-antimatter asymmetry which is apparently manifest in the universe. CP violation is incorporated into the Standard Model via a complex phase in the CKM matrix, and therefore a precise knowledge of its elements is required to decide whether or not additional sources of CP violation must be considered.

In order to make these statements more precise we recall some basic definitions. As is well known, the CKM matrix V CKM relates flavour to mass eigenstates. For flavour-changing charged current transitions between up- and down-type quarks this implies that, in addition to the dominant transitions like u ↔ d, c ↔ s and t ↔ b, there are further transitions of lesser strength. The CKM matrix is therefore expected to possess a hierarchical structure, with the diagonal elements V ud, V cs and V tb being of order one. An approximate parameterization that takes this into account is due to Wolfenstein [129]. By expanding V CKM in powers of the Cabibbo-angle |V us|≡ λ ≃ 0.22 one obtains

$$\displaystyle \begin{aligned} V_{\text{CKM}} \equiv \left(\begin{array}{ccc} V_{ud} & V_{us} & V_{ub} \\ V_{cd} & V_{cs} & V_{cb} \\ V_{td} & V_{ts} & V_{tb} \\ \end{array}\right) \simeq \left(\begin{array}{ccc} 1-{\lambda^2}/{2} & \lambda & A\lambda^3(\rho-i\eta) \\[0.2cm] -\lambda-iA^2\lambda^5\eta & 1-{\lambda^2}/{2} & A\lambda^2 \\[0.2cm] A\lambda^3(1-\bar{\rho}-i\bar{\eta}) & -A\lambda^2-iA\lambda^4\eta & 1 \\ \end{array}\right), \end{aligned} $$

with the remaining parameters \(A,\bar \rho \) and \(\bar \eta \) of order one.Footnote 20 In the standard model, V CKM is unitary, and, provided that one can determine its elements with sufficient precision, any deviation from unitarity would be a signature of “new physics”. Unitarity gives rise to relations such as

$$\displaystyle \begin{aligned} V_{ud}V_{ub}^*+V_{cd}V_{cb}^*+V_{td}V_{tb}^*=0, {} \end{aligned} $$

which can be represented by a triangle. The strategy that has been adopted in order to search for hints of new physics, is to use experimental and theoretical input to over-constrain the unitarity relations like those in Eq. (5.210). The current status is depicted in Fig. 5.18, where the unitarity triangle is plotted in the \((\bar \rho ,\bar \eta )\)-plane [130].

Fig. 5.18
figure 18

Constraints on the apex of the unitarity triangle [130]

The experimentally measured quantities, i.e. the mass differences ΔM s,  ΔM d and 𝜖 K, the latter of which parameterizes indirect CP violation in the kaon system, serve to constrain the apex of the unitarity triangle. They are proportional to the relevant CKM matrix elements, i.e.

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle \Delta M_d = {{G_{\mathrm{F}}^2 M_{\mathrm{W}}^2}\over{6\pi^2}} \eta_{\mathrm{B}}S\big(\frac{m_t}{M_{\mathrm{W}}}\big)\, {f_{\mathrm{B}}^2\hat{B}_{\mathrm{B}}}{\left|V_{td}V_{tb}^*\right|{}^2}, \qquad \frac{\Delta M_s}{\Delta M_d} = {{f_{\mathrm{B}_{\mathrm{s}}}^2{\hat{B}_{\mathrm{B}_{\mathrm{s}}}}}\over{f_{\mathrm{B}}^2{\hat{B}_{\mathrm{B}}}}} \frac{m_{\mathrm{B}_{\mathrm{s}}}}{m_{\mathrm{B}}} \frac{|V_{ts}|{}^2}{|V_{td}|{}^2}, \\ &\displaystyle &\displaystyle \epsilon_{\mathrm{K}} \propto \hat{B}_{\mathrm{K}} \,\mbox{Im}({V_{td}V_{ts}^*}), {} \end{array} \end{aligned} $$

where G F is the Fermi constant, and M W, m t denote the masses of the W-boson and top quark, respectively. The proportionality factors in the above expressions involve the leptonic B-meson decay constants f B and \(f_{\mathrm {B}_{\mathrm {s}}}\), as well as the B-parameters \(\hat {B}_{\mathrm {B}}\), \(\hat {B}_{\mathrm {B}_{\mathrm {s}}}\) and \(\hat {B}_{\mathrm {K}}\), which in turn parameterize the transition amplitudes for \(B^0-\bar {B}^0\), \(B_s^0-\bar {B}_s^0\), and \(K^0-\bar {K}^0\) mixing. While the decay constants are difficult to measure with sufficient accuracy, due to the fact that the leptonic decay rates are suppressed, the B-parameters are not at all accessible in experiment. One must therefore resort to theoretical estimates of these quantities. Since non-perturbative effects must inevitably be included, lattice simulations of QCD are ideally suited for this task.

Lattice calculations of weak hadronic matrix elements is a major activity within the lattice community, and a thorough coverage of all aspects would easily fill an entire chapter. We shall therefore concentrate on some of the most important quantities, and point out the main conceptual issues. It is strongly recommended that the reader consult the regular reviews of the topic at the annual lattice conferences, e.g. [131,132,133,134,135].

5.7.1 Weak Matrix Elements in the Kaon Sector

In the kaon sector, \(K^0-\bar {K}^0\) mixing is one of the most important processes. The B-parameter B K parameterizes the non-perturbative contribution to indirect CP violation. It is defined by the ratio of the relevant operator matrix element to its value in the so-called “vacuum saturation approximation”:

$$\displaystyle \begin{aligned} B_{\mathrm{K}}(\mu)= {{\left\langle\bar{K}^0\left| Q^{\Delta S=2}(\mu)\right|K^0\right\rangle} \over {\frac{8}{3}f_{\mathrm{K}}^2m_{\mathrm{K}}^2}}. \end{aligned} $$

Here, μ denotes the renormalization scale at which the ΔS = 2 four-quark operator Q ΔS=2, defined by

$$\displaystyle \begin{aligned} Q^{\Delta S=2} = \left[\bar{s}\gamma_\mu(1-\gamma_5)d\right] \left[\bar{s}\gamma_\mu(1-\gamma_5)d\right] \equiv O_{\mathrm{VV+AA}}-O_{\mathrm{VA+AV}}, \end{aligned} $$

is considered. The relation between 𝜖 K and the CKM matrix elements is provided by the RG-invariant B-parameter \(\hat {B}_{\mathrm {K}}\). In NLO perturbation theory \(\hat {B}_{\mathrm {K}}\) is related to B K(μ) via

$$\displaystyle \begin{aligned} \hat{B}_{\mathrm{K}} = \left(\frac{\bar{g}(\mu)^2}{4\pi}\right)^{\gamma_0/2b_0} \left\{ 1+\bar{g}(\mu)^2\left[ \frac{b_0\gamma_1-b_1\gamma_0}{2b_0^2} \right]\right\}\, B_{\mathrm{K}}(\mu), \end{aligned} $$

where γ 0, γ 1 denote the coefficients in the perturbative expansion of the anomalous dimension of Q ΔS=2. Since QCD is parity-conserving, the physically relevant operator in the above expression is the parity-even combination O VV+AA. The typical left-handed chiral structure of this operator, which is characteristic for weak transitions, poses a problem for lattice calculations if Wilson fermions are employed. In this case the discretization breaks chiral symmetry explicitly, and thus O VV+AA mixes under renormalization with operators involving the opposite chirality. Therefore, the general renormalization pattern is

$$\displaystyle \begin{aligned} O^{\mathrm{R}}_{\mathrm{VV+AA}}(\mu)=Z(g_0,a\mu)\left\{ O^{\mathrm{bare}}_{\mathrm{VV+AA}}+ \sum_{i=1}^4 \Delta_i(g_0) O_i^{\text{bare}} \right\} {} \end{aligned} $$

Thus, in order to determine the physical matrix element, one must not only determine the overall renormalization factor Z, but also the mixing coefficients Δi. Several techniques have been developed [136,137,138] to address this problem, which is merely an inconvenience rather than a serious obstacle. In a formulation based on staggered fermions the problem is absent, since the remnant U(1) ⊗ U(1) symmetry protects the operator from mixing with other chiralities. However, a drawback of the staggered formulation is the broken flavour (“taste”) symmetry, which may lead to significant complications [139]. Fermionic discretizations based on the Ginsparg-Wilson relation, such as domain wall or overlap fermions do not suffer from the mixing problem, whilst preserving all flavour symmetries. Finally, the mixing problem can also be circumvented for Wilson-like discretizations in the context of twisted-mass QCD [140, 141]. With the help of a suitably chosen flavour rotation (see Eq. (5.51)), the matrix element of O VV+AA in QCD can be mapped exactly onto that of the parity-odd operator O VA+AV in the chirally twisted theory, viz.

$$\displaystyle \begin{aligned} \left\langle\bar{K}^0\left|O^{\text{bare}}_{\mathrm{VA+AV}} \right|K^0\right\rangle_{\text{tmQCD}} = {{\mathrm{i}}}\left\langle\bar{K}^0\left|O^{\text{bare}}_{\mathrm{VV+AA}} \right|K^0\right\rangle_{\text{QCD}}. \end{aligned} $$

It has been shown that O VA+AV renormalizes purely multiplicatively [142], i.e. all mixing coefficients vanish. The overall multiplicative, scale-dependent renormalization factor of O VA+AV which yields the physical matrix element has been determined non-perturbatively [143], using the finite-size scaling procedure based on the Schrödinger functional formalism described in Sect. 5.5.2.

We now give a summary of the current status of B K. Here, the calculation by the JLQCD Collaboration [154], based on staggered quarks in the quenched approximation, has served as a benchmark result for a long time. Their result, for which the perturbatively renormalized matrix element was extrapolated to the continuum limit, has since been confirmed by many other calculations employing different fermionic discretizations and different renormalization techniques. These include domain wall [148, 149] and overlap quarks [150, 151], as well as the Wilson formulation [153, 155]. Moreover, a calculation employing twisted mass QCD has been completed [152], which includes non-perturbative renormalization and a thorough investigation of the continuum limit.

Recently, results for B K from simulations with dynamical quarks have become available, both for N f = 2 [146, 147] and N f = 2 + 1 flavours [144, 145]. A compilation of quenched and unquenched results is shown in Fig. 5.19. Although the figure suggests a trend in the data which points to slightly lower estimates for \(\hat {B}_{\mathrm {K}}\) if dynamical quarks are switched on (see Fig. 5.19), the quoted uncertainties are still too large to point to a significant deviation. In particular, a systematic study of the continuum limit in the unquenched case is not yet available. It is interesting to compare the results for \(\hat {B}_{\mathrm {K}}\) to the non-lattice determination in Ref. [130]. Here, the determinations of the angles of the unitarity triangle from experimental data in conjunction with direct measurements of ΔM d,  ΔM s and 𝜖 K allow to fit the values of several of the quantities in Eq. (5.211), which incorporate the hadronic uncertainties. In this way one obtains \(\hat {B}_{\mathrm {K}}^{\text{non-lattice}} = 0.94\pm 0.17\), which is shown as the vertical band in Fig. 5.19. Clearly, within the rather large error margins, this result is compatible with all lattice determinations, quenched or unquenched.

Fig. 5.19
figure 19

Recent lattice results for the RGI kaon B-parameter \(\hat {B}_{\mathrm {K}}\). From top to bottom, the plotted values are taken from Refs. [144,145,146,147,148,149,150,151,152,153,154]. Dotted error bars (where shown) indicate the quoted systematic error. The labels include information on the fermionic discretization and the intermediate renormalization scheme, if non-perturbative renormalization was used. We also indicate whether or not the results have been extrapolated to the continuum limit. The vertical lines represent the (non-lattice) result from [130], with the quoted uncertainty (see text)

First Row Unitarity and the Value of |V us|

In addition to Eq. (5.210), the unitarity of the CKM matrix implies many other constraints on its elements, such as those which appear in the first row:

$$\displaystyle \begin{aligned} |V_{ud}|{}^2+|V_{us}|{}^2+|V_{ub}|{}^2=1. {} \end{aligned} $$

Owing to the smallness of |V ub|, i.e. |V ub|2 ≃ 2 ⋅ 10−5, the direct verification of first row unitarity with the current experimental and theoretical accuracy rests on the precise knowledge of |V ud| and |V us|. The value of |V ud| can be determined with high accuracy from super-allowed nuclear β-decays (0+ → 0+ transitions), and in the current edition of the particle data book the best estimate is quoted as [61]

$$\displaystyle \begin{aligned} |V_{ud}| = 0.97377\pm 0.00027. \end{aligned} $$

The value of |V us| can be extracted from the decay rate of K 3 transitions, i.e.

$$\displaystyle \begin{aligned} \Gamma(K\to\pi{\ell}\nu_\ell) \propto \frac{G_{\mathrm{F}}^2 m_{\mathrm{K}}^5}{192\pi^3} |V_{us}|{}^2 \left|f_{+}^{K\pi}(0)\right|{}^2, \end{aligned} $$

where \(f_{+}^{K\pi }\) is one of the two form factors which parameterize the hadronic matrix element for semi-leptonic K → πℓν transitions, i.e.

$$\displaystyle \begin{aligned} \begin{array}{rcl} \left\langle\pi(\vec{p}_{\pi})\left| (\bar{s}\gamma_\mu u)(0)\right| K(\vec{p}_{\mathrm{K}})\right\rangle &\displaystyle =&\displaystyle f_{+}^{K\pi}(q^2)(p_{\mathrm{K}}+p_\pi)_\mu + f_{-}^{K\pi}(q^2)(p_{\mathrm{K}}-p_\pi)_\mu,\\ q_\mu &\displaystyle =&\displaystyle (p_{\mathrm{K}}-p_\pi)_\mu. {} \end{array} \end{aligned} $$

In order to arrive at a precise estimate for |V us|, \(f_{+}^{K\pi }(q^2)\) must be determined with an accuracy at the level of 1%, since the decay rate and hence the combination \(|V_{us}|{ }^2\,[f_{+}^{K\pi }]^2 \) can be measured rather precisely. The form factor \(f_{+}^{K\pi }\) admits a chiral expansion; At zero momentum transfer it reads

$$\displaystyle \begin{aligned} f_{+}^{K\pi}(0) = 1+f_2+f_4+\ldots. \end{aligned} $$

While the leading chiral correction, f 2 = −0.023, has been computed long ago [156], knowledge on f 4 and the higher corrections is still fairly limited. The strategy pursued in lattice calculations [157] is based on computing the quantity

$$\displaystyle \begin{aligned} {\Delta}f \equiv f_{+}^{K\pi}(0)- (1+f_2), \end{aligned} $$

which is a measure of the contributions beyond leading order. An old phenomenological estimate by Leutwyler and Roos [158] yields the value Δf = −0.016(8). It is clearly desirable to check this result and ultimately replace it by a model-independent estimate based on QCD.

Semi-leptonic form factors can be determined in lattice simulations by computing suitable three-point correlation functions, in which the initial and final hadronic states are projected onto non-vanishing momentum. The main issues that must be addressed in order to judge the accuracy of the form factor determination are listed in the following:

  • The dependence of the form factors on the momentum transfer q 2 must be modelled, in order to interpolate their values to q 2 = 0. Typical ansätze for the interpolation include linear or quadratic functions of q 2, as well as formulae based on pole dominance [159]. The freedom of choosing a particular ansatz introduces a certain ambiguity, since different model functions yield slightly different results. Via the introduction of so-called twisted boundary conditions [159,160,161,162,163,164], the q 2 resolution of form factors can be significantly improved;

  • As for all quantities involving pions, a chiral extrapolation of lattice results must be performed. Clearly, in order to obtain \(f_{+}^{K\pi }(0)\) and hence Δf with small controlled errors, a reliable chiral extrapolation is perhaps the single most important issue. Thus, the ability to simulate as deeply as possible in the chiral regime will be decisive for the final accuracy;

  • Other systematic uncertainties include control over lattice artefacts, which is closely related to the renormalization of local operators, such as the vector current, which appears in Eq. (5.220). If chiral symmetry is broken explicitly, the (local) vector current is not conserved, and in order to guarantee a smooth approach to the continuum limit, its renormalization factor, Z V, must be included. However, in all recent simulations the form factor has been extracted from suitably chosen ratios in which Z V drops out.

A compilation of recent results for the form factor \(f_{+}^{K\pi }(0)\) and the quantity Δf are presented in Table 5.4, where they are compared to analytical estimates. The agreement with the old result by Leutwyler and Roos is quite striking. Despite a tendency among the more recent analytical calculations to produce slightly larger estimates for Δf, all results are in good agreement within the quoted uncertainties.

Table 5.4 Recently published lattice results for the quantity Δf

An alternative method to determine |V us| from experimental data was proposed by Marciano [170]. Instead of considering semi-leptonic decays, it is based on the leptonic decay rates, i.e.

$$\displaystyle \begin{aligned} \frac{\Gamma(K\to\mu\bar\nu_\mu(\gamma))} {\Gamma(\pi{\to}e\bar\nu_e(\gamma))} \propto \frac{|V_{us}|{}^2}{|V_{ud}|{}^2}\,\frac{f_{\mathrm{K}}^2m_{\mathrm{K}}}{f_\pi^2 m_\pi}. \end{aligned} $$

Hence, the task is to provide an input value for the ratio of decay constants, f Kf π. This quantity is well-suited for lattice calculations in several respects: first, ratios of quantities can be computed with high statistical accuracy, owing to the fact that the fluctuations in the numerator and denominator are correlated. Second, the renormalization factor of the axial current, Z A, drops out in the ratio f Kf π. However, since the quantity of interest involves a chiral extrapolation, the same caveats as in the case of the pion form factor, apply in this case. In particular, it is mandatory to go as close as possible to the physical mass of the pion. The quenched approximation is clearly of very limited value in this context, since the chiral behaviour and hence the actual value of f Kf π may strongly depend on the number of active sea quarks. Furthermore, it is known that in the continuum limit of the quenched approximation the value f Kf π is underestimated by about 10% [171].

Recent results for f Kf π in lattice QCD with dynamical quarks are listed in Table 5.5. A caveat that applies to all such compilations is that systematic errors are not estimated in a uniform manner. For instance, none of the listed results (with the exception of [52]) is based on a systematic scaling study aimed at separating cutoff effects from the actual mass dependence, although the influence of lattice artefacts has been included in some error estimates by including cutoff effects into a generalized chiral fit. Moreover, not all of the listed values of f Kf π include finite-volume corrections, which can be computed in ChPT and incorporated into the ansatz for the chiral fit [177, 178]. Despite these caveats it appears, though, that the estimates for f Kf π based on fits including pion masses well below 500 MeV are compatible with each other.

Table 5.5 Recently published results for f Kf π in lattice QCD with dynamical quarks

5.7.2 Weak Matrix Elements in the Heavy Quark Sector

The main obstacle for calculations of weak matrix elements involving heavy quarks, and in particular the b-quark, is that one is faced with a multi-scale problem. In Sect. 5.2.5 we have already discussed systematic effects in lattice calculations that arise from finite-size effects and lattice artefacts. Translating the relations in (5.79) directly to the b-quark sector, one finds that the following inequalities cannot be satisfied simultaneously, at least not with the currently available computer power:

$$\displaystyle \begin{aligned} am_b \ll 1,\quad {m_\pi}L \gg 1,\quad L/a\;\lesssim\;50. {} \end{aligned} $$

Violation of the first relation implies the presence of large lattice artefacts, the second inequality must be satisfied if one wants to avoid uncontrolled finite-volume effects, and the third is dictated by memory capacities of current computers. With a b-quark mass of m b ≈ 4 GeV and typical inverse lattice spacings of \(a^{-1}\;\lesssim \;4.5\,\text{GeV}\), it is evident that the b-quark cannot be studied directly, since its Compton wavelength is smaller or of the same order of magnitude than the lattice spacing itself.

Several strategies to deal with this problem have been applied over many years, among them the “static approximation” [179], the non-relativistic formulation (NRQCD) [180], the so-called “Fermilab-approach” [181] and finite-size scaling techniques [182, 183].

Since the charm quark is lighter than the b-quark by roughly a factor three, one may attempt to treat charm as a fully relativistic, propagating quark in simulations. Still, one can incur large lattice artefacts in this way, and a careful extrapolation to the continuum limit is then required. However, such an extrapolation may be spoilt if the leading lattice artefacts cannot be isolated in the results, due to the relatively large mass of the charm quark. Still, if one has reason to trust the results obtained for relativistic charm quarks, one may extrapolate them to the mass of b-quark, which is yet another way of circumventing the problem that the b-quark is too heavy to be treated relativistically. Typically, the ansatz for the extrapolation of a particular quantity to the mass of the b-quark is motivated by its expected quark mass dependence in Heavy Quark Effective Theory (HQET).

In the static approximation the b-quark is assumed to be infinitely heavy [179]. In this formalism it is convenient to represent the b-quark by a pair of spinors, \((\psi _h,\psi _{\bar h})\), which propagate forward and backward in time, respectively, and which satisfy

$$\displaystyle \begin{aligned} P_{+}\psi_h = \psi_h,\quad P_{-}\psi_{\bar h} = \psi_{\bar h}, \qquad P_{\pm} = \textstyle{1\over2}(1\pm\gamma_0). \end{aligned} $$

While the field ψ h annihilates a heavy quark, \(\psi _{\bar h}\) creates a heavy antiquark. The dynamics of these fields in the discretized version of the theory is described by the Eichten-Hill action [184]

$$\displaystyle \begin{aligned} S^{\text{stat}} = a^4\sum_x \left\{ {\mathcal{L}}_h^{\text{stat}} +{\mathcal{L}}_{\bar h}^{\text{stat}} \right\},\quad {\mathcal{L}}_h^{\text{stat}} = \bar\psi_h(x)\nabla_0^*\psi_h(x),\quad {\mathcal{L}}_{\bar h}^{\text{stat}} = -\bar\psi_{\bar h}(x)\nabla_0\psi_{\bar h}(x), {} \end{aligned} $$

where \(\nabla _0,\,\nabla _0^*\) denote the forward and backward covariant lattice derivatives in the temporal direction. Although the numerical computation of the quark propagator based on the Eichten-Hill action is relatively “cheap”, simulation results in the static approximation typically suffer from relatively large statistical noise. Without going into detail we note that the signal-to-noise ratio can be significantly improved if one replaces the temporal link variables in ∇0 and \(\nabla _0^*\) by suitably chosen generalized parallel transporters. A full account can be found in Ref. [185].

Obviously, the static approximation represents only the leading term in an expansion of the quark action in inverse powers of the heavy quark mass, and thus one expects corrections in powers of 1∕m h. As described in Ref. [182], one can set up a formalism in which the leading corrections to physical observables can be systematically computed as operator insertions in correlation functions defined with respect to the static action S stat. Again, we refrain from describing any further details and refer the reader to the original literature [182, 186].

Higher-order corrections to the static approximation can also be incorporated into the theory by adding the appropriate 1∕m h terms to the action itself. In this way one obtains a non-relativistic version of QCD (NRQCD) [180], in which the mass of the heavy quark is imposed as a cutoff on relativistic momentum modes, i.e.

$$\displaystyle \begin{aligned} p\sim m_h v\;\ll\;m_h, \end{aligned} $$

where v denotes the four-velocity of the heavy quark. Heuristically, the introduction of the cutoff is justified since the internal typical momentum modes of hadrons containing a heavy quark are much smaller than the mass of the latter. The loss of relativistic states can be compensated by adding new local interaction terms order by order in pm h ∼ v to \({\mathcal {L}}_h^{\text{stat}}\) and \({\mathcal {L}}_{\bar h}^{\text{stat}}\). In general, these additional interaction terms will generate mixing between quark and antiquark. However, by applying a Foldy-Wouthuysen transformation, the fields can be decoupled. At the level of the classical theory, the 1∕m h correction to the NRQCD Lagrangian for the forward propagating field reads

$$\displaystyle \begin{aligned} {\mathcal{L}}_h^{(1);\,\text{class}} = -\frac{1}{2m_h}\left\{ \bar\psi_h {\boldsymbol{D}{\cdot}\boldsymbol{D}} \psi_h +\bar\psi_h {\boldsymbol{\sigma}{\cdot}\boldsymbol{B}} \psi_h \right\}, \end{aligned} $$

and D is the vector of the covariant derivatives in the spatial directions.

In the quantized version of the theory, the coefficients which multiply the fields in the above expression become dependent on the gauge coupling and must be appropriately tuned to guarantee the correct matching of the non-relativistic theory to standard QCD at order in 1∕m h. Thus, the lattice-regularized version of the 1∕m h correction reads

$$\displaystyle \begin{aligned} {\mathcal{L}}_h^{(1)} = -\left\{ \omega_1 \bar\psi_h {\boldsymbol{\nabla\cdot\nabla}} \psi_h +\omega_2 \bar\psi_h {\boldsymbol{\sigma\cdot\hat{B}}} \psi_h \right\}, \end{aligned} $$

where \({\boldsymbol {\hat B}}\) denotes a lattice representation of the magnetic field. The coefficients ω 1 and ω 2 are formally of order 1∕m h and are found to be linearly divergent in the lattice spacing a. Therefore, at a given order in the non-relativistic expansion of the action, a finite cutoff must be kept, and in this sense the effective theory is non-renormalizable. All this implies that in NRQCD the continuum limit, a → 0, cannot be taken. Instead, one must argue that lattice artefacts are small in the range of lattice spacings where the calculations are performed.

Another approach can be based on the idea that the Wilson fermion action is suitably adapted for heavy quarks, such that the Wilson quark propagator does not deviate from the continuum behaviour even for quark masses , i.e. for quark masses near or above the cutoff [181]. According to Ref. [181] this can be achieved by modifying the normalization of the quark fields (see Eq. (5.36)) in the discretized lattice theory, i.e.

$$\displaystyle \begin{aligned} \psi(x)\rightarrow\sqrt{2\kappa}\,{\mathrm{e}}^{am_{\mathrm{P}}/2}\psi(x), \qquad \bar{\psi}(x)\rightarrow\bar{\psi}(x)\,{\mathrm{e}}^{am_{\mathrm{P}}/2}\sqrt{2\kappa}, {} \end{aligned} $$

where the “pole mass” am P of the Wilson propagator is given by

$$\displaystyle \begin{aligned} am_{\mathrm{P}} = \ln(1+am), \end{aligned} $$

and am denotes the bare subtracted quark mass in the Wilson theory (see Eq. (5.39)). The factor \(\sqrt {2\kappa }\,{\mathrm {e}}^{am_{\mathrm {P}}/2}\) is designed to interpolate smoothly between the relativistic and non-relativistic regimes. As a consequence, in order to cancel the effects of large quark masses in hadronic matrix elements involving b-quarks, the normalization of quark fields is modified according to the above prescription. The so-called “Fermilab approach” to heavy quark physics on the lattice is based on the normalization in Eq. (5.230). Essentially it amounts to formulating an effective theory for quarks, whose spatial momenta are small, \(|a\vec {p}|\ll 1\), with mass-dependent coefficients. Like in the case of the static approximation, the formalism allows to take the continuum limit. Related approaches to the Fermilab method have been presented in Refs. [187, 188].

Finally, we briefly introduce another strategy to deal with heavy quarks on the lattice and the related multi-scale problem [182, 183]. Here the condition m πL ≫ 1 in Eq. (5.224) is sacrificed in favour of am b ≪ 1. In this way one is able to accommodate a fully relativistic b-quark at the expense of having to deal with strong finite-volume effects. The key observation is that the “distortion” due to unphysically small volumes can be computed in a series of finite-size scaling steps, which relate the results obtained on a sequence of lattice sizes L 0, L 1, …. Like in the case of the non-perturbative determination of the RG running of the coupling and the quark mass discussed in Sect. 5.5.2, one can set up a recursive finite-size scaling procedure, which traces the volume dependence of observables. Here it is mostly sufficient to apply two or three steps in the scaling sequence.

In the remainder of this section we shall discuss some selected results. Regarding the vast number of individual results, we do not attempt to provide a complete review of the current status of lattice calculations of weak matrix elements in the heavy quark sector. Regular appraisals of the progress made in studying these systems can be found in the rapporteur talks on the subject at the annual conferences on lattice field theory [132, 133, 135]. Instead we shall discuss the relation between CKM matrix elements and the quantities that must be computed in order to extract the former from experimental data without resorting to model assumptions.

Heavy-Light Decay Constants

From Eq. (5.211) and Fig. 5.18 one infers that the ratio \(\xi \equiv f_{\mathrm {B}_{\mathrm {s}}}\sqrt {\hat {B}_{\mathrm {B}_{\mathrm {s}}}}/f_{\mathrm {B}}\sqrt {\hat {B}_{\mathrm {B}}}\) of decay constants and B-parameters is a key quantity, since it links ΔM s∕ ΔM d to the ratio |V ts|2∕|V td|2 of CKM matrix elements. Typically, one determines decay constants and B-parameters separately, since the former can be easily extracted from hadronic two-point functions, while the latter may undergo complicated mixing patterns, depending on the fermionic discretization. The decay constant of, say, a B + meson, is defined via the matrix element of the heavy-light axial current, i.e.

$$\displaystyle \begin{aligned} f_{\mathrm{B}} m_{\mathrm{B}} = \left\langle 0 \left|(\bar{u}\gamma_0\gamma_5 b) \right|B^{+}\right\rangle. \end{aligned} $$

If the matrix element on the right-hand side is computed in a lattice simulation, then the axial current defined in the discretized theory must be matched to its counterpart in the continuum formulation. The details of the matching procedure depend on the type of fermionic discretization and the chosen treatment to represent the heavy-light axial current on the lattice (e.g. static approximation, NRQCD, etc.). If the b-quark is treated in the static approximation, the axial current has a non-vanishing anomalous dimension, and hence its running must be determined as well. Therefore, the various techniques which have been developed to compute the renormalization factors of local operators non-perturbatively, are of particular relevance also in the study of heavy-light decay constants [189]. In particular, non-perturbative estimates for the renormalization factor of the axial current, Z A, are required to ensure a smooth convergence towards the continuum limit.

We now present results for f B and \(f_{\mathrm {B}_{\mathrm {s}}}\). From Chiral Perturbation Theory one expects that the bulk of the SU(3)-flavour breaking effect in ξ (i.e. the deviation of ξ from unity) is carried by the decay constants. The full expression at NLO for \(f_{\mathrm {B}_{\mathrm {s}}}/f_{\mathrm {B}}\) reads [190]

$$\displaystyle \begin{aligned} \frac{f_{\mathrm{B}_{\mathrm{s}}}}{f_{\mathrm{B}}}-1 = (m_{\mathrm{K}}^2-m_\pi^2){f_2(\mu)} -{{1+3{g^2}}\over{(4\pi f_\pi)^2}} \left[ \frac{1}{2}I_{\mathrm{P}}(m_{\mathrm{K}})+\frac{1}{4}I_{\mathrm{P}}(m_\eta) -\frac{3}{4}I_{\mathrm{P}}(m_\pi) \right], \end{aligned} $$

where \(I_{\mathrm {P}}(m_{\mathrm {PS}})=m_{\mathrm {PS}}^2\ln (m_{\mathrm {PS}}^2/\mu ^2)\) and f 2 is a low-energy constant, and g 2 is the strength of the B vertex. As was pointed out by Kronfeld and Ryan [191], the contribution from the chiral logarithms can be sizeable, so that a naïve linear extrapolation of lattice data from the region of the strange quark mass tends to underestimate \(f_{\mathrm {B}_{\mathrm {s}}}/f_{\mathrm {B}}\). By contrast, the corresponding ratio \(B_{\mathrm {B}_{\mathrm {s}}}/B_{\mathrm {B}}\) is expected to be close to one, since the coefficient of the chiral logarithm nearly vanishes. Since \(f_{\mathrm {B}_{\mathrm {s}}}/f_{\mathrm {B}}\) enters directly into fits to the CKM parameters, many attempts were made to pin down its value precisely. As in the case of f Kf π discussed earlier, the main issue for lattice calculations is whether the quark masses employed in simulations are small enough to allow for a controlled chiral extrapolation based on the NLO formulae. The influence of the chiral logarithms has so far been detected only in simulations based on N f = 2 + 1 flavours of rooted staggered quarks. Using NRQCD to treat the b-quark, the authors of [192] find

$$\displaystyle \begin{aligned} \frac{f_{\mathrm{B}_{\mathrm{s}}}}{f_{\mathrm{B}}} = 1.20\pm0.03\pm0.01, \qquad N_{\mathrm{f}}=2+1, {} \end{aligned} $$

where the first error is statistical, while the second is an estimate of the systematic uncertainty. This result awaits confirmation from simulations with sea quark masses as small as those used in [192], but employing different fermionic discretizations, both in the sea and valence quark sectors. This is of particular relevance, since the typical spread among the recently published results is of the same order or even larger than the uncertainty quoted above. Further discussions and compilations of lattice data for \({f_{\mathrm {B}_{\mathrm {s}}}}/{f_{\mathrm {B}}}\) can be found in [133, 193].

Estimates for absolute values of heavy-light decay constants are also highly desirable, especially since f B is hard to determine experimentally, even at the B-factories, since the B → τν τ decay rate is suppressed. For \(f_{\mathrm {B}_{\mathrm {s}}}\) the suppression is even stronger, and thus the prospects for an experimental determination of this quantity are extremely uncertain. The main issues facing lattice calculations are the influence of lattice artefacts in conjunction with the renormalization of the axial current, and the dependence of results on the number of dynamical quark flavours.

As an example for one of the most advanced quenched calculations for \(f_{\mathrm {B}_{\mathrm {s}}}\) we shall briefly discuss the result by the ALPHA collaboration [198], which also illustrates the interplay between various methods to treat the b-quark. In Ref. [198] the results obtained in the static approximation were combined with data computed around the mass of the charm quark. Provided that estimates for the decay constants in both datasets have been extrapolated to the continuum limit, a subsequent interpolation in the heavy quark mass yields the desired result for \(f_{\mathrm {B}_{\mathrm {s}}}\). The ansatz for the interpolation is based on the expression

$$\displaystyle \begin{aligned} f_{\mathrm{PS}}\sqrt{m_{\mathrm{PS}}} = C_{\mathrm{PS}}(M/\Lambda_{{\overline{{\mathrm{MS}}}}}) \,\gamma\,\left(1+ \frac{\delta}{m_{\mathrm{PS}}}\right), \end{aligned} $$

where f PS is a generic heavy-light decay constant, γ, δ are real constants, and the factor C PS arises from the matching between the static approximation and QCD with fully relativistic quarks. Thus, using the static approximation as the limiting case removes the systematic error due to the uncontrolled extrapolation to the mass of the b-quark. The resulting estimate for \(f_{\mathrm {B}_{\mathrm {s}}}\) is [198]

$$\displaystyle \begin{aligned} f_{\mathrm{B}_{\mathrm{s}}}= 193\pm6\,\text{MeV},\qquad N_{\mathrm{f}}=0. \end{aligned} $$

Non-perturbative renormalization has been employed in both the static approximation and the relativistic formulation. Except for the unknown systematic error due to quenching, the quoted error contains all uncertainties. The above result has been confirmed by the approach based on the finite-size scaling method [199].

Turning now to unquenched simulations, we compare the above value to the result by the HPQCD Collaboration [192], which was obtained using NRQCD for the b-quark, while N f = 2 + 1 rooted staggered quarks were used as sea quarks. Here, the estimate for \(f_{\mathrm {B}_{\mathrm {s}}}\) results from a combination of the value for f B and the ratio \(f_{\mathrm {B}_{\mathrm {s}}}/f_{\mathrm {B}}\) already quoted in Eq. (5.234). In this way one obtains

$$\displaystyle \begin{aligned} f_{\mathrm{B}_{\mathrm{s}}}=259\pm32\,\text{MeV}. \end{aligned} $$

Thus, in spite of the large error, it appears that the inclusion of dynamical quark effects increases the estimate for heavy-light decay constants. This is also supported by other simulations. For instance, using their simulation results in quenched QCD and with N f = 2 flavour of dynamical Wilson quarks, the CP-PACS Collaboration find [203]

$$\displaystyle \begin{aligned} \frac{f_{\mathrm{B}_{\mathrm{s}}}^{N_{\mathrm{f}}=2}}{f_{\mathrm{B}_{\mathrm{s}}}^{N_{\mathrm{f}}=0}}=1.14\pm0.05. \end{aligned} $$

The “non-lattice” determination of \(f_{\mathrm {B}_{\mathrm {s}}}\) via fits using the experimental results for the angles of the unitarity triangle as input [130] also point to a larger value compared to the quenched theory, as can be seen from the horizontal band in the compilation in Fig. 5.20.

Fig. 5.20
figure 20

Recent lattice results for \(f_{\mathrm {B}_{\mathrm {s}}}\). From left to right the results are taken from the following papers. N f = 0 :  [194207]; N f = 2 :  [203, 204, 207, 208], [209]; N f = 2 + 1 :  [192, 210]. The labels indicate the method used to treat the b-quark in the simulation (“ext” and “int” stand for extrapolations and interpolations to the mass of the b-quark, respectively. The horizontal lines represent the (non-lattice) result from [130], with the quoted uncertainty

In current unquenched simulations, systematic effects such as lattice artefacts and the renormalization of local operators are not yet controlled at a similar level compared to the quenched approximation. Thus, despite the fact that these calculations are much more “realistic” in that they include sea quarks, the quoted overall uncertainties are still relatively large.

B-Parameters \(\hat {\boldsymbol {B}}_{\mathrm {B}_{\mathrm {d}}}\) and \(\hat {\boldsymbol {B}}_{\mathrm {B}_{\mathrm {s}}}\)

Following the recent experimental determination of the mass difference ΔM s at the Tevatron [211, 212], lattice determinations of the B-parameters \(\hat {B}_{\mathrm {B}_{\mathrm {d}}}\) and \(\hat {B}_{\mathrm {B}_{\mathrm {s}}}\) have received much attention. Although the first calculations date back to the 1990s, relatively few results are available, due to several specific technical difficulties. First, the complicated renormalization and mixing patterns of four-quark operators which afflict lattice calculations of the kaon B-parameter \(\hat {B}_{\mathrm {K}}\) are also encountered in the b-quark sector. Second, there is the added complication which arises from the fact that the b-quark cannot be simulated directly.

In Table 5.6 we list published results for \(B_{\mathrm {B}_{\mathrm {d}}}(m_b)\) and the ratio \(B_{\mathrm {B}_{\mathrm {s}}}/B_{\mathrm {B}_{\mathrm {d}}}\) from a variety of methods to treat the heavy quark. The table shows that all results are broadly consistent with each other at the level of 10%, despite the different systematics. Moreover, none of the listed estimates is based on non-perturbative renormalization factors, and furthermore all entries have been computed for a fixed value of the lattice spacing, i.e. a systematic study of the continuum limit is lacking even in the quenched approximation. As for the ratio \(B_{\mathrm {B}_{\mathrm {s}}}/B_{\mathrm {B}_{\mathrm {d}}}\), it should be mentioned that the quark masses in the simulations correspond to pion masses not much smaller than 500 MeV. However, in view of the fact that the bulk of the relevant SU(3)-flavour breaking effect in ΔM s∕ ΔM d is expected to come from the ratio of decay constants, \(f_{\mathrm {B}_{\mathrm {s}}}/f_{\mathrm {B}_{\mathrm {d}}}\), this may not be such a serious limitation. Results for \(B_{\mathrm {B}_{\mathrm {d}}}\) and \(B_{\mathrm {B}_{\mathrm {d}}}\) computed on dynamical gauge configurations with rooted staggered quarks should be published soon.

Table 5.6 Published lattice results for the B-parameter \(B_{\mathrm {B}_{\mathrm {d}}}(m_b)\) in the \({\overline {{\mathrm {MS}}}}\)-scheme and the ratio \(B_{\mathrm {B}_{\mathrm {s}}}/B_{\mathrm {B}_{\mathrm {d}}}\)

Another recent development is the implementation of non-perturbative renormalization for heavy-light four-quark operators in the static approximation [220, 221]. If the b-quark is treated in the static approximation, the ΔB = 2 four-quark operator must be matched to its counterpart in the static theory, i.e.

$$\displaystyle \begin{aligned} Q^{\Delta{B}=2}(m_b) = C_{\mathrm{L}}(m_b,\mu) \widetilde{Q}_1(\mu) +C_{\mathrm{S}}(m_b,\mu) \widetilde{Q}_2(\mu), \end{aligned} $$


$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle \widetilde{Q}_1 = \left(\bar\psi_h\gamma_\mu(1-\gamma_5)\ell\right) \left(\bar\psi_{\bar h}\gamma_\mu(1-\gamma_5)\ell\right) \equiv \widetilde{O}_{\mathrm{VV+AA}}+\widetilde{O}_{\mathrm{VA+AV}} \\ &\displaystyle &\displaystyle \widetilde{Q}_2 = \left(\bar\psi_h(1-\gamma_5)\ell\right) \left(\bar\psi_{\bar h}(1-\gamma_5)\ell\right) \equiv \widetilde{O}_{\mathrm{SS+PP}}+\widetilde{O}_{\mathrm{SP+PS}}, \end{array} \end{aligned} $$

with denoting the light (d or s) flavour. For the physical matrix element only the parity-even operators \(\widetilde {O}_{\mathrm {VV+AA}}\) and \(\widetilde {O}_{\mathrm {SS+PP}}\) are relevant. If chiral symmetry is not preserved by the discretization, four-quark operators such as \(\widetilde {O}_{\mathrm {VV+AA}}\) undergo complicated mixing patterns under renormalization, which necessitate finite subtractions similar to those required for the operator O VV+AA in Eq. (5.215). However, just as in the case of \(K^0-\bar K^0\) mixing, the parity-even operators \(\widetilde {O}_{\mathrm {VV+AA}}\) and \(\widetilde {O}_{\mathrm {SS+PP}}\) can be mapped onto their parity-odd counterparts \(\widetilde {O}_{\mathrm {VA+AV}}\) and \(\widetilde {O}_{\mathrm {SP+PS}}\) by a flavour rotation, which realizes the transition to tmQCD at maximal twist angle. Moreover, it can be shown [220] that the combinations

$$\displaystyle \begin{aligned} \widetilde{O}_1^\prime\equiv\widetilde{O}_{\mathrm{VA+AV}},\qquad \widetilde{O}_2^\prime\equiv \widetilde{O}_{\mathrm{VA+AV}}+4\widetilde{O}_{\mathrm{SP+PS}} \end{aligned} $$

renormalize purely multiplicatively. The RG running of these operators, as well as the matching to hadronic schemes based on tmQCD have been determined non-perturbatively in the SF scheme for N f = 0 [221] and N f = 2 [222], which will eventually allow for a determination of \(\hat {B}_{\mathrm {B}_{\mathrm {s}}}\) and \(\hat {B}_{\mathrm {B}}\) with full control over renormalization and discretization effects. Corrections of order 1∕m b can be taken into account through an interpolation between the results obtained in the static approximation and for relativistic heavy quarks with masses in the region of that of the charm quark.

Semi-Leptonic B-Decays

The CKM elements |V ub| and |V cb|, which appear in the unitarity triangle relation equation (5.210), can be extracted from both inclusive and exclusive B-meson decays. However, |V ub| is still one of the most poorly constrained CKM elements. Its value can be determined by combining lattice calculations of semi-leptonic form factors for exclusive decays such as \(\bar {B}^0\to \pi ^{+}\ell ^{-}\bar \nu _\ell \) with the experimentally measured decay rate. If the leptons are assumed to be massless, the latter yields the combination [|V ub| f +(q 2)]2, while the form factor f +(q 2) can be extracted from the matrix element

$$\displaystyle \begin{aligned} \left\langle\pi(\vec{p}_{\pi})\left| (\bar{b}\gamma_\mu u)(0)\right| B(\vec{p}_{\mathrm{B}})\right\rangle = \left[(p_{\mathrm{B}}+p_\pi)_\mu -q_\mu\frac{m_{\mathrm{B}}^2-m_\pi^2}{q^2}\right]f_{+}(q^2) +q_\mu\frac{m_{\mathrm{B}}^2-m_\pi^2}{q^2} f_{0}(q^2). {} \end{aligned} $$

Here, q μ ≡ (p Bp π)μ denotes the momentum transfer. For a B-meson at rest one has

$$\displaystyle \begin{aligned} q^2 = m_{\mathrm{B}}^2+m_\pi^2 - 2m_{\mathrm{B}}\sqrt{m_\pi^2+\vec{p}_\pi^2}. \end{aligned} $$

In order to avoid large lattice artefacts, typical values of the pion momentum in simulations are restricted to

$$\displaystyle \begin{aligned} \left|\vec{p}_\pi\right|\;\lesssim\;1\,\text{GeV}. \end{aligned} $$

Therefore, lattice calculations typically yield the form factors f + and f 0 near \(q^2=q^2_{\max }\). By contrast, the bulk of the experimental data is recorded in bins with small values of q 2, since the decay rate is suppressed near \(q_{\max }^2\). Therefore, an extrapolation to small values of q 2 must be performed, which requires an ansatz for the shape of the form factor. Although a parameterization of the q 2-dependence which goes beyond vector pole dominance and is also consistent with the expected heavy-quark scaling laws has been proposed [223], the extrapolation to small momentum transfers typically introduces some model dependence in the result for |V ub|.

Figure 5.21 shows a compilation of lattice data for the form factors as a function of q 2 together with the curves which represent the extrapolations to q 2 = 0. The problem of the model dependence introduced by the extrapolation to small momentum transfer can be avoided by combining form factors from lattice simulations with the decay rate measured in restricted intervals of q 2, which overlap with the range of momentum transfers that are directly accessible in simulations. Such a procedure has been performed by the CLEO Collaboration [231]. The result for |V ub| obtained in this way is somewhat smaller compared to the standard method based on form factor extrapolations, but the uncertainties are still quite large. For the actual estimates of |V ub| obtained in this way, the reader may consult the original papers.

Fig. 5.21
figure 21

Form factors f + (upper data set) and f 0 for B → πℓν decays (taken from Ref. [224]). The data are taken from Refs. [225] (UKQCD), [226] (Abada et al.), [227] (El-Khadra et al.), [228] (JLQCD) and [229] (FNAL04). The unquenched results by HPQCD have been updated [230]

Semi-leptonic heavy-to-heavy decays such as \(\bar {B}\to (D,D^*)\ell \bar \nu _\ell \) offer a way to determine |V cb|. In this case it is convenient to use the four-velocities of the two mesons as the kinematical variables instead of the four-momenta. The decay amplitudes are then parameterized in terms of six form factors, i.e.

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle \frac{\left\langle D(v^\prime)\left| (\bar{c}\gamma^\mu b) \right|B(v)\right\rangle}{\sqrt{m_{\mathrm{B}}m_{\mathrm{D}}}} = (v+v^\prime)^\mu h_{+}(\omega)+(v-v^\prime)^\mu h_{-}(\omega) \\ &\displaystyle &\displaystyle \frac{\left\langle D^*(v^\prime,\epsilon)\left| (\bar{c}\gamma^\mu b) \right|B(v)\right\rangle}{\sqrt{m_{\mathrm{B}}m_{\mathrm{D}}^*}} = {{\mathrm{i}} }\epsilon^{\mu\nu\alpha\beta}\epsilon_\nu^* v_\alpha^\prime v_\beta h_{\mathrm{V}}(\omega) \\ &\displaystyle &\displaystyle \frac{\left\langle D^*(v^\prime,\epsilon)\left| (\bar{c}\gamma^\mu\gamma_5 b) \right|B(v)\right\rangle}{\sqrt{m_{\mathrm{B}}m_{\mathrm{D}}^*}} = (\omega+1)\epsilon^{\ast\mu}h_{\mathrm{A}_1}(\omega) -\epsilon^{\ast}\cdot{v} \left[v^\mu h_{\mathrm{A}_2}(\omega) +v^{\prime\mu}h_{\mathrm{A}_3}(\omega) \right], \end{array} \end{aligned} $$

where ω = vv . In the limit of infinite heavy quark mass, four out of these six form factors can be replaced by a single, universal form factor, ξ(ω), which is called the Isgur-Wise function [232]

$$\displaystyle \begin{aligned} m_b,\,m_c\to\infty \quad \Rightarrow\quad h_{+}(\omega)=h_{\mathrm{A}_1}(\omega)=h_{\mathrm{A}_3}(\omega)= h_{\mathrm{V}}(\omega)=\xi(\omega), \end{aligned} $$

while h (ω) and \(h_{\mathrm {A}_2}(\omega )\) vanish as m b, m c become infinitely heavy. Outside the exact heavy-quark limit, the relation between the Isgur-Wise function and the form factors is modified. For instance,

$$\displaystyle \begin{aligned} h_{+}(\omega) = \left(1+\beta_{+}(\omega)+\gamma_{+}(\omega)\right) \xi(\omega), \end{aligned} $$

where β +, γ + parameterize radiative corrections and corrections arising from operators of higher dimension, which are suppressed by additional inverse powers of the heavy quark mass. Similar relations hold for \(h_{\mathrm {A}_1}, h_{\mathrm {A}_3}\) and h V. Another important result, known as Luke’s Theorem [233], states that at zero recoil, v = v , i.e. ω = 1, the leading corrections to the form factors h + and \(h_{\mathrm {A}_1}\) are quadratic in the inverse heavy quark mass.

With this setup one may devise a strategy to determine |V cb| by combining the experimentally determined decay rate with lattice calculations of the form factors. The differential decay rate for \(\bar {B}\to D^*\ell \bar \nu _\ell \) in the limit of zero recoil reads

$$\displaystyle \begin{aligned} \lim_{\omega\to1} \frac{1}{\sqrt{\omega^2-1}} \frac{{\mathrm{d}}\Gamma(B\to D^*\ell\nu)}{{\mathrm{d}}\omega}= |V_{cb}|{}^2\frac{G_{\mathrm{F}}^2}{4\pi^3} (m_{\mathrm{B}}-m_{\mathrm{D}^*})^2 m_{\mathrm{D}^*}^3 [h_{\mathrm{A}_1}(1)]^2, \end{aligned} $$

which, owing to Luke’s Theorem, receives corrections of order \(1/m_c^2\) only. For ω > 1 the single axial form factor \(h_{\mathrm {A}_1}\) must be replaced by a linear combination of several form factors. Thus, the theoretical uncertainties appear to be controlled best at zero recoil. Since the rate is suppressed near ω = 1, the measured decay rate must be extrapolated to that value to determine |V cb|. Most of the published lattice calculations of the form factors and the Isgur-Wise function [234,235,236,237,238] are therefore focused on the determination of the slope of ξ(ω) at ω = 1. The measured decay rate can then be extrapolated to zero recoil using a particular parameterization of ξ(ω), with its slope constrained via the lattice calculation. After taking radiative and power corrections into account, a value for |V cb| can be extracted.

A different but related strategy is to compute the form factors h +(1) and \(h_{\mathrm {A}_1}(1)\) directly via suitably chosen double ratios of hadronic matrix elements in which many systematic effects can be expected to cancel [239, 240]. Using the “Fermilab approach” for the heavy quarks in the quenched approximation, the authors of Ref. [240] find

$$\displaystyle \begin{aligned} h_{\mathrm{A}_1}(1) = 0.913^{+0.024\,+0.017}_{-0.017\,-0.030}, \end{aligned} $$

where the first error is statistical, while the second represents an estimate of various systematic uncertainties added in quadrature. Again, this result can be combined with the experimental decay rate to determine |V cb|. More details can be found in [240].

Most lattice studies of heavy-to-heavy semi-leptonic B-decays have been restricted to the quenched approximation. However, results for the form factors from dynamical simulations can be expected in the near future. Clearly, in order to have maximum impact on the determination of |V cb|, systematic effects arising from lattice artefacts and the formulation used to treat the heavy quark must be controlled to a high degree.

5.8 Concluding Remarks

In this article we have introduced the lattice approach to QCD and discussed a variety of applications, which range from hadron spectroscopy, confinement, quark masses and the running coupling, to spontaneous chiral symmetry breaking and hadronic matrix elements for flavour physics. This illustrates not only the versatility of the lattice method, but also indicates that lattice calculations have become ever more important for making quantitative predictions in the notoriously difficult sector of non-perturbative QCD. Still, a great number of other applications have not even been covered here, including nucleon structure functions and form factors, calculations at finite temperature and/or chemical potential, or detailed investigations of the QCD vacuum structure.

That lattice calculations have reached this standing is owed to the enormous progress which been made in developing more efficient algorithms for dynamical fermions, better discretizations, as well as a number of new theoretical concepts such as non-perturbative renormalization. These developments, in conjunction with the availability of ever more powerful computers, shall allow for precise computations of many phenomenologically relevant quantities, which previously seemed virtually intractable.

5.9 Addendum: QCD on the Lattice

5.9.1 Introduction

Since the first edition of this article [241] the field of lattice QCD has undergone a huge transformation. While the actual methodology was well established at the time of writing (2007), few simulations employing dynamical quarks had produced results with controlled errors, having a direct impact on phenomenology and experiment. During the past ten years or so this has changed dramatically. Simulations with light dynamical quarks, whose masses correspond to the physical value of the pion mass, have become the state of the art, and the effects of dynamical strange and charm quarks are now routinely included as well. In fact, lattice calculations of certain observables have reached (or are aiming for) a level of precision where the effects of the breaking of isospin symmetry can no longer be ignored. This necessitates that lattice QCD must account not only for the effects of unequal u and d quark masses but also for corrections due to electromagnetism, owing to the different electric charges of up- and down-type quarks.

In this context it is interesting to quote a remark by Ken Wilson, made at the 1989 International Conference on Lattice Field Theory [242]: “I still believe that an extraordinary increase in computing power (108is I think not enough) and equally powerful algorithmic advances will be necessary before a full interaction with experiment takes place.” Given that, in 1989, the most powerful supercomputers could sustain 10 GFlops (i.e. 1010 floating point operations per second), Wilson’s estimate was tantamount to requiring ExaFlops capabilities (1018 Flops) for lattice QCD to make an impact, a performance figure that has only been reached very recently by less than a handful of machines. The enormous progress that the field of lattice QCD has already seen over the past decade proves that Wilson’s view was far too pessimistic.Footnote 21 For instance, results from lattice calculations for the decay constants and form factors of mesons and baryons containing heavy quarks are vital input for global analyses of observables in flavour physics, designed to constrain the elements of the Cabibbo–Kobayashi–Maskawa matrix. Furthermore, lattice QCD yields precise values for the masses of the light (u, d, s) quarks [244].

An impressive testimony to the importance of lattice QCD for the entire field of particle physics is the regular report provided by the Flavour Lattice Averaging Group (FLAG). Since its inception in 2007, FLAG has been charting the progress in lattice QCD, by collecting results for a range of phenomenologically relevant quantities. Taking inspiration from the Particle Data Group, FLAG assesses the quality of individual calculations and produces world averages by combining those results that satisfy a defined set of requirements regarding the overall control over systematic effects. Three editions of the FLAG report, published in 2010 [245], 2013 [246] and 2016 [247], have appeared until now, and a fourth one has been published in 2019 [248]. In fact, the current status of lattice calculations of many observables that have been reviewed in the first edition of this article can be found in these comprehensive reports.

This short review is organized as follows. In Sects. 5.9.2 and 5.9.3 we give an update of lattice calculations applied to hadron spectroscopy, weak hadronic matrix elements and the determination of Standard Model parameters such as quark masses and the strong coupling constant. These quantities were covered extensively in the original edition of [241]. Then, in Sect. 5.9.4 we extend the discussion to the determination of quantities that describe structural and other properties of the nucleon, such as form factors and the axial charge. Finally, in Sect. 5.9.5 we discuss lattice calculations of the hadronic contributions to the muon anomalous magnetic moment, which is a key quantity to study possible deviations from the Standard Model. The review concludes with a few remarks on the progress achieved over the past decade and an outlook for future calculations.

5.9.2 Hadron Spectroscopy

The calculation of the light hadron spectrum, i.e. the masses of the lowest-lying mesons and baryons has long been regarded a benchmark for lattice QCD. In the quenched approximation, i.e. in the absence of dynamical quarks, a significant deviation between the calculated spectrum and experiment at the level of 10–15% was observed. When the light hadron spectrum could eventually be accurately reproduced within the overall uncertainty after the inclusion of light dynamical quarks [249,250,251,252] (see Fig. 5.22), this was hailed as a major success of lattice QCD. Thanks to these milestone results, the credibility of lattice calculations was firmly established throughout the particle and hadron physics communities.

Fig. 5.22
figure 22

The spectrum of the lowest-lying hadrons as computed in Ref. [250], to be compared to Fig. 5.5 of the original review [241]

Calculations of the light hadron spectrum have since been further refined, by taking the effects of isospin breaking into account. Strong isospin breaking arises from the mass splitting between the u and d quarks, m u ≠ m d. Since the electric charges of u and d quarks differ as well, electromagntic interactions are another source of isospin breaking. The formulation of QED on a lattice of finite volume poses considerable technical challenges since the photon is massless. There are several strategies to address the problem of the associated zero mode, and we refer the reader to recent reviews of the subject [253,254,255], which also serve as a guide to the literature.

After the inclusion of strong and electromagnetic isospin breaking effects, it became possible to perform another benchmark calculation, namely the accurate determination of the neutron-proton mass difference, as well as the mass splittings of other baryonic iso-multiplets [256,257,258,259]. The ability to determine isospin breaking effects arising from QED was also instrumental for calculations of the electromagnetic mass splittings of pions and kaons [260,261,262,263,264,265], which can be used to study violations of Dashen’s theorem [266]. The latter states that the electromagnetic self-energies of the charged pions and kaons are identical, while those of their neutral counterparts vanish. More details are found in section 3.1.1 of the FLAG report [247].

Another recent focus of lattice spectroscopy has been the determination of the excitation spectrum and the properties of hadronic resonances. This is a major refinement of previous calculations in which the masses of resonances (the simplest being the ρ-meson) were extracted naively from the exponential decay of the vector correlation function, thereby ignoring the fact that resonances are characterized both by a mass and a width. The general framework for the study of resonance properties in lattice QCD was developed by Lüscher already in the 1980s and 1990s [267,268,269,270], and it is only now that the potential of this elegant and powerful formalism can be fully exploited. The key idea that underlies the Lüscher method is the realization that computing the energy levels of multi-particle states in a finite volume gives access to the scattering phase shifts in infinite volume, provided that the spectrum (including excited states) can be determined sufficiently well for a range of kinematical situations. The latter are typically determined by the lattice volume and/or the total momentum of the multi-particle system in question.

To be more specific, let us consider the simplest resonance, the ρ-meson, whose properties can be accessed in p-wave ππ scattering. For energies below the inelastic threshold, the Lüscher condition reads

$$\displaystyle \begin{aligned} \phi(q) +\delta_1(k) = 0\quad \mbox{mod}\;\pi,\qquad q=\frac{kL}{2\pi}, \end{aligned} $$

where ϕ(q) is a known kinematic function of the scaled scattering momentum in units of the box size, q = kL∕2π and δ 1 is the scattering phase shift. The scattering momentum k is determined from the nth energy level ω n in a finite volume, according to

$$\displaystyle \begin{aligned} \omega_n=\sqrt{m_\pi^2+k^2}, \end{aligned} $$

where m π is the pion mass. Figure 5.23 shows an example of a calculation of the p-wave scattering phase shift as a function of the centre-of-mass energy [271].

Fig. 5.23
figure 23

The p-wave scattering phase shift of the ρ-meson, computed for m π = 280 MeV as a function of the centre-of-mass energy [271]. Data obtained for two different values of the lattice spacing (open and filled grey symbols) are shown. The solid line is obtained from a fit to a Breit-Wigner ansatz for the resonance

A crucial ingredient for the reliable determination of not just the energy level of the ground state but also the excitation spectrum is the use of correlator matrices computed using a suitable basis of interpolating operators (see Section 5.3 in Ref. [241]). The diagonalization of the correlator matrix can be achieved by solving a generalized eigenvalue problem from which the energy levels in a given channel can be determined [272,273,274]. The sometimes arduous task of constructing efficient interpolators for multi-particle states has been helped enormously by practical methods to compute “all-to-all” quark propagators [275] and, in particular, the so-called “distillation” technique [276, 277]. With these new developments it has been possible to perform lattice investigations of ππ scattering and the ρ resonance [278,279,280,281,282,283,284,285,286,287,288,289,290,291], as well as determinations of [292, 293] and KK scattering lengths [294, 295]. The formalism has also been used to study meson-baryon [296,297,298,299,300] and baryon-baryon [301, 302] interactions.

While the original Lüscher formalism was derived for the case of elastic two-particle scattering, it has now been generalized to coupled-channel systems [303,304,305,306,307], including the treatment of three-particle thresholds [308,309,310,311,312,313,314,315]. It also opens the possibility to study weak non-leptonic kaon decays [316] and compute form factors for timelike momentum transfers [317,318,319,320].

Moreover, the experimental discovery of new charmonium-like resonances, commonly referred to as the X, Y  and Z states, has kindled a new interest in hadron spectroscopy. A distinctive feature of the new resonances is their closeness to particle thresholds, and efforts are underway to gain a detailed understanding of the resonance structure in the charm sector. Using the formalism described above, there have been many calculations of a variety of charmonium-like resonances in lattice QCD. In view of the vast literature, we refer the reader to several recent reviews of the subject [321,322,323].

5.9.3 Parameters of the Standard Model

The Standard Model (SM) contains 19 parameters (excluding the neutrino sector) whose values are not predicted by the theory itself but must instead be fixed using experimental input. In many cases the relations between experimentally accessible observables and SM parameters involve quantities that encode the effects of the strong interactions. A well-known example is the kaon B-parameter B K that enters the relation between the quantity 𝜖 K, which is a measure of indirect CP violation, and a particular combination of Cabibbo–Kobayashi–Maskawa (CKM) matrix elements V td, V ts, i.e.

$$\displaystyle \begin{aligned} \epsilon_K\propto \hat{B}_K\,{\mathrm{Im}}\,(V_{td}\, V_{ts}^\ast). \end{aligned} $$

While 𝜖 K can be determined experimentally from a ratio of decay amplitudes of long- and short-lived K-mesons, K L,S → (ππ)I=0, the parameter \(\hat {B}_K\) must be extracted from the hadronic matrix element of a four-quark operator between K 0 and \(\bar {K}^0\) states. Obviously, such a calculation must be performed in the non-perturbative regime of QCD since it involves typical hadronic scales.

Other CKM matrix elements, such as V us, V ub and V cb are related to weak processes involving kaons, D- and B-mesons, which are described by a variety of leptonic decay constants (and their ratios), form factors of semi-leptonic meson and baryon decays, as well as the B-parameters that encode strong interaction contributions to \(B^0-\bar {B}^0\) and \(B_s^0-\bar {B}_s^0\) mixing. All these quantities have been studied in lattice QCD for many years, and increasingly precise estimates with controlled systematic errors have appeared over the past decade. They have been instrumental for recent analyses of the unitarity of the CKM matrix [324,325,326,327].

Similar considerations apply to SM parameters such as the strong coupling constant α s and the masses of the quarks. While the asymptotic scaling behaviour of α s gives rise to the dimensionful Λ-parameter that encodes the intrinsic scale of QCD, the quark masses are external parameters. Providing the link between experimentally accessible quantities and quark masses, as well as expressing the Λ-parameter in units of some measurable low-energy quantity has been a primary task for lattice QCD. Lattice calculations have also be instrumental for determining the coupling constants of effective descriptions of QCD, such as the low-energy constants of Chiral Perturbation Theory.

The importance of accurate, model-independent determinations of SM parameters and input quantities for flavour physics has led to the foundation of the Flavour Lattive Averaging Group (FLAG). Updates of the FLAG report have appeared at regular intervals since the publication of its first edition in 2010 [245]. As part of its mission, FLAG issues global estimates and averages of lattice results, provided that they satisfy a set of defined quality criteria. FLAG estimates are quoted separately according to the sea quark content of the calculations that enter the global analyses, i.e. whether they have been obtained with a degenerate doublet of u, d quarks (N f = 2) or with an additional dynamical strange (N f = 2 + 1) and charm quark (N f = 2 + 1 + 1). The current status of lattice QCD calculations of quark masses, the strong coupling, decay constants, form factors, mixing parameters and low-energy constants is summarized in Tables 1 and 2 of the 2016 FLAG report [247]. The FLAG webpageFootnote 22 contains additional updates. Below we comment on the current status of a few selected quantities.

Quark Masses

According to FLAG, the strange quark mass is known to 1% precision, while the accuracy in the determination of the average u and d quark mass, \(\hat {m}\equiv \frac {1}{2}(m_u+m_d)\), varies between 1–5 %, depending on the sea quark content [328,329,330,331,332, 332,333,334,335,336,337,338,339,340]. Thanks to the recent progress in including the effects of isospin breaking in lattice QCD calculations, estimates for the masses of the individual u and d quarks could also be obtained, typically with 2 − 5 % precision [261, 262, 264, 330]. Furthermore, the masses of the heavy quarks have been determined with excellent precision [328, 330,331,332, 335, 337, 341,342,343,344,345,346,347,348].

Running Coupling

A milestone was achieved by the ALPHA collaboration, who published [349] an estimate for \(\alpha _s(M_Z^2)\) obtained by tracing the scale evolution of the strong coupling non-perturbatively over several orders of magnitude into an energy range where the application of perturbation theory can be considered safe (at least as far as the quoted precision is concerned). Their main result is the determination of the Λ-parameter in three-flavour QCD, i.e. \(\Lambda _{{\overline {{\mathrm {MS}}}}}^{(3)}=341(12)\,\text{MeV}\), which can be matched to the Λ-parameter in the five-flavour theory using perturbation theory, giving \(\Lambda _{{\overline {{\mathrm {MS}}}}}^{(5)}=215(10)(3)\,\text{MeV}\). Finally, this is translated into the result for the strong coupling [349]:

$$\displaystyle \begin{aligned} \alpha_s^{{\overline{{\mathrm{MS}}}}}(M_Z^2) = 0.11852(84). \end{aligned} $$

The quoted error is 30% smaller than that of the 2016 PDG estimate of α s = 0.1181(11) [244]. The latter includes lattice results from Refs. [331, 335, 350,351,352,353,354].

Kaon Weak Matrix Elements

The kaon B-parameter B K is now known with an overall accuracy of 1.3% [336, 355,356,357,358,359]. Moreover, the calculations of matrix elements relevant for \(K^0-\bar {K}^0\) mixing have been extended to include operators that arise in extensions of the Standard Model [355, 358,359,360,361,362,363].

Lattice QCD results for kaon leptonic decay constants (more precisely: the ratio \(f_{K^+}/f_{\pi ^+}\)) and the form factor f +(0) describing semi-leptonic K → πℓν decays have now reached a level of precision that enables a competitive and model-independent determination of V us (see Sect. 5.7.1 of the original review article). Moreover, it is possible to test the unitarity of the first row in the CKM matrix, i.e. the relation

$$\displaystyle \begin{aligned} |V_{ud}|{}^2+|V_{us}|{}^2+|V_{ub}|{}^2 = 1, \end{aligned} $$

by combining experimental information with lattice results for f +(0) and \(f_{K^+}/f_{\pi ^+}\). Neglecting the contribution from |V ub|2 ≈ 1.7 ⋅ 10−5, one finds that |V ud|2 + |V us|2 can be determined with a total precision at the percent level, by combining the FLAG estimatesFootnote 23 for f +(0) and \(f_{K^+}/f_{\pi ^+}\) with the experimentally accessible combinations |V us|f +(0) = 0.2165(4) and \(|V_{us}/V_{ud}|f_{K^+}/f_{\pi ^+}=0.2760(4)\) [244, 364]. In QCD with dynamical light, strange and charm quarks (N f = 2 + 1 + 1) the result is |V ud|2 + |V us|2 = 0.9797(74), which signals a slight tension of 2.7 standard deviations with the Standard Model. The precision of the unitarity test can be sharpened considerably by replacing |V ud| with the value extracted from neutron β-decay, i.e. |V ud| = 0.97417(21) [365]. It is then sufficient to provide one additional constraint from lattice QCD, either in the form of f +(0) or the ratio \(f_{K^+}/f_{\pi ^+}\). Inserting the lattice result for f +(0) yields |V ud|2 + |V us|2 = 0.99884(53), which again differs from unitarity by about 2σ. Using instead the lattice result for \(f_{K^+}/f_{\pi ^+}\) implies |V ud|2 + |V us|2 = 0.99986(46). Thus, first-row unitarity can be probed with permil-level precision [247].

Heavy-Light Decay Constants and Form Factors

The treatment of heavy quarks on the lattice presents additional significant challenges: since the mass of the charm quark is close to typical values of the inverse lattice spacing, which acts as the ultraviolet cutoff, lattice results are prone to suffering from large discretisation errors. Moreover, the mass of the bottom quark exceeds currently accessible values of a −1, and specially designed methods are required for a consistent treatment. This has been discussed extensively in Sect. 5.7.2 of the original review.

The overall precision of lattice estimates for weak hadronic matrix elements involving charm and bottom quarks has vastly improved over the past decade. As shown in Table 2 of FLAG 2016 [247], the leptonic decay constants of the B and B s mesons are now known at the level of 2%, while ratios such as \(f_{B_s}/f_B\) have been determined with even better accuracy [347, 366,367,368,369,370,371,372,373]. Since the 2016 edition of the FLAG report, new results obtained with N f = 2 + 1 + 1 flavours of dynamical quarks [343, 374, 375] have pushed the overall precision to the sub-percent level, which is an impressive achievement. Also the estimates of the individual B-parameters \(\hat {B}_B\) and \(\hat {B}_{B_s}\), their ratios and combinations with the leptonic decay constants are now known with overall errors at the percent level [347, 370, 376, 377].

Results for form factors describing semi-leptonic decays of hadrons containing b-quarks, such as B → (D, D )ℓν, or even Λb → pℓν have reached a level of precision that is sufficient for competitive determinations of the CKM matrix elements V cb and V ub from exclusive processes. An extensive discussion is presented in the web update of the FLAG report.

5.9.4 Nucleon Matrix Elements

The understanding of the internal structure of the nucleon in terms of the fundamental interactions between its constituents, the quarks and gluons, has become a major activity within the field of lattice QCD. Structural information is encoded in quantities such as form factors, structure functions and (generalized) parton distribution functions (PDFs). An open problem in this context is the decomposition of the proton’s spin in terms of the spins of quarks and gluons, as well as their angular momentum [378, 379]. Another important issue is the so-called “proton radius puzzle” [380], which arises due to the observed discrepancy between the proton radius extracted from the Lamb shift in muonic hydrogen [381, 382] compared to the more traditional determinations from electron-proton scattering [383] or the Lamb shift in electronic hydrogen [384]. Accurate knowledge of the electromagnetic form factors of the proton are indispensable in order to resolve—or corroborate—this puzzle.

The determination of quantities such as nucleon form factors in lattice QCD proceeds by calculating the corresponding hadronic matrix elements between nucleon initial and final states. A strong motivation for computing such quantities is provided by the fact that fundamental interactions are often probed in scattering experiments involving nuclear targets. For instance, probing the neutrino sector requires accurate knowledge of the scattering cross sections of neutrinos with nuclear targets. Similar considerations apply to the search for dark matter candidates. Therefore, precise determinations of the corresponding nucleon matrix elements are indispensable for exploring the limits of the SM.

The past decade has seen a huge rise in the number of publications describing lattice calculations of nucleon matrix elements. Quantities that have been studied include

Recent reviews, presented at the annual conference on lattice field theory, can be found in Refs. [445,446,447]. Some results on nucleon form factors and other matrix elements are reviewed in section 3.2.5 of [448], and a dedicated chapter has been prepared for the 2019 edition of the FLAG report. In addition, there has been a community effort in the form of a white paper [449] in which lattice results are used to reduce the overall uncertainties in polarized and unpolarized proton PDFs and their moments.

The relevant nucleon hadronic matrix elements are extracted from suitable three-point correlation functions of quark bilinears between interpolating operators representing the initial and final-state nucleons. Examples of the corresponding diagrams, with the initial-state nucleon placed at Euclidean time t = 0 (the source), the final-state nucleon at time t s (the sink) and the operator insertion at time t, are shown in Fig. 5.24. In addition to the quark-connected diagram, in which the operator is inserted on a valence quark line, there are also quark-disconnected diagrams in which the operator probes the quark sea. The latter class of diagrams must be computed to determine, for instance, iso-scalar quantities, the strangeness form factors and the σ-terms.

Fig. 5.24
figure 24

Quark-connected (left) and disconnected (right) diagrams representing the interaction of the vector current with the nucleon

Precise determinations of nucleon matrix elements with controlled statistical and systematic errors are particularly challenging. This is a consequence of the fact that the noise-to-signal ratio in three-point correlation functions corresponding to the diagrams in Fig. 5.24 grows exponentially with a rate proportional to \(\exp \{(m_N-\frac {3}{2}m_\pi )t_s\}\), where m N and m π denote the nucleon and pion masses, respectively, and t s is the source-sink separation. Techniques designed to enhance the statistical signal at affordable numerical cost have been developed and applied, including the truncated solver method [450] and “all-mode-averaging” [451]. Furthermore, a technique to achieve an exponential error reduction via domain decomposition and multi-level integration has been proposed and tested in [452, 453]. So far, it has not been employed in actual calculations of nucleon matrix elements with dynamical quarks.

Quark-disconnected diagrams of the type shown on the right of Fig. 5.24 are intrinsically even noisier than their quark-connected counterparts and require special techniques that balance statistical accuracy against numerical cost. Commonly applied variance reduction techniques for quark-disconnected diagrams include hierarchical probing [454, 455], the coherent source sequential propagator method [389, 456] low-mode averaging [457, 458], the hopping parameter expansion [450, 459,460,461] and partitioning/dilution [275, 462]).

Despite these improvements, typical values of the source-sink separation t s for which the signal has not yet disappeared into the noise are limited to t s ≳ 1.5 fm. Since the correlation function is dominated by the ground state for t, (t s − t) →, it is then not guaranteed that the matrix element of interest can be extracted without incurring a bias from unsuppressed excited state contributions, as long as one cannot probe the region t s > 1.5 fm. Hence, in addition to “standard” systematic effects such as lattice artefacts or finite-volume effects, one must also ensure that the asymptotic regime of nucleon correlation functions has been correctly isolated. Indeed, controlling excited state effects has become perhaps the most important issue in current lattice calculations of nucleon matrix elements. The commonly used strategies include

  • fits to three-point correlation functions or suitably defined ratios of correlators including sub-leading contributions from excited states [393, 394];

  • calculations of three-point correlators summed over the operator insertion time t [463,464,465,466,467]. Contributions from excited states can be shown to be parametrically more strongly suppressed than in the standard case [468];

  • increasing the projection of nucleon interpolators onto the ground state [404, 469], as well as the construction of an operator basis for the variational method, which allows for the projection onto the approximate ground state [456, 469, 470].

The first two approaches proceed by fitting data obtained in a finite interval of source-sink separations t s to a function that describes the approach to the asymptotic behaviour. To be able to resolve the sub-leading contributions from excited states in such a fit obviously requires sufficiently precise input data.

Another challenge for lattice calculations of nucleon matrix elements is the accurate description of the pion mass dependence. Although simulations at or near the physical pion mass are now routinely performed, the result at the physical point is often obtained via an extrapolation in the pion mass. The fit ansatz for the pion mass dependence is usually derived from chiral effective theory. However, the convergence properties of baryonic chiral perturbation theory are not as well understood as in the mesonic sector, and it is still unclear whether the predicted functional form provides a good description in the pion mass range over which it is applied. It is thus mandatory to gather sufficiently precise results at small enough pion mass, in order to control the systematic uncertainty associated with the chiral extrapolation.

Instead of performing a detailed survey of a variety of nucleon observables, we single out one particular quantity—the iso-vector axial charge of the nucleon, g A, which is perhaps the most widely studied of nucleon matrix elements in lattice QCD and serves to illustrate the current state of the art. The axial charge describes the coupling of the W boson to the nucleon. In Minkowski space notation it is defined by

$$\displaystyle \begin{aligned} \left\langle {\mathrm{p}}(k,s^\prime)\right| \overline{u}\gamma^\mu\gamma_5 d\left|{\mathrm{n}}(k,s)\right\rangle = g_A\,\overline{u}_p(k,s^\prime)\,\gamma^\mu\gamma_5\,u_n(k,s), \end{aligned} $$

where u n(k, s) and u p(k, s ) denote the Dirac spinors of the neutron and proton with four-momentum k and spins s and s , respectively. The axial charge has been measured experimentally in neutron β-decay, and the current world average quoted in the PDG is g A = 1.2724 ± 0.0023 [471]. Provided that the experimental sensitivity is sufficient, it may be possible to probe for scalar and tensor interactions that are generated by loop effects or arise due to new forces in extensions of the SM. The definitions of the associated scalar and tensor charges, g S and g T are derived from Eq. (5.255) by replacing the axial current \(\overline {u}\gamma ^\mu \gamma _5 d\) by the scalar density \(\overline {u} d\) and the tensor current \(\overline {u}\sigma ^{\mu \nu } d\), respectively.

The calculation of g A is facilitated by the fact that it is derived from a forward matrix element without any momentum transfer and, secondly, since the contributions from quark-disconnected diagrams cancel in the iso-vector combination, for mass-degenerate up and down quarks. Coupled with the fact that a precise experimental value is known, the iso-vector axial charge is a benchmark quantity for lattice calculations of nucleon matrix elements. Obviously, the ability of state-of-the-art lattice calculations to reproduce the experimental result will enhance the credibility of lattice predictions for the unmeasured charges g S and g T. Figure 5.25 shows a compilation of recent results for g A, obtained in lattice QCD with N f = 2, 2 + 1 and 2 + 1 + 1 flavours of dynamical quarks. While most estimates agree with the experimental result within errors, it is clear that the overall precision of current lattice calculations does not match that of the experiments. To state this observation more precisely, we note that the typical total error of current lattice results is at the level of 1–3% while experiment is an order of magnitude more precise. It should also be mentioned that, more often than not, lattice results tend to be slightly lower that the PDG average. Whether this is due to a remnant bias from excited state contributions or indeed to any other systematic effect, must be investigated in future calculations able to realize larger source-sink separations.

Fig. 5.25
figure 25

Compilation of recent results for the isovector axial charge. The vertical red band indicates the PDG average [471]. Lattice results are labelled by PNDME 18 [411], CalLat 18 [410], PNDME 16 [406], Mainz/CLS 19 [414], PACS 18 [397], χQCD 18 [413], JLQCD 18 [412], LHPC 12 [392], LHPC 10 [389], Mainz/CLS 17 [409], ETMC 17 [407], ETMC 15 [405], RQCD 14 [404], QCDSF 13 [403] and Mainz/CLS 12 [402]

The tendency to underestimate g A in early lattice calculations of g A has been attributed to unsuppressed excited state effects. In this context it is interesting to note that recent analyses of the contributions from states to nucleon matrix elements based on chiral effective theory [472, 473] suggest that the asymptotic (physical) value of g A is approached from above. The different conclusions drawn from numerical and analytic studies can only be reconciled if one succeeds in simulating significantly larger source-sink separations at affordable cost.

Given that lattice QCD calculations reproduce the experimental value of benchmark quantities such as the axial charge at the level of a few percent, it is interesting to look at quantities that have not been measured so far. Results for the (iso-vector) scalar and tensor charges have been reported in [386, 393, 404,405,406, 411, 412, 414,415,416,417,418,419]. For both quantities one obtains g S, g T ≈ 1, and while the typical overall uncertainty in g S is at the level of 10%, the tensor charge is determined with 3% precision, similar to that of g A. The 2019 edition of the FLAG report contains a detailed compilation and comparison of results for the axial, scalar and tensor charges, as well as flavour-singlet charges and σ-terms. Calculations of these quantities have matured to a level which allows for global averages to be determined.

Lattice calculations of nucleon matrix elements is a rich subject, and while a comprehensive discussion of other quantities such as form factors and moments of PDFs is beyond the scope of this short review, we refer the reader to recent reviews [445,446,447], specific sections of [448] and the white paper on PDFs [449].

5.9.5 Hadronic Contributions to the Muon Anomalous Magnetic Moment

The SM describes with great accuracy and precision the properties of the constituents of the visible matter in the universe but leaves several profound questions unanswered. For instance, it cannot account for the matter-antimatter asymmetry and does not explain the vast hierarchy between the electroweak scale and the Planck mass. Most prominently, the SM cannot account for the presence of dark matter in the universe for which there is overwhelming observational evidence. Against this backdrop, the exploration of the limits of the SM and the search for “new physics” has become a major activity in particle physics. Traditionally, high-energy particle colliders have had the highest discovery potential. However, despite the fact that the LHC is the most powerful accelerator in the world, new particles that can, for instance, explain the dark matter puzzle have not been observed in the expected region. Therefore, additional search strategies must be pursued to detect evidence for physics beyond the SM.

Observables that can be measured with very high precision and for which similarly accurate theoretical predictions exist at the same time, play an increasingly important rôle for exploring the limits of the SM. One such quantity is the anomalous magnetic moment of the muon, \(a_\mu \equiv \frac {1}{2}(g_\mu -2)\), where g μ denotes the muon’s gyromagnetic ratio. There has been a persistent tension of about 3.5 standard deviations between the measured value and the SM prediction [244]:

$$\displaystyle \begin{aligned} a_\mu^{\text{exp}} - a_\mu^{\mathrm{SM}}=(266\pm76)\cdot10^{-11}. \end{aligned} $$

As described in detail in the extensive reviews in Refs. [474] and [475], the SM estimate of the anomalous magnetic moment receives contributions from QED, the weak and the strong interactions, i.e.

$$\displaystyle \begin{aligned} a_\mu^{\mathrm{SM}}=a_\mu^{\text{QED}} +a_\mu^{\text{weak}} +a_\mu^{\text{strong}}. \end{aligned} $$

While QED effects account for about 99.994% of the absolute value of \(a_\mu ^{\mathrm {SM}}\), its total uncertainty is completely dominated by the contribution from \(a_\mu ^{\text{strong}}\). Since the latter is mostly due to hadronic effects that are intrinsically non-perturbative, it is clear that special attention must be paid to their reliable evaluation.

The most important quantum corrections to \(a_\mu ^{\mathrm {SM}}\) arising from strong interaction physics are the leading hadronic vacuum polarization (HVP) and hadronic light-by-light scattering (HLbL) contributions. The HVP contribution, \(a_\mu ^{\text{hvp}}\), which arises at order α 2 (where α is the fine structure constant), can be expressed in terms of a dispersion integral of the cross section ratio R(s) = σ(e +e →hadrons)∕σ(e +e → μ +μ ), multiplied by a known kernel function. At small values of the centre-of-mass energy s, the dispersion integral is evaluated using experimental data for the R-ratio R(s) as input [476,477,478,479,480]. For instance, the recent analysis of Ref. [479], which is based on the available data for e +e →hadrons, produced an estimate of \(a_\mu ^{\text{hvp}}=(693.1\pm 3.4)\cdot 10^{-10}\). While the total error is at the level of 0.5%, it is clear that experimental uncertainties enter the SM prediction for a μ in this approach.

The HLbL contribution has been quantified mostly using hadronic models, although efforts are under way to formulate and apply a dispersive or data-driven framework to treat some of the dominant sub-processes [481,482,483,484,485,486,487,488,489,490,491]. The current SM estimate \(a_\mu ^{\mathrm {SM}}\) is based on model calculations such as the “Glasgow consensus”, i.e. \(a_\mu ^{\text{hlbl}}=(105\pm 26)\cdot 10^{-11}\) [492]. Other studies, which have produced consistent results, can be found in Refs. [474, 478, 493].

Given the importance of a μ for testing the limits of the SM, it is crucial to verify the current estimates of \(a_\mu ^{\text{hvp}}\) and \(a_\mu ^{\text{hlbl}}\) and possibly reduce their overall errors using an ab initio approach such as lattice QCD. Given that two new experiments (E989 at Fermilab and E34 at J-PARC) are set to improve the precision of the measurement of a μ by a factor four, the importance of reliably estimating the hadronic contributions has become even higher. In order to make an impact, lattice QCD must be able to constrain \(a_\mu ^{\text{hvp}}\) with sub-percent accuracy, while an estimate of \(a_\mu ^{\text{hlbl}}\) at the level of 10% would already be a major step forward. Both tasks, however, present a considerable challenge to lattice QCD. The current status of lattice calculations of \(a_\mu ^{\text{hvp}}\) and \(a_\mu ^{\text{hlbl}}\) was reviewed extensively in Ref. [494], which can be consulted for details. Here we present merely an overview of the main issues and a guide to the literature.

The hadronic vacuum polarization contribution, \(a_\mu ^{\text{hvp}}\), is accessible in lattice QCD via different integral representations involving the correlator of the electromagnetic current. The first possibility is to consider a convolution integral over Euclidean momenta Q 2 of the subtracted vacuum polarization function [500, 501]. The second possibility is the so-called time-momentum representation defined in Ref. [502], in which the product of the spatially summed vector correlator G(x 0) and a kernel function is integrated over the Euclidean time x 0. A variant of the time-momentum representation uses the time moments of G(x 0) [503]. Finally, there also exists a Lorentz-covariant formulation in coordinate space [504] involving the point-to-point vector correlator G(x, y).

In order to meet the precision goal of sub-percent uncertainty, it is mandatory to have good control over the infrared regime which makes a sizeable contribution to \(a_\mu ^{\text{hvp}}\). In the formulation of Refs. [500, 501] this implies that momenta corresponding to \(Q^2\lesssim m_\mu ^2\) must be included, since this is where the convolution integral receives its dominant contribution. Instead, in the time-momentum representation or the Lorentz-covariant formulation one must constrain the long-distance regime of the correlator sufficiently well. The statistical accuracy that one can attain for \(a_\mu ^{\text{hvp}}\) is affected by the well-known noise problem encountered for the vector correlator, i.e. the fact that the signal-to-noise ratio increases exponentially at large distances.Footnote 24 Another limiting factor for the overall precision of \(a_\mu ^{\text{hvp}}\) in lattice QCD is the knowledge of the lattice scale [499, 505]. At first sight this may seem surprising, given that \(a_\mu ^{\text{hvp}}\) is a dimensionless quantity. However, employing the time-momentum representation, one easily sees that the lattice scale enters through the combination (x 0m μ)2 in the kernel function. Similar arguments exist for the other representations of \(a_\mu ^{\text{hvp}}\). Furthermore, at the level of sub-percent precision, it is necessary to include the contributions from quark-disconnected diagrams and the effects from isospin breaking (see Sect. 5.9.2). All of this is explained in great detail in Ref. [494].

First exploratory calculations of \(a_\mu ^{\text{hvp}}\) in full QCD were published in 2008 [506], and in the following years several studies appeared [497, 507,508,509], employing a range of different discretisations of the quark action, which were mostly aimed at investigating systematic effects. The most recent calculations are focussed on reducing the overall uncertainties [495, 496, 498, 499, 510,511,512,513,514,515, 530]. A comparison of recent estimates for \(a_\mu ^{\text{hvp}}\) from lattice QCD to results obtained via the dispersive approach is shown in Fig. 5.26. As of now, current calculations cannot match the accuracy of the dispersive approach, but efforts are under way to reduce the uncertainties to a level that makes the lattice approach competitive with data-driven methods [494, 516].

Fig. 5.26
figure 26

Compilation of recent results for the hadronic vacuum polarisation contribution in units of 10−10. The three panels represent calculations with different numbers of sea quarks. Lattice results are labelled by ETMC 18 [515], BMW 17 [495], HPQCD 16 [496], ETMC 13 [497], Mainz/CLS 19 [530], RBC/UKQCD 18 [498], and Mainz/CLS 17 [499]. The phenomenological determinations based on the R-ratio are labelled as HLMNT 11 [477], DHMZ 11 [476], Jegerlehner 17 [478] and KNT 18 [480]. The red vertical band denotes the estimate from dispersion theory quoted in KNT 18 [480]

In order to determine the hadronic light-by-light scattering contribution, it is necessary to formulate the problem in such a way that \(a_\mu ^{\text{hlbl}}\) is expressed in terms of quantities that can be computed on the lattice with affordable effort. Several different strategies have been proposed and are currently being pursued:

In a first method, the matrix element of the electromagnetic current between explicit muon initial and final states is computed is QCD+QED [517]. In order to isolate the desired light-by-light scattering contribution, one has to perform a non-perturbative subtraction. While the method has produced estimates in the expected range, statistical errors are large, as a result of the cancellation between two large numbers [518].

In another method proposed by the RBC/UKQCD Collaboration [519, 520], the light-by-light scattering diagram is evaluated by inserting three explicit photon propagators. The positions of the insertion of these propagators are then sampled stochastically. In this way, results for the quark-connected and the leading quark-disconnected contributions have been obtained, i.e.

$$\displaystyle \begin{aligned} (a_\mu^{\text{hlbl}})^{\text{conn}} = (116.0\pm9.6)\cdot 10^{-11},\quad (a_\mu^{\text{hlbl}})^{\text{disc}} = (-62.5\pm8.0)\cdot 10^{-11}. \end{aligned} $$

The sum of the two contributions gives \(a_\mu ^{\text{hlbl}}=(53.5\pm 13.5)\cdot 10^{-11}\) which differs from the Glasgow consensus by a factor two. However, before jumping to conclusions one must take into account that systematic effects have not yet been fully quantified in these calculations.

The Mainz group has proposed a method in which the QED kernel function is computed semi-analytically in infinite volume [521,522,523,524]. This has the advantage that large finite-volume effects arising from the massless photon mode are absent. The method has yet to produce explicit estimates for \(a_\mu ^{\text{hlbl}}\). A variant was proposed by RBC/UKQCD in Ref. [525]. Another project of the Mainz group has focussed on the forward light-by-light scattering amplitude, which can be linked via the optical theorem and dispersive sum rules to models of the cross section for the process γ γ →hadrons [526, 527]. The results provide an important test for model estimates of \(a_\mu ^{\text{hlbl}}\).

Finally, lattice QCD calculations can also be used to directly test model estimates of the expected dominant contribution to \(a_\mu ^{\text{hlbl}}\) from the pion pole, which requires knowledge of the transition form factor for π 0 → γ γ . The calculation of Ref. [528], which was performed in two-flavour QCD, gives

$$\displaystyle \begin{aligned} (a_\mu^{\text{hlbl}})^{\pi^0} = (65.0\pm8.3)\cdot 10^{-11} \end{aligned} $$

which is in very good agreement with model estimates [491]. It will be interesting to extend this calculation by including the corresponding contributions of the η and η mesons.

This brief survey demonstrates that lattice QCD contributes in many different and complementary ways to constrain the hadronic contributions to the muon g − 2 more precisely.

5.9.6 Concluding Remarks

In this short review we have charted the progress of lattice QCD calculations over more than a decade, i.e. since the publication of the original review article. Back in 2007, lattice QCD was on the verge of providing estimates for hadronic observables from first principles, which were of immediate phenomenological relevance. In the meantime, lattice QCD has become an indispensable tool in particle and hadron physics: In addition to to providing accurate estimates of SM parameters and input quantities for analyses in flavour physics, lattice QCD is now also making inroads into field such as nucleon structure and precision observables. This underlines the important role of lattice calculations for exploring the limits of the SM and searches for new physics.

Furthermore, studying hadronic interactions, i.e. the physics of resonances and multi-hadron systems, has become a major activity in lattice QCD and also serves as a basis for the understanding of light nuclei from first principles. Other important applications of the lattice formulation that have not been covered in this article are studies of matter under extreme conditions. Indeed, many features of the QCD phase diagram and properties of the quark-gluon plasma that are otherwise inaccessible can nowadays be obtained reliably from lattice calculations. Perhaps the most significant development since Ken Wilson’s 1989 remark, quoted in the introduction, is the fact that there is now a vigorous interaction between lattice QCD and experiment.