1 Introduction

Flavour physics provides an important opportunity for exploring the limits of the Standard Model of particle physics and for constraining possible extensions that go beyond it. As the LHC explores a new energy frontier and as experiments continue to extend the precision frontier, the importance of flavour physics will grow, both in terms of searches for signatures of new physics through precision measurements and in terms of attempts to construct the theoretical framework behind direct discoveries of new particles. Crucial to such searches for new physics is the ability to quantify strong-interaction effects. Large-scale numerical simulations of lattice QCD allow for the computation of these effects from first principles. The scope of the Flavour Lattice Averaging Group (FLAG) is to review the current status of lattice results for a variety of physical quantities that are important for flavour physics. Set up in November 2007, it comprises experts in Lattice Field Theory, Chiral Perturbation Theory and Standard Model phenomenology. Our aim is to provide an answer to the frequently posed question “What is currently the best lattice value for a particular quantity?” in a way that is readily accessible to those who are not expert in lattice methods. This is generally not an easy question to answer; different collaborations use different lattice actions (discretizations of QCD) with a variety of lattice spacings and volumes, and with a range of masses for the u- and d-quarks. Not only are the systematic errors different, but also the methodology used to estimate these uncertainties varies between collaborations. In the present work, we summarize the main features of each of the calculations and provide a framework for judging and combining the different results. Sometimes it is a single result that provides the “best” value; more often it is a combination of results from different collaborations. Indeed, the consistency of values obtained using different formulations adds significantly to our confidence in the results.

The first three editions of the FLAG review were made public in 2010 [1], 2013 [2], and 2016 [3] (and will be referred to as FLAG 10, FLAG 13 and FLAG 16, respectively). The third edition reviewed results related to both light (u-, d- and s-), and heavy (c- and b-) flavours. The quantities related to pion and kaon physics were light-quark masses, the form factor \(f_+(0)\) arising in semileptonic \(K \rightarrow \pi \) transitions (evaluated at zero momentum transfer), the decay constants \(f_K\) and \(f_\pi \), the \(B_K\) parameter from neutral kaon mixing, and the kaon mixing matrix elements of new operators that arise in theories of physics beyond the Standard Model. Their implications for the CKM matrix elements \(V_{us}\) and \(V_{ud}\) were also discussed. Furthermore, results were reported for some of the low-energy constants of \(SU(2)_L \times SU(2)_R\) and \(SU(3)_L \times SU(3)_R\) Chiral Perturbation Theory. The quantities related to D- and B-meson physics that were reviewed were the masses of the charm and bottom quarks together with the decay constants, form factors, and mixing parameters of B- and D-mesons. These are the heavy–light quantities most relevant to the determination of CKM matrix elements and the global CKM unitarity-triangle fit. Last but not least, the current status of lattice results on the QCD coupling \(\alpha _s\) was reviewed.

Table 1 Summary of the main results of this review concerning quark masses, light-meson decay constants, LECs, and kaon mixing parameters. These are grouped in terms of \(N_{ f}\), the number of dynamical quark flavours in lattice simulations. Quark masses and the quark condensate are given in the \({\overline{\mathrm {MS}}}\) scheme at running scale \(\mu =2\,\mathrm {GeV}\) or as indicated. BSM bag parameters \(B_{2,3,4,5}\) are given in the \({\overline{\mathrm {MS}}}\) scheme at scale \(\mu =3\,\mathrm {GeV}\). Further specifications of the quantities are given in the quoted sections. Results for \(N_f=2\) quark masses are unchanged since FLAG 16 [3]. For each result we list the references that enter the FLAG average or estimate, and we stress again the importance of quoting these original works when referring to FLAG results. From the entries in this column one can also read off the number of results that enter our averages for each quantity. We emphasize that these numbers only give a very rough indication of how thoroughly the quantity in question has been explored on the lattice and recommend consulting the detailed tables and figures in the relevant section for more significant information and for explanations on the source of the quoted errors

In the present paper we provide updated results for all the above-mentioned quantities, but also extend the scope of the review by adding a section on nucleon matrix elements. This presents results for matrix elements of flavor nonsinglet and singlet bilinear operators, including the nucleon axial charge \(g_A\) and the nucleon sigma terms. These results are relevant for constraining \(V_{ud}\), for searches for new physics in neutron decays and other processes, and for dark matter searches. In addition, the section on up and down quark masses has been largely rewritten, replacing previous estimates for \(m_u\), \(m_d\), and the mass ratios R and Q that were largely phenomenological with those from lattice QED+QCD calculations. We have also updated the discussion of the phenomenology of isospin-breaking effects in the light meson sector, and their relation to quark masses, with a lattice-centric discussion. A short review of QED in lattice-QCD simulations is also provided, including a discussion of ambiguities arising when attempting to define “physical” quantities in pure QCD.

Our main results are collected in Tables 1, 2 and 3. As is clear from the tables, for most quantities there are results from ensembles with different values for \(N_f\). In most cases, there is reasonable agreement among results with \(N_f=2\), \(2\,+\,1\), and \(2\,+\,1\,+\,1\). As precision increases, we may some day be able to distinguish among the different values of \(N_f\), in which case, presumably \(2\,+\,1\,+\,1\) would be the most realistic. (If isospin violation is critical, then \(1\,+\,1+1\) or \(1\,+\,1+1\,+\,1\) might be desired.) At present, for some quantities the errors in the \(N_f=2\,+\,1\) results are smaller than those with \(N_f=2\,+\,1\,+\,1\) (e.g., for \(m_c\)), while for others the relative size of the errors is reversed. Our suggestion to those using the averages is to take whichever of the \(N_f=2\,+\,1\) or \(N_f=2\,+\,1\,+\,1\) results has the smaller error. We do not recommend using the \(N_f=2\) results, except for studies of the \(N_f\)-dependence of condensates and \(\alpha _s\), as these have an uncontrolled systematic error coming from quenching the strange quark.

Our plan is to continue providing FLAG updates, in the form of a peer reviewed paper, roughly on a triennial basis. This effort is supplemented by our more frequently updated website http://flag.unibe.ch [4], where figures as well as pdf-files for the individual sections can be downloaded. The papers reviewed in the present edition have appeared before the closing date 30 September 2018.Footnote 1

Table 2 Summary of the main results of this review concerning heavy–light mesons and the strong coupling constant. These are grouped in terms of \(N_{ f}\), the number of dynamical quark flavours in lattice simulations. The quantities listed are specified in the quoted sections. For each result we list the references that enter the FLAG average or estimate, and we stress again the importance of quoting these original works when referring to FLAG results. From the entries in this column one can also read off the number of results that enter our averages for each quantity. We emphasize that these numbers only give a very rough indication of how thoroughly the quantity in question has been explored on the lattice and recommend consulting the detailed tables and figures in the relevant section for more significant information and for explanations on the source of the quoted errors
Table 3 Summary of the main results of this review concerning nuclear matrix elements, grouped in terms of \(N_{ f}\), the number of dynamical quark flavours in lattice simulations. The quantities listed are specified in the quoted sections. For each result we list the references that enter the FLAG average or estimate, and we stress again the importance of quoting these original works when referring to FLAG results. From the entries in this column one can also read off the number of results that enter our averages for each quantity. We emphasize that these numbers only give a very rough indication of how thoroughly the quantity in question has been explored on the lattice and recommend consulting the detailed tables and figures in the relevant section for more significant information and for explanations on the source of the quoted errors

This review is organized as follows. In the remainder of Sect. 1 we summarize the composition and rules of FLAG and discuss general issues that arise in modern lattice calculations. In Sect. 2, we explain our general methodology for evaluating the robustness of lattice results. We also describe the procedures followed for combining results from different collaborations in a single average or estimate (see Sect. 2.2 for our definition of these terms). The rest of the paper consists of sections, each dedicated to a set of closely connected physical quantities. Each of these sections is accompanied by an Appendix with explicatory notes.Footnote 2 Finally, in Appendix A we provide a glossary in which we introduce some standard lattice terminology (e.g., concerning the gauge, light-quark and heavy-quark actions), and in addition we summarize and describe the most commonly used lattice techniques and methodologies (e.g., related to renormalization, chiral extrapolations, scale setting).

1.1 FLAG composition, guidelines and rules

FLAG strives to be representative of the lattice community, both in terms of the geographical location of its members and the lattice collaborations to which they belong. We aspire to provide the nuclear- and particle-physics communities with a single source of reliable information on lattice results.

In order to work reliably and efficiently, we have adopted a formal structure and a set of rules by which all FLAG members abide. The collaboration presently consists of an Advisory Board (AB), an Editorial Board (EB), and eight Working Groups (WG). The rôle of the Advisory Board is to provide oversight of the content, procedures, schedule and membership of FLAG, to help resolve disputes, to serve as a source of advice to the EB and to FLAG as a whole, and to provide a critical assessment of drafts. They also give their approval of the final version of the preprint before it is rendered public. The Editorial Board coordinates the activities of FLAG, sets priorities and intermediate deadlines, organizes votes on FLAG procedures, writes the introductory sections, and takes care of the editorial work needed to amalgamate the sections written by the individual working groups into a uniform and coherent review. The working groups concentrate on writing the review of the physical quantities for which they are responsible, which is subsequently circulated to the whole collaboration for critical evaluation.

The current list of FLAG members and their Working Group assignments is:

  • Advisory Board (AB):    S. Aoki, M. Golterman, R. Van De Water, and A. Vladikas

  • Editorial Board (EB): G. Colangelo, A. Jüttner, S. Hashimoto, S.R. Sharpe, and U. Wenger

  • Working Groups (coordinator listed first):

    • Quark masses: T. Blum, A. Portelli, and A. Ramos;

    • \(V_{us},V_{ud}\): S. Simula, T. Kaneko, and J. N. Simone;

    • LEC: S. Dürr, H. Fukaya, and U.M. Heller;

    • \(B_K\): P. Dimopoulos, G. Herdoiza, and R. Mawhinney;

    • \(f_{B_{(s)}}\), \(f_{D_{(s)}}\), \(B_B\): D. Lin, Y. Aoki, and M. Della Morte;

    • \(B_{(s)}\), D semileptonic and radiative decays: E. Lunghi, D. Becirevic, S. Gottlieb, and C. Pena;

    • \(\alpha _s\): R. Sommer, R. Horsley, and T. Onogi;

    • NME: R. Gupta, S. Collins, A. Nicholson, and H. Wittig;

The most important FLAG guidelines and rules are the following:

  • the composition of the AB reflects the main geographical areas in which lattice collaborations are active, with members from America, Asia/Oceania, and Europe;

  • the mandate of regular members is not limited in time, but we expect that a certain turnover will occur naturally;

  • whenever a replacement becomes necessary this has to keep, and possibly improve, the balance in FLAG, so that different collaborations, from different geographical areas are represented;

  • in all working groups the three members must belong to three different lattice collaborations;Footnote 3\(^{,}\)Footnote 4

  • a paper is in general not reviewed (nor colour-coded, as described in the next section) by any of its authors;

  • lattice collaborations will be consulted on the colour coding of their calculation;

  • there are also internal rules regulating our work, such as voting procedures.

For this edition of the FLAG review, we sought the advice of external reviewers once a complete draft of the review was available. For each review section, we have asked one lattice expert (who could be a FLAG alumnus/alumna) and one nonlattice phenomenologist for a critical assessment. This is similar to the procedure followed by the Particle Data Group in the creation of the Review of Particle Physics. The reviewers provide comments and feedback on scientific and stylistic matters. They are not anonymous, and enter into a discussion with the authors of the WG. Our aim with this additional step is to make sure that a wider array of viewpoints enter into the discussions, so as to make this review more useful for its intended audience.

1.2 Citation policy

We draw attention to this particularly important point. As stated above, our aim is to make lattice-QCD results easily accessible to those without lattice expertise, and we are well aware that it is likely that some readers will only consult the present paper and not the original lattice literature. It is very important that this paper not be the only one cited when our results are quoted. We strongly suggest that readers also cite the original sources. In order to facilitate this, in Tables 1, 2, and 3, besides summarizing the main results of the present review, we also cite the original references from which they have been obtained. In addition, for each figure we make a bibtex file available on our webpage [4] which contains the bibtex entries of all the calculations contributing to the FLAG average or estimate. The bibliography at the end of this paper should also make it easy to cite additional papers. Indeed, we hope that the bibliography will be one of the most widely used elements of the whole paper.

1.3 General issues

Several general issues concerning the present review are thoroughly discussed in Sect. 1.1 of our initial 2010 paper [1], and we encourage the reader to consult the relevant pages. In the remainder of the present section, we focus on a few important points. Though the discussion has been duly updated, it is similar to that of Sect. 1.2 in the previous two reviews [2, 3], with the addition of comments on the contributions from excited states that are particularly relevant for the new section on NMEs.

The present review aims to achieve two distinct goals: first, to provide a description of the relevant work done on the lattice; and, second, to draw conclusions on the basis of that work, summarizing the results obtained for the various quantities of physical interest.

The core of the information about the work done on the lattice is presented in the form of tables, which not only list the various results, but also describe the quality of the data that underlie them. We consider it important that this part of the review represents a generally accepted description of the work done. For this reason, we explicitly specify the quality requirements used and provide sufficient details in appendices so that the reader can verify the information given in the tables.Footnote 5

On the other hand, the conclusions drawn on the basis of the available lattice results are the responsibility of FLAG alone. Preferring to err on the side of caution, in several cases we draw conclusions that are more conservative than those resulting from a plain weighted average of the available lattice results. This cautious approach is usually adopted when the average is dominated by a single lattice result, or when only one lattice result is available for a given quantity. In such cases, one does not have the same degree of confidence in results and errors as when there is agreement among several different calculations using different approaches. The reader should keep in mind that the degree of confidence cannot be quantified, and it is not reflected in the quoted errors.

Each discretization has its merits, but also its shortcomings. For most topics covered in this review we have an increasingly broad database, and for most quantities lattice calculations based on totally different discretizations are now available. This is illustrated by the dense population of the tables and figures in most parts of this review. Those calculations that do satisfy our quality criteria indeed lead, in almost all cases, to consistent results, confirming universality within the accuracy reached. In our opinion, the consistency between independent lattice results, obtained with different discretizations, methods, and simulation parameters, is an important test of lattice QCD, and observing such consistency also provides further evidence that systematic errors are fully under control.

In the sections dealing with heavy quarks and with \(\alpha _s\), the situation is not the same. Since the b-quark mass can barely be resolved with current lattice spacings, most lattice methods for treating b quarks use effective field theory at some level. This introduces additional complications not present in the light-quark sector. An overview of the issues specific to heavy-quark quantities is given in the introduction of Sect. 8. For B- and D-meson leptonic decay constants, there already exists a good number of different independent calculations that use different heavy-quark methods, but there are only one or two independent calculations of semileptonic B and D meson form factors and B meson mixing parameters. For \(\alpha _s\), most lattice methods involve a range of scales that need to be resolved and controlling the systematic error over a large range of scales is more demanding. The issues specific to determinations of the strong coupling are summarized in Sect. 9.

Number of sea quarks in lattice simulations

Lattice-QCD simulations currently involve two, three or four flavours of dynamical quarks. Most simulations set the masses of the two lightest quarks to be equal, while the strange and charm quarks, if present, are heavier (and tuned to lie close to their respective physical values). Our notation for these simulations indicates which quarks are nondegenerate, e.g., \(N_{ f}=2\,+\,1\) if \(m_u=m_d < m_s\) and \(N_{ f}=2\,+\,1\,+\,1\) if \(m_u=m_d< m_s < m_c\). Calculations with \(N_{ f}=2\), i.e., two degenerate dynamical flavours, often include strange valence quarks interacting with gluons, so that bound states with the quantum numbers of the kaons can be studied, albeit neglecting strange sea-quark fluctuations. The quenched approximation (\(N_f=0\)), in which all sea-quark contributions are omitted, has uncontrolled systematic errors and is no longer used in modern lattice simulations with relevance to phenomenology. Accordingly, we will review results obtained with \(N_f=2\), \(N_f=2\,+\,1\), and \(N_f = 2\,+\,1\,+\,1\), but omit earlier results with \(N_f=0\). The only exception concerns the QCD coupling constant \(\alpha _s\). Since this observable does not require valence light quarks, it is theoretically well defined also in the \(N_f=0\) theory, which is simply pure gluodynamics. The \(N_f\)-dependence of \(\alpha _s\), or more precisely of the related quantity \(r_0 \Lambda _{\overline{\mathrm {MS}}}\), is a theoretical issue of considerable interest; here \(r_0\) is a quantity with the dimension of length that sets the physical scale, as discussed in Appendix A.2. We stress, however, that only results with \(N_f \ge 3\) are used to determine the physical value of \(\alpha _s\) at a high scale.

Lattice actions, simulation parameters, and scale setting

The remarkable progress in the precision of lattice calculations is due to improved algorithms, better computing resources, and, last but not least, conceptual developments. Examples of the latter are improved actions that reduce lattice artifacts and actions that preserve chiral symmetry to very good approximation. A concise characterization of the various discretizations that underlie the results reported in the present review is given in Appendix A.1.

Physical quantities are computed in lattice simulations in units of the lattice spacing so that they are dimensionless. For example, the pion decay constant that is obtained from a simulation is \(f_\pi a\), where a is the spacing between two neighboring lattice sites. (All simulations with results quoted in this review use hypercubic lattices, i.e., with the same spacing in all four Euclidean directions.) To convert these results to physical units requires knowledge of the lattice spacing a at the fixed values of the bare QCD parameters (quark masses and gauge coupling) used in the simulation. This is achieved by requiring agreement between the lattice calculation and experimental measurement of a known quantity, which thus “sets the scale” of a given simulation. A few details on this procedure are provided in Appendix A.2.

Renormalization and scheme dependence

Several of the results covered by this review, such as quark masses, the gauge coupling, and B-parameters, are for quantities defined in a given renormalization scheme and at a specific renormalization scale. The schemes employed (e.g., regularization-independent MOM schemes) are often chosen because of their specific merits when combined with the lattice regularization. For a brief discussion of their properties, see Appendix A.3. The conversion of the results obtained in these so-called intermediate schemes to more familiar regularization schemes, such as the \({\overline{\mathrm {MS}}}\)-scheme, is done with the aid of perturbation theory. It must be stressed that the renormalization scales accessible in simulations are limited, because of the presence of an ultraviolet (UV) cutoff of \(\sim \pi /a\). To safely match to \({\overline{\mathrm {MS}}}\), a scheme defined in perturbation theory, Renormalization Group (RG) running to higher scales is performed, either perturbatively or nonperturbatively (the latter using finite-size scaling techniques).

Extrapolations

Because of limited computing resources, lattice simulations are often performed at unphysically heavy pion masses, although results at the physical point have become increasingly common. Further, numerical simulations must be done at nonzero lattice spacing, and in a finite (four-dimensional) volume. In order to obtain physical results, lattice data are obtained at a sequence of pion masses and a sequence of lattice spacings, and then extrapolated to the physical pion mass and to the continuum limit. In principle, an extrapolation to infinite volume is also required. However, for most quantities discussed in this review, finite-volume effects are exponentially small in the linear extent of the lattice in units of the pion mass, and, in practice, one often verifies volume independence by comparing results obtained on a few different physical volumes, holding other parameters fixed. To control the associated systematic uncertainties, these extrapolations are guided by effective theories. For light-quark actions, the lattice-spacing dependence is described by Symanzik’s effective theory [93, 94]; for heavy quarks, this can be extended and/or supplemented by other effective theories such as Heavy-Quark Effective Theory (HQET). The pion-mass dependence can be parameterized with Chiral Perturbation Theory (\(\chi \)PT), which takes into account the Nambu-Goldstone nature of the lowest excitations that occur in the presence of light quarks. Similarly, one can use Heavy–Light Meson Chiral Perturbation Theory (HM\(\chi \)PT) to extrapolate quantities involving mesons composed of one heavy (b or c) and one light quark. One can combine Symanzik’s effective theory with \(\chi \)PT to simultaneously extrapolate to the physical pion mass and the continuum; in this case, the form of the effective theory depends on the discretization. See Appendix A.4 for a brief description of the different variants in use and some useful references. Finally, \(\chi \)PT can also be used to estimate the size of finite-volume effects measured in units of the inverse pion mass, thus providing information on the systematic error due to finite-volume effects in addition to that obtained by comparing simulations at different volumes.

Excited-state contamination

In all the hadronic matrix elements discussed in this review, the hadron in question is the lightest state with the chosen quantum numbers. This implies that it dominates the required correlation functions as their extent in Euclidean time is increased. Excited-state contributions are suppressed by \(e^{-\Delta E \Delta \tau }\), where \(\Delta E\) is the gap between the ground and excited states, and \(\Delta \tau \) the relevant separation in Euclidean time. The size of \(\Delta E\) depends on the hadron in question, and in general is a multiple of the pion mass. In practice, as discussed at length in Sect. 10, the contamination of signals due to excited-state contributions is a much more challenging problem for baryons than for the other particles discussed here. This is in part due to the fact that the signal-to-noise ratio drops exponentially for baryons, which reduces the values of \(\Delta \tau \) that can be used.

Critical slowing down

The lattice spacings reached in recent simulations go down to 0.05 fm or even smaller. In this regime, long autocorrelation times slow down the sampling of the configurations [95,96,97,98,99,100,101,102,103,104]. Many groups check for autocorrelations in a number of observables, including the topological charge, for which a rapid growth of the autocorrelation time is observed with decreasing lattice spacing. This is often referred to as topological freezing. A solution to the problem consists in using open boundary conditions in time [105], instead of the more common antiperiodic ones. More recently two other approaches have been proposed, one based on a multiscale thermalization algorithm [106, 107] and another based on defining QCD on a nonorientable manifold [108]. The problem is also touched upon in Sect. 9.2.1, where it is stressed that attention must be paid to this issue. While large scale simulations with open boundary conditions are already far advanced [109], only one result reviewed here has been obtained with any of the above methods (results for \(\alpha _s\) from Ref. [79] which use open boundary conditions). It is usually assumed that the continuum limit can be reached by extrapolation from the existing simulations, and that potential systematic errors due to the long autocorrelation times have been adequately controlled. Partially or completely frozen topology would produce a mixture of different \(\theta \) vacua, and the difference from the desired \(\theta =0\) result may be estimated in some cases using chiral perturbation theory, which gives predictions for the \(\theta \)-dependence of the physical quantity of interest [110, 111]. These ideas have been systematically and successfully tested in various models in [112, 113], and a numerical test on MILC ensembles indicates that the topology dependence for some of the physical quantities reviewed here is small, consistent with theoretical expectations [114].

Simulation algorithms and numerical errors

Most of the modern lattice-QCD simulations use exact algorithms such as those of Refs. [115, 116], which do not produce any systematic errors when exact arithmetic is available. In reality, one uses numerical calculations at double (or in some cases even single) precision, and some errors are unavoidable. More importantly, the inversion of the Dirac operator is carried out iteratively and it is truncated once some accuracy is reached, which is another source of potential systematic error. In most cases, these errors have been confirmed to be much less than the statistical errors. In the following we assume that this source of error is negligible. Some of the most recent simulations use an inexact algorithm in order to speed up the computation, though it may produce systematic effects. Currently available tests indicate that errors from the use of inexact algorithms are under control [117].

2 Quality criteria, averaging and error estimation

The essential characteristics of our approach to the problem of rating and averaging lattice quantities have been outlined in our first publication [1]. Our aim is to help the reader assess the reliability of a particular lattice result without necessarily studying the original article in depth. This is a delicate issue, since the ratings may make things appear simpler than they are. Nevertheless, it safeguards against the common practice of using lattice results, and drawing physics conclusions from them, without a critical assessment of the quality of the various calculations. We believe that, despite the risks, it is important to provide some compact information about the quality of a calculation. We stress, however, the importance of the accompanying detailed discussion of the results presented in the various sections of the present review.

2.1 Systematic errors and colour code

The major sources of systematic error are common to most lattice calculations. These include, as discussed in detail below, the chiral, continuum, and infinite-volume extrapolations. To each such source of error for which systematic improvement is possible we assign one of three coloured symbols: green star, unfilled green circle (which replaced in Ref. [2] the amber disk used in the original FLAG review [1]) or red square. These correspond to the following ratings:

  • the parameter values and ranges used to generate the data sets allow for a satisfactory control of the systematic uncertainties;

  • the parameter values and ranges used to generate the data sets allow for a reasonable attempt at estimating systematic uncertainties, which however could be improved;

  • the parameter values and ranges used to generate the data sets are unlikely to allow for a reasonable control of systematic uncertainties.

The appearance of a red tag, even in a single source of systematic error of a given lattice result, disqualifies it from inclusion in the global average.

Note that in the first two editions [1, 2], FLAG used the three symbols in order to rate the reliability of the systematic errors attributed to a given result by the paper’s authors. Starting with the previous edition [3] the meaning of the symbols has changed slightly – they now rate the quality of a particular simulation, based on the values and range of the chosen parameters, and its aptness to obtain well-controlled systematic uncertainties. They do not rate the quality of the analysis performed by the authors of the publication. The latter question is deferred to the relevant sections of the present review, which contain detailed discussions of the results contributing (or not) to each FLAG average or estimate.

For most quantities the colour-coding system refers to the following sources of systematic errors: (i) chiral extrapolation; (ii) continuum extrapolation; (iii) finite volume. As we will see below, renormalization is another source of systematic uncertainties in several quantities. This we also classify using the three coloured symbols listed above, but now with a different rationale: they express how reliably these quantities are renormalized, from a field-theoretic point of view (namely, nonperturbatively, or with 2-loop or 1-loop perturbation theory).

Given the sophisticated status that the field has attained, several aspects, besides those rated by the coloured symbols, need to be evaluated before one can conclude whether a particular analysis leads to results that should be included in an average or estimate. Some of these aspects are not so easily expressible in terms of an adjustable parameter such as the lattice spacing, the pion mass or the volume. As a result of such considerations, it sometimes occurs, albeit rarely, that a given result does not contribute to the FLAG average or estimate, despite not carrying any red tags. This happens, for instance, whenever aspects of the analysis appear to be incomplete (e.g., an incomplete error budget), so that the presence of inadequately controlled systematic effects cannot be excluded. This mostly refers to results with a statistical error only, or results in which the quoted error budget obviously fails to account for an important contribution.

Of course, any colour coding has to be treated with caution; we emphasize that the criteria are subjective and evolving. Sometimes, a single source of systematic error dominates the systematic uncertainty and it is more important to reduce this uncertainty than to aim for green stars for other sources of error. In spite of these caveats, we hope that our attempt to introduce quality measures for lattice simulations will prove to be a useful guide. In addition, we would like to stress that the agreement of lattice results obtained using different actions and procedures provides further validation.

2.1.1 Systematic effects and rating criteria

The precise criteria used in determining the colour coding are unavoidably time-dependent; as lattice calculations become more accurate, the standards against which they are measured become tighter. For this reason FLAG reassesses criteria with each edition and as a result some of the quality criteria (the one on chiral extrapolation for instance) have been tightened up over time [1,2,3].

In the following, we present the rating criteria used in the current report. While these criteria apply to most quantities without modification there are cases where they need to be amended or additional criteria need to be defined. For instance, when discussing results obtained in the \(\epsilon \)-regime of chiral perturbation theory in Sect. 5 the finite volume criterion listed below for the p-regime is no longer appropriate.Footnote 6 Similarly, the discussion of the strong coupling constant in Sect. 9 requires tailored criteria for renormalization, perturbative behaviour, and continuum extrapolation. In such cases, the modified criteria are discussed in the respective sections. Apart from only a few exceptions the following colour code applies in the tables:

  • Chiral extrapolation:

    • \(M_{\pi ,\mathrm {min}}< 200\) MeV, with three or more pion masses used in the extrapolation

      or two values of \(M_\pi \) with one lying within 10 MeV of 135MeV (the physical neutral pion mass) and the other one below 200 MeV

    • 200 MeV \(\le M_{\pi ,{\mathrm {min}}} \le \) 400 MeV, with three or more pion masses used in the extrapolation

      or two values of \(M_\pi \) with \(M_{\pi ,{\mathrm {min}}}<\) 200 MeV

      or a single value of \(M_\pi \), lying within 10 MeV of 135 MeV (the physical neutral pion mass)

    • otherwise

    This criterion has changed with respect to the previous edition [3].

  • Continuum extrapolation:

    • at least three lattice spacings and at least two points below 0.1 fm and a range of lattice spacings satisfying \([a_{\mathrm {max}}/a_{\mathrm {min}}]^2 \ge 2\)

    • at least two lattice spacings and at least one point below 0.1 fm and a range of lattice spacings satisfying \([a_{\mathrm {max}}/a_{\mathrm {min}}]^2 \ge 1.4\)

    • otherwise

    It is assumed that the lattice action is \({\mathcal {O}}(a)\)-improved (i.e., the discretization errors vanish quadratically with the lattice spacing); otherwise this will be explicitly mentioned. For unimproved actions an additional lattice spacing is required. This condition is unchanged from Ref. [3].

  • Finite-volume effects:

    The finite-volume colour code used for a result is chosen to be the worse of the QCD and the QED codes, as described below. If only QCD is used the QED colour code is ignored.

    – For QCD:

    • \([M_{\pi ,\mathrm {min}} / M_{\pi ,\mathrm {fid}}]^2 \exp \{4-M_{\pi ,\mathrm {min}}[L(M_{\pi ,\mathrm {min}})]_{\mathrm {max}}\} < 1\), or at least three volumes

    • \([M_{\pi ,\mathrm {min}} / M_{\pi ,\mathrm {fid}}]^2 \exp \{3-M_{\pi ,\mathrm {min}}[L(M_{\pi ,\mathrm {min}})]_{\mathrm {max}}\} < 1\), or at least two volumes

    • otherwise

    where we have introduced \([L(M_{\pi ,\mathrm {min}})]_{\mathrm {max}}\), which is the maximum box size used in the simulations performed at the smallest pion mass \(M_{\pi ,\mathrm{min}}\), as well as a fiducial pion mass \(M_{\pi ,\mathrm{fid}}\), which we set to 200 MeV (the cutoff value for a green star in the chiral extrapolation). It is assumed here that calculations are in the p-regime of chiral perturbation theory, and that all volumes used exceed 2 fm. This condition has been improved between the second [2] and the third [3] edition of the FLAG review but remains unchanged since. The rationale for this condition is as follows. Finite volume effects contain the universal factor \(\exp \{- L~M_\pi \}\), and if this were the only contribution a criterion based on the values of \(M_{\pi ,\text {min}} L\) would be appropriate. This is what we used in Ref. [2] (with \(M_{\pi ,\text {min}} L>4\) for   and \(M_{\pi ,\text {min}} L>3\) for ). However, as pion masses decrease, one must also account for the weakening of the pion couplings. In particular, 1-loop chiral perturbation theory [118] reveals a behaviour proportional to \(M_\pi ^2 \exp \{- L~M_\pi \}\). Our new condition includes this weakening of the coupling, and ensures, for example, that simulations with \(M_{\pi ,\mathrm {min}} = 135~\mathrm{MeV}\) and \(L~M_{\pi ,\mathrm {min}} = 3.2\) are rated equivalently to those with \(M_{\pi ,\mathrm {min}} = 200~\mathrm{MeV}\) and \(L~M_{\pi ,\mathrm {min}} = 4\).

    – For QED (where applicable):

    • \(1/([M_{\pi ,\mathrm {min}}L(M_{\pi ,\mathrm {min}})]_{\mathrm {max}})^{n_{\mathrm {min}}}<0.02\), or at least four volumes

    • \(1/([M_{\pi ,\mathrm {min}}L(M_{\pi ,\mathrm {min}})]_{\mathrm {max}})^{n_{\mathrm {min}}}<0.04\), or at least three volumes

    • otherwise

    Because of the infrared-singular structure of QED, electromagnetic finite-volume effects decay only like a power of the inverse spatial extent. In several cases like mass splittings [119, 120] or leptonic decays [121], the leading corrections are known to be universal, i.e., independent of the structure of the involved hadrons. In such cases, the leading universal effects can be directly subtracted exactly from the lattice data. We denote \(n_{\mathrm {min}}\) the smallest power of \(\frac{1}{L}\) at which such a subtraction cannot be done. In the widely used finite-volume formulation \(\mathrm {QED}_L\), one always has \(n_{\mathrm {min}}\le 3\) due to the nonlocality of the theory [122]. While the QCD criteria have not changed with respect to Ref. [3] the QED criteria are new. They are used here only in Sect. 3.

  • Isospin breaking effects (where applicable):

    • all leading isospin breaking effects are included in the lattice calculation

    • isospin breaking effects are included using the electro-quenched approximation

    • otherwise

    This criterion is used for quantities which are breaking isospin symmetry or which can be determined at the sub-percent accuracy where isospin breaking effects, if not included, are expected to be the dominant source of uncertainty. In the current edition, this criterion is only used for the up and down quark masses, and related quantities (\(\epsilon \), \(Q^2\) and \(R^2\)). The criteria for isospin breaking effects feature for the first time in the FLAG review.

  • Renormalization (where applicable):

    • nonperturbative

    • 1-loop perturbation theory or higher with a reasonable estimate of truncation errors

    • otherwise

    In Ref. [1], we assigned a red square to all results which were renormalized at 1-loop in perturbation theory. In Ref. [2], we decided that this was too restrictive, since the error arising from renormalization constants, calculated in perturbation theory at 1-loop, is often estimated conservatively and reliably. We did not change these criteria since.

  • Renormalization Group (RG) running (where applicable):

    For scale-dependent quantities, such as quark masses or \(B_K\), it is essential that contact with continuum perturbation theory can be established. Various different methods are used for this purpose (cf. Appendix A.3): Regularization-independent Momentum Subtraction (RI/MOM), the Schrödinger functional, and direct comparison with (resummed) perturbation theory. Irrespective of the particular method used, the uncertainty associated with the choice of intermediate renormalization scales in the construction of physical observables must be brought under control. This is best achieved by performing comparisons between nonperturbative and perturbative running over a reasonably broad range of scales. These comparisons were initially only made in the Schrödinger functional approach, but are now also being performed in RI/MOM schemes. We mark the data for which information about nonperturbative running checks is available and give some details, but do not attempt to translate this into a colour code.

The pion mass plays an important role in the criteria relevant for chiral extrapolation and finite volume. For some of the regularizations used, however, it is not a trivial matter to identify this mass. In the case of twisted-mass fermions, discretization effects give rise to a mass difference between charged and neutral pions even when the up- and down-quark masses are equal: the charged pion is found to be the heavier of the two for twisted-mass Wilson fermions (cf. Ref. [123]). In early works, typically referring to \(N_f=2\) simulations (e.g., Refs. [123] and [48]), chiral extrapolations are based on chiral perturbation theory formulae which do not take these regularization effects into account. After the importance of accounting for isospin breaking when doing chiral fits was shown in Ref. [124], later works, typically referring to \(N_f=2\,+\,1\,+\,1\) simulations, have taken these effects into account [9]. We use \(M_{\pi ^\pm }\) for \(M_{\pi ,\mathrm {min}}\) in the chiral-extrapolation rating criterion. On the other hand, we identify \(M_{\pi ,\mathrm {min}}\) with the root mean square (RMS) of \(M_{\pi ^+}\), \(M_{\pi ^-}\) and \(M_{\pi ^0}\) in the finite-volume rating criterion.Footnote 7

In the case of staggered fermions, discretization effects give rise to several light states with the quantum numbers of the pion.Footnote 8 The mass splitting among these “taste” partners represents a discretization effect of \({\mathcal {O}}(a^2)\), which can be significant at large lattice spacings but shrinks as the spacing is reduced. In the discussion of the results obtained with staggered quarks given in the following sections, we assume that these artifacts are under control. We conservatively identify \(M_{\pi ,\mathrm {min}}\) with the root mean square (RMS) average of the masses of all the taste partners, both for chiral-extrapolation and finite-volume criteria.Footnote 9

The strong coupling \(\alpha _s\) is computed in lattice QCD with methods differing substantially from those used in the calculations of the other quantities discussed in this review. Therefore, we have established separate criteria for \(\alpha _s\) results, which will be discussed in Sect. 9.2.1.

In the new section on nuclear matrix elements, Sect. 10, an additional criterion has been introduced. This concerns the level of control over contamination from excited states, which is a more challenging issue for nucleons than for mesons. In addition, the chiral-extrapolation criterion in this section is somewhat stricter than that given above.

2.1.2 Heavy-quark actions

For the b quark, the discretization of the heavy-quark action follows a very different approach from that used for light flavours. There are several different methods for treating heavy quarks on the lattice, each with its own issues and considerations. Most of these methods use Effective Field Theory (EFT) at some point in the computation, either via direct simulation of the EFT, or by using EFT as a tool to estimate the size of cutoff errors, or by using EFT to extrapolate from the simulated lattice quark masses up to the physical b-quark mass. Because of the use of an EFT, truncation errors must be considered together with discretization errors.

The charm quark lies at an intermediate point between the heavy and light quarks. In our earlier reviews, the calculations involving charm quarks often treated it using one of the approaches adopted for the b quark. Since the last report [3], however, we found more recent calculations to simulate the charm quark using light-quark actions. This has become possible thanks to the increasing availability of dynamical gauge field ensembles with fine lattice spacings. But clearly, when charm quarks are treated relativistically, discretization errors are more severe than those of the corresponding light-quark quantities.

In order to address these complications, we add a new heavy-quark treatment category to the rating system. The purpose of this criterion is to provide a guideline for the level of action and operator improvement needed in each approach to make reliable calculations possible, in principle.

A description of the different approaches to treating heavy quarks on the lattice is given in Appendix A.1.3, including a discussion of the associated discretization, truncation, and matching errors. For truncation errors we use HQET power counting throughout, since this review is focused on heavy-quark quantities involving B and D mesons rather than bottomonium or charmonium quantities. Here we describe the criteria for how each approach must be implemented in order to receive an acceptable ( ) rating for both the heavy-quark actions and the weak operators. Heavy-quark implementations without the level of improvement described below are rated not acceptable ( ). The matching is evaluated together with renormalization, using the renormalization criteria described in Sect. 2.1.1. We emphasize that the heavy-quark implementations rated as acceptable and described below have been validated in a variety of ways, such as via phenomenological agreement with experimental measurements, consistency between independent lattice calculations, and numerical studies of truncation errors. These tests are summarized in Sect. 8.

Relativistic heavy-quark actions

   at least tree-level \({\mathcal {O}}(a)\) improved action and weak operators

This is similar to the requirements for light-quark actions. All current implementations of relativistic heavy-quark actions satisfy this criterion.

NRQCD

   tree-level matched through \({\mathcal {O}}(1/m_h)\) and improved through \({\mathcal {O}}(a^2)\)

The current implementations of NRQCD satisfy this criterion, and also include tree-level corrections of \({\mathcal {O}}(1/m_h^2)\) in the action.

HQET

  tree-level matched through \({\mathcal {O}}(1/m_h)\) with discretization errors starting at \({\mathcal {O}}(a^2)\)

The current implementation of HQET by the ALPHA collaboration satisfies this criterion, since both action and weak operators are matched nonperturbatively through \({\mathcal {O}}(1/m_h)\). Calculations that exclusively use a static-limit action do not satisfy this criterion, since the static-limit action, by definition, does not include \(1/m_h\) terms. We therefore include static computations in our final estimates only if truncation errors (in \(1/m_h\)) are discussed and included in the systematic uncertainties.

Light-quark actions for heavy quarks

  discretization errors starting at \({\mathcal {O}}(a^2)\) or higher

This applies to calculations that use the tmWilson action, a nonperturbatively improved Wilson action, domain wall fermions or the HISQ action for charm-quark quantities. It also applies to calculations that use these light quark actions in the charm region and above together with either the static limit or with an HQET-inspired extrapolation to obtain results at the physical b-quark mass. In these cases, the continuum-extrapolation criteria described earlier must be applied to the entire range of heavy-quark masses used in the calculation.

2.1.3 Conventions for the figures

For a coherent assessment of the present situation, the quality of the data plays a key role, but the colour coding cannot be carried over to the figures. On the other hand, simply showing all data on equal footing might give the misleading impression that the overall consistency of the information available on the lattice is questionable. Therefore, in the figures we indicate the quality of the data in a rudimentary way, using the following symbols:

  • corresponds to results included in the average or estimate (i.e., results that contribute to the black square below);

  • corresponds to results that are not included in the average but pass all quality criteria;

  • corresponds to all other results;

  • corresponds to FLAG averages or estimates; they are also highlighted by a gray vertical band.

The reason for not including a given result in the average is not always the same: the result may fail one of the quality criteria; the paper may be unpublished; it may be superseded by newer results; or it may not offer a complete error budget.

Symbols other than squares are used to distinguish results with specific properties and are always explained in the caption.Footnote 10

Often, nonlattice data are also shown in the figures for comparison. For these we use the following symbols:

  • corresponds to nonlattice results;

  • corresponds to Particle Data Group (PDG) results.

2.2 Averages and estimates

FLAG results of a given quantity are denoted either as averages or as estimates. Here we clarify this distinction. To start with, both averages and estimates are based on results without any red tags in their colour coding. For many observables there are enough independent lattice calculations of good quality, with all sources of error (not merely those related to the colour-coded criteria), as analyzed in the original papers, appearing to be under control. In such cases, it makes sense to average these results and propose such an average as the best current lattice number. The averaging procedure applied to this data and the way the error is obtained is explained in detail in Sect. 2.3. In those cases where only a sole result passes our rating criteria (colour coding), we refer to it as our FLAG average, provided it also displays adequate control of all other sources of systematic uncertainty.

On the other hand, there are some cases in which this procedure leads to a result that, in our opinion, does not cover all uncertainties. Systematic errors are by their nature often subjective and difficult to estimate, and may thus end up being underestimated in one or more results that receive green symbols for all explicitly tabulated criteria. Adopting a conservative policy, in these cases we opt for an estimate (or a range), which we consider as a fair assessment of the knowledge acquired on the lattice at present. This estimate is not obtained with a prescribed mathematical procedure, but reflects what we consider the best possible analysis of the available information. The hope is that this will encourage more detailed investigations by the lattice community.

There are two other important criteria that also play a role in this respect, but that cannot be colour coded, because a systematic improvement is not possible. These are: i) the publication status, and ii) the number of sea-quark flavours \(N_{ f}\). As far as the former criterion is concerned, we adopt the following policy: we average only results that have been published in peer-reviewed journals, i.e., they have been endorsed by referee(s). The only exception to this rule consists in straightforward updates of previously published results, typically presented in conference proceedings. Such updates, which supersede the corresponding results in the published papers, are included in the averages. Note that updates of earlier results rely, at least partially, on the same gauge-field-configuration ensembles. For this reason, we do not average updates with earlier results. Nevertheless, all results are listed in the tables,Footnote 11 and their publication status is identified by the following symbols:

  • Publication status:

    A   published or plain update of published results

    P   preprint

    C   conference contribution

In the present edition, the publication status on the 30th of September 2018 is relevant. If the paper appeared in print after that date, this is accounted for in the bibliography, but does not affect the averages.Footnote 12

As noted above, in this review we present results from simulations with \(N_f=2\), \(N_f=2\,+\,1\) and \(N_f=2\,+\,1\,+\,1\) (except for \( r_0 \Lambda _{\overline{\mathrm {MS}}}\) where we also give the \(N_f=0\) result). We are not aware of an a priori way to quantitatively estimate the difference between results produced in simulations with a different number of dynamical quarks. We therefore average results at fixed \(N_{ f}\) separately; averages of calculations with different \(N_{ f}\) are not provided.

To date, no significant differences between results with different values of \(N_f\) have been observed in the quantities listed in Tables 1, 2, and 3. In the future, as the accuracy and the control over systematic effects in lattice calculations increases, it will hopefully be possible to see a difference between results from simulations with \(N_{ f}= 2\) and \(N_{ f}= 2 + 1\), and thus determine the size of the Zweig-rule violations related to strange-quark loops. This is a very interesting issue per se, and one which can be quantitatively addressed only with lattice calculations.

The question of differences between results with \(N_{ f}=2\,+\,1\) and \(N_{ f}=2\,+\,1\,+\,1\) is more subtle. The dominant effect of including the charm sea quark is to shift the lattice scale, an effect that is accounted for by fixing this scale nonperturbatively using physical quantities. For most of the quantities discussed in this review, it is expected that residual effects are small in the continuum limit, suppressed by \(\alpha _s(m_c)\) and powers of \(\Lambda ^2/m_c^2\). Here \(\Lambda \) is a hadronic scale that can only be roughly estimated and depends on the process under consideration. Note that the \(\Lambda ^2/m_c^2\) effects have been addressed in Refs. [130, 131]. Assuming that such effects are small, it might be reasonable to average the results from \(N_{ f}=2\,+\,1\) and \(N_{ f}=2\,+\,1\,+\,1\) simulations.

2.3 Averaging procedure and error analysis

In the present report, we repeatedly average results obtained by different collaborations, and estimate the error on the resulting averages. Here we provide details on how averages are obtained.

2.3.1 Averaging: generic case

We follow the procedure of the previous two editions [2, 3], which we describe here in full detail.

One of the problems arising when forming averages is that not all of the data sets are independent. In particular, the same gauge-field configurations, produced with a given fermion discretization, are often used by different research teams with different valence-quark lattice actions, obtaining results that are not really independent. Our averaging procedure takes such correlations into account.

Consider a given measurable quantity Q, measured by M distinct, not necessarily uncorrelated, numerical experiments (simulations). The result of each of these measurement is expressed as

$$\begin{aligned} Q_i = x_i \, \pm \, \sigma ^{(1)}_i \pm \, \sigma ^{(2)}_i \pm \cdots \pm \, \sigma ^{(E)}_i , \end{aligned}$$
(1)

where \(x_i\) is the value obtained by the \(i{\mathrm{th}}\) experiment (\(i = 1, \ldots , M\)) and \(\sigma ^{(k)}_i\) (for \(k = 1, \ldots , E\)) are the various errors. Typically \(\sigma ^{(1)}_i\) stands for the statistical error and \(\sigma ^{(\alpha )}_i\) (\(\alpha \ge 2\)) are the different systematic errors from various sources. For each individual result, we estimate the total error \(\sigma _i \) by adding statistical and systematic errors in quadrature:

$$\begin{aligned} Q_i= & {} x_i \, \pm \, \sigma _i , \nonumber \\ \sigma _i\equiv & {} \sqrt{\sum _{\alpha =1}^E \Big [\sigma ^{(\alpha )}_i \Big ]^2} . \end{aligned}$$
(2)

With the weight factor of each total error estimated in standard fashion,

$$\begin{aligned} \omega _i = \dfrac{\sigma _i^{-2}}{\sum _{i=1}^M \sigma _i^{-2}} , \end{aligned}$$
(3)

the central value of the average over all simulations is given by

$$\begin{aligned} x_{\mathrm{av}}= & {} \sum _{i=1}^M x_i\, \omega _i . \end{aligned}$$
(4)

The above central value corresponds to a \(\chi _{\mathrm{min}}^2\) weighted average, evaluated by adding statistical and systematic errors in quadrature. If the fit is not of good quality (\(\chi _\mathrm{min}^2/\mathrm{dof} > 1\)), the statistical and systematic error bars are stretched by a factor \(S = \sqrt{\chi ^2/\mathrm{dof}}\).

Next, we examine error budgets for individual calculations and look for potentially correlated uncertainties. Specific problems encountered in connection with correlations between different data sets are described in the text that accompanies the averaging. If there is reason to believe that a source of error is correlated between two calculations, a \(100\%\) correlation is assumed. The correlation matrix \(C_{ij}\) for the set of correlated lattice results is estimated by a prescription due to Schmelling [132]. This consists in defining

$$\begin{aligned} \sigma _{i;j} = \sqrt{{\sum _{\alpha }}^\prime \Big [ \sigma _i^{(\alpha )} \Big ]^2 } , \end{aligned}$$
(5)

with \(\sum _{\alpha }^\prime \) running only over those errors of \(x_i\) that are correlated with the corresponding errors of the measurement \(x_j\). This expresses the part of the uncertainty in \(x_i\) that is correlated with the uncertainty in \(x_j\). If no such correlations are known to exist, then we take \(\sigma _{i;j} =0\). The diagonal and off-diagonal elements of the correlation matrix are then taken to be

$$\begin{aligned} C_{ii}= & {} \sigma _i^2 \quad (i = 1, \ldots , M) , \nonumber \\ C_{ij}= & {} \sigma _{i;j} \, \sigma _{j;i} \quad (i \ne j) . \end{aligned}$$
(6)

Finally, the error of the average is estimated by

$$\begin{aligned} \sigma ^2_{\mathrm{av}} = \sum _{i=1}^M \sum _{j=1}^M \omega _i \,\omega _j \,C_{ij}\,\,, \end{aligned}$$
(7)

and the FLAG average is

$$\begin{aligned} Q_{\mathrm{av}} = x_{\mathrm{av}} \, \pm \, \sigma _{\mathrm{av}} . \end{aligned}$$
(8)

2.3.2 Nested averaging

We have encountered one case where the correlations between results are more involved, and a nested averaging scheme is required. This concerns the B-meson bag parameters discussed in Sect. 8.2. In the following, we describe the details of the nested averaging scheme. This is an updated version of the section added in the web update of the FLAG 16 report.

The issue arises for a quantity Q that is given by a ratio, \(Q=Y/Z\). In most simulations, both Y and Z are calculated, and the error in Q can be obtained in each simulation in the standard way. However, in other simulations only Y is calculated, with Z taken from a global average of some type. The issue to be addressed is that this average value \({\overline{Z}}\) has errors that are correlated with those in Q.

In the example that arises in Sect. 8.2, \(Q=B_B\), \(Y=B_B f_B^2\) and \(Z=f_B^2\). In one of the simulations that contribute to the average, Z is replaced by \({\overline{Z}}\), the PDG average for \(f_B^2\) [133] (obtained with an averaging procedure similar to that used by FLAG). This simulation is labeled with \(i=1\), so that

$$\begin{aligned} Q_1 = \frac{Y_1}{{\overline{Z}}}. \end{aligned}$$
(9)

The other simulations have results labeled \(Q_j\), with \(j\ge 2\). In this set up, the issue is that \({\overline{Z}}\) is correlated with the \(Q_j\), \(j\ge 2\).Footnote 13

We begin by decomposing the error in \(Q_1\) in the same schematic form as above,

$$\begin{aligned} Q_1 = x_1 \pm \frac{\sigma _{Y_1}^{(1)}}{{\overline{Z}}} \pm \frac{\sigma _{Y_1}^{(2)}}{{\overline{Z}}} \pm \cdots \pm \frac{\sigma _{Y_1}^{(E)}}{{\overline{Z}}} \pm \frac{Y_1 \sigma _{{\overline{Z}}}}{{\overline{Z}}^2}. \end{aligned}$$
(10)

Here the last term represents the error propagating from that in \({\overline{Z}}\), while the others arise from errors in \(Y_1\). For the remaining \(Q_j\) (\(j\ge 2\)) the decomposition is as in Eq. (1). The total error of \(Q_1\) then reads

$$\begin{aligned} \sigma _1^2= & {} \left( \frac{\sigma _{Y_1}^{(1)}}{{\overline{Z}}}\right) ^2 + \left( \frac{\sigma _{Y_1}^{(2)}}{{\overline{Z}}}\right) ^2 +\cdots + \left( \frac{\sigma _{Y_1}^{(E)}}{{\overline{Z}}}\right) ^2\nonumber \\&+ \left( \frac{Y_1}{{\overline{Z}}^2}\right) ^2 \sigma _{{\overline{Z}}}^2, \end{aligned}$$
(11)

while that for the \(Q_j\) (\(j\ge 2\)) is

$$\begin{aligned} \sigma _j^2 = \left( \sigma _j^{(1)}\right) ^2 + \left( \sigma _j^{(2)}\right) ^2 +\cdots + \left( \sigma _j^{(E)}\right) ^2. \end{aligned}$$
(12)

Correlations between \(Q_j\) and \(Q_k\) (\(j,k\ge 2\)) are taken care of by Schmelling’s prescription, as explained above. What is new here is how the correlations between \(Q_1\) and \(Q_j\) (\(j\ge 2\)) are taken into account.

To proceed, we recall from Eq. (7) that \(\sigma _{{\overline{Z}}}\) is given by

$$\begin{aligned} \sigma _{{\overline{Z}}}^2 = \sum _{{i'},{j'}=1}^{M'} \omega [Z]_{i'} \omega [Z]_{j'} C[Z]_{i'j'}. \end{aligned}$$
(13)

Here the indices \(i'\) and \(j'\) run over the \(M'\) simulations that contribute to \({\overline{Z}}\), which, in general, are different from those contributing to the results for Q. The weights \(\omega [Z]\) and correlation matrix C[Z] are given an explicit argument Z to emphasize that they refer to the calculation of this quantity and not to that of Q. C[Z] is calculated using the Schmelling prescription [Eqs. (5)–(7)] in terms of the errors, \(\sigma [Z]_{i'}^{(\alpha )}\), taking into account the correlations between the different calculations of Z.

We now generalize Schmelling’s prescription for \(\sigma _{i;j}\), Eq. (5), to that for \(\sigma _{1;k}\) (\(k\ge 2\)), i.e., the part of the error in \(Q_1\) that is correlated with \(Q_k\). We take

$$\begin{aligned} \sigma _{1;k} = \sqrt{ \frac{1}{{\overline{Z}}^2} \sum ^\prime _{(\alpha )\leftrightarrow k} \Big [\sigma _{Y_1}^{(\alpha )} \Big ]^2 + \frac{Y_1^2}{{\overline{Z}}^4} \sum _{i',j'}^{M'} \omega [Z]_{i'} \omega [Z]_{j'} C[Z]_{i'j'\leftrightarrow k} }. \nonumber \\ \end{aligned}$$
(14)

The first term under the square root sums those sources of error in \(Y_1\) that are correlated with \(Q_k\). Here we are using a more explicit notation from that in Eq. (5), with \((\alpha ) \leftrightarrow k\) indicating that the sum is restricted to the values of \(\alpha \) for which the error \(\sigma _{Y_1}^{(\alpha )}\) is correlated with \(Q_k\). The second term accounts for the correlations within \({\overline{Z}}\) with \(Q_k\), and is the nested part of the present scheme. The new matrix \(C[Z]_{i'j'\leftrightarrow k}\) is a restriction of the full correlation matrix C[Z], and is defined as follows. Its diagonal elements are given by

$$\begin{aligned} C[Z]_{i'i'\leftrightarrow k}= & {} (\sigma [Z]_{i'\leftrightarrow k})^2 \quad (i' = 1, \ldots , M') , \end{aligned}$$
(15)
$$\begin{aligned} (\sigma [Z]_{i'\leftrightarrow k})^2= & {} \sum ^\prime _{(\alpha )\leftrightarrow k} (\sigma [Z]_{i'}^{(\alpha )})^2, \end{aligned}$$
(16)

where the summation \(\sum ^\prime _{(\alpha )\leftrightarrow k}\) over \((\alpha )\) is restricted to those \(\sigma [Z]_{i'}^{(\alpha )}\) that are correlated with \(Q_k\). The off-diagonal elements are

$$\begin{aligned} C[Z]_{i'j'\leftrightarrow k}= & {} \sigma [Z]_{i';j'\leftrightarrow k} \, \sigma [Z]_{j';i'\leftrightarrow k} \quad (i' \ne j') , \end{aligned}$$
(17)
$$\begin{aligned} \sigma [Z]_{i';j'\leftrightarrow k}= & {} \sqrt{ \sum ^\prime _{(\alpha )\leftrightarrow j'k} \left( \sigma [Z]_{i'}^{(\alpha )}\right) ^2}, \end{aligned}$$
(18)

where the summation \(\sum ^\prime _{(\alpha )\leftrightarrow j'k}\) over \((\alpha )\) is restricted to \(\sigma [Z]_{i'}^{(\alpha )}\) that are correlated with both \(Z_{j'}\) and \(Q_k\).

The last quantity that we need to define is \(\sigma _{k;1}\).

$$\begin{aligned} \sigma _{k;1} = \sqrt{\sum ^\prime _{(\alpha )\leftrightarrow 1} \Big [ \sigma _k^{(\alpha )} \Big ]^2 } , \end{aligned}$$
(19)

where the summation \(\sum ^\prime _{(\alpha )\leftrightarrow 1}\) is restricted to those \(\sigma _k^{(\alpha )}\) that are correlated with one of the terms in Eq. (11).

In summary, we construct the correlation matrix \(C_{ij}\) using Eq. (6), as in the generic case, except the expressions for \(\sigma _{1;k}\) and \(\sigma _{k;1}\) are now given by Eqs. (14) and (19), respectively. All other \(\sigma _{i;j}\) are given by the original Schmelling prescription, Eq. (5). In this way we extend the philosophy of Schmelling’s approach while accounting for the more involved correlations.

3 Quark masses

Authors: T. Blum, A. Portelli, A. Ramos

Quark masses are fundamental parameters of the Standard Model. An accurate determination of these parameters is important for both phenomenological and theoretical applications. The bottom- and charm-quark masses, for instance, are important sources of parametric uncertainties in several Higgs decay modes. The up-, down- and strange-quark masses govern the amount of explicit chiral symmetry breaking in QCD. From a theoretical point of view, the values of quark masses provide information about the flavour structure of physics beyond the Standard Model. The Review of Particle Physics of the Particle Data Group contains a review of quark masses [134], which covers light as well as heavy flavours. Here we also consider light- and heavy-quark masses, but focus on lattice results and discuss them in more detail. We do not discuss the top quark, however, because it decays weakly before it can hadronize, and the nonperturbative QCD dynamics described by present day lattice simulations is not relevant. The lattice determination of light- (up, down, strange), charm- and bottom-quark masses is considered below in Sects. 3.1, 3.2, and 3.3, respectively.

Quark masses cannot be measured directly in experiment because quarks cannot be isolated, as they are confined inside hadrons. From a theoretical point of view, in QCD with \(N_f\) flavours, a precise definition of quark masses requires one to choose a particular renormalization scheme. This renormalization procedure introduces a renormalization scale \(\mu \), and quark masses depend on this renormalization scale according to the Renormalization Group (RG) equations. In mass-independent renormalization schemes the RG equations reads

$$\begin{aligned} \mu \frac{\mathrm{d} {{\bar{m}}}_i(\mu )}{\mathrm{d}{\mu }} = {{\bar{m}}}_i(\mu ) \tau ({{\bar{g}}})\,, \end{aligned}$$
(20)

where the function \(\tau ({{\bar{g}}})\) is the anomalous dimension, which depends only on the value of the strong coupling \(\alpha _s=\bar{g}^2/(4\pi )\). Note that in QCD \(\tau ({{\bar{g}}})\) is the same for all quark flavours. The anomalous dimension is scheme dependent, but its perturbative expansion

(21)

has a leading coefficient \(d_0 = 8/(4\pi )^2\), which is scheme independent.Footnote 14 Equation (20), being a first order differential equation, can be solved exactly by using Eq. (21) as boundary condition. The formal solution of the RG equation reads

$$\begin{aligned} M_i= & {} {{\bar{m}}}_i(\mu )[2b_0{{\bar{g}}}^2(\mu )]^{-d_0/(2b_0)} \nonumber \\&\times \exp \left\{ - \int _0^{{{\bar{g}}}(\mu )}\mathrm{d} x\, \left[ \frac{\tau (x)}{\beta (x)} - \frac{d_0}{b_0x} \right] \right\} \,, \end{aligned}$$
(22)

where \(b_0 = (11-2N_f/3) / (4\pi )^2\) is the universal leading perturbative coefficient in the expansion of the \(\beta \)-function \(\beta ({{\bar{g}}})\). The renormalization group invariant (RGI) quark masses \(M_i\) are formally integration constants of the RG Eq. (20). They are scale independent, and due to the universality of the coefficient \(d_0\), they are also scheme independent. Moreover, they are nonperturbatively defined by Eq. (22). They only depend on the number of flavours \(N_f\), making them a natural candidate to quote quark masses and compare determinations from different lattice collaborations. Nevertheless, it is customary in the phenomenology community to use the \(\overline{\mathrm{MS}}\) scheme at a scale \(\mu = 2\) GeV to compare different results for light-quark masses, and use a scale equal to its own mass for the charm and bottom quarks. In this review, we will quote the final averages of both quantities.

Results for quark masses are always quoted in the four-flavour theory. \(N_{\mathrm{f}}=2\,+\,1\) results have to be converted to the four flavour theory. Fortunately, the charm quark is heavy \((\Lambda _\mathrm{QCD}/m_c)^2<1\), and this conversion can be performed in perturbation theory with negligible (\(\sim 0.2\%\)) perturbative uncertainties. Nonperturbative corrections in this matching are more difficult to estimate. Since these effects are suppressed by a factor of \(1/N_{\mathrm{c}}\), and a factor of the strong coupling at the scale of the charm mass, naive power counting arguments would suggest that the effects are \(\sim 1\%\). In practice, numerical nonperturbative studies [130, 131] have found this power counting argument to be an overestimate by one order of magnitude in the determination of simple hadronic quantities or the \(\Lambda \)-parameter. Moreover, lattice determinations do not show any significant deviation between the \(N_{\mathrm{f}}=2\,+\,1\) and \(N_\mathrm{f}=2\,+\,1\,+\,1\) simulations. For example, the difference in the final averages for the mass of the strange quark \(m_s\) between \(N_f=2\,+\,1\) and \(N_f=2\,+\,1\,+\,1\) determinations is about a 0.8%, and negligible from a statistical point of view.

We quote all final averages at 2 GeV in the \(\overline{\mathrm{MS}}\) scheme and also the RGI values (in the four flavour theory). We use the exact RG Eq. (22). Note that to use this equation we need the value of the strong coupling in the \(\overline{\mathrm{MS}}\) scheme at a scale \(\mu = 2\) GeV. All our results are obtained from the RG equation in the \(\overline{\mathrm{MS}}\) scheme and the 5-loop beta function together with the value of the \(\Lambda \)-parameter in the four-flavour theory \(\Lambda ^{(4)}_{\overline{\mathrm{MS}}} = 294(12)\, \mathrm{MeV}\) obtained in this review (see Sect. 9). In the uncertainties of the RGI massses we separate the contributions from the determination of the quark masses and the propagation of the uncertainty of \(\Lambda ^{(4)}_{\overline{\mathrm{MS}}}\). These are identified with the subscripts m and \(\Lambda \), respectively.

Conceptually, all lattice determinations of quark masses contain three basic ingredients:

  1. 1.

    Tuning the lattice bare-quark masses to match the experimental values of some quantities. Pseudo-scalar meson masses provide the most common choice, since they have a strong dependence on the values of quark masses. In pure QCD with \(N_f\) quark flavours these values are not known, since the electromagnetic interactions affect the experimental values of meson masses. Therefore, pure QCD determinations use model/lattice information to determine the location of the physical point. This is discussed at length in Sect. 3.1.1.

  2. 2.

    Renormalization of the bare-quark masses. Bare-quark masses determined with the above-mentioned criteria have to be renormalized. Many of the latest determinations use some nonperturbatively defined scheme. One can also use perturbation theory to connect directly the values of the bare-quark masses to the values in the \(\overline{\mathrm{MS}}\) scheme at 2 GeV. Experience shows that 1-loop calculations are unreliable for the renormalization of quark masses: usually at least two loops are required to have trustworthy results.

  3. 3.

    If quark masses have been nonperturbatively renormalized, for example, to some MOM/SF scheme, the values in this scheme must be converted to the phenomenologically useful values in the \(\overline{\mathrm{MS}}\) scheme (or to the scheme/scale independent RGI masses). Either option requires the use of perturbation theory. The larger the energy scale of this matching with perturbation theory, the better, and many recent computations in MOM schemes do a nonperturbative running up to \(3{-}4\) GeV. Computations in the SF scheme allow us to perform this running nonperturbatively over large energy scales and match with perturbation theory directly at the electro-weak scale \(\sim 100\) GeV.

Note that quark masses are different from other quantities determined on the lattice since perturbation theory is unavoidable when matching to schemes in the continuum.

We mention that lattice-QCD calculations of the b-quark mass have an additional complication which is not present in the case of the charm and light quarks. At the lattice spacings currently used in numerical simulations the direct treatment of the b quark with the fermionic actions commonly used for light quarks is very challenging. Only one determination of the b-quark mass uses this approach, reaching the physical b-quark mass region at two lattice spacings with \(am\sim 0.9\) and 0.64, respectively (see Sect. 3.3). There are a few widely used approaches to treat the b quark on the lattice, which have been already discussed in the FLAG 13 review (see Sect. 8 of Ref. [2]). Those relevant for the determination of the b-quark mass will be briefly described in Sect. 3.3.

3.1 Masses of the light quarks

Light-quark masses are particularly difficult to determine because they are very small (for the up and down quarks) or small (for the strange quark) compared to typical hadronic scales. Thus, their impact on typical hadronic observables is minute, and it is difficult to isolate their contribution accurately.

Fortunately, the spontaneous breaking of \(SU(3)_L\times SU(3)_R\) chiral symmetry provides observables which are particularly sensitive to the light-quark masses: the masses of the resulting Nambu-Goldstone bosons (NGB), i.e., pions, kaons, and eta. Indeed, the Gell-Mann-Oakes-Renner relation [136] predicts that the squared mass of a NGB is directly proportional to the sum of the masses of the quark and antiquark which compose it, up to higher-order mass corrections. Moreover, because these NGBs are light, and are composed of only two valence particles, their masses have a particularly clean statistical signal in lattice-QCD calculations. In addition, the experimental uncertainties on these meson masses are negligible. Thus, in lattice calculations, light-quark masses are typically obtained by renormalizing the input quark mass and tuning them to reproduce NGB masses, as described above.

3.1.1 The physical point and isospin symmetry

As mentioned in Sect. 2.1, the present review relies on the hypothesis that, at low energies, the Lagrangian \(\mathcal{L}_{ \text{ QCD }}+{{{\mathcal {L}}}}_{ \text{ QED }}\) describes nature to a high degree of precision. However, most of the results presented below are obtained in pure QCD calculations, which do not include QED. Quite generally, when comparing QCD calculations with experiment, radiative corrections need to be applied. In pure QCD simulations, where the parameters are fixed in terms of the masses of some of the hadrons, the electromagnetic contributions to these masses must be discussed. How the matching is done is generally ambiguous because it relies on the unphysical separation of QCD and QED contributions. In this section, and in the following, we discuss this issue in detail. Of course, once QED is included in lattice calculations, the subtraction of electromagnetic contributions is no longer necessary.

Let us start from the unambiguous case of QCD+QED. As explained in the introduction of this section, the physical quark masses are the parameters of the Lagrangian such that a given set of experimentally measured, dimensionful hadronic quantities are reproduced by the theory. Many choices are possible for these quantities, but in practice many lattice groups use pseudoscalar meson masses, as they are easily and precisely obtained both by experiment, and through lattice simulations. For example, in the four-flavour case, one can solve the system

$$\begin{aligned} M_{\pi ^+}(m_u,m_d,m_s,m_c,\alpha )= & {} M_{\pi ^+}^{\mathrm {exp.}}\,, \end{aligned}$$
(23)
$$\begin{aligned} M_{K^+}(m_u,m_d,m_s,m_c,\alpha )= & {} M_{K^+}^{\mathrm {exp.}}\,, \end{aligned}$$
(24)
$$\begin{aligned} M_{K^0}(m_u,m_d,m_s,m_c,\alpha )= & {} M_{K^0}^{\mathrm {exp.}}\,, \end{aligned}$$
(25)
$$\begin{aligned} M_{D^0}(m_u,m_d,m_s,m_c,\alpha )= & {} M_{D^0}^{\mathrm {exp.}}\,, \end{aligned}$$
(26)

where we assumed that

  • all the equations are in the continuum and infinite-volume limits;

  • the overall scale has been set to its physical value, generally through some lattice-scale setting procedure involving a fifth dimensionful input;

  • the quark masses \(m_q\) are assumed to be renormalized from the bare, lattice ones in some given continuum renormalization scheme;

  • \(\alpha =\frac{e^2}{4\pi }\) is the fine-structure constant expressed as function of the positron charge e, generally set to the Thomson limit \(\alpha =0.007297352\dots \) [137];

  • the mass \(M_{h}(m_u,m_d,m_s,m_c,\alpha )\) of the meson h is a function of the quark masses and \(\alpha \). The functional dependence is generally obtained by choosing an appropriate parameterization and performing a global fit to the lattice data;

  • the superscript exp. indicates that the mass is an experimental input, lattice groups use in general the values in the Particle Data Group review [137].

However, ambiguities arise with simulations of QCD only. In that case, there is no experimentally measurable quantity that emerges from the strong interaction only. The missing QED contribution is tightly related to isospin-symmetry breaking effects. Isospin symmetry is explicitly broken by the differences between the up- and down-quark masses \(\delta m=m_u-m_d\), and electric charges \(\delta Q=Q_u-Q_d\). Both these effects are, respectively, of order \({\mathcal {O}}(\delta m/\Lambda _{\mathrm {QCD}})\) and \({\mathcal {O}}(\alpha )\), and are expected to be \({\mathcal {O}}(1\%)\) of a typical isospin-symmetric hadronic quantity. Strong and electromagnetic isospin-breaking effects are of the same order and therefore cannot, in principle, be evaluated separately without introducing strong ambiguities. Because these effects are small, they can be treated as a perturbation:

$$\begin{aligned}&X(m_u,m_d,m_s,m_c,\alpha )\nonumber \\&\quad ={\bar{X}}(m_{ud}, m_s, m_c) +\delta mA_X(m_{ud}, m_s, m_c) \nonumber \\&\qquad +\,\alpha B_X(m_{ud}, m_s, m_c)\,, \end{aligned}$$
(27)

for a given hadronic quantity X, where \(m_{ud}=\frac{1}{2}(m_u+m_d)\) is the average light-quark mass. There are several things to notice here. Firstly, the neglected higher-order \({\mathcal {O}}(\delta m^2,\alpha \delta m,\alpha ^2)\) corrections are expected to be \({\mathcal {O}}(10^{-4})\) relatively to X, which at the moment is way beyond the relative statistical accuracy that can be delivered by a lattice calculation. Secondly, this is not strictly speaking an expansion around the isospin-symmetric point, the electromagnetic interaction has also symmetric contributions. From this last expression the previous statements about ambiguities become clearer. Indeed, the only unambiguous prediction one can perform is to solve Eqs. (23)–(26) and use the resulting parameters to obtain a prediction for X, which is represented by the left-hand side of Eq. (27). This prediction will be the sum of the QCD isospin-symmetric part \({\bar{X}}\), the strong isospin-breaking effects \( X^{SU(2)}=\delta mA_X\), and the electromagnetic effects \(X^{\gamma }=\alpha B_X\). Obtaining any of these terms individually requires extra, unphysical conditions to perform the separation. To be consistent with previous editions of FLAG, we also define \({\hat{X}}={\bar{X}}+X^{SU(2)}\) to be the \(\alpha \rightarrow 0\) limit of X.

With pure QCD simulations, one typically solves Eqs. (23)–(26) by equating the QCD, isospin-symmetric part of a hadron mass \({\bar{M}}_h\), result of the simulations, with its experimental value \(M_h^{\mathrm {exp.}}\). This will result in an \({\mathcal {O}}(\delta m,\alpha )\) mis-tuning of the theory parameters which will propagate as an error on predicted quantities. Because of this, in principle, one cannot predict hadronic quantities with a relative accuracy higher than \({\mathcal {O}}(1\%)\) from pure QCD simulations, independently on how the target X is sensitive to isospin breaking effects. If one performs a complete lattice prediction of the physical value of X, it can be of phenomenological interest to define in some way \({\bar{X}}\), \(X^{SU(2)}\), and \(X^{\gamma }\). If we keep \(m_{ud}\), \(m_s\) and \(m_c\) at their physical values in physical units, for a given renormalization scheme and scale, then these three quantities can be extracted by setting successively and simultaneously \(\alpha \) and \(\delta m\) to 0. This is where the ambiguity lies: in general the \(\delta m=0\) point will depend on the renormalization scheme used for the quark masses. In the next section, we give more details on that particular aspect and discuss the order of scheme ambiguities.

3.1.2 Ambiguities in the separation of isospin-breaking contributions

In this section, we discuss the ambiguities that arise in the individual determination of the QED contribution \(X^{\gamma }\) and the strong-isospin correction \(X^{SU(2)}\) defined in the previous section. Throughout this section, we assume that the isospin-symmetric quark masses \(m_{ud}\), \(m_s\) and \(m_c\) are always kept fixed in physical units to the values they take at the QCD+QED physical point in some given renormalization scheme. Let us assume that both up and down masses have been renormalized in an identical mass-independent scheme which depends on some energy scale \(\mu \). We also assume that the renormalization procedure respects chiral symmetry so that quark masses renormalize multiplicatively. The renormalization constants of the quark masses are identical for \(\alpha =0\) and therefore the renormalized mass of a quark has the general form

$$\begin{aligned} m_q(\mu )= & {} Z_m(\mu )\left[ 1+\alpha Q_{\mathrm {tot.}}^2\delta _{Z}^{(0)}(\mu ) +\alpha Q_{\mathrm {tot.}}Q_q\delta _{Z}^{(1)}(\mu )\right. \nonumber \\&\left. +\,\alpha Q_q^2\delta _{Z}^{(2)}(\mu ) \right] m_{q,0} \,, \end{aligned}$$
(28)

up to \({\mathcal {O}}(\alpha ^2)\) corrections, where \(m_{q,0}\) is the bare quark mass, and where \(Q_{\mathrm {tot.}}\) and \(Q_{\mathrm {tot.}}^2\) are the sum of all quark charges and squared charges, respectively. Throughout this section, a subscript ud generally denotes the average between up and down quantities and \(\delta \) the difference between the up and the down quantities. The source of the ambiguities described in the previous section is the mixing of the isospin-symmetric mass \(m_{ud}\) and the difference \(\delta m\) through renormalization. Using Eq. (28) one can make this mixing explicit at leading order in \(\alpha \):

$$\begin{aligned} \begin{pmatrix}m_{ud}(\mu )\\ \delta m(\mu )\end{pmatrix}= & {} Z_m(\mu )\Bigg [1+\alpha Q_{\mathrm {tot.}}^2\delta _{Z}^{(0)}(\mu )+\alpha M^{(1)}(\mu )\nonumber \\&+\,\alpha M^{(2)}(\mu )\Bigg ] \begin{pmatrix}m_{ud,0}\\ \delta m_0\end{pmatrix} \end{aligned}$$
(29)

with the mixing matrices

$$\begin{aligned} M^{(1)}(\mu )= & {} \delta _Z^{(1)}(\mu )Q_{\mathrm {tot.}}\begin{pmatrix} Q_{ud} &{}\quad \frac{1}{4}\delta Q\\ \delta Q &{}\quad Q_{ud} \end{pmatrix}\qquad \text {and}\qquad \nonumber \\ M^{(2)}(\mu )= & {} \delta _Z^{(2)}(\mu )\begin{pmatrix} Q_{ud}^2 &{}\quad \frac{1}{4}\delta Q^2\\ \delta Q^2 &{}\quad Q_{ud}^2 \end{pmatrix}\,. \end{aligned}$$
(30)

Now let us assume that for the purpose of determining the different components in Eq. (27), one starts by tuning the bare masses to obtain equal up and down masses, for some small coupling \(\alpha _0\) at some scale \(\mu _0\), i.e., \(\delta m(\mu _0)=0\). At this specific point, one can extract the pure QCD, and the QED corrections to a given quantity X by studying the slope of \(\alpha \) in Eq. (27). From these quantities the strong isospin contribution can then readily be extracted using a nonzero value of \(\delta m(\mu _0)\). However, if now the procedure is repeated at another coupling \(\alpha \) and scale \(\mu \) with the same bare masses, it appears from Eq. (29) that \(\delta m(\mu )\ne 0\). More explicitly,

$$\begin{aligned} \delta m(\mu )=m_{ud}(\mu _0)\frac{Z_m(\mu )}{Z_m(\mu _0)} [\alpha \Delta _Z(\mu ) -\alpha _0\Delta _Z(\mu _0)]\,, \end{aligned}$$
(31)

with

$$\begin{aligned} \Delta _Z(\mu )=Q_{\mathrm {tot.}}\delta Q\delta _Z^{(1)}(\mu )+\delta Q^2\delta _Z^{(2)}(\mu )\,,\end{aligned}$$
(32)

up to higher-order corrections in \(\alpha \) and \(\alpha _0\). In other words, the definitions of \({\bar{X}}\), \(X^{SU(2)}\), and \(X^{\gamma }\) depend on the renormalization scale at which the separation was made. This dependence, of course, has to cancel in the physical sum X. One can notice that at no point did we mention the renormalization of \(\alpha \) itself, which, in principle, introduces similar ambiguities. However, the corrections coming from the running of \(\alpha \) are \({\mathcal {O}}(\alpha ^2)\) relatively to X, which, as justified above, can be safely neglected. Finally, important information is provided by Eq. (31): the scale ambiguities are \({\mathcal {O}}(\alpha m_{ud})\). For physical quark masses, one generally has \(m_{ud}\simeq \delta m\). So by using this approximation in the first-order expansion Eq. (27), it is actually possible to define unambiguously the components of X up to second-order isospin-breaking corrections. Therefore, in the rest of this review, we will not keep track of the ambiguities in determining pure QCD or QED quantities. However, in the context of lattice simulations, it is crucial to notice that \(m_{ud}\simeq \delta m\) is only accurate at the physical point. In simulations at larger-than-physical pion masses, scheme ambiguities in the separation of QCD and QED contributions are generally large. Once more, the argument made here assumes that the isospin-symmetric quark masses \(m_{ud}\), \(m_s\), and \(m_c\) are kept fixed to their physical value in a given scheme while varying \(\alpha \). Outside of this assumption there is an additional isospin-symmetric \(O(\alpha m_q)\) ambiguity between \({\bar{X}}\) and \(X^{\gamma }\).

Such separation on lattice-QCD+QED simulation results appeared for the first time in RBC 07 [138] and Blum 10 [139], where the scheme was implicitly defined around the \(\chi \)PT expansion. In that setup, the \(\delta m(\mu _0)=0\) point is defined in pure QCD, i.e., \(\alpha _0=0\) in the previous discussion. The QCD part of the kaon-mass splitting from the first FLAG review [1] is used as an input in RM123 11 [140], which focuses on QCD isospin corrections only. It therefore inherits from the convention that was chosen there, which is also to set \(\delta m(\mu _0)=0\) at zero QED coupling. The same convention was used in the follow-up works RM123 13 [141] and RM123 17 [19]. The BMW collaboration was the first to introduce a purely hadronic scheme in its electro-quenched study of the baryon octet mass splittings [142]. In this work, the quark mass difference \(\delta m(\mu )\) is swapped with the mass splitting \(\Delta M^2\) between the connected \({\bar{u}}u\) and \({\bar{d}}d\) pseudoscalar masses. Although unphysical, this quantity is proportional [143] to \(\delta m(\mu )\) up to \({\mathcal {O}}(\alpha m_{ud})\) chiral corrections. In this scheme, the quark masses are assumed to be equal at \(\Delta M^2=0\), and the \({\mathcal {O}}(\alpha m_{ud})\) corrections to this statement are analogous to the scale ambiguities mentioned previously. The same scheme was used with the same data set for the determination of light-quark masses BMW 16 [20]. The BMW collaboration used a different hadronic scheme for its determination of the nucleon-mass splitting BMW 14 [119] using full QCD+QED simulations. In this work, the \(\delta m=0\) point was fixed by imposing the baryon splitting \(M_{\Sigma ^+}-M_{\Sigma ^-}\) to cancel. This scheme is quite different from the other ones presented here, in the sense that its intrinsic ambiguity is not \({\mathcal {O}}(\alpha m_{ud})\). What motivates this choice here is that \(M_{\Sigma ^+}-M_{\Sigma ^-}=0\) in the limit where these baryons are point particles, so the scheme ambiguity is suppressed by the compositeness of the \(\Sigma \) baryons. This may sounds like a more difficult ambiguity to quantify, but this scheme has the advantage of being defined purely by measurable quantities. Moreover, it has been demonstrated numerically in BMW 14 [119] that, within the uncertainties of this study, the \(M_{\Sigma ^+}-M_{\Sigma ^-}=0\) scheme is equivalent to the \(\Delta M^2=0\) one, explicitly \(M_{\Sigma ^+}-M_{\Sigma ^-}=-0.18(12)(6)\,\mathrm {MeV}\) at \(\Delta M^2=0\). The calculation QCDSF/UKQCD 15 [144] uses a “Dashen scheme,” where quark masses are tuned such that flavour-diagonal mesons have equal masses in QCD and QCD+QED. Although not explicitly mentioned by the authors of the paper, this scheme is simply a reformulation of the \(\Delta M^2=0\) scheme mentioned previously. Finally, the recent preprint MILC 18 [145] also used the \(\Delta M^2=0\) scheme and noticed its connection to the “Dashen scheme” from QCDSF/UKQCD 15.

In the previous edition of this review, the contributions \({\bar{X}}\), \(X^{SU(2)}\), and \(X^{\gamma }\) were given for pion and kaon masses based on phenomenological information. Considerable progress has been achieved by the lattice community to include isospin-breaking effects in calculations, and it is now possible to determine these quantities precisely directly from a lattice calculation. However, these quantities generally appear as intermediate products of a lattice analysis, and are rarely directly communicated in publications. These quantities, although unphysical, have a phenomenological interest, and we encourage the authors of future calculations to quote them explicitly.

3.1.3 Inclusion of electromagnetic effects in lattice-QCD simulations

Electromagnetism on a lattice can be formulated using a naive discretization of the Maxwell action \(S[A_{\mu }]=\frac{1}{4}\int d^4 x\,\sum _{\mu ,\nu }[\partial _{\mu }A_{\nu }(x)-\partial _{\nu }A_{\mu }(x)]^2\). Even in its noncompact form, the action remains gauge-invariant. This is not the case for non-Abelian theories for which one uses the traditional compact Wilson gauge action (or an improved version of it). Compact actions for QED feature spurious photon-photon interactions which vanish only in the continuum limit. This is one of the main reason why the noncompact action is the most popular so far. It was used in all the calculations presented in this review. Gauge-fixing is necessary for noncompact actions. It was shown [146, 147] that gauge fixing is not necessary with compact actions, including in the construction of interpolating operators for charged states.

Although discretization is straightforward, simulating QED in a finite volume is more challenging. Indeed, the long range nature of the interaction suggests that important finite-size effects have to be expected. In the case of periodic boundary conditions, the situation is even more critical: a naive implementation of the theory features an isolated zero-mode singularity in the photon propagator. It was first proposed in [148] to fix the global zero-mode of the photon field \(A_{\mu }(x)\) in order to remove it from the dynamics. This modified theory is generally named \(\mathrm {QED}_{\mathrm {TL}}\). Although this procedure regularizes the theory and has the right classical infinite-volume limit, it is nonlocal because of the zero-mode fixing. As first discussed in [119], the nonlocality in time of \(\mathrm {QED}_{\mathrm {TL}}\) prevents the existence of a transfer matrix, and therefore a quantum-mechanical interpretation of the theory. Another prescription named \(\mathrm {QED}_{\mathrm {L}}\), proposed in [149], is to remove the zero-mode of \(A_{\mu }(x)\) independently for each time slice. This theory, although still nonlocal in space, is local in time and has a well-defined transfer matrix. Wether these nonlocalities constitute an issue to extract infinite-volume physics from lattice-QCD+\(\mathrm {QED}_{\mathrm {L}}\) simulations is, at the time of this review, still an open question. However, it is known through analytical calculations of electromagnetic finite-size effects at \(O(\alpha )\) in hadron masses [119, 120, 122, 141, 149,150,151], meson leptonic decays [151], and the hadronic vacuum polarization [152] that \(\mathrm {QED}_{\mathrm {L}}\) does not suffer from a problematic (e.g., UV divergent) coupling of short and long-distance physics due to its nonlocality. Another strategy, first prosposed in [153] and used by the QCDSF collaboration, is to bound the zero-mode fluctuations to a finite range. Although more minimal, it is still a nonlocal modification of the theory and so far finite-size effects for this scheme have not been investigated. More recently, two proposals for local formulations of finite-volume QED emerged. The first one described in [154] proposes to use massive photons to regulate zero-mode singularities, at the price of (softly) breaking gauge invariance. The second one presented in [147] avoids the zero-mode issue by using anti-periodic boundary conditions for \(A_{\mu }(x)\). In this approach, gauge invariance requires the fermion field to undergo a charge conjugation transformation over a period, breaking electric charge conservation. These local approaches have the potential to constitute cleaner approaches to finite-volume QED. All the calculations presented in this review used \(\mathrm {QED}_{\mathrm {L}}\) or \(\mathrm {QED}_{\mathrm {TL}}\), with the exception of QCDSF.

Once a finite-volume theory for QED is specified, there are various ways to compute QED effects themselves on a given hadronic quantity. The most direct approach, first used in [148], is to include QED directly in the lattice simulations and assemble correlation functions from charged quark propagators. Another approach proposed in [141], is to exploit the perturbative nature of QED, and compute the leading-order corrections directly in pure QCD as matrix elements of the electromagnetic current. Both approaches have their advantages and disadvantages and as shown in [19], are not mutually exclusive. A critical comparative study can be found in [155].

Finally, most of the calculations presented here made the choice of computing electromagnetic corrections in the electro-quenched approximation. In this limit, one assumes that only valence quarks are charged, which is equivalent to neglecting QED corrections to the fermionic determinant. This approximation reduces dramatically the cost of lattice-QCD + QED calculations since it allows the reuse of previously generated QCD configurations. It also avoids computing disconnected contributions coming from the electromagnetic current in the vacuum, which are generally challenging to determine precisely. The electromagnetic contributions from sea quarks are known to be flavour-SU(3) and large-\(N_c\) suppressed, thus electro-quenched simulations are expected to have an \(O(10\%)\) accuracy for the leading electromagnetic effects. This suppression is in principle rather weak and results obtained from electro-quenched simulations might feature uncontrolled systematic errors. For this reason, the use of the electro-quenched approximation constitutes the difference between  and  in the FLAG criterion for the inclusion of isospin breaking effects.

3.1.4 Lattice determination of \(m_s\) and \(m_{ud}\)

We now turn to a review of the lattice calculations of the light-quark masses and begin with \(m_s\), the isospin-averaged up- and down-quark mass \(m_{ud}\), and their ratio. Most groups quote only \(m_{ud}\), not the individual up- and down-quark masses. We then discuss the ratio \(m_u/m_d\) and the individual determinations of \(m_u\) and \(m_d\).

Quark masses have been calculated on the lattice since the mid-nineties. However, early calculations were performed in the quenched approximation, leading to unquantifiable systematics. Thus, in the following, we only review modern, unquenched calculations, which include the effects of light sea quarks.

Tables 4 and 5 list the results of \(N_{ f}=2\,+\,1\) and \(N_{ f}=2\,+\,1\,+\,1\) lattice calculations of \(m_s\) and \(m_{ud}\). These results are given in the \({\overline{\mathrm {MS}}}\) scheme at \(2\,\mathrm {GeV}\), which is standard nowadays, though some groups are starting to quote results at higher scales (e.g., Ref. [156]). The tables also show the colour coding of the calculations leading to these results. As indicated earlier in this review, we treat calculations with different numbers, \(N_f\), of dynamical quarks separately.

\(N_{ f}=2\,+\,1\) lattice calculations

We turn now to \(N_{ f}=2\,+\,1\) calculations. These and the corresponding results for \(m_{ud}\) and \(m_s\) are summarized in Table 4. Given the very high precision of a number of the results, with total errors on the order of 1%, it is important to consider the effects neglected in these calculations. Isospin-breaking and electromagnetic effects are small on \(m_{ud}\) and \(m_s\), and have been approximately accounted for in the calculations that will be retained for our averages. We have already commented that the effect of the omission of the charm quark in the sea is expected to be small, below our current precision. In contrast with previous editions of the FLAG report, we do not add any additional uncertainty due to these effects in the final averages.

Table 4 \(N_{ f}=2\,+\,1\) lattice results for the masses \(m_{ud}\) and \(m_s\)
Table 5 \(N_{ f}=2\,+\,1\,+\,1\) lattice results for the masses \(m_{ud}\) and \(m_s\)

The only new calculation since FLAG 16 is the \(m_s\) determination of Maezawa 16 [157]. This new result agrees well with other determinations; however because it is computed with a single pion mass of about 160 MeV, it does not meet our criteria for entering the average. RBC/UKQCD 14 [10] significantly improves on their RBC/UKQCD 12 [156] work by adding three new domain wall fermion simulations to three used previously. Two of the new simulations are performed at essentially physical pion masses (\(M_\pi \simeq 139\,\mathrm {MeV}\)) on lattices of about \(5.4\,\mathrm{fm}\) in size and with lattice spacings of \(0.114\,\mathrm{fm}\) and \(0.084\,\mathrm{fm}\). It is complemented by a third simulation with \(M_\pi \simeq 371\,\mathrm {MeV}\), \(a\simeq 0.063\) and a rather small \(L\simeq 2.0\,\mathrm{fm}\). Altogether, this gives them six simulations with six unitary (\(m_{\mathrm{sea}}=m_{\mathrm{val}}\)) \(M_\pi \)’s in the range of 139 to \(371\,\mathrm {MeV}\), and effectively three lattice spacings from 0.063 to \(0.114\,\mathrm{fm}\). They perform a combined global continuum and chiral fit to all of their results for the \(\pi \) and K masses and decay constants, the \(\Omega \) baryon mass and two Wilson-flow parameters. Quark masses in these fits are renormalized and run nonperturbatively in the RI-SMOM scheme. This is done by computing the relevant renormalization constant for a reference ensemble, and determining those for other simulations relative to it by adding appropriate parameters in the global fit. This new calculation passes all of our selection criteria. Its results will replace the older RBC/UKQCD 12 results in our averages.

\(N_{ f}=2\,+\,1\) MILC results for light-quark masses go back to 2004 [166, 167]. They use rooted staggered fermions. By 2009 their simulations covered an impressive range of parameter space, with lattice spacings going down to 0.045 fm, and valence-pion masses down to approximately 180 MeV [17]. The most recent MILC \(N_{ f}=2\,+\,1\) results, i.e., MILC 10A [14] and MILC 09A [17], feature large statistics and 2-loop renormalization. Since these data sets subsume those of their previous calculations, these latest results are the only ones that must be kept in any world average.

The PACS-CS 12 [158] calculation represents an important extension of the collaboration’s earlier 2010 computation [159], which already probed pion masses down to \(M_\pi \simeq 135\,\mathrm {MeV}\), i.e., down to the physical-mass point. This was achieved by reweighting the simulations performed in PACS-CS 08 [162] at \(M_\pi \simeq 160\,\mathrm {MeV}\). If adequately controlled, this procedure eliminates the need to extrapolate to the physical-mass point and, hence, the corresponding systematic error. The new calculation now applies similar reweighting techniques to include electromagnetic and \(m_u\ne m_d\) isospin-breaking effects directly at the physical pion mass. Further, as in PACS-CS 10 [159], renormalization of quark masses is implemented nonperturbatively, through the Schrödinger functional method [172]. As it stands, the main drawback of the calculation, which makes the inclusion of its results in a world average of lattice results inappropriate at this stage, is that for the lightest quark mass the volume is very small, corresponding to \(LM_\pi \simeq 2.0\), a value for which finite-volume effects will be difficult to control. Another problem is that the calculation was performed at a single lattice spacing, forbidding a continuum extrapolation. Further, it is unclear at this point what might be the systematic errors associated with the reweighting procedure.

The BMW 10A, 10B [11, 12] calculation still satisfies our stricter selection criteria. They reach the physical up- and down-quark mass by interpolation instead of by extrapolation. Moreover, their calculation was performed at five lattice spacings ranging from 0.054 to 0.116 fm, with full nonperturbative renormalization and running and in volumes of up to (6 fm)\(^3\), guaranteeing that the continuum limit, renormalization, and infinite-volume extrapolation are controlled. It does neglect, however, isospin-breaking effects, which are small on the scale of their error bars.

Finally, we come to another calculation which satisfies our selection criteria, HPQCD 10 [13]. It updates the staggered-fermions calculation of HPQCD 09A [24]. In these papers, the renormalized mass of the strange quark is obtained by combining the result of a precise calculation of the renormalized charm-quark mass, \(m_c\), with the result of a calculation of the quark-mass ratio, \(m_c/m_s\). As described in Ref. [171] and in Sect. 3.2, HPQCD determines \(m_c\) by fitting Euclidean-time moments of the \({{\bar{c}}}c\) pseudoscalar density two-point functions, obtained numerically in lattice-QCD, to fourth-order, continuum perturbative expressions. These moments are normalized and chosen so as to require no renormalization with staggered fermions. Since \(m_c/m_s\) requires no renormalization either, HPQCD’s approach displaces the problem of lattice renormalization in the computation of \(m_s\) to one of computing continuum perturbative expressions for the moments. To calculate \(m_{ud}\) HPQCD 10 [13] use the MILC 09 determination of the quark-mass ratio \(m_s/m_{ud}\) [129].

HPQCD 09A [24] obtains \(m_c/m_s=11.85(16)\) [24] fully nonperturbatively, with a precision slightly larger than 1%. HPQCD 10’s determination of the charm-quark mass, \(m_c(m_c)=1.268(6)\),Footnote 15 is even more precise, achieving an accuracy better than 0.5%.

This discussion leaves us with five results for our final average for \(m_s\): MILC 09A [17], BMW 10A, 10B [11, 12], HPQCD 10 [13] and RBC/UKQCD 14 [10]. Assuming that the result from HPQCD 10 is 100% correlated with that of MILC 09A, as it is based on a subset of the MILC 09A configurations, we find \(m_s=92.03(88)\,\mathrm {MeV}\) with a \(\chi ^2/\)dof = 1.2.

For the light-quark mass \(m_{ud}\), the results satisfying our criteria are RBC/UKQCD 14B, BMW 10A, 10B, HPQCD 10, and MILC 10A. For the error, we include the same 100% correlation between statistical errors for the latter two as for the strange case, resulting in \(m_{ud}=3.364(41)\) at 2 GeV in the \(\overline{\mathrm{MS}}\) scheme (\(\chi ^2/\)d.of.=1.1). Our final estimates for the light-quark masses are

(33)

And the RGI values

(34)

\(N_{ f}=2\,+\,1\,+\,1\) lattice calculations

Since the previous FLAG review, two new results for the strange-quark mass have appeared, HPQCD 18 [15] and FNAL/MILC/TUMQCD 18 [8]. In the former quark masses are renormalized nonperturbatively in the RI-SMOM scheme. The mass of the (fictitious) \({{\bar{s}}} s\) meson is used to tune the bare strange mass. The “physical” \({{\bar{s}}} s\) mass is given in QCD from the pion and kaon masses. In addition, they use the same HISQ ensembles and valence quarks as those in HPQCD 14A, where the quark masses were computed from time moments of vector-vector correlation functions. The new results are consistent with the old, with roughly the same size error, but of course with different systematics. In particular the new results avoid the use of high-order perturbation theory in the matching between lattice and continuum schemes. It is reassuring that the two methods, applied to the same ensembles, agree well.

The \(N_{ f}=2\,+\,1\,+\,1\) results are summarized in Table 5. Note that the results of Ref. [16] are reported as \(m_s(2\,\mathrm {GeV};N_f=3)\) and those of Ref. [9] as \(m_{ud(s)}(2\,\mathrm {GeV};N_f=4)\). We convert the former to \(N_f=4\) and obtain \(m_s(2\,\mathrm {GeV}; N_f=4)=93.12(69)\mathrm {MeV}\). The average of FNAL/MILC/TUMQCD 18, HPQCD 18, ETM 14 and HPQCD 14A is 93.44(68)\(\mathrm {MeV}\) with \(\chi ^2/\text{ dof }=1.7\). For the light-quark average we use ETM 14A and FNAL/MILC/TUMQCD 18 with an average 3.410(43) and a \(\chi ^2/\text{ dof }=3\). We note these \(\chi ^2\) values are large. For the case of the light-quark masses this is mostly due to ETM 14(A) masses lying significantly above the rest, but in the case of \(m_s\) there is also some tension between the recent and very precise results of HPQCD 18 and FNAL/MILC/TUMQCD 18. Also note that the 2 + 1-flavour values are consistent with the four-flavour ones, so in all cases we have decided to simply quote averages according to FLAG rules, including stretching factors for the errors based on \(\chi ^2\) values of our fits.

(35)

and the RGI values

(36)

In Figs. 1 and 2 the lattice results listed in Tables 4 and 5 and the FLAG averages obtained at each value of \(N_f\) are presented and compared with various phenomenological results.

Fig. 1
figure 1

\({\overline{\mathrm {MS}}}\) mass of the strange quark (at 2 GeV scale) in MeV. The upper two panels show the lattice results listed in Tables 4 and 5, while the bottom panel collects sum rule results [173,174,175,176,177]. Diamonds and squares represent results based on perturbative and nonperturbative renormalization, respectively. The black squares and the grey bands represent our estimates (33) and (35). The significance of the colours is explained in Sect. 2

3.1.5 Lattice determinations of \(m_s/m_{ud}\)

The lattice results for \(m_s/m_{ud}\) are summarized in Table 6. In the ratio \(m_s/m_{ud}\), one of the sources of systematic error – the uncertainties in the renormalization factors – drops out.

\(N_{ f}=2\,+\,1\) lattice calculations

For \(N_f = 2\,+\,1\) our average has not changed since the last version of the review and is based on the result RBC/UKQCD 14B, which replaces RBC/UKQCD 12 (see Sect. 3.1.4), and on the results MILC 09A and BMW 10A, 10B. The value quoted by HPQCD 10 does not represent independent information as it relies on the result for \(m_s/m_{ud}\) obtained by the MILC collaboration. Averaging these results according to the prescriptions of Sect. 2.3 gives \(m_s / m_{ud} = 27.42(12)\) with \(\chi ^2/\text{ dof } \simeq 0.2\). Since the errors associated with renormalization drop out in the ratio, the uncertainties are even smaller than in the case of the quark masses themselves: the above number for \(m_s/m_{ud}\) amounts to an accuracy of 0.5%.

Fig. 2
figure 2

Mean mass of the two lightest quarks, \(m_{ud}=\frac{1}{2}(m_u+m_d)\). The bottom panel shows results based on sum rules [173, 176, 178] (for more details see Fig. 1)

Table 6 Lattice results for the ratio \(m_s/m_{ud}\)

At this level of precision, the uncertainties in the electromagnetic and strong isospin-breaking corrections might not be completely negligible. Nevertheless, we decide not to add any uncertainty associated with this effect. The main reason is that most recent determinations try to estimate this uncertainty themselves and found an effect smaller than naive power counting estimates (see \(N_{ f}=2\,+\,1\,+\,1\) section).

$$\begin{aligned} N_f = 2+1: \quad {m_s}/{m_{ud}} = 27.42 ~ (12) \quad \,\mathrm {Refs.}~{[10{-}12,17]}.\nonumber \\ \end{aligned}$$
(37)

\(N_{ f}=2\,+\,1\,+\,1\) lattice calculations

For \(N_f = 2\,+\,1\,+\,1\) there are three results, MILC 17 [5], ETM 14 [9] and FNAL/MILC 14A [18], all of which satisfy our selection criteria.

MILC 17 uses 24 HISQ staggered-fermion ensembles at six values of the lattice spacing in the range \(0.15\, \mathrm{fm}\)\(0.03\, \mathrm{fm}\).

ETM 14 uses 15 twisted mass gauge ensembles at three lattice spacings ranging from 0.062 to 0.089 fm (using \(f_\pi \) as input), in boxes of size ranging from 2.0 to 3.0 fm, and pion masses from 210 to 440 MeV (explaining the tag in the chiral extrapolation and the tag for the continuum extrapolation). The value of \(M_\pi L\) at their smallest pion mass is 3.2 with more than two volumes (explaining the tag in the finite-volume effects). They fix the strange mass with the kaon mass.

FNAL/MILC 14A employs HISQ staggered fermions. Their result is based on 21 ensembles at four values of the coupling \(\beta \) corresponding to lattice spacings in the range from 0.057 to 0.153 fm, in boxes of sizes up to 5.8 fm, and with taste-Goldstone pion masses down to 130 MeV, and RMS pion masses down to 143 MeV. They fix the strange mass with \(M_{{{\bar{s}}}s}\), corrected for electromagnetic effects with \(\epsilon = 0.84(20)\) [179]. All of our selection criteria are satisfied with the tag . Thus our average is given by \(m_s / m_{ud} = 27.23 ~ (10)\), where the error includes a large stretching factor equal to \(\sqrt{\chi ^2/\text{ dof }} \simeq 1.6\), coming from our rules for the averages discussed in Sect. 2.2. As mentioned already this is mainly due to ETM 14(A) values lying significantly above the averages for the individual masses.

$$\begin{aligned} {N_f = 2\,+\,1\,+\,1:}\quad m_s / m_{ud} = 27.23 ~ (10)\quad \,\mathrm {Refs.}~\text{[5,9,18] },\!\!\nonumber \\ \end{aligned}$$
(38)

which corresponds to an overall uncertainty equal to 0.4%. It is worth noting that [5] estimates the EM effects in this quantity to be \(\sim 0.18\%\).

All the lattice results listed in Table 6 as well as the FLAG averages for each value of \(N_f\) are reported in Fig. 3 and compared with \(\chi \)PT and sum rules.

Fig. 3
figure 3

Results for the ratio \(m_s/m_{ud}\). The upper part indicates the lattice results listed in Table 6 together with the FLAG averages for each value of \(N_f\). The lower part shows results obtained from \(\chi \)PT and sum rules [176, 180,181,182,183]

3.1.6 Lattice determination of \(m_u\) and \(m_d\)

In addition to reviewing computations of individual \(m_u\) and \(m_d\) quark masses, we will also determine FLAG averages for the parameter \(\epsilon \) related to the violations of Dashen’s theorem

$$\begin{aligned} \epsilon =\frac{\left( \Delta M_{K}^{2}-\Delta M_{\pi }^{2}\right) ^{\gamma }}{\Delta M_{\pi }^{2}}\,, \end{aligned}$$
(39)

where \(\Delta M_{\pi }^{2}=M_{\pi ^+}^{2}-M_{\pi ^0}^{2}\) and \(\Delta M_{K}^{2}=M_{K^+}^{2}-M_{K^0}^{2}\) are the pion and kaon squared mass splittings, respectively. The superscript \(\gamma \), here and in the following, denotes corrections that arise from electromagnetic effects only. This parameter is often a crucial intermediate quantity in the extraction of the individual light-quark masses. Indeed, it can be shown using the G-parity symmetry of the pion triplet that \(\Delta M_{\pi }^{2}\) does not receive \(O(\delta m)\) isospin-breaking corrections. In other words

$$\begin{aligned} \Delta M_{\pi }^{2}=\left( \Delta M_{\pi }^{2}\right) ^{\gamma } \quad \text {and} \quad \epsilon =\frac{\left( \Delta M_{K}^{2}\right) ^{\gamma }}{\Delta M_{\pi }^{2}}-1\,, \end{aligned}$$
(40)

at leading-order in the isospin-breaking expansion. The difference \((\Delta M_{\pi }^{2})^{SU(2)}\) was estimated in previous editions of FLAG through the \(\epsilon _m\) parameter. However, consistent with our leading-order truncation of the isospin-breaking expansion, it is simpler to ignore this term. Once known, \(\epsilon \) allows one to consistently subtract the electromagnetic part of the kaon splitting to obtain the QCD splitting \((\Delta M_{K}^{2})^{SU(2)}\). In contrast with the pion, the kaon QCD splitting is sensitive to \(\delta m\), and, in particular, proportional to it at leading order in \(\chi \)PT. Therefore, the knowledge of \(\epsilon \) allows for the determination of \(\delta m\) from a chiral fit to lattice-QCD data. Originally introduced in another form in [184], \(\epsilon \) vanishes in the SU(3) chiral limit, a result known as Dashen’s theorem. However, in the 1990’s numerous phenomenological papers pointed out that \(\epsilon \) might be an O(1) number, indicating a significant failure of SU(3) \(\chi \)PT in the description of electromagnetic effects on light meson masses. However, the phenomenological determinations of \(\epsilon \) feature some level of controversy, leading to the rather imprecise estimate \(\epsilon =0.7(5)\) given in the first edition of FLAG. In this edition of the review, we quote below more precise averages for \(\epsilon \), directly obtained from lattice-QCD+QED simulations. We refer the reader to the previous editions of FLAG, and to the review [185] for discusions of the phenomenological determinations of \(\epsilon \).

Regarding finite-volume effects for calculations including QED, this edition of FLAG uses a new quality criterion presented in Sect. 2.1.1. Indeed, due to the long-distance nature of the electromagnetic interaction, these effects are dominated by a power law in the lattice spatial size. The coefficients of this expansion depend on the chosen finite-volume formulation of QED. For \(\mathrm {QED}_{\mathrm {L}}\), these effects on the squared mass \(M^2\) of a charged meson are given by [119, 120, 122]

$$\begin{aligned} \Delta _{\mathrm {FV}}M^2= \alpha M^2\left\{ \frac{c_{1}}{ML}+\frac{2c_1}{(ML)^2}+ {\mathcal {O}}\left[ \frac{1}{(ML)^3}\right] \right\} \,,\end{aligned}$$
(41)

with \(c_1\simeq -2.83730\). It has been shown in [119] that the two first orders in this expansion are exactly known for hadrons, and are equal to the pointlike case. However, the \({\mathcal {O}}[1/(ML)^{3}]\) term and higher orders depend on the structure of the hadron. The universal corrections for \(\mathrm {QED}_{\mathrm {TL}}\) can also be found in [119]. In all this part, for all computations using such universal formulae, the QED finite-volume quality criterion has been applied with \(n_{\mathrm {min}}=3\), otherwise \(n_{\mathrm {min}}=1\) was used.

Since FLAG 16, six new results have been reported for nondegenerate light-quark masses. In the \(N_f=2\,+\,1\,+\,1\) sector, MILC 18 [145] computed \(\epsilon \) using \(N_f=2\,+\,1\) asqtad electro-quenched QCD+\(\mathrm {QED}_{\mathrm {TL}}\) simulations and extracted the ratio \(m_u/m_d\) from a new set of \(N_f=2\,+\,1\,+\,1\) HISQ QCD simulations. Although \(\epsilon \) comes from \(N_f=2\,+\,1\) simulations, \((\Delta M_{K}^{2})^{SU(2)}\), which is about three times larger than \((\Delta M_{K}^{2})^{\gamma }\), has been determined in the \(N_f=2\,+\,1\,+\,1\) theory. We therefore chose to classify this result as a four-flavour one. This result is explicitly described by the authors as an update of MILC 17 [5]. In MILC 17 [5], \(m_u/m_d\) is determined as a side-product of a global analysis of heavy-meson decay constants, using a preliminary version of \(\epsilon \) from MILC 18 [145]. In FNAL/MILC/TUMQCD 18 [8] the ratio \(m_u/m_d\) from MILC 17 [5] is used to determine the individual masses \(m_u\) and \(m_d\) from a new calculation of \(m_{ud}\). The work RM123 17 [19] is the continuation the \(N_f=2\) result named RM123 13 [141] in the previous edition of FLAG. This group now uses \(N_f=2\,+\,1\,+\,1\) ensembles from ETM 10 [186], however still with a rather large minimum pion mass of \(270~\mathrm {MeV}\), leading to the  rating for chiral extrapolations. In the \(N_f=2\,+\,1\) sector, BMW 16 [20] reuses the data set produced from their determination of the light baryon octet mass splittings [142] using electro-quenched QCD+\(\mathrm {QED}_{\mathrm {TL}}\) smeared clover fermion simulations. Finally, MILC 16 [187], which is a preliminary result for the value of \(\epsilon \) published in MILC 18 [145], also provides a \(N_f=2\,+\,1\) computation of the ratio \(m_u/m_d\).

MILC 09A [17] uses the mass difference between \(K^0\) and \(K^+\), from which they subtract electromagnetic effects using Dashen’s theorem with corrections, as discussed in the introduction of this section. The up and down sea quarks remain degenerate in their calculation, fixed to the value of \(m_{ud}\) obtained from \(M_{\pi ^0}\). To determine \(m_u/m_d\), BMW 10A, 10B [11, 12] follow a slightly different strategy. They obtain this ratio from their result for \(m_s/m_{ud}\) combined with a phenomenological determination of the isospin-breaking quark-mass ratio \(Q=22.3(8)\), from \(\eta \rightarrow 3\pi \) decays [188] (the decay \(\eta \rightarrow 3\pi \) is very sensitive to QCD isospin breaking but fairly insensitive to QED isospin breaking). Instead of subtracting electromagnetic effects using phenomenology, RBC 07 [138] and Blum 10 [139] actually include a quenched electromagnetic field in their calculation. This means that their results include corrections to Dashen’s theorem, albeit only in the presence of quenched electromagnetism. Since the up and down quarks in the sea are treated as degenerate, very small isospin corrections are neglected, as in MILC’s calculation. PACS-CS 12 [158] takes the inclusion of isospin-breaking effects one step further. Using reweighting techniques, it also includes electromagnetic and \(m_u-m_d\) effects in the sea. However, they do not correct for the large finite-volume effects coming from electromagnetism in their \(M_{\pi }L\sim 2\) simulations, but provide rough estimates for their size, based on Ref. [149]. QCDSF/UKQCD 15 [189] uses QCD+QED dynamical simulations performed at the SU(3)-flavour-symmetric point, but at a single lattice spacing, so they do not enter our average. The smallest partially quenched (\(m_{\mathrm{sea}}\ne m_{\mathrm{val}}\)) pion mass is greater than 200 MeV, so our chiral-extrapolation criteria require a rating. Concerning finite-volume effects, this work uses three spatial extents L of \(1.6~\mathrm {fm}\), \(2.2~\mathrm {fm}\), and \(3.3~\mathrm {fm}\). QCDSF/UKQCD 15 claims that the volume dependence is not visible on the two largest volumes, leading them to assume that finite-size effects are under control. As a consequence of that, the final result for quark masses does not feature a finite-volume extrapolation or an estimation of the finite-volume uncertainty. However, in their work on the QED corrections to the hadron spectrum [189] based on the same ensembles, a volume study shows some level of compatibility with the \(\mathrm {QED}_{\mathrm {L}}\) finite-volume effects derived in [120]. We see two issues here. Firstly, the analytical result quoted from [120] predicts large, \(O(10\%)\) finite-size effects from QED on the meson masses at the values of \(M_{\pi }L\) considered in QCDSF/UKQCD 15, which is inconsistent with the statement made in the paper. Secondly, it is not known that the zero-mode regularization scheme used here has the same volume scaling as \(\mathrm {QED}_{\mathrm {L}}\). We therefore chose to assign the  rating for finite volume to QCDSF/UKQCD 15. Finally, for \(N_f=2\,+\,1\,+\,1\), ETM 14 [9] uses simulations in pure QCD, but determines \(m_u-m_d\) from the slope \(\partial M_K^2/\partial m_{ud}\) and the physical value for the QCD kaon-mass splitting taken from the phenomenological estimate in FLAG 13 (Fig. 4).

Fig. 4
figure 4

Lattice results and FLAG averages at \(N_f = 2\,+\,1\) and \(2\,+\,1\,+\,1\) for the up–down quark masses ratio \(m_u/m_d\), together with the current PDG estimate

Table 7 Lattice results for \(m_u\), \(m_d\) (MeV) and for the ratio \(m_u/m_d\). The values refer to the \({\overline{\mathrm {MS}}}\) scheme at scale 2 GeV. The top part of the table lists the result obtained with \(N_{ f}=2\,+\,1\,+\,1\), while the lower part presents calculations with \(N_f = 2\,+\,1\)

Lattice results for \(m_u\), \(m_d\) and \(m_u/m_d\) are summarized in Table 7. It is important to notice two major changes in the grading of these results: the introduction of an “isospin breaking” criterion and the modification of the “finite volume” criterion in the presence of QED. The colour coding is specified in detail in Sect. 2.1. Considering the important progress in the last years on including isospin-breaking effects in lattice simulations, we are now in a position where averages for \(m_u\) and \(m_d\) can be made without the need of phenomenological inputs. Therefore, lattice calculations of the individual quark masses using phenomenological inputs for isospin-breaking effects will be coded .

We start by recalling the \(N_f=2\) FLAG estimate for the light-quark masses, entirely coming from RM123 13 [141],

$$\begin{aligned}&\quad m_u =2.40(23) \,\mathrm {MeV}\quad \,\mathrm {Ref.}~\text{[141] },\nonumber \\ N_{ f}= 2:&\quad m_d = 4.80(23) \,\mathrm {MeV}\quad \,\mathrm {Ref.}~\text{[141] },\nonumber \\&\quad {m_u}/{m_d} = 0.50(4) \quad \,\mathrm {Ref.}~\text{[141] }, \end{aligned}$$
(42)

with errors of roughly 10%, 5% and 8%, respectively. In these results, the errors are obtained by combining the lattice statistical and systematic errors in quadrature. For \(N_{ f}=2\,+\,1\), the only result, which qualifies for entering the FLAG average for quark masses, is BMW 16 [20],

$$\begin{aligned}&\quad m_u =2.27(9)\,\mathrm {MeV}\quad \,\mathrm {Ref.}~ \text{[20] }\,, \nonumber \\ N_{ f}= 2\,+\,1:&\quad m_d = 4.67(9) \,\mathrm {MeV}\quad \,\mathrm {Ref.}~ \text{[20] }\,,\nonumber \\&\quad {m_u}/{m_d} = 0.485(19)\quad \,\mathrm {Ref.}~ \text{[20] }\,, \end{aligned}$$
(43)

with errors of roughly 4%, 2% and 4%, respectively. This estimate is slightly more precise than in the previous edition of FLAG. More importantly, it now comes entirely from a lattice-QCD+QED calculation, whereas phenomenological input was used in previous editions. These numbers result in the following RGI averages

$$\begin{aligned}&\quad M_u^{\mathrm{RGI}} =3.16(13)_m(4)_\Lambda \,\mathrm {MeV}= 3.16(13)\,\mathrm {MeV}\quad \,\mathrm {Ref.}~ \text{[20] }\,, \nonumber \\ N_{ f}= 2\,+\,1:&\nonumber \\&\quad M_d^{\mathrm{RGI}} = 6.50(13)_m(8)_\Lambda \,\mathrm {MeV}= 6.50(15)\,\mathrm {MeV}\quad \,\mathrm {Ref.}~ \text{[20] }\,. \end{aligned}$$
(44)

Finally, for \(N_{ f}=2\,+\,1\,+\,1\), only RM123 17 [19] enters the average, giving

$$\begin{aligned}&\quad m_u =2.50(17)\,\mathrm {MeV}\quad \,\mathrm {Ref.}~ \text{[19] }\,,\nonumber \\ N_{ f}= 2\,+\,1\,+\,1:&\quad m_d = 4.88(20)\,\mathrm {MeV}\quad \,\mathrm {Ref.}~ \text{[19] }\,,\nonumber \\&\quad {m_u}/{m_d} = 0.513(31)\quad \,\mathrm {Ref.}~ \text{[19] }\,. \end{aligned}$$
(45)

with errors of roughly 7%, 4% and 6%, respectively. In the previous edition of FLAG, ETM 14 [9] was used for the average. The RM123 17 result used here is slightly more precise and is free of phenomenological input. The value of \(m_u/m_d\) in MILC 17 [5] depends critically on the value of \(\epsilon \) given in MILC 18 [145], which was unpublished at the time of the review deadline. As a consequence we did not include the result MILC 17 [5] in the average. The value will appear in the average of the online version of the review. It is, however important to point out that both MILC 17 and MILC 18 results show a marginal discrepancy with RM123 17 [19] of 1.7 standard deviations. The RGI averages are

$$\begin{aligned}&\quad M_u^{\mathrm{RGI}} =3.48(24)_m(4)_\Lambda \,\mathrm {MeV}= 3.48(24) \,\mathrm {MeV}\quad \,\mathrm {Ref.}~ \text{[19] }\,,\nonumber \\ N_{ f}= 2\,+\,1\,+\,1:&\nonumber \\&\quad M_d^{\mathrm{RGI}} = 6.80(28)_m(8)_\Lambda \,\mathrm {MeV}= 6.80(29) \,\mathrm {MeV}\quad \,\mathrm {Ref.}~ \text{[19] }\,. \end{aligned}$$
(46)

Every result for \(m_u\) and \(m_d\) used here to produce the FLAG averages relies on electro-quenched calculations, so there is some interest to comment on the size of quenching effects. Considering phenomenology and the lattice results presented here, it is reasonable for a rough estimate to use the value \((\Delta M_{K}^{2})^{\gamma }\sim 2000~\mathrm {MeV}^2\) for the QED part of the kaon splitting. Using the arguments presented in Sect. 3.1.3, one can assume that the QED sea contribution represents \(O(10\%)\) of \((\Delta M_{K}^{2})^{\gamma }\). Using SU(3) PQ\(\chi \)PT+QED [143, 191] gives a \(\sim 5\%\) effect. Keeping the more conservative \(10\%\) estimate and using the experimental value of the kaon splitting, one finds that the QCD kaon splitting \((\Delta M_{K}^{2})^{SU(2)}\) suffers from a reduced \(3\%\) quenching uncertainty. Considering that this splitting is proportional to \(m_u-m_d\) at leading order in SU(3) \(\chi \)PT, we can estimate that a similar error will propagate to the quark masses. So the individual up and down masses look mildly affected by QED quenching. However, one notices that \(\sim 3\%\) is the level of error in the new FLAG averages, and increasing significantly this accuracy will require using fully unquenched calculations.

In view of the fact that a massless up-quark would solve the strong CP-problem, many authors have considered this an attractive possibility, but the results presented above exclude this possibility: the value of \(m_u\) in Eq. (43) differs from zero by 25 standard deviations. We conclude that nature solves the strong CP-problem differently.

Finally, we conclude this section by giving the FLAG averages for \(\epsilon \) defined in Eq. (39). For \(N_{ f}=2\,+\,1\,+\,1\), we average the RM123 17 [19] result with the value of \((\Delta M_{K}^{2})^{\gamma }\) from BMW 14 [119] combined with Eq. (40), giving

$$\begin{aligned} \epsilon =0.79(7)\,. \end{aligned}$$
(47)

Although BMW 14 [119] focuses on hadron masses and did not extract the light-quark masses, they are the only fully unquenched QCD+QED calculation to date that qualifies to enter a FLAG average. With the exception of renormalization which is not discussed in the paper, this work has a  rating for every FLAG criterion considered for the \(m_u\) and \(m_d\) quark masses. For \(N_{ f}=2\,+\,1\) we use the results from BMW 16 [20]

$$\begin{aligned} \epsilon =0.73(17)\,. \end{aligned}$$
(48)

These results are entirely determined from lattice-QCD+QED and represent an improvement of the error by a factor of two to three on the FLAG 16 phenomenological estimate.

It is important to notice that the \(\epsilon \) uncertainties from BMW 16 and RM123 17 are dominated by estimates of the QED quenching effects. Indeed, in contrast with the quark masses, \(\epsilon \) is expected to be rather sensitive to the sea quark-QED constributions. Using the arguments presented in Sect. 3.1.3, if one conservatively assumes that the QED sea contributions represent \(O(10\%)\) of \((\Delta M_{K}^{2})^{\gamma }\), then Eq. (40) implies that \(\epsilon \) will have a quenching error of \(\sim 0.15\) for \((\Delta M_{K}^{2})^{\gamma }\sim 2000~\mathrm {MeV}^2\), representing a large \(\sim 20\%\) relative error. It is interesting to observe that such a discrepancy does not appear between BMW 15 and RM123 17, although the \(\sim 10\%\) accuracy of both results might not be sufficient to resolve these effects. To conclude, although the controversy around the value of \(\epsilon \) has been significantly reduced by lattice-QCD+QED determinations, computing this quantity precisely requires fully unquenched simulations.

3.1.7 Estimates for R and Q

The quark-mass ratios

$$\begin{aligned} R\equiv \frac{m_s-m_{ud}}{m_d-m_u}\quad \text{ and }\quad Q^2\equiv \frac{m_s^2-m_{ud}^2}{m_d^2-m_u^2} \end{aligned}$$
(49)

compare SU(3) breaking with isospin breaking. Both numbers only depend on the ratios \(m_s/m_{ud}\) and \(m_u/m_d\),

$$\begin{aligned} R=\frac{1}{2}\left( \frac{m_s}{m_{ud}}-1\right) \frac{1+\frac{m_u}{m_d}}{1-\frac{m_u}{m_d}} \quad \text {and}\quad Q^2=\frac{1}{2}\left( \frac{m_s}{m_{ud}}+1\right) R.\nonumber \\ \end{aligned}$$
(50)

The quantity Q is of particular interest because of a low-energy theorem [192], which relates it to a ratio of meson masses,

$$\begin{aligned} Q^2_M\equiv & {} \frac{{\hat{M}}_K^2}{{\hat{M}}_\pi ^2}\frac{{\hat{M}}_K^2-{\hat{M}}_\pi ^2}{{\hat{M}}_{K^0}^2- {\hat{M}}_{K^+}^2}\,,\quad {\hat{M}}^2_\pi \equiv {\frac{1}{2}}( {\hat{M}}^2_{\pi ^+}+ {\hat{M}}^2_{\pi ^0}) \,,\nonumber \\ {\hat{M}}^2_K\equiv & {} {\frac{1}{2}}( {\hat{M}}^2_{K^+}+ {\hat{M}}^2_{K^0})\,.\end{aligned}$$
(51)
Table 8 Our estimates for the strange-quark and the average up-down-quark masses in the \({\overline{\mathrm {MS}}}\) scheme at running scale \(\mu =2\,\mathrm {GeV}\). Mass values are given in MeV. In the results presented here, the error is the one which we obtain by applying the averaging procedure of Sect. 2.3 to the relevant lattice results. We have added an uncertainty to the \(N_f=2\,+\,1\) results, associated with the neglect of the charm sea-quark and isospin-breaking effects, as discussed around Eqs. (33) and (37)
Table 9 Our estimates for the masses of the two lightest quarks and related, strong isospin-breaking ratios. Again, the masses refer to the \({\overline{\mathrm {MS}}}\) scheme at running scale \(\mu =2\,\mathrm {GeV}\). Mass values are given in MeV

Chiral symmetry implies that the expansion of \(Q_M^2\) in powers of the quark masses (i) starts with \(Q^2\) and (ii) does not receive any contributions at NLO:

(52)

We recall here the \(N_f=2\) estimates for Q and R from FLAG 16,

$$\begin{aligned} R=40.7(3.7)(2.2)\,,\quad Q=24.3(1.4)(0.6)\ , \end{aligned}$$
(53)

where the second error comes from the phenomenological inputs that were used. For \(N_{ f}=2\,+\,1\), we use Eqs. (37) and (43) and obtain

$$\begin{aligned} R=38.1(1.5)\,,\quad Q=23.3(0.5)\ , \end{aligned}$$
(54)

where now only lattice results have been used. For \(N_{ f}=2\,+\,1\,+\,1\) we obtain

$$\begin{aligned} R=40.7(2.7)\,,\quad Q=24.0(0.8)\ , \end{aligned}$$
(55)

which are quite compatible with two- and three-flavour results. It is interesting to notice that the most recent phenomenological determination of R and Q from \(\eta \rightarrow 3\pi \) decay [193] gives the values \(R=34.4(2.1)\) and \(Q=22.1(7)\), which are marginally discrepant with the averages presented here. For \(N_{ f}=2\,+\,1\), the discrepancy is 1.4 standard deviations for both R and Q. For \(N_{ f}=2\,+\,1\,+\,1\) it is 1.8 standard deviations. The authors of [193] point out that this discrepancy is due to surprisingly large corrections to the approximation (52) used in the phenomenological analysis.

Our final results for the masses \(m_u\), \(m_d\), \(m_{ud}\), \(m_s\) and the mass ratios \(m_u/m_d\), \(m_s/m_{ud}\), R, Q are collected in Tables 8 and 9. We separate \(m_u\), \(m_d\), \(m_u/m_d\), R and Q from \(m_{ud}\), \(m_s\) and \(m_s/m_{ud}\), because the latter are completely dominated by lattice results while the former still include some phenomenological input.

3.2 Charm quark mass

In the following, we collect and discuss the lattice determinations of the \(\overline{\mathrm{MS}}\) charm-quark mass \({\overline{m}}_c\). Most of the results have been obtained by analyzing the lattice-QCD simulations of two-point heavy–light- or heavy–heavy-meson correlation functions, using as input the experimental values of the D, \(D_s\), and charmonium mesons. Other groups use the moments method. The latter is based on the lattice calculation of the Euclidean time moments of pseudoscalar-pseudoscalar correlators for heavy-quark currents followed by an OPE expansion dominated by perturbative QCD effects, which provides the determination of both the heavy-quark mass and the strong-coupling constant \(\alpha _s\).

The heavy-quark actions adopted by various lattice collaborations have been discussed in previous FLAG reviews [2, 3], and their descriptions can be found in Sect. A.1.3. While the charm mass determined with the moments method does not need any lattice evaluation of the mass-renormalization constant \(Z_m\), the extraction of \({\overline{m}}_c\) from two-point heavy-meson correlators does require the nonperturbative calculation of \(Z_m\). The lattice scale at which \(Z_m\) is obtained, is usually at least of the order 2–3 GeV, and therefore it is natural in this review to provide the values of \({\overline{m}}_c(\mu )\) at the renormalization scale \(\mu = 3~\mathrm {GeV}\). Since the choice of a renormalization scale equal to \({\overline{m}}_c\) is still commonly adopted (as by PDG [170]), we have collected in Table 10 the lattice results for both \({\overline{m}}_c({\overline{m}}_c)\) and \({\overline{m}}_c(\text{3 } \text{ GeV })\), obtained for \(N_f =2\,+\,1\) and \(2\,+\,1\,+\,1\). This year’s review does not contain results for \(N_f=2\), and interested readers are referred to previous reviews [2, 3].

When not directly available in the published work, we apply a conversion factor equal either to 0.900 between the scales \(\mu = 2\) GeV and \(\mu = 3\) GeV or to 0.766 between the scales \(\mu = {\overline{m}}_c\) and \(\mu = 3\) GeV, obtained using perturbative QCD evolution at four loops assuming \(\Lambda _{QCD} = 300\) MeV for \(N_f = 4\).

Table 10 Lattice results for the \({\overline{\mathrm {MS}}}\)-charm-quark mass \({\overline{m}}_c({\overline{m}}_c)\) and \({\overline{m}}_c(\text{3 } \text{ GeV })\) in GeV, together with the colour coding of the calculations used to obtain these. When not directly available in a publication, we employ a conversion factor equal to 0.900 between the scales \(\mu = 2\) GeV and \(\mu = 3\) GeV (or, 0.766 between \(\mu = {\overline{m}}_c\) and \(\mu = 3\) GeV)

In the next sections, we review separately the results of \({\overline{m}}_c({\overline{m}}_c)\) for the various values of \(N_f\).

3.2.1 \(N_f = 2\,+\,1\) results

The HPQCD 10 [13] result is computed from moments, using a subset of \(N_f = 2\,+\,1\) Asqtad-staggered-fermion ensembles from MILC [129] and HISQ valence fermions. The charm mass is fixed from the \(\eta _c\) meson, \(M_{\eta _c} = 2.9852 (34) ~ \mathrm {GeV}\), corrected for \({{\bar{c}}}c\) annihilation and electromagnetic effects. HPQCD 10 supersedes the HPQCD 08B [171] result using valence-Asqtad-staggered fermions.

\(\chi \)QCD 14 [22] uses a mixed-action approach based on overlap fermions for the valence quarks and domain-wall fermions for the sea quarks. They adopt six of the gauge ensembles generated by the RBC/UKQCD collaboration [160] at two values of the lattice spacing (0.087 and 0.11 fm) with unitary pion masses in the range from 290 to 420 MeV. For the valence quarks no light-quark masses are simulated. At the lightest pion mass \(M_\pi \simeq \) 290 MeV, \(M_\pi L=4.1\), which satisfies the tag for finite-volume effects. The strange- and charm-quark masses are fixed together with the lattice scale by using the experimental values of the \(D_s\), \(D_s^*\) and \(J/\psi \) meson masses.

JLQCD 15B [194] determines the charm mass by using the moments method and Möbius domain-wall fermions at three values of the lattice spacing, ranging from 0.044 to 0.083 fm. They employ 15 ensembles in all, including several different pion masses and volumes. The lightest pion mass is \(\simeq 230\) MeV with \(M_\pi L\) is \(\simeq 4.4\). The linear size of their lattices is in the range 2.6–3.8 fm.

Since FLAG 16 there have been two new results, JLQCD 16 [23] and Maezawa 16 [157]. The former supersedes JLQCD 15B as it is a published update of their previous preliminary result. The latter employs the moments method using pseudoscalar correlation functions computed with HISQ fermions on a set of 11 ensembles with lattices spacing in the range 0.04 to 0.14 fm. Only a single pion mass of 160 MeV is studied. The linear size of the lattices take on values between 2.5 and 5.2 fm.

Thus, according to our rules on the publication status, the FLAG average for the charm-quark mass at \(N_f = 2\,+\,1\) is obtained by combining the results HPQCD 10, \(\chi \)QCD 14, and JLQCD 16,

$$\begin{aligned}&\quad {\overline{m}}_c({\overline{m}}_c) = 1.275 ~ (5) ~ \mathrm {GeV}\quad \,\mathrm {Refs.}~ \text{[13,22,23] }\,, \nonumber \\ {N_f = 2\,+\,1:}&\end{aligned}$$
(56)
$$\begin{aligned}&\quad {\overline{m}}_c(\text{3 } \text{ GeV }) = 0.992 ~ (6)~ \mathrm {GeV}\quad \,\mathrm {Refs.}~ \text{[13,22,23] }, \end{aligned}$$
(57)

where the error on \( {\overline{m}}_c(\text{3 } \text{ GeV })\) includes a stretching factor \(\sqrt{\chi ^2/\text{ dof }} \simeq 1.18\) as discussed in Sect. 2.2. This result corresponds to the following RGI average

$$\begin{aligned} M_c^{\mathrm{RGI}}&= 1.529(9)_m(14)_\Lambda ~ \mathrm {GeV}= 1.529(17) ~ \mathrm {GeV}\nonumber \\&\quad \ \ \,\mathrm {Refs.}~ \text{[13,22,23] }. \end{aligned}$$
(58)

3.2.2 \(N_f = 2\,+\,1\,+\,1\) results

In FLAG 16 three results employing four dynamical quarks in the sea were discussed. ETM 14 [9] uses 15 twisted-mass gauge ensembles at three lattice spacings ranging from 0.062 to 0.089 fm, in boxes of size ranging from 2.0 to 3.0 fm and pion masses from 210 to 440 MeV (explaining the tag in the chiral extrapolation and the tag for the continuum extrapolation). The value of \(M_\pi L\) at their smallest pion mass is 3.2 with more than two volumes (explaining the tag in the finite-volume effects). They fix the strange mass with the kaon mass and the charm one with that of the \(D_s\) and D mesons.

ETM 14A [21] uses 10 out of the 15 gauge ensembles adopted in ETM 14 spanning the same range of values for the pion mass and the lattice spacing, but the latter is fixed using the nucleon mass. Two lattice volumes with size larger than 2.0 fm are employed. The physical strange and the charm mass are obtained using the masses of the \(\Omega ^-\) and \(\Lambda _c^+\) baryons, respectively.

HPQCD 14A [16] employs the moments method with HISQ fermions. Their results are based on 9 out of the 21 ensembles produced by the MILC collaboration [18]. Lattice spacings range from 0.057 to 0.153 fm, with box sizes up to 5.8 fm and taste-Goldstone-pion masses down to 130 MeV. The RMS-pion masses go down to 173 MeV. The strange- and charm-quark masses are fixed using \(M_{{{\bar{s}}}s} = 688.5 (2.2)~\mathrm {MeV}\), calculated without including \({{\bar{s}}}s\) annihilation effects, and \(M_{\eta _c} = 2.9863(27)~\mathrm {GeV}\), obtained from the experimental \(\eta _c\) mass after correcting for \({{\bar{c}}}c\) annihilation and electromagnetic effects. All of the selection criteria of Sect. 2.1.1 are satisfied with the tag .Footnote 16

Since FLAG 16 two groups, FNAL/MILC/TUMQCD and HPQCD have produced new values for the charm-quark mass [8, 15]. The latter use nonperturbative renormalization in the RI-SMOM scheme as described in the strange quark section and the same HISQ ensembles and valence quarks as those described in HPQCD 14A [16].

The FNAL/MILC/TUMQCD groups use a new minimal-renormalon-subtraction scheme (MRS) [195] and a sophisticated, but complex, fit strategy incorporating three effective field theories: heavy quark effective theory (HQET), heavy-meson rooted all-staggered chiral perturbation theory (HMrAS\(\chi \)PT), and Symanzik effective theory for cutoff effects. heavy–light meson masses are computed from fits to lattice-QCD correlation functions. They employ HISQ quarks on 20 MILC \(2\,+\,1\,+\,1\) flavour ensembles with six lattice spacings between 0.03 and 0.15 fm (the largest is used only in the estimation of the systematic error in the continuum-limit extrapolation). The pion mass is physical on several ensembles except the finest, and \(M_\pi L=3.7\)–3.9 on the physical mass ensembles. The light-quark masses are fixed from meson masses in pure QCD, which have been shifted from their physical values using \(O(\alpha )\) electromagnetic effects recently computed by the MILC collaboration [145], see Sect. 3.1.6 for details. The heavy–light mesons are shifted using a phenomenological formula. Using chiral perturbation theory at NLO and NNLO, the results are corrected for exponentially small finite-volume effects. They find that nonexponential finite-volume effects due to nonequilibration of topological charge are negligible compared to other quoted errors. These allow for a combined continuum, chiral, and infinite-volume limit from a global fit including 77 free parameters to 324 data points which satisfies all of the FLAG criteria.

All four results enter the FLAG average for \(N_f = 2\,+\,1\,+\,1\) quark flavours. We note however that while the determinations of \({\overline{m}}_c\) by ETM 14 and 14A agree well with each other, they are incompatible with HPQCD 14A, HPQCD 18, and FNAL/MILC/TUMQCD 18 by several standard deviations. While the latter use the same configurations, the analyses are quite different and independent. As mentioned earlier, \(m_{ud}\) and \(m_s\) are also systematically high compared to their respective averages. In addition, the other four-flavour values are consistent with the three-flavour average. Combining all four results yields

Table 11 Lattice results for the quark-mass ratio \(m_c/m_s\), together with the colour coding of the calculations used to obtain these
(59)
(60)

where the errors include large stretching factors \(\sqrt{\chi ^2/\text{ dof }}\approx 2.0\) and 1.7, respectively. We have assumed 100% correlation for statistical errors between ETM results. For HPQCD 14A, HPQCD 18, and FNAL/MILC/TUMQCD 18 we use the correlations given in Ref. [15]. Our fits have \(\chi ^2/\text{ dof }=3.9\) and 2.8, respectively. The RGI average reads as follows

$$\begin{aligned} M_c^{\mathrm{RGI}}&= 1.523(11)_m(14)_\Lambda ~ \mathrm {GeV}= 1.523(18) ~ \mathrm {GeV}\nonumber \\&\quad \,\mathrm {Refs.}~ \text{[8,9,15,16,21] }. \end{aligned}$$
(61)

Figure 5 presents the results given in Table 10 along with the FLAG averages obtained for \(2\,+\,1\) and \(2\,+\,1\,+\,1\) flavours.

Fig. 5
figure 5

The charm quark mass for \(2\,+\,1\) and \(2\,+\,1\,+\,1\) flavours. For the latter a large stretching factor is used for the FLAG average due to poor \(\chi ^2\) from our fit

3.2.3 Lattice determinations of the ratio \(m_c/m_s\)

Because some of the results for the light-quark masses given in this review are obtained via the quark-mass ratio \(m_c/m_s\), we review these lattice calculations, which are listed in Table 11.

The \(N_f = 2\,+\,1\) results from \(\chi \)QCD 14 and HPQCD 09A [24] are the same as described for the charm-quark mass, and in addition the latter fixes the strange mass using \(M_{{{\bar{s}}}s} = 685.8(4.0)\,\mathrm {MeV}\). Since FLAG 16 another result has appeared, Maezawa 16 which does not pass our chiral-limit test (as described in the previous section), though we note that it is quite consistent with the other values. Combining \(\chi \)QCD 14 and HPQCD 09A, we obtain the same result reported in FLAG 16,

$$\begin{aligned} N_f = 2\,+\,1: \quad m_c / m_s = 11.82 ~ (16)\quad \,\mathrm {Refs.}~ \text{[22,24] },\nonumber \\ \end{aligned}$$
(62)

with a \(\chi ^2/\text{ dof } \simeq 0.85\).

Fig. 6
figure 6

Lattice results for the ratio \(m_c / m_s\) listed in Table 11 and the FLAG averages corresponding to \(2\,+\,1\) and \(2\,+\,1\,+\,1\) quark flavours. The latter average includes a large stretching factor on the error due a poor \(\chi ^2\) from our fit

Table 12 Lattice results for the \({\overline{\mathrm {MS}}}\)-bottom-quark mass \({\overline{m}}_b({\overline{m}}_b)\) in GeV, together with the systematic error ratings for each. Available results for the quark mass ratio \(m_b / m_c\) are also reported

Turning to \(N_f = 2\,+\,1\,+\,1\), in addition to the HPQCD 14A and ETM 14 calculations, already described in Sect. 3.2.2, we consider the recent FNAL/MILC/TUMQCD 18 value [8] (which updates and replaces [18]), where HISQ fermions are employed as described in the previous section. As for the HPQCD 14A result, all of our selection criteria are satisfied with the tag . However, some tension exists between the HPQCD and FNAL/MILC/TUMQCD results. Combining all three yields

$$\begin{aligned} {N_f = 2\,+\,1\,+\,1:} \quad m_c / m_s = 11.768~ (33)\quad \,\mathrm {Refs.}~ \text{[8,9,16] },\nonumber \\ \end{aligned}$$
(63)

where the error includes the stretching factor \(\sqrt{\chi ^2/\text{ dof }} \simeq 1.5\), and \(\chi ^2/dof=2.28\). We have assumed a 100% correlation of statistical errors for FNAL/MILC/TUMQCD 18 and HPQCD 14A.

Results for \(m_c/m_s\) are shown in Fig. 6 together with the FLAG averages for \(2\,+\,1\) and \(2\,+\,1\,+\,1\) flavours.

3.3 Bottom quark mass

Now we review the lattice results for the \(\overline{\mathrm{MS}}\)-bottom-quark mass \({\overline{m}}_b\). Related heavy-quark actions and observables have been discussed in the FLAG 13 and 17 reviews [2, 3], and descriptions can be found in Sect. A.1.3. In Table 12 we collect results for \({\overline{m}}_b({\overline{m}}_b)\) obtained with \(N_f =2\,+\,1\) and \(2\,+\,1\,+\,1\) quark flavours in the sea. Available results for the quark-mass ratio \(m_b / m_c\) are also reported. After discussing the various results we evaluate the corresponding FLAG averages.

3.3.1 \(N_f=2\,+\,1\)

HPQCD 13B [197] extracts \({\overline{m}}_b\) from a lattice determination of the \(\Upsilon \) energy in NRQCD and the experimental value of the meson mass. The latter quantities yield the pole mass which is related to the \(\overline{\mathrm{MS}}\) mass in 3-loop perturbation theory. The MILC coarse (0.12 fm) and fine (0.09 fm) Asqtad-2 + 1-flavour ensembles are employed in the calculation. The bare light-(sea)-quark masses correspond to a single, relatively heavy, pion mass of about 300 MeV. No estimate of the finite-volume error is given. This result is not used in our average.

The value of \({\overline{m}}_b({\overline{m}}_b)\) reported in HPQCD 10 [13] is computed in a very similar fashion to the one in HPQCD 14A described in the following section on \(2\,+\,1\,+\,1\) flavour results, except that MILC \(2\,+\,1\)-flavour-Asqtad ensembles are used under HISQ valence quarks. The lattice spacings of the ensembles range from 0.18 to 0.045 fm and pion masses down to about 165 MeV. In all, 22 ensembles were fit simultaneously. An estimate of the finite-volume error based on leading-order perturbation theory for the moment ratio is also provided. Details of perturbation theory and renormalization systematics are given in Sect. 9.7.

Maezawa 16 reports a new result for the b-quark mass since the last FLAG review. However as discussed in the charm-quark section, this calculation does not satisfy the criteria to be used in the FLAG average. As in the previous review, we take the HPQCD 10 result as our average,

$$\begin{aligned}&N_f= 2\,+\,1 : \quad \overline{m}_b(\overline{m}_b) = 4.164 (23) ~ \mathrm {GeV}\nonumber \\&\quad \,\mathrm {Ref.}~ \text{[13] }\,, \end{aligned}$$
(64)

Since HPQCD quotes \({\overline{m}}_b({\overline{m}}_b)\) using \(N_f = 5\) running, we used that value in the average. The corresponding 4-flavour RGI average is

$$\begin{aligned}&N_f= 2\,+\,1 : M_b^\mathrm{RGI} = 6.874(38)_m(54)_\Lambda \nonumber \\&\quad \mathrm {GeV}= 6.874(66) ~ \mathrm {GeV}\quad \ \ \,\mathrm {Ref.}~ \text{[13] }. \end{aligned}$$
(65)

3.3.2 \(N_f=2\,+\,1\,+\,1\)

Results have been published by HPQCD using NRQCD and HISQ-quark actions (HPQCD 14B [25] and HPQCD 14A [16], respectively). In both works the b-quark mass is computed with the moments method, that is, from Euclidean-time moments of two-point, heavy–heavy-meson correlation functions (see also Sect. 9.7 for a description of the method).

In HPQCD 14B the b-quark mass is computed from ratios of the moments \(R_n\) of heavy current-current correlation functions, namely,

$$\begin{aligned} \left[ \frac{R_n r_{n-2}}{R_{n-2}r_n}\right] ^{1/2} \frac{{\bar{M}}_\mathrm{kin}}{2 m_b} = \frac{{\bar{M}}_{\Upsilon ,\eta _b}}{2 {{\bar{m}}}_b(\mu )} ~ , \end{aligned}$$
(66)

where \(r_n\) are the perturbative moments calculated at \(\hbox {N}^3\)LO, \({\bar{M}}_{\mathrm{kin}}\) is the spin-averaged kinetic mass of the heavy–heavy vector and pseudoscalar mesons and \({\bar{M}}_{\Upsilon ,\eta _b}\) is the experimental spin average of the \(\Upsilon \) and \(\eta _b\) masses. The average kinetic mass \({\bar{M}}_{\mathrm{kin}}\) is chosen since in the lattice calculation the splitting of the \(\Upsilon \) and \(\eta _b\) states is inverted. In Eq. (66), the bare mass \(m_b\) appearing on the left-hand side is tuned so that the spin-averaged mass agrees with experiment, while the mass \({\overline{m}}_b\) at the fixed scale \(\mu = 4.18\) GeV is extrapolated to the continuum limit using three HISQ (MILC) ensembles with \(a \approx \) 0.15, 0.12 and 0.09 fm and two pion masses, one of which is the physical one. Their final result is \({\overline{m}}_b(\mu = 4.18\, \mathrm {GeV}) = 4.207(26)\) GeV, where the error is from adding systematic uncertainties in quadrature only (statistical errors are smaller than \(0.1 \%\) and ignored). The errors arise from renormalization, perturbation theory, lattice spacing, and NRQCD systematics. The finite-volume uncertainty is not estimated, but at the lowest pion mass they have \( m_\pi L \simeq 4\), which leads to the tag .

In HPQCD 14A the quark mass is computed using a similar strategy as above but with HISQ heavy quarks instead of NRQCD. The gauge field ensembles are the same as in HPQCD 14B above plus the one with \(a = 0.06\) fm (four lattice spacings in all). Since the physical b-quark mass in units of the lattice spacing is always greater than one in these calculations, fits to correlation functions are restricted to \(am_h \le 0.8\), and a high-degree polynomial in \(a m_{\eta _{h}}\), the corresponding pseudoscalar mass, is used in the fits to remove the lattice-spacing errors. Finally, to obtain the physical b-quark mass, the moments are extrapolated to \(m_{\eta _b}\). Bare heavy-quark masses are tuned to their physical values using the \(\eta _h\) mesons, and ratios of ratios yield \(m_h/m_c\). The \(\overline{\mathrm{MS}}\)-charm-quark mass determined as described in Sect. 3.2 then gives \(m_b\). The moment ratios are expanded using the OPE, and the quark masses and \(\alpha _S\) are determined from fits of the lattice ratios to this expansion. The fits are complicated: HPQCD uses cubic splines for valence- and sea-mass dependence, with several knots, and many priors for 21 ratios to fit 29 data points. Taking this fit at face value results in a rating for the continuum limit since they use four lattice spacings down to 0.06 fm. See however the detailed discussion of the continuum limit given in Sect. 9.7 on \(\alpha _S\).

The third four-flavour result [26] is from the ETM collaboration and updates their preliminary result appearing in a conference proceedings [196]. The calculation is performed on a set of configurations generated with twisted-Wilson fermions with three lattice spacings in the range 0.06–0.09 fm and with pion masses in the range 210–440 MeV. The b-quark mass is determined from a ratio of heavy–light pseudoscalar meson masses designed to yield the quark pole mass in the static limit. The pole mass is related to the \(\overline{\mathrm{MS}}\) mass through perturbation theory at \(\hbox {N}^3\)LO. The key idea is that by taking ratios of ratios, the b-quark mass is accessible through fits to heavy–light(strange)-meson correlation functions computed on the lattice in the range \(\sim 1\)\(2\times m_c\) and the static limit, the latter being exactly 1. By simulating below \({\overline{m}}_b\), taking the continuum limit is easier. They find \({\overline{m}}_b({\overline{m}}_b) = 4.26(3)(10)\) GeV, where the first error is statistical and the second systematic. The dominant errors come from setting the lattice scale and fit systematics.

The next new result since FLAG 16 is from Gambino, et al. [27]. The authors use twisted-mass-fermion ensembles from the ETM collaboration and the ETM ratio method as in ETM 16. Three values of the lattice spacing are used, ranging from 0.062 to 0.089 fm. Several volumes are also used. The light-quark masses produce pions with masses from 210 to 450 MeV. The main difference with ETM 16 is that the authors use the kinetic mass defined in the heavy-quark expansion (HQE) to extract the b-quark mass instead of the pole mass.

The final b-quark mass result is FNAL/MILC/TUM 18 [8]. The mass is extracted from the same fit and analysis that is described in the charm quark mass section. Note that relativistic HISQ quarks are used (almost) all the way up to the b-quark mass (0.9 \(am_b\)) on the finest two lattices, \(a=0.03\) and 0.042 fm. The authors investigated the effect of leaving out the heaviest points from the fit, and the result did not noticeably change.

All of the above results enter our average. We note that here the updated ETM result is consistent with the average and a stretching factor on the error is not used. The average and error is dominated by the very precise FNAL/MILC/TUM 18 value.

$$\begin{aligned}&N_f = 2\,+\,1\,+\,1:\quad {\overline{m}}_b({\overline{m}}_b) = 4.198 (12) \quad \mathrm {GeV}\nonumber \\&\quad \,\mathrm {Refs.}~{ [8,16,25{-}27]}. \end{aligned}$$
(67)

Since HPQCD quotes \({\overline{m}}_b({\overline{m}}_b)\) using \(N_f= 5\) running, we used that value in the average. We have included a 100% correlation on the statistical errors of ETM 16 and Gambino 17 since the same ensembles are used in both. This translates to the following RGI average

$$\begin{aligned}&N_f= 2\,+\,1\,+\,1:\quad M_b^{\mathrm{RGI}} = 6.936(20)_m(54)_\Lambda ~ \nonumber \\&\quad \mathrm {GeV}= 6.936(57) ~ \mathrm {GeV}\quad \,\mathrm {Refs.}{ [8,16,25{-}27]}. \end{aligned}$$
(68)

All the results for \({\overline{m}}_b({\overline{m}}_b)\) discussed above are shown in Fig. 7 together with the FLAG averages corresponding to \(N_f=2\,+\,1\) and \(2\,+\,1\,+\,1\) quark flavours.

Fig. 7
figure 7

The b-quark mass, \(N_f =2\,+\,1\) and \(2\,+\,1\,+\,1\). The updated PDG value from Ref. [137] is reported for comparison

4 Leptonic and semileptonic kaon and pion decay and \(|V_{ud}|\) and \(|V_{us}|\)

Authors: T. Kaneko, J. N. Simone, S. Simula

This section summarizes state-of-the-art lattice calculations of the leptonic kaon and pion decay constants and the kaon semileptonic-decay form factor and provides an analysis in view of the Standard Model. With respect to the previous edition of the FLAG review [3] the data in this section has been updated. As in Ref. [3], when combining lattice data with experimental results, we take into account the strong SU(2) isospin correction, either obtained in lattice calculations or estimated by using chiral perturbation theory (\(\chi \)PT), both for the kaon leptonic decay constant \(f_{K^\pm }\) and for the ratio \(f_{K^\pm } / f_{\pi ^\pm }\).

4.1 Experimental information concerning \(|V_{ud}|\), \(|V_{us}|\), \(f_+(0)\) and \({f_{K^\pm }}/{f_{\pi ^\pm }}\)

The following review relies on the fact that precision experimental data on kaon decays very accurately determine the product \(|V_{us}|f_+(0)\) [200] and the ratio \(|V_{us}/V_{ud}|f_{K^\pm }/f_{\pi ^\pm }\) [200, 201]:

$$\begin{aligned} |V_{us}| f_+(0) = 0.2165(4)\,,\quad \left| \frac{V_{us}}{V_{ud}}\right| \frac{ f_{K^\pm }}{ f_{\pi ^\pm }} \; =0.2760(4)\,.\nonumber \\ \end{aligned}$$
(69)

Here and in the following, \(f_{K^\pm }\) and \(f_{\pi ^\pm }\) are the isospin-broken decay constants, respectively, in QCD. We will refer to the decay constants in the SU(2) isospin-symmetric limit as \(f_K\) and \(f_\pi \) (the latter at leading order in the mass difference (\(m_u - m_d\)) coincides with \(f_{\pi ^\pm }\)). The parameters \(|V_{ud}|\) and \(|V_{us}|\) are elements of the Cabibbo-Kobayashi-Maskawa matrix and \(f_+(q^2)\) represents one of the form factors relevant for the semileptonic decay \(K^0\rightarrow \pi ^-\ell \,\nu \), which depends on the momentum transfer q between the two mesons. What matters here is the value at \(q^2 = 0\): . The pion and kaon decay constants are defined byFootnote 17

In this normalization, \(f_{\pi ^\pm } \simeq 130\) MeV, \(f_{K^\pm }\simeq 155\) MeV.

In Eq. (69), the electromagnetic effects have already been subtracted in the experimental analysis using \(\chi \)PT. Recently, a new method [206] has been proposed for calculating the leptonic decay rates of hadrons including both QCD and QED on the lattice, and successfully applied to the case of the ratio of the leptonic decay rates of kaons and pions [207]. The correction to the tree-level \(K_{\mu 2} / \pi _{\mu 2}\) decay rate, including both electromagnetic and strong isospin-breaking effects, is found to be equal to \(-1.22 (16) \%\) to be compared to the estimate \(-1.12 (21) \%\) based on \(\chi \)PT [133, 208]. Using the experimental values of the \(K_{\mu 2} \) and \(\pi _{\mu 2}\) decay rates the result of Ref. [207] implies

$$\begin{aligned} \left| \frac{V_{us}}{V_{ud}}\right| \frac{f_K}{f_\pi } = 0.27673 \, (29)_{\mathrm{exp}} \, (23)_{\mathrm{th}} \, [37] ~ , \end{aligned}$$
(70)

where the last error in brackets is the sum in quadrature of the experimental and theoretical uncertainties, and the ratio of the decay constants is the one corresponding to isosymmetric QCD. The single calculation of Ref. [207] is clearly not ready for averaging, but it demonstrates that the determination of \(V_{us} / V_{ud}\) using only lattice-QCD+QED and the ratio of the experimental values of the \(K_{\mu 2} \) and \(\pi _{\mu 2}\) decay rates is feasible with good accuracy.

The measurement of \(|V_{ud}|\) based on superallowed nuclear \(\beta \) transitions has now become remarkably precise. The result of the update of Hardy and Towner [209], which is based on 20 different superallowed transitions, readsFootnote 18

$$\begin{aligned} |V_{ud}| = 0.97420(21)\,.\end{aligned}$$
(71)

The matrix element \(|V_{us}|\) can be determined from semi-inclusive \(\tau \) decays [217,218,219,220]. By separating the inclusive decay \(\tau \rightarrow \text{ hadrons }+\nu \) into nonstrange and strange final states, e.g., HFLAV 16 [221] obtains \(|V_{us}|=0.2186(21)\) and both Maltman et al. [219, 222, 223] and Gamiz et al. [224, 225] arrive at very similar values. Inclusive hadronic \(\tau \) decay offers an interesting way to measure \(|V_{us}|\), but the above value of \(|V_{us}|\) differs from the result one obtains from assuming three-flavour SM-unitarity by more than three standard deviations [221]. This apparent tension has been recently solved in Ref. [226] thanks to the use of a different experimental input and to a new treatment of higher orders in the operator product expansion and of violations of quark-hadron duality. A much larger value of \(|V_{us}|\) is obtained, namely,

$$\begin{aligned} |V_{us}| = 0.2231 (27)_{\mathrm{exp}} (4)_{\mathrm{th}} ~ , \end{aligned}$$
(72)

which is in much better agreement with CKM unitarity. Recently, in Ref. [227], a new method, which includes also the lattice calculation of the hadronic vacuum polarization function, has been proposed for the determination of \(|V_{us}|\) from inclusive strange \(\tau \) decays.

Table 13 Colour code for the data on \(f_+(0)\). With respect to the previous edition [3] old results with two red tags have been dropped

The experimental results in Eq. (69) are for the semileptonic decay of a neutral kaon into a negatively charged pion and the charged pion and kaon leptonic decays, respectively, in QCD. In the case of the semileptonic decays the corrections for strong and electromagnetic isospin breaking in chiral perturbation theory at NLO have allowed for averaging the different experimentally measured isospin channels [228]. This is quite a convenient procedure as long as lattice-QCD simulations do not include strong or QED isospin-breaking effects. Several lattice results for \(f_K/f_\pi \) are quoted for QCD with (squared) pion and kaon masses of \(M_\pi ^2=M_{\pi ^0}^2\) and \(M_K^2=\frac{1}{2} \left( M_{K^\pm }^2\,+\,M_{K^0}^2-M_{\pi ^\pm }^2\,+\,M_{\pi ^0}^2\right) \) for which the leading strong and electromagnetic isospin violations cancel. While the modern trend is to include strong and electromagnetic isospin breaking in the lattice simulations (e.g., Refs. [140, 141, 162, 185, 206, 207, 229,230,231]), in this section contact with experimental results is made by correcting leading SU(2) isospin breaking guided either by chiral perturbation theory or by lattice calculations.

Table 14 Colour code for the data on the ratio of decay constants: \(f_K/f_\pi \) is the pure QCD SU(2)-symmetric ratio, while \(f_{K^\pm }/f_{\pi ^\pm }\) is in pure QCD including the SU(2) isospin-breaking correction. With respect to the previous edition [3] old results with two red tags have been dropped

4.2 Lattice results for \(f_+(0)\) and \(f_{K^\pm }/f_{\pi ^\pm }\)

The traditional way of determining \(|V_{us}|\) relies on using estimates for the value of \(f_+(0)\), invoking the Ademollo-Gatto theorem [241]. Since this theorem only holds to leading order of the expansion in powers of \(m_u\), \(m_d\), and \(m_s\), theoretical models are used to estimate the corrections. Lattice methods have now reached the stage where quantities like \(f_+(0)\) or \(f_K/f_\pi \) can be determined to good accuracy. As a consequence, the uncertainties inherent in the theoretical estimates for the higher order effects in the value of \(f_+(0)\) do not represent a limiting factor any more and we shall therefore not invoke those estimates. Also, we will use the experimental results based on nuclear \(\beta \) decay and \(\tau \) decay exclusively for comparison – the main aim of the present review is to assess the information gathered with lattice methods and to use it for testing the consistency of the SM and its potential to provide constraints for its extensions.

The database underlying the present review of the semileptonic form factor and the ratio of decay constants is listed in Tables 13 and 14. The properties of the lattice data play a crucial role for the conclusions to be drawn from these results: range of \(M_\pi \), size of \(L M_\pi \), continuum extrapolation, extrapolation in the quark masses, finite-size effects, etc. The key features of the various data sets are characterized by means of the colour code specified in Sect. 2.1. Note that with respect to the previous edition [3] we have dropped old results with two red tags. More detailed information on individual computations are compiled in Appendix B.2, which in this edition is limited to new results and to those entering the FLAG averages. For other calculations the reader should refer to the Appendix B.2 of Ref. [3].

The quantity \(f_+(0)\) represents a matrix element of a strangeness-changing null-plane charge, \(f_+(0)\,{=}\,\langle K|Q^{{\bar{u}}s}|\pi \rangle \) (see Ref. [242]). The vector charges obey the commutation relations of the Lie algebra of SU(3), in particular \([Q^{{\bar{u}}s},Q^{{\bar{s}}u}]=Q^{{\bar{u}}u-{\bar{s}}s}\). This relation implies the sum rule \(\sum _n |\langle K|Q^{{\bar{u}}s}|n \rangle |^2-\sum _n |\langle K|Q^{{\bar{s}}u}|n \rangle |^2=1\). Since the contribution from the one-pion intermediate state to the first sum is given by \(f_+(0)^2\), the relation amounts to an exact representation for this quantity [243]:

$$\begin{aligned} f_+(0)^2=1-\sum _{n\ne \pi } |\langle K|Q^{{\bar{u}}s}|n \rangle |^2\,+\,\sum _n |\langle K |Q^{{\bar{s}}u}|n \rangle |^2\,.\nonumber \\ \end{aligned}$$
(73)

While the first sum on the right extends over nonstrange intermediate states, the second runs over exotic states with strangeness \(\pm 2\) and is expected to be small compared to the first.

The expansion of \(f_+(0)\) in SU(3) chiral perturbation theory in powers of \(m_u\), \(m_d\), and \(m_s\) starts with \(f_+(0)=1+f_2\,+\,f_4+\ldots \,\) [244]. Since all of the low-energy constants occurring in \(f_2\) can be expressed in terms of \(M_\pi \), \(M_K\), \(M_\eta \) and \(f_\pi \) [242], the NLO correction is known. In the language of the sum rule (73), \(f_2\) stems from nonstrange intermediate states with three mesons. Like all other nonexotic intermediate states, it lowers the value of \(f_+(0)\): \(f_2=-0.023\) when using the experimental value of \(f_\pi \) as input. The corresponding expressions have also been derived in quenched or partially quenched (staggered) chiral perturbation theory [30, 245]. At the same order in the SU(2) expansion [246], \(f_+(0)\) is parameterized in terms of \(M_\pi \) and two a priori unknown parameters. The latter can be determined from the dependence of the lattice results on the masses of the quarks. Note that any calculation that relies on the \(\chi \)PT formula for \(f_2\) is subject to the uncertainties inherent in NLO results: instead of using the physical value of the pion decay constant \(f_\pi \), one may, for instance, work with the constant \(f_0\) that occurs in the effective Lagrangian and represents the value of \(f_\pi \) in the chiral limit. Although trading \(f_\pi \) for \(f_0\) in the expression for the NLO term affects the result only at NNLO, it may make a significant numerical difference in calculations where the latter are not explicitly accounted for. (Lattice results concerning the value of the ratio \(f_\pi /f_0\) are reviewed in Sect. 5.3.)

Fig. 8
figure 8

Comparison of lattice results (squares) for \(f_+(0)\) and \(f_{K^\pm }/ f_{\pi ^\pm }\) with various model estimates based on \(\chi \)PT (blue circles). The ratio \(f_{K^\pm }/f_{\pi ^\pm }\) is obtained in pure QCD including the SU(2) isospin-breaking correction (see Sect. 4.3). The black squares and grey bands indicate our estimates. The significance of the colours is explained in Sect. 2

The lattice results shown in the left panel of Fig. 8 indicate that the higher order contributions \(\Delta f\equiv f_+(0)-1-f_2\) are negative and thus amplify the effect generated by \(f_2\). This confirms the expectation that the exotic contributions are small. The entries in the lower part of the left panel represent various model estimates for \(f_4\). In Ref. [251], the symmetry-breaking effects are estimated in the framework of the quark model. The more recent calculations are more sophisticated, as they make use of the known explicit expression for the \(K_{\ell 3}\) form factors to NNLO in \(\chi \)PT [250, 252]. The corresponding formula for \(f_4\) accounts for the chiral logarithms occurring at NNLO and is not subject to the ambiguity mentioned above.Footnote 19 The numerical result, however, depends on the model used to estimate the low-energy constants occurring in \(f_4\) [247,248,249,250]. The figure indicates that the most recent numbers obtained in this way correspond to a positive or an almost vanishing rather than a negative value for \(\Delta f\). We note that FNAL/MILC 12I [30] and Ref. [253] have made an attempt at determining a combination of some of the low-energy constants appearing in \(f_4\) from lattice data.

4.3 Direct determination of \(f_+(0)\) and \(f_{K^\pm }/f_{\pi ^\pm }\)

Many lattice results for the form factor \(f_+(0)\) and for the ratio of decay constants, which we summarize here in Tables 13 and 14, respectively, have been computed in isospin-symmetric QCD. The reason for this unphysical parameter choice is that there are only a few simulations of isospin-breaking effects in lattice QCD, which is ultimately the cleanest way for predicting these effects [139,140,141, 148, 185, 206, 207, 231, 254, 255]. In the meantime, one relies either on chiral perturbation theory [166, 244] to estimate the correction to the isospin limit or one calculates the breaking at leading order in \((m_u-m_d)\) in the valence quark sector by extrapolating the lattice data for the charged kaons to the physical value of the up(down)-quark mass (the result for the pion decay constant is always extrapolated to the value of the average light-quark mass \({{\hat{m}}}\)). This defines the prediction for \(f_{K^\pm }/f_{\pi ^\pm }\).

Since the majority of results that qualify for inclusion into the FLAG average include the strong SU(2) isospin-breaking correction, we confirm the choice made in the previous edition of the FLAG review [3] and we provide in Fig. 8 the overview of the world data of \(f_{K^\pm }/f_{\pi ^\pm }\). For all the results of Table 14 provided only in the isospin-symmetric limit we apply individually an isospin correction that will be described later on (see Eqs. (78)–(79)).

The plots in Fig. 8 illustrate our compilation of data for \(f_+(0)\) and \(f_{K^\pm }/f_{\pi ^\pm }\). The lattice data for the latter quantity is largely consistent even when comparing simulations with different \(N_f\), while in the case of \(f_+(0)\) a slight tendency to get higher values for increasing \(N_f\) seems to be visible, even if it does not exceed one standard deviation. We now proceed to form the corresponding averages, separately for the data with \(N_{ f}=2\,+\,1\,+\,1\), \(N_{ f}=2\,+\,1\), and \(N_{ f}=2\) dynamical flavours, and in the following we will refer to these averages as the “direct” determinations.

4.3.1 Results for \(f_+(0)\)

For \(f_+(0)\) there are currently two computational strategies: FNAL/MILC uses the Ward identity to relate the \(K\rightarrow \pi \) form factor at zero momentum transfer to the matrix element \(\langle \pi |S|K\rangle \) of the flavour-changing scalar current \(S = {\bar{s}} u\). Peculiarities of the staggered fermion discretization used by FNAL/MILC (see Ref. [30]) makes this the favoured choice. The other collaborations are instead computing the vector current matrix element \(\langle \pi | {\bar{s}} \gamma _\mu u |K\rangle \). Apart from FNAL/MILC 13C, FNAL/MILC 13E and RBC/UKQCD 15A all simulations in Table 13 involve unphysically heavy quarks and, therefore, the lattice data needs to be extrapolated to the physical pion and kaon masses corresponding to the \(K^0\rightarrow \pi ^-\) channel. We note also that the recent computations of \(f_+(0)\) obtained by the FNAL/MILC and RBC/UKQCD collaborations make use of the partially-twisted boundary conditions to determine the form-factor results directly at the relevant kinematical point \(q^2=0\) [266, 267], avoiding in this way any uncertainty due to the momentum dependence of the vector and/or scalar form factors. The ETM collaboration uses partially-twisted boundary conditions to compare the momentum dependence of the scalar and vector form factors with the one of the experimental data [29, 240], while keeping at the same time the advantage of the high-precision determination of the scalar form factor at the kinematical end-point \(q_{max}^2 = (M_K - M_\pi )^2\) [32, 268] for the interpolation at \(q^2 = 0\).

According to the colour codes reported in Table 13 and to the FLAG rules of Sect. 2.2, only the result ETM 09A with \(N_{ f}=2\), the results FNAL/MILC 12I and RBC/UKQCD 15A with \(N_{ f}=2\,+\,1\) and the results FNAL/MILC 13E and ETM 16 with \(N_{ f}=2\,+\,1\,+\,1\) dynamical flavours of fermions, respectively, can enter the FLAG averages.

At \(N_{ f}=2\,+\,1\,+\,1\) the result from the FNAL/MILC collaboration, \(f_+(0) = 0.9704 (24) (22)\) (FNAL/MILC 13E), is based on the use of the Highly Improved Staggered Quark (HISQ) action (for both valence and sea quarks), which has been tailored to reduce staggered taste-breaking effects, and includes simulations with three lattice spacings and physical light-quark masses. These features allow to keep the uncertainties due to the chiral extrapolation and to the discretization artifacts well below the statistical error. The remaining largest systematic uncertainty comes from finite-size effects, which have been investigated in Ref. [269] using 1-loop \(\chi \)PT (with and without taste-violating effects). Recently [232] the FNAL/MILC collaboration presented a more precise determination of \(f_+(0)\), \(f_+(0) = 0.9696 (15) (11)\) (see the entry FNAL/MILC 18 in Table 13), in which the improvement of the precision with respect to FNAL/MILC 13E is obtained mainly by using an estimate of finite-size effects based on ChPT only. We do not consider FNAL/MILC 18 as a plain update of FNAL/MILC 13E.

The new result from the ETM collaboration, \(f_+(0) = 0.9709 (45) (9)\) (ETM 16), makes use of the twisted-mass discretization adopting three values of the lattice spacing in the range \(0.06{-}0.09\) fm and pion masses simulated in the range \(210{-}450\) MeV. The chiral and continuum extrapolations are performed in a combined fit together with the momentum dependence, using both a SU(2)-\(\chi \)PT inspired ansatz (following Ref. [240]) and a modified z-expansion fit. The uncertainties coming from the chiral extrapolation, the continuum extrapolation and the finite-volume effects turn out to be well below the dominant statistical error, which includes also the error due to the fitting procedure. A set of synthetic data points, representing both the vector and the scalar semileptonic form factors at the physical point for several selected values of \(q^2\), is provided together with the corresponding correlation matrix.

At \(N_{ f}=2\,+\,1\) there is a new result from the JLQCD collaboration [234], which however does not satisfy all FLAG criteria for entering the average. The two results eligible to enter the FLAG average at \(N_{ f}=2\,+\,1\) are the one from RBC/UKQCD 15A, \(f_+(0) = 0.9685 (34) (14)\) [31], and the one from FNAL/MILC 12I, \(f_+(0)=0.9667(23)(33)\) [30]. These results, based on different fermion discretizations (staggered fermions in the case of FNAL/MILC and domain wall fermions in the case of RBC/UKQCD) are in nice agreement. Moreover, in the case of FNAL/MILC the form factor has been determined from the scalar current matrix element, while in the case of RBC/UKQCD it has been determined including also the matrix element of the vector current. To a certain extent both simulations are expected to be affected by different systematic effects.

RBC/UKQCD 15A has analyzed results on ensembles with pion masses down to 140 MeV, mapping out the complete range from the SU(3)-symmetric limit to the physical point. No significant cut-off effects (results for two lattice spacings) were observed in the simulation results. Ensembles with unphysical light-quark masses are weighted to work as a guide for small corrections toward the physical point, reducing in this way the model dependence in the fitting ansatz. The systematic uncertainty turns out to be dominated by finite-volume effects, for which an estimate based on effective theory arguments is provided.

The result FNAL/MILC 12I is from simulations reaching down to a lightest RMS pion mass of about 380 MeV (the lightest valence pion mass for one of their ensembles is about 260 MeV). Their combined chiral and continuum extrapolation (results for two lattice spacings) is based on NLO staggered chiral perturbation theory supplemented by the continuum NNLO expression [250] and a phenomenological parameterization of the breaking of the Ademollo-Gatto theorem at finite lattice spacing inherent in their approach. The \(p^4\) low-energy constants entering the NNLO expression have been fixed in terms of external input [270].

The ETM collaboration uses the twisted-mass discretization and provides at \(N_{ f}=2\) a comprehensive study of the systematics [32, 240], by presenting results for four lattice spacings and by simulating at light pion masses (down to \(M_\pi = 260\) MeV). This makes it possible to constrain the chiral extrapolation, using both SU(3) [242] and SU(2) [246] chiral perturbation theory. Moreover, a rough estimate for the size of the effects due to quenching the strange quark is given, based on the comparison of the result for \(N_{ f}=2\) dynamical quark flavours [40] with the one in the quenched approximation, obtained earlier by the SPQcdR collaboration [268].

We now compute the \(N_f = 2\,+\,1\,+\,1\) FLAG-average for \(f_+(0)\) using the FNAL/MILC 13E and ETM 16 (uncorrelated) results, the \(N_f =2\,+\,1\) FLAG-average based on FNAL/MILC 12I and RBC/UKQCD 15A, which we consider uncorrelated, while for \(N_f = 2\) we consider directly the ETM 09A result, respectively:

$$\begin{aligned}&\text{ direct },\,N_{ f}=2\,+\,1\,+\,1:\quad f_+(0) = 0.9706(27)\quad \,\mathrm {Refs.}~ \text{[28,29] }, \end{aligned}$$
(74)
$$\begin{aligned}&\text{ direct },\,N_{ f}=2\,+\,1: \quad f_+(0) = 0.9677(27) \quad \,\mathrm {Refs.}~\text{[30,31] }, \nonumber \\\end{aligned}$$
(75)
$$\begin{aligned}&\text{ direct },\,N_{ f}=2: \quad f_+(0) = 0.9560(57)(62)\quad \,\mathrm {Ref.}~\text{[32] }, \end{aligned}$$
(76)

where the brackets in the third line indicate the statistical and systematic errors, respectively. We stress that the results (74) and (75), corresponding to \(N_f = 2\,+\,1\,+\,1\) and \(N_f = 2\,+\,1\), respectively, include already simulations with physical light-quark masses.

4.3.2 Results for \(f_{K^\pm }/f_{\pi ^\pm }\)

In the case of the ratio of decay constants the data sets that meet the criteria formulated in the introduction are HPQCD 13A [33], ETM 14E [34] and FNAL/MILC 17 [5] (which updates FNAL/MILC 14A [18]) with \(N_f=2\,+\,1\,+\,1\), HPQCD/UKQCD 07 [35], MILC 10 [36], BMW 10 [37], RBC/UKQCD 14B [10], Dürr 16 [38, 260] and QCDSF/UKQCD 16 [39] with \(N_{ f}=2\,+\,1\) and ETM 09 [40] with \(N_{ f}=2\) dynamical flavours.

ETM 14E uses the twisted-mass discretization and provides a comprehensive study of the systematics by presenting results for three lattice spacings in the range \(0.06 - 0.09\) fm and for pion masses in the range \(210 - 450\) MeV. This makes it possible to constrain the chiral extrapolation, using both SU(2) [246] chiral perturbation theory and polynomial fits. The ETM collaboration always includes the spread in the central values obtained from different ansätze into the systematic errors. The final result of their analysis is \({f_{K^\pm }}/{f_{\pi ^\pm }}= 1.184(12)_{\mathrm{stat+fit}}(3)_{\mathrm{Chiral}}(9)_{\mathrm{a}^2}(1)_{Z_P}(3)_{FV}(3)_{IB}\) where the errors are (statistical + the error due to the fitting procedure), due to the chiral extrapolation, the continuum extrapolation, the mass-renormalization constant, the finite-volume and (strong) isospin-breaking effects.

FNAL/MILC 17 [5] has determined the ratio of the decay constants from a comprehensive set of HISQ ensembles with \(N_f = 2\,+\,1\,+\,1\) dynamical flavours. They have generated 24 ensembles for six values of the lattice spacing (\(0.03 - 0.15\) fm, scale set with \(f_{\pi ^+}\)) and with both physical and unphysical values of the light sea-quark masses, controlling in this way the systematic uncertainties due to chiral and continuum extrapolations. With respect to FNAL/MILC 14A they have increased the statistics and added three ensembles at very fine lattice spacings, \(a \simeq 0.03\) and 0.042 fm, including for the latter case also a simulation at the physical value of the light-quark mass. The final result of their analysis is \({f_{K^\pm }}/{f_{\pi ^\pm }}=1.1950(14)_{\mathrm{stat}}(_{-17}^{+0})_{\mathrm{a}^2} (2)_{FV} (3)_{f_\pi , PDG} (3)_{EM} (2)_{Q^2}\), where the errors are statistical, due to the continuum extrapolation, finite-volume, pion decay constant from PDG, electromagnetic effects and sampling of the topological charge distribution.

HPQCD 13A has analyzed ensembles generated by MILC and therefore its study of \({f_{K^\pm }}/{f_{\pi ^\pm }}\) is based on the same set of ensembles bar the ones at the finest lattice spacings (namely, only \(a = 0.09 - 0.15\) fm, scale set with \(f_{\pi ^+}\) and relative scale set with the Wilson flow [271, 272]) supplemented by some simulation points with heavier quark masses. HPQCD employs a global fit based on continuum NLO SU(3) chiral perturbation theory for the decay constants supplemented by a model for higher-order terms including discretization and finite-volume effects (61 parameters for 39 data points supplemented by Bayesian priors). Their final result is \(f_{K^\pm }/f_{\pi ^\pm }=1.1916(15)_{\mathrm{stat}}(12)_{\mathrm{a}^2}(1)_{FV}(10)\), where the errors are statistical, due to the continuum extrapolation, due to finite-volume effects and the last error contains the combined uncertainties from the chiral extrapolation, the scale-setting uncertainty, the experimental input in terms of \(f_{\pi ^+}\) and from the uncertainty in \(m_u/m_d\).

In the two previous editions of the FLAG review [2, 3] the error budget of HPQCD 13A was compared with the ones of MILC 13A and FNAL/MILC 14A and discussed in detail. It was pointed out that, despite the overlap in primary lattice data, both collaborations arrive at surprisingly different error budgets, particularly in the cases of the cutoff dependence and of the finite volume effects. The error budget of the latest update FNAL/MILC 17, which has a richer lattice setup with respect to HPQCD 13A, is consistent with the one of HPQCD 13A.

Adding in quadrature all the uncertainties one gets: \(f_{K^\pm }/f_{\pi ^\pm } = 1.1916(22)\) (HPQCD 13A) and \({f_{K^\pm }}/{f_{\pi ^\pm }}=1.1944(18)\)Footnote 20 (FNAL/MILC 17). It can be seen that the total errors are very similar and the central values are consistent within approximately one standard deviation. Thus, the HPQCD 13A and FNAL/MILC 17 are averaged, assuming a \(100 \%\) statistical and systematic correlations between them, together with the (uncorrelated) ETM 14E result, obtaining

$$\begin{aligned}&\text{ direct },\,N_{ f}=2\,+\,1\,+\,1: \quad {f_{K^\pm }}/{f_{\pi ^\pm }}=1.1932(19)\nonumber \\&\quad \,\mathrm {Refs.}~\text{[5,33,34] }. \end{aligned}$$
(77)

For \(N_f=2\,+\,1\) the result Dürr 16 [38, 260] is now eligible to enter the FLAG average as well as the new result [39] from the QCDSF collaboration. Dürr 16 [38, 260] has analyzed the decay constants evaluated for 47 gauge ensembles generated using tree-level clover-improved fermions with two HEX-smearings and the tree-level Symanzik-improved gauge action. The ensembles correspond to five values of the lattice spacing (\(0.05{-}0.12\) fm, scale set by \(\Omega \) mass), to pion masses in the range \(130{-}680\) MeV and to values of the lattice size from 1.7 to 5.6 fm, obtaining a good control over the interpolation to the physical mass point and the extrapolation to the continuum and infinite volume limits.

QCDSF/UKQCD 16 [39] has used the nonperturbatively \(\mathcal{{O}}(a)\)-improved clover action for the fermions (mildly stout-smeared) and the tree-level Symanzik action for the gluons. Four values of the lattice spacing (\(0.06{-}0.08\) fm) have been simulated with pion masses down to \(\sim 220\) MeV and values of the lattice size in the range \(2.0{-}2.8\) fm. The decay constants are evaluated using an expansion around the symmetric SU(3) point \(m_u = m_d = m_s = (m_u + m_d + m_s)^{phys}/3\).

Note that for \(N_f=2\,+\,1\) MILC 10 and HPQCD/UKQCD 07 are based on staggered fermions, BMW 10, Dürr 16 and QCDSF/UKQCD 16 have used improved Wilson fermions and RBC/UKQCD 14B’s result is based on the domain-wall formulation. In contrast to RBC/UKQCD 14B and Dürr 16 the other simulations are for unphysical values of the light-quark masses (corresponding to smallest pion masses in the range \(220 - 260\) MeV in the case of MILC 10, HPQCD/UKQCD 07 and QCDSF/UKQCD 16) and therefore slightly more sophisticated extrapolations needed to be controlled. Various ansätze for the mass and cutoff dependence comprising SU(2) and SU(3) chiral perturbation theory or simply polynomials were used and compared in order to estimate the model dependence. While BMW 10, RBC/UKQCD 14B and QCDSF/UKQCD 16 are entirely independent computations, subsets of the MILC gauge ensembles used by MILC 10 and HPQCD/UKQCD 07 are the same. MILC 10 is certainly based on a larger and more advanced set of gauge configurations than HPQCD/UKQCD 07. This allows them for a more reliable estimation of systematic effects. In this situation we consider both statistical and systematic uncertainties to be correlated.

For \(N_f=2\) no new result enters the corresponding FLAG average with respect to the previous edition of the FLAG review [3], which therefore remains the ETM 09 result, which has simulated twisted-mass fermions down to (charged) pion masses equal to 260 MeV.

We note that the overall uncertainties quoted by ETM 14E at \(N_{ f}=2\,+\,1\,+\,1\) and by Dürr 16 and QCDSF/UKQCD 16 at \(N_{ f}=2\,+\,1\) are much larger than the overall uncertainties obtained with staggered (HPQCD 13A, FNAL/MILC 17 at \(N_{ f}=2\,+\,1\,+\,1\) and MILC 10, HPQCD/UKQCD 07 at \(N_{ f}=2\,+\,1\)) and domain-wall fermions (RBC/UKQCD 14B at \(N_{ f}=2\,+\,1\)).

Before determining the average for \(f_{K^\pm }/f_{\pi ^\pm }\), which should be used for applications to Standard Model phenomenology, we apply the strong isospin correction individually to all those results that have been published only in the isospin-symmetric limit, i.e., BMW 10, HPQCD/UKQCD 07 and RBC/UKQCD 14B at \(N_f = 2\,+\,1\) and ETM 09 at \(N_f = 2\). To this end, as in the previous edition of the FLAG reviews [2, 3], we make use of NLO SU(3) chiral perturbation theory [208, 244], which predicts

$$\begin{aligned} \frac{f_{K^\pm }}{f_{\pi ^\pm }}= \frac{f_K}{f_\pi } ~ \sqrt{1 + \delta _{SU(2)}} ~ , \end{aligned}$$
(78)

where [208]

(79)

We use as input \(\epsilon _{SU(2)} = \sqrt{3} / (4 R)\) with the FLAG result for R of Eq. (54), \(F_0 = f_0 / \sqrt{2} = 80\,(20)\) MeV, \(M_\pi = 135\) MeV and \(M_K = 495\) MeV (we decided to choose a conservative uncertainty on \(f_0\) in order to reflect the magnitude of potential higher-order corrections). The results are reported in Table 15, where in the last column the last error is due to the isospin correction (the remaining errors are quoted in the same order as in the original data).

Table 15 Values of the SU(2) isospin-breaking correction \(\delta _{SU(2)}\) applied to the lattice data for \(f_K/f_\pi \) , entering the FLAG average at \(N_f=2\,+\,1\), for obtaining the corrected charged ratio \(f_{K^\pm }/f_{\pi ^\pm }\)
Fig. 9
figure 9

The plot compares the information for \(|V_{ud}|\), \(|V_{us}|\) obtained on the lattice for \(N_f = 2\,+\,1\) and \(N_f = 2\,+\,1\,+\,1\) with the experimental result extracted from nuclear \(\beta \) transitions. The dotted line indicates the correlation between \(|V_{ud}|\) and \(|V_{us}|\) that follows if the CKM-matrix is unitary. For the \(N_f = 2\) results see the previous FLAG edition [3]

For \(N_f=2\) and \(N_f=2\,+\,1\,+\,1\) dedicated studies of the strong-isospin correction in lattice QCD do exist. The updated \(N_f=2\) result of the RM123 collaboration [141] amounts to \(\delta _{SU(2)}=-0.0080(4)\) and we use this result for the isospin correction of the ETM 09 result. Note that the above RM123 value for the strong-isospin correction is incompatible with the results based on SU(3) chiral perturbation theory, \(\delta _{SU(2)}=-0.004(1)\) (see Table 15). Moreover, for \(N_f=2\,+\,1\,+\,1\) HPQCD [33], FNAL/MILC [5] and ETM [273] estimate a value for \(\delta _{SU(2)}\) equal to \(-0.0054(14)\), \(-0.0052(9)\) and \(-0.0073(6)\), respectively. Note that the RM123 and ETM results are obtained using the insertion of the isovector scalar current according to the expansion method of Ref. [140], while the HPQCD and FNAL/MILC results correspond to the difference between the values of the decay constant ratio extrapolated to the physical u-quark mass \(m_u\) and to the average \((m_u + m_d) / 2\) light-quark mass.

One would not expect the strange and heavier sea-quark contributions to be responsible for such a large effect. Whether higher-order effects in chiral perturbation theory or other sources are responsible still needs to be understood. More lattice-QCD simulations of SU(2) isospin-breaking effects are therefore required. To remain on the conservative side we add a \(100 \%\) error to the correction based on SU(3) chiral perturbation theory. For further analyses we add (in quadrature) such an uncertainty to the systematic error.

Using the results of Table 15 for \(N_f = 2\,+\,1\) we obtain

$$\begin{aligned}&\text{ direct },\,N_{ f}=2\,+\,1\,+\,1:\quad f_{K^\pm } / f_{\pi ^\pm } = 1.1932(19)\nonumber \\&\quad \,\mathrm {Refs.}~\text{[5,33,34] }, \end{aligned}$$
(80)
$$\begin{aligned}&\text{ direct },\,N_{ f}=2\,+\,1: \quad f_{K^\pm } / f_{\pi ^\pm } = 1.1917(37)\nonumber \\&\quad \,\mathrm {Refs.}~{ [10,35{-}39]}, \end{aligned}$$
(81)
$$\begin{aligned}&\text{ direct },\,N_{ f}=2: \quad f_{K^\pm } / f_{\pi ^\pm } = 1.205(18)\nonumber \\&\quad \,\mathrm {Ref.}~\text{[40] }, \end{aligned}$$
(82)

for QCD with broken isospin.

The averages obtained for \(f_+(0)\) and \({f_{K^\pm }}/{f_{\pi ^\pm }}\) at \(N_{ f}=2\,+\,1\) and \(N_{ f}=2\,+\,1\,+\,1\) [see Eqs. (74-75) and (80-81)] exhibit a precision better than \(\sim 0.3 \%\). At such a level of precision QED effects cannot be ignored and a consistent lattice treatment of both QED and QCD effects in leptonic and semileptonic decays becomes mandatory.

4.3.3 Extraction of \(|V_{ud}|\) and \(|V_{us}|\)

It is instructive to convert the averages for \(f_+(0)\) and \({f_{K^\pm }}/{f_{\pi ^\pm }}\) into a corresponding range for the CKM matrix elements \(|V_{ud}|\) and \(|V_{us}|\), using the relations (69). Consider first the results for \(N_{ f}=2\,+\,1\,+\,1\). The range for \(f_+(0)\) in Eq. (74) is mapped into the interval \(|V_{us}|=0.2231(7)\), depicted as a horizontal red band in Fig. 9, while the one for \({f_{K^\pm }}/{f_{\pi ^\pm }}\) in Eq. (80) is converted into \(|V_{us}|/|V_{ud}|= 0.2313(5)\), shown as a tilted red band. The red ellipse is the intersection of these two bands and represents the 68% likelihood contour,Footnote 21 obtained by treating the above two results as independent measurements. Repeating the exercise for \(N_{ f}=2\,+\,1\) leads to the green ellipse. The plot indicates a slight tension of both the \(N_f=2\,+\,1\,+\,1\) and \(N_f=2\,+\,1\) results with the one from nuclear \(\beta \) decay.

4.4 Tests of the Standard Model

In the Standard Model, the CKM matrix is unitary. In particular, the elements of the first row obey

$$\begin{aligned} |V_u|^2\equiv |V_{ud}|^2 + |V_{us}|^2 + |V_{ub}|^2 = 1\,.\end{aligned}$$
(83)

The tiny contribution from \(|V_{ub}|\) is known much better than needed in the present context: \(|V_{ub}|= 3.94 (36) \cdot 10^{-3}\) [201]. In the following, we first discuss the evidence for the validity of the relation (83) and only then use it to analyse the lattice data within the Standard Model.

Fig. 10
figure 10

Results for \(|V_{us}|\) and \(|V_{ud}|\) that follow from the lattice data for \(f_+(0)\) (triangles) and \({f_{K^\pm }}/{f_{\pi ^\pm }}\) (squares), on the basis of the assumption that the CKM matrix is unitary. The black squares and the grey bands represent our estimates, obtained by combining these two different ways of measuring \(|V_{us}|\) and \(|V_{ud}|\) on a lattice. For comparison, the figure also indicates the results obtained if the data on nuclear \(\beta \) decay and \(\tau \) decay is analysed within the standard model

In Fig. 9, the correlation between \(|V_{ud}|\) and \(|V_{us}|\) imposed by the unitarity of the CKM matrix is indicated by a dotted line (more precisely, in view of the uncertainty in \(|V_{ub}|\), the correlation corresponds to a band of finite width, but the effect is too small to be seen here). The plot shows that there is a slight tension with unitarity in the data for \(N_f = 2 + 1 + 1\): Numerically, the outcome for the sum of the squares of the first row of the CKM matrix reads \(|V_u|^2 = 0.9797(74)\), which deviates from unity at the level of \(\simeq 2.7\) standard deviations. Still, it is fair to say that at this level the Standard Model passes a nontrivial test that exclusively involves lattice data and well-established kaon decay branching ratios. Combining the lattice results for \(f_+(0)\) and \({f_{K^\pm }}/{f_{\pi ^\pm }}\) in Eqs. (74) and (80) with the \(\beta \) decay value of \(|V_{ud}|\) quoted in Eq. (71), the test sharpens considerably: the lattice result for \(f_+(0)\) leads to \(|V_u|^2 = 0.99884(53)\), which highlights again a \(\simeq 2.2\sigma \)-tension with unitarity, while the one for \({f_{K^\pm }}/{f_{\pi ^\pm }}\) implies \(|V_u|^2 = 0.99986(46)\), confirming the first-row CKM unitarity below the permille level.Footnote 22 Note that the largest contribution to the uncertainty on \(|V_u|^2\) comes from the error on \(|V_{ud}|\) given in Eq. (71).

Table 16 Values of \(|V_{us}|\) and \(|V_{ud}|\) obtained from the lattice determinations of either \(f_+(0)\) or \({f_{K^\pm }}/{f_{\pi ^\pm }}\) assuming CKM unitarity. The first (second) number in brackets represents the statistical (systematic) error

The situation is similar for \(N_{ f}=2\,+\,1\): with the lattice data alone one has \(|V_u|^2 = 0.9832(89)\), which deviates from unity at the level of \(\simeq 1.9\) standard deviations. Combining the lattice results for \(f_+(0)\) and \({f_{K^\pm }}/{f_{\pi ^\pm }}\) in Eqs. (75) and (81) with the \(\beta \) decay value of \(|V_{ud}|\), the test sharpens again considerably: the lattice result for \(f_+(0)\) leads to \(|V_u|^2 = 0.99914(53)\), implying only a \(\simeq 1.6\sigma \)-tension with unitarity, while the one for \({f_{K^\pm }}/{f_{\pi ^\pm }}\) implies \(|V_u|^2 = 0.99999(54)\), thus confirming again CKM unitarity below the permille level.

For the analysis corresponding to \(N_f = 2\) the reader should refer to the previous FLAG edition [3].

Note that the above tests also offer a check of the basic hypothesis that underlies our analysis: we are assuming that the weak interaction between the quarks and the leptons is governed by the same Fermi constant as the one that determines the strength of the weak interaction among the leptons and the lifetime of the muon. In certain modifications of the Standard Model, this is not the case and it need not be true that the rates of the decays \(\pi \rightarrow \ell \nu \), \(K\rightarrow \ell \nu \) and \(K\rightarrow \pi \ell \nu \) can be used to determine the matrix elements \(|V_{ud}f_\pi |\), \(|V_{us}f_K|\) and \(|V_{us}f_+(0)|\), respectively, and that \(|V_{ud}|\) can be measured in nuclear \(\beta \) decay. The fact that the lattice data is consistent with unitarity and with the value of \(|V_{ud}|\) found in nuclear \(\beta \) decay indirectly also checks the equality of the Fermi constants.

4.5 Analysis within the Standard Model

The Standard Model implies that the CKM matrix is unitary. The precise experimental constraints quoted in (69) and the unitarity condition (83) then reduce the four quantities \(|V_{ud}|,|V_{us}|,f_+(0),{f_{K^\pm }}/{f_{\pi ^\pm }}\) to a single unknown: any one of these determines the other three within narrow uncertainties.

As Fig. 10 shows, the results obtained for \(|V_{us}|\) and \(|V_{ud}|\) from the data on \({f_{K^\pm }}/{f_{\pi ^\pm }}\) (squares) are quite consistent with the determinations via \(f_+(0)\) (triangles). In order to calculate the corresponding average values, we restrict ourselves to those determinations that we have considered best in Sect. 4.3. The corresponding results for \(|V_{us}|\) are listed in Table 16 (the error in the experimental numbers used to convert the values of \(f_+(0)\) and \({f_{K^\pm }}/{f_{\pi ^\pm }}\) into values for \(|V_{us}|\) is included in the statistical error).

Table 17 The upper half of the table shows our final results for \(|V_{us}|\), \(|V_{ud}|\), \(f_+(0)\) and \({f_{K^\pm }}/{f_{\pi ^\pm }}\) that are obtained by analysing the lattice data within the Standard Model (see text). For comparison, the lower half lists the values that follow if the lattice results are replaced by the experimental results on nuclear \(\beta \) decay and \(\tau \) decay, respectively

For \(N_{ f}=2\,+\,1\,+\,1\) we consider the data both for \(f_+(0)\) and \({f_{K^\pm }}/{f_{\pi ^\pm }}\), treating ETM 16 and ETM 14E on the one hand and FNAL/MILC 13E, FNAL/MILC 17 and HPQCD 13A on the other hand, as statistically correlated according to the prescription of Sect. 2.3. We obtain \(|V_{us}|=0.2249(7)\), where the error includes the inflation factor due to the value of \(\chi ^2/\mathrm{dof} \simeq 2.5\). This result is indicated on the left hand side of Fig. 10 by the narrow vertical band. In the case \(N_f = 2\,+\,1\) we consider MILC 10, FNAL/MILC 12I and HPQCD/UKQCD 07 on the one hand and RBC/UKQCD 14B and RBC/UKQCD 15A on the other hand, as mutually statistically correlated, since the analysis in the two cases starts from partly the same set of gauge ensembles. In this way we arrive at \(|V_{us}| = 0.2249(5)\) with \(\chi ^2/\mathrm{dof} \simeq 0.8\). For \(N_{ f}=2\) we consider ETM 09A and ETM 09 as statistically correlated, obtaining \(|V_{us}|=0.2256(19)\) with \(\chi ^2/\mathrm{dof} \simeq 0.7\). The figure shows that the results obtained for the data with \(N_{ f}=2\), \(N_{ f}=2\,+\,1\) and \(N_{ f}=2\,+\,1\,+\,1\) are consistent with each other.

Alternatively, we can solve the relations for \(|V_{ud}|\) instead of \(|V_{us}|\). Again, the result \(|V_{ud}|=0.97437(16)\), which follows from the lattice data with \(N_{ f}=2\,+\,1\,+\,1\), is perfectly consistent with the values \(|V_{ud}|=0.97438(12)\) and \(|V_{ud}|=0.97423(44)\) obtained from the data with \(N_{ f}=2\,+\,1\) and \(N_{ f}=2\), respectively. The reduction of the uncertainties in the result for \(|V_{ud}|\) due to CKM unitarity is to be expected from Fig. 9: the unitarity condition reduces the region allowed by the lattice results to a nearly vertical interval.

Next, we determine the values of \(f_+(0)\) and \({f_{K^\pm }}/{f_{\pi ^\pm }}\) that follow from our determinations of \(|V_{us}|\) and \(|V_{ud}|\) obtained from the lattice data within the Standard Model. We find \(f_+(0) = 0.9627(35)\) for \(N_{ f}=2\,+\,1\,+\,1\), \(f_+(0) = 0.9627(28)\) for \(N_{ f}=2\,+\,1\), \(f_+(0) = 0.9597(83)\) for \(N_{ f}=2\) and \({f_{K^\pm }}/{f_{\pi ^\pm }}= 1.196(3)\) for \(N_{ f}=2\,+\,1\,+\,1\), \({f_{K^\pm }}/{f_{\pi ^\pm }}= 1.196(3)\) for \(N_{ f}=2\,+\,1\), \({f_{K^\pm }}/{f_{\pi ^\pm }}= 1.192(9) \) for \(N_{ f}=2\), respectively. These results are collected in the upper half of Table 17. In the lower half of the table, we list the analogous results found by working out the consequences of the CKM unitarity using the values of \(|V_{ud}|\) and \(|V_{us}|\) obtained from nuclear \(\beta \) decay and \(\tau \) decay, respectively. The comparison shows that the lattice result for \(|V_{ud}|\) not only agrees very well with the totally independent determination based on nuclear \(\beta \) transitions, but is also remarkably precise. On the other hand, the values of \(|V_{ud}|\), \(f_+(0)\) and \({f_{K^\pm }}/{f_{\pi ^\pm }}\) that follow from the \(\tau \)-decay data if the Standard Model is assumed to be valid were initially not all in agreement with the lattice results for these quantities. The disagreement is reduced considerably if the analysis of the \(\tau \) data is supplemented with experimental results on electroproduction [223]: the discrepancy then amounts to little more than one standard deviation. The disagreement disappears when recent implementations of the relevant sum rules and a different experimental input are considered [226].

Table 18 Colour code for the lattice data on \(f_{\pi ^\pm }\) and \(f_{K^\pm }\) together with information on the way the lattice spacing was converted to physical units and on whether or not an isospin-breaking correction has been applied to the quoted result (see Sect. 4.3). The numerical values are listed in MeV units. With respect to the previous edition [3] old results with two red tags have been dropped

4.6 Direct determination of \(f_{K^\pm }\) and \(f_{\pi ^\pm }\)

It is useful for flavour physics studies to provide not only the lattice average of \(f_{K^\pm } / f_{\pi ^\pm }\), but also the average of the decay constant \(f_{K^\pm }\). The case of the decay constant \(f_{\pi ^\pm }\) is different, since the the PDG value [201] of this quantity, based on the use of the value of \(|V_{ud}|\) obtained from superallowed nuclear \(\beta \) decays [209], is often used for setting the scale in lattice QCD (see Appendix A.2). However, the physical scale can be set in different ways, namely, by using as input the mass of the \(\Omega \)-baryon (\(m_\Omega \)) or the \(\Upsilon \)-meson spectrum (\(\Delta M_\Upsilon \)), which are less sensitive to the uncertainties of the chiral extrapolation in the light-quark mass with respect to \(f_{\pi ^\pm }\). In such cases the value of the decay constant \(f_{\pi ^\pm }\) becomes a direct prediction of the lattice-QCD simulations. It is therefore interesting to provide also the average of the decay constant \(f_{\pi ^\pm }\), obtained when the physical scale is set through another hadron observable, in order to check the consistency of different scale setting procedures.

Our compilation of the values of \(f_{\pi ^\pm }\) and \(f_{K^\pm }\) with the corresponding colour code is presented in Table 18 and it is unchanged from the corresponding one in the previous FLAG review [3].

In comparison to the case of \(f_{K^\pm } / f_{\pi ^\pm }\) we have added two columns indicating which quantity is used to set the physical scale and the possible use of a renormalization constant for the axial current. For several lattice formulations the use of the nonsinglet axial-vector Ward identity allows to avoid the use of any renormalization constant.

One can see that the determinations of \(f_{\pi ^\pm }\) and \(f_{K^\pm }\) suffer from larger uncertainties with respect to the ones of the ratio \(f_{K^\pm } / f_{\pi ^\pm }\), which is less sensitive to various systematic effects (including the uncertainty of a possible renormalization constant) and, moreover, is not exposed to the uncertainties of the procedure used to set the physical scale.

According to the FLAG rules, for \(N_f = 2 + 1 + 1\) three data sets can form the average of \(f_{K^\pm }\) only: ETM 14E [34], FNAL/MILC 14A [18] and HPQCD 13A [33]. Following the same procedure already adopted in Sect. 4.3 in the case of the ratio of the decay constant we treat FNAL/MILC 14A and HPQCD 13A as statistically correlated. For \(N_f = 2 + 1\) three data sets can form the average of \(f_{\pi ^\pm }\) and \(f_{K^\pm }\) : RBC/UKQCD 14B [10] (update of RBC/UKQCD 12), HPQCD/UKQCD 07 [35] and MILC 10 [36], which is the latest update of the MILC program. We consider HPQCD/UKQCD 07 and MILC 10 as statistically correlated and use the prescription of Sect. 2.3 to form an average. For \(N_f = 2\) the average cannot be formed for \(f_{\pi ^\pm }\), and only one data set (ETM 09) satisfies the FLAG rules in the case of \(f_{K^\pm }\).

Thus, our estimates read

$$\begin{aligned}&N_f = 2 + 1: \quad f_{\pi ^\pm }= 130.2 ~ (0.8)~ \text{ MeV } \quad \,\mathrm {Refs.}~ \text{[10,35,36] }, \end{aligned}$$
(84)
$$\begin{aligned}&N_f = 2 + 1 + 1: \quad f_{K^\pm } = 155.7 ~ (0.3)~ \text{ MeV } \quad \,\mathrm {Refs.}~ \text{[18,33,34] } ,\nonumber \\&N_f = 2 + 1: \quad f_{K^\pm } = 155.7 ~ (0.7)~ \text{ MeV } \quad \,\mathrm {Refs.}~\text{[10,35,36] }, \nonumber \\&N_f = 2: \quad f_{K^\pm } = 157.5 ~ (2.4)~ \text{ MeV } \quad \,\mathrm {Ref.}~ \text{[40] }. \end{aligned}$$
(85)

The lattice results of Table 18 and our estimates (84)–(85) are reported in Fig. 11. Note that the FLAG estimates of \(f_{K^\pm }\) for \(N_f = 2\) and \(N_f = 2 + 1 + 1\) are based on calculations in which \(f_{\pi ^\pm }\) is used to set the lattice scale, while the \(N_f = 2 + 1\) estimate does not rely on that.

5 Low-energy constants

Authors: S. Dürr, H. Fukaya, U. M. Heller

In the study of the quark-mass dependence of QCD observables calculated on the lattice, it is common practice to invoke chiral perturbation theory (\(\chi \)PT). For a given quantity this framework predicts the nonanalytic quark-mass dependence and it provides symmetry relations among different observables. These relations are best expressed with the help of a set of linearly independent and universal (i.e., process-independent) low-energy constants (LECs), which first appear as coefficients of the polynomial terms (in \(m_q\) or \(M_{\pi }^2\)) in different observables. When numerical simulations are done at heavier than physical (light) quark masses, \(\chi \)PT is usually invoked in the extrapolation to physical quark masses.

5.1 Chiral perturbation theory

\(\chi \)PT is an effective field theory approach to the low-energy properties of QCD based on the spontaneous breaking of chiral symmetry, \(SU(N_{ f})_L \times SU(N_{ f})_R \rightarrow SU(N_{ f})_{L+R}\), and its soft explicit breaking by quark-mass terms. In its original implementation, in infinite volume, it is an expansion in \(m_q\) and \(p^2\) with power counting \(M_{\pi }^2 \sim m_q \sim p^2\).

If one expands around the SU(2) chiral limit, there appear two LECs at order \(p^2\) in the chiral effective Lagrangian,

(86)

and seven at order \(p^4\), indicated by \({{\bar{\ell }}}_i\) with \(i=1,\ldots ,7\). In the analysis of the SU(3) chiral limit there are also just two LECs at order \(p^2\),

(87)

but ten at order \(p^4\), indicated by the capital letter \(L_i(\mu )\) with \(i=1,\ldots ,10\). These constants are independent of the quark masses,Footnote 23 but they become scale dependent after renormalization (sometimes a superscript r is added). The SU(2) constants \({\bar{\ell }}_i\) are scale independent, since they are defined at scale \(\mu =M_{\pi ,\mathrm {phys}}\) (as indicated by the bar). For the precise definition of these constants and their scale dependence we refer the reader to Refs. [244, 277].

Fig. 11
figure 11

Values of \(f_\pi \) and \(f_K\). The black squares and grey bands indicate our estimates (84) and (85)

5.1.1 Patterns of chiral symmetry breaking

If the box size is finite but large compared to the Compton wavelength of the pion, \(L\gg 1/M_{\pi }\), the power counting generalizes to \(m_q \sim p^2 \sim 1/L^2\), as one would assume based on the fact that \(p_\mathrm {min}=2\pi /L\) is the minimum momentum in a finite box with periodic boundary conditions in the spatial directions. This is the so-called p-regime of \(\chi \)PT. It coincides with the setting that is used for standard phenomenologically oriented lattice-QCD computations, and we shall consider the p-regime the default in the following. However, if the pion mass is so small that the box-length L is no longer large compared to the Compton wavelength that the pion would have, at the given \(m_q\), in infinite volume, then the chiral series must be reordered. Such finite-volume versions of \(\chi \)PT with correspondingly adjusted power counting schemes, referred to as \(\epsilon \)- and \(\delta \)-regime, are described in Sects. 5.1.6 and 5.1.7, respectively.

Lattice calculations can be used to test if chiral symmetry is indeed spontaneously broken along the path \(SU(N_{ f})_L \times SU(N_{ f})_R \rightarrow SU(N_{ f})_{L+R}\) by measuring nonzero chiral condensates and by verifying the validity of the GMOR relation \(M_{\pi }^2\propto m_q\) close to the chiral limit. If the chiral extrapolation of quantities calculated on the lattice is made with the help of fits to their \(\chi \)PT forms, apart from determining the observable at the physical value of the quark masses, one also obtains the relevant LECs. This is an important by-product for two reasons:

  1. 1.

    All LECs up to order \(p^4\) (with the exception of B and \(B_0\), since only the product of these times the quark masses can be estimated from phenomenology) have either been determined by comparison to experiment or estimated theoretically, e.g., in large-\(N_c\) QCD. A lattice determination of the better known LECs thus provides a test of the \(\chi \)PT approach.

  2. 2.

    The less well-known LECs are those which describe the quark-mass dependence of observables – these cannot be determined from experiment, and therefore the lattice, where quark masses can be varied, provides unique quantitative information. This information is essential for improving phenomenological \(\chi \)PT predictions in which these LECs play a role.

We stress that this program is based on the nonobvious assumption that \(\chi \)PT is valid in the region of masses and momenta used in the lattice simulations under consideration, something that can and should be checked. With the ability to create data at multiple values of the light-quark masses, lattice QCD offers the possibility to check the convergence of \(\chi \)PT. Lattice data may be used to verify that higher order contributions, for small enough quark masses, become increasingly unimportant. In the end one wants to compare lattice and phenomenological determinations of LECs, much in the spirit of Ref. [278]. An overview of many of the conceptual issues involved in matching lattice data to an effective field theory framework like \(\chi \)PT is given in Refs. [279,280,281].

The fact that, at large volume, the finite-size effects, which occur if a system undergoes spontaneous symmetry breakdown, are controlled by the Nambu-Goldstone modes, was first noted in solid state physics, in connection with magnetic systems [282, 283]. As pointed out in Ref. [284] in the context of QCD, the thermal properties of such systems can be studied in a systematic and model-independent manner by means of the corresponding effective field theory, provided the temperature is low enough. While finite volumes are not of physical interest in particle physics, lattice simulations are necessarily carried out in a finite box. As shown in Refs. [285,286,287], the ensuing finite-size effects can be studied on the basis of the effective theory – \(\chi \)PT in the case of QCD – provided the simulation is close enough to the continuum limit, the volume is sufficiently large and the explicit breaking of chiral symmetry generated by the quark masses is sufficiently small. Indeed, \(\chi \)PT represents a useful tool for the analysis of the finite-size effects in lattice simulations.

In the remainder of this section we collect the relevant \(\chi \)PT formulae that will be used in the two following sections to extract SU(2) and SU(3) LECs from lattice data.

5.1.2 Quark-mass dependence of pseudoscalar masses and decay constants

A. SU(2) formulae

The expansionsFootnote 24 of \(M_{\pi }^2\) and \(F_\pi \) in powers of the quark mass are known to next-to-next-to-leading order (NNLO) in the SU(2) chiral effective theory. In the isospin limit, \(m_u=m_d=m\), the explicit expressions may be written in the form [288]

$$\begin{aligned} M_{\pi }^2= & {} M^2\left\{ 1-\frac{1}{2}x\ln \frac{\Lambda _3^2}{M^2} +\frac{17}{8}x^2 \left( \ln \frac{\Lambda _M^2}{M^2} \right) ^2 \right. \nonumber \\&\quad \quad \quad \left. +x^2 k_M +{\mathcal {O}}(x^3) \right\} , \nonumber \\ F_\pi= & {} F\left\{ 1+x\ln \frac{\Lambda _4^2}{M^2} -\frac{5}{4}x^2 \left( \ln \frac{\Lambda _F^2}{M^2} \right) ^2 \right. \nonumber \\&\quad \quad \quad \left. +x^2 k_F +{\mathcal {O}}(x^3) \right\} . \end{aligned}$$
(88)

Here the expansion parameter is given by

$$\begin{aligned} x=\frac{M^2}{(4\pi F)^2},\quad M^2=2Bm=\frac{2\Sigma m}{F^2}, \end{aligned}$$
(89)

but there is another option as discussed below. The scales \(\Lambda _3,\Lambda _4\) are related to the effective coupling constants \({\bar{\ell }}_3,{\bar{\ell }}_4\) of the chiral Lagrangian at scale \(\mu =M_{\pi ,\mathrm {phys}}\) by

$$\begin{aligned} {\bar{\ell }}_n=\ln \frac{\Lambda _n^2}{M_{\pi ,\mathrm {phys}}^2},\quad n=1,...,7. \end{aligned}$$
(90)

Note that in Eq. (88) the logarithms are evaluated at \(M^2\), not at \(M_{\pi }^2\). The coupling constants \(k_M,k_F\) in Eq. (88) are mass-independent. The scales of the squared logarithms can be expressed in terms of the \({\mathcal {O}}(p^4)\) coupling constants as

$$\begin{aligned} \ln \frac{\Lambda _M^2}{M^2}= & {} \frac{1}{51}\left( 28\ln \frac{\Lambda _1^2}{M^2} +32\ln \frac{\Lambda _2^2}{M^2} -9\ln \frac{\Lambda _3^2}{M^2} +49 \right) , \nonumber \\ \ln \frac{\Lambda _F^2}{M^2}= & {} \frac{1}{30}\left( 14\ln \frac{\Lambda _1^2}{M^2} +16\ln \frac{\Lambda _2^2}{M^2}\nonumber \right. \\&\quad \quad \quad \left. +6\ln \frac{\Lambda _3^2}{M^2} - 6 \ln \frac{\Lambda _4^2}{M^2} +23 \right) . \end{aligned}$$
(91)

Hence by analysing the quark-mass dependence of \(M_{\pi }^2\) and \(F_\pi \) with Eq. (88), possibly truncated at NLO, one can determineFootnote 25 the \({\mathcal {O}}(p^2)\) LECs B and F, as well as the \({\mathcal {O}}(p^4)\) LECs \({{\bar{\ell }}}_3\) and \({{\bar{\ell }}}_4\). The quark condensate in the chiral limit is given by \(\Sigma =F^2B\). With precise enough data at several small enough pion masses, one could in principle also determine \(\Lambda _M\), \(\Lambda _F\) and \(k_M\), \(k_F\). To date this is not yet possible. The results for the LO and NLO constants will be presented in Sect. 5.2.

Alternatively, one can invert Eq. (88) and express \(M^2\) and F as an expansion in

$$\begin{aligned} \xi \equiv \frac{M_{\pi }^2}{16 \pi ^2 F_\pi ^2} \; \; , \end{aligned}$$
(92)

and the corresponding expressions then take the form

$$\begin{aligned} M^2= & {} M_{\pi }^2\,\left\{ 1+\frac{1}{2}\,\xi \,\ln \frac{\Lambda _3^2}{M_\pi ^2}- \frac{5}{8}\,\xi ^2 \left( \!\ln \frac{\Omega _M^2}{M_\pi ^2}\!\right) ^2\right. \nonumber \\&\quad \quad \quad \left. + \xi ^2 c_{\scriptscriptstyle M}+{\mathcal {O}}(\xi ^3)\right\} \,,\nonumber \\ F= & {} F_\pi \,\left\{ 1-\xi \,\ln \frac{\Lambda _4^2}{M_\pi ^2}-\frac{1}{4}\,\xi ^2 \left( \!\ln \frac{\Omega _F^2}{M_\pi ^2}\!\right) ^2 \right. \nonumber \\&\quad \quad \quad \left. +\xi ^2 c_{\scriptscriptstyle F}+{\mathcal {O}}(\xi ^3)\right\} \,.\end{aligned}$$
(93)

The scales of the quadratic logarithms are determined by \(\Lambda _1,\ldots ,\Lambda _4\) through

$$\begin{aligned} \ln \frac{\Omega _M^2}{M_\pi ^2}= & {} \frac{1}{15}\left( 28\,\ln \frac{\Lambda _1^2}{M_\pi ^2}+32\,\ln \frac{\Lambda _2^2}{M_\pi ^2}- 33\,\ln \frac{\Lambda _3^2}{M_\pi ^2}\nonumber \right. \\&\quad \quad \quad \left. -12\,\ln \frac{\Lambda _4^2}{M_\pi ^2}+52\right) \,,\nonumber \\ \ln \frac{\Omega _F^2}{M_\pi ^2}= & {} \frac{1}{3}\,\left( -7\,\ln \frac{\Lambda _1^2}{M_\pi ^2}-8\,\ln \frac{\Lambda _2^2}{M_\pi ^2}+ 18\,\ln \frac{\Lambda _4^2}{M_\pi ^2}- \frac{29}{2}\right) \,.\nonumber \\ \end{aligned}$$
(94)

In practice, many results are expressed in terms of the LO constants F and \(\Sigma \) and the NLO constants \({{\bar{\ell }}}_i\). The LO constants relate to the LO constants used above through \(B=\Sigma /F^2\). At the NLO the relation is a bit more involved, since the \({{\bar{\ell }}}_i\) bear the notion of the physical pion mass, see (90). For instance, Eqs. (93) may be rewritten as

$$\begin{aligned} M^2= & {} M_{\pi }^2\,\left\{ 1+\frac{1}{2}\,\xi \,{{\bar{\ell }}}_3+\frac{1}{2}\,\xi \ln \frac{M_{\pi ,\mathrm {phys}}^2}{M_{\pi }^2}\nonumber \right. \\&\quad \quad \quad \left. - \frac{5}{8}\,\xi ^2 \left( \!\ln \frac{\Omega _M^2}{M_\pi ^2}\!\right) ^2 + \xi ^2 c_{\scriptscriptstyle M}+{\mathcal {O}}(\xi ^3)\right\} \,,\nonumber \\ F= & {} F_\pi \,\left\{ 1-\xi \,{{\bar{\ell }}}_4-\xi \,\ln \frac{M_{\pi ,\mathrm {phys}}^2}{M_{\pi }^2}\nonumber \right. \\&\quad \quad \quad \left. -\frac{1}{4}\,\xi ^2 \left( \!\ln \frac{\Omega _F^2}{M_\pi ^2}\!\right) ^2 +\xi ^2 c_{\scriptscriptstyle F}+{\mathcal {O}}(\xi ^3)\right\} \,,\end{aligned}$$
(95)

and this implies that fitting some lattice data (say at a single lattice spacing a) with Eq. (95) requires some a-priori knowledge of the lattice spacing. On the other hand, doing the same job with Eq. (93) yields the scales \(a\Lambda _3, a\Lambda _4\) in lattice units (which may be converted to \({{\bar{\ell }}}_3,{{\bar{\ell }}}_4\) at a later stage of the analysis when the scale is known more precisely).

B. SU(3) formulae

While the formulae for the pseudoscalar masses and decay constants are known to NNLO for SU(3) as well [289], they are rather complicated and we restrict ourselves here to next-to-leading order (NLO). In the isospin limit, the relevant SU(3) formulae take the form [244]

(96)

where \(m_{ud}\) is the joint up/down quark mass in the simulation [which may be taken different from the average light-quark mass \(\frac{1}{2}(m_u^\mathrm {phys}+m_d^\mathrm {phys})\) in the real world]. And \(B_0=\Sigma _0/F_0^2\), \(F_0\) denote the condensate parameter and the pseudoscalar decay constant in the SU(3) chiral limit, respectively. In addition, we use the notation

$$\begin{aligned} \mu _P=\frac{M_P^2}{32\pi ^2F_0^2} \ln \!\left( \frac{M_P^2}{\mu ^2}\right) \;. \end{aligned}$$
(97)

At the order of the chiral expansion used in these formulae, the quantities \(\mu _\pi \), \(\mu _K\), \(\mu _\eta \) can equally well be evaluated with the leading-order expressions for the masses,

(98)

Throughout, \(L_i\) denotes the renormalized low-energy constant/coupling (LEC) at scale \(\mu \), and we adopt the convention that is standard in phenomenology, \(\mu =M_\rho =770\,\mathrm {MeV}\). The normalization used for the decay constants is specified in footnote 24.

5.1.3 Pion form factors and charge radii

The scalar and vector form factors of the pion are defined by the matrix elements

$$\begin{aligned} \begin{aligned}&\langle \pi ^i(p_2) |\, {\bar{q}}\, q \, | \pi ^k(p_1) \rangle = \delta ^{ik} F_S^\pi (t) \,,\\&\langle \pi ^i(p_2) | \,{\bar{q}}\, {\frac{1}{2}}\tau ^j \gamma ^\mu q\,| \pi ^k(p_1) \rangle = \mathrm {i} \,\epsilon ^{ijk} (p_1^\mu + p_2^\mu ) F_V^\pi (t) \,,\end{aligned} \end{aligned}$$
(99)

where the operators contain only the lightest two quark flavours, i.e., \(\tau ^1\), \(\tau ^2\), \(\tau ^3\) are the Pauli matrices, and \(t\equiv (p_1-p_2)^2\) denotes the momentum transfer.

The vector form factor has been measured by several experiments for time-like as well as for space-like values of t. The scalar form factor is not directly measurable, but it can be evaluated theoretically from data on the \(\pi \pi \) and \(\pi K\) phase shifts [290] by means of analyticity and unitarity, i.e., in a model-independent way. Lattice calculations can be compared with data or model-independent theoretical evaluations at any given value of t. At present, however, most lattice studies concentrate on the region close to \(t=0\) and on the evaluation of the slope and curvature, which are defined as

$$\begin{aligned} \begin{aligned} F^\pi _V(t)&= 1+{\frac{1}{6}}\langle r^2 \rangle ^\pi _V t + c_V t^2\,+\,\cdots , \\ F^\pi _S(t)&= F^\pi _S(0) \left[ 1+{\frac{1}{6}}\langle r^2 \rangle ^\pi _S t + c_S\, t^2\,+\, \cdots \right] . \end{aligned} \end{aligned}$$
(100)

The slopes are related to the mean-square vector and scalar radii, which are the quantities on which most experiments and lattice calculations concentrate.

In \(\chi \)PT, the form factors are known at NNLO for SU(2) [291]. The corresponding formulae are available in fully analytical form and are compact enough that they can be used for the chiral extrapolation of the data (as done, for example, in Refs. [53, 292]). The expressions for the scalar and vector radii and for the \(c_{S,V}\) coefficients at 2-loop level in SU(2) terminology read

$$\begin{aligned} \langle r^2 \rangle ^\pi _S= & {} \frac{1}{(4\pi F_\pi )^2} \left\{ 6 \ln \frac{\Lambda _4^2}{M_\pi ^2}-\frac{13}{2} -\frac{29}{3}\,\xi \left( \!\ln \frac{\Omega _{r_S}^2}{M_{\pi }^2} \!\right) ^2\right. \nonumber \\&\left. \quad \quad \quad \quad \quad + 6 \xi \, k_{r_S}+{\mathcal {O}}(\xi ^2)\right\} \,,\nonumber \\ \langle r^2 \rangle ^\pi _V= & {} \frac{1}{(4\pi F_\pi )^2} \left\{ \ln \frac{\Lambda _6^2}{M_\pi ^2}-1 +2\,\xi \left( \!\ln \frac{\Omega _{r_V}^2}{M_{\pi }^2} \!\right) ^2\right. \nonumber \\&\left. \quad \quad \quad \quad \quad +6 \xi \,k_{r_V}+{\mathcal {O}}(\xi ^2)\right\} \,,\nonumber \\ c_S= & {} \frac{1}{(4\pi F_\pi M_{\pi })^2} \left\{ \frac{19}{120} + \xi \left[ \frac{43}{36} \left( \! \ln \frac{\Omega _{c_S}^2}{M_{\pi }^2} \!\right) ^2 + k_{c_S} \right] \right\} \,,\nonumber \\ c_V= & {} \frac{1}{(4\pi F_\pi M_{\pi })^2} \left\{ \frac{1}{60}+\xi \left[ \frac{1}{72} \left( \! \ln \frac{\Omega _{c_V}^2}{M_{\pi }^2} \!\right) ^2 + k_{c_V} \right] \right\} \,,\nonumber \\ \end{aligned}$$
(101)

where

$$\begin{aligned} \ln \frac{\Omega _{r_S}^2}{M_{\pi }^2}= & {} \frac{1}{29}\,\left( 31\,\ln \frac{\Lambda _1^2}{M_\pi ^2}+34\,\ln \frac{\Lambda _2^2}{M_\pi ^2}-36\,\ln \frac{\Lambda _4^2}{M_\pi ^2}+\frac{145}{24}\right) \,,\nonumber \\ \ln \frac{\Omega _{r_V}^2}{M_{\pi }^2}= & {} \frac{1}{2}\,\left( \ln \frac{\Lambda _1^2}{M_\pi ^2}-\ln \frac{\Lambda _2^2}{M_\pi ^2}+\ln \frac{\Lambda _4^2}{M_\pi ^2}+\ln \frac{\Lambda _6^2}{M_\pi ^2}-\frac{31}{12}\right) \,,\nonumber \\ \ln \frac{\Omega _{c_S}^2}{M_{\pi }^2}= & {} \frac{43}{63}\,\left( 11\,\ln \frac{\Lambda _1^2}{M_\pi ^2}+14\,\ln \frac{\Lambda _2^2}{M_\pi ^2}+18\,\ln \frac{\Lambda _4^2}{M_\pi ^2}-\frac{6041}{120}\right) \,,\nonumber \\ \ln \frac{\Omega _{c_V}^2}{M_{\pi }^2}= & {} \frac{1}{72}\,\left( 2\ln \frac{\Lambda _1^2}{M_\pi ^2}-2\ln \frac{\Lambda _2^2}{M_\pi ^2}-\ln \frac{\Lambda _6^2}{M_\pi ^2}-\frac{26}{30}\right) \,,\nonumber \\ \end{aligned}$$
(102)

and \(k_{r_S},k_{r_V}\) and \(k_{c_S},k_{c_V}\) are independent of the quark masses. Their expression in terms of the \(\ell _i\) and of the \({\mathcal {O}}(p^6)\) constants \(c_M,c_F\) is known but will not be reproduced here.

The SU(3) formula for the slope of the pion vector form factor reads, to NLO [242],

(103)

while the expression \(\langle r^2\rangle _S^\mathrm {oct}\) for the octet part of the scalar radius does not contain any NLO low-energy constant at 1-loop order [242] – contrary to the situation in SU(2), see Eq. (101).

The difference between the quark-line connected and the full (i.e., containing the connected and the disconnected pieces) scalar pion form factor has been investigated by means of \(\chi \)PT in Ref. [293]. It is expected that the technique used can be applied to a large class of observables relevant in QCD phenomenology.

As a point of practical interest let us remark that there are no finite-volume correction formulae for the mean-square radii \(\langle r^2\rangle _{V,S}\) and the curvatures \(c_{V,S}\). The lattice data for \(F_{V,S}(t)\) need to be corrected, point by point in t, for finite-volume effects. In fact, if a given \(\sqrt{t}\) is realized through several inequivalent \(p_1\!-\!p_2\) combinations, the level of agreement after the correction has been applied is indicative of how well higher-order and finite-volume effects are under control.

5.1.4 Goldstone boson scattering in a finite volume

The scattering of pseudoscalar octet mesons off each other (mostly \(\pi \)\(\pi \) and \(\pi \)K scattering) is a useful approach to determine \(\chi \)PT low-energy constants [288, 294,295,296,297]. This statement holds true both in experiment and on the lattice. We would like to point out that the main difference between these approaches is not so much the discretization of space-time, but rather the Minkowskian versus Euclidean setup.

In infinite-volume Minkowski space-time, 4-point Green’s functions can be evaluated (e.g., in experiment) for a continuous range of (on-shell) momenta, as captured, for instance, by the Mandelstam variable s. For a given isospin channel \(I=0\) or \(I=2\) the \(\pi \)\(\pi \) scattering phase shift \(\delta ^{I}(s)\) can be determined for a variety of s values, and by matching to \(\chi \)PT some low-energy constants can be determined (see below). In infinite-volume Euclidean space-time, such 4-point Green’s functions can only be evaluated at kinematic thresholds; this is the content of the so-called Maiani-Testa theorem [298]. However, in the Euclidean case, the finite volume comes to our rescue, as first pointed out by Lüscher [299,300,301,302]. By comparing the energy of the (interacting) two-pion system in a box with finite spatial extent L to twice the energy of a pion (with identical bare parameters) in infinite volume information on the scattering length can be obtained. In particular in the (somewhat idealized) situation where one can “scan” through a narrowly spaced set of box-sizes L such information can be reconstructed in an efficient way.

We begin with a brief summary of the relevant formulae from \(\chi \)PT in SU(2) terminology. In the x-expansion the formulae for \(a_\ell ^I\) with \(\ell =0\) and \(I=0,2\) are found in Ref. [277]

$$\begin{aligned} a_0^0M_{\pi }= & {} +\frac{7M^2}{32\pi F^2} \bigg \{ 1+\frac{5M^2}{84\pi ^2 F^2}\nonumber \\&\times \left[ {{\bar{\ell }}}_1+2{{\bar{\ell }}}_2-\frac{9}{10}{{\bar{\ell }}}_3 +\frac{21}{8}\right] +{\mathcal {O}}(x^2) \bigg \} \;, \end{aligned}$$
(104)
$$\begin{aligned} a_0^2M_{\pi }= & {} -\frac{ M^2}{16\pi F^2} \bigg \{ 1-\frac{ M^2}{12\pi ^2 F^2}\left[ {{\bar{\ell }}}_1+2{{\bar{\ell }}}_2\,+\,\frac{ 3}{8}\right] \nonumber \\&\quad \quad \quad \quad \quad +{\mathcal {O}}(x^2) \bigg \} \;, \end{aligned}$$
(105)

where we deviate from the \(\chi \)PT habit of absorbing a factor \(-M_{\pi }\) into the scattering length (relative to the convention used in quantum mechanics), since we include just a minus sign but not the factor \(M_{\pi }\). Hence, our \(a_\ell ^I\) have the dimension of a length so that all quark- or pion-mass dependence is explicit (as is most convenient for the lattice community). But the sign convention is the one of the chiral community (where \(a_\ell ^IM_{\pi }>0\) means attraction and \(a_\ell ^IM_{\pi }<0\) means repulsion).

An important difference between the two scattering lengths is evident already at tree-level. The isospin-0 S-wave scattering length (104) is large and positive, while the isospin-2 counterpart (105) is by a factor \(\sim 3.5\) smaller (in absolute magnitude) and negative. Hence, in the channel with \(I=0\) the interaction is attractive, while in the channel with \(I=2\) the interaction is repulsive and significantly weaker. In this convention experimental results, evaluated with the unitarity constraint genuine to any local quantum field theory, read \(a_0^0M_{\pi }=0.2198(46)_\mathrm {stat}(16)_\mathrm {syst}(64)_\mathrm {theo}\) and \(a_0^2M_{\pi }=-0.0445(11)_\mathrm {stat}(4)_\mathrm {syst}(8)_\mathrm {theo}\) [288, 303,304,305]. The ratio between the two (absolute) central values is larger than 3.5, and this suggests that NLO contributions to \(a_0^0\) might be more relevant than NLO contributions to \(a_0^2\).

By means of \(M^2/(4\pi F)^2=M_{\pi }^2/(4\pi F_\pi )^2\{1+\frac{1}{2}\xi \ln (\Lambda _3^2/M_{\pi }^2)+2\xi \ln (\Lambda _4^2/M_{\pi }^2)+{\mathcal {O}}(\xi ^2)\}\) or equivalently through \(M^2/(4\pi F)^2=M_{\pi }^2/(4\pi F_\pi )^2\{1+\frac{1}{2}\xi {{\bar{\ell }}}_3+2\xi {{\bar{\ell }}}_4+{\mathcal {O}}(\xi ^2)\}\) Eqs. (104, 105) may be brought into the form

$$\begin{aligned} a_0^0M_{\pi }= & {} +\frac{7M_{\pi }^2}{32\pi F_\pi ^2} \bigg \{ 1 +\xi \frac{1}{2}{{\bar{\ell }}}_3 +\xi 2{{\bar{\ell }}}_4 \nonumber \\&+\,\xi \left[ \frac{20}{21}{{\bar{\ell }}}_1+\frac{40}{21}{{\bar{\ell }}}_2-\frac{18}{21}{{\bar{\ell }}}_3 +\frac{ 5}{ 2}\right] +{\mathcal {O}}(\xi ^2) \bigg \} \;, \end{aligned}$$
(106)
$$\begin{aligned} a_0^2M_{\pi }= & {} -\frac{ M_{\pi }^2}{16\pi F_\pi ^2} \bigg \{ 1 +\xi \frac{1}{2}{{\bar{\ell }}}_3 +\xi 2{{\bar{\ell }}}_4 \nonumber \\&-\,\xi \left[ \frac{ 4}{ 3}{{\bar{\ell }}}_1+\frac{ 8}{ 3}{{\bar{\ell }}}_2 +\frac{ 1}{ 2}\right] +{\mathcal {O}}(\xi ^2) \bigg \} \;. \end{aligned}$$
(107)

Finally, this expression can be summarized as

$$\begin{aligned} a_0^0M_{\pi }= & {} +\frac{7M_{\pi }^2}{32\pi F_\pi ^2} \bigg \{ 1+\frac{9M_{\pi }^2}{32\pi ^2F_\pi ^2}\ln \frac{(\lambda _0^0)^2}{M_{\pi }^2}+{\mathcal {O}}(\xi ^2) \bigg \} \;, \nonumber \\ \end{aligned}$$
(108)
$$\begin{aligned} a_0^2M_{\pi }= & {} -\frac{ M_{\pi }^2}{16\pi F_\pi ^2} \bigg \{ 1-\frac{3M_{\pi }^2}{32\pi ^2F_\pi ^2}\ln \frac{(\lambda _0^2)^2}{M_{\pi }^2}+{\mathcal {O}}(\xi ^2) \bigg \} \;, \nonumber \\ \end{aligned}$$
(109)

with the abbreviations

$$\begin{aligned} \frac{9}{2}\ln \frac{\left( \lambda _0^0\right) ^2}{M_{\pi ,\mathrm {phys}}^2}= & {} \frac{20}{21}{{\bar{\ell }}}_1 +\frac{40}{21}{{\bar{\ell }}}_2 -\frac{5}{14}{{\bar{\ell }}}_3 +2{{\bar{\ell }}}_4 +\frac{5}{2} \;, \end{aligned}$$
(110)
$$\begin{aligned} \frac{3}{2}\ln \frac{\left( \lambda _0^2\right) ^2}{M_{\pi ,\mathrm {phys}}^2}= & {} \frac{ 4}{ 3}{{\bar{\ell }}}_1 +\frac{ 8}{ 3}{{\bar{\ell }}}_2 -\frac{1}{ 2}{{\bar{\ell }}}_3 -2{{\bar{\ell }}}_4 +\frac{1}{2} \;, \end{aligned}$$
(111)

where \(\lambda _\ell ^I\) with \(\ell =0\) and \(I=0,2\) are scales like the \(\Lambda _i\) in \({{\bar{\ell }}}_i=\ln (\Lambda _i^2/M_{\pi ,\mathrm {phys}}^2)\) for \(i\in \{1,2,3,4\}\) (albeit they are not independent from the latter). Here we made use of the fact that \(M_{\pi }^2/M_{\pi ,\mathrm {phys}}^2=1+{\mathcal {O}}(\xi )\) and thus \(\xi \ln (M_{\pi }^2/M_{\pi ,\mathrm {phys}}^2)={\mathcal {O}}(\xi ^2)\). In the absence of any knowledge on the \({{\bar{\ell }}}_i\) one would assume \(\lambda _0^0\simeq \lambda _0^2\), and with this input Eqs. (108, 109) suggest that the NLO contribution to \(|a_0^0|\) is by a factor \(\sim 9\) larger than the NLO contribution to \(|a_0^2|\). The experimental numbers quoted before clearly support this view.

Given that all of this sounds like a complete success story for the determination of the scattering lengths \(a_0^0\) and \(a_0^2\), one may wonder whether lattice QCD is helpful at all. It is, because the “experimental” evaluation of these scattering lengths builds on a constraint between these two quantities that, in turn, is based on a (rather nontrivial) dispersive evaluation of scattering phase shifts [288, 303,304,305]. Hence, to overcome this possible loophole, an independent lattice determination of \(a_0^0\) and/or \(a_0^2\) is highly welcome.

On the lattice \(a_0^2\) is much easier to determine than \(a_0^0\), since the former quantity does not involve quark-line disconnected contributions. The main upshot of such activities (to be reviewed below) is that the lattice determination of \(a_0^2M_{\pi }\) at the physical mass point is in perfect agreement with the experimental numbers quoted before, thus supporting the view that the scalar condensate is – at least in the SU(2) case – the dominant order parameter, and the original estimate \({{\bar{\ell }}}_3=2.9\pm 2.4\) is correct (see below). Still, from a lattice perspective it is natural to see a determination of \(a_0^0M_{\pi }\) and/or \(a_0^2M_{\pi }\) as a means to access the specific linear combinations of \({{\bar{\ell }}}_i\) with \(i\in \{1,2,3,4\}\) defined in Eqs. (110, 111).

In passing we note that an alternative version of Eqs. (108, 109) is used in the literature, too. For instance Refs. [306,307,308,309,310] give their results in the form

$$\begin{aligned} a_0^0M_{\pi }= & {} +\frac{7M_{\pi }^2}{32\pi F_\pi ^2} \bigg \{ 1+\frac{M_{\pi }^2}{32\pi ^2F_\pi ^2}\left[ \ell ^{I=0}_{\pi \pi }+5-9\ln \frac{M_{\pi }^2}{2F_\pi ^2}\right] \nonumber \\&\quad \quad \quad \quad \quad +\,{\mathcal {O}}(\xi ^2) \bigg \} \;, \end{aligned}$$
(112)
$$\begin{aligned} a_0^2M_{\pi }= & {} -\frac{M_{\pi }^2}{16\pi F_\pi ^2} \bigg \{ 1-\frac{M_{\pi }^2}{32\pi ^2F_\pi ^2}\left[ \ell ^{I=2}_{\pi \pi }+1-3\ln \frac{M_{\pi }^2}{2F_\pi ^2}\right] \nonumber \\&\quad \quad \quad \quad \quad +\,{\mathcal {O}}(\xi ^2) \bigg \} \;, \end{aligned}$$
(113)

where the quantities (used to quote the results of the lattice calculation)

$$\begin{aligned} \ell ^{I=0}_{\pi \pi }= & {} \frac{40}{21}\bar{\ell _1}+\frac{80}{21}\bar{\ell _2}-\frac{5}{7}\bar{\ell _3}+4\bar{\ell _4}+9\ln \frac{M_{\pi ,\mathrm {phys}}^2}{2F^2_{\pi ,\mathrm {phys}}} ,\quad \end{aligned}$$
(114)
$$\begin{aligned} \ell ^{I=2}_{\pi \pi }= & {} \frac{ 8}{ 3}\bar{\ell _1}+\frac{16}{ 3}\bar{\ell _2}- \bar{\ell _3}-4\bar{\ell _4}+3\ln \frac{M_{\pi ,\mathrm {phys}}^2}{2F^2_{\pi ,\mathrm {phys}}} \;, \end{aligned}$$
(115)

amount to linear combinations of the \(\ell _i^\mathrm {ren}(\mu ^\mathrm {ren})\) that, due to the explicit logarithms in Eqs. (114, 115), are effectively renormalized at the scale \(\mu _\mathrm {ren}=f_{\pi ,\mathrm {phys}}=\sqrt{2}F_{\pi ,\mathrm {phys}}\). Note that in these equations the dependence on the physical pion mass in the logarithms cancels the one that comes from the \({{\bar{\ell }}}_i\), so that the left-hand-sides bear no knowledge of \(M_{\pi ,\mathrm {phys}}\). This alternative form is slightly different from Eqs. (108, 109). Exact equality would be reached upon substituting \(F_\pi ^2 \rightarrow F_{\pi ,\mathrm {phys}}^2\) in the logarithms of Eqs. (112, 113). Upon expanding \(F_\pi ^2/F_{\pi ,\mathrm {phys}}^2\) and subsequently the logarithm, one realizes that this difference amounts to a term \(O(\xi )\) within the square bracket. It thus makes up for a difference at the NNLO, which is beyond the scope of these formulae.

We close by mentioning a few works that elaborate on specific issues in \(\pi \)\(\pi \) scattering relevant to the lattice. Ref. [311] does mixed action \(\chi \)PT for 2 and 2 + 1 flavors of staggered sea quarks and Ginsparg-Wilson valence quarks, Refs. [312, 313] work out scattering formulae in Wilson fermion \(\chi \)PT, and Ref. [314] lists connected and disconnected contractions in \(\pi \)\(\pi \) scattering.

5.1.5 Partially quenched and mixed action formulations

The term “partially quenched QCD” is used in two ways. For heavy quarks (cb and sometimes s) it usually means that these flavours are included in the valence sector, but not into the functional determinant, i.e., the sea sector. For the light quarks (ud and sometimes s) it means that they are present in both the valence and the sea sector of the theory, but with different masses (e.g., a series of valence quark masses is evaluated on an ensemble with fixed sea-quark masses).

The program of extending the standard (unitary) SU(3) theory to the (second version of) “partially quenched QCD” has been completed at the 2-loop (NNLO) level for masses and decay constants [315]. These formulae tend to be complicated, with the consequence that a state-of-the-art analysis with \({\mathcal {O}}(2000)\) bootstrap samples on \({\mathcal {O}}(20)\) ensembles with \({\mathcal {O}}(5)\) masses each [and hence \({\mathcal {O}}(200\,000)\) different fits] will require significant computational resources. For a summary of recent developments in \(\chi \)PT relevant to lattice QCD we refer to Ref. [316]. The SU(2) partially quenched formulae can be obtained from the SU(3) ones by “integrating out the strange quark”; this involves a matching of the two theories. At NLO, they can be found in Ref. [317] by setting the lattice artifact terms from the staggered \(\chi \)PT form to zero.

The theoretical underpinning of how “partial quenching” is to be understood in the (properly extended) chiral framework is given in Ref. [318]. Specifically, for partially quenched QCD with staggered quarks it is shown that a transfer matrix can be constructed that is not Hermitian but bounded, and can thus be used to construct correlation functions in the usual way. The program of calculating all observables in the p-regime in finite-volume to two loops, first completed in the unitary theory [319, 320], has been carried out for the partially quenched case, too [321].

A further extension of the \(\chi \)PT framework concerns the lattice effects that arise in partially quenched simulations where sea and valence quarks are implemented with different lattice fermion actions [245, 322,323,324,325,326,327,328]. This extension is usually referred to as “mixed-action \(\chi \)PT” or “mixed-action partially-quenched \(\chi \)PT”.

5.1.6 Correlation functions in the \(\epsilon \)-regime

The finite-size effects encountered in lattice calculations can be used to determine some of the LECs of QCD. In order to illustrate this point, we focus on the two lightest quarks, take the isospin limit \(m_u=m_d=m\) and consider a box of size \(L_s\) in the three space directions and size \(L_t\) in the time direction. If m is sent to zero at fixed box size, chiral symmetry is restored, and the zero-momentum mode of the pion field becomes nonperturbative. An intuitive way to understand the regime with \(ML<1\) () starts from considering the pion propagator \(G(p)=1/(p^2\,+\,M^2)\) in finite volume. For and \(p\sim 1/L\), \(G(p)\sim L^2\) for small momenta, including \(p=0\). But when M becomes of order \(1/L^2\), \(G(0)\propto L^4\gg G(p\ne 0)\sim L^2\). The \(p=0\) mode of the pion field becomes nonperturbative, and the integration over this mode restores chiral symmetry in the limit \(m\rightarrow 0\).

The pion effective action for the zero-momentum field depends only on the combination \(\mu =m\Sigma V\), the symmetry-restoration parameter, where \(V=L_s^3 L_t\) [329]. In the \(\epsilon \)-regime, where \(ML\ll 1\) with \(L\equiv V^{1/4}\) and hence \(m \ll 1/(2BL^2)\), all other terms in the effective action are sub-dominant in powers of \(\epsilon \sim 1/L\). This amounts to a reordering of the chiral expansion, based on \(m\sim \epsilon ^4\) in the \(\epsilon \)-regime [329]. In the p-regime, with \(m\sim \epsilon ^2\) or equivalently , finite-volume corrections are of order \(\int d^4p\,e^{ipx}\,G(p)|_{x\sim L}\sim e^{-ML}\). In the \(\epsilon \)-regime the chiral expansion is an expansion in powers of \(1/(\Lambda _\mathrm {QCD}L)\sim 1/(FL)\).

As an example, we consider the correlator of the axial charge carried by the two lightest quarks, \(q(x)=\{u(x),d(x)\}\). The axial current and the pseudoscalar density are given by

$$\begin{aligned} \begin{aligned} A_\mu ^i(x)&= {\bar{q}}(x){\frac{1}{2}} \tau ^i\,\gamma _\mu \gamma _5\,q(x)\,, \\ P^i(x)&= {\bar{q}}(x){\frac{1}{2}} \tau ^i\,\mathrm {i} \gamma _5\,q(x)\,, \end{aligned} \end{aligned}$$
(116)

where \(\tau ^1, \tau ^2,\tau ^3\) are the Pauli matrices in flavour space. In Euclidean space, the correlators (at zero spatial momentum) of the axial charge and the pseudoscalar density are given by

$$\begin{aligned} \begin{aligned} \delta ^{ik}C_{AA}(t)&= L_s^3\int d^3\vec {x}\;\langle A_4^i(\vec {x},t) A_4^k(0)\rangle \,, \\ \delta ^{ik}C_{PP}(t)&= L_s^3\int d^3\vec {x}\;\langle P^i(\vec {x},t) P^k(0)\rangle \,. \end{aligned} \end{aligned}$$
(117)

\(\chi \)PT yields explicit finite-size scaling formulae for these quantities [287, 330, 331]. In the \(\epsilon \)-regime, the expansion starts with

$$\begin{aligned} C_{AA}(t)= & {} \frac{F^2L_s^3}{L_t}\left[ a_A+ \frac{L_t}{F^2L_s^3}\,b_A\,h_1\left( \frac{t}{L_t} \right) +{\mathcal {O}}(\epsilon ^4)\right] , \nonumber \\ C_{PP}(t)= & {} \Sigma ^2L_s^6\left[ a_P+\frac{L_t}{F^2L_s^3}\,b_P\,h_1\left( \frac{t}{L_t} \right) +{\mathcal {O}}(\epsilon ^4)\right] ,\nonumber \\ \end{aligned}$$
(118)

where the coefficients \(a_A\), \(b_A\), \(a_P\), \(b_P\) stand for quantities of \({\mathcal {O}}(\epsilon ^0)\). They can be expressed in terms of the variables \(L_s\), \(L_t\) and m and involve only the two leading low-energy constants F and \(\Sigma \). In fact, at leading order only the combination \(\mu =m\,\Sigma \,L_s^3 L_t\) matters, the correlators are t-independent and the dependence on \(\mu \) is fully determined by the structure of the groups involved in the pattern of spontaneous symmetry breaking. In the case of \(SU(2)\times SU(2)\) \(\rightarrow \) SU(2), relevant for QCD in the symmetry restoration region with two light quarks, the coefficients can be expressed in terms of Bessel functions. The t-dependence of the correlators starts showing up at \({\mathcal {O}}(\epsilon ^2)\), in the form of a parabola, viz., \(h_1(\tau )=\frac{1}{2}\left[ \left( \tau -\frac{1}{2} \right) ^2-\frac{1}{12}\right] \). Explicit expressions for \(a_A\), \(b_A\), \(a_P\), \(b_P\) can be found in Refs. [287, 330, 331], where some of the correlation functions are worked out to NNLO. By matching the finite-size scaling of correlators computed on the lattice with these predictions one can extract F and \(\Sigma \). A way to deal with the numerical challenges germane to the \(\epsilon \)-regime has been described [332].

The fact that the representation of the correlators to NLO is not “contaminated” by higher-order unknown LECs, makes the \(\epsilon \)-regime potentially convenient for a clean extraction of the LO couplings. The determination of these LECs is then affected by different systematic uncertainties with respect to the standard case; simulations in this regime yield complementary information that can serve as a valuable cross-check to get a comprehensive picture of the low-energy properties of QCD.

The effective theory can also be used to study the distribution of the topological charge in QCD [329] and the various quantities of interest may be defined for a fixed value of this charge. The expectation values and correlation functions then not only depend on the symmetry restoration parameter \(\mu \), but also on the topological charge \(\nu \). The dependence on these two variables can explicitly be calculated. It turns out that the two-point correlation functions considered above retain the form (118), but the coefficients \(a_A\), \(b_A\), \(a_P\), \(b_P\) now depend on the topological charge as well as on the symmetry restoration parameter (see Refs. [333,334,335] for explicit expressions).

A specific issue with \(\epsilon \)-regime calculations is the scale setting. Ideally one would perform a p-regime study with the same bare parameters to measure a hadronic scale (e.g., the proton mass). In the literature, sometimes a gluonic scale, like the static force scale \(r_0\) [336] or the gradient flow scales \(t_0\) [271] or \(w_0\) [272], is used to avoid such expenses. However, it seems not entirely obvious to us that it is legitimate to identify such a gluonic scale with the length determined in the p-regime (e.g., by using \(r_0\simeq 0.48\,\mathrm{fm}\)).

It is important to stress that in the \(\epsilon \)-expansion higher-order finite-volume corrections might be significant, and the physical box size (in fm) should still be large in order to keep these distortions under control. The criteria for the chiral extrapolation and finite-volume effects are obviously different with respect to the p-regime. For these reasons we have to adjust the colour coding defined in Sect. 2.1 (see Sect. 5.2 for more details).

Recently, the effective theory has been extended to the “mixed regime” where some quarks are in the p-regime and some in the \(\epsilon \)-regime [337, 338]. In Ref. [339] a technique is proposed to smoothly connect the p- and \(\epsilon \)-regimes. In Ref. [340] the issue is reconsidered with a counting rule that is essentially the same as in the p-regime. In this new scheme, one can treat the IR fluctuations of the zero-mode nonperturbatively, while keeping the logarithmic quark-mass dependence of the p-regime.

Also first steps towards calculating higher n-point functions in the \(\epsilon \)-regime have been taken. For instance the electromagnetic pion form factor in QCD has been calculated to NLO in the \(\epsilon \)-expansion, and a way to get rid of the pion zero-momentum part has been proposed [341].

5.1.7 Energy levels of the QCD Hamiltonian in a box and \(\delta \)-regime

At low temperature, the properties of the partition function are governed by the lowest eigenvalues of the Hamiltonian. In the case of QCD, the lowest levels are due to the Nambu-Goldstone bosons and can be worked out with \(\chi \)PT [342]. In the chiral limit the level pattern follows the one of a quantum-mechanical rotator, i.e., \(E_\ell =\ell (\ell +1)/(2\,\Theta )\) with \(\ell =0,1,2,\ldots \). For a cubic spatial box and to leading order in the expansion in inverse powers of the box size \(L_s\), the moment of inertia is fixed by the value of the pion decay constant in the chiral limit, i.e., \(\Theta =F^2L_s^3\).

In order to analyse the dependence of the levels on the quark masses and on the parameters that specify the size of the box, a reordering of the chiral series is required, the so-called \(\delta \)-expansion. Regarding the spatial box-size, this regime is similar to the \(\epsilon \)-regime, i.e., \(ML_s\ll 1\), where \(M=\sqrt{2Bm}\) is the mass the pion would have in infinite volume. But the temporal box size is effectively infinite, since \(1\ll ML_t\) (and \(ML_t\ll 4\pi FL_t\) to enable the chiral approach at all), whereupon \(L_s\ll L_t\). The region where the properties of the system are controlled by this expansion is referred to as the \(\delta \)-regime [342]. Evaluating the chiral series in this regime, one finds that the expansion of the partition function goes in even inverse powers of \(FL_s\), that the rotator formula for the energy levels holds up to NNLO and the expression for the moment of inertia is now also known up to and including terms of order \((FL_s)^{-4}\) [343,344,345]. Since the level spectrum is governed by the value of the pion decay constant in the chiral limit, an evaluation of this spectrum on the lattice can be used to measure F. More generally, the evaluation of various observables in the \(\delta \)-regime offers an alternative method for a determination of some of the low-energy constants occurring in the effective Lagrangian. At present, however, the numerical results obtained in this way [346, 347] are not yet competitive with those found in the p- or \(\epsilon \)-regime. For recent theoretical investigations concerning the \(\delta \)-regime and how it matches onto the \(\epsilon \)-regime see Refs. [348, 349].

5.1.8 Other methods for the extraction of the low-energy constants

An observable that can be used to extract LECs is the topological susceptibility

$$\begin{aligned} \chi _t=\int d^4\!x\; \langle \omega (x) \omega (0)\rangle , \end{aligned}$$
(119)

where \(\omega (x)\) is the topological charge density,

$$\begin{aligned} \omega (x)=\frac{1}{32\pi ^2} \epsilon ^{\mu \nu \rho \sigma }{\mathrm{Tr}}\left[ F_{\mu \nu }(x)F_{\rho \sigma }(x)\right] . \end{aligned}$$
(120)

At infinite volume, the expansion of \(\chi _t\) in powers of the quark masses starts with [350]

$$\begin{aligned} \chi _t= & {} {\overline{m}}\,\Sigma \,\{1+{\mathcal {O}}(m)\}\,, \nonumber \\ {\overline{m}}\equiv & {} \left( \frac{1}{m_u}+\frac{1}{m_d}+\frac{1}{m_s}+\cdots \right) ^{-1}. \end{aligned}$$
(121)

The condensate \(\Sigma \) can thus be extracted from the properties of the topological susceptibility close to the chiral limit. The behaviour at finite volume, in particular in the region where the symmetry is restored, is discussed in Ref. [331]. The dependence on the vacuum angle \(\theta \) and the projection on sectors of fixed \(\nu \) have been studied in Ref. [329]. For a discussion of the finite-size effects at NLO, including the dependence on \(\theta \), we refer to Refs. [335, 351].

The role that the topological susceptibility plays in attempts to determine whether there is a large paramagnetic suppression when going from the \(N_{ f}=2\) to the \(N_{ f}=2\,+\,1\) theory has been highlighted in Ref. [352]. And the potential usefulness of higher moments of the topological charge distribution to determine LECs has been investigated in Ref. [353].

Another method for computing the quark condensate has been proposed in Ref. [354], where it is shown that starting from the Banks–Casher relation [355] one may extract the condensate from suitable (renormalizable) spectral observables, for instance the number of Dirac operator modes in a given interval. For those spectral observables higher-order corrections can be systematically computed in terms of the chiral effective theory. For recent implementations of this strategy, see Refs. [41, 50, 356]. As an aside let us remark that corrections to the Banks–Casher relation that come from a finite quark mass, a finite four-dimensional volume and (with Wilson-type fermions) a finite lattice spacing can be parameterized in a properly extended version of the chiral framework [357, 358].

An alternative strategy is based on the fact that at LO in the \(\epsilon \)-expansion the partition function in a given topological sector \(\nu \) is equivalent to the one of a chiral Random Matrix Theory (RMT) [359,360,361,362]. In RMT it is possible to extract the probability distributions of individual eigenvalues [363,364,365] in terms of two dimensionless variables \(\zeta =\lambda \Sigma V\) and \(\mu =m\Sigma V\), where \(\lambda \) represents the eigenvalue of the massless Dirac operator and m is the sea quark mass. More recently this approach has been extended to the Hermitian (Wilson) Dirac operator [366], which is easier to study in numerical simulations. Hence, if it is possible to match the QCD low-lying spectrum of the Dirac operator to the RMT predictions, then one may extractFootnote 26 the chiral condensate \(\Sigma \). One issue with this method is that for the distributions of individual eigenvalues higher-order corrections are still not known in the effective theory, and this may introduce systematic effects that are hardFootnote 27 to control. Another open question is that, while it is clear how the spectral density is renormalized [370], this is not the case for the individual eigenvalues, and one relies on assumptions. There have been many lattice studies [371,372,373,374,375] that investigate the matching of the low-lying Dirac spectrum with RMT. In this review the results of the LECs obtained in this wayFootnote 28 are not included.

5.2 Extraction of SU(2) low-energy constants

In this and the following sections we summarize the lattice results for the SU(2) and SU(3) LECs, respectively. In either case we first discuss the \({\mathcal {O}}(p^2)\) constants and then proceed to their \({\mathcal {O}}(p^4)\) counterparts. The \({\mathcal {O}}(p^2)\) LECs are determined from the chiral extrapolation of masses and decay constants or, alternatively, from a finite-size study of correlators in the \(\epsilon \)-regime. At order \(p^4\) some LECs affect two-point functions while others appear only in three- or four-point functions; the latter need to be determined from form factors or scattering amplitudes. The \(\chi \)PT analysis of the (nonlattice) phenomenological quantities is nowadaysFootnote 29 based on \({\mathcal {O}}(p^6)\) formulae. At this level the number of LECs explodes and we will not discuss any of these. We will, however, discuss how comparing different orders and different expansions (in particular the x versus \(\xi \)-expansion) can help to assess the theoretical uncertainties of the LECs determined on the lattice.

5.2.1 General remarks on the extraction of low-energy constants

The lattice results for the SU(2) LECs are summarized in Tables 19, 20, 21, 22 and Figs. 12, 13 and 14. The tables present our usual colour coding, which summarizes the main aspects related to the treatment of the systematic errors of the various calculations.

Table 19 Cubic root of the SU(2) quark condensate in \(\,\mathrm {MeV}\) units, in the \(\overline{\mathrm{MS}}\)-scheme, at the renormalization scale \(\mu =2\,\mathrm {GeV}\). All ETM values that were available only in \(r_0\) units were converted on the basis of \(r_0=0.48(2)\,\mathrm{fm}\) [386, 400, 401], with this error being added in quadrature to any existing systematic error
Table 20 Results for the SU(2) low-energy constant F (in MeV) and for the ratio \(F_\pi /F\). All ETM values that were available only in \(r_0\) units were converted on the basis of \(r_0=0.48(2)\,\mathrm{fm}\) [386, 400, 401], with this error being added in quadrature to any existing systematic error. Numbers in slanted fonts have been calculated by us, based on \(\sqrt{2}F_\pi ^\mathrm {phys}=130.41(20)\,\mathrm {MeV}\) [170], with this error being added in quadrature to any existing systematic error (otherwise to the statistical error). The systematic error in ETM 11 has been carried over from ETM 10
Table 21 Results for the SU(2) NLO low-energy constants \({\bar{\ell }}_3\) and \({\bar{\ell }}_4\). For comparison, the last two lines show results from phenomenological analyses. The systematic error in ETM 11 has been carried over from ETM 10
Table 22 Top (vector form factor of the pion): Lattice results for the charge radius \(\langle r^2\rangle _V^\pi \) (in \(\mathrm {fm}^2\)), the curvature \(c_V\) (in \(\mathrm {GeV}^{-4}\)) and the effective coupling constant \({\bar{\ell }}_6\) are compared with the experimental value, as obtained by NA7, and some phenomenological estimates. Bottom (scalar form factor of the pion): Lattice results for the scalar radius \(\langle r^2 \rangle _S^\pi \) (in \(\mathrm {fm}^2\)) and the combination \({\bar{\ell }}_1-{\bar{\ell }}_2\) are compared with a dispersive calculation of these quantities

A delicate issue in the lattice determination of chiral LECs (in particular at NLO), which cannot be reflected by our colour coding, is a reliable assessment of the theoretical error that comes from the chiral expansion. We add a few remarks on this point:

  1. 1.

    Using both the x and the \(\xi \) expansion is a good way to test how the ambiguity of the chiral expansion (at a given order) affects the numerical values of the LECs that are determined from a particular set of data [44, 376]. For instance, to determine \({{\bar{\ell }}}_4\) (or \(\Lambda _4\)) from lattice data for \(F_\pi \) as a function of the quark mass, one may compare the fits based on the parameterization \(F_\pi =F\{1+x\ln (\Lambda _4^2/M^2)\}\) [see Eq. (88)] with those obtained from \(F_\pi =F/\{1-\xi \ln (\Lambda _4^2/M_{\pi }^2)\}\) [see Eq. (93)]. The difference between the two results provides an estimate of the uncertainty due to the truncation of the chiral series. Which central value one chooses is in principle arbitrary, but we find it advisable to use the one obtained with the \(\xi \) expansion,Footnote 30 in particular because it makes the comparison with phenomenological determinations (where it is standard practice to use the \(\xi \) expansion) more meaningful.

  2. 2.

    Alternatively one could try to estimate the influence of higher chiral orders by reshuffling irrelevant higher-order terms. For instance, in the example mentioned above one might use \(F_\pi =F/\{1-x\ln (\Lambda _4^2/M^2)\}\) as a different functional form at NLO. Another way to establish such an estimate is through introducing by hand “analytical” higher-order terms (e.g., “analytical NNLO” as done, in the past, by MILC [129]). In principle it would be preferable to include all NNLO terms or none, such that the structure of the chiral expansion is preserved at any order (this is what ETM [48] and JLQCD/TWQCD [376] have done for SU(2) \(\chi \)PT and MILC for both SU(2) and SU(3) \(\chi \)PT [14, 17, 36]). There are different opinions in the field as to whether it is advisable to include terms to which the data is not sensitive. In case one is willing to include external (typically: nonlattice) information, the use of priors is a theoretically well founded option (e.g., priors for NNLO LECs if one is interested exclusively in LECs at LO/NLO).

  3. 3.

    Another issue concerns the s-quark mass dependence of the LECs \({{\bar{\ell }}}_i\) or \(\Lambda _i\) of the SU(2) framework. As far as variations of \(m_s\) around \(m_s^\mathrm {phys}\) are concerned (say for \(0<m_s<1.5m_s^\mathrm {phys}\) at best) the issue can be studied in SU(3) \(\chi \)PT, and this has been done in a series of papers [244, 377, 378]. However, the effect of sending \(m_s\) to infinity, as is the case in \(N_{ f}=2\) lattice studies of SU(2) LECs, cannot be addressed in this way. A way to analyse this difference is to compare the numerical values of LECs determined in \(N_{ f}=2\) lattice simulations to those determined in \(N_{ f}=2\,+\,1\) lattice simulations (see, e.g., Ref. [379] for a discussion).

  4. 4.

    Last but not least let us recall that the determination of the LECs is affected by discretization effects, and it is important that these are removed by means of a continuum extrapolation. In this step invoking an extended version of the chiral Lagrangian [323, 380,381,382,383,384] may be usefulFootnote 31 in case one aims for a global fit of lattice data involving several \(M_{\pi }\) and a values and several chiral observables.

In the tables and figures we summarize the results of various lattice collaborations for the SU(2) LECs at LO (F or \(F_\pi /F\), B or \(\Sigma \)) and at NLO (\({\bar{\ell }}_1-{\bar{\ell }}_2\), \({\bar{\ell }}_3\), \({\bar{\ell }}_4\), \({\bar{\ell }}_6\)). Throughout we group the results into those which stem from \(N_{ f}=2\,+\,1\,+\,1\) calculations, those which come from \(N_{ f}=2\,+\,1\) calculations and those which stem from \(N_{ f}=2\) calculations (since, as mentioned above, the LECs are logically distinct even if the current precision of the data is not sufficient to resolve the differences). Furthermore, we make a distinction whether the results are obtained from simulations in the p-regime or whether alternative methods (\(\epsilon \)-regime, spectral densities, topological susceptibility, etc.) have been used (this should not affect the result). For comparison we add, in each case, a few representative phenomenological determinations.

Fig. 12
figure 12

Cubic root of the SU(2) quark condensate in the \(\overline{\mathrm{MS}}\)-scheme, at the renormalization scale \(\mu =2\,\mathrm {GeV}\). Green and red squares indicate determinations from correlators in the p-regime. Up triangles refer to extractions from the topological susceptibility, diamonds to determinations from the pion form factor, and star symbols refer to the spectral density method

Fig. 13
figure 13

Comparison of the results for the ratio of the physical pion decay constant \(F_\pi \) and the leading-order SU(2) low-energy constant F. The meaning of the symbols is the same as in Fig. 12

Fig. 14
figure 14

Effective coupling constants \({\bar{\ell }}_3\), \({\bar{\ell }}_4\) and \({\bar{\ell }}_6\). Squares indicate determinations from correlators in the p-regime, diamonds refer to determinations from the pion form factor

A generic comment applies to the issue of the scale setting. In the past none of the lattice studies with \(N_{ f}\ge 2\) involved simulations in the p-regime at the physical value of \(m_{ud}\). Accordingly, the setting of the scale \(a^{-1}\) via an experimentally measurable quantity did necessarily involve a chiral extrapolation, and as a result of this dimensionful quantities used to be particularly sensitive to this extrapolation uncertainty, while in dimensionless ratios such as \(F_\pi /F\), \(F/F_0\), \(B/B_0\), \(\Sigma /\Sigma _0\) this particular problem is much reduced (and often finite lattice-to-continuum renormalization factors drop out). Now, there is a new generation of lattice studies with \(N_{ f}=2\) [386], \(N_{ f}=2\,+\,1\) [10,11,12, 30, 43,44,45, 117, 156, 161], and \(N_{ f}=2\,+\,1\,+\,1\) [33, 387], which does involve simulations at physical pion masses. In such studies the uncertainty that the scale setting has on dimensionful quantities is much mitigated.

It is worth repeating here that the standard colour-coding scheme of our tables is necessarily schematic and cannot do justice to every calculation. In particular there is some difficulty in coming up with a fair adjustment of the rating criteria to finite-volume regimes of QCD. For instance, in the \(\epsilon \)-regimeFootnote 32 we re-express the “chiral extrapolation” criterion in terms of \(\sqrt{2m_\mathrm {min}\Sigma }/F\), with the same threshold values (in MeV) between the three categories as in the p-regime. Also the “infinite volume” assessment is adapted to the \(\epsilon \)-regime, since the \(M_{\pi }L\) criterion does not make sense here; we assign a green star if at least 2 volumes with \(L>2.5\,\mathrm{fm}\) are included, an open symbol if at least 1 volume with \(L>2\,\mathrm{fm}\) is invoked and a red square if all boxes are smaller than \(2\,\mathrm{fm}\). Similarly, in the calculation of form factors and charge radii the tables do not reflect whether an interpolation to the desired \(q^2\) has been performed or whether the relevant \(q^2\) has been engineered by means of “twisted boundary conditions” [390]. In spite of these limitations we feel that these tables give an adequate overview of the qualities of the various calculations.

5.2.2 Results for the LO SU(2) LECs

We begin with a discussion of the lattice results for the SU(2) LEC \(\Sigma \). We present the results in Table 19 and Fig. 12. We remind the reader that results which include only a statistical error are listed in the table but omitted from the plot. Regarding the \(N_{ f}=2\) computations there are six entries without a red tag. We form the average based on ETM 09C, ETM 13 (here we deviate from our “superseded” rule, since the two works use different methods), Brandt 13, and Engel 14. Here and in the following we take into account that ETM 09C, ETM 13 share configurations, and the same statement holds true for Brandt 13 and Engel 14. Regarding the \(N_{ f}=2\,+\,1\) computations there are six published or updated papers (MILC 10A, Borsanyi 12, BMW 13, RBC/UKQCD 15E, JLQCD 16B and JLQCD 17A) that qualify for the \(N_{ f}=2\,+\,1\) average. Here we deviate again from the “superseded” rule, since JLQCD 17A [47] uses a completely different methodology than JLQCD 16B [46]. Unfortunately, the new error-bar (from an indirect determination, via the topological susceptibility) is about an order of magnitude larger than the old one, hence it barely affects our average. Finally, the single complete \(N_{ f}=2\,+\,1\,+\,1\) calculation available so far, ETM 13 [41], was recently complemented by ETM 17E [42]. Again we deviate from the “supersede” rule, since both authors and methodologies differ.

In slight deviation from the general recipe outlined in Sect. 2.2 we use these values as a basis for our estimates (as opposed to averages) of the \(N_{ f}=2\), \(N_{ f}=2\,+\,1\), and \(N_{ f}=2\,+\,1\,+\,1\) condensates. In each case the central value is obtained from our standard averaging procedure, but the (symmetrical) error is just the median of the overall uncertainties of all contributing results (see the comment below for details). This leads to the values

$$\begin{aligned} N_f&=2 :\quad \Sigma ^{1/3}= 266(10) \,\mathrm {MeV}\quad \,\mathrm {Refs.}~[41,48{-}50] ,\nonumber \\ N_f&=2\,+\,1 :\quad \Sigma ^{1/3}= 272( 5) \,\mathrm {MeV}\quad \,\mathrm {Refs.}~[14,43{-}47],\nonumber \\ N_f&=2\,+\,1\,+\,1:\quad \Sigma ^{1/3}= 286(23) \,\mathrm {MeV}\quad \,\mathrm {Refs.}~ \text{[41,42] }, \end{aligned}$$
(122)

in the \({\overline{\mathrm {MS}}}\) scheme at the renormalization scale \(2\,\mathrm {GeV}\), where the errors include both statistical and systematic uncertainties. In accordance with our guidelines we ask the reader to cite the appropriate set of references as indicated in Eq. (122) when using these numbers.

As a rationale for using estimates (as opposed to averages) for \(N_{ f}=2\), \(N_{ f}=2\,+\,1\), and \(N_{ f}=2\,+\,1\,+\,1\), we add that for \(\Sigma ^{1/3}|_{N_{ f}=2}\), \(\Sigma ^{1/3}|_{N_{ f}=2\,+\,1}\), and \(\Sigma ^{1/3}|_{N_{ f}=2\,+\,1\,+\,1}\) the standard averaging method would yield central values as quoted in Eq. (122), but with (overall) uncertainties of \(4\,\mathrm {MeV}\), \(1\,\mathrm {MeV}\), and \(16\,\mathrm {MeV}\), respectively. It is not entirely clear to us that the scale is sufficiently well known in all contributing works to warrant a precision of up to 0.37% on our \(\Sigma ^{1/3}\), and a similar statement can be made about the level of control over the convergence of the chiral expansion. The aforementioned uncertainties would tend to suggest an \(N_{ f}\)-dependence of the SU(2) chiral condensate, which (especially in view of similar issues with other LECs, see below) seems premature to us. Therefore we choose to form the central value of our estimate with the standard averaging procedure, but its uncertainty is taken as the median of the uncertainties of the participating results. We hope that future high-quality determinations (with any of \(N_f=2\), \(N_f=2\,+\,1\), or \(N_f=2\,+\,1\,+\,1\)) will help determine whether there is a noticeable \(N_f\)-dependence of the SU(2) chiral condensate or not.

The next quantity considered is F, i.e., the pion decay constant in the SU(2) chiral limit (\(m_{ud}\rightarrow 0\), at fixed physical \(m_s\) for \(N_f > 2\) simulations). As argued on previous occasions we tend to give preference to \(F_\pi /F\) (here the numerator is meant to refer to the physical-pion-mass point) wherever it is available, since often some of the systematic uncertainties are mitigated. We collect the results in Table 20 and Fig. 13. In those cases where the collaboration provides only F, the ratio is computed on the basis of the phenomenological value of \(F_\pi \), and the respective entries in Table 20 are in slanted fonts. We encourage authors to provide both F and \(F_\pi /F\) from their analysis, since the ratio is less dependent on the scale setting, and errors tend to partially cancel. Among the \(N_{ f}=2\) determinations five (ETM 08, ETM 09C, QCDSF 13, Brandt 13 and Engel 14) are without red tags. Since the third one is without systematic error, only four of them enter the average. Among the \(N_{ f}=2\,+\,1\) determinations five values (MILC 10 as an update of MILC 09, NPLQCD 11, Borsanyi 12, BMW 13, and RBC/UKQCD 15E) contribute to the average. Here and in the following we take into account that MILC 10 and NPLQCD 11 share configurations. Finally, there is a single \(N_{ f}=2\,+\,1\,+\,1\) determination (ETM 11) which forms the current best estimate in this category.

In analogy to the condensates discussed above, we use these values as a basis for our estimates (as opposed to averages) of the decay constant ratios

$$\begin{aligned} N_f&=2 :\quad {F_\pi }/{F}=1.073(15) \quad \,\mathrm {Refs.}~{ [48{-}50,53]} ,\nonumber \\ N_f&=2\,+\,1 :{F_\pi }/{F}=1.062( 7) \quad \,\mathrm {Refs.}~[36,43{-}45,52],\nonumber \\ N_f&=2\,+\,1\,+\,1:{F_\pi }/{F}=1.077( 3) \quad \,\mathrm {Ref.}~ \text{[51] }, \end{aligned}$$
(123)

where the errors include both statistical and systematic uncertainties. We ask the reader to cite the appropriate set of references as indicated in Eq. (123) when using these numbers. For \(N_{ f}=2\) and \(N_{ f}=2\,+\,1\) these estimates are obtained through the well-defined procedure described next to Eq. (122). For \(N_{ f}=2\,+\,1\,+\,1\) the result of ETM 11 (as an update to ETM 10) is the only oneFootnote 33 available.

For \(N_{ f}=2\) and \(N_{ f}=2\,+\,1\) the standard averaging method would yield the central values as quoted in Eq. (123), but with (overall) uncertainties of 6 and 1, respectively, on the last digit quoted. In this particular case the single \(N_{ f}=2\,+\,1\,+\,1\) determination lies significantly higher than the \(N_{ f}=2\,+\,1\) average (with the small error-bar), basically on par with the \(N_{ f}=2\) average (ditto), and this makes such a standard average look even more suspicious to us. At the very least, one should wait for one more qualifying \(N_f=2\,+\,1\,+\,1\) determination before attempting any conclusions about the \(N_f\)-dependence of \(F_\pi /F\). While we are not aware of any theorem that excludes a nonmonotonic behavior in \(N_f\) of a LEC, standard physics reasoning would suggest that quark-loop effects become smaller with increasing quark mass, hence a dynamical charm quark will influence LECs less significantly than a dynamical strange quark, and even the latter one seems to bring rather small shifts. As a result, we feel that a nonmonotonic behavior of \(F_\pi /F\) with \(N_{ f}\), once established, would represent a noteworthy finding. We hope this reasoning explains why we prefer to stay in Eq. (123) with estimates that obviously are on the conservative side.

5.2.3 Results for the NLO SU(2) LECs

We move on to a discussion of the lattice results for the NLO LECs \({\bar{\ell }}_3\) and \({\bar{\ell }}_4\). We remind the reader that on the lattice the former LEC is obtained as a result of the tiny deviation from linearity seen in \(M_{\pi }^2\) versus \(Bm_{ud}\), whereas the latter LEC is extracted from the curvature in \(F_\pi \) versus \(Bm_{ud}\). The available determinations are presented in Table 21 and Fig. 14. Among the \(N_{ f}=2\) determinations ETM 08, ETM 09C, Brandt 13, and Gülpers 15 come with a systematic uncertainty and without red tags. Given that the former two use different approaches, all four determinations enter our average. The colour coding of the \(N_{ f}=2\,+\,1\) results looks very promising; there is a significant number of lattice determinations without any red tag. Applying our superseding rule, MILC 10 (as an updateFootnote 34 to MILC 09), NPLQCD 11, Borsanyi 12, BMW 13, and RBC/UKQCD 15E contribute to the average. For \(N_{ f}=2\,+\,1\,+\,1\) there is only the single work ETM 11 (as an update to ETM 10).

In analogy to our processing of the LECs at LO, we use these determinations as the basis of our estimate (as opposed to average) of the NLO quantities

$$\begin{aligned} N_f&=2 :\quad {\bar{\ell }}_3=3.41(82) \quad \,\mathrm {Refs.}~{ [48,49,53]}, \nonumber \\ N_f&=2\,+\,1 :\quad {\bar{\ell }}_3=3.07(64) \quad [36,43{-}45,52],\nonumber \\ N_f&=2\,+\,1\,+\,1:\quad {\bar{\ell }}_3=3.53(26) \quad \,\mathrm {Ref.}~ \text{[51] } \end{aligned}$$
(124)
$$\begin{aligned} N_f&=2:\quad {\bar{\ell }}_4=4.40(28) \quad \,\mathrm {Refs.}~{ [48, 49, 53, 54]}, \nonumber \\ N_f&=2\,+\,1:\quad {\bar{\ell }}_4=4.02(45) \quad \,\mathrm {Refs.}~[36,43{-}45,52],\nonumber \\ N_f&=2\,+\,1\,+\,1:\quad {\bar{\ell }}_4=4.73(10) \quad \,\mathrm {Ref.}~ \text{[51] } \end{aligned}$$
(125)

where the errors include both statistical and systematic uncertainties. Again we ask the reader to cite the appropriate set of references as indicated in Eq. (124) or Eq. (125) when using these numbers. For \(N_{ f}=2\) and \(N_{ f}=2\,+\,1\) these estimates are obtained through the well-defined procedure described next to Eq. (122). For \(N_{ f}=2\,+\,1\,+\,1\) once again ETM 11 (as an update to ETM 10) is the single reference available.

We remark that our preprocessing procedureFootnote 35 symmetrizes the asymmetric error of ETM 09C with a slight adjustment of the central value. Regarding the difference between the estimates as given in Eqs. (124, 125) and the result of the standard averaging procedure we add that the latter would yield the overall uncertainties 25 and 12 for \({{\bar{\ell }}}_3\), and the overall uncertainties 17 and 5 for \({{\bar{\ell }}}_4\). In all cases the central value would be unchanged. Especially for \({{\bar{\ell }}}_4\) such numbers would suggest a clear difference between the value with \(N_{ f}=2\) dynamical flavours and the one at \(N_{ f}=2\,+\,1\). Similarly to what happened with \(F_\pi /F\), the single determination with \(N_{ f}=2\,+\,1\,+\,1\) is more on the \(N_{ f}=2\) side, which, if confirmed, would suggest a nonmonotonicity of a \(\chi \)PT LEC with \(N_{ f}\). Again we think that currently such a conclusion would be premature, and this is why we give preference to the estimates quoted in Eqs. (124, 125).

From a more phenomenological point of view there is a notable difference between \({\bar{\ell }}_3\) and \({\bar{\ell }}_4\) in Fig. 14. For \({\bar{\ell }}_4\) the precision of the phenomenological determination achieved in Colangelo 01 [288] represents a significant improvement compared to Gasser 84 [277]. Picking any \(N_{ f}\), the lattice estimate of \({\bar{\ell }}_4\) is consistent with both of the phenomenological values and comes with an error-bar that is roughly comparable to or somewhat larger than the one in Colangelo 01 [288]. By contrast, for \({\bar{\ell }}_3\) the error of an individual lattice computation is usually much smaller than the error of the estimate given in Gasser 84 [277], and even our conservative estimates (124) have uncertainties that represent a significant improvement on the error-bar of Gasser 84 [277]. Evidently, our hope is that future determinations of \({{\bar{\ell }}}_3,{{\bar{\ell }}}_4\), with \(N_{ f}=2\), \(N_{ f}=2\,+\,1\) and \(N_{ f}=2\,+\,1\,+\,1\), will allow us to further shrink our error-bars in a future edition of FLAG.

Let us add that Ref. [414] determines \(\ell _1,\ell _2,\ell _3,\ell _4\) (or equivalently \({{\bar{\ell }}}_1,{{\bar{\ell }}}_2,{{\bar{\ell }}}_3,{{\bar{\ell }}}_4\)) individually, with some assumptions and various fits from lattice data at a single lattice spacing and two (heavier than physical) pion masses.

We continue with a discussion of the lattice results for \({\bar{\ell }}_6\) and \({\bar{\ell }}_1-{\bar{\ell }}_2\). The LEC \({\bar{\ell }}_6\) determines the leading contribution in the chiral expansion of the pion vector charge radius, see Eq. (101). Hence from a lattice study of the vector form factor of the pion with several \(M_{\pi }\) one may extract the radius \(\langle r^2\rangle _V^\pi \), the curvature \(c_V\) (both at the physical pion-mass point) and the LEC \({\bar{\ell }}_6\) in one go. Similarly, the leading contribution in the chiral expansion of the scalar radius of the pion determines \({\bar{\ell }}_4\), see Eq. (101). This LEC is also present in the pion-mass dependence of \(F_\pi \), as we have seen. The difference \({\bar{\ell }}_1-{\bar{\ell }}_2\), finally, may be obtained from the momentum dependence of the vector and scalar pion form factors, based on the 2-loop formulae of Ref. [291]. The top part of Table 22 collects the results obtained from the vector form factor of the pion (charge radius, curvature and \({\bar{\ell }}_6\)). Regarding this low-energy constant two \(N_{ f}=2\) calculations are published works without a red tag; we thus arrive at the average (actually the first one in the LEC section)

$$\begin{aligned} N_f=2:\quad {\bar{\ell }}_6=15.1(1.2) \quad \,\mathrm {Refs.}~ \text{[49,53] }, \end{aligned}$$
(126)

which is represented as a grey band in the last panel of Fig. 14. Here we ask the reader to cite Refs. [49, 53] when using this number.

The experimental information concerning the charge radius is excellent and the curvature is also known very accurately, based on \(e^+e^-\) data and dispersion theory. The vector form factor calculations thus present an excellent testing ground for the lattice methodology. The first data column of Table 22 shows that most of the available lattice results pass the test. There is, however, one worrisome point. For \({\bar{\ell }}_6\) the agreement seems less convincing than for the charge radius, even though the two quantities are closely related. In particular the \({\bar{\ell }}_6\) value of JLQCD 14 [408] seems inconsistent with the phenomenological determinations of Refs. [277, 291], even though its value for \(\langle r^2\rangle _V^\pi \) is consistent. So far we have no explanation (other than observing that lattice computations which disagree with the phenomenological determination of \({\bar{\ell }}_6\) tend to have red tags), but we urge the groups to pay special attention to this point. Similarly, the bottom part of Table 22 collects the results obtained for the scalar form factor of the pion and the combination \({\bar{\ell }}_1-{\bar{\ell }}_2\) that is extracted from it. A new feature is that Ref. [387] gives both the (flavour) octet and singlet part in SU(3), finding \(\langle r^2\rangle _{S,\mathrm {octet}}^\pi =0.431(38)(46)\) and \(\langle r^2\rangle _{S,\mathrm {singlet}}^\pi =0.506(38)(53)\). For reasons of backward compatibility they also give \(\langle r^2\rangle _{S,ud}^\pi \) defined with a \({\bar{u}}u+{\bar{d}}d\) density, and this number is shown in Table 22. Another notable feature is that they find the ordering \(\langle r^2\rangle _{S,\mathrm {conn}}^\pi< \langle r^2\rangle _{S,\mathrm {octet}}^\pi< \langle r^2\rangle _{S,ud}^\pi < \langle r^2\rangle _{S,\mathrm {singlet}}^\pi \) [387].

Fig. 15
figure 15

Summary of the pion form factors \(\langle r^2\rangle _V^\pi \) (top) and \(\langle r^2\rangle _S^\pi \) (bottom)

Those data of Table 22 that come with a systematic error are shown in Fig. 15. The overall impression is that the majority of lattice results come with a fair assessment of the respective systematic uncertainties. Yet it is clear that it is a nontrivial endeavor to match the precision obtained in experiment and subsequent phenomenological analysis.

The last set of observables we wish to discuss includes the \(\pi \)\(\pi \) scattering lengths \(a_0^0\) and \(a_0^2\) in the isopin channels \(I=0\) and \(I=2\), respectively. As can be seen from Eqs. (108110), the \(I=0\) scattering length carries information about \(\frac{20}{21}{{\bar{\ell }}}_1+\frac{40}{21}{{\bar{\ell }}}_2-\frac{5}{14}{{\bar{\ell }}}_3+2{{\bar{\ell }}}_4\). And from Eqs. (109111) it follows that the \(I=2\) counterpart carries information about the linear combination \(\frac{4}{3}{{\bar{\ell }}}_1+\frac{8}{3}{{\bar{\ell }}}_2-\frac{1}{2}{{\bar{\ell }}}_3-2{{\bar{\ell }}}_4\). We prefer quoting the dimensionless products \(a_0^{I}M_{\pi }\) (at the physical mass point) over the aforementioned linear combinations to ease comparison with phenomenology. In Table 23 we summarize the lattice information on \(a_0^{I=0}M_{\pi }\) and \(a_0^{I=2}M_{\pi }\) at the physical mass point. We are aware of at least one additional work, Ref. [422], which has a technical focus and determines a scattering length away from the physical point, and which, for this reason, is not included in Table 23. We remind the reader that a lattice computation of \(a_0^{I=0}M_{\pi }\) involves quark-loop disconnected contributions, which tend to be very noisy and hence require lots of statistics. To date there are three pioneering calculations, but none of them is free of red tags. The situation is slightly better for \(a_0^{I=2}M_{\pi }\); there is one computation at \(N_{ f}=2\) and one at \(N_{ f}=2\,+\,1\,+\,1\) that would qualify for a FLAG average. Still, since in the much better populated category of \(N_{ f}=2\,+\,1\) studies there is currently no computation without a red tag, we feel it is appropriate to postpone any form of averaging to the next edition of FLAG, when hopefully qualifying computations (at least for \(a_0^{I=2}M_{\pi }\)) are available at each \(N_{ f}\) considered.

Table 23 Summary of \(\pi \)\(\pi \) scattering data in the \(I=0\) (top) and \(I=2\) (bottom) channels. In our view the paper Fu 17 contains one pion mass at \(a\simeq 0.09\mathrm{fm}\) and another one at \(a\simeq 0.06\mathrm{fm}\). The results of ETM 15E and NPLQCD 11A have been adapted to our sign convention. The results of Refs. [288, 305] allow for a cross-check with phenomenology
Table 24 Lattice results for the low-energy constants \(F_0\), \(B_0\) (in MeV) and \(\Sigma _0\!\equiv \!F_0^2B_0\), which specify the effective SU(3) Lagrangian at leading order. The ratios \(F/F_0\), \(B/B_0\), \(\Sigma /\Sigma _0\), which compare these with their SU(2) counterparts, indicate the strength of the Zweig-rule violations in these quantities (in the large-\(N_c\) limit, they tend to unity). Numbers in slanted fonts are calculated by us, from the information given in the references
Table 25 Low-energy constants of the SU(3) Lagrangian at NLO with running scale \(\mu \!=\!770\,\mathrm {MeV}\) (the values in Refs. [17, 33, 36, 129, 244] are evolved accordingly). The MILC 10 entry for \(L_6\) is obtained from their results for \(2L_6\!-\!L_4\) and \(L_4\) (similarly for other entries in slanted fonts)
Table 26 Low-energy constants of the SU(3) Lagrangian at NLO with running scale \(\mu = 770\,\mathrm {MeV}\) (the values in Ref. [244] are evolved accordingly). The JLQCD 08A result for \(\ell _5(770\,\mathrm {MeV})\) [despite the paper saying \(L_{10}(770\,\mathrm {MeV})\)] was converted to \(L_{10}\) with the GL 1-loop formula, assuming that the difference between \({\bar{\ell }}_5(m_s\!=\!m_s^\mathrm {phys})\) (needed in the formula) and \({\bar{\ell }}_5(m_s = \infty )\) (computed by JLQCD) is small. Note that for the “hybrid” papers Boyle 14 and Boito 15 the ratings, referring to the lattice data only (cf. footnote 36), are incomplete and the reader may be well advised to prefer the latter result over the former

5.2.4 Epilogue

In this section there are several quantities for which only one qualifying (“all-green”) determination is available for a given SU(2) LEC. Obviously the phenomenologically oriented reader is encouraged to use such a value (as provided in our tables) and to cite the original work. We hope that the lattice community will come up with further computations, in particular for \(N_{ f}=2\,+\,1\,+\,1\), such that a fair comparison of different works is possible at any \(N_{ f}\), and eventually a statement can be made about the presence or absence of an \(N_{ f}\)-dependence of SU(2) LECs.

What can be learned about the convergence pattern of SU(2) \(\chi \)PT from varying the fit ranges (in \(m_{ud}\)) of the pion mass and decay constant (i.e., the quantities from which \({\bar{\ell }}_3,{\bar{\ell }}_4\) are derived) is discussed in Ref. [423], where also the usefulness of comparing results from the x and the \(\xi \) expansion (with material taken from Ref. [44]) is emphasized.

Perhaps the most important physics result of this section is that the lattice simulations confirm the approximate validity of the Gell–Mann–Oakes–Renner formula and show that the square of the pion mass indeed grows in proportion to \(m_{ud}\). The formula represents the leading term of the chiral series and necessarily receives corrections from higher orders. At first nonleading order, the correction is determined by the effective coupling constant \({\bar{\ell }}_3\). The results collected in Table 21 and in the top panel of Fig. 14 show that \({\bar{\ell }}_3\) is now known quite well. They corroborate the conclusion drawn already in Ref. [424]: the lattice confirms the estimate of \({\bar{\ell }}_3\) derived in Ref. [277]. In the graph of \(M_{\pi }^2\) versus \(m_{ud}\), the values found on the lattice for \({\bar{\ell }}_3\) correspond to remarkably little curvature. In other words, the Gell-Mann-Oakes-Renner formula represents a reasonable first approximation out to values of \(m_{ud}\) that exceed the physical value by an order of magnitude.

As emphasized by Stern and collaborators [425,426,427], the analysis in the framework of \(\chi \)PT is coherent only if (i) the leading term in the chiral expansion of \(M_{\pi }^2\) dominates over the remainder and (ii) the ratio \(m_s/m_{ud}\) is close to the value 25.6 that follows from Weinberg’s leading-order formulae. In order to investigate the possibility that one or both of these conditions might fail, the authors proposed a more general framework, referred to as “generalized \(\chi \)PT”, which includes (standard) \(\chi \)PT as a special case. The results found on the lattice demonstrate that QCD does satisfy both of the above conditions. Hence, in the context of QCD, the proposed generalization of the effective theory does not appear to be needed. There is a modified version, however, referred to as “re-summed \(\chi \)PT” [428], which is motivated by the possibility that the Zweig-rule violating couplings \(L_4\) and \(L_6\) might be larger than expected. The available lattice data does not support this possibility, but they do not rule it out either (see Sect. 5.3 for details).

5.3 Extraction of SU(3) low-energy constants

To date, there are three comprehensive SU(3) papers with results based on lattice QCD with \(N_{ f}\!=\!2\,+\,1\) dynamical flavours [129, 162, 163], and one more with results based on \(N_{ f}\!=\!2\,+\,1\,+\,1\) dynamical flavours [33]. It is an open issue whether the data collected at \(m_s \simeq m_s^\mathrm {phys}\) allows for an unambiguous determination of SU(3) low-energy constants (cf. the discussion in Ref. [163]). To make definite statements one needs data at considerably smaller \(m_s\), and so far only MILC has some [129]. We are aware of a few papers with a result on one SU(3) low-energy constant each, which we list for completeness. Some particulars of the computations are listed in Table 24.

5.3.1 Results for the LO and NLO SU(3) LECs

Results for the SU(3) low-energy constants of leading order are found in Table 24 and analogous results for some of the effective coupling constants that enter the chiral SU(3) Lagrangian at NLO are collected in Tables 25 and 26. From PACS-CS [162] only those results are quoted that have been corrected for finite-size effects (misleadingly labelled “w/FSE” in their tables). For staggered data our colour-coding rule states that \(M_{\pi }\) is to be understood as \(M_{\pi }^\mathrm {RMS}\). The rating of Refs. [36, 129] is based on the information regarding the RMS masses given in Ref. [17]. Finally, Boyle 14 [431] and Boito 15 [430] are “hybrids” in the sense that they combine lattice data and experimental information.Footnote 36

A graphical summary of the lattice results for the coupling constants \(L_4\), \(L_5\), \(L_6\) and \(L_8\), which determine the masses and the decay constants of the pions and kaons at NLO of the chiral SU(3) expansion, is displayed in Fig. 17, along with the two phenomenological determinations quoted in the above tables. The overall consistency seems fairly convincing. In spite of this apparent consistency, there is a point that needs to be clarified as soon as possible. Some collaborations (RBC/UKQCD and PACS-CS) find that they are having difficulties in fitting their partially quenched data to the respective formulae for pion masses above \(\simeq \) 400 MeV. Evidently, this indicates that the data is stretching the regime of validity of these formulae. To date it is, however, not clear which subset of the data causes the troubles, whether it is the unitary part extending to too large values of the quark masses or whether it is due to \(m^\mathrm {val}/m^\mathrm {sea}\) differing too much from one. In fact, little is known, in the framework of partially quenched \(\chi \)PT, about the shape of the region of applicability in the \(m^\mathrm {val}\) versus \(m^\mathrm {sea}\) plane for fixed \(N_{ f}\). This point has also been emphasized in Ref. [379].

To date only the computations MILC 10 [36] (as an update of MILC 09 and MILC 09A) and HPQCD 13A [33] are free of red tags. Since they use different \(N_{ f}\) (in the former case \(N_{ f}=2\,+\,1\), in the latter case \(N_{ f}=2\,+\,1\,+\,1\)) we stay away from averaging them. Hence the situation remains unsatisfactory in the sense that for each \(N_{ f}\) only a single determination of high standing is available. Accordingly, for the phenomenologically oriented reader there is no alternative to using the results of MILC 10 [36] for \(N_{ f}=2\,+\,1\) and HPQCD 13A [33] for \(N_{ f}=2\,+\,1\,+\,1\), as given in Table 25.

Fig. 16
figure 16

Summary of the \(\pi \)\(\pi \) scattering lengths \(a_0^0M_{\pi }\) (top) and \(a_0^2M_{\pi }\) (bottom)

5.3.2 Epilogue

In this subsection we find ourselves again in the unpleasant situation that only one qualifying (“all-green”) determination is available (at a given \(N_{ f}\)) for several LECs in the SU(3) framework, both at LO and at NLO. Obviously the phenomenologically oriented reader is encouraged to use such a value (as provided in our tables) and to cite the original work. Again our hope is that further computations would become available in forthcoming years, such that a fair comparison of different works will become possible both at \(N_{ f}=2\,+\,1\) and \(N_{ f}=2\,+\,1\,+\,1\).

In the large-\(N_c\) limit, the Zweig rule becomes exact, but the quarks have \(N_c=3\). The work done on the lattice is ideally suited to confirm or disprove the approximate validity of this rule for QCD. Two of the coupling constants entering the effective SU(3) Lagrangian at NLO disappear when \(N_c\) is sent to infinity: \(L_4\) and \(L_6\). The upper part of Table 25 and the left panels of Fig. 17 show that the lattice results for these quantities are in good agreement. At the scale \(\mu =M_\rho \), \(L_4\) and \(L_6\) are consistent with zero, indicating that these constants do approximately obey the Zweig rule. As mentioned above, the ratios \(F/F_0\), \(B/B_0\) and \(\Sigma /\Sigma _0\) also test the validity of this rule. Their expansion in powers of \(m_s\) starts with unity and the contributions of first order in \(m_s\) are determined by the constants \(L_4\) and \(L_6\), but they also contain terms of higher order. Apart from measuring the Zweig-rule violations, an accurate determination of these ratios will thus also allow us to determine the range of \(m_s\) where the first few terms of the expansion represent an adequate approximation. Unfortunately, at present, the uncertainties in the lattice data on these ratios are too large to draw conclusions, both concerning the relative size of the subsequent terms in the chiral series and concerning the magnitude of the Zweig-rule violations. The data seems to confirm the paramagnetic inequalities [427], which require \(F/F_0>1\), \(\Sigma /\Sigma _0>1\), and it appears that the ratio \(B/B_0\) is also larger than unity, but the numerical results need to be improved before further conclusions can be drawn.

The matching formulae in Ref. [244] can be used to calculate the SU(2) couplings \({{\bar{\ell }}}_i\) from the SU(3) couplings \(L_j\). Results obtained in this way are included in Table 21, namely, the entries explicitly labelled “SU(3)-fit” as well as MILC 10. Within the still rather large errors, the converted LECs from the SU(3) fits agree with those directly determined within SU(2) \(\chi \)PT. We plead with every collaboration performing \(N_{ f}=2\,+\,1\) simulations to also directly analyse their data in the SU(2) framework. In practice, lattice simulations are performed at values of \(m_s\) close to the physical value and the results are then corrected for the difference of \(m_s\) from its physical value. If simulations with more than one value of \(m_s\) have been performed, this can be done by interpolation. Alternatively one can use the technique of re-weighting (for a review see, e.g., Ref. [436]) to shift \(m_s\) to its physical value. From a conceptual view, the most pressing issue is the question about the convergence of the SU(3) framework for \(m_s\simeq m_s^\mathrm {phys}\). In line with what has been said in the very first paragraph of this section, we plead with every collaboration involved in \(N_{ f}=2\,+\,1\) (or \(2\,+\,1\,+\,1\)) simulations, to add ensembles with \(m_s \ll m_s^\mathrm {phys}\) to their database, as this allows them to address the issue properly.

5.3.3 Outlook

A relatively new development is that several lattice groups started extracting low-energy constants from \(\pi \)\(\pi \) scattering data. In the isospin \(I=0\) and \(I=2\) channels the results of these studies are typically expressed in SU(2) terminology [i.e., through the linear combinations of \({{\bar{\ell }}}_i\) that appear in Eqs. (110, 111)], even if the studies are performed with \(N_{ f}=2\,+\,1\) or \(N_{ f}=2\,+\,1\,+\,1\) lattices. This is why the respective compilation, in the form of Table 23, is found in Sect. 5.2. Still, we remind the reader that the most generic way of presenting the results is through the scattering lengths \(a_0^0,a_0^2\), as featured in Fig. 16.

In the isospin \(I=1\) channel the situation is different. The most obvious difference is that this channel is dominated by a low-lying (and fairly broad) resonance, the well-known \(\rho (770)\). Lattice data would naturally include contributions where this resonance features in internal propagators. In the chiral SU(2) and SU(3) frameworks, on the other hand, there is no degree of freedom with the quantum numbers of a vector meson [244, 277]. Its contributions are subsumed in the low-energy constants, and an important insight is that the theory is built in such a way that it would correctly describe the low-energy tail of such contributions [437, 438]. Of course, one may extend the theory as to include vector mesons as explicit degrees of freedom, but this raises the issue of how to avoid double counting. Another way of phrasing this is to say that the low-energy constants in such an extended theory are logically different from those of \(\chi \)PT, since they should not include the vector meson contributions. Moreover, such extensions of \(\chi \)PT seem to lack a clear-cut power-counting scheme. In any case, since the literature on this topic is mostly in terms of the SU(3) chiral Lagrangian (and its extensions), it is natural to expect that lattice results concerning the \(I=1\) channel will be expressed in terms of SU(3) LECs.

Fig. 17
figure 17

Low-energy constants that enter the effective SU(3) Lagrangian at NLO, with scale \(\mu =770\,\mathrm {MeV}\). The grey bands labelled as “FLAG average” coincide with the results of MILC 10 [36] for \(N_{ f}=2\,+\,1\) and with HPQCD 13A [33] for \(N_{ f}=2\,+\,1\,+\,1\), respectively

In this spirit we like to mention that there are considerable efforts, on the lattice, to get a better handle on the \(I=1\) channel of \(\pi \)\(\pi \) scattering; we are aware of Refs. [422, 439,440,441,442,443,444,445,446,447,448,449,450,451,452]. Some of these try to extract the NLO LEC combinations \(2L_4+L_5\) and \(2L_1-L_2\,+\,L_3\), sometimes with a single lattice spacing and with little or limited variation in the pion mass. We feel confident that these calculations will mature quickly, and eventually yield results on LECs (or linear combinations thereof) that might appear here, in the SU(3) section of a future edition.

We should add that there are claims that low-order calculations in extended chiral frameworks might allow for a simpler description of lattice data with an extended range of light (\(m_{ud}\)) and strange (\(m_s\)) quark masses than high-order calculations in the standard (vector-meson free) SU(3) \(\chi \)PT framework, see, e.g., Ref. [453]. While it is too early to jump to conclusions, we see nothing wrong in testing such frameworks as an effective (or model) description of lattice data on masses and decay constants of pseudoscalar mesons. But we caution that whenever LECs are extracted, it is worth scrutinizing the details of how this is done, for reasons that are intricately linked to the “double counting” issue mentioned above.

Last but not least we should mention that also baryon \(\chi \)PT results can be used to learn something about the chiral LECs in the meson sector. For instance Refs. [454, 455] give values for \(2L_6-L_4\), \(2L_8-L_5\), and \(L_8+3L_7\) from three different fits to lattice-QCD baryon masses by other groups. The quoted LECs enter via the pion- and kaon-mass dependence on quark masses. In our view checking whether the indirect determination of SU(3) meson LECs, via baryonic properties, agrees with the direct determination in the meson sector is a promising direction for forthcoming years.

6 Kaon mixing

Authors: P. Dimopoulos, G. Herdoíza, R. Mawhinney

The mixing of neutral pseudoscalar mesons plays an important role in the understanding of the physics of CP violation. In this section we discuss \(K^0 - {{\bar{K}}}^0\) oscillations, which probe the physics of indirect CP violation. Extensive reviews on the subject can be found in Refs. [456,457,458,459,460]. For the most part, we shall focus on kaon mixing in the SM. The case of Beyond-the-Standard-Model (BSM) contributions is discussed in Sect. 6.3.

6.1 Indirect CP violation and \(\epsilon _{K}\) in the SM

Indirect CP violation arises in \(K_L \rightarrow \pi \pi \) transitions through the decay of the \(\mathrm CP=+1\) component of \(K_L\) into two pions (which are also in a \(\mathrm CP=+1\) state). Its measure is defined as

$$\begin{aligned} \epsilon _{K} = \dfrac{{{{\mathcal {A}}}} [ K_L \rightarrow (\pi \pi )_{I=0}]}{{{{\mathcal {A}}}} [ K_S \rightarrow (\pi \pi )_{I=0}]} \,\, , \end{aligned}$$
(127)

with the final state having total isospin zero. The parameter \(\epsilon _{K}\) may also be expressed in terms of \(K^0 - {{\bar{K}}}^0\) oscillations. In the Standard Model, \(\epsilon _{K}\) receives contributions from: (i) short-distance (SD) physics given by \(\Delta S = 2\) “box diagrams” involving \(W^\pm \) bosons and uc and t quarks; (ii) the long-distance (LD) physics from light hadrons contributing to the imaginary part of the dispersive amplitude \(M_{12}\) used in the two component description of \(K^0-{\bar{K}}^0\) mixing; (iii) the imaginary part of the absorptive amplitude \(\Gamma _{12}\) from \(K^0-{\bar{K}}^0\) mixing; and (iv) \(\text {Im}(A_0)/\text {Re}(A_0)\), where \(A_0\) is the \(K \rightarrow (\pi \pi )_{I=0}\) decay amplitude. The various factors in this decomposition can vary with phase conventions. In terms of the \(\Delta S = 2\) effective Hamiltonian, \({{{\mathcal {H}}}}_\text {eff}^{\Delta S = 2}\), it is common to represent contribution (i) by

$$\begin{aligned} \text {Im}(M_{12}^\text {SD}) \equiv \frac{1}{2m_K}\text {Im} \left[ \langle {\bar{K}}^0 | {{{\mathcal {H}}}}_\text {eff}^{\Delta S = 2} | K^0 \rangle \right] , \end{aligned}$$
(128)

and contribution (ii) by \(\text {Im}\,M_{12}^\text {LD}\). Contribution (iii) can be related to \(\text {Im}(A_0)/\text {Re}(A_0)\) since \((\pi \pi )_{I=0}\) states provide the dominant contribution to absorptive part of the integral in \(\Gamma _{12}\). Collecting the various pieces yields the following expression  for the \(\epsilon _{K}\) factor [459, 461,462,463,464]

$$\begin{aligned} \epsilon _K= & {} \exp (i \phi _\epsilon ) \, \sin (\phi _\epsilon ) \nonumber \\&\times \left[ \frac{\text {Im}(M_{12}^\text {SD})}{\Delta M_K} + \frac{\text {Im}(M_{12}^\text {LD})}{\Delta M_K} + \frac{\text {Im}(A_0)}{\text {Re}(A_0)} \right] , \end{aligned}$$
(129)

where the phase of \(\epsilon _{K}\) is given by

$$\begin{aligned} \phi _\epsilon = \arctan \frac{\Delta M_{K}}{\Delta \Gamma _{K}/2} . \end{aligned}$$
(130)

The quantities \(\Delta M_K\) and \(\Delta \Gamma _K\) are the mass and decay width differences between long- and short-lived neutral kaons. The experimentally known values of the above quantities read  [137]:

$$\begin{aligned}&\vert \epsilon _{K} \vert = 2.228(11) \times 10^{-3}, \end{aligned}$$
(131)
$$\begin{aligned}&\phi _\epsilon = 43.52(5)^\circ , \end{aligned}$$
(132)
$$\begin{aligned}&\Delta M_{K} \equiv M_{K_{L}} - M_{K_{S}} = 3.484(6) \times 10^{-12}\, \mathrm{MeV},\quad \end{aligned}$$
(133)
$$\begin{aligned}&\Delta \Gamma _{K} \equiv \Gamma _{K_{S}} - \Gamma _{K_{L}} ~\, = 7.3382(33) \times 10^{-12} \,\mathrm{MeV}, \end{aligned}$$
(134)

where the latter three measurements have been obtained by imposing CPT symmetry.

We will start by discussing the short-distance effects (i) since they provide the dominant contribution to \(\epsilon _K\). To lowest order in the electroweak theory, the contribution to the \(K^0 - \bar{K}^0\) oscillations arises from so-called box diagrams, in which two W bosons and two “up-type” quarks (i.e., up, charm, top) are exchanged between the constituent down and strange quarks of the K mesons. The loop integration of the box diagrams can be performed exactly. In the limit of vanishing external momenta and external quark masses, the result can be identified with an effective four-fermion interaction, expressed in terms of the effective Hamiltonian

$$\begin{aligned} {{{\mathcal {H}}}}_{\mathrm{eff}}^{\Delta S = 2} = \frac{G_F^2 M_{\mathrm{W}}^2}{16\pi ^2} {{{\mathcal {F}}}}^0 Q^{\Delta S=2} \,\, + \,\, \mathrm{h.c.} \,\,. \end{aligned}$$
(135)

In this expression, \(G_F\) is the Fermi coupling, \(M_{\mathrm{W}}\) the W-boson mass, and

$$\begin{aligned} Q^{\Delta S=2}= & {} \left[ {\bar{s}}\gamma _\mu (1-\gamma _5)d\right] \left[ {\bar{s}}\gamma _\mu (1-\gamma _5)d\right] \nonumber \\\equiv & {} O_\mathrm{VV+AA}-O_{\mathrm{VA+AV}}, \end{aligned}$$
(136)

is a dimension-six, four-fermion operator. The function \({{{\mathcal {F}}}}^0\) is given by

$$\begin{aligned} {{{\mathcal {F}}}}^0 = \lambda _c^2 S_0(x_c) \, + \, \lambda _t^2 S_0(x_t) \, + \, 2 \lambda _c \lambda _t S_0(x_c,x_t), \end{aligned}$$
(137)

where \(\lambda _a = V^*_{as} V_{ad}\), and \(a=c\,,t\) denotes a flavour index. The quantities \(S_0(x_c),\,S_0(x_t)\) and \(S_0(x_c,x_t)\) with \(x_c=m_c^2/M_{\mathrm{W}}^2\), \(x_t=m_t^2/M_{\mathrm{W}}^2\) are the Inami-Lim functions [465], which express the basic electroweak loop contributions without QCD corrections. The contribution of the up quark, which is taken to be massless in this approach, has been taken into account by imposing the unitarity constraint \(\lambda _u + \lambda _c + \lambda _t = 0\).

When strong interactions are included, \(\Delta {S}=2\) transitions can no longer be discussed at the quark level. Instead, the effective Hamiltonian must be considered between mesonic initial and final states. Since the strong coupling is large at typical hadronic scales, the resulting weak matrix element cannot be calculated in perturbation theory. The operator product expansion (OPE) does, however, factorize long- and short- distance effects. For energy scales below the charm threshold, the \(K^0-{{\bar{K}}}^0\) transition amplitude of the effective Hamiltonian can be expressed as

$$\begin{aligned}&\langle {{\bar{K}}}^0 \vert {{{\mathcal {H}}}}_{\mathrm{eff}}^{\Delta S = 2} \vert K^0 \rangle \,\, \nonumber \\&\quad = \,\, \frac{G_F^2 M_{\mathrm{W}}^2}{16 \pi ^2} \Big [ \lambda _c^2 S_0(x_c) \eta _1 + \, \lambda _t^2 S_0(x_t) \eta _2 \, + \, 2 \lambda _c \lambda _t S_0(x_c,x_t) \eta _3 \Big ] \nonumber \\&\qquad \times \left( \frac{{\bar{g}}(\mu )^2}{4\pi }\right) ^{-\gamma _0/(2\beta _0)} \exp \bigg \{ \int _0^{{\bar{g}}(\mu )} \, dg \, \bigg ( \frac{\gamma (g)}{\beta (g)} \, + \, \frac{\gamma _0}{\beta _0g} \bigg ) \bigg \} \nonumber \\&\qquad \times \langle {{\bar{K}}}^0 \vert Q^{\Delta S=2}_{\mathrm{R}} (\mu ) \vert K^0 \rangle \,\, + \,\, \mathrm{h.c.} \,\, , \end{aligned}$$
(138)

where \({\bar{g}}(\mu )\) and \(Q^{\Delta S=2}_{\mathrm{R}}(\mu )\) are the renormalized gauge coupling and four-fermion operator in some renormalization scheme. The factors \(\eta _1, \eta _2\) and \(\eta _3\) depend on the renormalized coupling \({\bar{g}}\), evaluated at the various flavour thresholds \(m_t, m_b, m_c\) and \( M_{\mathrm{W}}\), as required by the OPE and RG-running procedure that separate high- and low-energy contributions. Explicit expressions can be found in Refs. [458] and references therein, except that \(\eta _1\) and \(\eta _3\) have been calculated to NNLO in Refs. [466, 467], respectively. We follow the same conventions for the RG equations as in Ref. [458]. Thus the Callan-Symanzik function and the anomalous dimension \(\gamma ({\bar{g}})\) of \(Q^{\Delta S=2}\) are defined by

$$\begin{aligned} \dfrac{d {\bar{g}}}{d \ln \mu } = \beta ({\bar{g}})\,,\quad \dfrac{d Q^{\Delta S=2}_{\mathrm{R}}}{d \ln \mu } = -\gamma ({\bar{g}})\,Q^{\Delta S=2}_{\mathrm{R}} \,\,, \end{aligned}$$
(139)

with perturbative expansions

$$\begin{aligned} \beta (g)= & {} -\beta _0 \dfrac{g^3}{(4\pi )^2} \,\, - \,\, \beta _1 \dfrac{g^5}{(4\pi )^4} \,\, - \,\, \cdots , \nonumber \\ \gamma (g)= & {} \gamma _0 \dfrac{g^2}{(4\pi )^2} \,\, + \,\, \gamma _1 \dfrac{g^4}{(4\pi )^4} \,\, + \,\, \cdots \,. \end{aligned}$$
(140)

We stress that \(\beta _0, \beta _1\) and \(\gamma _0\) are universal, i.e., scheme independent. As for \(K^0-{{\bar{K}}}^0\) mixing, this is usually considered in the naive dimensional regularization (NDR) scheme of \({\overline{\mathrm {MS}}}\), and below we specify the perturbative coefficient \(\gamma _1\) in that scheme:

$$\begin{aligned} \begin{aligned} \beta _0&= \left\{ \frac{11}{3}N-\frac{2}{3}N_{ f}\right\} ,\\ \beta _1&= \left\{ \frac{34}{3}N^2-N_{ f}\left( \frac{13}{3}N-\frac{1}{N} \right) \right\} , \\ \gamma _0&= \frac{6(N-1)}{N}, \qquad \\ \gamma _1&= \frac{N-1}{2N} \left\{ -21 + \frac{57}{N} - \frac{19}{3}N + \frac{4}{3}N_{ f}\right\} . \end{aligned} \end{aligned}$$
(141)

Note that for QCD the above expressions must be evaluated for \(N=3\) colours, while \(N_{ f}\) denotes the number of active quark flavours. As already stated, Eq. (138) is valid at scales below the charm threshold, after all heavier flavours have been integrated out, i.e., \(N_{ f}= 3\).

In Eq. (138), the terms proportional to \(\eta _1,\,\eta _2\) and \(\eta _3\), multiplied by the contributions containing \({\bar{g}}(\mu )^2\), correspond to the Wilson coefficient of the OPE, computed in perturbation theory. Its dependence on the renormalization scheme and scale \(\mu \) is canceled by that of the weak matrix element \(\langle {{\bar{K}}}^0 \vert Q^{\Delta S=2}_{\mathrm{R}} (\mu ) \vert K^0 \rangle \). The latter corresponds to the long-distance effects of the effective Hamiltonian and must be computed nonperturbatively. For historical, as well as technical reasons, it is convenient to express it in terms of the B-parameter \(B_{K}\), defined as

$$\begin{aligned} B_{K}(\mu )= \frac{{\left\langle {\bar{K}}^0\left| Q^{\Delta S=2}_\mathrm{R}(\mu )\right| K^0\right\rangle } }{ {\frac{8}{3}f_{K}^2m_{K}^2}} \,\, . \end{aligned}$$
(142)

The four-quark operator \(Q^{\Delta S=2}(\mu )\) is renormalized at scale \(\mu \) in some regularization scheme, for instance, NDR-\({\overline{\mathrm {MS}}}\). Assuming that \(B_{K}(\mu )\) and the anomalous dimension \(\gamma (g)\) are both known in that scheme, the renormalization group independent (RGI) B-parameter \({\hat{B}}_{K}\) is related to \(B_{K}(\mu )\) by the exact formula

$$\begin{aligned} {\hat{B}}_{K}= & {} \left( \frac{{\bar{g}}(\mu )^2}{4\pi }\right) ^{-\gamma _0/(2\beta _0)} \nonumber \\&\times \exp \bigg \{ \int _0^{{\bar{g}}(\mu )} dg \, \bigg ( \frac{\gamma (g)}{\beta (g)} + \frac{\gamma _0}{\beta _0g} \bigg ) \bigg \} \, B_{K}(\mu ) . \end{aligned}$$
(143)

At NLO in perturbation theory the above reduces to

$$\begin{aligned} {\hat{B}}_{K}= & {} \left( \frac{{\bar{g}}(\mu )^2}{4\pi }\right) ^{- \gamma _0/(2\beta _0)} \nonumber \\&\times \left\{ 1+\dfrac{{\bar{g}}(\mu )^2}{(4\pi )^2}\left[ \frac{\beta _1\gamma _0-\beta _0\gamma _1}{2\beta _0^2} \right] \right\} \, B_{K}(\mu ) . \end{aligned}$$
(144)

To this order, this is the scale-independent product of all \(\mu \)-dependent quantities in Eq. (138).

Lattice-QCD calculations provide results for \(B_K(\mu )\). These results are, however, usually obtained in intermediate schemes other than the continuum \({\overline{\mathrm {MS}}}\) scheme used to calculate the Wilson coefficients appearing in Eq. (138). Examples of intermediate schemes are the RI/MOM scheme [468] (also dubbed the “Rome-Southampton method”) and the Schrödinger functional (SF) scheme [172]. These schemes are used as they allow a nonperturbative renormalization of the four-fermion operator, using an auxiliary lattice simulation. This allows \(B_K(\mu )\) to be calculated with percent-level accuracy, as described below.

In order to make contact with phenomenology, however, and in particular to use the results presented above, one must convert from the intermediate scheme to the \({\overline{\mathrm {MS}}}\) scheme or to the RGI quantity \({\hat{B}}_{K}\). This conversion relies on one or 2-loop perturbative matching calculations, the truncation errors in which are, for many recent calculations, the dominant source of error in \({\hat{B}}_{K}\) (see, for instance, Refs. [10, 57, 58, 156, 469]). While this scheme-conversion error is not, strictly speaking, an error of the lattice calculation itself, it must be included in results for the quantities of phenomenological interest, namely, \(B_K({\overline{\mathrm {MS}}},2\,\mathrm{GeV})\) and \({\hat{B}}_{K}\). Incidentally, we remark that this truncation error is estimated in different ways and that its relative contribution to the total error can considerably differ among the various lattice calculations. We note that this error can be minimized by matching between the intermediate scheme and \({\overline{\mathrm {MS}}}\) at as large a scale \(\mu \) as possible (so that the coupling which determines the rate of convergence is minimized). Recent calculations have pushed the matching \(\mu \) up to the range \(3-3.5\,\)GeV. This is possible because of the use of nonperturbative RG running determined on the lattice [10, 56, 156]. The Schrödinger functional offers the possibility to run nonperturbatively to scales \(\mu \sim M_{\mathrm{W}}\) where the truncation error can be safely neglected. However, so far this has been applied only for two flavours for \(B_K\) in Ref. [470] and for the case of the BSM bag parameters in Ref. [471], see more details in Sect. 6.3.

Perturbative truncation errors in Eq. (138) also affect the Wilson coefficients \(\eta _1\), \(\eta _2\) and \(\eta _3\). It turns out that the largest uncertainty arises from the charm quark contribution \(\eta _1=1.87(76)\) [466]. Although it is now calculated at NNLO, the series shows poor convergence. The net effect from the uncertainty on \(\eta _1\) on the amplitude in Eq. (138) is larger than that of present lattice calculations of \(B_K\).

We will now proceed to discuss the remaining contributions to \(\epsilon _K\) in Eq. (129). An analytical estimate of the leading contribution from \(\mathrm{Im} M_{12}^\text {LD}\) based on \(\chi \)PT, shows that it is approximately proportional to \(\xi \equiv \mathrm{{Im}}(A_0)/\mathrm{{Re}}(A_0)\) so that Eq. (129) can be written as follows [463, 464]

$$\begin{aligned} \epsilon _{K} \, = \, \exp (i \phi _\epsilon ) \,\, \sin (\phi _\epsilon ) \,\, \left[ \frac{\text {Im}(M_{12}^{\mathrm{SD}})}{\Delta M_K } \,\,\, + \,\,\, \rho \,\xi \,\, \right] \, , \end{aligned}$$
(145)

where the deviation of \(\rho \) from one parameterizes the long-distance effects in \(\mathrm{Im} M_{12}\).

An estimate of \(\xi \) has been obtained from a direct evaluation of the ratio of amplitudes \(\mathrm{Im}(A_0)/\mathrm{{Re}}(A_0)\) where \(\mathrm{{Re}}(A_0)\) is determined from a lattice-QCD computation [472] at one value of the lattice spacing, while \(\mathrm{Re}(A_0) \simeq |A_0|\) and the value \(|A_0| = 3.320(2) \times 10^{-7}\) GeV are used based on the relevant experimental input [137] from the decay to two pions. This leads to a result for \(\xi \) with a rather large relative error,

$$\begin{aligned} \xi = -0.6(5)\cdot 10^{-4}. \end{aligned}$$
(146)

A more precise estimate can be been obtained through a lattice-QCD computation of the ratio of amplitudes \(\mathrm{Im}(A_2)/\mathrm{Re}(A_2)\) [473] where the continuum limit result is based on data at two values of the lattice spacing; \(A_2\) denotes the \(\Delta {I}=3/2\) \(K\rightarrow \pi \pi \) decay amplitude. For the computation of \(\xi \), the experimental values of \(\mathrm{Re}(\epsilon ^{\prime }/\epsilon )\), \(|\epsilon _K|\) and \(\omega = \mathrm{Re}(A_2)/\mathrm{Re}(A_0)\) have been used. The result for \(\xi \) reads

$$\begin{aligned} \xi = -1.6 (2)\cdot 10^{-4}. \end{aligned}$$
(147)

A phenomenological estimate can also be obtained from the relationship of \(\xi \) to \(\mathrm{Re} (\epsilon ^\prime /\epsilon )\), using the experimental value of the latter and further assumptions concerning the estimate of hadronic contributions. The corresponding value of \(\xi \) reads [463, 464]

$$\begin{aligned} \xi = -6.0(1.5)\cdot 10^{-2}\sqrt{2}\,|\epsilon _K| = -1.9(5)\cdot 10^{-4}. \end{aligned}$$
(148)

We note that the use of the experimental value for \(\mathrm{Re}(\epsilon ^\prime /\epsilon )\) is based on the assumption that it is free from New Physics contributions. The value of \(\xi \) can then be combined with a \({\chi }\mathrm PT\)-based estimate for the long-range contribution, \(\rho =0.6(3)\) [464]. Overall, the combination \(\rho \xi \) appearing in Eq. (145) leads to a suppression of the SM prediction of \(|\epsilon _K|\) by about \(3(2)\%\) relative to the experimental measurement of \(|\epsilon _K|\) given in Eq. (131), regardless of whether the phenomenological estimate of \(\xi \) [see Eq. (148)] or the most precise lattice result [see Eq. (147)] are used. The uncertainty in the suppression factor is dominated by the error on \(\rho \). Although this is a small correction, we note that its contribution to the error of \(\epsilon _K\) is larger than that arising from the value of \(B_{K}\) reported below.

Efforts are under way to compute the long-distance contributions to \(\epsilon _{K}\)  [474] and to the \(K_L-K_S\) mass difference in lattice QCD  [475,476,477,478]. However, the results are not yet precise enough to improve the accuracy in the determination of the parameter \(\rho \).

The lattice-QCD study of \(K \rightarrow \pi \pi \) decays provides crucial input to the SM prediction of \(\epsilon _{K}\). Besides the RBC-UKQCD collaboration programme [472, 473] using domain-wall fermions, an approach based on improved Wilson fermions [479, 480] has presented a determination of the \(K \rightarrow \pi \pi \) decay amplitudes, \(A_0\) and \(A_2\), at unphysical quark masses. A first proposal aiming at the inclusion of electromagnetism in lattice-QCD calculations of these decays was reported in Ref. [481]. For an ongoing analysis of the scaling with the number of colours of \(K \rightarrow \pi \pi \) decay amplitudes using lattice-QCD computations, we refer to Refs. [482, 483].

Finally, we notice that \(\epsilon _{K}\) receives a contribution from \(|V_{cb}|\) through the \(\lambda _t\) parameter in Eq. (137). The present uncertainty on \(|V_{cb}|\) has a significant impact on the error of \(\epsilon _{K}\) (see, e.g., Ref. [484] and a recent update [485]).

6.2 Lattice computation of \(B_{K}\)

Lattice calculations of \(B_{K}\) are affected by the same systematic effects discussed in previous sections. However, the issue of renormalization merits special attention. The reason is that the multiplicative renormalizability of the relevant operator \(Q^{\Delta S=2}\) is lost once the regularized QCD action ceases to be invariant under chiral transformations. For Wilson fermions, \(Q^{\Delta S=2}\) mixes with four additional dimension-six operators, which belong to different representations of the chiral group, with mixing coefficients that are finite functions of the gauge coupling. This complicated renormalization pattern was identified as the main source of systematic error in earlier, mostly quenched calculations of \(B_{K}\) with Wilson quarks. It can be bypassed via the implementation of specifically designed methods, which are either based on Ward identities [486] or on a modification of the Wilson quark action, known as twisted mass QCD [487,488,489].

An advantage of staggered fermions is the presence of a remnant U(1) chiral symmetry. However, at nonvanishing lattice spacing, the symmetry among the extra unphysical degrees of freedom (tastes) is broken. As a result, mixing with other dimension-six operators cannot be avoided in the staggered formulation, which complicates the determination of the B-parameter. In general, taste conserving mixings are implemented directly in the lattice computation of the matrix element. The effects of the broken taste symmetry are usually treated through an effective field theory, staggered Chiral Perturbation Theory (S\(\chi \)PT) [327, 490], parameterizing the quark-mass and lattice-spacing dependences.

Fermionic lattice actions based on the Ginsparg-Wilson relation [491] are invariant under the chiral group, and hence four-quark operators such as \(Q^{\Delta S=2}\) renormalize multiplicatively. However, depending on the particular formulation of Ginsparg-Wilson fermions, residual chiral symmetry breaking effects may be present in actual calculations. For instance, in the case of domain-wall fermions, the finiteness of the extra 5th dimension implies that the decoupling of modes with different chirality is not exact, which produces a residual nonzero quark mass in the chiral limit. Furthermore, whether a significant mixing with dimension-six operators of different chirality is induced must be investigated on a case-by-case basis.

The only existing lattice-QCD calculation of \(B_K\) with \(N_{ f}=2\,+\,1\,+\,1\) dynamical quarks  [55] was reviewed in the FLAG 16 report. Considering that no direct evaluation of the size of the excess of charm quark effects included in \(N_{ f}=2\,+\,1\,+\,1\) computations of \(B_K\) has appeared since then, we wish to reiterate a discussion about a few related conceptual issues.

As described in Sect. 6.1, kaon mixing is expressed in terms of an effective four-quark interaction \(Q^{{\Delta }S=2}\), considered below the charm threshold. When the matrix element of \(Q^{{\Delta }S=2}\) is evaluated in a theory that contains a dynamical charm quark, the resulting estimate for \(B_K\) must then be matched to the three-flavour theory that underlies the effective four-quark interaction.Footnote 37 In general, the matching of \(2\,+\,1\)-flavour QCD with the theory containing \(2\,+\,1\,+\,1\) flavours of sea quarks below the charm threshold can be accomplished by adjusting the coupling and quark masses of the \(N_{ f}=2\,+\,1\) theory so that the two theories match at energies \(E<m_c\). The corrections associated with this matching are of order \((E/m_c)^2\), since the subleading operators have dimension eight [492].

When the kaon mixing amplitude is considered, the matching also involves the relation between the relevant box graphs and the effective four-quark operator. In this case, corrections of order \((E/m_c)^2\) arise not only from the charm quarks in the sea, but also from the valence sector, since the charm quark propagates in the box diagrams. We note that the original derivation of the effective four-quark interaction is valid up to corrections of order \((E/m_c)^2\). The kaon mixing amplitudes evaluated in the \(N_{ f}=2\,+\,1\) and \(2\,+\,1\,+\,1\) theories are thus subject to corrections of the same order in \(E/m_c\) as the derivation of the conventional four-quark interaction.

Regarding perturbative QCD corrections at the scale of the charm quark mass on the amplitude in Eq. (138), the uncertainty on \(\eta _1\) and \(\eta _3\) factors is of \({\mathcal {O}}(\alpha _s(m_c)^3)\) [466, 467], while that on \(\eta _2\) is of \({\mathcal {O}}(\alpha _s(m_c)^2)\) [493]. On the other hand, a naive power counting argument suggests that the corrections of order \((E/m_c)^2\) due to dynamical charm-quark effects in the matching of the amplitudes are suppressed by powers of \(\alpha _s(m_c)\) and by a factor of \(1/N_c\). It is therefore essential that any forthcoming calculation of \(B_K\) with \(N_{ f}=2\,+\,1\,+\,1\) flavours addresses properly the size of these residual dynamical charm effects in a quantitative way.

Table 27 Results for the kaon B-parameter in QCD with \(N_{ f}=2\,+\,1\,+\,1\) and \(N_{ f}=2\,+\,1\) dynamical flavours, together with a summary of systematic errors. Any available information about nonperturbative running is indicated in the column “running”, with details given at the bottom of the table
Table 28 Results for the kaon B-parameter in QCD with \(N_{ f}=2\) dynamical flavours, together with a summary of systematic errors. Any available information about nonperturbative running is indicated in the column “running”, with details given at the bottom of the table

Another issue in this context is how the lattice scale and the physical values of the quark masses are determined in the \(2\,+\,1\) and \(2\,+\,1\,+\,1\) flavour theories. Here it is important to consider in which way the quantities used to fix the bare parameters are affected by a dynamical charm quark. Apart from a brief discussion in Ref.  [55], these issues have not yet been directly addressed in the literature.Footnote 38 Given the hierarchy of scales between the charm quark mass and that of \(B_K\), we expect these errors to be modest, but a more quantitative understanding is needed as statistical errors on \(B_K\) are reduced. Within this review we will not discuss this issue further. However, we wish to point out that the present discussion also applies to \(N_{ f}=2\,+\,1\,+\,1\) computations of the kaon BSM B-parameters discussed in Sect. 6.3.

A compilation of results with \(N_{ f}=2, 2\,+\,1\) and \(2\,+\,1\,+\,1\) flavours of dynamical quarks is shown in Tables 27 and 28, as well as Fig. 18. An overview of the quality of systematic error studies is represented by the colour coded entries in Tables 27 and 28. In Appendix B.4 we gather the simulation details and results that have appeared since the previous FLAG review [3]. The values of the most relevant lattice parameters, and comparative tables on the various estimates of systematic errors are also collected.

Some of the groups whose results are listed in Tables 27 and 28 do not quote results for both \(B_{K}(\overline{\mathrm{MS}},2\,\mathrm{GeV})\) – which we denote by the shorthand \(B_{K}\) from now on – and \({\hat{B}}_{K}\). This concerns Refs. [59, 494] for \(N_{ f}=2\), Refs. [10, 57, 58, 156] for \(2\,+\,1\) and Ref. [55] for \(2\,+\,1\,+\,1\) flavours. In these cases we perform the conversion ourselves by evaluating the proportionality factor in Eq. (144) using perturbation theory at NLO at a renormalization scale \(\mu =2\,\mathrm {GeV}\). For \(N_{ f}=2\,+\,1\), by using the world average value \(\Lambda _{{\overline{\mathrm {MS}}}}^{(3)}=332\) MeV from PDG [137] and the 4-loop \(\beta \)-function we obtain, \({\hat{B}}_{K}/B_{K}=1.369\) in the three-flavour theory. Had we used the 5-loop \(\beta \)-function we would get \({\hat{B}}_{K}/B_{K}=1.373\). If we use instead the average lattice results from Sect. 9 of the present FLAG report, \(\Lambda _{{\overline{\mathrm {MS}}}}^{(3)}=343\) MeV, together with the four and 5-loop \(\beta \)-function, we obtain \({\hat{B}}_{K}/B_{K}=1.365\) and \({\hat{B}}_{K}/B_{K}=1.369\), respectively. In FLAG 16, we used \({\hat{B}}_{K}/B_{K}=1.369\) based on the 2014 edition of the PDG  [170]. The relative deviations among these various estimates is below the 3 permille level and amounts to a tiny fraction of the uncertainty on the average value of the B-parameter. We have therefore used in this edition the value, \({\hat{B}}_{K}/B_{K}=1.369\), which was also used in FLAG 16. The same value for the conversion factor has also been applied to the result computed in QCD with \(N_{ f}=2\,+\,1\,+\,1\) flavours of dynamical quarks [55].

Fig. 18
figure 18

Recent unquenched lattice results for the RGI B-parameter \({\hat{B}}_{K}\). The grey bands indicate our global averages described in the text. For \(N_{ f}=2\,+\,1\,+\,1\) and \(N_{ f}=2\) the global estimate coincide with the results by ETM 12D and ETM 10A, respectively

In two-flavour QCD one can insert into the NLO expressions for \(\alpha _s\) the estimate \(\Lambda _{{\overline{\mathrm {MS}}}}^{(2)}=330\) MeV, which is the average value for \(N_{ f}=2\) obtained in Sect. 9, and get \({\hat{B}}_K/B_K = 1.365\) and \({\hat{B}}_K/B_K = 1.368\) for running with four and 5-loop \(\beta \)-function, respectively. We again note that the difference between the conversion factors for \(N_{ f}=2\) and \(N_{ f}=2\,+\,1\) will produce a negligible ambiguity, which, in any case, is well below the overall uncertainties in Refs. [59, 494]. We have therefore chosen to apply the conversion factor of 1.369 not only to results obtained for \(N_{ f}=2\,+\,1\) flavours but also to the two-flavour theory (in cases where only one of \({\hat{B}}_K\) and \(B_K\) are quoted). We have indicated explicitly in Table 28 in which way the conversion factor 1.369 has been applied to the results of Refs. [59, 494]. We wish to encourage authors to provide both \({\hat{B}}_{K}\) and \(B_{K}\) together with the values of the parameters appearing in the perturbative running.

We discuss here one recent result for the kaon B-parameter reported by the RBC/UKQCD collaboration, RBC/UKQCD 16 [60], where \(N_f=2\,+\,1\) dynamical quarks have been employed. For a detailed description of previous calculations – and in particular those considered in the computation of the average values – we refer the reader to the FLAG 16 [3] and FLAG 13 [2] reports.

In Ref. [60], RBC/UKQCD presented a determination of \(B_K\) obtained as part of their study of kaon mixing in extensions of the SM. In this calculation two values of the lattice spacing, \(a \simeq 0.11\) and 0.08 fm, are used, employing ensembles generated using the Iwasaki gauge action and the Shamir domain-wall fermionic action. The lattice volumes are \(24^3 \times 64 \times 16\) for the coarse and \(32^3 \times 64 \times 16\) for the fine lattice spacing. The lowest simulated values for the pseudoscalar mass are about 340 MeV and 300 MeV, respectively. The renormalization of four-quark operators was performed nonperturbatively in two RI-SMOM schemes, namely, \(({/}\!\!\!{q},{/}\!\!\!{q})\) and \((\gamma _{\mu }, \gamma _{\mu })\), where the latter was used for the final estimate of \(B_K\). While the procedure to determine \(B_K\) is very similar to RBC/UKQCD 14B, the calculation in RBC/UKQCD 16  [60] is based only on a subset of the ensembles studied in Ref. [10]. Therefore, the result for \(B_K\) reported in Ref. [60] can neither be considered an update of RBC/UKQCD 14B, nor an independent new result.

We now describe our procedure for obtaining global averages. The rules of Sect. 2.1 stipulate that results free of red tags and published in a refereed journal may enter an average. Papers that at the time of writing are still unpublished but are obvious updates of earlier published results can also be taken into account.

There is only one result for \(N_f=2\,+\,1\,+\,1\), computed by the ETM collaboration  [55]. Since it is free of red tags, it qualifies as the currently best global estimate, i.e.,

$$\begin{aligned}&N_{ f}=2\,+\,1\,+\,1:{\hat{B}}_{K} = 0.717(18)(16)\, , \nonumber \\&\quad B_{K}^{\overline{\mathrm {MS}}}(2\,\mathrm{GeV}) = 0.524(13)(12) \quad \,\mathrm {Ref.}~ \text{[55] }. \end{aligned}$$
(149)

The bulk of results for the kaon B-parameter has been obtained for \(N_{ f}=2\,+\,1\). As in the previous editions of the FLAG review  [2, 3] we include the results from SWME  [58, 469, 495], despite the fact that nonperturbative information on the renormalization factors is not available. Instead, the matching factor has been determined in perturbation theory at one loop, but with a sufficiently conservative error of 4.4%. As described above, the result in RBC/UKQCD 16 [60] cannot be considered an update of the earlier estimate in RBC/UKQCD 14B, and hence it is not included in the FLAG average.

Thus, for \(N_{ f}=2\,+\,1\) our global average is based on the results of BMW 11 [56], Laiho 11 [57], RBC/UKQCD 14B [10] and SWME 15A [58]. The last three are the latest updates from a series of calculations by the same collaborations. Our procedure is as follows: in a first step statistical and systematic errors of each individual result for the RGI B-parameter, \({\hat{B}}_{K}\), are combined in quadrature. Next, a weighted average is computed from the set of results. For the final error estimate we take correlations between different collaborations into account. To this end we note that we consider the statistical and finite-volume errors of SWME 15A and Laiho 11 to be correlated, since both groups use the Asqtad ensembles generated by the MILC collaboration. Laiho 11 and RBC/UKQCD 14B both use domain-wall quarks in the valence sector and also employ similar procedures for the nonperturbative determination of matching factors. Hence, we treat the quoted renormalization and matching uncertainties by the two groups as correlated. After constructing the global covariance matrix according to Schmelling [132], we arrive at

$$\begin{aligned} N_{ f}=2\,+\,1:\quad {\hat{B}}_{K} = 0.7625(97)\quad \,\mathrm {Refs.}~{ [10,56{-}58]},\nonumber \\ \end{aligned}$$
(150)

with \(\chi ^2/\mathrm{dof}=0.675\). After applying the NLO conversion factor \({\hat{B}}_{K}/B_{K}^{\overline{\mathrm {MS}}}(2\,\mathrm{GeV})=1.369\), this translates into

$$\begin{aligned}&N_{ f}=2\,+\,1:\quad B_{K}^{\overline{\mathrm {MS}}}(2\,{\mathrm{GeV}})=0.5570(71)\nonumber \\&\quad \,\mathrm {Refs.}~{ [10,56{-}58]}. \end{aligned}$$
(151)

Note that the statistical errors of each calculation entering the global average are small enough to make their results statistically incompatible. It is only because of the relatively large systematic errors that the weighted average produces a value of \({\mathcal {O}}(1)\) for the reduced \(\chi ^2\).

Passing over to describing the results computed for \(N_{ f}=2\) flavours, we note that there is only the set of results published in ETM 12D  [59] and ETM 10A [494] that allow for an extensive investigation of systematic uncertainties. We identify the result from ETM 12D  [59], which is an update of ETM 10A, with the currently best global estimate for two-flavour QCD, i.e.,

$$\begin{aligned}&N_{ f}=2:\quad {\hat{B}}_{K} = 0.727(22)(12), \quad \nonumber \\&\quad B_{K}^{\overline{\mathrm {MS}}}(2\,\mathrm{GeV}) = 0.531(16)(19)\,\mathrm {Ref.}~\hbox {[59]}. \end{aligned}$$
(152)

The result in the \({\overline{\mathrm {MS}}}\) scheme has been obtained by applying the same conversion factor of 1.369 as in the three-flavour theory.

6.3 Kaon BSM B-parameters

We now report on lattice results concerning the matrix elements of operators that encode the effects of physics beyond the Standard Model (BSM) to the mixing of neutral kaons. In this theoretical framework both the SM and BSM contributions add up to reproduce the experimentally observed value of \(\epsilon _K\). Since BSM contributions involve heavy but unobserved particles they are short-distance dominated. The effective Hamiltonian for generic \({\Delta }S=2\) processes including BSM contributions reads

$$\begin{aligned} {{{\mathcal {H}}}}_{\mathrm{eff,BSM}}^{\Delta S=2} = \sum _{i=1}^5 C_i(\mu )Q_i(\mu ), \end{aligned}$$
(153)

where \(Q_1\) is the four-quark operator of Eq. (136) that gives rise to the SM contribution to \(\epsilon _K\). In the so-called SUSY basis introduced by Gabbiani et al. [501] the operators \(Q_2,\ldots ,Q_5\) read Footnote 39

$$\begin{aligned} Q_2= & {} \big ({\bar{s}}^a(1-\gamma _5)d^a\big ) \big ({\bar{s}}^b(1-\gamma _5)d^b\big ), \nonumber \\ Q_3= & {} \big ({\bar{s}}^a(1-\gamma _5)d^b\big ) \big ({\bar{s}}^b(1-\gamma _5)d^a\big ), \nonumber \\ Q_4= & {} \big ({\bar{s}}^a(1-\gamma _5)d^a\big ) \big ({\bar{s}}^b(1+\gamma _5)d^b\big ), \nonumber \\ Q_5= & {} \big ({\bar{s}}^a(1-\gamma _5)d^b\big ) \big ({\bar{s}}^b(1+\gamma _5)d^a\big ), \end{aligned}$$
(154)

where a and b denote colour indices. In analogy to the case of \(B_{K}\) one then defines the B-parameters of \(Q_2,\ldots ,Q_5\) according to

$$\begin{aligned} B_i(\mu ) = \frac{\left\langle {\bar{K}}^0\left| Q_i(\mu )\right| K^0 \right\rangle }{N_i\left\langle {\bar{K}}^0\left| {\bar{s}}\gamma _5 d\right| 0\right\rangle \left\langle 0\left| {\bar{s}}\gamma _5 d\right| K^0\right\rangle }, \quad i=2,\ldots ,5.\nonumber \\ \end{aligned}$$
(155)

The factors \(\{N_2,\ldots ,N_5\}\) are given by \(\{-5/3, 1/3, 2, 2/3\}\), and it is understood that \(B_i(\mu )\) is specified in some renormalization scheme, such as \({\overline{\mathrm {MS}}}\) or a variant of the regularization-independent momentum subtraction (RI-MOM) scheme.

Table 29 Results for the BSM B-parameters \(B_2,\ldots ,B_5\) in the \({\overline{\mathrm {MS}}}\) scheme at a reference scale of 3 GeV. Any available information on nonperturbative running is indicated in the column “running”, with details given at the bottom of the table

The SUSY basis has been adopted in Refs.  [55, 59, 60, 502]. Alternatively, one can employ the chiral basis of Buras, Misiak and Urban  [503]. The SWME collaboration prefers the latter since the anomalous dimension that enters the RG running has been calculated to two loops in perturbation theory  [503]. Results obtained in the chiral basis can be easily converted to the SUSY basis via

$$\begin{aligned} B_3^{\mathrm{SUSY}}={\textstyle \frac{1}{2}}\left( 5B_2^{\mathrm{chiral}} - 3B_3^{\mathrm{chiral}} \right) . \end{aligned}$$
(156)

The remaining B-parameters are the same in both bases. In the following we adopt the SUSY basis and drop the superscript.

Older quenched results for the BSM B-parameters can be found in Refs. [504,505,506]. For a nonlattice approach to get estimates for the BSM B-parameters see Ref. [507].

Estimates for \(B_2,\ldots ,B_5\) have been reported for QCD with \(N_{ f}=2\) (ETM 12D [59]), \(N_{ f}=2+1\) (RBC/UKQCD 12E  [502], SWME 13A  [495], SWME 14C  [508], SWME 15A  [58], RBC/UKQCD 16 [60, 509]) and \(N_{ f}=2\,+\,1\,+\,1\) (ETM 15  [55]) flavours of dynamical quarks. They are listed and compared in Table 29 and Fig. 19. In general one finds that the BSM B-parameters computed by different collaborations do not show the same level of consistency as the SM kaon mixing parameter \(B_K\) discussed previously. Control over systematic uncertainties (chiral and continuum extrapolations, finite-volume effects) in \(B_2,\ldots ,B_5\) is expected to be at the same level as for \(B_{K}\), as far as the results by ETM 12D, ETM 15 and SWME 15A are concerned. The calculation by RBC/UKQCD 12E has been performed at a single value of the lattice spacing and a minimum pion mass of 290 MeV. Thus, the results do not benefit from the same improvements regarding control over the chiral and continuum extrapolations as in the case of \(B_{K}\)  [10].

Fig. 19
figure 19

Lattice results for the BSM B-parameters defined in the \({\overline{\mathrm {MS}}}\) scheme at a reference scale of 3 GeV, see Table 29

The RBC/UKQCD collaboration has recently extended its calculation of BSM B-parameters [60, 509] for \(N_f=2\,+\,1\), by considering two values of the lattice spacing, \(a \simeq 0.11\) and 0.08 fm, employing ensembles generated using the Iwasaki gauge action and the Shamir domain-wall fermionic action. The lattice volumes in the RBC/UKQCD 16 calculation are \(24^3 \times 64 \times 16\) for the coarse and \(32^3 \times 64 \times 16\) for the fine lattice spacing, while the lowest simulated values for the pseudoscalar mass are about 340 MeV and 300 MeV, respectively. As in the related calculation of \(B_K\) (RBC/UKQCD 14B [10]) the renormalization of four-quark operators was performed nonperturbatively in two RI-SMOM schemes, namely, \(({/}\!\!\!{q},{/}\!\!\!{q})\) and \((\gamma _{\mu }, \gamma _{\mu })\), where the latter was used for the final estimates of \(B_2,\ldots ,B_5\) quoted in Ref.  [60]. By comparing the results obtained in the conventional RI-MOM and the two RI-SMOM schemes, RBC/UKQCD 16 report significant discrepancies for \(B_4\) and \(B_5\) in the \({\overline{\mathrm{MS}}}\) scheme at the scale of 3 GeV, which amount up to 2.8\(\sigma \) in the case of \(B_5\). By contrast, the agreement for \(B_2\) and \(B_3\) determined for different intermediate scheme is much better. Based on these findings they claim that these discrepancies are due to uncontrolled systematics coming from the Goldstone boson pole subtraction procedure that is needed in the RI-MOM scheme, while pole subtraction effects are much suppressed in RI-SMOM thanks to the fact that the latter is based on nonexceptional momenta. The RBC/UKQCD collaboration has presented an ongoing study [510] in which simulations with two values of the lattice spacing at the physical point and with a third finer lattice spacing at \(M_\pi = 234\) MeV are employed in order to obtain the BSM matrix elements in the continuum limit. Results are still preliminary.

The findings by RBC/UKQCD 16 [60, 509] provide evidence that the nonperturbative determination of the matching factors depends strongly on the details of the implementation of the Rome-Southampton method. The use of nonexceptional momentum configurations in the calculation of the vertex functions produces a significant modification of the renormalization factors, which affects the matching between \({\overline{\mathrm{MS}}}\) and the intermediate momentum subtraction scheme. This effect is most pronounced in \(B_4\) and \(B_5\). Furthermore, it can be noticed that the estimates for \(B_4\) and \(B_5\) from RBC/UKQCD 16 are much closer to those of SWME 15A. At the same time, the results for \(B_2\) and \(B_3\) obtained by ETM 15, SWME 15A and RBC/UKQCD 16 are in good agreement within errors.

A nonperturbative computation of the running of the four-fermion operators contributing to the \(B_2\), ..., \(B_5\) parameters has been carried out with two dynamical flavours using the Schrödinger functional renormalization scheme [471]. Renormalization matrices of the operator basis are used to build step-scaling functions governing the continuum-limit running between hadronic and electroweak scales. A comparison to perturbative results using NLO (2-loops) for the four-fermion operator anomalous dimensions indicates that, at scales of about 3 GeV, nonperturbative effects can induce a sizeable contribution to the running.

A detailed look at the most recent calculations reported in ETM 15 [55], SWME 15A [58] and RBC/UKQCD 16 [60] reveals that cutoff effects appear to be larger for the BSM B-parameters compared to \(B_K\). Depending on the details of the renormalization procedure and/or the fit ansatz for the combined chiral and continuum extrapolation, the results obtained at the coarsest lattice spacing differ by 15–30%. At the same time the available range of lattice spacings is typically much reduced compared to the corresponding calculations of \(B_K\), as can be seen by comparing the quality criteria in Tables 27 and 29. Hence, the impact of the renormalization procedure and the continuum limit on the BSM B-parameters certainly requires further investigation.

Finally we present our estimates for the BSM B-parameters, quoted in the \({\overline{\mathrm {MS}}}\)-scheme at scale 3 GeV. For \(N_f=2\,+\,1\) our estimate is given by the average between the results from SWME 15A and RBC/UKQCD 16, i.e.,

$$\begin{aligned} N_{ f}= & {} 2\,+\,1: \nonumber \\ B_2= & {} 0.502(14),\quad B_3=0.766(32),\quad B_4=0.926(19),\nonumber \\&B_5=0.720(38), \quad \,\mathrm {Refs.}~ \text{[58,60] }. \end{aligned}$$
(157)

For \(N_f=2\,+\,1\,+\,1\) and \(N_f=2\), our estimates coincide with the ones by ETM 15 and ETM 12D, respectively, since there is only one computation for each case. Thus we quote

$$\begin{aligned} N_{ f}= & {} 2\,+\,1\,+\,1{:} \nonumber \\ B_2= & {} 0.46(1)(3),\quad B_3=0.79(2)(4),\quad B_4=0.78(2)(4),\nonumber \\&\quad B_5=0.49(3)(3), \quad \,\mathrm {Ref.}~ \text{[55] }, \end{aligned}$$
(158)
$$\begin{aligned} N_{ f}= & {} 2{:} \nonumber \\ B_2= & {} 0.47(2)(1),\quad B_3=0.78(4)(2),\quad B_4=0.76(2)(2),\nonumber \\&\quad B_5=0.58(2)(2), \quad \,\mathrm {Ref.}~\hbox {[59]}. \end{aligned}$$
(159)

Based on the above discussion on the effects of employing different intermediate momentum subtraction schemes in the nonperturbative renormalization of the operators, the discrepancy for \(B_4\) and \(B_5\) results between \(N_f=2, 2\,+\,1\,+\,1\) and \(N_f=2\,+\,1\) computations should not be considered an effect associated with the number of dynamical flavours. As a closing remark, we encourage authors to provide the correlation matrix of the \(B_i\) parameters.

7 D-meson decay constants and form factors

Authors: Y. Aoki, D. Bečirević, M. Della Morte, S. Gottlieb, D. Lin, E. Lunghi, C. Pena

Leptonic and semileptonic decays of charmed D and \(D_s\) mesons occur via charged W-boson exchange, and are sensitive probes of \(c \rightarrow d\) and \(c \rightarrow s\) quark flavour-changing transitions. Given experimental measurements of the branching fractions combined with sufficiently precise theoretical calculations of the hadronic matrix elements, they enable the determination of the CKM matrix elements \(|V_{cd}|\) and \(|V_{cs}|\) (within the Standard Model) and a precise test of the unitarity of the second row of the CKM matrix. Here we summarize the status of lattice-QCD calculations of the charmed leptonic decay constants. Significant progress has been made in charm physics on the lattice in recent years, largely due to the availability of gauge configurations produced using highly-improved lattice-fermion actions that enable treating the c quark with the same action as for the u, d, and s quarks.

This section updates the corresponding one in the last FLAG review [3] for results that appeared after November 30, 2015. As already done in Ref. [3], we limit our review to results based on modern simulations with reasonably light pion masses (below approximately 500 MeV). This excludes results obtained from the earliest unquenched simulations, which typically had two flavours in the sea, and which were limited to heavier pion masses because of the constraints imposed by the computational resources and methods available at that time.

Following our review of lattice-QCD calculations of \(D_{(s)}\)-meson leptonic decay constants and semileptonic form factors, we then interpret our results within the context of the Standard Model. We combine our best-determined values of the hadronic matrix elements with the most recent experimentally-measured branching fractions to obtain \(|V_{cd(s)}|\) and test the unitarity of the second row of the CKM matrix.

7.1 Leptonic decay constants \(f_D\) and \(f_{D_s}\)

In the Standard Model, and up to electromagnetic corrections, the decay constant \(f_{D_{(s)}}\) of a pseudoscalar D or \(D_s\) meson is related to the branching ratio for leptonic decays mediated by a W boson through the formula

$$\begin{aligned}&{{\mathcal {B}}}(D_{(s)} \rightarrow \ell \nu _\ell )\nonumber \\&\quad = {{G_F^2|V_{cq}|^2 \tau _{D_{(s)}}}\over {8 \pi }} f_{D_{(s)}}^2 m_\ell ^2 m_{D_{(s)}} \left( 1-{{m_\ell ^2}\over {m_{D_{(s)}}^2}}\right) ^2\;, \end{aligned}$$
(160)

where q is d or s and \(V_{cd}\) (\(V_{cs}\)) is the appropriate CKM matrix element for a D (\(D_s\)) meson. The branching fractions have been experimentally measured by CLEO, Belle, Babar and BES with a precision around 4–5\(\%\) for both the D and the \(D_s\)-meson decay modes [133]. When combined with lattice results for the decay constants, they allow for determinations of \(|V_{cs}|\) and \(|V_{cd}|\).

In lattice-QCD calculations the decay constants \(f_{D_{(s)}}\) are extracted from Euclidean matrix elements of the axial current

$$\begin{aligned} \langle 0| A^{\mu }_{cq} | D_q(p) \rangle = if_{D_q}\;p_{D_q}^\mu \;, \end{aligned}$$
(161)

with \(q=d,s\) and \( A^{\mu }_{cq} ={\bar{c}}\gamma _\mu \gamma _5 q\). Results for \(N_f=2,\; 2\,+\,1\) and \(2\,+\,1\,+\,1\) dynamical flavours are summarized in Table 30 and Fig. 20. Since the publication of the last FLAG review, a handful of results for \(f_D\) and \(f_{D_s}\) have appeared, as described below. We consider isospin-averaged quantities, although in a few cases results for \(f_{D^+}\) are quoted (see, for example, the FNAL/MILC 11,14A and 17 computations, where the difference between \(f_D\) and \(f_{D^+}\) has been estimated to be around 0.5 MeV).

One new result has appeared for \(N_f=2\). Blossier 18 [66] employs a subset of the gauge field configuration ensembles entering the earlier study presented in ALPHA 13B [511] by the ALPHA collaboration, however, it is independent of it; in particular, in [66] a different strategy is used to analyse the raw data, based on matrices of correlation functions and by solving a Generalized Eigenvalue Problem. It describes a determination of the \(D_s\) and \(D_s^*\) decay constants computed on six \(N_f=2\) ensembles of nonperturbatively O(a) improved Wilson fermions at lattice spacings of 0.065 and 0.048 fm. Pion masses range between 440 and 194 MeV and the condition \(Lm_\pi \ge 4\) is always met. Chiral/continuum extrapolations are performed adopting a fit ansatz linear in \(m_\pi ^2\) and \(a^2\). The systematic errors are dominated by the uncertainty on the absolute lattice scale, which is fixed through \(f_K\). Cutoff effects on \(f_{D_s}\) instead appear to be small and are at the 1% level at the coarsest lattice spacing.

Table 30 Decay constants of the D and \(D_{s}\) mesons (in MeV) and their ratio
Fig. 20
figure 20

Decay constants of the D and \(D_s\) mesons [values in Table 30 and Eqs. (162170)]. As usual, full green squares are used in the averaging procedure, pale green squares have been superseded by later determinations, while pale red squares do not satisfy the criteria. The black squares and grey bands indicate our averages

The \(N_f=2\) averages for \(f_D\) and \({{f_{D_s}}/{f_D}}\) coincide with those in the previous FLAG review and are given by the values in ETM 13B [64], while the estimate for \(f_{D_s}\) is the result of the weighted average of the numbers in ETM 13B [64] and Blossier 18 [66]. They read

$$\begin{aligned} N_{ f}=2:&\;\;f_D = 208(7) \;\mathrm{MeV} \quad \,\mathrm {Ref.}~ \text{[64] }, \end{aligned}$$
(162)
$$\begin{aligned} N_{ f}=2:&\;\;f_{D_s} = 242.5(5.8) \; \mathrm{MeV} \quad \,\mathrm {Refs.}~ \text{[64, } \text{66] }, \end{aligned}$$
(163)
$$\begin{aligned} N_{ f}=2:&\;\;{f_{D_s}\over {f_D}} = 1.20(0.02)\quad \,\mathrm {Ref.}~ \text{[64] }, \end{aligned}$$
(164)

where the error on the average of \(f_{D_s}\) has been rescaled by the factor \(\sqrt{\chi ^2/\text{ dof }}=1.34\) (see Sect. 2).

The RBC/UKQCD collaboration presented in RBC/UKQCD 17 [63] the final results for the computation of the D- and \(D_s\)-mesons decay constants based on the \(N_f=2\,+\,1\) dynamical ensembles generated using Domain Wall Fermions (DWF). Three lattice spacings have been considered with pion masses ranging between the physical value (reached at the two coarsest lattice spacings) and 430 MeV. Two different Domain Wall discretizations (Möbius and Shamir) have been used for both (light) valence and sea quarks. They correspond to two different choices for the DWF kernel. The Möbius DWF are loosely equivalent to Shamir DWF at twice the extension in the fifth dimension [10]. For the actual implementation by the RBC/UKQCD collaboration O\((a^2)\) cutoff effects in the two formulations are expected to agree and results are therefore extrapolated jointly to the continuum limit. For the quenched charm quark Möbius DWF are always used, with a domain-wall height slightly different from the one adopted for light valence quarks. The choice helps to keep cutoff effects under control, according to the study in Ref. [517]. The continuum/physical-mass extrapolations are performed by using a Taylor expansion in \(a^2\) and \(m_\pi ^2-m_\pi ^{2 \, phys}\) and the associated systematic error is estimated by essentially applying cuts in the pion mass. This error dominates the uncertainties on the final results.

The updated FLAG estimates then read

(165)
(166)
(167)

where the error on the \(N_{ f}=2\,+\,1\) average of \(f_{D_s}\) has been rescaled by the factor \(\sqrt{\chi ^2/\text{ dof }}=1.1\). Those come from the results in HPQCD 12A [61], FNAL/MILC 11 [62] as well as RBC/UKQCD 17 [63] concerning \(f_D\) while for \(f_{D_s}\) also the \(\chi \)QCD 14 [22] result contributes, and instead of the value in HPQCD 12A [61] the one in HPQCD 10A [65] is used. In addition, the statistical errors between the results of FNAL/MILC and HPQCD have been everywhere treated as 100% correlated since the two collaborations use overlapping sets of configurations. The same procedure had been used in the past reviews.

For \(N_f=2\,+\,1\,+\,1\) one new determination (FNAL/MILC 17) appeared in [5], which is actually an extension of FNAL/MILC 14A [18] (described in detail in the previous FLAG review). While in FNAL/MILC 14A the finest lattice spacing considered was 0.06 fm, in FNAL/MILC 17 three new ensembles have been employed; two with resolution 0.042 fm and light-quark masses equal to either one fifth of the strange-quark mass (\(m_s/5\)) or to the physical up-down average mass, and one at \(a\approx \) 0.03 fm with light-quark masses equal to \(m_s/5\). In addition, the statistics on the \(a\approx \) 0.06 fm ensemble have been increased. As in FNAL/MILC 14A, the HISQ fermionic regularization and the 1-loop tadpole improved Symanzik gauge action have been used for the generation of configurations, produced by a combination of the RHMC and the RHMD algorithms. The analysis, absolute and relative scale setting, and the chiral/continuum extrapolations are performed in essentially the same way as in FNAL/MILC 14A and the latter rely on the use of heavy-meson rooted all-staggered chiral perturbation theory (HMrAS\(\chi \)PT) [518] at NNLO with the inclusion of \(\hbox {N}^3\)LO mass-dependent analytic terms. A novel aspect is represented by the inclusion of corrections due to the nonequilibration of the topological charge. Such freezing of the topology is particularly severe at the two new fine lattice spacings. Following [114] such corrections are computed in the context of heavy-meson \(\chi \)PT, through an expansion in \(1/\chi _TV\), with \(\chi _T\) being the topological susceptibility in a fully-sampled, large-volume ensemble. The resulting systematic error turns out to be of the same size as other systematic uncertainties such as the scale setting. The final total errors (below 0.5%) however are dominated by statistics and the systematic due to chiral/continuum extrapolations. As in FNAL/MILC 14A the results for the decay constants are used in combination with the experimental decay rates for \(D_{(s)}^+ \rightarrow \ell \nu _\ell \) in order to perform a unitarity test of the second row of the CKM matrix. After correcting the experimental decay rates from PDG by the known long- and short-distance electroweak contributions and including a 0.6% uncertainty to account for unknown electromagnetic corrections (as done in FNAL/MILC 14A and discussed in the previous FLAG review), the FNAL/MILC collaboration obtains \(1- |V_{cd}|^2 - |V_{cs}|^2 - |V_{cb}|^2 =-0.049(32)\), which is compatible with CKM unitarity within 1.5 standard deviations.

The results in FNAL/MILC 17 [5] replace those from FNAL/MILC 14A [18] in our \(N_f=2\,+\,1\,+\,1\) final estimates, which are therefore obtained by performing a weighted average with ETM 14E [34] and read

$$\begin{aligned}&N_{ f}=2\,+\,1\,+\,1:\quad f_D = 212.0(0.7) \;\mathrm{MeV}\quad \,\mathrm {Refs.}~ \text{[5,34] }, \end{aligned}$$
(168)
$$\begin{aligned}&N_{ f}=2\,+\,1\,+\,1: \quad f_{D_s} = 249.9(0.5) \; \mathrm{MeV} \quad \,\mathrm {Refs.}~ \text{[5,34] }, \end{aligned}$$
(169)
$$\begin{aligned}&N_{ f}=2\,+\,1\,+\,1: \quad {f_{D_s}\over {f_D}} = 1.1783(0.0016)\quad \,\mathrm {Refs.}~ \text{[5,34] }, \end{aligned}$$
(170)

where the error on the average of \(f_{D}\) has been rescaled by the factor \(\sqrt{\chi ^2/\text{ dof }}=1.22\).

7.2 Form factors for \(D\rightarrow \pi \ell \nu \) and \(D\rightarrow K \ell \nu \) semileptonic decays

The SM prediction for the differential decay rate of the semileptonic processes \(D\rightarrow \pi \ell \nu \) and \(D\rightarrow K \ell \nu \) can be written as

$$\begin{aligned}&\frac{d\Gamma (D\rightarrow P\ell \nu )}{dq^2} \nonumber \\&\quad = \frac{G_{\mathrm{F}}^2 |V_{cx}|^2}{24 \pi ^3} \,\frac{(q^2-m_\ell ^2)^2\sqrt{E_P^2-m_P^2}}{q^4m_{D}^2} \,\nonumber \\&\qquad \times \left[ \left( 1+\frac{m_\ell ^2}{2q^2}\right) m_{D}^2\left( E_P^2-m_P^2\right) |f_+(q^2)|^2 \right. \nonumber \\&\qquad \left. + \frac{3m_\ell ^2}{8q^2}\left( m_{D}^2-m_P^2\right) ^2|f_0(q^2)|^2\right] \,, \end{aligned}$$
(171)

where \(x = d, s\) is the daughter light quark, \(P= \pi , K\) is the daughter light-pseudoscalar meson, \(q = (p_D - p_P)\) is the momentum of the outgoing lepton pair, and \(E_P\) is the light-pseudoscalar meson energy in the rest frame of the decaying D. The vector and scalar form factors \(f_+(q^2)\) and \(f_0(q^2)\) parameterize the hadronic matrix element of the heavy-to-light quark flavour-changing vector current \(V_\mu = {\overline{x}} \gamma _\mu c\),

$$\begin{aligned} \langle P| V_\mu | D \rangle= & {} f_+(q^2) \left( {p_D}_\mu + {p_P}_\mu - \frac{m_D^2 - m_P^2}{q^2}\,q_\mu \right) \nonumber \\&+ f_0(q^2) \frac{m_D^2 - m_P^2}{q^2}\,q_\mu \,, \end{aligned}$$
(172)

and satisfy the kinematic constraint \(f_+(0) = f_0(0)\). Because the contribution to the decay width from the scalar form factor is proportional to \(m_\ell ^2\), within current precision standards it can be neglected for \(\ell = e, \mu \), and Eq. (171) simplifies to

$$\begin{aligned} \frac{d\Gamma \!\left( D \rightarrow P \ell \nu \right) }{d q^2} = \frac{G_{\mathrm{F}}^2}{24 \pi ^3} |\vec {p}_{P}|^3 {|V_{cx}|^2 |f_+ (q^2)|^2} \,. \end{aligned}$$
(173)

In models of new physics, decay rates may also receive contributions from matrix elements of other parity-even currents. In the case of the scalar density, partial vector current conservation allows one to write matrix elements of the latter in terms of \(f_+\) and \(f_0\), while for tensor currents \(T_{\mu \nu }={{\bar{x}}}\sigma _{\mu \nu }c\) a new form factor has to be introduced, viz.,

$$\begin{aligned} \langle P| T_{\mu \nu } | D \rangle = \frac{2}{m_D+m_P}\left[ p_{P\mu }p_{D\nu }-p_{P\nu }p_{D\mu }\right] f_T(q^2)\,.\nonumber \\ \end{aligned}$$
(174)

Recall that, unlike the Noether current \(V_\mu \), the operator \(T_{\mu \nu }\) requires a scale-dependent renormalization.

Lattice-QCD computations of \(f_{+,0}\) allow for comparisons to experiment to ascertain whether the SM provides the correct prediction for the \(q^2\)-dependence of \(d\Gamma (D\rightarrow P\ell \nu )/dq^2\); and, subsequently, to determine the CKM matrix elements \(|V_{cd}|\) and \(|V_{cs}|\) from Eq. (171). The inclusion of \(f_T\) allows for analyses to constrain new physics. Currently, state-of-the-art experimental results by CLEO-c [519] and BESIII [520, 521] provide data for the differential rates in the whole \(q^2\) range available, with a precision of order 2–3% for the total branching fractions in both the electron and muon final channels.

Calculations of the \(D\rightarrow \pi \ell \nu \) and \(D\rightarrow K \ell \nu \) form factors typically use the same light-quark and charm-quark actions as those of the leptonic decay constants \(f_D\) and \(f_{D_s}\). Therefore many of the same issues arise; in particular, considerations about cutoff effects coming from the large charm-quark mass, or the normalization of weak currents, apply. Additional complications arise, however, due to the necessity of covering a sizeable range of values in \(q^2\):

  • Lattice kinematics imposes restrictions on the values of the hadron momenta. Because lattice calculations are performed in a finite spatial volume, the pion or kaon three-momentum can only take discrete values in units of \(2\pi /L\) when periodic boundary conditions are used. For typical box sizes in recent lattice D- and B-meson form-factor calculations, \(L \sim 2.5\)–3 fm; thus the smallest nonzero momentum in most of these analyses lies in the range \(|\vec {p}_P| \sim 400\)–500 MeV. The largest momentum in lattice heavy–light form-factor calculations is typically restricted to \( |\vec {p}_P| \le 4\pi /L\). For \(D \rightarrow \pi \ell \nu \) and \(D \rightarrow K \ell \nu \), \(q^2=0\) corresponds to \(|\vec {p}_\pi | \sim 940\) MeV and \(|\vec {p}_K| \sim 1\) GeV, respectively, and the full recoil-momentum region is within the range of accessible lattice momenta. This has implications for both the accuracy of the study of the \(q^2\)-dependence, and the precision of the computation, since statistical errors and cutoff effects tend to increase at larger meson momenta. As a consequence, many recent studies have incorporated the use of nonperiodic (“twisted”) boundary conditions [522, 523] as a means to circumvent these difficulties and study other values of momentum including, perhaps, that for which \(q^2=0\) [67, 524,525,526,527,528].

  • Final-state pions and kaons can have energies \(\gtrsim 1~\mathrm{GeV}\), given the available kinematical range \(0 \lesssim q^2 \le q_{\mathrm{max}}^2=(m_D-m_P)^2\). This makes the use of (heavy-meson) chiral perturbation theory to extrapolate to physical light-quark masses potentially problematic.

  • Accurate comparisons to experiment, including the determination of CKM parameters, requires good control of systematic uncertainties in the parameterization of the \(q^2\)-dependence of form factors. While this issue is far more important for semileptonic B decays, where existing lattice computations cover just a fraction of the kinematic range, the increase in experimental precision requires accurate work in the charm sector as well. The parameterization of semileptonic form factors is discussed in detail in Appendix A.5.

The most advanced \(N_f = 2\) lattice-QCD calculation of the \(D \rightarrow \pi \ell \nu \) and \(D \rightarrow K \ell \nu \) form factors is by the ETM collaboration [524]. This work, for which published results are still at the preliminary stage, uses the twisted-mass Wilson action for both the light and charm quarks, with three lattice spacings down to \(a \approx 0.068\) fm and (charged) pion masses down to \(m_\pi \approx 270\) MeV. The calculation employs the method of Ref. [529] to avoid the need to renormalize the vector current, by introducing double-ratios of lattice three-point correlation functions in which the vector current renormalization cancels. Discretization errors in the double ratio are of \({{\mathcal {O}}}((am_c)^2)\), due to the automatic \({{\mathcal {O}}}(a)\) improvement at maximal twist. The vector and scalar form factors \(f_+(q^2)\) and \(f_0(q^2)\) are obtained by taking suitable linear combinations of these double ratios. Extrapolation to physical light-quark masses is performed using SU(2) heavy–light meson \(\chi \)PT. The ETM collaboration simulates with twisted boundary conditions for the valence quarks to access arbitrary momentum values over the full physical \(q^2\) range, and interpolate to \(q^2=0\) using the Bečirević–Kaidalov ansatz [530]. The statistical errors in \(f_+^{D\pi }(0)\) and \(f_+^{DK}(0)\) are 9% and 7%, respectively, and lead to rather large systematic uncertainties in the fits to the light-quark mass and energy dependence (7% and 5%, respectively). Another significant source of uncertainty is from discretization errors (5% and 3%, respectively). On the finest lattice spacing used in this analysis \(am_c \sim 0.17\), so \({\mathcal {O}}((am_c)^2)\) cutoff errors are expected to be about 5%. This can be reduced by including the existing \(N_f = 2\) twisted-mass ensembles with \(a \approx 0.051\) fm discussed in Ref. [48].

The first published \(N_f = 2\,+\,1\) lattice-QCD calculation of the \(D \rightarrow \pi \ell \nu \) and \(D \rightarrow K \ell \nu \) form factors came from the Fermilab Lattice, MILC, and HPQCD collaborations [531].Footnote 40 This work uses asqtad-improved staggered sea quarks and light (uds) valence quarks and the Fermilab action for the charm quarks, with a single lattice spacing of \(a \approx 0.12\) fm, and for a minimum RMS pion mass is \(\approx 510\) MeV, dictated by the presence of fairly large staggered taste splittings. The vector current is normalized using a mostly nonperturbative approach, such that the perturbative truncation error is expected to be negligible compared to other systematics. Results for the form factors are provided over the full kinematic range, rather than focusing just at \(q^2=0\) as was customary in previous work, and fitted to a Bečirević-Kaidalov ansatz. In fact, the publication of this result predated the precise measurements of the \(D\rightarrow K \ell \nu \) decay width by the FOCUS [532] and Belle experiments [533], and showed good agreement with the experimental determination of the shape of \(f_+^{DK}(q^2)\). Progress on extending this work was reported in [534]; efforts are aimed at reducing both the statistical and systematic errors in \(f_+^{D\pi }(q^2)\) and \(f_+^{DK}(q^2)\) by increasing the number of configurations analyzed, simulating with lighter pions, and adding lattice spacings as fine as \(a \approx 0.045\) fm.

The most precise published calculations of the \(D \rightarrow \pi \ell \nu \) [68] and \(D \rightarrow K \ell \nu \) [69] form factors in \(N_f=2\,+\,1\) QCD are by the HPQCD collaboration. They are also based on \(N_f = 2\,+\,1\) asqtad-improved staggered MILC configurations, but use two lattice spacings \(a \approx 0.09\) and 0.12 fm, and a HISQ action for the valence uds, and c quarks. In these mixed-action calculations, the HISQ valence light-quark masses are tuned so that the ratio \(m_l/m_s\) is approximately the same as for the sea quarks; the minimum RMS sea-pion mass \(\approx 390\) MeV. Form factors are determined only at \(q^2=0\), by using a Ward identity to relate matrix elements of vector currents to matrix elements of the absolutely normalized quantity \((m_{c} - m_{x} ) \langle P | {\bar{x}}c | D \rangle \), and exploiting the kinematic identity \(f_+(0) = f_0(0)\) to yield \(f_+(q^2=0) = (m_{c} - m_{x} ) \langle P | {\bar{x}}c | D \rangle / (m^2_D - m^2_P)\). A modified z-expansion (cf. Appendix A.5) is employed to simultaneously extrapolate to the physical light-quark masses and continuum and interpolate to \(q^2 = 0\), and allow the coefficients of the series expansion to vary with the light- and charm-quark masses. The form of the light-quark dependence is inspired by \(\chi \)PT, and includes logarithms of the form \(m_\pi ^2 \mathrm{log} (m_\pi ^2)\) as well as polynomials in the valence-, sea-, and charm-quark masses. Polynomials in \(E_{\pi (K)}\) are also included to parameterize momentum-dependent discretization errors. The number of terms is increased until the result for \(f_+(0)\) stabilizes, such that the quoted fit error for \(f_+(0)\) not only contains statistical uncertainties, but also reflects relevant systematics. The largest quoted uncertainties in these calculations are from statistics and charm-quark discretization errors. Progress towards extending the computation to the full \(q^2\) range have been reported in [525, 526]; however, the information contained in these conference proceedings is not enough to establish an updated value of \(f_+(0)\) with respect to the previous journal publications.

The most recent \(N_f=2\,+\,1\) computation of D semileptonic form factors has been carried out by the JLQCD collaboration, and so far published in conference proceedings only; the most recent update is Ref. [535]. They use their own Möbius domain-wall configurations at three values of the lattice spacing \(a=0.080, 0.055, 0.044~\mathrm{fm}\), with several pion masses ranging from 226 to 501 MeV (though there is so far only one ensemble, with \(m_\pi =284~\mathrm{MeV}\), at the finest lattice spacing). The vector and scalar form factors are computed at four values of the momentum transfer for each ensemble. The computed form factors are observed to depend mildly on both the lattice spacing and the pion mass. The momentum dependence of the form factors is fitted to a BCL z-parameterization with a Blaschke factor that contains the measured value of the \(D_{(s)}^*\) mass in the vector channel, and a trivial Blaschke factor in the scalar channel. The systematics of this latter fit is assessed by a BCL fit with the experimental value of the scalar resonance mass in the Blaschke factor. Continuum and chiral extrapolations are carried out through a linear fit in the squared lattice spacing and the square pion and \(\eta _c\) masses. A global fit that uses hard-pion HM\(\chi \)PT to model the mass dependence is furthermore used for a comparison of the form factor shapes with experimental data.Footnote 41 Since the computation is only published in proceedings so far, it will not enter our \(N_f=2\,+\,1\) average.Footnote 42

The first full computation of both the vector and scalar form factors in \(N_f=2\,+\,1\,+\,1\) QCD has been achieved by the ETM collaboration [67]. They have furthermore provided a separate determination of the tensor form factor, relevant for new physics analyses [528]. Both works use the available \(N_f = 2\,+\,1\,+\,1\) twisted-mass Wilson lattices [186], totaling three lattice spacings down to \(a\approx 0.06\) fm, and a minimal pion mass of 220 MeV. Matrix elements are extracted from suitable double ratios of correlation functions that avoid the need of nontrivial current normalizations. The use of twisted boundary conditions allows both for imposing several kinematical conditions, and considering arbitrary frames that include moving initial mesons. After interpolation to the physical strange- and charm-quark masses, the results for form factors are fitted to a modified z-expansion that takes into account both the light-quark mass dependence through hard-pion SU(2) \(\chi \)PT [537], and the lattice-spacing dependence. In the case of the latter, a detailed study of Lorentz-breaking effects due to the breaking of rotational invariance down to the hypercubic subgroup is performed, leading to a nontrivial momentum-dependent parameterization of cutoff effects. The z-parameterization itself includes a single-pole Blaschke factor (save for the scalar channel in \(D\rightarrow K\), where the Blaschke factor is trivial), with pole masses treated as free parameters. The final quoted uncertainty on the form factors is about 5–6% for \(D\rightarrow \pi \), and 4% for \(D\rightarrow K\). The dominant source of uncertainty is quoted as statistical+fitting procedure+input parameters – the latter referring to the values of quark masses, the lattice spacing (i.e., scale setting), and the LO SU(2) LECs.

The FNAL/MILC collaboration has also reported ongoing work on extending their computation to \(N_f=2\,+\,1\,+\,1\), using MILC HISQ ensembles at four values of the lattice spacing down to \(a=0.042~\mathrm{fm}\) and pion masses down to the physical point. The latest updates on this computation, focusing on the form factors at \(q^2=0\), but without explicit values of the latter yet, can be found in Refs. [538, 539]. A similar update of the HPQCD collaboration is ongoing, for which results for the \(D \rightarrow K\) vector and scalar form factors are being determined for the full \(q^2\) range based on MILC \(N_f=2\,+\,1\,+\,1\) ensembles [540]. This supersedes previously reported progress by HPQCD in extending their \(N_f=2\,+\,1\) computation to nonvanishing \(q^2\), see Refs. [525, 526].

Table 31 Summary of computations of charmed-hadrons semileptonic form factors. Note that Meinel 16 addresses only \(\Lambda _c\rightarrow \Lambda \) transitions (hence the absence of quoted values for \(f_+^{D\pi }(0)\) and \(f_+^{DK}(0)\)), while ETM 18 provides a computation of tensor form factors

Table 31 contains our summary of the existing calculations of the \(D \rightarrow \pi \ell \nu \) and \(D \rightarrow K \ell \nu \) semileptonic form factors. Additional tables in Appendix B.5.1 provide further details on the simulation parameters and comparisons of the error estimates. Recall that only calculations without red tags that are published in a refereed journal are included in the FLAG average. We will quote no FLAG estimate for \(N_f=2\), since the results by ETM have only appeared in conference proceedings. For \(N_f=2\,+\,1\), only HPQCD 10B,11 qualify, which provides our estimate for \(f_+(q^2=0)=f_0(q^2=0)\). For \(N_f=2\,+\,1\,+\,1\), we quote as FLAG estimate the only available result by ETM 17D:

(175)
(176)

In Fig. 21 we display the existing \(N_f =2\), \(N_f = 2\,+\,1\), and \(N_f=2\,+\,1\,+\,1\) results for \(f_+^{D\pi }(0)\) and \(f_+^{DK}(0)\); the grey bands show our estimates of these quantities. Sect. 7.4 discusses the implications of these results for determinations of the CKM matrix elements \(|V_{cd}|\) and \(|V_{cs}|\) and tests of unitarity of the second row of the CKM matrix.

Fig. 21
figure 21

\(D\rightarrow \pi \ell \nu \) and \(D\rightarrow K\ell \nu \) semileptonic form factors at \(q^2=0\). The HPQCD result for \(f_+^{D\pi }(0)\) is from HPQCD 11, the one for \(f_+^{DK}(0)\) represents HPQCD 10B (see Table 31)

7.3 Form factors for \(\Lambda _c\rightarrow \Lambda \ell \nu \) semileptonic decays

In recent years, Meinel and collaborators have pioneered the computation of form factors for semileptonic heavy-baryon decays (see also Sect. 8.6). In particular, Ref. [541] deals with \(\Lambda _c\rightarrow \Lambda \ell \nu \) transitions. The motivation for this study is twofold: apart from allowing for a new determination of \(|V_{cs}|\) in combination with the recent pioneering experimental measurement of the decay rates in Refs. [542, 543], it allows one to test the techniques previously employed for b baryons in the better-controlled (from the point of view of systematics) charm environment.

The amplitudes of the decays \(\Lambda _c\rightarrow \Lambda \ell \nu \) receive contributions from both the vector and the axial components of the current in the matrix element \(\langle \Lambda |\bar{s}\gamma ^\mu ({\mathbf {1}}-\gamma _5)c|\Lambda _c\rangle \), and can be parameterized in terms of six different form factors – see, e.g., Ref. [544] for a complete description. They split into three form factors \(f_+\), \(f_0\), \(f_\perp \) in the parity-even sector, mediated by the vector component of the current, and another three form factors \(g_+,g_0,g_\perp \) in the parity-odd sector, mediated by the axial component. All of them provide contributions that are parametrically comparable.

The computation in Meinel 16 [541] uses RBC/UKQCD \(N_f=2\,+\,1\) DWF ensembles, and treats the c quarks within the Columbia RHQ approach. Two values of the lattice spacing (\(a\sim 0.11,~0.085~\mathrm{fm}\)) are considered, with the absolute scale set from the \(\Upsilon (2S)\)\(\Upsilon (1S)\) splitting. In one ensemble the pion mass \(m_\pi =139~\mathrm{MeV}\) is at the physical point, while for other ensembles they range roughly in the 300–350 MeV interval. Results for the form factors are obtained from suitable three-point functions, and fitted to a modified z-expansion ansatz that combines the \(q^2\)-dependence with the chiral and continuum extrapolations. The paper goes on to quote the predictions for the total rates in the e and \(\mu \) channels (where errors are statistical and systematic, respectively)

$$\begin{aligned} \begin{aligned} \frac{\Gamma (\Lambda _c\rightarrow \Lambda e^+\nu _e)}{|V_{cs}|^2}= & {} 0.2007(71)(74)~\mathrm{ps}^{-1}\,, \\ \frac{\Gamma (\Lambda _c\rightarrow \Lambda \mu ^+\nu _\mu )}{|V_{cs}|^2}= & {} 0.1945(69)(72)~\mathrm{ps}^{-1}\,. \end{aligned} \end{aligned}$$
(177)

The combination with the recent experimental determination of the total branching fractions by BESIII in Refs. [542, 543] to extract \(|V_{cs}|\) is discussed in Sect. 7.4 below.

7.4 Determinations of \(|V_{cd}|\) and \(|V_{cs}|\) and test of second-row CKM unitarity

We now interpret the lattice-QCD results for the \(D_{(s)}\) meson decays as determinations of the CKM matrix elements \(|V_{cd}|\) and \(|V_{cs}|\) in the Standard Model.

Table 32 Determinations of \(|V_{cd}|\) and \(|V_{cs}|\) obtained from lattice calculations of D-meson leptonic decay constants and semileptonic form factors. The errors shown are from the lattice calculation and experiment (plus nonlattice theory), respectively, save for ETM 17D/Riggio 17, where the joint fit to lattice and experimental data does not provide a separation of the two sources of error (although the latter is still largely theory-dominated)

For the leptonic decays, we use the latest experimental averages from Rosner, Stone and Van de Water for the Particle Data Group [137]

$$\begin{aligned} f_D |V_{cd}| = 45.91(1.05)~\mathrm{MeV} \,, \quad f_{D_s} |V_{cs}| = 250.9(4.0)~\mathrm{MeV}.\nonumber \\ \end{aligned}$$
(178)

By combining these with the average values of \(f_D\) and \(f_{D_s}\) from the individual \(N_f = 2\), \(N_f = 2\,+\,1\) and \(N_f=2\,+\,1\,+\,1\) lattice-QCD calculations that satisfy the FLAG criteria, we obtain the results for the CKM matrix elements \(|V_{cd}|\) and \(|V_{cs}|\) in Table 32. For our preferred values we use the averaged \(N_f=2\) and \(N_f = 2\,+\,1\) results for \(f_D\) and \(f_{D_s}\) in Eqs. (162170). We obtain

$$\begin{aligned}&{\mathrm{leptonic~decays}}, N_f=2\,+1\,+\,1:\quad |V_{cd}| = 0.2166(7)(50),\nonumber \\&\quad |V_{cs}| = 1.004 (2)(16) , \end{aligned}$$
(179)
$$\begin{aligned}&{\mathrm{leptonic~decays}}, N_f=2\,+\,1: \quad |V_{cd}| = 0.2197(25)(50)\,, \nonumber \\&\quad |V_{cs}| = 1.012 (7)(16) \,, \end{aligned}$$
(180)
$$\begin{aligned}&{\mathrm{leptonic~decays}}, N_f=2: \quad |V_{cd}| = 0.2207(74)(50)\,, \nonumber \\&\quad |V_{cs}| = 1.035 (25)(16) \,, \end{aligned}$$
(181)

where the errors shown are from the lattice calculation and experiment (plus nonlattice theory), respectively. For the \(N_f = 2\,+\,1\) and the \(N_f=2\,+\,1\,+\,1\) determinations, the uncertainties from the lattice-QCD calculations of the decay constants are smaller than the experimental uncertainties in the branching fractions. Although the results for \(|V_{cs}|\) are slightly larger than one, they are consistent with unity within at most 1.5 standard deviations.

The leptonic determinations of these CKM matrix elements have uncertainties that are reaching the few-percent level. However, higher-order electroweak and hadronic-structure dependent corrections to the rate have not been computed for the case of \(D_{(s)}\) mesons, whereas they have been estimated to be around 1–2% for pion and kaon decays [545]. It is therefore important that such theoretical calculations are tackled soon, perhaps directly on the lattice, as proposed in Ref. [206].

For D meson semileptonic decays, there is no update on the lattice side from the previous version of our review for \(N_f=2\,+\,1\), where the only works entering the FLAG averages are HPQCD 10B/11 [68, 69], that provide values for \(f_+^{DK}(0)\) and \(f_+^{D\pi }(0)\), respectively, cf. Eq. (175). The latter can be combined with the latest experimental averages from the HFLAV collaboration [221]:

$$\begin{aligned} f_+^{D\pi }(0) |V_{cd}|= & {} 0.1426(19) \,, \qquad \nonumber \\ f_+^{DK}(0) |V_{cs}|= & {} 0.7226(34) \,, \end{aligned}$$
(182)

where we have combined the experimental statistical and systematic errors in quadrature, to determine the CKM parameters.

The new \(N_f=2\,+\,1\,+\,1\) result for form factors in ETM 17D [67] has a broader scope, in that a companion paper [546] provides a determination of \(|V_{cd}|\) and \(|V_{cs}|\) from a joint fit to lattice and experimental data. This procedure is a priori preferable to the matching at \(q^2=0\), and we will therefore use the values in Ref. [546] for our CKM averages. It has to be stressed that this entails a measure of bias in the comparison with the above \(N_f=2\,+\,1\) result; to quantify the effect, we also show in Fig. 22 the values of \(|V_{cd}|\) and \(|V_{cs}|\) obtained by using the values for \(f_+(0)\) quoted in [67], cf. Eq. (176), together with Eq. (182).

Table 33 Comparison of determinations of \(|V_{cd}|\) and \(|V_{cs}|\) obtained from lattice methods with nonlattice determinations and the Standard Model prediction assuming CKM unitarity

Finally, Meinel 16 has determined the form factors for \(\Lambda _c\rightarrow \Lambda \ell \nu \) decays for \(N_f=2\,+\,1\), which results in a determination of \(|V_{cs}|\) in combination with the experimental measurement of the branching fractions for the \(e^+\) and \(\mu ^+\) channels in Refs. [542, 543]. In Ref. [541] the value \(|V_{cs}|=0.949(24)(14)(49)\) is quoted, where the first error comes from the lattice computation, the second from the \(\Lambda _c\) lifetime, and the third from the branching fraction of the decay. While the lattice uncertainty is competitive with meson channels, the experimental uncertainty is far larger.

Fig. 22
figure 22

Comparison of determinations of \(|V_{cd}|\) and \(|V_{cs}|\) obtained from lattice methods with nonlattice determinations and the standard model prediction based on CKM unitarity. When two references are listed on a single row, the first corresponds to the lattice input for \(|V_{cd}|\) and the second to that for \(|V_{cs}|\). The results denoted by squares are from leptonic decays, while those denoted by triangles are from semileptonic decays. The points indicated as ETM 17D (\(q^2=0\)) do not contribute to the average, and are shown for comparison purposes (see text)

We thus proceed to quote our estimates from semileptonic decay as

$$\begin{aligned}&\text{ SL } \text{ averages } \text{ for }~N_f=2\,+\,1:\nonumber \\&\quad |V_{cd}| = 0.2141(93)(29) \quad \,\mathrm {Ref.}~ \text{[68] },\nonumber \\&\quad |V_{cs}|(D) = 0.967(25)(5) \quad \,\mathrm {Ref.}~ \text{[69] },\nonumber \\&\quad |V_{cs}|(\Lambda _c) = 0.949(24)(51) \quad \,\mathrm {Ref.}~ \text{[541] }, \end{aligned}$$
(183)
$$\begin{aligned}&\text{ SL } \text{ averages } \text{ for }~N_f=2\,+\,1\,+\,1:\nonumber \\&\quad |V_{cd}| = 0.2341(74) \quad \,\mathrm {Refs.}~ \text{[67,546] },\nonumber \\&\quad |V_{cs}| = 0.970(33) \quad \,\mathrm {Refs.}~ \text{[67,546] }, \end{aligned}$$
(184)

where the errors for \(N_f=2\,+\,1\) are lattice and experimental (plus nonlattice theory), respectively. It has to be stressed that all errors are largely theory-dominated. The above values are compared with individual leptonic determinations in Table 32.

In Table 33 we summarize the results for \(|V_{cd}|\) and \(|V_{cs}|\) from leptonic and semileptonic decays, and compare them to determinations from neutrino scattering (for \(|V_{cd}|\) only) and CKM unitarity. These results are also plotted in Fig. 22. For both \(|V_{cd}|\) and \(|V_{cs}|\), the errors in the direct determinations from leptonic and semileptonic decays are approximately one order of magnitude larger than the indirect determination from CKM unitarity. The direct and indirect determinations are still always compatible within at most \(1.2\sigma \), save for the leptonic determinations of \(|V_{cs}|\) – that show a \(\sim 2\sigma \) deviation for all values of \(N_f\) – and \(|V_{cd}|\) using the \(N_f=2\,+\,1\,+\,1\) lattice result, where the difference is \(1.8\sigma \).

In order to provide final estimates, we average all the available results separately for each value of \(N_f\). In all cases, we assume that results that share a significant fraction of the underlying gauge ensembles have statistical errors that are 100% correlated; the same applies to the heavy-quark discretization and scale setting errors in HPQCD calculations of leptonic and semileptonic decays. Finally, we include a 100% correlation in the fraction of the error of \(|V_{cd(s)}|\) leptonic determinations that comes from the experimental input, to avoid an artificial reduction of the experimental uncertainty in the averages. We finally quote

$$\begin{aligned}&{\mathrm{our average}}, N_f=2\,+\,1\,+\,1:\quad |V_{cd}| = 0.2219(43) \,,\quad \nonumber \\&\quad |V_{cs}| = 1.002(14) \,, \end{aligned}$$
(185)
$$\begin{aligned}&{\mathrm{our average}}, N_f=2\,+\,1: \quad |V_{cd}| = 0.2182(50) \,,\nonumber \\&\quad |V_{cs}| = 0.999(14) \,, \end{aligned}$$
(186)
$$\begin{aligned}&{\mathrm{our average}}, N_f=2: \quad |V_{cd}| = 0.2207(89) \,,\nonumber \\&\quad |V_{cs}| = 1.031(30) \,, \end{aligned}$$
(187)

where the errors include both theoretical and experimental uncertainties. These averages also appear in Fig. 22. The mutual consistency between the various lattice results is always good, save for the case of \(|V_{cd}|\) with \(N_f=2\,+\,1\,+\,1\), where a \(\sim 2\sigma \) tension between the leptonic and semileptonic determinations shows up. Currently, the leptonic and semileptonic determinations of \(V_{cd}\) are controlled by experimental and lattice uncertainties, respectively. The leptonic error will be reduced by Belle II and BES III. It would be valuable to have other lattice calculations of the semileptonic form factors.

Using the lattice determinations of \(|V_{cd}|\) and \(|V_{cs}|\) in Table 33, we can test the unitarity of the second row of the CKM matrix. We obtain

$$\begin{aligned} N_f&=2\,+\,1\,+\,1: \quad |V_{cd}|^2 + |V_{cs}|^2 + |V_{cb}|^2 - 1 = 0.05(3) \,, \end{aligned}$$
(188)
$$\begin{aligned} N_f&=2\,+\,1: \quad |V_{cd}|^2 + |V_{cs}|^2 + |V_{cb}|^2 - 1 = 0.05(3) \,, \end{aligned}$$
(189)
$$\begin{aligned} N_f&=2: \quad |V_{cd}|^2 + |V_{cs}|^2 + |V_{cb}|^2 - 1 = 0.11(6) \,. \end{aligned}$$
(190)

Again, tensions at the 2\(\sigma \) level with CKM unitarity are visible, as also reported in the PDG review [133], where the value 0.063(34) is quoted for the quantity in the equations above. Given the current level of precision, this result does not depend on \(|V_{cb}|\), which is of \({\mathcal {O}}(10^{-2})\).

8 B-meson decay constants, mixing parameters and form factors

Authors: Y. Aoki, D. Bečirević, M. Della Morte, S. Gottlieb, D. Lin, E. Lunghi, C. Pena

The (semi)leptonic decay and mixing processes of \(B_{(s)}\) mesons have been playing a crucial role in flavour physics. In particular, they contain important information for the investigation of the \(b{-}d\) unitarity triangle in the Cabibbo-Kobayashi-Maskawa (CKM) matrix, and can be ideal probes of physics beyond the Standard Model. The charged-current decay channels \(B^{+} \rightarrow l^{+} \nu _{l}\) and \(B^{0} \rightarrow \pi ^{-} l^{+} \nu _{l}\), where \(l^{+}\) is a charged lepton with \(\nu _{l}\) being the corresponding neutrino, are essential in extracting the CKM matrix element \(|V_{ub}|\). Similarly, the B to \(D^{(*)}\) semileptonic transitions can be used to determine \(|V_{cb}|\). The flavour-changing neutral current (FCNC) processes, such as \(B\rightarrow K^{(*)} \ell ^+ \ell ^-\) and \(B_{d(s)} \rightarrow \ell ^+ \ell ^-\), occur only beyond the tree level in weak interactions and are suppressed in the Standard Model. Therefore, these processes can be sensitive to new physics, since heavy particles can contribute to the loop diagrams. They are also suitable channels for the extraction of the CKM matrix elements involving the top quark which can appear in the loop. The decays \(B\rightarrow D^{(*)}\ell \nu \) and \(B\rightarrow K^{(*)} \ell \ell \) can also be used to test lepton flavour universality by comparing results for \(\ell = e\), \(\mu \) and \(\tau \). In particular, anomalies have been seen in the ratios \(R(D^{(*)}) = {{{\mathcal {B}}}} (B\rightarrow D^{(*)}\tau \nu ) /{{{\mathcal {B}}}} (B\rightarrow D^{(*)}\ell \nu )_{\ell =e,\mu }\) and \({{{{R}}}}(K^{(*)}) = {{{\mathcal {B}}}} (B\rightarrow K^{(*)}\mu \mu ) /{{{\mathcal {B}}}} (B\rightarrow K^{(*)}ee)\). In addition, the neutral \(B_{d(s)}\)-meson mixings are FCNC processes and are dominated by the 1-loop “box” diagrams containing the top quark and the W bosons. Thus, using the experimentally measured neutral \(B^0_{d(s)}\)-meson oscillation frequencies, \(\Delta M_{d(s)}\), and the theoretical calculations for the relevant hadronic mixing matrix elements, one can obtain \(|V_{td}|\) and \(|V_{ts}|\) in the Standard Model.Footnote 43

Accommodating the light quarks and the b quark simultaneously in lattice-QCD computations is a challenging endeavour. To incorporate the pion and the b hadrons with their physical masses, the simulations have to be performed using the lattice size \({\hat{L}} = L/a \sim {\mathcal {O}}(10^{2})\), where a is the lattice spacing and L is the physical (dimensionful) box size. The most ambitious calculations are now using such volumes; however, many ensembles are smaller. Therefore, in addition to employing Chiral Perturbation Theory for the extrapolations in the light-quark mass, current lattice calculations for quantities involving b hadrons often make use of effective theories that allow one to expand in inverse powers of \(m_{b}\). In this regard, two general approaches are widely adopted. On the one hand, effective field theories such as Heavy-Quark Effective Theory (HQET) and Nonrelativistic QCD (NRQCD) can be directly implemented in numerical computations. On the other hand, a relativistic quark action can be improved à la Symanzik to suppress cutoff errors, and then re-interpreted in a manner that is suitable for heavy-quark physics calculations. This latter strategy is often referred to as the method of the Relativistic Heavy-Quark Action (RHQA). The utilization of such effective theories inevitably introduces systematic uncertainties that are not present in light-quark calculations. These uncertainties can arise from the truncation of the expansion in constructing the effective theories (as in HQET and NRQCD), or from more intricate cutoff effects (as in NRQCD and RQHA). They can also be introduced through more complicated renormalization procedures which often lead to significant systematic effects in matching the lattice operators to their continuum counterparts. For instance, due to the use of different actions for the heavy and the light quarks, it is more difficult to construct absolutely normalized bottom-light currents.

Complementary to the above “effective theory approaches”, another popular method is to simulate the heavy and the light quarks using the same (normally improved) lattice action at several values of the heavy-quark mass \(m_{h}\) with \(a m_{h} < 1\) and \(m_{h} < m_{b}\). This enables one to employ HQET-inspired relations to extrapolate the computed quantities to the physical b mass. When combined with results obtained in the static heavy-quark limit, this approach can be rendered into an interpolation, instead of extrapolation, in \(m_{h}\). The discretization errors are the main source of the systematic effects in this method, and very small lattice spacings are needed to keep such errors under control.

In recent years, it has also been possible to perform lattice simulations at very fine lattice spacings and treat heavy quarks as fully relativistic fermions without resorting to effective field theories. Such simulations are of course very demanding in computing resources.

Because of the challenge described above, the efforts that have been made to obtain reliable, accurate lattice-QCD results for physics of the b quark have been enormous. These efforts include significant theoretical progress in formulating QCD with heavy quarks on the lattice. This aspect is briefly reviewed in Appendix A.1.3.

In this section, we summarize the results of the B-meson leptonic decay constants, the neutral B-mixing parameters, and the semileptonic form factors, from lattice QCD. To be focused on the calculations that have strong phenomenological impact, we limit the review to results based on modern simulations containing dynamical fermions with reasonably light pion masses (below approximately 500 MeV). There has been significant progress for b-quark physics since the previous review. There are also a number of calculations that are still in a preliminary stage. We have made note of some of these in anticipation of later publications, whose results will contribute to future averages.

Following our review of \(B_{(s)}\)-meson leptonic decay constants, the neutral B-meson mixing parameters, and semileptonic form factors, we then interpret our results within the context of the Standard Model. We combine our best-determined values of the hadronic matrix elements with the most recent experimentally-measured branching fractions to obtain \(|V_{ub}|\) and \(|V_{cb}|\), and compare these results to those obtained from inclusive semileptonic B decays.

Recent lattice-QCD averages for \(B^+\)- and \(B_s\)-meson decay constants were also presented by the Particle Data Group (PDG) in Ref. [133]. The PDG three- and four-flavour averages for these quantities differ from those quoted here because the PDG provides the charged-meson decay constant \(f_{B^+}\), while we present the isospin-averaged meson-decay constant \(f_B\).

8.1 Leptonic decay constants \(f_B\) and \(f_{B_s}\)

The B- and \(B_s\)-meson decay constants are crucial inputs for extracting information from leptonic B decays. Charged B mesons can decay to the lepton-neutrino final state through the charged-current weak interaction. On the other hand, neutral \(B_{d(s)}\) mesons can decay to a charged-lepton pair via a flavour-changing neutral current (FCNC) process.

In the Standard Model the decay rate for \(B^+ \rightarrow \ell ^+ \nu _{\ell }\) is described by a formula identical to Eq. (160), with \(D_{(s)}\) replaced by B, and the relevant CKM matrix element \(V_{cq}\) replaced by \(V_{ub}\),

$$\begin{aligned} \Gamma ( B \rightarrow \ell \nu _{\ell } ) = \frac{ m_B}{8 \pi } G_F^2 f_B^2 |V_{ub}|^2 m_{\ell }^2 \left( 1-\frac{ m_{\ell }^2}{m_B^2} \right) ^2 \;. \end{aligned}$$
(191)

The only charged-current B-meson decay that has been observed so far is \(B^{+} \rightarrow \tau ^{+} \nu _{\tau }\), which has been measured by the Belle and Babar collaborations [549, 550]. Both collaborations have reported results with errors around \(20\%\). These measurements can be used to determine \(|V_{ub}|\) when combined with lattice-QCD predictions of the corresponding decay constant.

Neutral \(B_{d(s)}\)-meson decays to a charged-lepton pair \(B_{d(s)} \rightarrow l^{+} l^{-}\) is a FCNC process, and can only occur at one loop in the Standard Model. Hence these processes are expected to be rare, and are sensitive to physics beyond the Standard Model. The corresponding expression for the branching fraction has the form

$$\begin{aligned} B ( B_q \rightarrow \ell ^+ \ell ^-)= & {} \tau _{B_q} \frac{G_F^2}{\pi } \, Y \, \left( \frac{\alpha }{4 \pi \sin ^2 \Theta _W} \right) ^2 \nonumber \\&\times m_{B_q} f_{B_q}^2 |V_{tb}^*V_{tq}|^2 m_{\ell }^2 \sqrt{1- 4 \frac{ m_{\ell }^2}{m_B^2} }\;,\nonumber \\ \end{aligned}$$
(192)

where the light quark \(q=s\) or d, and the function Y includes NLO QCD and electro-weak corrections [465, 551]. Evidence for both \(B_s \rightarrow \mu ^+ \mu ^-\) and \(B_s \rightarrow \mu ^+ \mu ^-\) decays was first observed by the CMS and the LHCb collaborations, and a combined analysis was presented in 2014 in Ref. [547]. In 2017, the LHCb collaboration reported their latest measurements as [548]

$$\begin{aligned} \begin{aligned} B(B_d \rightarrow \mu ^+ \mu ^-)&= \left( 1.5^{+1.2\,+\,0.2}_{-1.0-0.1}\right) \,10^{-10} , \\ B(B_s \rightarrow \mu ^+ \mu ^-)&= \left( 3.0\pm 0.6^{+0.3}_{-0.2}\right) \,10^{-9} , \end{aligned} \end{aligned}$$
(193)

which are compatible with the Standard Model predictions [552].

The decay constants \(f_{B_q}\) (with \(q=u,d,s\)) parameterize the matrix elements of the corresponding axial-vector currents \(A^{\mu }_{bq} = {\bar{b}}\gamma ^{\mu }\gamma ^5q\) analogously to the definition of \(f_{D_q}\) in Sect. 7.1:

$$\begin{aligned} \langle 0| A^{\mu } | B_q(p) \rangle = i p_B^{\mu } f_{B_q} \;. \end{aligned}$$
(194)

For heavy–light mesons, it is convenient to define and analyse the quantity

$$\begin{aligned} \Phi _{B_q} \equiv f_{B_q} \sqrt{m_{B_q}} \;, \end{aligned}$$
(195)

which approaches a constant (up to logarithmic corrections) in the \(m_B \rightarrow \infty \) limit because of the heavy-quark symmetry. In the following discussion we denote lattice data for \(\Phi \)(f) obtained at a heavy-quark mass \(m_h\) and light valence-quark mass \(m_{\ell }\) as \(\Phi _{h\ell }\)(\(f_{hl}\)), to differentiate them from the corresponding quantities at the physical b- and light-quark masses.

Table 34 Decay constants of the B, \(B^+\), \(B^0\) and \(B_{s}\) mesons (in MeV). Here \(f_B\) stands for the mean value of \(f_{B^+}\) and \(f_{B^0}\), extrapolated (or interpolated) in the mass of the light valence-quark to the physical value of \(m_{ud}\)
Table 35 Ratios of decay constants of the B and \(B_s\) mesons (for details see Table 34)

The SU(3)-breaking ratio \(f_{B_s}/f_B\) is of phenomenological interest. This is because in lattice-QCD calculations for this quantity, many systematic effects can be partially reduced. These include discretization errors, heavy-quark mass tuning effects, and renormalization/ matching errors, amongst others. On the other hand, this SU(3)-breaking ratio is still sensitive to the chiral extrapolation. Given that the chiral extrapolation is under control, one can then adopt \(f_{B_s}/f_B\) as input in extracting phenomenologically-interesting quantities. In addition, it often happens to be easier to obtain lattice results for \(f_{B_{s}}\) with smaller errors. Therefore, one can combine the \(B_{s}\)-meson decay constant with the SU(3)-breaking ratio to calculate \(f_{B}\). Such a strategy can lead to better precision in the computation of the B-meson decay constant, and has been adopted by the ETM [26, 64] and the HPQCD collaborations [73].

It is clear that the decay constants for charged and neutral B mesons play different roles in flavour-physics phenomenology. As already mentioned above, the knowledge of the \(B^{+}\)-meson decay constant \(f_{B^{+}}\) is essential for extracting \(|V_{ub}|\) from leptonic \(B^{+}\) decays. The neutral B-meson decay constants \(f_{B^{0}}\) and \(f_{B_{s}}\) are inputs for the search of new physics in rare leptonic \(B^{0}\) decays. In view of this, it is desirable to include isospin-breaking effects in lattice computations for these quantities, and have results for \(f_{B^{+}}\) and \(f_{B^{0}}\). With the increasing precision of recent lattice calculations, isospin splittings for B-meson decay constants are significant, and will play an important role in the foreseeable future. A few collaborations reported \(f_{B^{+}}\) and \(f_{B^{0}}\) separately by taking into account strong isospin effects in the valence sector, and estimated the corrections from electromagnetism. To properly use these results for extracting phenomenologically relevant information, one would have to take into account QED effects in the B-meson leptonic decay rates.Footnote 44 Currently, errors on the experimental measurements on these decay rates are still very large. In this review, we will then concentrate on the isospin-averaged result \(f_{B}\) and the \(B_{s}\)-meson decay constant, as well as the SU(3)-breaking ratio \(f_{B_{s}}/f_{B}\). For the world average for lattice determination of \(f_{B^{+}}\) and \(f_{B_{s}}/f_{B^{+}}\), we refer the reader to the latest work from the Particle Data Group (PDG) [133]. Notice that the \(N_{f} = 2\,+\,1\) lattice results used in Ref. [133] and the current review are identical. We will discuss this in further detail at the end of this section.

The status of lattice-QCD computations for B-meson decay constants and the SU(3)-breaking ratio, using gauge-field ensembles with light dynamical fermions, is summarized in Tables 34 and 35, while Figs. 23 and 24 contain the graphical presentation of the collected results and our averages. Many results in these tables and plots were already reviewed in detail in the previous FLAG report. Below we will describe the new results that appeared after January 2016.

Fig. 23
figure 23

Decay constants of the B and \(B_s\) mesons. The values are taken from Table 34 (the \(f_B\) entry for FNAL/MILC 11 represents \(f_{B^+}\)). The significance of the colours is explained in Sect. 2. The black squares and grey bands indicate our averages in Eqs. (196), (199), (202), (197), (200) and (203)

Fig. 24
figure 24

Ratio of the decay constants of the B and \(B_s\) mesons. The values are taken from Table 35. Results labelled as FNAL/MILC 17 1 and FNAL/MILC 17 2 correspond to those for \(f_{B_{s}}/f_{B^{0}}\) and \(f_{B_{s}}/f_{B^{+}}\) reported in FNAL/MILC 17. The significance of the colours is explained in Sect. 2. The black squares and grey bands indicate our averages in Eqs. (198), (201) and (204)

No new \(N_{f}=2\) and \(N_{f}=2\,+\,1\) project for computing \(f_{B}\), \(f_{B_{s}}\) and \(f_{B_{s}}/f_{B}\) were completed after the publication of the previous FLAG review [3]. Therefore, our averages for these cases stay the same as those in Ref. [3],

$$\begin{aligned} N_{ f}&=2:\quad f_{B} = 188(7) \;\mathrm{MeV}\quad \,\mathrm {Refs.}~ \text{[64,76] }, \end{aligned}$$
(196)
$$\begin{aligned} N_{ f}&=2: \quad f_{B_{s}} = 227(7)\; \mathrm{MeV} \quad \,\mathrm {Refs.}~ \text{[64,76] }, \end{aligned}$$
(197)
$$\begin{aligned} N_{ f}&=2: \quad {f_{B_{s}}\over {f_B}} = 1.206(0.023)\quad \,\mathrm {Refs.}~ \text{[64,76] }, \end{aligned}$$
(198)
$$\begin{aligned} N_{ f}&=2\,+\,1:\quad f_{B} = 192.0(4.3) \;\mathrm{MeV} \quad \,\mathrm {Refs.}~{ [62, 72{-}75]}, \end{aligned}$$
(199)
$$\begin{aligned} N_{ f}&=2\,+\,1: \quad f_{B_{s}} = 228.4(3.7)\; \mathrm{MeV} \quad \,\mathrm {Refs.}~{ [62, 72{-}75]}, \end{aligned}$$
(200)
$$\begin{aligned} N_{ f}&=2\,+\,1: \quad {f_{B_{s}}\over {f_B}} = 1.201(0.016)\quad \,\mathrm {Refs.}~{ [62,73{-}75]} . \end{aligned}$$
(201)

There have been results for \(f_{B_{(s)}}\) and \(f_{B_{s}}/f_{B}\) from three collaborations, ETMC, HPQCD and FNAL/MILC since the last FLAG report. In Tables 34 and 35, these results are labelled ETM 16B [26], HPQCD 17A [71] and FNAL/MILC 17 [5].

In ETM 16B [26], simulations at three values of lattice spacing, \(a=0.0885\), 0.0815 and 0.0619 fm are performed with twisted-mass Wilson fermions and the Iwasaki gauge action. The three lattice spacings correspond to the bare couplings \(\beta =1.90\), 1.95 and 2.10. The pion masses in this work range from 210 to 450 MeV, and the lattice sizes are between 1.97 and 2.98 fm. An essential feature in ETM 16B [26] is the use of the ratio method [560]. In the application of this approach to the B-decay constants, one first computes the quantity \({{\mathcal {F}}}_{hq} \equiv f_{hq}/M_{hq}\), where \(f_{hq}\) and \(M_{hq}\) are decay constant and mass of the pseudoscalar meson composed of valence (relativistic) heavy quark h and light (or strange) quark q. The matching between the lattice and the continuum heavy–light currents for extracting the above \(f_{hq}\) is straightforward because the valence heavy quark is also described by twisted-mass fermions. In the second step, the ratio \(z_{q} ({\bar{\mu }}^{(h)}, \lambda ) \equiv [{{\mathcal {F}}}_{hq}C_{A}^{{\mathrm {stat}}}({\bar{\mu }}^{(h^{\prime })})(\mu _{{\mathrm {pole}}}^{(h)})^{3/2}]/[{{\mathcal {F}}}_{h^{\prime }q}C_{A}^{{\mathrm {stat}}}({\bar{\mu }}^{(h)})(\mu _{{\mathrm {pole}}}^{(h^{\prime })})^{3/2}]\) is calculated, where \(\mu _{{\mathrm {pole}}}^{(h)}\) is the pole mass of the heavy quark h with \({\bar{\mu }}^{(h)}\) being the corresponding renormalized mass in a scheme (chosen to be the \({\overline{\mathrm{MS}}}\) scheme in ETM 16B [26]), \(C_{A}^{{\mathrm {stat}}}({\bar{\mu }}^{(h)})\) is the matching coefficient for the (hq)-meson decay constant in QCD and its counterpart in HQET, and \({\bar{\mu }}^{(h)} = \lambda {\bar{\mu }}^{(h^{\prime })}\) with \(\lambda \) being larger than, but close to, one. The authors of ETM 16B [26] use the NNLO perturbative result of \(C_{A}^{{\mathrm {stat}}}({\bar{\mu }}^{(h)})\) in their work. Notice that in practice one never has to determine the heavy-quark pole mass in this strategy, since it can be matched to the \({\overline{\mathrm{MS}}}\) mass, and the matching coefficient is known to NNNLO [169, 563, 564]. By starting from a “triggering” point with the heavy-quark mass around that of the charm, one can proceed with the calculations in steps, such that \({\bar{\mu }}^{(h)}\) is increased by a factor of \(\lambda \) at each step. In ETM 16B [26], the authors went up to heavy-quark mass around 3.3 GeV, and observed that all systematics were under control. In this approach, it is also crucial to employ the information that \(z_{q} ({\bar{\mu }}^{(h)}, \lambda ) \rightarrow 1\) in the limit \({\bar{\mu }}^{(h)} \rightarrow \infty \). Designing the computations in such a way that in the last step, \({\bar{\mu }}^{(h)}\) is equal to the pre-determined bottom-quark mass in the same renormalization scheme, one obtains \(f_{B_{(s)}}/M_{B_{(s)}}\). Employing experimental results for \(M_{B_{(s)}}\), the decay constants can be extracted. In ETM 16B [26], this strategy was implemented to compute \(f_{B_{s}}\). It was also performed for a double ratio to determine \((f_{B_{s}}/f_{B})/(f_{K}/f_{\pi })\), hence \(f_{B_{s}}/f_{B}\), using information of \(f_{K}/f_{\pi }\) from Ref. [34]. This double ratio leads to the advantage that it contains small coefficients for chiral logarithms. The B-meson decay constant is then computed with \(f_{B} = f_{B_{s}}/(f_{B_{s}}/f_{B})\). The authors estimated various kinds of systematic errors in their work. These include discretization errors, the effects of perturbative matching between QCD and HQET, those of chiral extrapolation, and errors associate with the value of \(f_{K}/f_{\pi }\).

The authors of HPQCD 17A [71] reanalysed the data in Ref. [70] (HPQCD 13 in Tables 34 and 35) employing a different method for computing B-decay constants with NRQCD heavy quarks and HISQ light quarks on the lattice. The NRQCD action used in this work is improved to \(O(\alpha _{s} v_{b}^{4})\), where \(v_{b}\) is the velocity of the b quark. In Ref. [70], the determination of the decay constants is carried out through studying matrix elements of axial currents. On the other hand, the same decay constants can be obtained by investigating \((m_{b} + m_{l}) \langle 0 | P | B \rangle \), where \(m_{b}\) (\(m_{l}\)) is the b- (light-)quark mass and P stands for the pseudoscalar current. The matching of this pseudoscalar current between QCD and NRQCD is performed at the precision of \({{\mathcal {O}}}(\alpha _{s})\), \({{\mathcal {O}}}(\alpha _{s}\Lambda _{{\mathrm {QCD}}}/m_{b})\) and \({{\mathcal {O}}}(\alpha _{s} a \Lambda _{{\mathrm {QCD}}})\), using lattice perturbation theory. This requires the inclusion of three operators in the NRQCD-HISQ theory. The gauge configurations used in this computation were part of those generated by the MILC collaboration, with details given in Ref. [117]. They are the ensembles obtained at three values of bare gauge coupling (\(\beta = 6.3, 6.0\) and 5.8), corresponding to lattice spacings, determined using the \(\Upsilon (2S-1A)\) splitting, between 0.09 and 0.15 fm. Pion masses are between 128 and 315 MeV. For each lattice spacing, the MILC collaboration performed simulations at several lattice volumes, such that \(M_{\pi } L \) lies between 3.3 and 4.5 for all the data points. This ensures that the finite-size effects are under control [565]. On each ensemble, the bare NRQCD quark mass is tuned to the b-quark mass using the spin-average for the masses \(\Upsilon \) and \(\eta _{b}\). In this work, a combined chiral-continuum extrapolation is performed, with the strategy of using Bayesian priors as explained in Ref. [566]. Systematic effects estimated in HPQCD 17A [71] include those from lattice-spacing dependence, the chiral fits, the \(B{-}B^{*}{-}\pi \) axial coupling, the operator matching, and the relativistic corrections to the NRQCD formalism. Although these errors are estimated in the same fashion as in Ref. [70] (HPQCD 13), most of them involve the handling of fits to the actual data. This means that most of the systematics effects from HPQCD 13 [70] and HPQCD 17A [71] are not correlated, although the two calculations are performed on exactly the same gauge field ensembles. The only exception is the error in the relativistic corrections. For this, the authors simply take \((\Lambda _{{\mathrm {QCD}}}/m_{b})^{2} \approx 1\%\) as the estimation in both computations. Therefore, we will correlate this part of the systematic effects in our average.Footnote 45

The third new calculation for the B-meson decay constants since the last FLAG report was performed by the FNAL/MILC collaboration (FNAL/MILC 17 [5] in Tables 34 and 35). In this work, Ref. [5], the simulations are performed for six lattice spacings, ranging from 0.03 to 0.15 fm. For the two finest lattices (\(a = 0.042\) and 0.03 fm), it is found that the topological charge is not well equilibrated. The effects of this nonequilibration are estimated using results of chiral perturbation theory in Ref. [114]. Another feature of the simulations is that both RHMC and RHMD algorithms are used. The authors investigated the effects of omitting the Metropolis test in the RHMD simulations by examining changes of the plaquette values, and found that they do not result in any variation with statistical significance. The light-quark masses used in this computation correspond to pion masses between 130 to 314 MeV. The values of the valence heavy-quark mass \(m_{h}\) are in the range of about 0.9 and 5 times the charm-quark mass. Notice that on the two coarsest lattices, the authors only implement calculations at \(m_{h} \sim m_{c}\) in order to avoid uncontrolled discretization errors, while only on the two finest lattices, is \(m_{h}\) chosen to be as high as \(\sim 5 m_{c}\). For setting the scale and the quark masses, the approach described in Ref. [18] is employed, with the special feature of using the decay constant of the “fictitious” meson that is composed of degenerate quarks with mass \(m_{p4s} = 0.4 m_{s}\). The overall scale is determined by comparing \(f_{\pi }\) to its PDG average as reported in Ref. [133]. In the analysis procedure of extrapolating/interpolating to the physical quark masses and the continuum limit, the key point is the use of heavy-meson rooted all-staggered chiral perturbation theory (HMrA\(\chi \)PT) [518]. In order to account for lattice artifacts and the effects of the heavy-quark mass in this chiral expansion, appropriate polynomial terms in a and \(1/m_{h}\) are included in the fit formulae. The full NLO terms in the chiral effective theory are incorporated in the analysis, while only the analytic contributions from the NNLO are considered. Furthermore, data obtained at the coarsest lattice spacing, \(a \approx 0.15\) fm, are discarded for the central-value fits, and are subsequently used only for the estimation of systematic errors. In this analysis strategy, there are 60 free parameters to be determined by about 500 data points. Systematic errors estimated in FNAL/MILC 17 include excited-state contamination, choices of fit models with different sizes of the priors, scale setting, quark-mass tuning, finite-size correction, electromagnetic (EM) contribution, and topological nonequilibration. The dominant effects are from the first two in this list. For the EM effects, the authors also include an error associated with choosing a specific scheme to estimate their contributions.

In our current work, the averages for \(f_{B_{s}}\), \(f_{B^{0}}\) and \(f_{B_{s}}/f_{B^{0}}\) with \(N_{f}=2\,+\,1\,+\,1\) lattice simulations are updated, because of the three published papers (FNAL/MILC 17 [5], HPQCD 17A [71] and ETM 16B [26] in Tables 34 and 35) that appeared after the release of the last FLAG review [3]. In the updated FLAG averages, we include results from these three new computations, as well as those in HPQCD 13 [70]. Since the decay constants presented in HPQCD 13 [70], HPQCD 17A [71] and FNAL/MILC 17 have been extracted with a significant overlap of gauge-field configurations, we correlate statistical errors from these works. Furthermore, as explained above, the systematic effects arising from relativistic corrections in HPQCD 13 [70] and HPQCD 17A [71] are correlated. Notice that the authors of FNAL/MILC 17 [5] computed \(f_{B_{s}}/f_{B^{+}}\) and \(f_{B_{s}}/f_{B^{0}}\) without performing an isospin average to obtain \(f_{B_{s}}/f_{B}\). This is the reason why in Fig. 24 we show two results, FNAL/MILC 17 1 (\(f_{B_{s}}/f_{B^{0}}\)) and FNAL/MILC 17 2 (\(f_{B_{s}}/f_{B^{+}}\)), from this reference. To determine the global average for \(f_{B_{s}}/f_{B}\), we first perform the average of \(f_{B_{s}}/f_{B^{+}}\) and \(f_{B_{s}}/f_{B^{0}}\) in FNAL/MILC 17 [5] by following the procedure in Sect. 2, with all errors correlated. This gives us the estimate of \(f_{B_{s}}/f_{B}\) from this work by the FNAL/MILC collaboration.

Following the above strategy, and the procedure explained in Sect. 2, we compute the average of \(N_f = 2\,+\,1\,+\,1\) results for \(f_{B_{(s)}}\) and \(f_{B_{s}}/f_{B}\),

$$\begin{aligned} N_{ f}&=2\,+\,1\,+\,1:\quad f_{B} = 190.0(1.3) \;\mathrm{MeV} \quad \,\mathrm {Refs.}~ \text{[5,26,70,71] }, \end{aligned}$$
(202)
$$\begin{aligned} N_{ f}&=2\,+\,1\,+\,1:\quad f_{B_{s}} = 230.3(1.3) \; \mathrm{MeV}\quad \,\mathrm {Refs.}~ \text{[5,26,70,71] }, \end{aligned}$$
(203)
$$\begin{aligned} N_{ f}&=2\,+\,1\,+\,1: \quad {f_{B_{s}}\over {f_B}} = 1.209(0.005)\quad \,\mathrm {Refs.}~ \text{[5,26,70,71] }. \end{aligned}$$
(204)

The PDG presented their averages for the \(N_{f}=2\,+\,1\) and \(N_{f}=2\,+\,1\,+\,1\) lattice-QCD determinations of \(f_{B^{+}}\), \(f_{B_{s}}\) and \(f_{B_{s}}/f_{B^{+}}\) in 2015 [133]. The \(N_{f}=2\,+\,1\) lattice-computation results used in Ref. [133] are identical to those included in our current work. Regarding our isospin-averaged \(f_{B}\) as the representative for \(f_{B^{+}}\), then the current FLAG and PDG estimations for these quantities are compatible, although the errors of \(N_{f}=2\,+\,1\,+\,1\) results in this report are significantly smaller. In the PDG work, they “corrected” the isospin-averaged \(f_{B}\), as reported by various lattice collaborations, using the \(N_{f}=2\,+\,1\,+\,1\) strong isospin-breaking effect computed in HPQCD 13 [70] (see Table 34 in this section). However, since only unitary points (with equal sea- and valence-quark masses) were considered in HPQCD 13 [70], this procedure only correctly accounts for the effect from the valence-quark masses, while introducing a spurious sea-quark contribution. We notice that \(f_{B^{+}}\) and \(f_{B^{0}}\) are also separately reported in FNAL/MILC 17 [5] by taking into account the strong-isospin effect, and it is found that these two decay constants are well compatible. Notice that the new FNAL/MILC results were obtained by properly keeping the averaged light sea-quark mass fixed when varying the quark masses in their analysis procedure. Their finding indicates that the strong isospin-breaking effects could be smaller than what was suggested by previous computations.

8.2 Neutral B-meson mixing matrix elements

Neutral B-meson mixing is induced in the Standard Model through 1-loop box diagrams to lowest order in the electroweak theory, similar to those for short-distance effects in neutral kaon mixing. The effective Hamiltonian is given by

$$\begin{aligned} {{{\mathcal {H}}}}_{\mathrm{eff}}^{\Delta B = 2, \mathrm{SM}} = \frac{G_F^2 M_{\mathrm{W}}^2}{16\pi ^2} \left( {{{\mathcal {F}}}}^0_d {{{\mathcal {Q}}}}^d_1 + {{{\mathcal {F}}}}^0_s {{{\mathcal {Q}}}}^s_1\right) \,\, + \,\, \mathrm{h.c.}, \end{aligned}$$
(205)

with

$$\begin{aligned} {{{\mathcal {Q}}}}^q_1 = \left[ {\bar{b}}\gamma _\mu (1-\gamma _5)q\right] \left[ {\bar{b}}\gamma _\mu (1-\gamma _5)q\right] , \end{aligned}$$
(206)

where \(q=d\) or s. The short-distance function \({{{\mathcal {F}}}}^0_q\) in Eq. (205) is much simpler compared to the kaon mixing case due to the hierarchy in the CKM matrix elements. Here, only one term is relevant,

$$\begin{aligned} {{{\mathcal {F}}}}^0_q = \lambda _{tq}^2 S_0(x_t) \end{aligned}$$
(207)

where

$$\begin{aligned} \lambda _{tq} = V^*_{tq}V_{tb}, \end{aligned}$$
(208)

and where \(S_0(x_t)\) is an Inami-Lim function with \(x_t=m_t^2/M_W^2\), which describes the basic electroweak loop contributions without QCD [465]. The transition amplitude for \(B_q^0\) with \(q=d\) or s can be written as

$$\begin{aligned}&\langle {{\bar{B}}}^0_q \vert {{{\mathcal {H}}}}_{\mathrm{eff}}^{\Delta B = 2} \vert B^0_q\rangle \,\, \nonumber \\&\quad = \,\, \frac{G_F^2 M_{\mathrm{W}}^2}{16 \pi ^2} \Big [ \lambda _{tq}^2 S_0(x_t) \eta _{2B} \Big ] \left( \frac{{\bar{g}}(\mu )^2}{4\pi }\right) ^{-\gamma _0/(2\beta _0)} \nonumber \\&\qquad \times \exp \bigg \{ \int _0^{{\bar{g}}(\mu )} \, dg \, \bigg ( \frac{\gamma (g)}{\beta (g)} \, + \, \frac{\gamma _0}{\beta _0g} \bigg ) \bigg \} \nonumber \\&\qquad \times \langle {{\bar{B}}}^0_q \vert Q^q_{\mathrm{R}} (\mu ) \vert B^0_q \rangle \,\, + \,\, \mathrm{h.c.} , \end{aligned}$$
(209)

where \(Q^q_{\mathrm{R}} (\mu )\) is the renormalized four-fermion operator (usually in the NDR scheme of \({\overline{\mathrm {MS}}}\)). The running coupling \({\bar{g}}\), the \(\beta \)-function \(\beta (g)\), and the anomalous dimension of the four-quark operator \(\gamma (g)\) are defined in Eqs. (139) and (140). The product of \(\mu \)-dependent terms on the second line of Eq. (209) is, of course, \(\mu \)-independent (up to truncation errors arising from the use of perturbation theory). The explicit expression for the short-distance QCD correction factor \(\eta _{2B}\) (calculated to NLO) can be found in Ref. [458].

For historical reasons the B-meson mixing matrix elements are often parameterized in terms of bag parameters defined as

$$\begin{aligned} B_{B_q}(\mu )= \frac{{\left\langle {\bar{B}}^0_q\left| Q^q_\mathrm{R}(\mu )\right| B^0_q\right\rangle } }{ {\frac{8}{3}f_{B_q}^2 m_{\mathrm{B}}^2}} \,\, . \end{aligned}$$
(210)

The RGI B parameter \({\hat{B}}\) is defined as in the case of the kaon, and expressed to 2-loop order as

$$\begin{aligned} {\hat{B}}_{B_q}= & {} \left( \frac{{\bar{g}}(\mu )^2}{4\pi }\right) ^{- \gamma _0/(2\beta _0)} \nonumber \\&\times \left\{ 1+\dfrac{{\bar{g}}(\mu )^2}{(4\pi )^2}\left[ \frac{\beta _1\gamma _0-\beta _0\gamma _1}{2\beta _0^2} \right] \right\} \, B_{B_q}(\mu ), \end{aligned}$$
(211)

with \(\beta _0\), \(\beta _1\), \(\gamma _0\), and \(\gamma _1\) defined in Eq. (141). Note, as Eq. (209) is evaluated above the bottom threshold (\(m_b<\mu <m_t\)), the active number of flavours here is \(N_f=5\).

Nonzero transition amplitudes result in a mass difference between the CP eigenstates of the neutral B-meson system. Writing the mass difference for a \(B_q^0\) meson as \(\Delta m_q\), its Standard Model prediction is

$$\begin{aligned} \Delta m_q = \frac{G^2_Fm^2_W m_{B_q}}{6\pi ^2} \, |\lambda _{tq}|^2 S_0(x_t) \eta _{2B} f_{B_q}^2 {\hat{B}}_{B_q}. \end{aligned}$$
(212)

Experimentally the mass difference is measured as oscillation frequency of the CP eigenstates. The frequencies are measured precisely with an error of less than a percent. Many different experiments have measured \(\Delta m_d\), but the current average [170] is based on measurements from the B-factory experiments Belle and Babar, and from the LHC experiment LHCb. For \(\Delta m_s\) the experimental average is dominated by results from LHCb [170]. With these experimental results and lattice-QCD calculations of \(f_{B_q}^2{\hat{B}}_{B_q}\) at hand, \(\lambda _{tq}\) can be determined. In lattice-QCD calculations the flavour SU(3)-breaking ratio

$$\begin{aligned} \xi ^2 = \frac{f_{B_s}^2B_{B_s}}{f_{B_d}^2B_{B_d}} \end{aligned}$$
(213)

can be obtained more precisely than the individual \(B_q\)-mixing matrix elements because statistical and systematic errors cancel in part. With this the ratio \(|V_{td}/V_{ts}|\) can be determined, which can be used to constrain the apex of the CKM triangle.

Neutral B-meson mixing, being loop-induced in the Standard Model, is also a sensitive probe of new physics. The most general \(\Delta B=2\) effective Hamiltonian that describes contributions to B-meson mixing in the Standard Model and beyond is given in terms of five local four-fermion operators:

$$\begin{aligned} {{{\mathcal {H}}}}_{\mathrm{eff, BSM}}^{\Delta B = 2} = \sum _{q=d,s}\sum _{i=1}^5 {{{\mathcal {C}}}}_i {{{\mathcal {Q}}}}^q_i \;, \end{aligned}$$
(214)

where \({{{\mathcal {Q}}}}_1\) is defined in Eq. (206) and where

$$\begin{aligned} \begin{aligned} {{{\mathcal {Q}}}}^q_2&= \left[ {\bar{b}}(1-\gamma _5)q\right] \left[ {\bar{b}}(1-\gamma _5)q\right] , \qquad \\ {{{\mathcal {Q}}}}^q_3&= \left[ {\bar{b}}^{\alpha }(1-\gamma _5)q^{\beta }\right] \left[ {\bar{b}}^{\beta }(1-\gamma _5)q^{\alpha }\right] ,\\ {{{\mathcal {Q}}}}^q_4&= \left[ {\bar{b}}(1-\gamma _5)q\right] \left[ {\bar{b}}(1+\gamma _5)q\right] , \qquad \\ {{{\mathcal {Q}}}}^q_5&= \left[ {\bar{b}}^{\alpha }(1-\gamma _5)q^{\beta }\right] \left[ {\bar{b}}^{\beta }(1+\gamma _5)q^{\alpha }\right] , \end{aligned} \end{aligned}$$
(215)

with the superscripts \(\alpha ,\beta \) denoting colour indices, which are shown only when they are contracted across the two bilinears. There are three other basis operators in the \(\Delta B=2\) effective Hamiltonian. When evaluated in QCD, however, they give identical matrix elements to the ones already listed due to parity invariance in QCD. The short-distance Wilson coefficients \({{{\mathcal {C}}}}_i\) depend on the underlying theory and can be calculated perturbatively. In the Standard Model only matrix elements of \({{{\mathcal {Q}}}}^q_1\) contribute to \(\Delta m_q\), while all operators do, for example, for general SUSY extensions of the Standard Model [501]. The matrix elements or bag parameters for the non-SM operators are also useful to estimate the width difference in the Standard Model, where combinations of matrix elements of \({{{\mathcal {Q}}}}^q_1\), \({{{\mathcal {Q}}}}^q_2\), and \({{{\mathcal {Q}}}}^q_3\) contribute to \(\Delta \Gamma _q\) at \({\mathcal {O}}(1/m_b)\) [567, 568].

Table 36 Neutral B- and \(B_{\mathrm{s}}\)-meson mixing matrix elements (in MeV) and bag parameters
Table 37 Results for SU(3)-breaking ratios of neutral \(B_{d}\)- and \(B_{s}\)-meson mixing matrix elements and bag parameters

In this section, we report on results from lattice-QCD calculations for the neutral B-meson mixing parameters \({\hat{B}}_{B_d}\), \({\hat{B}}_{B_s}\), \(f_{B_d}\sqrt{{\hat{B}}_{B_d}}\), \(f_{B_s}\sqrt{{\hat{B}}_{B_s}}\) and the SU(3)-breaking ratios \(B_{B_s}/B_{B_d}\) and \(\xi \) defined in Eqs. (210), (211), and (213). The results are summarized in Tables 36 and 37 and in Figs. 25 and 26. Additional details about the underlying simulations and systematic error estimates are given in Appendix B.6.2. Some collaborations do not provide the RGI quantities \({\hat{B}}_{B_q}\), but quote instead \(B_B(\mu )^{{\overline{MS}},NDR}\). In such cases we convert the results to the RGI quantities quoted in Table 36 using Eq. (211). More details on the conversion factors are provided below in the descriptions of the individual results. We do not provide the B-meson matrix elements of the other operators \({{{\mathcal {Q}}}}_{2-5}\) in this report. They have been calculated in Ref. [64] for the \(N_f=2\) case and in Refs. [78, 569] for \(N_f=2\,+\,1\).

Fig. 25
figure 25

Neutral B- and \(B_{\mathrm{s}}\)-meson mixing matrix elements and bag parameters [values in Table 36 and Eqs. (216), (219), (217), (220)]

Fig. 26
figure 26

The SU(3)-breaking quantities \(\xi \) and \(B_{B_s}/B_{B_d}\) [values in Table 37 and Eqs. (218), (221)]

There are no new results for \(N_f=2\) reported after the previous FLAG review. In this category one work (ETM 13B) [64] passes the quality criteria. A description of this work can be found in the FLAG 13 review [2] where it did not enter the average as it had not appeared in a journal. Because this is the only result available for \(N_f=2\), we quote their values as our averages in this version:

$$\begin{aligned}&f_{B_d}\sqrt{\hat{B}_{b_d}}&= 216(10)\;\; \mathrm{MeV}&f_{B_s}\sqrt{\hat{B}_{B_s}}&= 262(10)\;\; \mathrm{MeV}&\,\mathrm {Ref.}~\text{[64] }, \end{aligned}$$
(216)
$$\begin{aligned} N_f=2:&\hat{B}_{B_d}&= 1.30(6)&\hat{B}_{B_s}&= 1.32(5)&\,\mathrm {Ref.}~\text{[64] }, \end{aligned}$$
(217)
$$\begin{aligned}&\xi&= 1.225(31)&B_{B_s}/B_{B_d}&= 1.007(21)&\,\mathrm {Ref.}~\text{[64] }. \end{aligned}$$
(218)

For the \(N_f=2\,+\,1\) case the FNAL/MILC collaboration reported their new results on the neutral B-meson mixing parameters in 2016. As the paper [78] appeared after the closing date of FLAG 16 [3], the results had not been taken into our average then. However, the subsequent web update of FLAG took the results into the average, and was made public in November 2017.

Their estimate of the \(B^0{-}\overline{B^0}\) mixing matrix elements are far improved compared to their older ones as well as all the prior \(N_f=2\,+\,1\) results. Hence, including the new FNAL/MILC results makes our averages much more precise. The study uses the asqtad action for light quarks and the Fermilab action for the b quark. They use MILC asqtad ensembles spanning four lattice spacings in the range \(a\approx 0.045{-}0.12\) fm and RMS pion mass of 257 MeV as the lightest. The lightest Goldstone pion of 177 MeV, at which the RMS mass is 280 MeV, helps constraining the combined chiral and continuum limit analysis with the HMrS\(\chi \)PT (heavy-meson rooted-staggered chiral perturbation theory) to NLO with NNLO analytic terms using a Bayesian prior. The extension to the finer lattice spacing and closer to physical pion masses together with the quadrupled statistics of the ensembles compared with those used in the earlier studies, as well as the inclusion of the wrong spin contribution [573], which is a staggered fermion artifact, make it possible to achieve the large improvement of the overall precision. Although for each parameter only one lattice volume is available, the finite-volume effects are well controlled by using a large enough lattice () for all the ensembles. The operator renormalization is done by 1-loop lattice perturbation theory with the help of the mostly nonperturbative renormalization method where a perturbative computation of the ratio of the four-quark operator and square of the vector-current renormalization factors is combined with the nonperturbative estimate of the latter. Let us note that in the report [78] not only the SM \(B^0{-}\overline{B^0}\) mixing matrix element, but also those with all possible four-quark operators are included. The correlation among the different matrix elements are given, which helps to properly assess the error propagation to phenomenological analyses where combinations of the different matrix elements enter. The authors estimate the effect of omitting the charm-quark dynamics, which we have not propagated to our \(N_f=2\,+\,1\) averages. It should also be noted that their main new results are for the \(B^0-\overline{B^0}\) mixing matrix elements, that are \(f_{B_d}\sqrt{B_{B_d}}\), \(f_{B_s}\sqrt{B_{B_s}}\) and the ratio \(\xi \). They reported also on \(B_{B_d}\), \(B_{B_s}\) and \(B_{B_s}/B_{B_d}\). However, the B-meson decay constants needed in order to isolate the bag parameters from the four-fermion matrix elements are taken from the PDG [133] averages, which are obtained using a procedure similar to that used by FLAG. They plan to compute the decay constants on the same gauge field ensembles and then complete the bag parameter calculation on their own in the future. As of now, for the bag parameters we need to use the nested averaging scheme, described in Sect. 2.3.2, to take into account the possible correlations with this new result to the other ones through the averaged decay constants. The detailed procedure to apply the scheme for this particular case is provided in Sect. 8.2.1.

The other results for \(N_f=2\,+\,1\) are RBC/UKQCD 14A [74], which had been included in the averages at FLAG 16 [3], and HPQCD 09 [77] to which a description is available in FLAG 13 [2]. Now our averages for \(N_f=2\,+\,1\) are:

$$\begin{aligned}&f_{B_d}\sqrt{\hat{B}_{B_d}}&= 225(9)\, \mathrm{MeV}&f_{B_s}\sqrt{\hat{B}_{B_s}}&= 274(8)\, \mathrm{MeV}&\,\mathrm {Refs.}~\text{[74,77,78] }, \end{aligned}$$
(219)
$$\begin{aligned}&N_f=2+1:&\hat{B}_{B_d}&= 1.30(10)&\hat{B}_{B_s}&= 1.35(6)&\,\mathrm {Refs.}~\text{[74,77,78] }, \end{aligned}$$
(220)
$$\begin{aligned}&\xi&= 1.206(17)&B_{B_s}/B_{B_d}&= 1.032(38)&\,\mathrm {Refs.}~\text{[74,78] }. \end{aligned}$$
(221)

Here all the above equations have been updated from the paper version of FLAG 16. The new results from FNAL/MILC 16 [78] entered the average for Eqs. (219), (220), and replaced the earlier FNAL/MILC 12 [572] for Eq. (221).

As discussed in detail in the FLAG 13 review [2] HPQCD 09 does not include wrong-spin contributions [573], which are staggered fermion artifacts, to the chiral extrapolation analysis. It is possible that the effect is significant for \(\xi \) and \(B_{B_s}/B_{B_d}\), since the chiral extrapolation error is a dominant one for these flavour SU(3)-breaking ratios. Indeed, a test done by FNAL/MILC 12 [572] indicates that the omission of the wrong spin contribution in the chiral analysis may be a significant source of error. We therefore took the conservative choice to exclude \(\xi \) and \(B_{B_s}/B_{B_d}\) by HPQCD 09 from our average and we follow the same strategy in this report as well.

Table 38 Correlated elements of error composition in the summation over \((\alpha )\) for \(\sigma [Z]_{i';j'\leftrightarrow k}\) [Eq. (18)] for \(Z=f_B^2, f_{B_s}^2, f_{B_s}^2/f_B^2\). The \(i'=j'\) elements express \(\sigma [Z]_{i'\leftrightarrow k}\) [Eq. (16)]. The elements not listed here are all null

We note that the above results within same \(N_f\) are all correlated with each other, due to the use of the same gauge field ensembles for different quantities. The results are also correlated with the averages obtained in Sect. 8.1 and shown in Eqs. (196)–(198) for \(N_f=2\) and Eqs. (199)–(201) for \(N_f=2\,+\,1\), because the calculations of B-meson decay constants and mixing quantities are performed on the same (or on similar) sets of ensembles, and results obtained by a given collaboration use the same actions and setups. These correlations must be considered when using our averages as inputs to unitarity triangle (UT) fits. For this reason, if one were for example to estimate \(f_{B_s}\sqrt{{\hat{B}}_s}\) from the separate averages of \(f_{B_s}\) and \({\hat{B}}_s\), one would obtain a value about one standard deviation below the one quoted above. While these two estimates lead to compatible results, giving us confidence that all uncertainties have been properly addressed, we do not recommend combining averages this way, as many correlations would have to be taken into account to properly assess the errors. We recommend instead using the numbers quoted above. In the future, as more independent calculations enter the averages, correlations between the lattice-QCD inputs to UT fits will become less significant.

8.2.1 Error treatment for B-meson bag parameters

The latest FNAL/MILC computation (FNAL/MILC 16) uses B-meson decay constants averaged for PDG [133] to isolate the bag parameter from the mixing matrix elements. The bag parameters so obtained have correlation to those from the other computations in two ways: through the mixing matrix elements of FNAL/MILC 16 and through the PDG average. Since the PDG average is obtained similarly as the FLAG average, estimating the bag parameter average with FNAL/MILC 16 requires a nested scheme. The nested scheme discussed in Sect. 2.3.2 is applied as follows.

Three computations contribute to the \(N_f=2\,+\,1\) average of the \(B_d\) meson bag parameter \(B_{B_d}\), FNAL/MILC 16 [78], RBC/UKQCD 14A [74], HPQCD 09 [77]. FNAL/MILC 16 uses \(f_{B^0}\) of PDG [133], which is an average of RBC/UKQCD 14, RBC/UKQCD 14A, HPQCD 12/11A, FNAL/MILC 11 in Table 34.Footnote 46 \(B_{B_d}\) (RBC/UKQCD 14A) has correlation with that of FNAL/MILC 16, through \(f_B\) (RBC/UKQCD 14A). Also some correlation exists through \(f_B\) (RBC/UKQCD 14), which uses the same set of gauge field configurations as \(B_{B_d}\) (RBC/UKQCD 14A).

Table 39 Correlated elements of error composition in the summation over \((\alpha )\) for \(\sigma _{i;j}\) of \(B_{B_d}\), \(B_{B_s}\), \(B_{B_s}/B_{B_d}\). The \(i=\) [FNAL/MILC 16] row expresses the correlations in the first term in the square root in Eq. (14). The \(j=\) [FNAL/MILC 16] column represents the correlations for Eq. (19). For \(B_{B_s}/B_{B_d}\) only upper \(2\times 2\) block is relevant

In Eq. (9) for this particular case, \(Q_1\) is \(B_{B_d}\) (FNAL/MILC 16), \(Y_1\) is \(f_{B^0}^2 B_{B_d}\) (FNAL/MILC 16), and \({\overline{Z}}\) is the PDG average of \(f_{B^0}^2\). The most nontrivial part of the nested averaging is to construct the restricted errors \(\sigma [f_B^2]_{i'\leftrightarrow k}\) [Eq. (16)] and \(\sigma [f_B^2]_{i';j'\leftrightarrow k}\) [Eq. (18)], which goes into the final correlation matrix \(C_{ij}\) of \(B_{B_d}\) through \(\sigma _{1;k}\) [Eq. (14)]. The restricted summation over \((\alpha )\) labeling the origin of errors in this analysis turns out to be either the whole error or the statistical error only.

For the correlation of \(f_B\) and \(B_{B_d}\) both with RBC/UKQCD 14A, not knowing the information of the correlation, we take total errors 100 % correlated. For example, the heavy-quark error, which is \(O(1/m_b)\) and most dominant, is common for both. For the correlation of \(f_B\) (RBC/UKQCD 14) and \(B_{B_d}\) (RBC/UKQCD 14A), which uses different heavy-quark formulations but based on the same set of gauge field configurations, only the statistical error is taken as correlated. In a similar way, correlation among the other computations is determined. In principle, we take the whole error as correlated between \(f_B\) and \(B_{B_d}\) if both results are based on the exact same lattice action for light and heavy quarks and are sharing (at least a part of) the gauge field ensemble. Otherwise, only the statistical error is taken as correlated if two computations share the gauge field ensemble, or no correlation for the rest, which is summarized in Table 38. Also in a similar way, correlations of \(f_{B_s}\) and \(B_{B_s}\), \(f_{B_s}/f_{B}\) and \(B_{B_s}/B_{B_d}\) are determined, which are also summarized in Table 38.

The necessary information for constructing the second term in the square root of Eq. (14) has already been provided. For completeness, let us also summarize the correlation pattern needed to construct the other part of \(\sigma _{i;j}\) for the bag parameters, which is shown in Table 39.

8.3 Semileptonic form factors for B decays to light flavours

The Standard Model differential rate for the decay \(B_{(s)}\rightarrow P\ell \nu \) involving a quark-level \(b\rightarrow u\) transition is given, at leading order in the weak interaction, by a formula analogous to the one for D decays in Eq. (171), but with \(D \rightarrow B_{(s)}\) and the relevant CKM matrix element \(|V_{cq}| \rightarrow |V_{ub}|\):

$$\begin{aligned}&\frac{d\Gamma (B_{(s)}\rightarrow P\ell \nu )}{dq^2} \nonumber \\&\quad = \frac{G_F^2 |V_{ub}|^2}{24 \pi ^3} \,\frac{(q^2-m_\ell ^2)^2\sqrt{E_P^2-m_P^2}}{q^4m_{B_{(s)}}^2}\nonumber \\&\qquad \times \left[ \left( 1+\frac{m_\ell ^2}{2q^2}\right) m_{B_{(s)}}^2(E_P^2-m_P^2)|f_+(q^2)|^2 \right. \nonumber \\&\qquad \left. ~\,+\,\frac{3m_\ell ^2}{8q^2}(m_{B_{(s)}}^2-m_P^2)^2|f_0(q^2)|^2 \right] \,. \end{aligned}$$
(222)

Again, for \(\ell =e,\mu \) the contribution from the scalar form factor \(f_0\) can be neglected, and one has a similar expression to Eq. (173), which, in principle, allows for a direct extraction of \(|V_{ub}|\) by matching theoretical predictions to experimental data. However, while for D (or K) decays the entire physical range \(0 \le q^2 \le q^2_{\mathrm{max}}\) can be covered with moderate momenta accessible to lattice simulations, in \(B \rightarrow \pi \ell \nu \) decays one has \(q^2_{\mathrm{max}} \sim 26~\mathrm{GeV}^2\) and only part of the full kinematic range is reachable. As a consequence, obtaining \(|V_{ub}|\) from \(B\rightarrow \pi \ell \nu \) is more complicated than obtaining \(|V_{cd(s)}|\) from semileptonic D-meson decays.

In practice, lattice computations are restricted to large values of the momentum transfer \(q^2\) (see Sect. 7.2) where statistical and momentum-dependent discretization errors can be controlled,Footnote 47 which in existing calculations roughly cover the upper third of the kinematically allowed \(q^2\) range. Since, on the other hand, the decay rate is suppressed by phase space at large \(q^2\), most of the semileptonic \(B\rightarrow \pi \) events are selected in experiment at lower values of \(q^2\), leading to more accurate experimental results for the binned differential rate in that region.Footnote 48 It is therefore a challenge to find a window of intermediate values of \(q^2\) at which both the experimental and lattice results can be reliably evaluated.

In current practice, the extraction of CKM matrix elements requires that both experimental and lattice data for the \(q^2\)-dependence be parameterized by fitting data to a specific ansatz. Before the generalization of the sophisticated ansätze that will be discussed below, the most common procedure to overcome this difficulty involved matching the theoretical prediction and the experimental result for the integrated decay rate over some finite interval in \(q^2\),

$$\begin{aligned} \Delta \zeta = \frac{1}{|V_{ub}|^2} \int _{q^2_{1}}^{q^2_{2}} \left( \frac{d \Gamma }{d q^2} \right) dq^2\,. \end{aligned}$$
(223)

In the most recent literature, it has become customary to perform a joint fit to lattice and experimental results, keeping the relative normalization \(|V_{ub}|^2\) as a free parameter. In either case, good control of the systematic uncertainty induced by the choice of parameterization is crucial to obtain a precise determination of \(|V_{ub}|\). A detailed discussion of the parameterization of form factors as a function of \(q^2\) can be found in Appendix A.5.

8.3.1 Form factors for \(B\rightarrow \pi \ell \nu \)

The semileptonic decay processes \(B\rightarrow \pi \ell \nu \) enable determinations of the CKM matrix element \(|V_{ub}|\) within the Standard Model via Eq. (222). Early results for \(B\rightarrow \pi \ell \nu \) form factors came from the HPQCD [575] and FNAL/MILC [576] collaborations. Only HPQCD provided results for the scalar form factor \(f_0\). Our previous review featured a significantly extended calculation of \(B\rightarrow \pi \ell \nu \) from FNAL/MILC [577] and a new computation from RBC/UKQCD [578]. All the above computations employ \(N_f=2\,+\,1\) dynamical configurations, and provide values for both form factors \(f_+\) and \(f_0\). In addition, HPQCD using MILC ensembles had published the first \(N_f=2\,+\,1\,+\,1\) results for the \(B\rightarrow \pi \ell \nu \) scalar form factor, working at zero recoil and pion masses down to the physical value [579]; this adds to previous reports on ongoing work to upgrade their 2006 computation [580, 581]. Since this latter result has no immediate impact on current \(|V_{ub}|\) determinations, which come from the vector-form-factor-dominated decay channels into light leptons, we will from now on concentrate on the \(N_f=2\,+\,1\) determinations of the \(q^2\)-dependence of \(B\rightarrow \pi \) form factors.

Table 40 Results for the \(B \rightarrow \pi \ell \nu \) semileptonic form factor. The quantity \(\Delta \zeta \) is defined in Eq. (223); the quoted values correspond to \(q_1=4\) GeV, \(q_2=q_{max}\), and are given in \(\text{ ps }^{-1}\)

Results presented at Lattice 2017 are preliminary or blinded, so not yet ready for inclusion in this review. However, the reader will be interested to know that the JLQCD collaboration is using Möbius Domain Wall fermions with \(a\approx 0.08\), 0.055, and 0.044 fm and pion masses down to 300 MeV to study this process [582]. FNAL/MILC is using \(N_f=2\,+\,1\,+\,1\) HISQ ensembles with \(a\approx 0.15\), 0.12, and 0.088 fm, with Goldstone pion mass down to its physical value [583]. Both groups updated their results for Lattice 2018, but do not have final values for the form factors.

Returning to the works that contribute to our averages, both the HPQCD and the FNAL/MILC computations of \(B\rightarrow \pi \ell \nu \) amplitudes use ensembles of gauge configurations with \(N_f=2\,+\,1\) flavours of rooted staggered quarks produced by the MILC collaboration; however, the latest FNAL/MILC work makes a much more extensive use of the currently available ensembles, both in terms of lattice spacings and light-quark masses. HPQCD have results at two values of the lattice spacing (\(a\sim 0.12,~0.09~\mathrm{fm}\)), while FNAL/MILC employs four values (\(a\sim 0.12,~0.09,~0.06,~0.045~\mathrm{fm}\)). Lattice-discretization effects are estimated within HMrS\(\chi \)PT in the FNAL/MILC computation, while HPQCD quotes the results at \(a\sim 0.12~\mathrm{fm}\) as central values and uses the \(a\sim 0.09~\mathrm{fm}\) results to quote an uncertainty. The relative scale is fixed in both cases through \(r_1/a\). HPQCD set the absolute scale through the \(\Upsilon \) 2S–1S splitting, while FNAL/MILC uses a combination of \(f_\pi \) and the same \(\Upsilon \) splitting, as described in Ref. [62]. The spatial extent of the lattices employed by HPQCD is \(L\simeq 2.4~\mathrm{fm}\), save for the lightest mass point (at \(a\sim 0.09~\mathrm{fm}\)) for which \(L\simeq 2.9~\mathrm{fm}\). FNAL/MILC, on the other hand, uses extents up to \(L \simeq 5.8~\mathrm{fm}\), in order to allow for light-pion masses while keeping finite-volume effects under control. Indeed, while in the 2006 HPQCD work the lightest RMS pion mass is \(400~\mathrm{MeV}\), the latest FNAL/MILC work includes pions as light as \(165~\mathrm{MeV}\) – in both cases the bound \(m_\pi L \gtrsim 3.8\) is kept. Other than the qualitatively different range of MILC ensembles used in the two computations, the main difference between HPQCD and FNAL/MILC lies in the treatment of heavy quarks. HPQCD uses the NRQCD formalism, with a 1-loop matching of the relevant currents to the ones in the relativistic theory. FNAL/MILC employs the clover action with the Fermilab interpretation, with a mostly nonperturbative renormalization of the relevant currents, within which light-light and heavy–heavy currents are renormalized nonperturbatively and 1-loop perturbation theory is used for the relative normalization. (See Table 40; full details about the computations are provided in tables in Appendix B.6.3.)

The RBC/UKQCD computation is based on \(N_f=2\,+\,1\) DWF ensembles at two values of the lattice spacing (\(a\sim 0.12,~0.09~\mathrm{fm}\)), and pion masses in a narrow interval ranging from slightly above \(400~\mathrm{MeV}\) to slightly below \(300~\mathrm{MeV}\), keeping \(m_\pi L \gtrsim 4\). The scale is set using the \(\Omega ^-\) baryon mass. Discretization effects coming from the light sector are estimated in the \(1\%\) ballpark using HM\(\chi \)PT supplemented with effective higher-order interactions to describe cutoff effects. The b quark is treated using the Columbia RHQ action, with a mostly nonperturbative renormalization of the relevant currents. Discretization effects coming from the heavy sector are estimated with power-counting arguments to be below \(2\%\).

Given the large kinematical range available in the \(B\rightarrow \pi \) transition, chiral extrapolations are an important source of systematic uncertainty: apart from the eventual need to reach physical pion masses in the extrapolation, the applicability of \(\chi \)PT is not guaranteed for large values of the pion energy \(E_\pi \). Indeed, in all computations \(E_\pi \) reaches values in the \(1~\mathrm{GeV}\) ballpark, and chiral extrapolation systematics is the dominant source of errors. FNAL/MILC uses SU(2) NLO HMrS\(\chi \)PT for the continuum-chiral extrapolation, supplemented by NNLO analytic terms and hard-pion \(\chi \)PT terms [537];Footnote 49 systematic uncertainties are estimated through an extensive study of the effects of varying the specific fit ansatz and/or data range. RBC/UKQCD uses SU(2) hard-pion HM\(\chi \)PT to perform its combined continuum-chiral extrapolation, and obtains sizeable estimates for systematic uncertainties by varying the ansätze and ranges used in fits. HPQCD performs chiral extrapolations using HMrS\(\chi \)PT formulae, and estimates systematic uncertainties by comparing the result with the ones from fits to a linear behaviour in the light-quark mass, continuum HM\(\chi \)PT, and partially quenched HMrS\(\chi \)PT formulae (including also data with different sea and valence light-quark masses).

FNAL/MILC and RBC/UKQCD describe the \(q^2\)-dependence of \(f_+\) and \(f_0\) by applying a BCL parameterization to the form factors extrapolated to the continuum limit, within the range of values of \(q^2\) covered by data. RBC/UKQCD generate synthetic data for the form factors at some values of \(q^2\) (evenly spaced in z) from the continuous function of \(q^2\) obtained from the joint chiral-continuum extrapolation, which are then used as input for the fits. After having checked that the kinematical constraint \(f_+(0)=f_0(0)\) is satisfied within errors by the extrapolation to \(q^2=0\) of the results of separate fits, this constraint is imposed to improve fit quality. In the case of FNAL/MILC, rather than producing synthetic data a functional method is used to extract the z-parameterization directly from the fit functions employed in the continuum-chiral extrapolation. In the case of HPQCD, the parameterization of the \(q^2\)-dependence of form factors is somewhat intertwined with chiral extrapolations: a set of fiducial values \(\{E_\pi ^{(n)}\}\) is fixed for each value of the light-quark mass, and \(f_{+,0}\) are interpolated to each of the \(E_\pi ^{(n)}\); chiral extrapolations are then performed at fixed \(E_\pi \) (i.e., \(m_\pi \) and \(q^2\) are varied subject to \(E_\pi \)=constant). The interpolation is performed using a BZ ansatz. The \(q^2\)-dependence of the resulting form factors in the chiral limit is then described by means of a BZ ansatz, which is cross-checked against BK, RH, and BGL parameterizations. Unfortunately, the correlation matrix for the values of the form factors at different \(q^2\) is not provided, which severely limits the possibilities of combining them with other computations into a global z-parameterization.

Based on the parameterized form factors, HPQCD and RBC/UKQCD provide values for integrated decay rates \(\Delta \zeta ^{B\pi }\), as defined in Eq. (223); they are quoted in Table 40. The latest FNAL/MILC work, on the other hand, does not quote a value for the integrated ratio. Furthermore, as mentioned above, the field has recently moved forward to determine CKM matrix elements from direct joint fits of experimental results and theoretical form factors, rather than a matching through \(\Delta \zeta ^{B\pi }\). Thus, we will not provide here a FLAG average for the integrated rate, and focus on averaging lattice results for the form factors themselves.

The different ways in which the current results are presented do not allow a straightforward averaging procedure. RBC/UKQCD only provides synthetic values of \(f_+\) and \(f_0\) at a few values of \(q^2\) as an illustration of their results, and FNAL/MILC does not quote synthetic values at all. In both cases, full results for BCL z-parameterizations defined by Eq. (448) are quoted. In the case of HPQCD 06, unfortunately, a fit to a BCL z-parameterization is not possible, as discussed above.

In order to combine these form factor calculations we start from sets of synthetic data for several \(q^2\) values. HPQCD and RBC/UKQCD provide directly this information; FNAL/MILC presents only fits to a BCL z-parameterization from which we can easily generate an equivalent set of form factor values. It is important to note that in both the RBC/UKQCD synthetic data and the FNAL/MILC z-parameterization fits the kinematic constraint at \(q^2=0\) is automatically included (in the FNAL/MILC case the constraint is manifest in an exact degeneracy of the \((a_n^+ ,a_n^0)\) covariance matrix). Due to these considerations, in our opinion the most accurate procedure is to perform a simultaneous fit to all synthetic data for the vector and scalar form factors. Unfortunately, the absence of information on the correlation in the HPQCD result between the vector and scalar form factors even at a single \(q^2\) point makes it impossible to include consistently this calculation in the overall fit. In fact, the HPQCD and FNAL/MILC statistical uncertainties are highly correlated (because they are based on overlapping subsets of MILC \(N_f=2\,+\,1\) ensembles) and, without knowledge of the \(f_+ - f_0\) correlation we are unable to construct the HPQCD-FNAL/MILC off-diagonal entries of the overall covariance matrix.

In conclusion, we will present as our best result a combined vector and scalar form factor fit to the FNAL/MILC and RBC/UKQCD results that we treat as completely uncorrelated. For sake of completeness we will also show the results of a vector form factor fit alone in which we include one HPQCD datum at \(q^2=17.34~\,\mathrm {GeV}^2\) assuming conservatively a 100% correlation between the statistical error of this point and of all FNAL/MILC synthetic data. In spite of contributing just one point, the HPQCD datum has a significant weight in the fit due to its small overall uncertainty. We stress again that this procedure is slightly inconsistent because FNAL/MILC and RBC/UKQCD include information on the kinematic constraint at \(q^2=0\) in their \(f_+\) results.

Fig. 27
figure 27

The form factors \((1 - q^2/m_{B^*}^2) f_+(q^2)\) and \(f_0 (q^2)\) for \(B \rightarrow \pi \ell \nu \) plotted versus z. (See text for a discussion of the data set.) The grey and salmon bands display our preferred \(N^+=N^0=3\) BCL fit (five parameters) to the plotted data with errors

The resulting data set is then fitted to the BCL parameterization in Eqs. (448) and (449). We assess the systematic uncertainty due to truncating the series expansion by considering fits to different orders in z. In Fig. 27, we show the FNAL/MILC, RBC/UKQCD, and HPQCD data points for \((1-q^2/m_{B^*}^2) f_+(q^2)\) and \(f_0 (q^2)\) versus z. The data is highly linear and we get a good \(\chi ^2/\mathrm{dof}\) with \(N^+ = N^0 = 3\). Note that this implies three independent parameters for \(f_+\) corresponding to a polynomial through \({\mathcal {O}}(z^3)\) and two independent parameters for \(f_0\) corresponding to a polynomial through \({\mathcal {O}}(z^2)\) (the coefficient \(a_2^0\) is fixed using the \(q^2=0\) kinematic constraint). We cannot constrain the coefficients of the z-expansion beyond this order; for instance, including a fourth parameter in \(f_+\) results in 100% uncertainties on \(a_2^+\) and \(a_3^+\). The outcome of the five-parameter \(N^+ =N^0=3\) BCL fit to the FNAL/MILC and RBC/UKQCD calculations is shown in Table 41. The uncertainties on \(a_0^{+,0}\), \(a_1^{+,0}\) and \(a_2^+\) encompass the central values obtained from \(N^+=2,4\) and \(N^0=2,4,5\) fits and thus adequately reflect the systematic uncertainty on those series coefficients. This can be used as the averaged FLAG result for the lattice-computed form factor \(f_+(q^2)\). The coefficient \(a_3^+\) can be obtained from the values for \(a_0^+\)\(a_2^+\) using Eq. (447). The coefficient \(a_2^0\) can be obtained from all other coefficients imposing the \(f_+(q^2=0) = f_0(q^2=0)\) constraint. The fit is illustrated in Fig. 27. It is worth stressing that, with respect to our average in the 2015 edition of the FLAG report, the relative error on \(a_0^+\), which dominates the theory contribution to the determination of \(|V_{ub}|\), has decreased from \(7.3\%\) to \(3.2\%\). The dominant factor in this remarkable improvement is the new FNAL/MILC determination of \(f_+\). We emphasize that future lattice-QCD calculations of semileptonic form factors should publish their full statistical and systematic correlation matrices to enable others to use the data. It is also preferable to present a set of synthetic form factors data equivalent to the z-fit results, since this allows for an independent analysis that avoids further assumptions about the compatibility of the procedures to arrive at a given z-parameterization.Footnote 50 It is also preferable to present covariance/correlation matrices with enough significant digits to calculate correctly all their eigenvalues.

For the sake of completeness, we present also a standalone z-fit to the vector form factor. In this fit we are able to include the single \(f_+\) point at \(q^2 = 17.34\; \mathrm{GeV}^2\) that we mentioned above. This fit uses the FNAL/MILC and RBC/UKQCD results that do make use of the kinematic constraint at \(q^2=0\), but is otherwise unbiased. The results of the three-parameter BCL fit to the HPQCD, FNAL/MILC and RBC/UKQCD calculations of the vector form factor are:

$$\begin{aligned}&N_f=2\,+\,1: \quad a_0^+ = 0.421(13), \nonumber \\&a_1^+ = -0.35(10),\quad a_2^+ = -0.41(64); \nonumber \\&{\mathrm{corr}}(a_i,a_j)=\left( \begin{array}{rrr} 1.000 &{}\quad 0.306 &{} \quad 0.084 \\ 0.306 &{}\quad 1.000 &{}\quad 0.856 \\ 0.084 &{} \quad 0.856 &{}\quad 1.000 \end{array}\right) . \end{aligned}$$
(224)

Note that the \(a_0^+\) coefficient, that is the most relevant for input to the extraction of \(V_{ub}\) from semileptonic \(B\rightarrow \pi \ell \nu _\ell (\ell =e,\mu )\) decays, shifts by about a standard deviation.

Table 41 Coefficients and correlation matrix for the \(N^+ =N^0=3\) z-expansion of the \(B\rightarrow \pi \) form factors \(f_+\) and \(f_0\)
Table 42 Results for the \(B_s \rightarrow K\ell \nu \) semileptonic form factor

8.3.2 Form factors for \(B_s\rightarrow K\ell \nu \)

Similar to \(B\rightarrow \pi \ell \nu \), measurements of \(B_s\rightarrow K\ell \nu \) enable determinations of the CKM matrix element \(|V_{ub}|\) within the Standard Model via Eq. (222). From the lattice point of view the two channels are very similar – as a matter of fact, \(B_s\rightarrow K\ell \nu \) is actually somewhat simpler, in that the fact that the kaon mass region is easily accessed by all simulations makes the systematic uncertainties related to chiral extrapolation smaller. On the other hand, \(B_s\rightarrow K\ell \nu \) channels have not been measured experimentally yet, and therefore lattice results provide SM predictions for the relevant rates.

At the time of our previous review, results for \(B_s\rightarrow K\ell \nu \) form factors were provided by HPQCD [584] and RBC/UKQCD [577] for both form factors \(f_+\) and \(f_0\), in both cases using \(N_f=2\,+\,1\) dynamical configurations. The ALPHA collaboration determination of \(B_s\rightarrow K\ell \nu \) form factors with \(N_f=2\) was also well underway [585]; however, we have not seen final results. HPQCD has recently emphasized the value of form factor ratios for the processes \(B_s\rightarrow K\ell \nu \) and \(B_s\rightarrow D_s\ell \nu \) for determination of \(|V_{ub}/V_{cb}|\) [586]. Preliminary results from FNAL/MILC have been reported for \(N_f=2\,+\,1\) [587] and \(N_f=2\,+\,1\,+\,1\) [583]. Archival papers are expected soon.

The RBC/UKQCD computation has been published together with the \(B\rightarrow \pi \ell \nu \) computation discussed in Sect. 8.3.1, all technical details being practically identical. The main difference is that errors are significantly smaller, mostly due to the reduction of systematic uncertainties due to the chiral extrapolation; detailed information is provided in tables in Appendix B.6.3. The HPQCD computation uses ensembles of gauge configurations with \(N_f=2\,+\,1\) flavours of asqtad rooted staggered quarks produced by the MILC collaboration at two values of the lattice spacing (\(a\sim 0.12,~0.09~\mathrm{fm}\)), for three and two different sea-pion masses, respectively, down to a value of \(260~\mathrm{MeV}\). The b quark is treated within the NRQCD formalism, with a 1-loop matching of the relevant currents to the ones in the relativistic theory, omitting terms of \({\mathcal {O}}(\alpha _s\Lambda _{\mathrm{QCD}}/m_b)\). The HISQ action is used for the valence s quark. The continuum-chiral extrapolation is combined with the description of the \(q^2\)-dependence of the form factors into a modified z-expansion (cf. Appendix A.5) that formally coincides in the continuum with the BCL ansatz. The dependence of form factors on the pion energy and quark masses is fitted to a 1-loop ansatz inspired by hard-pion \(\chi \)PT [537], that factorizes out the chiral logarithms describing soft physics. See Table 42 and the tables in Appendix B.6.3 for full details.

Both RBC/UKQCD and HPQCD quote values for integrated differential decay rates over the full kinematically available region. However, since the absence of experiment makes the relevant integration interval subject to change, we will not discuss them here, and focus on averages of form factors. In order to combine the results from the two collaborations, we will follow a similar approach to the one adopted above for \(B\rightarrow \pi \ell \nu \): we will take as direct input the synthetic values of the form factors provided by RBC/UKQCD, use the preferred HPQCD parameterization to produce synthetic values, and perform a joint fit to the two data sets.

Note that the kinematic constraint at \(q^2=0\) is included explicitly in the results presented by HPQCD (the coefficient \(b_0^0\) is expressed analytically in terms of all others) and implicitly in the synthetic data provided by RBC/UKQCD. Therefore, following the procedure we adopted for the \(B\rightarrow \pi \) case, we present a joint fit to the vector and scalar form factors and implement explicitly the \(q^2=0\) constraint by expressing the coefficient \(b^0_{N^0-1}\) in terms of all others.

For the fits we employ a BCL ansatz with \(t_+=(M_{B_s}+M_{K^\pm })^2 \simeq 34.35~\,\mathrm {GeV}^2\) and \(t_0=(M_{B_s}+M_{K^\pm })(\sqrt{M_{B_s}}-\sqrt{M_{K^\pm }})^2 \simeq 15.27~\,\mathrm {GeV}^2\). Our pole factors will contain a single pole in both the vector and scalar channels, for which we take the mass values \(M_{B^*}=5.325~\,\mathrm {GeV}\) and \(M_{B^*(0^+)}=5.65~\,\mathrm {GeV}\).Footnote 51

Table 43 Coefficients and correlation matrix for the \(N^+ =N^0=3\) z-expansion of the \(B_s\rightarrow K\) form factors \(f_+\) and \(f_0\)

The outcome of the five-parameter \(N^+ = N^0 = 3\) BCL fit, which we quote as our preferred result, is shown in Table 43. The uncertainties on \(a_0\) and \(a_1\) encompass the central values obtained from \({\mathcal {O}}(z^2)\) fits, and thus adequately reflect the systematic uncertainty on those series coefficients.Footnote 52 These can be used as the averaged FLAG results for the lattice-computed form factors \(f_+(q^2)\) and \(f_0(q^2)\). The coefficient \(a_3^+\) can be obtained from the values for \(a_0^+\)\(a_2^+\) using Eq. (447). The fit is illustrated in Fig. 28.

Fig. 28
figure 28

The form factors \((1 - q^2/m_{B^*}^2) f_+(q^2)\) and \((1 - q^2/m_{B^*(0+)}^2) f_0(q^2)\) for \(B_s \rightarrow K\ell \nu \) plotted versus z. (See text for a discussion of the data sets.) The grey and salmon bands display our preferred \(N^+=N^0=3\) BCL fit (five parameters) to the plotted data with errors

8.3.3 Form factors for rare and radiative B-semileptonic decays to light flavours

Lattice-QCD input is also available for some exclusive semileptonic decay channels involving neutral-current \(b\rightarrow q\) transitions at the quark level, where \(q=d,s\). Being forbidden at tree level in the SM, these processes allow for stringent tests of potential new physics; simple examples are \(B\rightarrow K^*\gamma \), \(B\rightarrow K^{(*)}\ell ^+\ell ^-\), or \(B\rightarrow \pi \ell ^+\ell ^-\) where the B meson (and therefore the light meson in the final state) can be either neutral or charged.

The corresponding SM effective weak Hamiltonian is considerably more complicated than the one for the tree-level processes discussed above: after integrating out the top and the W boson, as many as ten dimension-six operators formed by the product of two hadronic currents or one hadronic and one leptonic current appear.Footnote 53 Three of the latter, coming from penguin and box diagrams, dominate at short distances and have matrix elements that, up to small QED corrections, are given entirely in terms of \(B\rightarrow (\pi ,K,K^*)\) form factors. The matrix elements of the remaining seven operators can be expressed, up to power corrections whose size is still unclear, in terms of form factors, decay constants and light-cone distribution amplitudes (for the \(\pi \), K, \(K^*\) and B mesons) by employing OPE arguments (at large di-lepton invariant mass) and results from Soft Collinear Effective Theory (at small di-lepton invariant mass). In conclusion, the most important contributions to all of these decays are expected to come from matrix elements of current operators (vector, tensor, and axial-vector) between one-hadron states, which in turn can be parameterized in terms of a number of form factors (see Ref. [589] for a complete description).

Table 44 Results for the \(B \rightarrow K\) semileptonic form factors
Table 45 Coefficients and correlation matrix for the \(N^+ =N^0=3\) z-expansion of the \(B\rightarrow \pi \) form factor \(f_T\)

In channels with pseudoscalar mesons in the final state, the level of sophistication of lattice calculations is similar to the \(B\rightarrow \pi \) case and there are results for the vector, scalar, and tensor form factors for \(B\rightarrow K\ell ^+\ell ^-\) decays by HPQCD [590], and more recent results for both \(B\rightarrow \pi \ell ^+\ell ^-\) [592] and \(B\rightarrow K\ell ^+\ell ^-\) [591] from FNAL/MILC. Full details about these two calculations are provided in Table 44 and in the tables in Appendix B.6.4. Both computations employ MILC \(N_f=2\,+\,1\) asqtad ensembles. HPQCD [593] and FNAL/MILC [594] have also companion papers in which they calculate the Standard Model predictions for the differential branching fractions and other observables and compare to experiment. The HPQCD computation employs NRQCD b quarks and HISQ valence light quarks, and parameterizes the form factors over the full kinematic range using a model-independent z-expansion as in Appendix A.5, including the covariance matrix of the fit coefficients. In the case of the (separate) FNAL/MILC computations, both of them use Fermilab b quarks and asqtad light quarks, and a BCL z-parameterization of the form factors.

Reference [592] includes results for the tensor form factor for \(B\rightarrow \pi \ell ^+\ell ^-\) not included in previous publications on the vector and scalar form factors [577]. Nineteen ensembles from four lattice spacings are used to control continuum and chiral extrapolations. The results for \(N_z=4\) z-expansion of the tensor form factor and its correlations with the expansions for the vector and scalar form factors, which we consider the FLAG estimate, are shown in Table 45. Partial decay widths for decay into light leptons or \(\tau ^+\tau ^-\) are presented as a function of \(q^2\). The former is compared with results from LHCb [595], while the latter is a prediction.

The averaging of the HPQCD and FNAL/MILC results for the \(B\rightarrow K\) form factors is similar to our treatment of the \(B\rightarrow \pi \) and \(B_s\rightarrow K\) form factors. In this case, even though the statistical uncertainties are partially correlated because of some overlap between the adopted sets of MILC ensembles, we choose to treat the two calculations as independent. The reason is that, in \(B\rightarrow K\), statistical uncertainties are subdominant and cannot be easily extracted from the results presented by HPQCD and FNAL/MILC. Both collaborations provide only the outcome of a simultaneous z-fit to the vector, scalar and tensor form factors, that we use to generate appropriate synthetic data. We then impose the kinematic constraint \(f_+(q^2=0) = f_0(q^2=0)\) and fit to \((N^+ = N^0 = N^T = 3)\) BCL parameterization. The functional forms of the form factors that we use are identical to those adopted in Ref. [594].Footnote 54 The results of the fit are presented in Table 46. The fit is illustrated in Fig. 29. Note that the average for the \(f_T\) form factor appears to prefer the FNAL/MILC synthetic data. This happens because we perform a correlated fit of the three form factors simultaneously (both FNAL/MILC and HPQCD present covariance matrices that include correlations between all form factors). We checked that the average for the \(f_T\) form factor, obtained neglecting correlations with \(f_0\) and \(f_+\), is a little lower and lies in between the two data sets. There is still a noticeable tension between the FNAL/MILC and HPQCD data for the tensor form factor; indeed, a standalone fit to these data results in \(\chi ^2_{\mathrm{red}}=7.2/3\), while a similar standalone joint fit to \(f_+\) and \(f_0\) has \(\chi ^2_{\mathrm{red}}=7.3/7\). Finally, the global fit that is shown in the figure has \(\chi ^2_{\mathrm{red}}=16.4/10\).

Table 46 Coefficients and correlation matrix for the \(N^+ =N^0=N^T=3\) z-expansion of the \(B\rightarrow K\) form factors \(f_+\), \(f_0\) and \(f_T\)
Fig. 29
figure 29

The \(B\rightarrow K\) form factors \((1 - q^2/m_{B^*}^2) f_+(q^2)\), \((1 - q^2/m_{B^*(0+)}^2) f_0(q^2)\) and \((1 - q^2/m_{B^*}^2) f_T(q^2)\) plotted versus z. (See text for a discussion of the data sets.) The grey, salmon and blue bands display our preferred \(N^+=N^0=N^T=3\) BCL fit (eight parameters) to the plotted data with errors

Lattice computations of form factors in channels with a vector meson in the final state face extra challenges with respect to the case of a pseudoscalar meson: the state is unstable, and the extraction of the relevant matrix element from correlation functions is significantly more complicated; \(\chi \)PT cannot be used as a guide to extrapolate results at unphysically heavy pion masses to the chiral limit. While field-theory procedures to take resonance effects into account are available [300,301,302, 597,598,599,600,601,602], they have not yet been implemented in the existing preliminary computations, which therefore suffer from uncontrolled systematic errors in calculations of weak decay form factors into unstable vector meson final states, such as the \(K^*\) or \(\rho \) mesons.Footnote 55

As a consequence of the complexity of the problem, the level of maturity of these computations is significantly below the one present for pseudoscalar form factors. Therefore, we will only provide below a short guide to the existing results.

Concerning channels with vector mesons in the final state, Horgan et al. have obtained the seven form factors governing \(B \rightarrow K^* \ell ^+ \ell ^-\) (as well as those for \(B_s \rightarrow \phi \, \ell ^+ \ell ^-\)) in Ref. [603] using NRQCD b quarks and asqtad staggered light quarks. In this work, they use a modified z-expansion to simultaneously extrapolate to the physical light-quark masses and continuum and extrapolate in \(q^2\) to the full kinematic range. As discussed in Sect. 7.2, the modified z-expansion is not based on an underlying effective theory, and the associated uncertainties have yet to be fully studied. Horgan et al. use their form-factor results to calculate the differential branching fractions and angular distributions and discuss the implications for phenomenology in a companion paper [604]. Finally, preliminary results on \(B\rightarrow K^*\ell ^+\ell ^-\) and \(B_s\rightarrow \phi \ell ^+\ell ^-\) by RBC/UKQCD, have been reported in Refs. [605,606,607].

8.4 Semileptonic form factors for \(B_{(s)} \rightarrow D_{(s)} \ell \nu \) and \(B \rightarrow D^* \ell \nu \)

The semileptonic processes \( B_{(s)} \rightarrow D_{(s)} \ell \nu \) and \(B \rightarrow D^* \ell \nu \) have been studied extensively by experimentalists and theorists over the years. They allow for the determination of the CKM matrix element \(|V_{cb}|\), an extremely important parameter of the Standard Model. The matrix elememt \(V_{cb}\) appears in many quantities that serve as inputs to CKM unitarity triangle analyses and reducing its uncertainties is of paramount importance. For example, when \(\epsilon _K\), the measure of indirect CP violation in the neutral kaon system, is written in terms of the parameters \(\rho \) and \(\eta \) that specify the apex of the unitarity triangle, a factor of \(|V_{cb}|^4\) multiplies the dominant term. As a result, the errors coming from \(|V_{cb}|\) (and not those from \(B_K\)) are now the dominant uncertainty in the Standard Model (SM) prediction for this quantity.

The decay rate for \(B \rightarrow D\ell \nu \) can be parameterized in terms of vector and scalar form factors in the same way as, e.g., \(B\rightarrow \pi \ell \nu \), see Sect. 8.3. The decay rate for \(B \rightarrow D^*\ell \nu \) is different because the final-state hadron is spin-1. There are four form factors used to describe the vector and axial-vector current matrix elements that are needed to calculate this decay. We define the 4-velocity of the meson P as \(v_P= p_P/m_P\) and the polarization vector of the \(D^*\) as \(\epsilon \). When the light lepton \(\ell =e\), or \(\mu \), it is traditional to use \( w=v_B\cdot v_{D^{(*)}}\) rather than \(q^2\) as the variable upon which the form factors depend. Then, the form factors \(h_V\), and \(h_{A_i}\), with \(i=1\), 2 or 3 are defined by

$$\begin{aligned} \langle D^* | V_\mu | B \rangle&= \sqrt{m_B m_{D^*}} h_V(w) \varepsilon _{\mu \nu \alpha \beta } \epsilon ^{*\nu } v_{D^*}^\alpha v_B^\beta \,, \end{aligned}$$
(225)
$$\begin{aligned} \langle D^* | A_\mu | B \rangle&= i \sqrt{m_B m_{D^*}} \left[ h_{A_1}(w) (1+w) \epsilon ^{*\mu } \nonumber \right. \\&\quad - h_{A_2}(w) \epsilon ^*\cdot v_B {v_B}_\mu \left. - h_{A_3}(w) \epsilon ^*\cdot v_B {v_{D^*}}_\mu \right] .\nonumber \\ \end{aligned}$$
(226)

The differential decay rates can then be written as

$$\begin{aligned} \frac{d\Gamma _{B^-\rightarrow D^{0} \ell ^-{\bar{\nu }}}}{dw}= & {} \frac{G^2_{\mathrm{F}} m^3_{D}}{48\pi ^3}(m_B+m_{D})^2(w^2-1)^{3/2}\nonumber \\&\quad \times |\eta _\mathrm {EW}|^2|V_{cb}|^2 |{\mathcal {G}}(w)|^2, \end{aligned}$$
(227)
$$\begin{aligned} \frac{d\Gamma _{B^-\rightarrow D^{0*}\ell ^-{\bar{\nu }}}}{dw}= & {} \frac{G^2_{\mathrm{F}} m^3_{D^*}}{4\pi ^3}(m_B-m_{D^*})^2(w^2-1)^{1/2}\nonumber \\&\quad \times |\eta _\mathrm {EW}|^2|V_{cb}|^2\chi (w)|{\mathcal {F}}(w)|^2 , \end{aligned}$$
(228)

where \(w = v_B \cdot v_{D^{(*)}}\) (depending on whether the final-state meson is D or \(D^*\)) and \(\eta _\mathrm {EW}=1.0066\) is the 1-loop electroweak correction [608]. The function \(\chi (w)\) in Eq. (228) depends on the recoil w and the meson masses, and reduces to unity at zero recoil [588]. These formulas do not include terms that are proportional to the lepton mass squared, which can be neglected for \(\ell = e, \mu \). Further details of the definitions of \({{{\mathcal {F}}}}\) and \({{{\mathcal {G}}}}\) may be found, e.g., in Ref. [588]. Until recently, most unquenched lattice calculations for \(B \rightarrow D^* \ell \nu \) and \(B \rightarrow D \ell \nu \) decays focused on the form factors at zero recoil \({{{\mathcal {F}}}}^{B \rightarrow D^*}(1)\) and \({{{\mathcal {G}}}}^{B \rightarrow D}(1)\); these can then be combined with experimental input to extract \(|V_{cb}|\). The main reasons for concentrating on the zero recoil point are that (i) the decay rate then depends on a single form factor, and (ii) for \(B \rightarrow D^*\ell \nu \), there are no \({\mathcal {O}}(\Lambda _{QCD}/m_Q)\) contributions due to Luke’s theorem [609]. Further, the zero recoil form factor can be computed via a double ratio in which most of the current renormalization cancels and heavy-quark discretization errors are suppressed by an additional power of \(\Lambda _{QCD}/m_Q\). Recent work on \(B \rightarrow D^{(*)}\ell \nu \) transitions has started to explore the dependence of the relevant form factors on the momentum transfer, using a similar methodology to the one employed in \(B\rightarrow \pi \ell \nu \) transitions; we refer the reader to Sect. 8.3 for a detailed discussion.

Early computations of the form factors for \(B \rightarrow D\ell \nu \) decays include \(N_f=2\,+\,1\) results by FNAL/MILC [610, 611] for \({{{\mathcal {G}}}}^{B \rightarrow D}(1)\) and the \(N_f=2\) study by Atoui et al. [612], that in addition to providing \(\mathcal{G}^{B \rightarrow D}(1)\) explored the \(w>1\) region. This latter work also provided the first results for \(B_s \rightarrow D_s\ell \nu \) amplitudes, again including information about the momentum-transfer dependence; this will allow for an independent determination of \(|V_{cb}|\) as soon as experimental data is available for these transitions. The first published unquenched results for \({{{\mathcal {F}}}}^{B \rightarrow D^*}(1)\), obtained by FNAL/MILC, date from 2008 [613]. In 2014 and 2015, significant progress was achieved in \(N_f=2\,+\,1\) computations: the FNAL/MILC value for \({{{\mathcal {F}}}}^{B \rightarrow D^*}(1)\) was updated in Ref. [614], and full results for \(B \rightarrow D\ell \nu \) at \(w \ge 1\) were published by FNAL/MILC [615] and HPQCD [616]. These works also provided full results for the scalar form factor, allowing analysis of the decay with a final-state \(\tau \). Since the previous version of this review, there are new results for \(B_s \rightarrow D_s\ell \nu \) form factors over the full kinematic range for \(N_f=2\,+\,1\) from HPQCD [617, 618], and for \({{B}_{(s)}\rightarrow D_{(s)}^{*}\ell {\nu }}\) form factors at zero recoil with \(N_f=2\,+\,1\,+\,1\) also from HPQCD [619, 620]. There has also been significant progress on heavy-baryon decay. Reference [621] calculates the tensor form factors for decay \({\Lambda }_b\rightarrow {\Lambda }_c\tau {{\overline{\nu }}}_{\tau }\) and considers the phenomenological implications.

In the discussion below, we mainly concentrate on the latest generation of results, which supersedes previous \(N_f=2\,+\,1\) determinations and allows for an extraction of \(|V_{cb}|\) that incorporates information about the \(q^2\)-dependence of the decay rate (cf. Sect. 8.8).

8.4.1 \(B_{(s)} \rightarrow D_{(s)}\) decays

We will first discuss the \(N_f=2\,+\,1\) computations of \(B \rightarrow D \ell \nu \) by FNAL/MILC and HPQCD mentioned above, both based on MILC asqtad ensembles. Full details about all the computations are provided in Table 48 and in the tables in Appendix B.6.5.

The FNAL/MILC study [615] employs ensembles at four values of the lattice spacing ranging between approximately 0.045 and \(0.12~\mathrm{fm}\), and several values of the light-quark mass corresponding to pions with RMS masses ranging between 260 and \(670~\mathrm{MeV}\) (with just one ensemble with \(M_\pi ^{\mathrm{RMS}} \simeq 330~\mathrm{MeV}\) at the finest lattice spacing). The b and c quarks are treated using the Fermilab approach. The quantities directly studied are the form factors \(h_\pm \) defined by

$$\begin{aligned} \frac{\langle D(p_D)| i{{\bar{c}}} \gamma _\mu b| B(p_B)\rangle }{\sqrt{m_D m_B}}= & {} h_+(w)(v_B+v_D)_\mu \,\nonumber \\&+\,h_-(w)(v_B-v_D)_\mu \,, \end{aligned}$$
(229)

which are related to the standard vector and scalar form factors by

$$\begin{aligned} f_+(q^2)= & {} \frac{1}{2\sqrt{r}}\,\left[ (1+r)h_+(w)-(1-r)h_-(w)\right] \,, \end{aligned}$$
(230)
$$\begin{aligned} f_0(q^2)= & {} \sqrt{r}\left[ \frac{1+w}{1+r}\,h_+(w)\,+\,\frac{1-w}{1-r}\,h_-(w)\right] \,, \end{aligned}$$
(231)

with \(r=m_D/m_B\). (Recall that \(q^2=(p_B-p_D)^2=m_B^2\,+\,m_D^2-2 w m_B m_D\).) The hadronic form factor relevant for experiment, \({\mathcal {G}}(w)\), is then obtained from the relation \({\mathcal {G}}(w)=4rf_+(q^2)/(1+r)\). The form factors are obtained from double ratios of three-point functions in which the flavour-conserving current renormalization factors cancel. The remaining matching factor \(\rho _{V^\mu _{cb}}\) is estimated with 1-loop lattice perturbation theory. In order to obtain \(h_\pm (w)\), a joint continuum-chiral fit is performed to an ansatz that contains the light-quark mass and lattice-spacing dependence predicted by next-to-leading order HMrS\(\chi \)PT, and the leading dependence on \(m_c\) predicted by the heavy-quark expansion (\(1/m_c^2\) for \(h_+\) and \(1/m_c\) for \(h_-\)). The w-dependence, which allows for an interpolation in w, is given by analytic terms up to \((1-w)^2\), as well as a contribution from the logarithm proportional to \(g^2_{D^*D\pi }\). The total resulting systematic error is \(1.2\%\) for \(f_+\) and \(1.1\%\) for \(f_0\). This dominates the final error budget for the form factors. After \(f_+\) and \(f_0\) have been determined as functions of w within the interval of values of \(q^2\) covered by the computation, synthetic data points are generated to be subsequently fitted to a z-expansion of the BGL form, cf. Sect. 8.3, with pole factors set to unity. This in turn enables one to determine \(|V_{cb}|\) from a joint fit of this z-expansion and experimental data. The value of the zero-recoil form factor resulting from the z-expansion is

$$\begin{aligned} {{{\mathcal {G}}}}^{B \rightarrow D}(1)= 1.054(4)_{\mathrm{stat}}(8)_{\mathrm{sys}}\,. \end{aligned}$$
(232)

The HPQCD computations [616, 618] use ensembles at two values of the lattice spacing, \(a=0.09,~0.12~\mathrm{fm}\), and two and three values of light-quark masses, respectively. The b quark is treated using NRQCD, while for the c quark the HISQ action is used. The form factors studied, extracted from suitable three-point functions, are

$$\begin{aligned} \langle D_{(s)}(p_{D_{(s)}})| V^0 | B_{(s)}\rangle= & {} \sqrt{2M_{B_{(s)}}}f^{(s)}_\parallel \,,~~~~~~~~ \nonumber \\ \langle D_{(s)}(p_{D_{(s)}})| V^k | B_{(s)}\rangle= & {} \sqrt{2M_{B_{(s)}}}p^k_{D_{(s)}} f^{(s)}_\perp \,, \end{aligned}$$
(233)

where \(V_\mu \) is the relevant vector current and the \(B_{(s)}\) rest frame is assumed. The standard vector and scalar form factors are retrieved as

$$\begin{aligned} f^{(s)}_+= & {} \frac{1}{\sqrt{2M_{B_{(s)}}}}f^{(s)}_\parallel \,\nonumber \\&+\, \frac{1}{\sqrt{2M_{B_{(s)}}}}\left( M_{B_{(s)}}-E_{D_{(s)}}\right) f^{(s)}_\perp \,, \end{aligned}$$
(234)
$$\begin{aligned} f^{(s)}_0= & {} \frac{\sqrt{2M_{B_{(s)}}}}{M_{B_{(s)}}^2-M_{D_{(s)}}^2}\left[ \left( M_{B_{(s)}}-E_{D_{(s)}}\right) f^{(s)}_\parallel \nonumber \right. \\&\left. +\left( M_{B_{(s)}}^2-E_{D_{(s)}}^2\right) f^{(s)}_\perp \right] \,. \end{aligned}$$
(235)

The currents in the effective theory are matched at 1-loop to their continuum counterparts. Results for the form factors are then fitted to a modified BCL z-expansion ansatz, that takes into account simultaneously the lattice spacing, light-quark masses, and \(q^2\)-dependence. For the mass dependence NLO chiral logarithms are included, in the form obtained in hard-pion \(\chi \)PT. As in the case of the FNAL/MILC computation, once \(f_+\) and \(f_0\) have been determined as functions of \(q^2\), \(|V_{cb}|\) can be determined from a joint fit of this z-expansion and experimental data. The works quote for the zero-recoil vector form factor the result

$$\begin{aligned} {{{\mathcal {G}}}}^{B \rightarrow D}(1)=1.035(40)\,~~~~{{{\mathcal {G}}}}^{B_s \rightarrow D_s}(1)=1.068(4)\,. \end{aligned}$$
(236)

The HPQCD and FNAL/MILC results for \(B\rightarrow D\) differ by less than half a standard deviation (assuming they are uncorrelated, which they are not as some of the ensembles are common) primarily because of lower precision of the former result. The HPQCD central value is smaller by 1.8 of the FNAL/MILC standard deviations than the FNAL/MILC value. The dominant source of errors in the \(|V_{cb}|\) determination by HPQCD are discretization effects and the systematic uncertainty associated with the perturbative matching.

Table 47 Coefficients and correlation matrix for the \(N^+ =N^0=3\) z-expansion of the \(B\rightarrow D\) form factors \(f_+\) and \(f_0\)

In order to combine the form factors determinations of HPQCD and FNAL/MILC into a lattice average, we proceed in a similar way as with \(B\rightarrow \pi \ell \nu \) and \(B_s\rightarrow K\ell \nu \) above. FNAL/MILC quotes synthetic values for the form factors at three values of w (or, alternatively, \(q^2\)) with a full correlation matrix, which we take directly as input. In the case of HPQCD, we use their preferred modified z-expansion parameterization to produce synthetic values of the form factors at two different values of \(q^2\). This leaves us with a total of five data points in the kinematical range \(w\in [1.00,1.11]\). As in the case of \(B\rightarrow \pi \ell \nu \), we conservatively assume a 100% correlation of statistical uncertainties between HPQCD and FNAL/MILC. We then fit this data set to a BCL ansatz, using \(t_+=(M_{B^0}+M_{D^\pm })^2 \simeq 51.12~\,\mathrm {GeV}^2\) and \(t_0=(M_{B^0}+M_{D^\pm })(\sqrt{M_{B^0}}-\sqrt{M_{D^\pm }})^2 \simeq 6.19~\,\mathrm {GeV}^2\). In our fits, pole factors have been set to unity – i.e., we do not take into account the effect of sub-threshold poles, which is then implicitly absorbed into the series coefficients. The reason for this is our imperfect knowledge of the relevant resonance spectrum in this channel, which does not allow us to decide the precise number of poles needed.Footnote 56 This in turn implies that unitarity bounds do not rigorously apply, which has to be taken into account when interpreting the results (cf. Appendix A.5).

With a procedure similar to what we adopted for the \(B\rightarrow \pi \) and \(B_s\rightarrow K\) cases, we impose the kinematic constraint at \(q^2=0\) by expressing the \(a^0_{N^0-1}\) coefficient in the z-expansion of \(f_0\) in terms of all other coefficients. As mentioned above, FNAL/MILC provides synthetic data for \(f_+\) and \(f_0\) including correlations; HPQCD presents the result of simultaneous z-fits to the two form factors including all correlations, thus enabling us to generate a complete set of synthetic data for \(f_+\) and \(f_0\). Since both calculations are based on MILC ensembles, we then reconstruct the off-diagonal HPQCD-FNAL/MILC entries of the covariance matrix by conservatively assuming that statistical uncertainties are 100% correlated. The Fermilab/MILC (HPQCD) statistical error is 58% (31%) of the total error for every \(f_+\) value, and 64% (49%) for every \(f_0\) one. Using this information we can easily build the off-diagonal block of the overall covariance matrix (e.g., the covariance between \([f_+(q_1^2)]_{\mathrm{FNAL}}\) and \([f_0(q_2^2)]_\mathrm{HPQCD}\) is \((\delta [f_+(q_1^2)]_{\mathrm{FNAL}} \times 0.58)\; (\delta [f_0(q_2^2)]_{\mathrm{HPQCD}} \times 0.49)\), where \(\delta f\) is the total error).

For our central value, we choose an \(N^+ =N^0=3\) BCL fit, shown in Table 47. The coefficient \(a_3^+\) can be obtained from the values for \(a_0^+\)\(a_2^+\) using Eq. (447). The fit is illustrated in Fig. 30.

Fig. 30
figure 30

The form factors \(f_+(q^2)\) and \(f_0(q^2)\) for \(B \rightarrow D\ell \nu \) plotted versus z. (See text for a discussion of the data sets.) The grey and salmon bands display our preferred \(N^+=N^0=3\) BCL fit (five parameters) to the plotted data with errors

Reference [612] is the only existing \(N_f=2\) work on \(B \rightarrow D\ell \nu \) transitions, that furthermore provided the first available results for \(B_s \rightarrow D_s\ell \nu \). This computation uses the publicly available ETM configurations obtained with the twisted-mass QCD action at maximal twist. Four values of the lattice spacing, ranging between 0.054 and \(0.098~\mathrm{fm}\), are considered, with physical box lengths ranging between 1.7 and \(2.7~\mathrm{fm}\). At two values of the lattice spacing two different physical volumes are available. Charged-pion masses range between \(\approx 270\) and \(\approx 490~\mathrm{MeV}\), with two or three masses available per lattice spacing and volume, save for the \(a \approx 0.054~\mathrm{fm}\) point at which only one light mass is available for each of the two volumes. The strange- and heavy-valence quarks are also treated with maximally twisted-mass QCD.

Table 48 Lattice results for the \(B_{(s)} \rightarrow D_{(s)}^{(*)} \ell \nu \) semileptonic form factors and \(R(D_{(s)})\)

The quantities of interest are again the form factors \(h_\pm \) defined above. In order to control discretization effects from the heavy quarks, a strategy similar to the one employed by the ETM collaboration in their studies of B-meson decay constants (cf. Sect. 8.1) is employed: the value of \({{{\mathcal {G}}}}(w)\) is computed at a fixed value of \(m_c\) and several values of a heavier quark mass \(m_h^{(k)}=\lambda ^k m_c\), where \(\lambda \) is a fixed scaling parameter, and step-scaling functions are built as

$$\begin{aligned} \Sigma _k(w) = \frac{{{{\mathcal {G}}}}(w,\lambda ^{k+1} m_c,m_c,a^2)}{\mathcal{G}(w,\lambda ^k m_c,m_c,a^2)}\,. \end{aligned}$$
(237)

Each ratio is extrapolated to the continuum limit, \(\sigma _k(w)=\lim _{a \rightarrow 0}\Sigma _k(w)\). One then exploits the fact that the \(m_h \rightarrow \infty \) limit of the step-scaling is fixed – in particular, it is easy to find from the heavy-quark expansion that \(\lim _{m_h\rightarrow \infty }\sigma (1)=1\). In this way, the physical result at the b-quark mass can be reached by interpolating \(\sigma (w)\) between the charm region (where the computation can be carried out with controlled systematics) and the known static limit value.

In practice, the values of \(m_c\) and \(m_s\) are fixed at each value of the lattice spacing such that the experimental kaon and \(D_s\) masses are reached at the physical point, as determined in Ref. [622]. For the scaling parameter, \(\lambda =1.176\) is chosen, and eight scaling steps are performed, reaching \(m_h/m_c=1.176^9\simeq 4.30\), approximately corresponding to the ratio of the physical b- and c-masses in the \(\overline{\mathrm{MS}}\) scheme at \(2~\mathrm{GeV}\). All observables are obtained from ratios that do not require (re)normalization. The ansatz for the continuum and chiral extrapolation of \(\Sigma _k\) contains a constant and linear terms in \(m_{\mathrm{sea}}\) and \(a^2\). Twisted boundary conditions in space are used for valence-quark fields for better momentum resolution. Applying this strategy the form factors are finally obtained at four reference values of w between 1.004 and 1.062, and, after a slight extrapolation to \(w=1\), the result is quoted

$$\begin{aligned} {{{\mathcal {G}}}}^{B_s \rightarrow D_s}(1) = 1.052(46)\,. \end{aligned}$$
(238)

The authors also provide values for the form factor relevant for the meson states with light-valence quarks, obtained from a similar analysis to the one described above for the \(B_s\rightarrow D_s\) case. Values are quoted from fits with and without a linear \(m_\mathrm{sea}/m_s\) term in the chiral extrapolation. The result in the former case, which safely covers systematic uncertainties, is

$$\begin{aligned} {{{\mathcal {G}}}}^{B \rightarrow D}(1)=1.033(95)\,. \end{aligned}$$
(239)

Given the identical strategy, and the small sensitivity of the ratios used in their method to the light valence- and sea-quark masses, we assign this result the same ratings in Table 48 as those for their calculation of \({{{\mathcal {G}}}}^{B_s \rightarrow D_s}(1)\). Currently the precision of this calculation is not competitive with that of \(N_f=2\,+\,1\) works, but this is due largely to the small number of configurations analysed by Atoui et al. The viability of their method has been clearly demonstrated, however, which leaves significant room for improvement on the errors of both the \(B \rightarrow D\) and \(B_s \rightarrow D_s\) form factors with this approach by including either additional two-flavour data or analysing more recent ensembles with \(N_f>2\).

Finally, Atoui et al. also study the scalar and tensor form factors, as well as the momentum-transfer dependence of \(f_{+,0}\). The value of the ratio \(f_0(q^2)/f_+(q^2)\) is provided at a reference value of \(q^2\) as a proxy for the slope of \({{{\mathcal {G}}}}(w)\) around the zero-recoil limit.

8.4.2 Ratios of \(B\rightarrow D\ell \nu \) form factors

The availability of results for the scalar form factor \(f_0\) for \(B\rightarrow D\ell \nu \) amplitudes allows us to study interesting observables that involve the decay in the \(\tau \) channel. One such quantity is the ratio

$$\begin{aligned} R(D) = {{{\mathcal {B}}}}(B \rightarrow D \tau \nu ) / {{{\mathcal {B}}}}(B \rightarrow D \ell \nu )\quad \text{ with }\;\ell =e,\mu \,,\nonumber \\ \end{aligned}$$
(240)

which is sensitive to \(f_0\), and can be accurately determined by experiment.Footnote 57 Indeed, the recent availability of experimental results for R(D) has made this quantity particularly relevant in the search for possible physics beyond the Standard Model. Both FNAL/MILC and HPQCD provide values for R(D) from their recent form factor computations, discussed above. The quoted values by FNAL/MILC and HPQCD are

$$\begin{aligned} R(D)= & {} 0.299(11)\,\,\mathrm {Ref.}~[615]\nonumber \\ R(D)= & {} 0.300(8)\,\,\mathrm {Ref.}~\text{[616] }. \end{aligned}$$
(241)

These results are in excellent agreement, and can be averaged (using the same considerations for the correlation between the two computations as we did in the averaging of form factors) into

$$\begin{aligned} R(D) = 0.300(8)\,,\quad \text{ our } \text{ average. } \end{aligned}$$
(242)

This result is about \(2.3\sigma \) lower than the current experimental average for this quantity. It has to be stressed that achieving this level of precision critically depends on the reliability with which the low-\(q^2\) region is controlled by the parameterizations of the form factors. It is also worth mentioning that if experimental data for \(B \rightarrow D\ell \nu \) are used to further constrain R(D) as part of a global fit, it is possible to decrease the error substantially, cf. the value \(R(D)=0.299(3)\) quoted in [623].

HPQCD also computes a new value for \(R(D_s)\), the analog of R(D), with both heavy–light mesons containing a strange quark [618]:

$$\begin{aligned} R(D_s) = 0.301(6). \end{aligned}$$
(243)

Another area of immediate interest in searches for physics beyond the Standard Model is the measurement of \(B_s \rightarrow \mu ^+ \mu ^-\) decays, recently studied at the LHC.Footnote 58 In addition to the \(B_s\) decay constant (see Sect. 8.1), one of the hadronic inputs required by the LHCb analysis is the ratio of \(B_q\) meson (\(q = d,s\)) fragmentation fractions \(f_s / f_d\). A dedicated \(N_f=2\,+\,1\) study by FNAL/MILCFootnote 59 [624] addresses the ratios of scalar form factors \(f_0^{(q)}(q^2)\), and quotes:

$$\begin{aligned} f_0^{(s)}\left( M_\pi ^2\right) / f_0^{(d)}\left( M_K^2\right)= & {} 1.046(44)(15), \qquad \nonumber \\ f_0^{(s)}\left( M_\pi ^2\right) / f_0^{(d)}\left( M_\pi ^2\right)= & {} 1.054(47)(17), \end{aligned}$$
(244)

where the first error is statistical and the second systematic. The more recent results from HPQCD [618] are:

$$\begin{aligned} f_0^{(s)}\left( M_\pi ^2\right) / f_0^{(d)}\left( M_K^2\right)= & {} 1.000(62), \qquad \nonumber \\ f_0^{(s)}\left( M_\pi ^2\right) / f_0^{(d)}\left( M_\pi ^2\right)= & {} 1.006(62). \end{aligned}$$
(245)

Results from both groups lead to fragmentation fraction ratios \(f_s/f_d\) that are consistent with LHCb’s measurements via other methods [625].

8.4.3 \(B \rightarrow D^*\) decays

The most precise computation of the zero-recoil form factors needed for the determination of \(|V_{cb}|\) from exclusive B semileptonic decays comes from the \(B \rightarrow D^* \ell \nu \) form factor at zero recoil \({{{\mathcal {F}}}}^{B \rightarrow D^*}(1)\), calculated by the FNAL/MILC collaboration. The original computation, published in Ref. [613], has now been updated [614] by employing a much more extensive set of gauge ensembles and increasing the statistics of the ensembles originally considered, while preserving the analysis strategy. There is currently no unquenched computation of the relevant form factors at nonzero recoil.

This work uses the MILC \(N_f = 2 + 1\) ensembles. The bottom and charm quarks are simulated using the clover action with the Fermilab interpretation and light quarks are treated via the asqtad staggered fermion action. Recalling the definition of the form factors in Eq. (226), at zero recoil \({{{\mathcal {F}}}}^{B \rightarrow D^*}(1)\) reduces to a single form factor \(h_{A_1}(1)\) coming from the axial-vector current

$$\begin{aligned} \langle D^*(v,\epsilon ^\prime )| {{{\mathcal {A}}}}_\mu | {\overline{B}}(v) \rangle = i \sqrt{2m_B 2 m_{D^*}} \; {\epsilon ^\prime _\mu }^*h_{A_1}(1), \end{aligned}$$
(246)

where \(\epsilon ^\prime \) is the polarization of the \(D^*\). The form factor is accessed through a ratio of three-point correlators, viz.,

$$\begin{aligned} {{{\mathcal {R}}}}_{A_1} = \frac{\langle D^*|{\bar{c}} \gamma _j \gamma _5 b | {\overline{B}} \rangle \; \langle {\overline{B}}| {\bar{b}} \gamma _j \gamma _5 c | D^* \rangle }{\langle D^*|{\bar{c}} \gamma _4 c | D^* \rangle \; \langle {\overline{B}}| {\bar{b}} \gamma _4 b | {\overline{B}} \rangle } = |h_{A_1}(1)|^2.\nonumber \\ \end{aligned}$$
(247)

Simulation data is obtained on MILC ensembles with five lattice spacings, ranging from \(a \approx 0.15~\mathrm{fm}\) to \(a \approx 0.045~\mathrm{fm}\), and as many as five values of the light-quark masses per ensemble (though just one at the finest lattice spacing). Results are then extrapolated to the physical, continuum/chiral, limit employing staggered \(\chi \)PT.

The \(D^*\) meson is not a stable particle in QCD and decays predominantly into a D plus a pion. Nevertheless, heavy–light meson \(\chi \)PT can be applied to extrapolate lattice simulation results for the \(B\rightarrow D^*\ell \nu \) form factor to the physical light-quark mass. The \(D^*\) width is quite narrow, 0.096 MeV for the \(D^{*\pm }(2010)\) and less than 2.1 MeV for the \(D^{*0}(2007)\), making this system much more stable and long lived than the \(\rho \) or the \(K^*\) systems. The fact that the \(D^* - D\) mass difference is close to the pion mass leads to the well-known “cusp” in \(\mathcal{R}_{A_1}\) just above the physical pion mass [626,627,628]. This cusp makes the chiral extrapolation sensitive to values used in the \(\chi \)PT formulas for the \(D^*D \pi \) coupling \(g_{D^*D\pi }\). The error budget in Ref. [614] includes a separate error of 0.3% coming from the uncertainty in \(g_{D^*D \pi }\) in addition to general chiral extrapolation errors in order to take this sensitivity into account.

The final updated value presented in Ref. [614] is

$$\begin{aligned} N_{ f}=2\,+\,1: \;\; {{{\mathcal {F}}}}^{B \rightarrow D^*}(1) = 0.906(4)(12)\,, \end{aligned}$$
(248)

where the first error is statistical, and the second the sum of systematic errors added in quadrature, making up a total error of 1.4% (down from the original 2.6% of Ref. [613]). The largest systematic uncertainty comes from discretization errors followed by effects of higher-order corrections in the chiral perturbation theory ansatz.

Since the previous version of this review, the HPQCD collaboration has published the first study of \({{B}_{(s)}\rightarrow D_{(s)}^{*}\ell {\nu }}\) form factors at zero recoil for \(N_f=2\,+\,1\,+\,1\) using eight MILC ensembles with lattice spacing \(a\approx 0.15\), 0.12, and 0.09 [618]. There are three ensembles with varying light-quark masses for the two coarser lattice spacings and two choices of light-quark mass for the finest lattice spacing. In each case, there is one ensemble for which the light-quark mass is very close to the physical value. The b quark is treated using NRQCD and the light quarks are treated using the HISQ action. The resulting zero-recoil form factors are:

$$\begin{aligned}&N_{ f}=2\,+\,1\,+\,1: \;\; {{\mathcal {F}}}^{B \rightarrow D^*}(1) = 0.895(10)(24)\,,~~~~ \nonumber \\&{{\mathcal {F}}}^{B_s\rightarrow D_s^*}(1) = 0.883(12)(28)\,. \end{aligned}$$
(249)

At Lattice 2018, two groups presented preliminary results for the \(B\rightarrow D^* \ell \nu \) semileptonic decay. From JLQCD, there was a poster describing their calculations for zero and nonzero recoil using \(N_f=2\,+\,1\) Möbius domain-wall ensembles. Two lattice spacings of roughly 0.079 and 0.055 fm were used with bottom-quark mass limited to 2.4 times the charm-quark mass to control the heavy-quark discretization effects. In addition, JLQCD is studying \(B\rightarrow D\ell \nu \). From FNAL/MILC there was a presentation of preliminary results using 15 \(N_f=2\,+\,1\) asqtad sea-quark ensembles with lattice spacing between approximately 0.15 and 0.045 fm. The heavy quarks are treated using the clover action with the Fermilab interpretation. In addition, HPQCD presented preliminary results for \(B_s \rightarrow D_s^{(*)}\ell \nu \) using HISQ quarks for all valence and sea quarks. The calculation uses three of MILC’s \(N_f=2\,+\,1\,+\,1\) HISQ ensembles with \(a\approx 0.09\), 0.06 and 0.045 fm. An advantage of the all-HISQ approach is that there is no need for perturbative renormalization of the axial-vector- or vector-current.

8.5 Semileptonic form factors for \(B_c\rightarrow \eta _c\ell \nu \) and \(B_c\rightarrow J/\psi \ell \nu \)

In 2016, preliminary results for the decays \(B_c\rightarrow \eta _c\ell \nu \) and \(B_c\rightarrow J/\psi \ell \nu \) were presented at two conferences by the HPQCD collaboration [629, 630]. The calculations use both NRQCD and HISQ actions for the valence b quark, and the HISQ action for the c quark (both valence and sea). The calculations were done using five ensembles from the MILC collaboration with \(N_f=2\,+\,1\,+\,1\) and lattice spacings between approximately 0.15 and 0.045 fm. Only ensembles with \(m_l/m_s=0.2\) are used although ones with a physical light-quark mass are available. For the HISQ formalism, a range of heavy-quark masses obeying \(am_h<0.8\) is used and an extrapolation \(m_h\rightarrow m_b\) is made. Comparison of results using NRQCD and HISQ allows an improved normalization for the NRQCD currents as the HISQ currents do not require renormalization.

8.6 Semileptonic form factors for \(\Lambda _b\rightarrow p\ell \nu \) and \(\Lambda _b\rightarrow \Lambda _c\ell \nu \)

Lattice-QCD computations for heavy-quark physics has been extended to the study of semileptonic decays of the \(\Lambda _b\) baryon, with first unquenched results away from the static limit provided in a work by Detmold, Lehner, and Meinel [631].Footnote 60 The importance of this result is that, together with a recent analysis by LHCb of the ratio of decay rates \(\Gamma (\Lambda _b\rightarrow p\ell \nu )/\Gamma (\Lambda _b\rightarrow \Lambda _c\ell \nu )\) [634], it allows for an exclusive determination of the ratio \(|V_{ub}|/|V_{cb}|\) largely independent from the outcome of different exclusive channels, thus contributing a very interesting piece of information to the existing tensions in the determination of third-column CKM matrix elements (cf. Sects. 8.7, and 8.8).

The amplitudes of the decays \(\Lambda _b\rightarrow p\ell \nu \) and \(\Lambda _b\rightarrow \Lambda _c\ell \nu \) receive contributions from both the vector and the axial components of the current in the matrix elements \(\langle p|\bar{q}\gamma ^\mu ({\mathbf {1}}-\gamma _5)b|\Lambda _b\rangle \) and \(\langle \Lambda _c|{{\bar{q}}}\gamma ^\mu ({\mathbf {1}}-\gamma _5)b|\Lambda _b\rangle \), and can be parameterized in terms of six different form factors – see, e.g., Ref. [544] for a complete description. They split into three form factors \(f_+\), \(f_0\), \(f_\perp \) in the parity-even sector, mediated by the vector component of the current, and another three form factors \(g_+,g_0,g_\perp \) in the parity-odd sector, mediated by the axial component. All of them provide contributions that are parametrically comparable.

The computation of Detmold et al. uses RBC/UKQCD \(N_f=2\,+\,1\) DWF ensembles, and treats the b and c quarks within the Columbia RHQ approach. Two values of the lattice spacing (\(a\sim 0.112,~0.085~\mathrm{fm}\)) are considered, with the absolute scale set from the \(\Upsilon (2S)\)\(\Upsilon (1S)\) splitting. Sea pion masses lie in a narrow interval ranging from slightly above \(400~\mathrm{MeV}\) to slightly below \(300~\mathrm{MeV}\), keeping \(m_\pi L \gtrsim 4\); however, lighter pion masses are considered in the valence DWF action for the ud quarks, leading to partial quenching effects in the chiral extrapolation. More importantly, this also leads to values of \(M_{\pi ,\mathrm{min}}L\) close to 3.0 (cf. Appendix B.6.3 for details); compounded with the fact that there is only one lattice volume in the computation, an application of the FLAG criteria would lead to a rating for finite-volume effects. It has to be stressed, however, that our criteria have been developed in the context of meson physics, and their application to the baryon sector is not straightforward; as a consequence, we will refrain from providing a conclusive rating of this computation for the time being.

Results for the form factors are obtained from suitable three-point functions, and fitted to a modified z-expansion ansatz that combines the \(q^2\)-dependence with the chiral and continuum extrapolations. The main results of the paper are the predictions (errors are statistical and systematic, respectively)

$$\begin{aligned} \zeta _{p\mu {{\bar{\nu }}}}(15\mathrm{GeV}^2)&\equiv \frac{1}{|V_{ub}|^2}\int _{15~\mathrm{GeV}^2}^{q^2_{\mathrm{max}}}\frac{\mathrm{d}\Gamma (\Lambda _b\rightarrow p\mu ^-{{\bar{\nu }}}_\mu )}{\mathrm{d}q^2}\,\mathrm{d}q^2 \nonumber \\&= 12.31(76)(77)~\mathrm{ps}^{-1}\,, \end{aligned}$$
(250)
$$\begin{aligned} \zeta _{\Lambda _c \mu {{\bar{\nu }}}}(7\mathrm{GeV}^2)&\equiv \frac{1}{|V_{cb}|^2}\int _{7~\mathrm{GeV}^2}^{q^2_\mathrm{max}}\frac{\mathrm{d}\Gamma (\Lambda _b\rightarrow \Lambda _c\mu ^-{{\bar{\nu }}}_\mu )}{\mathrm{d}q^2}\,\mathrm{d}q^2 \nonumber \\&= 8.37(16)(34)~\mathrm{ps}^{-1}\,, \end{aligned}$$
(251)
$$\begin{aligned} \displaystyle \frac{\zeta _{p\mu {{\bar{\nu }}}}(15\mathrm{GeV}^2)}{\zeta _{\Lambda _c \mu {{\bar{\nu }}}}(7\mathrm{GeV}^2)}&= 1.471(95)(109)\,, \end{aligned}$$
(252)

which are the input for the LHCb analysis. Predictions for the total rates in all possible lepton channels, as well as for ratios similar to R(D) (cf. Sect. 8.4) between the \(\tau \) and light-lepton channels are also available.

Since the previous version of this review, there have been three papers [621, 635, 636] extending study of the \(\Lambda _b\) and two [541, 637] studying the \(\Lambda _c\). Reference [635] studies the rare decay \(\Lambda _b \rightarrow \Lambda \ell ^+ \ell ^-\). The lattice setup is identical, and similar considerations as above thus apply. Furthermore, the renormalization of the tensor current is carried out adopting a mostly nonperturbative renormalization strategy, without however computing the residual renormalization factor \(\rho _{T^{\mu \nu }}\), which is set to its tree-level value. While the matching systematic uncertainty is augmented to take this fact into account, the procedure implies that the current retains an uncanceled logarithmic divergence at \({\mathcal {O}}(\alpha _s)\).

Reference [636] is an exploratory study of the decay \(\Lambda _b \rightarrow \Lambda (1520) \ell ^+ \ell ^-\) using a single gauge ensemble presented at Lattice 2016. Reference [621] includes new results for \({\Lambda }_b\rightarrow {\Lambda }_c\) for the tensor form factors. The main focus of this paper is the phenomenology of the \( {\Lambda }_b\rightarrow {\Lambda }_c\tau {{\overline{\nu }}}_{\tau } \) decay and how it can be used to limit contributions from beyond the standard model physics.

8.7 Determination of \(|V_{ub}|\)

We now use the lattice-determined Standard Model transition amplitudes for leptonic (Sect. 8.1) and semileptonic (Sect. 8.3) B-meson decays to obtain exclusive determinations of the CKM matrix element \(|V_{ub}|\). In this section, we describe the aspect of our work that involves experimental input for the relevant charged-current exclusive decay processes. The relevant formulae are Eqs. (191) and (222). Among leptonic channels the only input comes from \(B\rightarrow \tau \nu _\tau \), since the rates for decays to e and \(\mu \) have not yet been measured. In the semileptonic case we only consider \(B\rightarrow \pi \ell \nu \) transitions (experimentally measured for \(\ell =e,\mu \)). As discussed in Sects. 8.3 and 8.6, there are now lattice predictions for the rates of the decays \(B_s\rightarrow K\ell \nu \) and \(\Lambda _b\rightarrow p\ell \nu \); however, in the former case the process has not been experimentally measured yet, while in the latter case the only existing lattice computation does not meet FLAG requirements for controlled systematics.

We first investigate the determination of \(|V_{ub}|\) through the \(B\rightarrow \tau \nu _\tau \) transition. This is the only experimentally measured leptonic decay channel of the charged B meson. The experimental measurements of the branching fraction of this channel, \(B(B^{-} \rightarrow \tau ^{-} {\bar{\nu }})\), have not been updated since the publication of the previous FLAG report [3]. In Table 49 we summarize the current status of experimental results for this branching fraction.

Table 49 Experimental measurements for \(B(B^{-}\rightarrow \tau ^{-}{\bar{\nu }})\). The first error on each result is statistical, while the second error is systematic
Table 50 \(|V_{ub}|\), coefficients for the \(N^+ =N^0=N^T=3\) z-expansion of the \(B\rightarrow \pi \) form factors \(f_+\) and \(f_0\), and their correlation matrix

It is obvious that all the measurements listed in Table 49 have significance smaller than \(5\sigma \), and the large uncertainties are dominated by statistical errors. These measurements lead to the averages of experimental measurements for \(B(B^{-}\rightarrow \tau {\bar{\nu }})\) [549, 550],

$$\begin{aligned} B(B^{-}\rightarrow \tau {\bar{\nu }} )\times 10^4= & {} 0.91 \pm 0.22 \hbox { }{\hbox { from Belle,}} \end{aligned}$$
(253)
$$\begin{aligned}= & {} 1.79 \pm 0.48 \hbox { from } \hbox {BaBar,} \end{aligned}$$
(254)
$$\begin{aligned}= & {} 1.06 \pm 0.33 \hbox { average,} \end{aligned}$$
(255)

where, following our standard procedure we perform a weighted average and rescale the uncertainty by the square root of the reduced chi-squared. Note that the Particle Data Group [133] did not inflate the uncertainty in the calculation of the averaged branching ratio.

Combining the results in Eqs. (253255) with the experimental measurements of the mass of the \(\tau \)-lepton and the B-meson lifetime and mass we get

$$\begin{aligned} |V_{ub}| f_{B}= & {} 0.72 \pm 0.09 \hbox { MeV}\hbox { from} \hbox { Belle,} \end{aligned}$$
(256)
$$\begin{aligned}= & {} 1.01 \pm 0.14 \hbox { MeV}\hbox { from }\hbox {BaBar,} \end{aligned}$$
(257)
$$\begin{aligned}= & {} 0.77 \pm 0.12 \hbox { MeV}\hbox { average,} \end{aligned}$$
(258)

which can be used to extract \(|V_{ub}|\), viz.,

$$\begin{aligned}&N_f=2&\text{ Belle }~B\rightarrow \tau \nu _\tau :&|V_{ub}|&= 3.83(14)(48) \times 10^{-3} , \end{aligned}$$
(259)
$$\begin{aligned}&N_f=2\,+\,1&\text{ Belle }~B\rightarrow \tau \nu _\tau :&|V_{ub}|&= 3.75(8)(47) \times 10^{-3} , \end{aligned}$$
(260)
$$\begin{aligned}&N_f=2\,+\,1\,+\,1&\text{ Belle }~B\rightarrow \tau \nu _\tau :&|V_{ub}|&= 3.79(3)(47) \times 10^{-3} ; \end{aligned}$$
(261)
$$\begin{aligned}&N_f=2&\text{ Babar }~B\rightarrow \tau \nu _\tau :&|V_{ub}|&= 5.37(20)(74) \times 10^{-3} , \end{aligned}$$
(262)
$$\begin{aligned}&N_f=2\,+\,1&\text{ Babar }~B\rightarrow \tau \nu _\tau :&|V_{ub}|&= 5.26(12)(73) \times 10^{-3} , \end{aligned}$$
(263)
$$\begin{aligned}&N_f=2\,+\,1\,+\,1&\text{ Babar }~B\rightarrow \tau \nu _\tau :&|V_{ub}|&= 5.32(4)(74) \times 10^{-3} , \end{aligned}$$
(264)
$$\begin{aligned}&N_f=2&\text{ average }~B\rightarrow \tau \nu _\tau :&|V_{ub}|&= 4.10(15)(64) \times 10^{-3} , \end{aligned}$$
(265)
$$\begin{aligned}&N_f=2\,+\,1&\text{ average }~B\rightarrow \tau \nu _\tau :&|V_{ub}|&= 4.01(9)(63) \times 10^{-3} , \end{aligned}$$
(266)
$$\begin{aligned}&N_f=2\,+\,1\,+\,1&\text{ average }~B\rightarrow \tau \nu _\tau :&|V_{ub}|&= 4.05(3)(64) \times 10^{-3} , \end{aligned}$$
(267)

where the first error comes from the uncertainty in \(f_B\) and the second comes from experiment.

Let us now turn our attention to semileptonic decays. The experimental value of \(|V_{ub}|f_+(q^2)\) can be extracted from the measured branching fractions for \(B^0\rightarrow \pi ^\pm \ell \nu \) and/or \(B^\pm \rightarrow \pi ^0\ell \nu \) applying Eq. (222);Footnote 61 \(|V_{ub}|\) can then be determined by performing fits to the constrained BCL z-parameterization of the form factor \(f_+(q^2)\) given in Eq. (448). This can be done in two ways: one option is to perform separate fits to lattice and experimental results, and extract the value of \(|V_{ub}|\) from the ratio of the respective \(a_0\) coefficients; a second option is to perform a simultaneous fit to lattice and experimental data, leaving their relative normalization \(|V_{ub}|\) as a free parameter. We adopt the second strategy, because it combines the lattice and experimental input in a more efficient way, leading to a smaller uncertainty on \(|V_{ub}|\).

The available state-of-the-art experimental input consists of five data sets: three untagged measurements by BaBar (6-bin [640] and 12-bin [641]) and Belle [642], all of which assume isospin symmetry and provide combined \(B^0\rightarrow \pi ^-\) and \(B^+\rightarrow \pi ^0\) data; and the two tagged Belle measurements of \({{\bar{B}}}^0\rightarrow \pi ^+\) (13-bin) and \(B^-\rightarrow \pi ^0\) (7-bin) [643]. Including all of them, along with the available information about cross-correlations, will allow us to obtain a meaningful final error estimate.Footnote 62 The lattice input data set will be the same discussed in Sect. 8.3.

We perform a constrained BCL fit of the vector and scalar form factors (this is necessary in order to take into account the \(f_+(q^2=0) = f_0 (q^2=0)\) constraint) together with the combined experimental data sets. We find that the error on \(|V_{ub}|\) stabilizes for \((N^+ = N^0 = 3)\). The result of the combined fit is presented in Table 50.

In Fig. 31 we show both the lattice and experimental data for \((1-q^2/m_{B^*}^2)f_+(q^2)\) as a function of \(z(q^2)\), together with our preferred fit; experimental data has been rescaled by the resulting value for \(|V_{ub}|^2\). It is worth noting the good consistency between the form factor shapes from lattice and experimental data. This can be quantified, e.g., by computing the ratio of the two leading coefficients in the constrained BCL parameterization: the fit to lattice form factors yields \(a_1^+/a_0^+=-1.67(35)\) (cf. the results presented in Sect. 8.3.1), while the above lattice+experiment fit yields \(a_1^+/a_0^+=-1.19(13)\).

Fig. 31
figure 31

Lattice and experimental data for \((1-q^2/m_{B^*}^2)f_+^{B\rightarrow \pi }(q^2)\) and \(f_0^{B\rightarrow \pi } (q^2)\) versus z. Green symbols denote lattice-QCD points included in the fit, while blue and indigo points show experimental data divided by the value of \(|V_{ub}|\) obtained from the fit. The grey and orange bands display the preferred \(N^+ = N^0 = 3\) BCL fit (six parameters) to the lattice-QCD and experimental data with errors

We plot the values of \(|V_{ub}|\) we have obtained in Fig. 33, where the (GGOU) determination through inclusive decays by the Heavy Flavour Averaging Group (HFLAV) [221], yielding \(|V_{ub}| = 4.52(15)({}^{+11}_{-14}) \times 10^{-3}\), is also shown for comparison. In this plot the tension between the BaBar and the Belle measurements of \(B(B^{-} \rightarrow \tau ^{-} {\bar{\nu }})\) is manifest. As discussed above, it is for this reason that we do not extract \(|V_{ub}|\) through the average of results for this branching fraction from these two collaborations. In fact this means that a reliable determination of \(|V_{ub}|\) using information from leptonic B-meson decays is still absent; the situation will only clearly improve with the more precise experimental data expected from Belle II [644, 645]. The value for \(|V_{ub}|\) obtained from semileptonic B decays for \(N_f=2\,+\,1\), on the other hand, is significantly more precise than both the leptonic and the inclusive determinations, and exhibits the well-known \(\sim 3\sigma \) tension with the latter.

8.8 Determination of \(|V_{cb}|\)

We will now use the lattice-QCD results for the \(B \rightarrow D^{(*)}\ell \nu \) form factors in order to obtain determinations of the CKM matrix element \(|V_{cb}|\) in the Standard Model. The relevant formulae are given in Eq. (228).

Let us summarize the lattice input that satisfies FLAG requirements for the control of systematic uncertainties, discussed in Sect. 8.4. In the (experimentally more precise) \(B\rightarrow D^*\ell \nu \) channel, there is only one \(N_f=2\,+\,1\) lattice computation of the relevant form factor \({\mathcal {F}}^{B\rightarrow D^*}\) at zero recoil. Concerning the \(B \rightarrow D\ell \nu \) channel, for \(N_f=2\) there is one determination of the relevant form factor \({\mathcal {G}}^{B\rightarrow D}\) at zero recoil;Footnote 63 while for \(N_f=2\,+\,1\) there are two determinations of the \(B \rightarrow D\) form factor as a function of the recoil parameter in roughly the lowest third of the kinematically allowed region. In this latter case, it is possible to replicate the analysis carried out for \(|V_{ub}|\) in Sect. 8.7, and perform a joint fit to lattice and experimental data; in the former, the value of \(|V_{cb}|\) has to be extracted by matching to the experimental value for \({\mathcal {F}}^{B\rightarrow D^*}(1)\eta _{\mathrm{EW}}|V_{cb}|\) and \({\mathcal {G}}^{B\rightarrow D}(1)\eta _{\mathrm{EW}}|V_{cb}|\).

Table 51 \(|V_{cb}|\), coefficients for the \(N^+ =N^0=N^T=3\) z-expansion of the \(B\rightarrow D\) form factors \(f_+\) and \(f_0\), and their correlation matrix

The latest experimental average by HFLAV [221] for the \(B\rightarrow D^*\) form factor at zero recoil makes use of the CLN [646] parameterization of the \(B\rightarrow D^*\) form factor and is

$$\begin{aligned} \left[ {\mathcal {F}}^{B\rightarrow D^*}(1)\eta _{\mathrm{EW}}|V_{cb}|\right] _{\mathrm{CLN,HFLAV}} = 35.61(43)\times 10^{-3}\,. \nonumber \\ \end{aligned}$$
(268)

Recently the Belle collaboration presented an updated measurement of the \(B\rightarrow D^* \ell \nu \) branching ratio [647] in which, as suggested in Refs. [648,649,650], the impact of the form factor parameterization has been studied by comparing the CLN [646] and BGL [651, 652] ansätze. The fit results using the two parameterizations are perfectly compatible. In light of the fact that the BGL parameterization imposes much less stringent constraints on the shape of the form factor than the CLN one we choose to focus on the BGL fit:

$$\begin{aligned} \left[ {\mathcal {F}}^{B\rightarrow D^*}(1)\eta _{\mathrm{EW}}|V_{cb}|\right] _\mathrm{BGL, \; Belle}\&= 34.93(23)(59)\times 10^{-3}\,, \end{aligned}$$
(269)

where the first error is statistical and the second is systematic. In the following we present determinations of \(|V_{cb}|\) obtained from Eqs. (268) and (269). By using \(\eta _{\mathrm{EW}}=1.00662\) Footnote 64 and the \(N_f = 2 +1\) lattice value for \({\mathcal {F}}^{B\rightarrow D^*}(1)\) in Eq. (248),Footnote 65 we thus extract the averages

$$\begin{aligned}&N_f=2\,+\,1 \;\;[B\rightarrow D^*\ell \nu ]_\mathrm{CLN ,HFLAV}: \nonumber \\&\quad |V_{cb}| = 39.05(55)(47) \times 10^{-3} \,, \end{aligned}$$
(270)
$$\begin{aligned}&N_f=2\,+\,1 \;\;[B\rightarrow D^*\ell \nu ]_\mathrm{BGL, Belle}: \nonumber \\&\quad |V_{cb}| = 38.30(53)(69) \times 10^{-3} \,, \end{aligned}$$
(271)

where the first uncertainty comes from the lattice computation and the second from the experimental input.

For the zero-recoil \(B \rightarrow D\) form factor, HFLAV [221] quotes

$$\begin{aligned} \text{ HFLAV: } \qquad {\mathcal {G}}^{B\rightarrow D}(1)\eta _{\mathrm{EW}}|V_{cb}| = 41.57(45)(89) \times 10^{-3}, \nonumber \\ \end{aligned}$$
(272)

yielding the following average for \(N_f=2\):

$$\begin{aligned}&N_f=2&B\rightarrow D\ell \nu :&|V_{cb}|&= 40.0(3.7)(1.0) \times 10^{-3} \,, \end{aligned}$$
(273)

where the first uncertainty comes from the lattice computation and the second from the experimental input.

Finally, for \(N_f=2\,+\,1\) we perform, as discussed above, a joint fit to the available lattice data, discussed in Sect. 8.4, and state-of-the-art experimental determinations. In this case, we will combine the aforementioned recent Belle measurement [653], which provides partial integrated decay rates in 10 bins in the recoil parameter w, with the 2010 BaBar data set in Ref. [654], which quotes the value of \({\mathcal {G}}^{B\rightarrow D}(w)\eta _{\mathrm{EW}}|V_{cb}|\) for ten values of w.Footnote 66 The fit is dominated by the more precise Belle data; given this, and the fact that only partial correlations among systematic uncertainties are to be expected, we will treat both data sets as uncorrelated.Footnote 67

A constrained \((N^+ = N^0 = 3)\) BCL fit using the same ansatz as for lattice-only data in Sect. 8.4, yields our average, which we present in Table 51. The fit is illustrated in Fig. 32. In passing, we note that, if correlations between the FNAL/MILC and HPQCD calculations are neglected, the \(|V_{cb}|\) central value rises to \(40.3 \times 10^{-3}\) in nice agreement with the results presented in Ref. [623].

Fig. 32
figure 32

Lattice and experimental data for \(f_+^{B\rightarrow D}(q^2)\) and \(f_0^{B\rightarrow D}(q^2)\) versus z. Green symbols denote lattice-QCD points included in the fit, while blue and indigo points show experimental data divided by the value of \(|V_{cb}|\) obtained from the fit. The grey and orange bands display the preferred \(N^+=N^0=3\) BCL fit (six parameters) to the lattice-QCD and experimental data with errors

In order to combine the determinations of \(|V_{cb}|\) from exclusive \(B\rightarrow D\) and \(B\rightarrow D^*\) semileptonic decays, we need to estimate the correlation between the lattice uncertainties in the two modes. We assume conservatively that the statistical component of the lattice error in both determinations are 100% correlated because they are based on the same MILC configurations (albeit on different subsets). Considering separately the BGL and CLN determination of \(|V_{cb}|\) from \(B\rightarrow D^*\) semileptonic decays, we obtain:

$$\begin{aligned} |V_{cb}^{}|\times 10^3&= 39.08 (91) \;\quad \mathrm{BGL,Belle} \end{aligned}$$
(274)
$$\begin{aligned} |V_{cb}^{}|\times 10^3&= 39.41 (60) \;\quad \mathrm{CLN,HFLAV} \end{aligned}$$
(275)

where we applied a rescaling factor 1.35 to the BGL case.

Table 52 Results for \(|V_{cb}|\). When two errors are quoted in our averages, the first one comes from the lattice form factor, and the second from the experimental measurement. The HFLAV inclusive average obtained in the kinetic scheme from Ref. [221] is shown for comparison

Our results are summarized in Table 52, which also shows the HFLAV inclusive determination of \(|V_{cb}|=42.00(65) \times 10^{-3}\) [655] for comparison, and illustrated in Fig. 33.

Fig. 33
figure 33

Left: Summary of \(|V_{ub}|\) determined using: i) the B-meson leptonic decay branching fraction, \(B(B^{-} \rightarrow \tau ^{-} {\bar{\nu }})\), measured at the Belle and BaBar experiments, and our averages for \(f_{B}\) from lattice QCD; and ii) the various measurements of the \(B\rightarrow \pi \ell \nu \) decay rates by Belle and BaBar, and our averages for lattice determinations of the relevant vector form factor \(f_+(q^2)\). Right: Same for determinations of \(|V_{cb}|\) using semileptonic decays. The HFLAV inclusive results are from Refs. [221, 655]

In Fig. 34 we present a summary of determinations of \(|V_{ub}|\) and \(|V_{cb}|\) from \(B\rightarrow (\pi ,D^{(*)})\ell \nu \) and \(B\rightarrow \tau \nu \). For comparison purposes, we also add the determination of \(|V_{ub}/V_{cb}|\) obtained from \(\Lambda _b\rightarrow (p,\Lambda _c)\ell \nu \) decays in Refs. [631, 634] – which, as discussed in the text, does not meet the FLAG criteria to enter our averages – as well as the results from inclusive \(B\rightarrow X_{u,c} \ell \nu \) decays. Currently, the determinations of \(V_{cb}\) from \(B\rightarrow D^*\) and \(B\rightarrow D\) decays are quite compatible; however, a sizeable tension involving the extraction of \(V_{cb}\) from inclusive dedecays remains. In the determination of the \(1\sigma \) and \(2\sigma \) contours for our average we have included an estimate of the correlation between \(|V_{ub}|\) and \(|V_{cb}|\) from semileptonic B decays: the lattice inputs to these quantities are dominated by results from the Fermilab/MILC and HPQCD collaborations which are both based on MILC \(N_f=2\,+\,1\) ensembles, leading to our conservatively introducing a 100% correlation between the lattice statistical uncertainties of the three computations involved. The results of the fit are

$$\begin{aligned}&{\left\{ \begin{array}{ll} |V_{cb}^{}|\times 10^3 = 39.09 (68) &{} \\ |V_{ub}^{}| \times 10^3 = 3.73 (14) &{} \mathrm{BGL}\\ p\mathrm{{-}value} = 0.32 &{} \end{array}\right. }\nonumber \\&\quad \;\;\;\; \mathrm{and} \;\;\;\; {\left\{ \begin{array}{ll} |V_{cb}^{}|\times 10^3 = 39.41 (61) &{} \\ |V_{ub}^{}| \times 10^3 = 3.74 (14) &{} \mathrm{CLN}\\ p\mathrm{{-}value} = 0.55 &{} \end{array}\right. } \end{aligned}$$
(276)

for the BGL and CLN \(B\rightarrow D^*\) parameterizations, respectively.

Fig. 34
figure 34

Summary of \(|V_{ub}|\) and \(|V_{cb}|\) determinations. Left and right panels correspond to using the BGL and CLN parameterization for the \(B\rightarrow D^*\) form factor, respectively. The solid and dashed lines correspond to 68% and 95% C.L. contours. As discussed in the text, baryonic modes are not included in our averages. The results of the fit in the two cases are \((|V_{cb}^{}|,|V_{ub}^{}|) \times 10^3 = (39.09\pm 0.68, 3.73 \pm 0.14)\) with a p-value of 0.32 and \((|V_{cb}^{}|,|V_{ub}^{}|) \times 10^3 = (39.41\pm 0.61, 3.74 \pm 0.14)\) with a p-value of 0.55, for the BGL and CLN \(B\rightarrow D^*\) parameterizations, respectively

References [623, 648, 650] published in 2016 and 2017 presented evidence that there can be a considerable difference in the CKM matrix elements when choosing between the CLN and BGL parameterizations of form factors. In mid-2018, it appeared that switching to BGL might resolve the difference between the inclusive and exclusive determinations of \(|V_{cb}|\); however, it did not seem to shed light on \(|V_{ub}|\). In September, 2018, a new analysis of Belle [647] appeared to find a 10% difference between CNL and BGL parametrizations for \({\mathcal {F}}^{B\rightarrow D^*}(1)\eta _\mathrm{EW}|V_{cb}|\), supporting previous findings. However, in April, 2019, a new version of that preprint found the two parametrizations completely compatible. Further, in March, 2019, a BaBar preprint [656] presented an angular analysis of the full dataset from that experiment. This unbinned fit using the BGL parametrization of the form factors and the FNAL/MILC result for \({\mathcal {F}}^{B\rightarrow D^*}(1)\) finds \(|V_{cb}| = (38.36\pm 0.90)\times 10^{-3},\) quite compatible with previous exclusive determinations and not indicating a resolution of the difference from the inclusive value. A recent paper by Gambino et al. [657] reviews the history, presents numerous fits of the Belle tagged and untagged data, and finds about a \(2 \sigma \) difference between exclusive and inclusive values for \(|V_{cb}|\).

It will be interesting to see what happens when both experimental and theoretical precisions are improved. At least four groups are working to improve the form factor calculations: FNAL/MILC, HPQCD, JLQCD, and LANL/SWME. It would also be good to have additional results on the \(\Lambda _b\) form factors. We can expect new measurements from Belle II and LHCb.

9 The strong coupling \(\alpha _{\mathrm{s}}\)

Authors: R. Horsley, T. Onogi, R. Sommer

9.1 Introduction

The strong coupling \({\bar{g}}_s(\mu )\) defined at scale \(\mu \), plays a key role in the understanding of QCD and in its application to collider physics. For example, the parametric uncertainty from \(\alpha _s\) is one of the dominant sources of uncertainty in the Standard Model prediction for the \(H \rightarrow b{\bar{b}}\) partial width, and the largest source of uncertainty for \(H \rightarrow gg\). Thus higher precision determinations of \(\alpha _s\) are needed to maximize the potential of experimental measurements at the LHC, and for high-precision Higgs studies at future colliders and the study of the stability of the vacuum [658,659,660,661,662,663,664,665]. The value of \(\alpha _s\) also yields one of the essential boundary conditions for completions of the standard model at high energies.

In order to determine the running coupling at scale \(\mu \)

$$\begin{aligned} \alpha _s(\mu ) = { {\bar{g}}^2_{s}(\mu ) \over 4\pi } \,, \end{aligned}$$
(277)

we should first “measure” a short-distance quantity \({{\mathcal {Q}}}\) at scale \(\mu \) either experimentally or by lattice calculations, and then match it to a perturbative expansion in terms of a running coupling, conventionally taken as \(\alpha _{\overline{\mathrm{MS}}}(\mu )\),

$$\begin{aligned} {{\mathcal {Q}}}(\mu ) = c_1 \alpha _{\overline{\mathrm{MS}}}(\mu ) + c_2 \alpha _{\overline{\mathrm{MS}}}(\mu )^2 + \cdots \,. \end{aligned}$$
(278)

The essential difference between continuum determinations of \(\alpha _s\) and lattice determinations is the origin of the values of \({\mathcal {Q}}\) in Eq. (278).

The basis of continuum determinations are experimentally measurable cross sections or decay widths from which \({\mathcal {Q}}\) is defined. These cross sections have to be sufficiently inclusive and at sufficiently high scales such that perturbation theory can be applied. Often hadronization corrections have to be used to connect the observed hadronic cross sections to the perturbative ones. Experimental data at high \(\mu \), where perturbation theory is progressively more precise, usually have increasing experimental errors, and it is not easy to find processes that allow one to follow the \(\mu \)-dependence of a single \({\mathcal {Q}}(\mu )\) over a range where \(\alpha _s(\mu )\) changes significantly and precision is maintained.

In contrast, in lattice gauge theory, one can design \({\mathcal {Q}}(\mu )\) as Euclidean short-distance quantities that are not directly related to experimental observables. This allows us to follow the \(\mu \)-dependence until the perturbative regime is reached and nonperturbative “corrections” are negligible. The only experimental input for lattice computations of \(\alpha _s\) is the hadron spectrum which fixes the overall energy scale of the theory and the quark masses. Therefore experimental errors are completely negligible and issues such as hadronization do not occur. We can construct many short-distance quantities that are easy to calculate nonperturbatively in lattice simulations with small statistical uncertainties. We can also simulate at parameter values that do not exist in nature (for example, with unphysical quark masses between bottom and charm) to help control systematic uncertainties. These features mean that precise results for \(\alpha _s\) can be achieved with lattice gauge theory computations. Further, as in the continuum, the different methods available to determine \(\alpha _s\) in lattice calculations with different associated systematic uncertainties enable valuable cross-checks. Practical limitations are discussed in the next section, but a simple one is worth mentioning here. Experimental results (and therefore the continuum determinations) of course have all quarks present, while in lattice gauge theories in practice only the lighter ones are included and one is then forced to use the matching at thresholds, as discussed in the following section.

It is important to keep in mind that the dominant source of uncertainty in most present day lattice-QCD calculations of \(\alpha _s\) are from the truncation of continuum/lattice perturbation theory and from discretization errors. Perturbative truncation errors are of particular concern because they often cannot easily be estimated from studying the data itself. Further, the size of higher-order coefficients in the perturbative series can sometimes turn out to be larger than naive expectations based on power counting from the behaviour of lower-order terms. We note that perturbative truncation errors are also the dominant source of uncertainty in several of the phenomenological determinations of \(\alpha _s\).

The various phenomenological approaches to determining the running coupling, \(\alpha ^{(5)}_{\overline{\mathrm{MS}}}(M_Z)\) are summarized by the Particle Data Group [137]. The PDG review lists five categories of phenomenological results used to obtain the running coupling: using hadronic \(\tau \) decays, hadronic final states of \(e^+e^-\) annihilation, deep inelastic lepton–nucleon scattering, electroweak precision data, and high energy hadron collider data. Excluding lattice results, the PDG quotes the weighted average as

$$\begin{aligned} \alpha ^{(5)}_{\overline{\mathrm{MS}}}(M_Z)= & {} 0.1174(16) \,, \quad \text{ PDG } \text{2018 } [137] \end{aligned}$$
(279)

compared to \( \alpha ^{(5)}_{\overline{\mathrm{MS}}}(M_Z) = 0.1183(12) \) of the older review [170]. For a general overview of the various phenomenological and lattice approaches see, e.g., Ref.  [666]. The extraction of \(\alpha _s\) from \(\tau \) data, which is the most precise and has the largest impact on the nonlattice average in Eq. (279) is especially sensitive to the treatment of higher-order perturbative terms as well as the treatment of nonperturbative effects. This is important to keep in mind when comparing our chosen range for \(\alpha ^{(5)}_{\overline{\mathrm{MS}}}(M_Z)\) from lattice determinations in Eq. (344) with the nonlattice average from the PDG.

9.1.1 Scheme and scale dependence of \(\alpha _s\) and \(\Lambda _{\mathrm{QCD}}\)

Despite the fact that the notion of the QCD coupling is initially a perturbative concept, the associated \(\Lambda \) parameter is nonperturbatively defined

$$\begin{aligned} \Lambda\equiv & {} \mu \,\left( b_0{\bar{g}}_s^2(\mu )\right) ^{-b_1/(2b_0^2)} e^{-1/(2b_0{\bar{g}}_s^2(\mu ))} \nonumber \\&\times \exp \left[ -\int _0^{{\bar{g}}_s(\mu )}\,dx \left( {1\over \beta (x)} + {1 \over b_0x^3} - {b_1 \over b_0^2x} \right) \right] , \nonumber \\ \end{aligned}$$
(280)

where \(\beta \) is the full renormalization group function in the scheme which defines \({\bar{g}}_s\), and \(b_0\) and \(b_1\) are the first two scheme-independent coefficients of the perturbative expansion

$$\begin{aligned} \beta (x) \sim -b_0 x^3 -b_1 x^5 + \ldots \,, \end{aligned}$$
(281)

with

$$\begin{aligned} \begin{aligned} b_0&= {1\over (4\pi )^2} \left( 11 - {2\over 3}N_f \right) \,, \qquad \\ b_1&= {1\over (4\pi )^4} \left( 102 - {38 \over 3} N_f \right) \,. \end{aligned} \end{aligned}$$
(282)

Thus the \(\Lambda \) parameter is renormalization-scheme-dependent but in an exactly computable way, and lattice gauge theory is an ideal method to relate it to the low-energy properties of QCD. In the \(\overline{\mathrm{MS}}\) scheme presently \(b_{n_l}\) with \(n_l = 4\) is known.

The change in the coupling from one scheme S to another (taken here to be the \(\overline{\mathrm{MS}}\) scheme) is perturbative,

$$\begin{aligned} g_{\overline{\mathrm{MS}}}^2(\mu ) = g_{\mathrm{S}}^2(\mu ) (1 + c^{(1)}_g g_{\mathrm{S}}^2(\mu ) + \cdots ) \,, \end{aligned}$$
(283)

where \(c^{(i)}_g, \, i\ge 1\) are finite renormalization coefficients. The scale \(\mu \) must be taken high enough for the error in keeping only the first few terms in the expansion to be small. On the other hand, the conversion to the \(\Lambda \) parameter in the \(\overline{\mathrm{MS}}\) scheme is given exactly by

$$\begin{aligned} \Lambda _{\overline{\mathrm{MS}}} = \Lambda _{\mathrm{S}} \exp \left[ c_g^{(1)}/(2b_0)\right] \,. \end{aligned}$$
(284)

The fact that \(\Lambda _{\overline{\mathrm {MS}}}\) can be obtained exactly from \(\Lambda _S\) in any scheme S where \(c^{(1)}_g\) is known together with the high order knowledge (5-loop by now) of \(\beta _{\overline{\mathrm {MS}}}\) means that the errors in \(\alpha _{\overline{\mathrm {MS}}}(m_\mathrm {Z})\) are dominantly due to the errors of \(\Lambda _S\). We will therefore mostly discuss them in that way. Starting from Eq. (280), we have to consider (i) the error of \({\bar{g}}_S^2(\mu )\) (denoted as \(\left( \frac{\Delta \Lambda }{\Lambda }\right) _{\Delta \alpha _S}\) ) and (ii) the truncation error in \(\beta _S\) (denoted as \(\left( \frac{\Delta \Lambda }{\Lambda }\right) _{\mathrm{trunc}}\)). Concerning (ii), note that knowledge of \(c_g^{(n_l)}\) for the scheme S means that \(\beta _S\) is known to \(n_l+1\) loop order; \(b_{n_l}\) is known. We thus see that in the region where perturbation theory can be applied, the following errors of \(\Lambda _S\) (or consequently \(\Lambda _{\overline{\mathrm{MS}}}\)) have to be considered

$$\begin{aligned} \left( \frac{\Delta \Lambda }{\Lambda }\right) _{\Delta \alpha _S}= & {} \frac{\Delta \alpha _{S}(\mu )}{ 8\pi b_0 \alpha _{S}^2(\mu )} \times \left[ 1 + \mathrm{O}(\alpha _S(\mu ))\right] , \end{aligned}$$
(285)
$$\begin{aligned} \left( \frac{\Delta \Lambda }{\Lambda }\right) _{\mathrm{trunc}}= & {} k \alpha _{S}^{n_\mathrm {l}}(\mu ) + \mathrm{O}\left( \alpha _S^{n_\mathrm {l}+1}(\mu )\right) \,, \end{aligned}$$
(286)

where k is proportional to \(b_{n_\mathrm {l}+1}\) and in typical good schemes such as \({\overline{\mathrm {MS}}}\) it is numerically of order one. Statistical and systematic errors such as discretization effects contribute to \(\Delta \alpha _{S}(\mu )\). In the above we dropped a scheme subscript for the \(\Lambda \)-parameters because of Eq. (284).

By convention \(\alpha _{\overline{\mathrm {MS}}}\) is usually quoted at a scale \(\mu =M_Z\) where the appropriate effective coupling is the one in the 5-flavour theory: \(\alpha ^{(5)}_{\overline{\mathrm{MS}}}(M_Z)\). In order to obtain it from a result with fewer flavours, one connects effective theories with different number of flavours as discussed by Bernreuther and Wetzel [667]. For example, one considers the \({\overline{\mathrm {MS}}}\) scheme, matches the 3-flavour theory to the 4-flavour theory at a scale given by the charm-quark mass [668,669,670], runs with the 5-loop \(\beta \)-function [168, 671,672,673,674] of the 4-flavour theory to a scale given by the b-quark mass, and there matches to the 5-flavour theory, after which one runs up to \(\mu =M_Z\) with the 5-loop \(\beta \) function. For the matching relation at a given quark threshold we use the mass \(m_\star \) which satisfies \(m_\star = {\overline{m}}_{\overline{\mathrm {MS}}}(m_\star )\), where \({\overline{m}}\) is the running mass (analogous to the running coupling). Then

$$\begin{aligned}&{\bar{g}}^2_{N_f-1}(m_\star )\nonumber \\&= {\bar{g}}^2_{N_f}(m_\star )\left[ 1+ 0\times {\bar{g}}^{2}_{N_f}(m_\star ) + \sum _{n\ge 2}t_n\,{\bar{g}}^{2n}_{N_f}(m_\star )\right] \end{aligned}$$
(287)

with [668, 670, 675]

$$\begin{aligned} t_2= & {} {1 \over (4\pi ^2)^2} {11\over 72}, \end{aligned}$$
(288)
$$\begin{aligned} t_3= & {} {1 \over (4\pi ^2)^3} \left[ - {82043\over 27648}\zeta _3 + {564731\over 124416}-{2633\over 31104}(N_f-1)\right] ,\nonumber \\ \end{aligned}$$
(289)
$$\begin{aligned} t_4= & {} {1 \over (4\pi ^2)^4} \big [5.170347 - 1.009932 (N_f-1)\nonumber \\&\quad \quad \ \quad \quad - 0.021978 \,(N_f-1)^2\big ], \end{aligned}$$
(290)

(where \(\zeta _3\) is the Riemann zeta-function) provides the matching at the thresholds in the \({\overline{\mathrm {MS}}}\) scheme. Often the package RunDec is used for quark-threshold matching and running in the \({\overline{\mathrm {MS}}}\)-scheme [676, 677].

While \(t_2,\,t_3,\,t_4\) are numerically small coefficients, the charm threshold scale is also relatively low and so there are nonperturbative uncertainties in the matching procedure, which are difficult to estimate but which we assume here to be negligible. Obviously there is no perturbative matching formula across the strange “threshold”; here matching is entirely nonperturbative. Model dependent extrapolations of \({\bar{g}}^2_{N_f}\) from \(N_f=0,2\) to \(N_f=3\) were done in the early days of lattice gauge theory. We will include these in our listings of results but not in our estimates, since such extrapolations are based on untestable assumptions.

9.1.2 Overview of the review of \(\alpha _s\)

We begin by explaining lattice-specific difficulties in Sect. 9.2.1 and the FLAG criteria designed to assess whether the associated systematic uncertainties can be controlled and estimated in a reasonable manner. We then discuss, in Sects. 9.39.8, the various lattice approaches. For completeness, we present results from calculations with \(N_f = 0, 2, 3\), and 4 flavours. Finally, in Sect. 9.10, we present averages together with our best estimates for \(\alpha _{\overline{\mathrm{MS}}}^{(5)}\). These are determined from 3- and 4-flavour QCD simulations. The earlier \(N_f = 0, 2\) works obtained results for \(N_f = 3\) by extrapolation in \(N_f\). Because this is not a theoretically controlled procedure, we do not include these results in our averages. For the \(\Lambda \) parameter, we also give results for other number of flavours, including \(N_f=0\). Even though the latter numbers should not be used for phenomenology, they represent valuable nonperturbative information concerning field theories with variable numbers of quarks.

9.1.3 Additions with respect to the FLAG 13 report

Computations added in FLAG 16 were

  • Karbstein 14 [678] and Bazavov 14 [80] based on the static-quark potential (Sect. 9.4),

  • FlowQCD 15 [679] based on a tadpole-improved bare coupling (Sect. 9.6),

  • HPQCD 14A [16] based on heavy-quark current two-point functions (Sect. 9.7).

They influenced the final ranges marginally.

9.1.4 Additions with respect to the FLAG 16 report

For the benefit of the readers who are familiar with our previous report, we list here where changes and additions can be found which go beyond slight improvements of the presentation.

Our criteria are slightly updated, keeping up-to-date with the cited precisions of computations. In particular, in the criterion for perturbative behaviour we specify that the requirement may be less stringent if a larger uncertainty is quoted.

The FLAG 19 additions are

  • ALPHA 17 [79] and Ishikawa 17 [680] from step-scaling methods (Sect. 9.3).

  • Husung 17 [681], Karbstein 18 [682] and Takaura 18 [683, 684] from the static-quark potential (Sect. 9.4).

  • Hudspith 18 [685] based on the vacuum polarization (Sect. 9.5).

  • Kitazawa 16 [686] based on a tadpole-improved bare coupling (Sect. 9.6).

  • JLQCD 16 [23] and Maezawa 16 [157] based on heavy-quark current two-point functions (Sect. 9.7).

  • Nakayama 18 [687] from the eigenvalue spectrum of the Dirac operator (Sect. 9.9).

9.2 General issues

9.2.1 Discussion of criteria for computations entering the averages

As in the PDG review, we only use calculations of \(\alpha _s\) published in peer-reviewed journals, and that use NNLO or higher-order perturbative expansions, to obtain our final range in Sect. 9.10. We also, however, introduce further criteria designed to assess the ability to control important systematics, which we describe here. Some of these criteria, e.g., that for the continuum extrapolation, are associated with lattice-specific systematics and have no continuum analogue. Other criteria, e.g., that for the renormalization scale, could in principle be applied to nonlattice determinations. Expecting that lattice calculations will continue to improve significantly in the near future, our goal in reviewing the state of the art here is to be conservative and avoid prematurely choosing an overly small range.

In lattice calculations, we generally take \({{\mathcal {Q}}}\) to be some combination of physical amplitudes or Euclidean correlation functions which are free from UV and IR divergences and have a well-defined continuum limit. Examples include the force between static quarks and two-point functions of quark bilinear currents.

In comparison to values of observables \({{\mathcal {Q}}}\) determined experimentally, those from lattice calculations require two more steps. The first step concerns setting the scale \(\mu \) in GeV, where one needs to use some experimentally measurable low-energy scale as input. Ideally one employs a hadron mass. Alternatively convenient intermediate scales such as \(\sqrt{t_0}\), \(w_0\), \(r_0\), \(r_1\), [271, 272, 336, 688] can be used if their relation to an experimental dimensionful observable is established. The low-energy scale needs to be computed at the same bare parameters where \({{\mathcal {Q}}}\) is determined, at least as long as one does not use the step-scaling method (see below). This induces a practical difficulty given present computing resources. In the determination of the low-energy reference scale the volume needs to be large enough to avoid finite-size effects. On the other hand, in order for the perturbative expansion of Eq. (278) to be reliable, one has to reach sufficiently high values of \(\mu \), i.e., short enough distances. To avoid uncontrollable discretization effects the lattice spacing a has to be accordingly small. This means

$$\begin{aligned} L \gg \text{ hadron } \text{ size }\sim \Lambda _{\mathrm{QCD}}^{-1}\quad \text{ and } \quad 1/a \gg \mu \,, \end{aligned}$$
(291)

(where L is the box size) and therefore

$$\begin{aligned} L/a \ggg \mu /\Lambda _{\mathrm{QCD}} \,. \end{aligned}$$
(292)

The currently available computer power, however, limits L/a, typically to \(L/a = 32-96\). Unless one accepts compromises in controlling discretization errors or finite-size effects, this means one needs to set the scale \(\mu \) according to

$$\begin{aligned} \mu \lll L/a \times \Lambda _{\mathrm{QCD}}&\sim 10-30\, \text{ GeV } \,. \end{aligned}$$
(293)

(Here \(\lll \) or \(\ggg \) means at least one order of magnitude smaller or larger.) Therefore, \(\mu \) can be \(1-3\, \text{ GeV }\) at most. This raises the concern whether the asymptotic perturbative expansion truncated at 1-loop, 2-loop, or 3-loop in Eq. (278) is sufficiently accurate. There is a finite-size scaling method, usually called step-scaling method, which solves this problem by identifying \(\mu =1/L\) in the definition of \({{\mathcal {Q}}}(\mu )\), see Sect. 9.3.

For the second step after setting the scale \(\mu \) in physical units (\(\text{ GeV }\)), one should compute \({{\mathcal {Q}}}\) on the lattice, \({{\mathcal {Q}}}_{\mathrm{lat}}(a,\mu )\) for several lattice spacings and take the continuum limit to obtain the left hand side of Eq. (278) as

$$\begin{aligned} {{\mathcal {Q}}}(\mu ) \equiv \lim _{a\rightarrow 0} {{\mathcal {Q}}}_{\mathrm{lat}}(a,\mu ) \text{ with } \mu \text{ fixed }\,. \end{aligned}$$
(294)

This is necessary to remove the discretization error.

Here it is assumed that the quantity \({{\mathcal {Q}}}\) has a continuum limit, which is regularization-independent. The method discussed in Sect. 9.6, which is based on the perturbative expansion of a lattice-regulated, divergent short-distance quantity \(W_\mathrm{lat}(a)\) differs in this respect and must be treated separately.

In summary, a controlled determination of \(\alpha _s\) needs to satisfy the following:

  1. 1.

    The determination of \(\alpha _s\) is based on a comparison of a short-distance quantity \({{\mathcal {Q}}}\) at scale \(\mu \) with a well-defined continuum limit without UV and IR divergences to a perturbative expansion formula in Eq. (278).

  2. 2.

    The scale \(\mu \) is large enough so that the perturbative expansion in Eq. (278) is precise to the order at which it is truncated, i.e., it has good asymptotic convergence.

  3. 3.

    If \({{\mathcal {Q}}}\) is defined by physical quantities in infinite volume, one needs to satisfy Eq. (292).

    Nonuniversal quantities need a separate discussion, see Sect. 9.6.

Conditions (2) and (3) give approximate lower and upper bounds for \(\mu \) respectively. It is important to see whether there is a window to satisfy (2) and (3) at the same time. If it exists, it remains to examine whether a particular lattice calculation is done inside the window or not.

Obviously, an important issue for the reliability of a calculation is whether the scale \(\mu \) that can be reached lies in a regime where perturbation theory can be applied with confidence. However, the value of \(\mu \) does not provide an unambiguous criterion. For instance, the Schrödinger Functional, or SF-coupling (Sect. 9.3) is conventionally taken at the scale \(\mu =1/L\), but one could also choose \(\mu =2/L\). Instead of \(\mu \) we therefore define an effective \(\alpha _{\mathrm{eff}}\). For schemes such as SF (see Sect. 9.3) or qq (see Sect. 9.4) this is directly the coupling of the scheme. For other schemes such as the vacuum polarization we use the perturbative expansion Eq. (278) for the observable \({{\mathcal {Q}}}\) to define

$$\begin{aligned} \alpha _{\mathrm{eff}} = {{\mathcal {Q}}}/c_1 \,. \end{aligned}$$
(295)

If there is an \(\alpha _s\)-independent term it should first be subtracted. Note that this is nothing but defining an effective, regularization-independent coupling, a physical renormalization scheme.

Let us now comment further on the use of the perturbative series. Since it is only an asymptotic expansion, the remainder \(R_n({{\mathcal {Q}}})={{\mathcal {Q}}}-\sum _{i\le n}c_i \alpha _s^i\) of a truncated perturbative expression \({{\mathcal {Q}}}\sim \sum _{i\le n}c_i \alpha _s^i\) cannot just be estimated as a perturbative error \(k\,\alpha _s^{n+1}\). The error is nonperturbative. Often one speaks of “nonperturbative contributions”, but nonperturbative and perturbative cannot be strictly separated due to the asymptotic nature of the series (see, e.g., Ref. [689]).

Still, we do have some general ideas concerning the size of nonperturbative effects. The known ones such as instantons or renormalons decay for large \(\mu \) like inverse powers of \(\mu \) and are thus roughly of the form

$$\begin{aligned} \exp (-\gamma /\alpha _s) \,, \end{aligned}$$
(296)

with some positive constant \(\gamma \). Thus we have, loosely speaking,

$$\begin{aligned} {{\mathcal {Q}}}= & {} c_1 \alpha _s + c_2 \alpha _s^2 + \cdots + c_n\alpha _s^n + {\mathcal {O}}\left( \alpha _s^{n+1}\right) \nonumber \\&+\, {\mathcal {O}}(\exp (-\gamma /\alpha _s)) \,. \end{aligned}$$
(297)

For small \(\alpha _s\), the \(\exp (-\gamma /\alpha _s)\) is negligible. Similarly the perturbative estimate for the magnitude of relative errors in Eq. (297) is small; as an illustration for \(n=3\) and \(\alpha _s = 0.2\) the relative error is \(\sim 0.8\%\) (assuming coefficients \(|c_{n+1} /c_1 | \sim 1\)).

For larger values of \(\alpha _s\) nonperturbative effects can become significant in Eq. (297). An instructive example comes from the values obtained from \(\tau \) decays, for which \(\alpha _s\approx 0.3\). Here, different applications of perturbation theory (fixed order and contour improved) each look reasonably asymptotically convergent but the difference does not seem to decrease much with the order (see, e.g., the contribution of Pich in Ref. [690]). In addition nonperturbative terms in the spectral function may be nonnegligible even after the integration up to \(m_\tau \) (see, e.g., Refs. [691, 692]). All of this is because \(\alpha _s\) is not really small.

Since the size of the nonperturbative effects is very hard to estimate one should try to avoid such regions of the coupling. In a fully controlled computation one would like to verify the perturbative behaviour by changing \(\alpha _s\) over a significant range instead of estimating the errors as \(\sim \alpha _s^{n+1}\) . Some computations try to take nonperturbative power ‘corrections’ to the perturbative series into account by including such terms in a fit to the \(\mu \)-dependence. We note that this is a delicate procedure, both because the separation of nonperturbative and perturbative is theoretically not well defined and because in practice a term like, e.g., \(\alpha _s(\mu )^3\) is hard to distinguish from a \(1/\mu ^2\) term when the \(\mu \)-range is restricted and statistical and systematic errors are present. We consider it safer to restrict the fit range to the region where the power corrections are negligible compared to the estimated perturbative error.

The above considerations lead us to the following special criteria for the determination of \(\alpha _s\):

  • Renormalization scale

    • all points relevant in the analysis have \(\alpha _\mathrm {eff} < 0.2\)

    • all points have \(\alpha _\mathrm {eff} < 0.4\) and at least one \(\alpha _\mathrm {eff} \le 0.25\)

    • otherwise

  • Perturbative behaviour

    • verified over a range of a factor 4 change in \(\alpha _\mathrm {eff}^{n_\mathrm {l}}\) without power corrections or alternatively \(\alpha _\mathrm {eff}^{n_\mathrm {l}} \le \frac{1}{2} \Delta \alpha _\mathrm {eff} / (8\pi b_0 \alpha _\mathrm {eff}^2) \) is reached

    • agreement with perturbation theory over a range of a factor \((3/2)^2\) in \(\alpha _\mathrm {eff}^{n_\mathrm {l}}\) possibly fitting with power corrections or alternatively \(\alpha _\mathrm {eff}^{n_\mathrm {l}} \le \Delta \alpha _\mathrm {eff} / (8\pi b_0 \alpha _\mathrm {eff}^2)\) is reached

    • otherwise

    Here \(\Delta \alpha _\mathrm {eff}\) is the accuracy cited for the determination of \(\alpha _\mathrm {eff}\) and \(n_\mathrm {l}\) is the loop order to which the connection of \(\alpha _\mathrm {eff}\) to the \({\overline{\mathrm {MS}}}\) scheme is known. Recall the discussion around Eqs. (285, 286) The \(\beta \)-function of \(\alpha _\mathrm {eff}\) is then known to \(n_\mathrm {l}+1\) loop order.Footnote 68

  • Continuum extrapolation

    At a reference point of \(\alpha _{\mathrm{eff}} = 0.3\) (or less) we require

    • three lattice spacings with \(\mu a < 1/2\) and full \({\mathcal {O}}(a)\) improvement,

      or three lattice spacings with \(\mu a \le 1/4\) and 2-loop \({\mathcal {O}}(a)\) improvement,

      or \(\mu a \le 1/8\) and 1-loop \({\mathcal {O}}(a)\) improvement

    • three lattice spacings with \(\mu a < 3/2\) reaching down to \(\mu a =1\) and full \({\mathcal {O}}(a)\) improvement,

      or three lattice spacings with \(\mu a \le 1/4\) and 1-loop \({\mathcal {O}}(a)\) improvement

    • otherwise

We also need to specify what is meant by \(\mu \). Here are our choices:

$$\begin{aligned} \text {step-scaling}&:&\mu =1/L\,, \nonumber \\ \text {heavy quark-antiquark potential}&:&\mu =2/r\,, \nonumber \\ \text {observables in momentum space}&:&\mu =q \,, \nonumber \\ \text {moments of heavy-quark currents}&:&\mu =2{\bar{m}}_\mathrm {c} \,, \nonumber \\ \text {eigenvalues of the Dirac operator}&:&\mu = \lambda _{\overline{\mathrm {MS}}} \end{aligned}$$
(298)

where q is the magnitude of the momentum, \({\bar{m}}_\mathrm {c}\) the heavy-quark mass, usually taken around the charm quark mass and \(\lambda _{\overline{\mathrm {MS}}}\) is the eigenvalue of the Dirac operator, see Sect. 9.9. We note again that the above criteria cannot be applied when regularization dependent quantities \(W_\mathrm {lat}(a)\) are used instead of \({{\mathcal {O}}}(\mu )\). These cases are specifically discussed in Sect. 9.6.

In principle one should also account for electro-weak radiative corrections. However, both in the determination of \(\alpha _{s}\) at intermediate scales \(\mu \) and in the running to high scales, we expect electro-weak effects to be much smaller than the presently reached precision. Such effects are therefore not further discussed.

The attentive reader will have noticed that bounds such as \(\mu a < 3/2\) or at least one value of \(\alpha _\mathrm {eff}\le 0.25\) which we require for a are not very stringent. There is a considerable difference between and . We have chosen the above bounds, unchanged as compared to FLAG 16, since not too many computations would satisfy more stringent ones at present. Nevertheless, we believe that the criteria already give reasonable bases for estimates of systematic errors. In the future, we expect that we will be able to tighten our criteria for inclusion in the average, and that many more computations will reach the present rating in one or more categories.

In addition to our explicit criteria, the following effects may influence the precision of results:

Topology sampling  In principle a good way to improve the quality of determinations of \(\alpha _s\) is to push to very small lattice spacings thus enabling large \(\mu \). It is known that the sampling of field space becomes very difficult for the HMC algorithm when the lattice spacing is small and one has the standard periodic boundary conditions. In practice, for all known discretizations the topological charge slows down dramatically for \(a\approx 0.05\,\mathrm{fm}\) and smaller [97, 100,101,102,103,104, 401]. Open boundary conditions solve the problem [105] but are not frequently used. Since the effect of the freezing on short distance observables is not known, we also do need to pay attention to this issue. Remarks are added in the text when appropriate.

Quark-mass effects  We assume that effects of the finite masses of the light quarks (including strange) are negligible in the effective coupling itself where large, perturbative, \(\mu \) is considered.

Scale determination  The scale does not need to be very precise, since using the lowest-order \(\beta \)-function shows that a 3% error in the scale determination corresponds to a \(\sim 0.5\%\) error in \(\alpha _s(M_Z)\). So as long as systematic errors from chiral extrapolation and finite-volume effects are well below 3% we do not need to be concerned about those at the present level of precision in \(\alpha _s(M_Z)\). This may change in the future.

9.2.2 Physical scale

A popular scale choice has been the intermediate \(r_0\) scale. One should bear in mind that its determination from physical observables also has to be taken into account. The phenomenological value of \(r_0\) was originally determined as \(r_0 \approx 0.49\,\text{ fm }\) through potential models describing quarkonia [336]. Of course the quantity is precisely defined, independent of such model considerations. But a lattice computation with the correct sea-quark content is needed to determine a completely sharp value. When the quark content is not quite realistic, the value of \(r_0\) may depend to some extent on which experimental input is used to determine (actually define) it.

The latest determinations from two-flavour QCD are \(r_0\) = 0.420(14)–0.450(14) fm by the ETM collaboration [40, 48], using as input \(f_\pi \) and \(f_K\) and carrying out various continuum extrapolations. On the other hand, the ALPHA collaboration [693] determined \(r_0\) = 0.503(10) fm with input from \(f_K\), and the QCDSF collaboration [90] cites 0.501(10)(11) fm from the mass of the nucleon (no continuum limit). Recent determinations from three-flavour QCD are consistent with \(r_1\) = 0.313(3) fm and \(r_0\) = 0.472(5) fm [36, 276, 694]. Due to the uncertainty in these estimates, and as many results are based directly on \(r_0\) to set the scale, we shall often give both the dimensionless number \(r_0 \Lambda _{\overline{\mathrm{MS}}}\), as well as \(\Lambda _{\overline{\mathrm{MS}}}\). In the cases where no physical \(r_0\) scale is given in the original papers or we convert to the \(r_0\) scale, we use the value \(r_0\) = 0.472 fm. In case \(r_1 \Lambda _{\overline{\mathrm{MS}}}\) is given in the publications, we use \(r_0 /r_1 = 1.508\) [694], to convert, which remains well consistent with the update [401] neglecting the error on this ratio. In some, mostly early, computations the string tension, \(\sqrt{\sigma }\) was used. We convert to \(r_0\) using \(r_0^2\sigma = 1.65-\pi /12\), which has been shown to be an excellent approximation in the relevant pure gauge theory [695, 696].

The new scales \(t_0,w_0\) based on the gradient flow are very attractive alternatives to \(r_0\) but their discretization errors are still under discussion [697,698,699,700] and their values at the physical point are not yet determined with great precision. We remain with \(r_0\) as our main reference scale for now. A general discussion of the various scales is given in [701].

9.2.3 Studies of truncation errors of perturbation theory

As discussed previously, we have to determine \(\alpha _s\) in a region where the perturbative expansion for the \(\beta \)-function, Eq. (281) in the integral Eq. (280), is reliable. In principle this must be checked, however this is difficult to achieve as we need to reach up to a sufficiently high scale. A frequently used recipe to estimate the size of truncation errors of the perturbative series is to vary the renormalization-scale dependence around the chosen ‘optimal’ scale \(\mu _*\), of an observable evaluated at a fixed order in the coupling from \(\mu =\mu _*/2\) to \(2\mu _*\). For an example see Fig. 35. Alternatively, or in addition, the renormalization scheme chosen can be varied, which investigates the perturbative conversion of the chosen scheme to the perturbatively defined \(\overline{\mathrm{MS}}\) scheme and in particular ‘fastest apparent convergence’ when the ‘optimal’ scale is chosen so that the \(O(\alpha _s^2)\) coefficient vanishes.

The ALPHA collaboration in Ref. [702] and ALPHA 17  [703], within the SF approach defined a set of \(\nu \) schemes where the third scheme-dependent coefficient of the \(\beta \)-function for \(N_f = 2\,+\,1\) flavours was computed to be \(b_2^\nu = -(0.064(27)+1.259(1)\nu )/(4\pi )^3\). The standard SF scheme has \(\nu = 0\). For comparison, \(b_2^{\overline{\mathrm {MS}}}= 0.324/(4\pi )^3\). A range of scales from about \(4\,\text{ GeV }\) to \(128\,\text{ GeV }\) was investigated. It was found that while the procedure of varying the scale by a factor 2 up and down gave a correct estimate of the residual perturbative error for \(\nu \approx 0 \ldots 0.3\), for negative values, e.g., \(\nu = -0.5\), the estimated perturbative error is much too small to account for the mismatch in the \(\Lambda \)-parameter of \(\approx 8\%\) at \(\alpha _s=0.15\). This mismatch, however, did, as expected, still scale with \(\alpha _s^{n_l}\) with \(n_l=2\). In the schemes with negative \(\nu \), the coupling \(\alpha _s\) has to be quite small for scale-variations of a factor 2 to correctly signal the perturbative errors.

A similar \(\approx 8\%\) deviation in the \(\Lambda \)-parameter extracted from the qq-scheme (c.f. Sect. 9.4) is found by Husung 17 [681], but at \(\alpha _s\approx 0.2\) and with \(n_l=3\).

9.3 \(\alpha _s\) from Step-Scaling Methods

9.3.1 General considerations

The method of step-scaling functions avoids the scale problem, Eq. (291). It is in principle independent of the particular boundary conditions used and was first developed with periodic boundary conditions in a two-dimensional model [704].

The essential idea of the step-scaling strategy is to split the determination of the running coupling at large \(\mu \) and of a hadronic scale into two lattice calculations and connect them by ‘step-scaling’. In the former part, we determine the running coupling constant in a finite-volume scheme in which the renormalization scale is set by the inverse lattice size \(\mu = 1/L\). In this calculation, one takes a high renormalization scale while keeping the lattice spacing sufficiently small as

$$\begin{aligned} \mu \equiv 1/L \sim 10\,\ldots \, 100\,\text{ GeV }\,, \qquad a/L \ll 1 \,. \end{aligned}$$
(299)

In the latter part, one chooses a certain \({\bar{g}}^2_\mathrm {max}={\bar{g}}^2(1/L_\mathrm {max})\), typically such that \(L_\mathrm {max}\) is around 0.5–1 fm. With a common discretization, one then determines \(L_\mathrm {max}/a\) and (in a large volume \(L \ge \) 2–3 fm) a hadronic scale such as a hadron mass, \(\sqrt{t_0}/a\) or \(r_0/a\) at the same bare parameters. In this way one gets numbers for, e.g., \(L_\mathrm {max}/r_0\) and by changing the lattice spacing a carries out a continuum limit extrapolation of that ratio.

In order to connect \({\bar{g}}^2(1/L_\mathrm {max})\) to \({\bar{g}}^2(\mu )\) at high \(\mu \), one determines the change of the coupling in the continuum limit when the scale changes from L to L/s, starting from \(L=L_{\mathrm{max}}\) and arriving at \(\mu = s^k /L_{\mathrm{max}}\). This part of the strategy is called step-scaling. Combining these results yields \({\bar{g}}^2(\mu )\) at \(\mu = s^k \,(r_0 / L_\mathrm {max})\, r_0^{-1}\), where \(r_0\) stands for the particular chosen hadronic scale. Most applications use a scale factor \(s=2\).

At present most applications in QCD use Schrödinger functional boundary conditions [172, 705] and we discuss this below in a little more detail. (However, other boundary conditions are also possible, such as twisted boundary conditions and the discussion also applies to them.) An important reason is that these boundary conditions avoid zero modes for the quark fields and quartic modes [706] in the perturbative expansion in the gauge fields. Furthermore the corresponding renormalization scheme is well studied in perturbation theory [707,708,709] with the 3-loop \(\beta \)-function and 2-loop cutoff effects (for the standard Wilson regularization) known.

In order to have a perturbatively well-defined scheme, the SF scheme uses Dirichlet boundary conditions at time \(t = 0\) and \(t = T\). These break translation invariance and permit \({{\mathcal {O}}}(a)\) counter terms at the boundary through quantum corrections. Therefore, the leading discretization error is \({{\mathcal {O}}}(a)\). Improving the lattice action is achieved by adding counter terms at the boundaries whose coefficients are denoted as \(c_t,{{\tilde{c}}}_t\). In practice, these coefficients are computed with 1-loop or 2-loop perturbative accuracy. A better precision in this step yields a better control over discretization errors, which is important, as can be seen, e.g., in Refs. [695, 710].

Also computations with Dirichlet boundary conditions do in principle suffer from the insufficient change of topology in the HMC algorithm at small lattice spacing. However, in a small volume the weight of nonzero charge sectors in the path integral is exponentially suppressed [711]Footnote 69 and in a Monte Carlo run of typical length very few configurations with nontrivial topology should appear. Considering the issue quantitatively Ref. [712] finds a strong suppression below \(L\approx 0.8\,\mathrm{fm}\). Therefore the lack of topology change of the HMC is not a serious issue. Still Ref. [713] includes a projection to zero topology into the definition of the coupling. We note also that a mix of Dirichlet and open boundary conditions is expected to remove the topology issue entirely [714] and may be considered in the future.

Apart from the boundary conditions, the very definition of the coupling needs to be chosen. We briefly discuss in turn, the two schemes used at present, namely, the ‘Schrödinger Functional’ (SF) and ‘Gradient Flow’ (GF) schemes.

The SF scheme is the first one, which was used in step-scaling studies in gauge theories [172]. Inhomogeneous Dirichlet boundary conditions are imposed in time,

$$\begin{aligned} A_k(x)|_{x_0=0} = C_k\,, \quad A_k(x)|_{x_0=L} = C_k'\,, \end{aligned}$$
(300)

for \(k=1,2,3\). Periodic boundary conditions (up to a phase for the fermion fields) with period L are imposed in space. The matrices

$$\begin{aligned} LC_k&= i \,\mathrm{diag}\big ( \eta - \pi /3, -\eta /2 , -\eta /2 + \pi /3 \big ) \,, \\ LC^\prime _k&= i \,\mathrm{diag}\big ( -(\eta +\pi ), \eta /2 + \pi /3,\eta /2 + 2\pi /3 \big )\,, \end{aligned}$$

just depend on the dimensionless parameter \(\eta \). The coupling \({\bar{g}}_\mathrm {SF}\) is obtained from the \(\eta \)-derivative of the effective action,

$$\begin{aligned} \langle \partial _\eta S|_{\eta =0} \rangle = \frac{12\pi }{{\bar{g}}^2_\mathrm {SF}}\,. \end{aligned}$$
(301)

For this scheme, the finite \(c^{(i)}_g\), Eq. (283), are known for \(i=1,2\) [708, 709].

Table 53 Results for the \(\Lambda \) parameter from computations using step-scaling of the SF-coupling. Entries without values for \(\Lambda \) computed the running and established perturbative behaviour at large \(\mu \)

More recently, gradient flow couplings have been used frequently because of their small statistical errors at large couplings (in contrast to \({\bar{g}}_\mathrm {SF}\), which has small statistical errors at small couplings). The gradient flow is introduced as follows [271, 715]. Consider the flow gauge field \(B_\mu (t,x)\) with the flow time t, which is a one parameter deformation of the bare gauge field \(A_\mu (x)\), where \(B_\mu (t,x)\) is the solution to the gradient flow equation

$$\begin{aligned} \partial _t B_\mu (t,x)= & {} D_\nu G_{\nu \mu }(t,x)\,, \nonumber \\ G_{\mu \nu }= & {} \partial _\mu B_\nu - \partial _\nu B_\mu + [B_\mu ,B_\nu ] \,, \end{aligned}$$
(302)

with initial condition \(B_\mu (0,x) = A_\mu (x)\). The renormalized coupling is defined by [271]

$$\begin{aligned} {\bar{g}}^2_{\mathrm{GF}}(\mu ) = \left. {{{\mathcal {N}}}} t^2 \langle E(t,x)\rangle \right| _{\mu =1/\sqrt{8t}} \,, \end{aligned}$$
(303)

with \({{{\mathcal {N}}}} = 16\pi ^2/3 + O((a/L)^2)\) and where E(tx) is the action density given by

$$\begin{aligned} E(t,x) = \frac{1}{4} G^a_{\mu \nu }(t,x) G^a_{\mu \nu }(t,x). \end{aligned}$$
(304)

In a finite volume, one needs to specify additional conditions. In order not to introduce two independent scales one sets

$$\begin{aligned} \sqrt{8t} = cL \,, \end{aligned}$$
(305)

for some fixed number c [716]. Schrödinger functional boundary conditions [717] or twisted boundary conditions [680, 718] have been employed. Matching of the GF coupling to the \(\overline{\mathrm{MS}}\) scheme coupling is known to 1-loop for twisted boundary conditions with zero quark flavours and SU(3) group [680] and to 2-loop with SF boundary conditions with zero quark flavours [719]. The former is based on a MC evaluation at small couplings and the latter on numerical stochastic perturbation theory.

9.3.2 Discussion of computations

In Table 53 we give results from various determinations of the \(\Lambda \) parameter. For a clear assessment of the \(N_f\)-dependence, the last column also shows results that refer to a common hadronic scale, \(r_0\). As discussed above, the renormalization scale can be chosen large enough such that \(\alpha _s < 0.2\) and the perturbative behaviour can be verified. Consequently only is present for these criteria except for early work where the \(n_l=2\) loop connection to \({\overline{\mathrm {MS}}}\) was not yet known and we assigned a concerning the renormalization scale. With dynamical fermions, results for the step-scaling functions are always available for at least \(a/L = \mu a =1/4,1/6, 1/8\). All calculations have a nonperturbatively \({\mathcal {O}}(a)\) improved action in the bulk. For the discussed boundary \({\mathcal {O}}(a)\) terms this is not so. In most recent calculations 2-loop \({\mathcal {O}}(a)\) improvement is employed together with at least three lattice spacings.Footnote 70 This means a for the continuum extrapolation. In other computations only 1-loop \(c_t\) was available and we arrive at . We note that the discretization errors in the step-scaling functions of the SF coupling are usually found to be very small, at the percent level or below. However, the overall desired precision is very high as well, and the results in CP-PACS 04 [710] show that discretization errors at the below percent level cannot be taken for granted. In particular with staggered fermions (unimproved except for boundary terms) few percent effects are seen in Perez 10 [721].

In the work by PACS-CS 09A [81], the continuum extrapolation in the scale setting is performed using a constant function in a and with a linear function. Potentially the former leaves a considerable residual discretization error. We here use, as discussed with the collaboration, the continuum extrapolation linear in a, as given in the second line of PACS-CS 09A [81] results in Table 53. After perturbative conversion from a three-flavour result to five flavours (see Sect. 9.2.1), they obtain

$$\begin{aligned} \alpha _{\overline{\mathrm {MS}}}^{(5)}(M_Z)=0.118(3)\,. \end{aligned}$$
(306)

In Ref. [79], the ALPHA collaboration determined \(\Lambda ^{(3)}_{{\overline{\mathrm {MS}}}}\) combining step-scaling in \({\bar{g}}^2_\mathrm{GF}\) in the lower scale region \(\mu _{\mathrm{had}} \le \mu \le \mu _0\), and step-scaling in \({\bar{g}}^2_{\mathrm{SF}}\) for higher scales \(\mu _0 \le \mu \le \mu _{\mathrm{PT}}\). Both schemes are defined with SF boundary conditions. For \({\bar{g}}^2_{\mathrm{GF}}\) a projection to the sector of zero topological charge is included, Eq. (304) is restricted to the magnetic components, and \(c=0.3\). The scales \(\mu _{\mathrm{had}}\), \(\mu _0\), and \(\mu _{\mathrm{PT}}\) are defined by \({\bar{g}}^2_{\mathrm{GF}} (\mu _{\mathrm{had}})= 11.3\), \({\bar{g}}^2_{\mathrm{SF}}(\mu _0) = 2.012\), and \(\mu _{\mathrm{PT}} = 16 \mu _0\) which are roughly estimated as

$$\begin{aligned} 1/L_\mathrm {max}\equiv \mu _{\mathrm{had}} \approx 0.2 \text{ GeV },&\mu _0 \approx 4 \text{ GeV } \,,&\mu _{\mathrm{PT}}\approx 70 \text{ GeV }.\nonumber \\ \end{aligned}$$
(307)

Step-scaling is carried out with an O(a)-improved Wilson quark action [725] and Lüscher-Weisz gauge action [726] in the low-scale region and an O(a)-improved Wilson quark action [727] and Wilson gauge action in the high-energy part. For the step-scaling using steps of \(L/a \,\rightarrow \,2L/a\), three lattice sizes \(L/a=8,12,16\) were simulated for \({\bar{g}}^2_{\mathrm{GF}}\) and four lattice sizes \(L/a=(4,)\, 6, 8, 12\) for \({\bar{g}}^2_{\mathrm{SF}}\). The final results do not use the small lattices given in parenthesis. The parameter \(\Lambda ^{(3)}_{{\overline{\mathrm {MS}}}}\) is then obtained via

$$\begin{aligned} \Lambda ^{(3)}_{{\overline{\mathrm {MS}}}} = \underbrace{\frac{\Lambda ^{(3)}_{{\overline{\mathrm {MS}}}}}{\mu _{\mathrm{PT}}}}_\mathrm{perturbation ~ theory} \times \underbrace{\frac{\mu _\mathrm{PT}}{\mu _{\mathrm{had}}}}_{\mathrm{step-scaling}} \times \underbrace{\frac{\mu _{\mathrm{had}}}{f_{\pi K}}}_{\mathrm{large ~volume ~simulation}} \times \underbrace{f_{\pi K}}_{\mathrm{experimental ~data}} \,, \nonumber \\ \end{aligned}$$
(308)

where the hadronic scale \(f_{\pi K}\) is \(f_{\pi K}= \frac{1}{3}(2 f_K + f_\pi ) = 147.6 (5) \text{ MeV }\). The first term on the right hand side of Eq. (308) is obtained from \(\alpha _\mathrm{SF}(\mu _{\mathrm{PT}})\) which is the output from SF step-scaling using Eq. (280) with \(\alpha _{\mathrm{SF}}(\mu _{\mathrm{PT}})\approx 0.1\) and the 3-loop \(\beta \)-function and the exact conversion to the \({\overline{\mathrm {MS}}}\)-scheme. The second term is essentially obtained from step-scaling in the GF scheme and the measurement of \({\bar{g}}^2_\mathrm{SF}(\mu _0)\) except for the trivial scaling factor of 16 in the SF running. The third term is obtained from a measurement of the hadronic quantity at large volume.

A large volume simulation is done for three lattice spacings with sufficiently large volume and reasonable control over the chiral extrapolation so that the scale determination is precise enough. The step-scaling results in both schemes satisfy renormalization criteria, perturbation theory criteria, and continuum limit criteria just as previous studies using step-scaling. So we assign green stars for these criteria.

The dependence of \(\Lambda \), Eq. (280) with 3-loop \(\beta \)-function, on \(\alpha _s\) and on the chosen scheme is discussed in [702]. This investigation provides a warning on estimating the truncation error of perturbative series. Details are explained in Sect. 9.2.3.

The result for the \(\Lambda \) parameter is \(\Lambda ^{(3)}_{\overline{\mathrm{MS}}} = 341(12)~\text{ MeV }\), where the dominant error comes from the error of \(\alpha _{\mathrm{SF}}(\mu _\mathrm{PT})\) after step-scaling in SF scheme. Using 4-loop matching at the charm and bottom thresholds and 5-loop running one finally obtains

$$\begin{aligned} \alpha ^{(5)}_{\overline{\mathrm{MS}}}(M_Z) = 0.11852(84)\,. \end{aligned}$$
(309)

Several other results do not have a sufficient number of quark flavours or do not yet contain the conversion of the scale to physical units (ALPHA 10A [720], Perez 10 [721]). Thus no value for \(\alpha _{\overline{\mathrm {MS}}}^{(5)}(M_Z)\) is quoted.

The computation of Ishikawa et al. [680] is based on the gradient flow coupling with twisted boundary conditions [718] (TGF coupling) in the pure gauge theory. Again they use \(c=0.3\). Step-scaling with a scale factor \(s=3/2\) is employed, covering a large range of couplings from \(\alpha _s\approx 0.5\) to \(\alpha _s\approx 0.1\) and taking the continuum limit through global fits to the step-scaling function on \(L/a=12,16,18\) lattices with between 6 and 8 parameters. Systematic errors due to variations of the fit functions are estimated. Two physical scales are considered: \(r_0/a\) is taken from [695] and \(\sigma a^2\) from [163] and [728]. As the ratio \(\Lambda _\mathrm {TGF}/\Lambda _{{\overline{\mathrm {MS}}}}\) has not yet been computed analytically, Ref. [680] determines the 1-loop relation between \({\bar{g}}_\mathrm {SF}\) and \({\bar{g}}_\mathrm {TGF}\) from MC simulations performed in the weak coupling region and then uses the known \(\Lambda _\mathrm {SF}/\Lambda _{{\overline{\mathrm {MS}}}}\). Systematic errors due to variations of the fit functions dominate the overall uncertainty.

9.4 \(\alpha _s\) from the potential at short distances

9.4.1 General considerations

The basic method was introduced in Ref. [729] and developed in Ref. [730]. The force or potential between an infinitely massive quark and antiquark pair defines an effective coupling constant via

$$\begin{aligned} F(r) = {d V(r) \over dr} = C_F {\alpha _\mathrm {qq}(r) \over r^2} \,. \end{aligned}$$
(310)

The coupling can be evaluated nonperturbatively from the potential through a numerical differentiation, see below. In perturbation theory one also defines couplings in different schemes \(\alpha _{{\bar{V}}}\), \(\alpha _V\) via

$$\begin{aligned} V(r) = - C_F {\alpha _{{\bar{V}}}(r) \over r} \,, \quad \text{ or } \quad {\tilde{V}}(Q) = - C_F {\alpha _V(Q) \over Q^2} \,, \end{aligned}$$
(311)

where one fixes the unphysical constant in the potential by \(\lim _{r\rightarrow \infty }V(r)=0\) and \({\tilde{V}}(Q)\) is the Fourier transform of V(r). Nonperturbatively, the subtraction of a constant in the potential introduces an additional renormalization constant, the value of \(V(r_\mathrm {ref})\) at some distance \(r_\mathrm {ref}\). Perturbatively, it is believed to entail a renormalon ambiguity. In perturbation theory, the different definitions are all simply related to each other, and their perturbative expansions are known including the \(\alpha _s^4,\,\alpha _s^4 \log \alpha _s\) and \(\alpha _s^5 \log \alpha _s ,\,\alpha _s^5 (\log \alpha _s)^2\) terms [731,732,733,734,735,736,737,738].

The potential V(r) is determined from ratios of Wilson loops, W(rt), which behave as

$$\begin{aligned} \langle W(r, t) \rangle = |c_0|^2 e^{-V(r)t} + \sum _{n\not = 0} |c_n|^2 e^{-V_n(r)t} \,, \end{aligned}$$
(312)

where t is taken as the temporal extension of the loop, r is the spatial one and \(V_n\) are excited-state potentials. To improve the overlap with the ground state, and to suppress the effects of excited states, t is taken large. Also various additional techniques are used, such as a variational basis of operators (spatial paths) to help in projecting out the ground state. Furthermore some lattice-discretization effects can be reduced by averaging over Wilson loops related by rotational symmetry in the continuum.

In order to reduce discretization errors it is of advantage to define the numerical derivative giving the force as

$$\begin{aligned} F(r_\mathrm {I}) = { V(r) - V(r-a) \over a } \,, \end{aligned}$$
(313)

where \(r_\mathrm {I}\) is chosen so that at tree level the force is the continuum force. \(F(r_\mathrm {I})\) is then a ‘tree-level improved’ quantity and similarly the tree-level improved potential can be defined [739].

Lattice potential results are in position space, while perturbation theory is naturally computed in momentum space at large momentum. Usually, the Fourier transform of the perturbative expansion is then matched to lattice data.

Finally, as was noted in Sect. 9.2.1, a determination of the force can also be used to determine the scales \(r_0,\,r_1\), by defining them from the static force by

$$\begin{aligned} r_0^2 F(r_0) = {1.65} \,, \quad r_1^2 F(r_1) = 1\,. \end{aligned}$$
(314)

9.4.2 Discussion of computations

In Table 54, we list results of determinations of \(r_0\Lambda _{{\overline{\mathrm {MS}}}}\) (together with \(\Lambda _{{\overline{\mathrm {MS}}}}\) using the scale determination of the authors). Since the last review, FLAG 16, there have been three new computations, Husung 17 [681], Karbstein 18 [682] and Takaura 18 [683, 684].

The first determinations in the three-colour Yang Mills theory are by UKQCD 92 [730] and Bali 92 [743] who used \(\alpha _\mathrm {qq}\) as explained above, but not in the tree-level improved form. Rather a phenomenologically determined lattice artifact correction was subtracted from the lattice potentials. The comparison with perturbation theory was on a more qualitative level on the basis of a 2-loop \(\beta \)-function (\(n_l=1\)) and a continuum extrapolation could not be performed as yet. A much more precise computation of \(\alpha _\mathrm {qq}\) with continuum extrapolation was performed in Refs. [695, 739]. Satisfactory agreement with perturbation theory was found [739] but the stability of the perturbative prediction was not considered sufficient to be able to extract a \(\Lambda \) parameter.

Table 54 Short-distance potential results

In Brambilla 10 [742] the same quenched lattice results of Ref. [739] were used and a fit was performed to the continuum potential, instead of the force. Perturbation theory to \(n_l=3\) loop was used including a resummation of terms \(\alpha _s^3 (\alpha _s \ln \alpha _s)^n \) and \(\alpha _s^4 (\alpha _s \ln \alpha _s)^n \). Close agreement with perturbation theory was found when a renormalon subtraction was performed. Note that the renormalon subtraction introduces a second scale into the perturbative formula which is absent when the force is considered.

Bazavov 14 [80] updates Bazavov 12 [740] and modifies this procedure somewhat. They consider the perturbative expansion for the force. They set \(\mu = 1/r\) to eliminate logarithms and then integrate the force to obtain an expression for the potential. The resulting integration constant is fixed by requiring the perturbative potential to be equal to the nonperturbative one exactly at a reference distance \(r_{\mathrm{ref}}\) and the two are then compared at other values of r. As a further check, the force is also used directly.

For the quenched calculation Brambilla 10 [742] very small lattice spacings, \(a \sim 0.025\,\text{ fm }\), were available from Ref. [739]. For ETM 11C [741], Bazavov 12 [740], Karbstein 14 [678] and Bazavov 14 [80] using dynamical fermions such small lattice spacings are not yet realized (Bazavov 14 reaches down to \(a \sim 0.041\,\text{ fm }\)). They all use the tree-level improved potential as described above. We note that the value of \(\Lambda _{\overline{\mathrm {MS}}}\) in physical units by ETM 11C [741] is based on a value of \(r_0=0.42\) fm. This is at least 10% smaller than the large majority of other values of \(r_0\). Also the values of \(r_0/a\) on the finest lattices in ETM 11C [741] and \(r_1/a\) for Bazavov 14 [80] come from rather small lattices with \(m_\pi L \approx 2.4\), 2.2 respectively.

Instead of the procedure discussed previously, Karbstein 14 [678] reanalyzes the data of ETM 11C [741] by first estimating the Fourier transform \({{\tilde{V}}}(p)\) of V(r) and then fitting the perturbative expansion of \({{\tilde{V}}}(p)\) in terms of \(\alpha _{\overline{\mathrm {MS}}}(p)\). Of course, the Fourier transform requires some modelling of the r-dependence of V(r) at short and at large distances. The authors fit a linearly rising potential at large distances together with string-like corrections of order \(r^{-n}\) and define the potential at large distances by this fit.Footnote 71 Recall that for observables in momentum space we take the renormalization scale entering our criteria as \(\mu =q\), Eq. (298). The analysis (as in ETM 11C [741]) is dominated by the data at the smallest lattice spacing, where a controlled determination of the overall scale is difficult due to possible finite-size effects. Karbstein 18 [682] is a reanalysis of Karbstein 14 and supersedes it. Some data with a different discretization of the static quark is added (on the same configurations) and the discrete lattice results for the static potential in position space are first parameterized by a continuous function, which then allows for an analytical Fourier transformation to momentum space.

Similarly also for Takaura 18 [683, 684] the momentum space potential \({\tilde{V}}(Q)\) is the central object. Namely, they assume that renormalon / power law effects are absent in \({\tilde{V}}(Q)\) and only come in through the Fourier transformation. They provide evidence that renormalon effects (both \(u=1/2\) and \(u=3/2\)) can be subtracted and arrive at a nonperturbative term \(k\,\Lambda _{\overline{\mathrm {MS}}}^3 r^2\). Two different analysis are carried out with the final result taken from “Analysis II”. Our numbers including the evaluation of the criteria refer to it. Together with the perturbative 3-loop (including the \( \alpha _s^4\log \alpha _s\) term) expression, this term is fitted to the nonperturbative results for the potential in the region \(0.04\,\mathrm{fm}\, \le \, r \,\le 0.35\,\mathrm{fm}\), where \(0.04\,\mathrm{fm}\) is \(r=a\) on the finest lattice. The NP potential data originate from JLQCD ensembles (Symanzik improved gauge action and Möbius domain-wall quarks) at three lattice spacings with a pion mass around \(300\,\,\mathrm {MeV}\). Since at the maximal distance in the analysis we find \(\alpha _{\overline{\mathrm {MS}}}(2/r) = 0.43\), the renormalization scale criterion yields a . The perturbative behavior is because of the high order in PT known. The continuum limit criterion yields a .

One of the main issues for all these computations is whether the perturbative running of the coupling constant has been reached. While for \(N_f=0\) fermions Brambilla 10 [742] reports agreement with perturbative behavior at the smallest distances, Husung 17 [681] (which goes to shorter distances) finds relatively large corrections beyond the 3-loop \(\alpha _\mathrm {qq}\). For dynamical fermions, Bazavov 12 [740] and Bazavov 14 [80] report good agreement with perturbation theory after the renormalon is subtracted or eliminated.

A second issue is the coverage of configuration space in some of the simulations, which use very small lattice spacings with periodic boundary conditions. Affected are the smallest two lattice spacings of Bazavov 14 [80] where very few tunnelings of the topological charge occur [401]. With present knowledge, it also seems possible that the older data by Refs. [695, 739] used by Brambilla 10 [742] are partially obtained with (close to) frozen topology.

The recent computation Husung 17 [681], for \(N_f = 0\) flavours first determines the coupling \({\bar{g}}_{\mathrm{qq}}^2(r,a)\) from the force and then performs a continuum extrapolation on lattices down to \(a \approx 0.015\,\text{ fm }\), using a step-scaling method at short distances, . Using the 4-loop \(\beta ^{\mathrm{qq}}\) function this allows \(r_0\Lambda _{\mathrm{qq}}\) to be estimated, which is then converted to the \(\overline{\mathrm{MS}}\) scheme. \(\alpha _{\mathrm{eff}} = \alpha _{\mathrm{qq}}\) ranges from \(\sim 0.17\) to large values; we give for renormalization scale and for perturbative behaviour. The range \(a\mu = 2a/r \approx 0.37\) - 0.14 leads to a in the continuum extrapolation.

We note that the \(N_{ f}=3\) determinations of \(r_0 \Lambda _{\overline{\mathrm{MS}}}\) agree within their errors of 4-6%.

Table 55 Vacuum polarization results

9.5 \(\alpha _s\) from the vacuum polarization at short distances

9.5.1 General considerations

The vacuum polarization function for the flavour nonsinglet currents \(J^a_\mu \) (\(a=1,2,3\)) in the momentum representation is parameterized as

$$\begin{aligned} \langle J^a_\mu J^b_\nu \rangle= & {} \delta ^{ab} [(\delta _{\mu \nu }Q^2 - Q_\mu Q_\nu ) \Pi ^{(1)}(Q) \nonumber \\&- Q_\mu Q_\nu \Pi ^{(0)}(Q)], \end{aligned}$$
(315)

where \(Q_\mu \) is a space-like momentum and \(J_\mu \equiv V_\mu \) for a vector current and \(J_\mu \equiv A_\mu \) for an axial-vector current. Defining \(\Pi _J(Q)\equiv \Pi _J^{(0)}(Q)+\Pi _J^{(1)}(Q)\), the operator product expansion (OPE) of the vacuum polarization function \(\Pi _{V+A}(Q)=\Pi _V(Q)+\Pi _A(Q)\) is given by

$$\begin{aligned}&\Pi _{V+A}|_{\mathrm{OPE}}(Q^2,\alpha _s) \nonumber \\&\quad = c + C_1(Q^2) + C_m^{V+A}(Q^2) \frac{{\bar{m}}^2(Q)}{Q^2} \nonumber \\&\qquad + \sum _{q=u,d,s}C_{{\bar{q}}q}^{V+A}(Q^2) \frac{\langle m_q{\bar{q}}q \rangle }{Q^4}\nonumber \\&\qquad + C_{GG}(Q^2) \frac{\langle \alpha _s GG\rangle }{Q^4}+{{\mathcal {O}}}(Q^{-6}) \,, \end{aligned}$$
(316)

for large \(Q^2\). The perturbative coefficient functions \(C_X^{V+A}(Q^2)\) for the operators X (\(X=1\), \({\bar{q}}q\), GG) are given as \(C_X^{V+A}(Q^2)=\sum _{i\ge 0}\left( C_X^{V+A}\right) ^{(i)}\alpha _s^i(Q^2)\) and \({{\bar{m}}}\) is the running mass of the mass-degenerate up and down quarks. \(C_1\) is known including \(\alpha _s^4\) in a continuum renormalization scheme such as the \(\overline{\mathrm{MS}}\) scheme [744,745,746,747]. Nonperturbatively, there are terms in \(C_X\) that do not have a series expansion in \(\alpha _s\). For an example for the unit operator see Ref. [748]. The term c is Q-independent and divergent in the limit of infinite ultraviolet cutoff. However the Adler function defined as

$$\begin{aligned} D(Q^2) \equiv - Q^2 { d\Pi (Q^2) \over dQ^2} \,, \end{aligned}$$
(317)

is a scheme-independent finite quantity. Therefore one can determine the running coupling constant in the \(\overline{\mathrm{MS}}\) scheme from the vacuum polarization function computed by a lattice-QCD simulation. In more detail, the lattice data of the vacuum polarization is fitted with the perturbative formula Eq. (316) with fit parameter \(\Lambda _{\overline{\mathrm{MS}}}\) parameterizing the running coupling \(\alpha _{\overline{\mathrm{MS}}}(Q^2)\).

While there is no problem in discussing the OPE at the nonperturbative level, the ‘condensates’ such as \({\langle \alpha _s GG\rangle }\) are ambiguous, since they mix with lower-dimensional operators including the unity operator. Therefore one should work in the high-\(Q^2\) regime where power corrections are negligible within the given accuracy. Thus setting the renormalization scale as \(\mu \equiv \sqrt{Q^2}\), one should seek, as always, the window \(\Lambda _{\mathrm{QCD}} \ll \mu \ll a^{-1}\).

9.5.2 Discussion of computations

Results using this method are, to date, only available using overlap fermions or domain wall fermions. Since the last review, FLAG 16, there has been one new computation, Hudspith 18 [685]. These are collected in Table 55 for \(N_f=2\), JLQCD/TWQCD 08C [751] and for \(N_f = 2\,+\,1\), JLQCD 10 [750] and Hudspith 18 [685].

We first discuss the results of JLQCD/TWQCD 08C [751] and JLQCD 10 [750]. The fit to Eq. (316) is done with the 4-loop relation between the running coupling and \(\Lambda _{{\overline{\mathrm {MS}}}}\). It is found that without introducing condensate contributions, the momentum scale where the perturbative formula gives good agreement with the lattice results is very narrow, \(aQ \simeq 0.8{-}1.0\). When a condensate contribution is included the perturbative formula gives good agreement with the lattice results for the extended range \(aQ \simeq 0.6-1.0\). Since there is only a single lattice spacing \(a \approx 0.11\,\text{ fm }\) there is a for the continuum limit. The renormalization scale \(\mu \) is in the range of \(Q=1.6-2\,\text{ GeV }\). Approximating \(\alpha _{\mathrm{eff}}\approx \alpha _{\overline{\mathrm{MS}}}(Q)\), we estimate that \(\alpha _{\mathrm{eff}}=0.25-0.30\) for \(N_f=2\) and \(\alpha _\mathrm{eff}=0.29-0.33\) for \(N_f=2\,+\,1\). Thus we give a and for \(N_{ f}=2\) and \(N_{ f}=2\,+\,1\), respectively, for the renormalization scale and a for the perturbative behaviour.

A further investigation of this method was initiated in Hudspith 15 [749] and completed by Hudspith 18 [685] (see also [752]) based on domain wall fermion configurations at three lattice spacings, \(a^{-1} = 1.78,\, 2.38,\, 3.15 \,\,\mathrm {GeV}\), with three different light quark masses on the two coarser lattices and one on the fine lattice. An extensive discussion of condensates, using continuum finite energy sum rules was employed to estimate where their contributions might be negligible. It was found that even up to terms of \(O((1/Q^2)^8)\) (a higher order than depicted in Eq. (316) but with constant coefficients) no single condensate dominates and apparent convergence was poor for low \(Q^2\) due to cancellations between contributions of similar size with alternating signs. (See, e.g., the list given by Hudspith 15 [749].) Choosing \(Q^2\) to be at least \(\sim 3.8\,\text{ GeV }^2\) mitigated the problem, but then the coarest lattice had to be discarded, due to large lattice artifacts. So this gives a for continuum extrapolation. With the higher \(Q^2\) the quark-mass dependence of the results was negligible, so ensembles with different quark masses were averaged over. A range of \(Q^2\) from 3.8 – \(16\,\text{ GeV }^2\) gives \(\alpha _{\mathrm{eff}} = 0.31\) – 0.22, so there is a for the renormalization scale. The value of \(\alpha _{\mathrm{eff}}^3\) reaches \(\Delta \alpha _\mathrm{eff}/(8\pi b_0 \alpha _{\mathrm{eff}})\) and thus gives a for perturbative behaviour. In Hudspith 15 [749] (superseded by Hudspith 18 [685]) about a 20% difference in \(\Pi (Q^2)\) was seen between the two lattice lattice spacings and a result is quoted only for the smaller a.

9.6 \(\alpha _s\) from observables at the lattice spacing scale

9.6.1 General considerations

The general method is to evaluate a short-distance quantity \({{\mathcal {Q}}}\) at the scale of the lattice spacing \(\sim 1/a\) and then determine its relationship to \(\alpha _{\overline{\mathrm{MS}}}\) via a perturbative expansion.

This is epitomized by the strategy of the HPQCD collaboration [753, 754], discussed here for illustration, which computes and then fits to a variety of short-distance quantities, Y,

$$\begin{aligned} Y = \sum _{n=1}^{n_{\mathrm{max}}} c_n \alpha _{\mathrm {V'}}^n(q^*) \,. \end{aligned}$$
(318)

The quantity Y is taken as the logarithm of small Wilson loops (including some nonplanar ones), Creutz ratios, ‘tadpole-improved’ Wilson loops and the tadpole-improved or ‘boosted’ bare coupling (\({\mathcal {O}}(20)\) quantities in total). The perturbative coefficients \(c_n\) (each depending on the choice of Y) are known to \(n = 3\) with additional coefficients up to \(n_{\mathrm{max}}\) being fitted numerically. The running coupling \(\alpha _{\mathrm {V'}}\) is related to \(\alpha _{\mathrm {V}}\) from the static-quark potential (see Sect. 9.4).Footnote 72

The coupling constant is fixed at a scale \(q^* = d/a\). The latter is chosen as the mean value of \(\ln q\) with the one gluon loop as measure [755, 756]. (Thus a different result for d is found for every short-distance quantity.) A rough estimate yields \(d \approx \pi \), and in general the renormalization scale is always found to lie in this region.

For example, for the Wilson loop \(W_{mn} \equiv \langle W(ma,na) \rangle \) we have

$$\begin{aligned} \ln \left( \frac{W_{mn}}{u_0^{2(m+n)}}\right) = c_1 \alpha _{\mathrm {V'}}(q^*) + c_2 \alpha _{\mathrm {V'}}^2(q^*) + c_3 \alpha _{\mathrm {V'}}^3(q^*) + \cdots \,,\nonumber \\ \end{aligned}$$
(319)

for the tadpole-improved version, where \(c_1\), \(c_2\,, \ldots \) are the appropriate perturbative coefficients and \(u_0 = W_{11}^{1/4}\). Substituting the nonperturbative simulation value in the left hand side, we can determine \(\alpha _{\mathrm {V'}}(q^*)\), at the scale \(q^*\). Note that one finds empirically that perturbation theory for these tadpole-improved quantities have smaller \(c_n\) coefficients and so the series has a faster apparent convergence compared to the case without tadpole improvement.

Using the \(\beta \)-function in the \(\mathrm V'\) scheme, results can be run to a reference value, chosen as \(\alpha _0 \equiv \alpha _{\mathrm {V'}}(q_0)\), \(q_0 = 7.5\,\text{ GeV }\). This is then converted perturbatively to the continuum \({\overline{\mathrm {MS}}}\) scheme

$$\begin{aligned} \alpha _{\overline{\mathrm{MS}}}(q_0) = \alpha _0 + d_1 \alpha _0^2 + d_2 \alpha _0^3 + \cdots \,, \end{aligned}$$
(320)

where \(d_1, d_2\) are known 1- and 2-loop coefficients.

Other collaborations have focused more on the bare ‘boosted’ coupling constant and directly determined its relationship to \(\alpha _{\overline{\mathrm{MS}}}\). Specifically, the boosted coupling is defined by

$$\begin{aligned} \alpha _{\mathrm {P}}(1/a) = {1\over 4\pi } {g_0^2 \over u_0^4} \,, \end{aligned}$$
(321)

again determined at a scale \(\sim 1/a\). As discussed previously, since the plaquette expectation value in the boosted coupling contains the tadpole diagram contributions to all orders, which are dominant contributions in perturbation theory, there is an expectation that the perturbation theory using the boosted coupling has smaller perturbative coefficients [755], and hence smaller perturbative errors.

9.6.2 Continuum limit

Lattice results always come along with discretization errors, which one needs to remove by a continuum extrapolation. As mentioned previously, in this respect the present method differs in principle from those in which \(\alpha _s\) is determined from physical observables. In the general case, the numerical results of the lattice simulations at a value of \(\mu \) fixed in physical units can be extrapolated to the continuum limit, and the result can be analyzed as to whether it shows perturbative running as a function of \(\mu \) in the continuum. For observables at the cutoff-scale (\(q^*=d/a\)), discretization effects cannot easily be separated out from perturbation theory, as the scale for the coupling comes from the lattice spacing. Therefore the restriction \(a\mu \ll 1\) (the ‘continuum extrapolation’ criterion) is not applicable here. Discretization errors of order \(a^2\) are, however, present. Since \(a\sim \exp (-1/(2b_0 g_0^2)) \sim \exp (-1/(8\pi b_0 \alpha (q^*))\), these errors now appear as power corrections to the perturbative running, and have to be taken into account in the study of the perturbative behaviour, which is to be verified by changing a. One thus usually fits with power corrections in this method.

In order to keep a symmetry with the ‘continuum extrapolation’ criterion for physical observables and to remember that discretization errors are, of course, relevant, we replace it here by one for the lattice spacings used:

  • Lattice spacings

    • 3 or more lattice spacings, at least 2 points below \(a = 0.1\,\text{ fm }\)

    • 2 lattice spacings, at least 1 point below \(a = 0.1\,\text{ fm }\)

    • otherwise

9.6.3 Discussion of computations

Note that due to \(\mu \sim 1/a\) being relatively large the results easily have a or in the rating on renormalization scale.

The work of El-Khadra 92 [762] employs a 1-loop formula to relate \(\alpha ^{(0)}_{\overline{\mathrm{MS}}}(\pi /a)\) to the boosted coupling for three lattice spacings \(a^{-1} = 1.15\), 1.78, \(2.43\,\text{ GeV }\). (The lattice spacing is determined from the charmonium 1S-1P splitting.) They obtain \(\Lambda ^{(0)}_{\overline{\mathrm{MS}}}=234\,\text{ MeV }\), corresponding to \(\alpha _{\mathrm{eff}} = \alpha ^{(0)}_{\overline{\mathrm{MS}}}(\pi /a) \approx \) 0.15–0.2. The work of Aoki 94 [761] calculates \(\alpha ^{(2)}_V\) and \(\alpha ^{(2)}_{\overline{\mathrm{MS}}}\) for a single lattice spacing \(a^{-1}\sim 2\,\text{ GeV }\), again determined from charmonium 1S-1P splitting in two-flavour QCD. Using 1-loop perturbation theory with boosted coupling, they obtain \(\alpha ^{(2)}_V=0.169\) and \(\alpha ^{(2)}_{\overline{\mathrm{MS}}}=0.142\). Davies 94 [760] gives a determination of \(\alpha _{\mathrm {V}}\) from the expansion

$$\begin{aligned} -\ln W_{11}\equiv & {} \frac{4\pi }{3}\alpha _{\mathrm {V}}^{(N_f)}(3.41/a) \nonumber \\&\times \left[ 1 - (1.185+0.070N_f)\alpha _{\mathrm {V}}^{(N_f)}\right] \,, \end{aligned}$$
(322)

neglecting higher-order terms. They compute the \(\Upsilon \) spectrum in \(N_f=0\), 2 QCD for single lattice spacings at \(a^{-1} = 2.57\), \(2.47\,\text{ GeV }\) and obtain \(\alpha _{\mathrm {V}}(3.41/a)\simeq \) 0.15, 0.18, respectively. Extrapolating the inverse coupling linearly in \(N_f\), a value of \(\alpha _{\mathrm {V}}^{(3)}(8.3\,\text{ GeV }) = 0.196(3)\) is obtained. SESAM 99 [758] follows a similar strategy, again for a single lattice spacing. They linearly extrapolated results for \(1/\alpha _{\mathrm {V}}^{(0)}\), \(1/\alpha _{\mathrm {V}}^{(2)}\) at a fixed scale of \(9\,\text{ GeV }\) to give \(\alpha _{\mathrm {V}}^{(3)}\), which is then perturbatively converted to \(\alpha _{\overline{\mathrm{MS}}}^{(3)}\). This finally gave \(\alpha _{\overline{\mathrm{MS}}}^{(5)}(M_Z) = 0.1118(17)\). Wingate 95 [759] also follows this method. With the scale determined from the charmonium 1S-1P splitting for single lattice spacings in \(N_f = 0\), 2 giving \(a^{-1}\simeq 1.80\,\text{ GeV }\) for \(N_f=0\) and \(a^{-1}\simeq 1.66\,\text{ GeV }\) for \(N_f=2\), they obtain \(\alpha _{\mathrm {V}}^{(0)}(3.41/a)\simeq 0.15\) and \(\alpha _{\mathrm {V}}^{(2)}\simeq 0.18\), respectively. Extrapolating the coupling linearly in \(N_f\), they obtain \(\alpha _{\mathrm {V}}^{(3)}(6.48\,\text{ GeV })=0.194(17)\).

The QCDSF/UKQCD collaboration, QCDSF/UKQCD 05 [757, 763,764,765], use the 2-loop relation (re-written here in terms of \(\alpha \))

$$\begin{aligned} {1 \over \alpha _{\overline{\mathrm{MS}}}(\mu )}= & {} {1 \over \alpha _{\mathrm {P}}(1/a)} + 4\pi \left( 2b_0\ln a\mu - t_1^{\mathrm{P}}\right) \nonumber \\&+ (4\pi )^2\left( 2b_1\ln a\mu - t_2^\mathrm{P}\right) \alpha _{\mathrm {P}}(1/a) \,, \end{aligned}$$
(323)

where \(t_1^{\mathrm{P}}\) and \(t_2^{\mathrm{P}}\) are known. (A 2-loop relation corresponds to a 3-loop lattice \(\beta \)-function.) This was used to directly compute \(\alpha _{{\overline{\mathrm{MS}}}}\), and the scale was chosen so that the \({\mathcal {O}}(\alpha _{\mathrm {P}}^0)\) term vanishes, i.e.,

$$\begin{aligned} \mu ^* = {1 \over a} \exp {[t_1^{\mathrm{P}}/(2b_0)] } \approx \left\{ \begin{array}{ll} 2.63/a &{}\quad N_f = 0 \\ 1.4/a &{}\quad N_f = 2 \\ \end{array} \right. \,. \end{aligned}$$
(324)

The method is to first compute \(\alpha _{\mathrm {P}}(1/a)\) and from this, using Eq. (323) to find \(\alpha _{\overline{\mathrm{MS}}}(\mu ^*)\). The RG equation, Eq. (280), then determines \(\mu ^*/\Lambda _{\overline{\mathrm{MS}}}\) and hence using Eq. (324) leads to the result for \(r_0\Lambda _{\overline{\mathrm{MS}}}\). This avoids giving the scale in \(\text{ MeV }\) until the end. In the \(N_{ f}=0\) case seven lattice spacings were used [695], giving a range \(\mu ^*/\Lambda _{\overline{\mathrm{MS}}} \approx \) 24–72 (or \(a^{-1} \approx \) 2–7 GeV) and \(\alpha _{\mathrm{eff}} = \alpha _{\overline{\mathrm{MS}}}(\mu ^*) \approx \) 0.15–0.10. Neglecting higher-order perturbative terms (see discussion after Eq. (325) below) in Eq. (323) this is sufficient to allow a continuum extrapolation of \(r_0\Lambda _{\overline{\mathrm{MS}}}\). A similar computation for \(N_f = 2\) by QCDSF/UKQCD 05 [757] gave \(\mu ^*/\Lambda _{\overline{\mathrm{MS}}} \approx \) 12–17 (or roughly \(a^{-1} \approx \) 2–3 GeV) and \(\alpha _{\mathrm{eff}} = \alpha _{\overline{\mathrm{MS}}}(\mu ^*) \approx \) 0.20–0.18. The \(N_f=2\) results of QCDSF/UKQCD 05 [757] are affected by an uncertainty which was not known at the time of publication: It has been realized that the values of \(r_0/a\) of Ref. [757] were significantly too low [693]. As this effect is expected to depend on a, it influences the perturbative behaviour leading us to assign a for that criterion.

Since FLAG 13, there has been one new result for \(N_f = 0\) by FlowQCD 15 [679], later updated and published in Kitazawa 16 [686]. They also use the techniques as described in Eqs. (323), (324), but together with the gradient flow scale \(w_0\) (rather than the \(r_0\) scale) leading to a determination of \(w_0\Lambda _{\overline{\mathrm{MS}}}\). The continuum limit is estimated by extrapolating the data at 6 lattice spacings linearly in \(a^2\). The data range used is \(\mu ^*/\Lambda _{\overline{\mathrm{MS}}} \approx \) 50–120 (or \(a^{-1} \approx \) 5–11 GeV) and \(\alpha _{\overline{\mathrm{MS}}}(\mu ^*) \approx \) 0.12–0.095. Since a very small value of \(\alpha _{\overline{\mathrm {MS}}}\) is reached, there is a in the perturbative behaviour. Note that our conversion to the common \(r_0\) scale unfortunately leads to a significant increase of the error of the \(\Lambda \) parameter compared to using \(w_0\) directly [701]. Again we note that the results of QCDSF/UKQCD 05 [757] (\(N_f = 0\)) and Kitazawa 16 [686] may be affected by frozen topology as they have lattice spacings significantly below \(a = 0.05\,\text{ fm }\). Kitazawa 16 [686] investigate this by evaluating \(w_0/a\) in a fixed topology and estimate any effect at about \(\sim 1\%\).

The work of HPQCD 05A [753] (which supersedes the original work [766]) uses three lattice spacings \(a^{-1} \approx 1.2\), 1.6, \(2.3\,\text{ GeV }\) for \(2\,+\,1\) flavour QCD. Typically the renormalization scale \(q \approx \pi /a \approx \) 3.50–7.10 GeV, corresponding to \(\alpha _\mathrm {V'} \approx \) 0.22–0.28.

In the later update HPQCD 08A [754] twelve data sets (with six lattice spacings) are now used reaching up to \(a^{-1} \approx 4.4\,\text{ GeV }\), corresponding to \(\alpha _\mathrm {V'}\approx 0.18\). The values used for the scale \(r_1\) were further updated in HPQCD 10 [13]. Maltman 08 [82] uses most of the same lattice ensembles as HPQCD 08A [754], but not the one at the smallest lattice spacing, \(a\approx 0.045\) fm. Maltman 08 [82] also considers a much smaller set of quantities (three versus 22) that are less sensitive to condensates. They also use different strategies for evaluating the condensates and for the perturbative expansion, and a slightly different value for the scale \(r_1\). The central values of the final results from Maltman 08 [82] and HPQCD 08A [754] differ by 0.0009 (which would be decreased to 0.0007 taking into account a reduction of 0.0002 in the value of the \(r_1\) scale used by Maltman 08 [82]).

As mentioned before, the perturbative coefficients are computed through 3-loop order [767], while the higher-order perturbative coefficients \(c_n\) with \( n_{\mathrm{max}} \ge n > 3\) (with \(n_{\mathrm{max}} = 10\)) are numerically fitted using the lattice-simulation data for the lattice spacings with the help of Bayesian methods. It turns out that corrections in Eq. (319) are of order \(|c_i/c_1|\alpha ^i=\) 5–15% and 3–10% for i = 2, 3, respectively. The inclusion of a fourth-order term is necessary to obtain a good fit to the data, and leads to a shift of the result by 1 – 2 sigma. For all but one of the 22 quantities, central values of \(|c_4/c_1|\approx \) 2–4 were found, with errors from the fits of \(\approx 2\).

An important source of uncertainty is the truncation of perturbation theory. In HPQCD 08A [754], 10 [13] it is estimated to be about 0.4% of \(\alpha _{\overline{\mathrm {MS}}}(M_Z)\). In FLAG 13 we included a rather detailed discussion of the issue with the result that we prefer for the time being a more conservative error based on the above estimate \(|c_4/c_1| = 2\). From Eq. (318) this gives an estimate of the uncertainty in \(\alpha _{\mathrm{eff}}\) of

$$\begin{aligned} \Delta \alpha _{\mathrm{eff}}(\mu _1) = \left| {c_4 \over c_1}\right| \alpha _{\mathrm{eff}}^4(\mu _1) \,, \end{aligned}$$
(325)

at the scale \(\mu _1\) where \(\alpha _{\mathrm{eff}}\) is computed from the Wilson loops. This can be used with a variation in \(\Lambda \) at lowest order of perturbation theory and also applied to \(\alpha _s\) evolved to a different scale \(\mu _2\),Footnote 73

$$\begin{aligned} {\Delta \Lambda \over \Lambda } = {1\over 8\pi b_0 \alpha _s} {\Delta \alpha _s \over \alpha _s} \,, \qquad {\Delta \alpha _s(\mu _2) \over \Delta \alpha _s(\mu _1)} = {\alpha _s^2(\mu _2) \over \alpha _s^2(\mu _1)} \,. \end{aligned}$$
(326)

With \(\mu _2 = M_Z\) and \(\alpha _s(\mu _1)=0.2\) (a typical value extracted from Wilson loops in HPQCD 10 [13], HPQCD 08A [754] at \(\mu = 5\,\text{ GeV }\)) we have

$$\begin{aligned} \Delta \alpha _{\overline{\mathrm {MS}}}(m_Z) = 0.0012 \,, \end{aligned}$$
(327)

which we shall later use as the typical perturbative uncertainty of the method with \(2\,+\,1\) fermions.

Table 56 Wilson loop results. Some early results for \(N_{ f}=0, 2\) did not determine \(\Lambda _{\overline{\mathrm {MS}}}\)

Table 56 summarizes the results. Within the errors of 3–5% \(N_f=3\) determinations of \(r_0 \Lambda \) nicely agree.

9.7 \(\alpha _s\) from heavy-quark current two-point functions

9.7.1 General considerations

The method has been introduced in HPQCD 08, Ref. [171], and updated in HPQCD 10, Ref. [13], see also Ref. [768]. In addition there is a 2 + 1 + 1 flavour result, HPQCD 14A [16]. Since FLAG 16 two new results have appeared: JLQCD 16 [23] and Maezawa 16 [157].

The basic observable is constructed from a current

$$\begin{aligned} J(x) = i m_c{{\overline{\psi }}}_c(x)\gamma _5\psi _{c'}(x) \end{aligned}$$
(328)

of two mass-degenerate heavy-valence quarks, c, \(c^\prime \), usually taken to be at or around the charm quark mass. The pre-factor \(m_c\) denotes the bare mass of the quark. When the lattice discretization respects chiral symmetry, J(x) is a renormalization group invariant local field, i.e., it requires no renormalization. Staggered fermions and twisted mass fermions have such a residual chiral symmetry. The (Euclidean) time-slice correlation function

$$\begin{aligned} G(x_0) = a^3 \sum _{\vec {x}} \langle J^\dagger (x) J(0) \rangle \,, \end{aligned}$$
(329)

(\(J^\dagger (x) = im_c{{\overline{\psi }}}_{c'}(x)\gamma _5\psi _c(x)\)) has a \(\sim x_0^{-3}\) singularity at short distances and moments

$$\begin{aligned} G_n = a \sum _{x_0=-(T/2-a)}^{T/2-a} x_0^n \,G(x_0) \, \end{aligned}$$
(330)

are nonvanishing for even n and furthermore finite for \(n \ge 4\). Here T is the time extent of the lattice. The moments are dominated by contributions at t of order \(1/m_c\). For large mass \(m_c\) these are short distances and the moments become increasingly perturbative for decreasing n. Denoting the lowest-order perturbation theory moments by \(G_n^{(0)}\), one defines the normalized moments

$$\begin{aligned} R_n = \left\{ \begin{array}{ll} G_4/G_4^{(0)} &{} \quad \text{ for } n=4 \,, \\ {am_{\eta _c}\over 2am_c} \left( { G_n \over G_n^{(0)}} \right) ^{1/(n-4)} &{}\quad \text{ for } n \ge 6 \,, \\ \end{array} \right. \end{aligned}$$
(331)

of even order n. Note that Eq. (328) contains the variable (bare) heavy-quark mass \(m_c\). The normalization \(G_n^{(0)}\) is introduced to help in reducing lattice artifacts. In addition, one can also define moments with different normalizations,

$$\begin{aligned} {{\tilde{R}}}_n = 2 R_n / m_{\eta _c} \quad \text{ for } n \ge 6\,. \end{aligned}$$
(332)

While \({{\tilde{R}}}_n\) also remains renormalization group invariant, it now also has a scale which might introduce an additional ambiguity [23].

The normalized moments can then be parameterized in terms of functions

$$\begin{aligned} R_n \equiv \left\{ \begin{array}{ll} r_4(\alpha _s(\mu )) &{}\quad \text{ for } n=4 \,, \\ {r_n(\alpha _s(\mu )) \over {\bar{m}}_c(\mu )} &{}\quad \text{ for } n \ge 6 \,, \\ \end{array} \right. \end{aligned}$$
(333)

with \({\bar{m}}_c(\mu )\) being the renormalized charm-quark mass. The reduced moments \(r_n\) have a perturbative expansion

$$\begin{aligned} r_n = 1 + r_{n,1}\alpha _s + r_{n,2}\alpha _s^2 + r_{n,3}\alpha _s^3 + \cdots \,, \end{aligned}$$
(334)

where the written terms \(r_{n,i}(\mu /{\bar{m}}_c(\mu ))\), \(i \le 3\) are known for low n from Refs. [769,770,771,772,773]. In practice, the expansion is performed in the \(\overline{\mathrm{MS}}\) scheme. Matching nonperturbative lattice results for the moments to the perturbative expansion, one determines an approximation to \(\alpha _{\overline{\mathrm{MS}}}(\mu )\) as well as \({{\bar{m}}}_c(\mu )\). With the lattice spacing (scale) determined from some extra physical input, this calibrates \(\mu \). As usual suitable pseudoscalar masses determine the bare quark masses, here in particular the charm mass, and then through Eq. (333) the renormalized charm-quark mass.

A difficulty with this approach is that large masses are needed to enter the perturbative domain. Lattice artifacts can then be sizeable and have a complicated form. The ratios in Eq. (331) use the tree-level lattice results in the usual way for normalization. This results in unity as the leading term in Eq. (334), suppressing some of the kinematical lattice artifacts. We note that in contrast to, e.g., the definition of \(\alpha _\mathrm {qq}\), here the cutoff effects are of order \(a^k\alpha _s\), while there the tree-level term defines \(\alpha _s\) and therefore the cutoff effects after tree-level improvement are of order \(a^k\alpha _s^2\).

Finite-size effects (FSE) due to the omission of \(|t| > T /2\) in Eq. (330) grow with n as \((m_{\eta _c}T/2)^n\, \exp {(-m_{\eta _c} T/2)}\). In practice, however, since the (lower) moments are short-distance dominated, the FSE are expected to be irrelevant at the present level of precision.

Moments of correlation functions of the quark’s electromagnetic current can also be obtained from experimental data for \(e^+e^-\) annihilation [774, 775]. This enables a nonlattice determination of \(\alpha _s\) using a similar analysis method. In particular, the same continuum perturbation theory computation enters both the lattice and the phenomenological determinations.

9.7.2 Discussion of computations

The method has originally been applied in HPQCD 08B [171] and in HPQCD 10 [13], based on the MILC ensembles with \(2 + 1\) flavours of Asqtad staggered quarks and HISQ valence quarks. Both use \(R_n\) while the latter also used a range of quark masses \(m_c\) in addition to the physical charm mass.

The scale was set using \(r_1 = 0.321(5)\,\text{ fm }\) in HPQCD 08B [171] and the updated value \(r_1 = 0.3133(23)\,\text{ fm }\) in HPQCD 10 [13]. The effective range of couplings used is here given for \(n = 4\), which is the moment most dominated by short (perturbative) distances and important in the determination of \(\alpha _s\). The range is similar for other ratios. With \(r_{4,1} = 0.7427\) and \(R_4 = 1.28\) determined in the continuum limit at the charm mass in Ref. [171], we have \(\alpha _{\mathrm{eff}} = 0.38\) at the charm-quark mass, which is the mass value where HPQCD 08B [171] carries out the analysis. In HPQCD 10 [13] a set of masses is used, with \({\tilde{R}}_4 \in [1.09, 1.29]\), which corresponds to \(\alpha _{\mathrm{eff}} \in [0.12, 0.40]\). The available data of HPQCD 10 [13] is reviewed in FLAG 13. For the continuum limit criterion, we choose the scale \(\mu = 2{{\bar{m}}}_c \approx m_{\eta _c}/1.1\), where we have taken \({{\bar{m}}}_c\) in the \({\overline{\mathrm {MS}}}\) scheme at scale \({{\bar{m}}}_c\) and the numerical value 1.1 was determined in HPQCD 10B [69]. With these choices for \(\mu \), the continuum limit criterion is satisfied for three lattice spacings when \(\alpha _\mathrm {eff} \le 0.3\) and \(n=4\).

Larger-n moments are more influenced by nonperturbative effects. For the n values considered, adding a gluon condensate term only changed error bars slightly in HPQCD’s analysis. We note that HPQCD in their papers perform a global fit to all data using a joint expansion in powers of \(\alpha _s^n\), \((\Lambda /(m_{\eta _c}/2))^j\) to parameterize the heavy-quark mass dependence, and \(( am_{\eta _c}/2)^{2i}\) to parameterize the lattice-spacing dependence. To obtain a good fit, they must exclude data with \(am_{\eta _c} > 1.95\) and include lattice-spacing terms \(a^{2i}\) with i greater than 10. Because these fits include many more fit parameters than data points, HPQCD uses their expectations for the sizes of coefficients as Bayesian priors. The fits include data with masses as large as \(am_{\eta _c}/2 \sim 0.86\), so there is only minimal suppression of the many high-order contributions for the heavier masses. It is not clear, however, how sensitive the final results are to the larger \(am_{\eta _c}/2\) values in the data. The continuum limit of the fit is in agreement with a perturbative scale dependence (a 5-loop running \(\alpha _{\overline{\mathrm{MS}}}\) with a fitted 5-loop coefficient in the \(\beta \)-function is used). Indeed, Fig. 2 of Ref. [13] suggests that HPQCD’s fit describes the data well.

A more recent computation, HPQCD 14A [16] uses \({\tilde{R}}_n\) and is based on MILC’s 2 + 1 + 1 HISQ staggered ensembles. Compared to HPQCD 10 [13] valence- and sea-quarks now use the same discretization and the scale is set through the gradient flow scale \(w_0\), determined to \(w_0=0.1715(9)\,\mathrm{fm}\) in Ref. [566]. A number of data points satisfy our continuum limit criterion \(a\mu < 1.5\), at two different lattice spacings. This does not by itself lead to a but the next-larger lattice spacing does not miss the criterion by much We therefore assign a in that criterion.

Two new computations have appeared since the last FLAG report. Maezawa and Petreczky, [157] computed the two-point functions of the \(c{\bar{c}}\) pseudoscalar operator and obtained \(R_4\), \(R_6/R_8\) and \(R_8/R_{10}\) based on the HotQCD collaboration HISQ staggered ensembles, [401]. The scale is set by measuring \(r_1=0.3106(18)\) fm. Continuum limits are taken fitting the lattice spacing dependence with \(a^2\,+\,a^4\) form as the best fit. For \(R_4\), they also employ other forms for fit functions such as \(a^2\), \(\alpha _s^{\mathrm{boosted}} a^2\,+\,a^4\), etc., the results agreeing within errors. Matching \(R_4\) with the 3-loop formula Eq. (334) through order \(\alpha _{\overline{\mathrm {MS}}}^3\) [769], where \(\mu \) is fixed to \(m_c\), they obtain \(\alpha ^{(3)}_{\overline{\mathrm{MS}}}(\mu =m_c) = 0.3697(54)(64)(15)\). The first error is statistical, the second is the uncertainty in the continuum extrapolation, and the third is the truncation error in the perturbative approximation of \(r_4\). This last error is estimated by the “typical size” of the missing 4-loop contribution, which they assume to be \(\alpha ^4_{\overline{\mathrm{MS}}}(\mu )\) multiplied by 2 times the 3-loop coefficient \(2 \times r_{4,3} \times \alpha ^4_{{\overline{\mathrm {MS}}}}(\mu ) = 0.2364 \times \alpha ^4_{{\overline{\mathrm {MS}}}}(\mu )\). The result is converted to

$$\begin{aligned} \alpha ^{(5)}_{{\overline{\mathrm {MS}}}}(M_Z) = 0.11622(84) \,. \end{aligned}$$
(335)

Since \(\alpha _{\mathrm{eff}}(2m_c)\) reaches 0.25, we assign for the criterion of the renormalization scale. As \(\Delta \Lambda / \Lambda \sim \alpha _{\mathrm{eff}}^2\), we assign for the criterion of perturbative behaviour. The lattice cutoff ranges as \(a^{-1} \) = 1.42–4.89 GeV with \(\mu =2m_c\sim 2.6\) GeV so that we assign for continuum extrapolation.

JLQCD 16 [23] also computed the two-point functions of the \(c{\bar{c}}\) pseudoscalar operator and obtained \(R_6\), \(R_8\), \(R_{10}\) and their ratios based on 2 + 1 flavour QCD with Möbius domain-wall quark for three lattice cutoff \(a^{-1}\) = 2.5, 3.6, 4.5 GeV. The scale is set by \(\sqrt{t_0}=0.1465(21)(13)\,\text{ fm }\). The continuum limit is taken assuming linear dependence on \(a^2\). They find a sizeable lattice-spacing dependence of \(R_4\), which is therefore not used in their analysis, but for \(R_6,R_8, R_{10}\) the dependence is mild giving reasonable control over the continuum limit. They use the perturbative formulae for the vacuum polarization in the pseudoscalar channel \(\Pi _{PS}\) through order \(\alpha _{\overline{\mathrm {MS}}}^3\) in the \(\overline{\mathrm{MS}}\) scheme [771, 772] to obtain \(\alpha ^{(4)}_{\overline{\mathrm{MS}}}\). Combining the matching of lattice results with continuum perturbation theory for \(R_6\), \(R_6/R_8\) and \(R_{10}\), they obtain \(\alpha ^{(4)}_{\overline{\mathrm{MS}}}(\mu =3\,\,\mathrm {GeV})=0.2528(127)\), where the error is dominated by the perturbative truncation error. To estimate the truncation error they study the dependence of the final result on the choice of the renormalization scales \(\mu _\alpha , \;\mu _\mathrm {m}\) which are used as renormalization scales for \(\alpha \) and the quark mass. Independently [776] the two scales are varied in the range of 2 GeV to 4 GeV. The above result is converted to \(\alpha ^{(5)}_{\overline{\mathrm {MS}}}(M_Z)\) as

$$\begin{aligned} \alpha ^{(5)}_{\overline{\mathrm{MS}}}(M_Z) = 0.1177(26) \,. \end{aligned}$$
(336)

Since \(\alpha _{\mathrm{eff}}\) roughly reaches 0.25, they have for the renormalization scale criterion. Since \(\Delta \Lambda / \Lambda > \alpha _{\mathrm{eff}}^2\), we also assign for the criterion of perturbative behaviour. The lattice cutoff ranges over \(a^{-1}\) = 2.5–4.5 GeV with \(\mu =3\) GeV so we also give them a for continuum extrapolation.

Fig. 35
figure 35

Renormalization-scale (\(\mu \)) dependence of \(\alpha (m_*)\) extracted from \(R_4\). We have evaluated this dependence for the case where the same renormalization scale is used for the quark mass and for \(\alpha _s\)

There is a significant difference in the perturbative error estimate of JLQCD 16 [23] and Maezawa 16 [157], both of which use the moments at the charm mass. JLQCD 16 uses the scale dependence (see also Sect. 9.2.3) but Maezawa 16 looks at the perturbative coefficients at \(\mu =m_*\), with \({{{\bar{m}}}_c}(m_*)=m_*\). While the Maezawa 16 result derives from \(R_4\), JLQCD 16 did not use that moment and therefore did not show its renormalization-scale dependence. We provide it here and show \(\alpha (m_*)\) extracted from \(R_4\) expanded in \(\alpha (\mu )\) for \(\mu =s\,m_*\) (and evolved to \(\mu =m_*\)) in Fig. 35. Note that the perturbative error estimated by Maezawa 16 is a small contribution to the total error, while the scale dependence in Fig. 35 is significant between, e.g., \(s=1\) and \(s=4\). This is a confirmation of our in the perturbative error criterion which is linked to the cited overall error as spelled out in Sect. 9.2.1.

Aside from the final results for \(\alpha _s(m_Z)\) obtained by matching with perturbation theory, it is interesting to make a comparison of the short distance quantities in the continuum limit \(R_n\) which are available from HPQCD 08 [171], JLQCD 16 [23] and Maezawa 16 [157] (all using \(2\,+\,1\) flavours). In Fig. 36 we plot the various results based on the numbers collated in Table 58. These results are in quite good agreement with each other. For future studies it is of course interesting to check agreement of these numbers before turning to the more involved determination of \(\alpha _s\).

Fig. 36
figure 36

Ratios from Table 58. Note that constants have been subtracted from \(R_4\), \(R_6\) and \(R_{10}\), to be able to plot all results in a similar range

Table 57 Heavy-quark current two-point function results. Note that all analysis using \(2\,+\,1\) flavour simulations perturbatively add a dynamical charm quark. Partially they then quote results in \(N_{ f}=4\)-flavour QCD, which we converted back to \(N_{ f}=3\), corresponding to the nonperturbative sea quark content

In Table 57 we summarize the results for the latter.

9.8 \(\alpha _s\) from QCD vertices

9.8.1 General considerations

The most intuitive and in principle direct way to determine the coupling constant in QCD is to compute the appropriate three- or four-point gluon vertices or alternatively the quark-quark-gluon vertex or ghost-ghost-gluon vertex (i.e., \( q{\overline{q}}A\) or \(c{\overline{c}}A\) vertex, respectively). A suitable combination of renormalization constants then leads to the relation between the bare (lattice) and renormalized coupling constant. This procedure requires the implementation of a nonperturbative renormalization condition and the fixing of the gauge. For the study of nonperturbative gauge fixing and the associated Gribov ambiguity, we refer to Refs. [777,778,779] and references therein. In practice the Landau gauge is used and the renormalization constants are defined by requiring that the vertex is equal to the tree level value at a certain momentum configuration. The resulting renormalization schemes are called ‘MOM’ scheme (symmetric momentum configuration) or ‘\(\mathrm {\widetilde{MOM}}\)’ (one momentum vanishes), which are then converted perturbatively to the \(\overline{\mathrm{MS}}\) scheme.

A pioneering work to determine the three-gluon vertex in the \(N_f = 0\) theory is Alles 96 [780] (which was followed by Ref. [781] for two flavour QCD); a more recent \(N_f = 0\) computation was Ref. [782] in which the three-gluon vertex as well as the ghost-ghost-gluon vertex was considered. (This requires a computation of the propagator of the Faddeev–Popov ghost on the lattice.) The latter paper concluded that the resulting \(\Lambda _{\overline{\mathrm{MS}}}\) depended strongly on the scheme used, the order of perturbation theory used in the matching and also on nonperturbative corrections [783].

Subsequently in Refs. [784, 785] a specific \(\widetilde{\mathrm{MOM}}\) scheme with zero ghost momentum for the ghost-ghost-gluon vertex was used. In this scheme, dubbed the ‘MM’ (Minimal MOM) or ‘Taylor’ (T) scheme, the vertex is not renormalized, and so the renormalized coupling reduces to

$$\begin{aligned} \alpha _{\mathrm{T}}(\mu ) = D^{\mathrm{gluon}}_{\mathrm{lat}}(\mu , a) D^\mathrm{ghost}_{\mathrm{lat}}(\mu , a)^2 \, {g_0^2 \over 4\pi } \,, \end{aligned}$$
(337)

where \(D^{\mathrm{ghost}}_{\mathrm{lat}}\) and \(D^{\mathrm{gluon}}_{\mathrm{lat}}\) are the (bare lattice) dressed ghost and gluon ‘form factors’ of these propagator functions in the Landau gauge,

$$\begin{aligned} \begin{aligned} D^{ab}(p)&= - \delta ^{ab}\, {D^{\mathrm{ghost}}(p) \over p^2}\,, \qquad \\ D_{\mu \nu }^{ab}(p)&= \delta ^{ab} \left( \delta _{\mu \nu } - {p_\mu p_\nu \over p^2} \right) \, {D^{\mathrm{gluon}}(p) \over p^2 } \,, \end{aligned} \end{aligned}$$
(338)

and we have written the formula in the continuum with \(D^\mathrm{ghost/gluon}(p)=D^{\mathrm{ghost/gluon}}_{\mathrm{lat}}(p, 0)\). Thus there is now no need to compute the ghost-ghost-gluon vertex, just the ghost and gluon propagators.

Table 58 Moments from \(N_f=3\) simulations at the charm mass. The moments have been corrected perturbatively to include the effect of a charm sea quark

9.8.2 Discussion of computations

For the calculations considered here, to match to perturbative scaling, it was first necessary to reduce lattice artifacts by an H(4) extrapolation procedure (addressing O(4) rotational invariance), e.g., ETM 10F [791] or by lattice perturbation theory, e.g., Sternbeck 12 [789]. To match to perturbation theory, collaborations vary in their approach. In ETM 10F [791], it was necessary to include the operator \(A^2\) in the OPE of the ghost and gluon propagators, while in Sternbeck 12 [789] very large momenta are used and \(a^2p^2\) and \(a^4p^4\) terms are included in their fit to the momentum dependence. A further later refinement was the introduction of higher nonperturbative OPE power corrections in ETM 11D [788] and ETM 12C [787]. Although the expected leading power correction, \(1/p^4\), was tried, ETM finds good agreement with their data only when they fit with the next-to-leading-order term, \(1/p^6\). The update ETM 13D [786] investigates this point in more detail, using better data with reduced statistical errors. They find that after again including the \(1/p^6\) term they can describe their data over a large momentum range from about 1.75 GeV to 7 GeV.

In all calculations except for Sternbeck 10 [790], Sternbeck 12 [789] , the matching with the perturbative formula is performed including power corrections in the form of condensates, in particular \(\langle A^2 \rangle \). Three lattice spacings are present in almost all calculations with \(N_f=0\), 2, but the scales ap are rather large. This mostly results in a on the continuum extrapolation (Sternbeck 10 [790], Boucaud 01B [781] for \(N_f=2\). Ilgenfritz 10 [792], Boucaud 08 [785], Boucaud 05 [782], Becirevic 99B [797], Becirevic 99A [798], Boucaud 98B [799], Boucaud 98A [800], Alles 96 [780] for \(N_f=0\)). A is reached in the \(N_{ f}=0\) computations Boucaud 00A [796], 00B [795], 01A [794], Soto 01 [793] due to a rather small lattice spacing, but this is done on a lattice of a small physical size. The \(N_f=2\,+\,1\,+\,1\) calculation, fitting with condensates, is carried out for two lattice spacings and with \(ap>1.5\), giving for the continuum extrapolation as well. In ETM 10F [791] we have \(0.25< \alpha _{\mathrm{eff}} < 0.4\), while in ETM 11D [788], ETM 12C [787] (and ETM 13 [41]) we find \(0.24< \alpha _{\mathrm{eff}} < 0.38\), which gives a green circle in these cases for the renormalization scale. In ETM 10F [791] the values of ap violate our criterion for a continuum limit only slightly, and we give a .

Table 59 Results for the gluon–ghost vertex

In Sternbeck 10 [790], the coupling ranges over \(0.07 \le \alpha _{\mathrm{eff}} \le 0.32\) for \(N_f=0\) and \(0.19 \le \alpha _{\mathrm{eff}} \le 0.38\) for \(N_f=2\) giving and for the renormalization scale, respectively. The fit with the perturbative formula is carried out without condensates, giving a satisfactory description of the data. In Boucaud 01A [794], depending on a, a large range of \(\alpha _{\mathrm{eff}}\) is used which goes down to 0.2 giving a for the renormalization scale and perturbative behaviour, and several lattice spacings are used leading to in the continuum extrapolation. The \(N_{ f}=2\) computation Boucaud 01B [794], fails the continuum limit criterion because both \(a\mu \) is too large and an unimproved Wilson fermion action is used. Finally in the conference proceedings Sternbeck 12 [789], the \(N_f\) = 0, 2, 3 coupling \(\alpha _\mathrm {T}\) is studied. Subtracting 1-loop lattice artifacts and subsequently fitting with \(a^2p^2\) and \(a^4p^4\) additional lattice artifacts, agreement with the perturbative running is found for large momenta (\(r_0^2p^2 > 600\)) without the need for power corrections. In these comparisons, the values of \(r_0\Lambda _{\overline{\mathrm {MS}}}\) from other collaborations are used. As no numbers are given, we have not introduced ratings for this study.

Table 60 Dirac eigenvalue result

In Table 59 we summarize the results. Presently there are no \(N_f \ge 3\) calculations of \(\alpha _s\) from QCD vertices that satisfy the FLAG criteria to be included in the range.

9.9 \(\alpha _s\) from the eigenvalue spectrum of the Dirac operator

9.9.1 General considerations

Consider the spectral density of the continuum Dirac operator

$$\begin{aligned} \rho (\lambda ) = \frac{1}{V} \left\langle \sum _k (\delta (\lambda -i\lambda _k) + \delta (\lambda +i\lambda _k)) \right\rangle , \end{aligned}$$
(339)

where V is the volume and \(\lambda _k\) are the eigenvalues of the Dirac operator in a gauge background.

Its perturbative expansion

$$\begin{aligned} \rho (\lambda ) = {3 \over 4\pi ^2} \,\lambda ^3 (1-\rho _1{\bar{g}}^2 -\rho _2{\bar{g}}^4 -\rho _3{\bar{g}}^6 \,{+}\, \mathrm{O}({\bar{g}}^8) ), \end{aligned}$$
(340)

is known including \(\rho _3\) in the \(\overline{\mathrm {MS}}\) scheme [801, 802]. In renormalization group improved form one sets the renormalization scale \(\mu \) to \(\mu =s\lambda \) with \(s=\mathrm{O}(1)\) and the \(\rho _i\) are pure numbers. Nakayama 18 [687] initiated a study of \(\rho (\lambda )\) in the perturbative regime. They prefer to consider \(\mu \) independent from \(\lambda \). Then \(\rho _i\) are polynomials in \(\log (\lambda /\mu )\) of degree i. One may consider

$$\begin{aligned} F(\lambda )\equiv & {} {\partial \log (\rho (\lambda )) \over \partial \log (\lambda )} \nonumber \\= & {} 3 -F_1{\bar{g}}^2 -F_2{\bar{g}}^4 -F_3{\bar{g}}^6 - F_4{\bar{g}}^8 + \mathrm{O}({\bar{g}}^{10}),\quad \end{aligned}$$
(341)

where the four coefficients \(F_i\), again polynomials of degree i in \(\log (\lambda /\mu )\), are known. Choosing instead the renormalization group improved form with \(\mu =s\lambda \) in Eq. (340) would have led to

$$\begin{aligned} F(\lambda ) = 3 -{{\bar{F}}}_2{\bar{g}}^4(\lambda ) -{{\bar{F}}}_3{\bar{g}}^6(\lambda ) - {{\bar{F}}}_4{\bar{g}}^8(\lambda ) + \mathrm{O}({\bar{g}}^{10}) \,,\nonumber \\ \end{aligned}$$
(342)

with pure numbers \({{\bar{F}}}_i\) and \({{\bar{F}}}_1=0\). Determinations of \(\alpha _s\) can be carried out by a computation and continuum extrapolation of \(\rho (\lambda )\) and/or \(F(\lambda )\) at large \(\lambda \). Such computations are made possible by the techniques of [46, 354, 687].

We note that according to our general discussions in terms of an effective coupling, we have \(n_\mathrm {l}=2\); the 3-loop \(\beta \) function of a coupling defined from Eq. (340) or Eq. (342) is known.Footnote 74

9.9.2 Discussion of computations

There is one pioneering result to date using this method by Nakayama 18 [687]. They computed the eigenmode distributions of the Hermitian operator \(a^2 D^{\dagger }_\mathrm{ov}(m_f=0,am_{\mathrm{PV}}) D_{\mathrm{ov}}(m_f=0,am_{\mathrm{PV}})\) where \(D_\mathrm{ov}\) is the overlap operator and \(m_{\mathrm{PV}}\) is the Pauli–Villars regulator on ensembles with 2 + 1 flavours using Möbius domain-wall quarks for three lattice cutoff \(a^{-1}= 2.5, 3.6, 4.5 \) GeV, where \(am_{\mathrm{PV}} = 3\) or \(\infty \). The bare eigenvalues are converted to the \({\overline{\mathrm {MS}}}\) scheme at \(\mu = 2\,\text{ GeV }\) by multiplying with the renormalization constant \(Z_m(2\,\text{ GeV })\), which is then transformed to those renormalized at \(\mu =6\,\text{ GeV }\) using the renormalization group equation. The scale is set by \(\sqrt{t_0}=0.1465(21)(13)\,\text{ fm }\). The continuum limit is taken assuming a linear dependence in \(a^2\), while the volume size is kept about constant: 2.6–2.8 fm.

Choosing the renormalization scale \(\mu = 6\,\text{ GeV }\), Nakayama 18 [687], extracted \(\alpha _{\overline{\mathrm{MS}}}^{(3)}(6\,\text{ GeV })=0.204(10)\). The result is converted to

$$\begin{aligned} \alpha ^{(5)}_{\overline{\mathrm{MS}}}(M_Z) = 0.1226(36) \,. \end{aligned}$$
(343)

The lattice cutoff ranges over \(a^{-1} = 2.5-4.5\,\text{ GeV }\) with \(\mu =\lambda =0.8-1.25\,\text{ GeV }\) yielding quite small values \(a\mu \). However, our continuum limit criterion does not apply as it requires us to consider \(\alpha _s=0.3\). We thus deviate from the general rule and give a   which would result at the smallest value \(\alpha _{\overline{\mathrm {MS}}}(\mu )=0.4\) considered by Nakayama 18  [687]. The values of \(\alpha _{\overline{\mathrm {MS}}}\) lead to a for the renormalization scale, while perturbative behavior is rated .

In Table 60 we list this result.

9.10 Summary

After reviewing the individual computations, we are now in a position to discuss the overall result. We first present the current status and for that briefly consider \(r_0\Lambda \) with its flavour dependence from \(N_f = 0\) to 4 flavours. Then we discuss the central \(\alpha _{\overline{\mathrm{MS}}}(M_Z)\) results, which just use \(N_f \ge 3\), give ranges for each sub-group discussed previously, and give final FLAG average as well as an overall average together with the current PDG nonlattice numbers. Finally we return to \(r_0\Lambda \), presenting our estimates for the various \(N_f\).

Table 61 Results for \(\alpha _{\overline{\mathrm {MS}}}(M_\mathrm {Z})\). Different methods are listed separately and they are combined to a pre-range when computations are available without any . A weighted average of the pre-ranges gives 0.11824(58), using the smallest pre-range uncertainty gives 0.11824(81) while the average uncertainty of the ranges used as an error gives 0.11824(131). We note that Bazavov 12 is superseded by Bazavov 14

9.10.1 The present situation

We first summarize the status of lattice-QCD calculations of the QCD scale \(\Lambda _{\overline{\mathrm {MS}}}\). Figure 37 shows all the results for \(r_0\Lambda _{\overline{\mathrm{MS}}}\) discussed in the previous sections.

Fig. 37
figure 37

\(r_0\Lambda _{\overline{\mathrm{MS}}}\) estimates for \(N_f = 0\), 2, 3, 4 flavours. Full green squares are used in our final ranges, pale green squares also indicate that there are no red squares in the colour coding but the computations were superseded by later more complete ones or not published, while red open squares mean that there is at least one red square in the colour coding

Many of the numbers are the ones given directly in the papers. However, when only \(\Lambda _{\overline{\mathrm{MS}}}\) in physical units (\(\text{ MeV }\)) is available, we have converted them by multiplying with the value of \(r_0\) in physical units. The notation used is full green squares for results used in our final average, while a lightly shaded green square indicates that there are no red squares in the previous colour coding but the computation does not enter the ranges because either it has been superseded by an update or it is not published. Red open squares mean that there is at least one red square in the colour coding.

For \(N_f=0\) there is relatively little spread in the more recent numbers.

When two flavours of quarks are included, the numbers extracted by the various groups show a considerable spread, as in particular older computations did not yet control the systematics sufficiently. This illustrates the difficulty of the problem and emphasizes the need for strict criteria. The agreement among the more modern calculations with three or more flavours, however, is quite good.

We now turn to the status of the essential result for phenomenology, \(\alpha _{\overline{\mathrm{MS}}}^{(5)}(M_Z)\). In Table 61 and the upper plot in Fig. 38 we show all the results for \(\alpha _{\overline{\mathrm{MS}}}^{(5)}(M_Z)\) (i.e., \(\alpha _{\overline{\mathrm{MS}}}\) at the Z mass) obtained from \(N_f=2\,+\,1\) and \(N_f = 2\,+\,1\,+\,1\) simulations. The conversion from \(N_{ f}= 3\) or \(N_{ f}= 4\) to \(N_{ f}= 5\) is made by matching the coupling constant at the charm and bottom quark thresholds and using the scale as determined or used by the authors.

Fig. 38
figure 38

\(\alpha _{\overline{\mathrm{MS}}}^{(5)}(M_Z)\), the coupling constant in the \(\overline{\mathrm{MS}}\) scheme at the Z mass. Left: Lattice results, pre-ranges from different calculation methods, and final average. Right: Comparison of the lattice pre-ranges and average with the nonlattice ranges and average. The first PDG 18 entry gives the outcome of their analysis excluding lattice results (see Sect. 9.10.4)

As can be seen from the tables and figures, at present there are several computations satisfying the criteria to be included in the FLAG average. Since FLAG 16 two new computations of \(\alpha _{\overline{\mathrm{MS}}}^{(5)}(M_Z)\) pass all our criteria with at least a and one computation with all . The results agree quite well within the stated uncertainties. The uncertainties vary significantly.

9.10.2 Our range for \(\alpha _{\overline{\mathrm{MS}}}^{(5)}\)

We now explain the determination of our range. We only include those results without a red tag and that are published in a refereed journal. We also do not include any numbers that were obtained by extrapolating from theories with less than three flavours. They are not controlled and can be looked up in the previous FLAG reviews.

A general issue with most determinations of \(\alpha _{\overline{\mathrm {MS}}}\), both lattice and nonlattice, is that they are dominated by perturbative truncation errors, which are difficult to estimate. Further, all results discussed here except for those of Sects. 9.39.6 are based on extractions of \(\alpha _{\overline{\mathrm {MS}}}\) that are largely influenced by data with \(\alpha _\mathrm {eff}\ge 0.3\). At smaller \(\alpha _s\) the momentum scale \(\mu \) quickly is at or above \(a^{-1}\). We have included computations using \(a\mu \) up to 1.5 and \(\alpha _\mathrm {eff}\) up to 0.4, but one would ideally like to be significantly below that. Accordingly we choose to not simply perform weighted averages with the individual errors estimated by each group. Rather, we use our own more conservative estimates of the perturbative truncation errors in the weighted average.

In the following we repeat aspects of the methods and calculations that inform our estimates of the perturbative truncation errors. We also provide separate estimates for \(\alpha _s\) obtained from step-scaling, the heavy-quark potential, Wilson loops, and heavy-quark current two-point functions to enable a comparison of the different lattice approaches; these are summarized in Table 61.

  • Step-scaling

    The step-scaling computations of PACS-CS 09A [81] and ALPHA 17 [79] reach energies around the Z-mass where perturbative uncertainties in the three-flavour theory are negligible. Perturbative errors do enter in the conversion of the \(\Lambda \)-parameters from three to five flavours, but successive order contributions decrease rapidly and can be neglected. We form a weighted average of the two results and obtain \(\alpha _{{\overline{\mathrm {MS}}}}=0.11848(81)\).

  • Potential computations

    Brambilla 10 [742], ETM 11C [741] and Bazavov 12 [740] give evidence that they have reached distances where perturbation theory can be used. However, in addition to \(\Lambda \), a scale is introduced into the perturbative prediction by the process of subtracting the renormalon contribution. This subtraction is avoided in Bazavov 14 [80] by using the force and again agreement with perturbative running is reported. Husung 17 [681] (unpublished) studies the reliability of perturbation theory in the pure gauge theory with lattice spacings down to \(0.015\,\mathrm{fm}\) and finds that at weak coupling there is a downwards trend in the \(\Lambda \)-parameter with a slope \(\Delta \Lambda / \Lambda \approx 9 \alpha _s^3\). While it is not very satisfactory to use just Husung 17 to estimate the perturbative error, we do not have additional information at present. Further studies are needed to better understand the errors of \(\alpha _s\) determinations from the potential.

    Only Bazavov 14 [80] satisfies all of the criteria to enter the FLAG average for \(\alpha _s\). Given the findings of [681] we estimate a perturbative error of \(\Delta \Lambda / \Lambda = 9(\alpha _s^{\mathrm{min}})^3 \) with \(\alpha _s^{\mathrm{min}} \approx 0.19\) the smallest value reached in [80]. This translates into \(\Delta \alpha _{\overline{\mathrm {MS}}}(M_Z)=0.0014\). A different way to estimate the effect is to take the actual difference of the \(\Lambda \)-parameters estimated in \(N_{ f}=0\) by Brambilla 10 [742] and Husung 17 [681]: \(\Delta \Lambda / \Lambda \approx (0.637-0.590)/0.637 = 0.074\) or \(\Delta \alpha _{\overline{\mathrm {MS}}}(M_Z)=0.0018\). We use the mean of these two error estimates together with the central value of Bazavov 14 and obtain \(\alpha _{{\overline{\mathrm {MS}}}}= 0.1166(16)\).

  • Small Wilson loops

    Here the situation is unchanged as compared to FLAG 16. In the determination of \(\alpha _s\) from observables at the lattice spacing scale, there is an interplay of higher-order perturbative terms and lattice artifacts. In HPQCD 05A [753], HPQCD 08A [754] and Maltman 08 [82] both lattice artifacts (which are power corrections in this approach) and higher-order perturbative terms are fitted. We note that Maltman 08 [82] and HPQCD 08A [754] analyze largely the same data set but use different versions of the perturbative expansion and treatments of nonperturbative terms. After adjusting for the slightly different lattice scales used, the values of \(\alpha _{\overline{\mathrm {MS}}}(M_Z)\) differ by 0.0004 to 0.0008 for the three quantities considered. In fact the largest of these differences (0.0008) comes from a tadpole-improved loop, which is expected to be best behaved perturbatively. We therefore replace the perturbative-truncation errors from [13, 82] with our estimate of the perturbative uncertainty Eq. (327). Taking the perturbative errors to be 100% correlated between the results, we obtain for the weighted average \(\alpha _{{\overline{\mathrm {MS}}}}=0.11871(128)\).

  • Heavy quark current two-point functions

    Other computations with small errors are HPQCD 10 [13] and HPQCD 14A [16], where correlation functions of heavy valence quarks are used to construct short-distance quantities. Due to the large quark masses needed to reach the region of small coupling, considerable discretization errors are present, see Fig. 30 of FLAG 16. These are treated by fits to the perturbative running (a 5-loop running \(\alpha _{\overline{\mathrm{MS}}}\) with a fitted 5-loop coefficient in the \(\beta \)-function is used) with high-order terms in a double expansion in \(a^2\Lambda ^2\) and \(a^2 m_\mathrm {c}^2\) supplemented by priors which limit the size of the coefficients. The priors play an especially important role in these fits given the much larger number of fit parameters than data points. We note, however, that the size of the coefficients does not prevent high-order terms from contributing significantly, since the data includes values of \(am_{\text {c}}\) that are rather close to 1.

    More recent calculations use the same method but just at the charm quark mass, where discretization errors are considerably smaller. Here the dominating uncertainty is the perturbative error. JLQCD 16 [23] estimates it at \(\Delta \alpha _s= 0.0011\) from independent changes of the renormalization scales of coupling and mass, \(\mu _\alpha ,\mu _\mathrm {m}\). Figure 35 for the residual scale dependence of \(\alpha _s\) from \(R_4\) yields 0.0017 from scale change \(1\le s\le 3\) and 0.0025 for \(2\le s\le 4\). For the figure we set \(\mu _\alpha = \mu _\mathrm {m}\). Independent changes of \(\mu _\alpha ,\,\mu _\mathrm {m}\) would yield a larger estimate of the uncertainty [776]. We note also that there are small differences in the continuum-extrapolated results in the moments themselves, cf. Table 58. The relative difference in \(R_6/R_8 -1 \sim k \alpha _s\) between Maezawa 16 [157], and JLQCD 16 [23], is about 4.5(2.5)%, which translates into a difference of 0.0023(13) in \(\alpha _s\) at the Z-mass, close to the total cited uncertainty of JLQCD 16 [23]. A further estimate of the uncertainty is the difference of the JLQCD 16 [23] and Maezawa 16 [157] final numbers,Footnote 75 which is \(\Delta \alpha _{\overline{\mathrm {MS}}}(M_Z)= 0.0015\)

    We settle for an intermediate value of \(\Delta \alpha _s=0.0015\). Replacing the perturbative truncation errors from HPQCD 10 [13], HPQCD 14A [16], and JLQCD 16 [23] with this value, and including a 100% correlation between the perturbative errors, we obtain for the weighted average \(\alpha _{{\overline{\mathrm {MS}}}}= 0.11818(156)\).

  • Other methods

    Computations using other methods do not qualify for an average yet, predominantly due to a lacking in the continuum extrapolation.

We obtain the central value for our range of \(\alpha _s\) from the weighted average of the four pre-ranges listed in Table 61. The error of this weighted average is 0.0006, which is quite a bit smaller than the most precise entry. Because, however, the errors on almost all of the \(\alpha _s\) calculations that enter the average are dominated by perturbative truncation errors, which are especially difficult to estimate, we choose instead to take a larger range for \(\alpha _s\) of 0.0008. This is the error on the pre-range for \(\alpha _s\) from step-scaling, because perturbative-truncation errors are sub-dominant in this method. Our final range is then given by

$$\begin{aligned} \alpha _{\overline{\mathrm{MS}}}^{(5)}(M_Z) = 0.1182(8) \,. \end{aligned}$$
(344)

Almost all of the eight calculations that are included are within 1\(\sigma \) of this range. Further, the range for \(\alpha _{\overline{\mathrm{MS}}}^{(5)}(M_Z)\) presented here is based on results with rather different systematics (apart from the matching across the charm threshold). We therefore believe that the true value is very likely to lie within this range.

All computations which enter this range, with the exception of HPQCD 14A [16], rely on a perturbative inclusion of the charm and bottom quarks. Perturbation theory for the matching of \({\bar{g}}^2_{N_f}\) and \({\bar{g}}^2_{N_f-1}\) looks very well behaved even at the mass of the charm. Worries that still there may be purely nonperturbative effects at this rather low scale have been removed by nonperturbative studies of the accuracy of perturbation theory. While the original study in Ref. [130] was not precise enough, the extended one in Ref. [131] estimates effects in the \(\Lambda \)-parameter to be significantly below 1% and thus negligible for the present and near future accuracy.

9.10.3 Ranges for \([r_0 \Lambda ]^{(N_{ f})}\) and \(\Lambda _{{\overline{\mathrm {MS}}}}\)

In the present situation, we give ranges for \([r_0 \Lambda ]^{(N_{ f})}\) and \(\Lambda _{{\overline{\mathrm {MS}}}}\), discussing their determination case by case. We include results with \(N_{ f}<3\) because it is interesting to see the \(N_{ f}\)-dependence of the connection of low- and high-energy QCD. This aids our understanding of the field theory and helps in finding possible ways to tackle it beyond the lattice approach. It is also of interest in providing an impression on the size of the vacuum polarization effects of quarks, in particular with an eye on the still difficult-to-treat heavier charm and bottom quarks. Even if this information is rather qualitative, it may be valuable, given that it is of a completely nonperturbative nature. We emphasize that results for \([r_0 \Lambda ]^{(0)}\) and \([r_0 \Lambda ]^{(2)}\) are not meant to be used in phenomenology.

For \(N_{ f}=2\,+\,1\,+\,1\), we presently do not quote a range as there is a single result: HPQCD 14A [16] found \([r_0 \Lambda ]^{(4)} = 0.70(3)\).

For \(N_{ f}=2\,+\,1\), we take as a central value the weighted average of ALPHA 17 [79], JLQCD 16 [23], Bazavov 14 [80], HPQCD 10 [13] (Wilson loops and current two-point correlators), PACS-CS 09A [81] and Maltman 08 [82]. Since the uncertainty in \(r_0\) is small compared to that of \(\Lambda \), we can directly propagate the error from the analog of Eq. (344) with the 2 + 1 + 1 number removed and arrive at

$$\begin{aligned} \left[ r_0 \Lambda _{{\overline{\mathrm {MS}}}}\right] ^{(3)} = 0.806(29) \,. \end{aligned}$$
(345)

(The error of the straight weighted average is 0.012.) It is in good agreement with all 2 + 1 results without red tags. In physical units, using \(r_0=0.472\) fm and neglecting its error, this means

$$\begin{aligned} \Lambda _{{\overline{\mathrm {MS}}}}^{(3)} = 343(12)\,\text{ MeV }\,, \end{aligned}$$
(346)

where the error of the straight weighted average is \(5\,\mathrm {MeV}\).

For \(N_f=2\), at present there is one computation with a rating for all criteria, ALPHA 12 [693]. We adopt it as our central value and enlarge the error to cover the central values of the other three results with filled green boxes. This results in an asymmetric error. Our range is unchanged as compared to FLAG 13,

$$\begin{aligned}{}[r_0 \Lambda _{{\overline{\mathrm {MS}}}}]^{(2)} = 0.79\left( ^{+~5}_{-{15}}\right) \,, \quad \end{aligned}$$
(347)

and in physical units, using \(r_0=0.472\)fm,

$$\begin{aligned} \Lambda _{{\overline{\mathrm {MS}}}}^{(2)} = 330\left( ^{+21}_{-{63}}\right) \text{ MeV }\,. \quad \end{aligned}$$
(348)

A weighted average of the four eligible numbers would yield \([r_0 \Lambda _{{\overline{\mathrm {MS}}}}]^{(2)} = 0.689(23)\), not covering the best result and in particular leading to a smaller error than we feel is justified, given the issues discussed previously in Sect. 9.4.2 (Karbstein 18 [682], ETM 11C [741]) and Sect. 9.8.2 (ETM 10F [791]). Thus we believe that our estimate is a conservative choice; the low values of ETM 11C [741] and Karbstein 18 [682] lead to a large downward error. We note that this can largely be explained by different values of \(r_0\) between ETM 11C [741] and ALPHA 12 [693]. We still hope that future work will improve the situation.

For \(N_f=0\) we take into account ALPHA 98 [724], QCDSF/UKQCD 05 [757], Brambilla 10 [742], Kitazawa 16 [686] and Ishikawa 17 [680] for forming a range.Footnote 76 Taking a weighted average of the five numbers, we obtain \([r_0 \Lambda _{{\overline{\mathrm {MS}}}}]^{(0)} = 0.615(5)\), dominated by the QCDSF/UKQCD 05 [757] result.

Since the errors are dominantly systematic, due to missing higher orders of PT, we prefer to presently take a range which encompasses all five central values and whose uncertainty comes close to our estimate of the perturbative error in QCDSF/UKQCD 05 [757]: based on \(|c_4/c_1| \approx 2\) as before, we find \(\Delta [r_0 \Lambda _{{\overline{\mathrm {MS}}}}]^{(0)} = 0.018\). We then have

$$\begin{aligned}{}[r_0 \Lambda _{{\overline{\mathrm {MS}}}}]^{(0)} = 0.615(18) \,. \quad \end{aligned}$$
(349)

Converting to physical units, again using \(r_0=0.472\,\text{ fm }\) yields

$$\begin{aligned} \Lambda _{{\overline{\mathrm {MS}}}}^{(0)} = 257(7)\,\text{ MeV }\,. \quad \end{aligned}$$
(350)

While the conversion of the \(\Lambda \) parameter to physical units is quite unambiguous for \(N_{ f}=2\,+\,1\), our choice of \(r_0=0.472\) fm also for smaller numbers of flavour amounts to a convention, in particular for \(N_{ f}=0\). Indeed, in the Tables 53, 54, 55, 56, 57, 58 and 59 somewhat different numbers in MeV are found.

How sure are we about our ranges for \([r_0 \Lambda _{{\overline{\mathrm {MS}}}}]^{(N_f)}\)? In one case we have a result, Eq. (347) that easily passes our criteria; in another one [Eq. (349)] we have four compatible results that are close to that quality and agree. For \(N_{ f}=2\,+\,1\) the range [Eq. (345)] takes account of results with rather different systematics. We therefore find it difficult to imagine that the ranges could be violated by much.

9.10.4 Conclusions

With the present results our range for the strong coupling is (repeating Eq. (344))

$$\begin{aligned} \alpha _{\overline{\mathrm{MS}}}^{(5)}(M_Z) = 0.1182(8)\qquad \,\mathrm {Refs.}~{[13,16,23,79{-}82]}, \end{aligned}$$

and the associated \(\Lambda \) parameters

$$\begin{aligned} \Lambda _{\overline{\mathrm{MS}}}^{(5)}= & {} 211(10)\,\,\mathrm {MeV}\quad \,\mathrm {Refs.}~{[13,16,23,79{-}82]}, \end{aligned}$$
(351)
$$\begin{aligned} \Lambda _{\overline{\mathrm{MS}}}^{(4)}= & {} 294(12)\,\,\mathrm {MeV}\quad \,\mathrm {Refs.}~{[13,16,23,79{-}82]}, \end{aligned}$$
(352)
$$\begin{aligned} \Lambda _{\overline{\mathrm{MS}}}^{(3)}= & {} 343(12)\,\,\mathrm {MeV}\quad \,\mathrm {Refs.}~{[13,16,23,79{-}82]}. \end{aligned}$$
(353)

Compared with FLAG 16, the errors have been reduced by about 30% due to new computations. As can be seen from Fig. 38, when surveying the green data points, the individual lattice results agree within their quoted errors. Furthermore those points are based on different methods for determining \(\alpha _s\), each with its own difficulties and limitations. Thus the overall consistency of the lattice \(\alpha _s\) results and the large number of in Table 38, engenders confidence in our range.

It is interesting to compare with the Particle Data Group average of nonlattice determinations of recent years,

$$\begin{aligned} \alpha ^{(5)}_{\overline{\mathrm{MS}}}(M_Z)= & {} 0.1174(16) \,, \quad \text{ PDG } \text{18, } \text{ nonlattice } [137] \quad \qquad (279) \nonumber \\ \alpha ^{(5)}_{\overline{\mathrm{MS}}}(M_Z)= & {} 0.1174(16) \,, \quad \text{ PDG } \text{16, } \text{ nonlattice } [201] \end{aligned}$$
(354)
$$\begin{aligned} \alpha ^{(5)}_{\overline{\mathrm{MS}}}(M_Z)= & {} 0.1175(17) \,, \quad \text{ PDG } \text{14, } \text{ nonlattice } [170] \end{aligned}$$
(355)
$$\begin{aligned} \alpha ^{(5)}_{\overline{\mathrm{MS}}}(M_Z)= & {} 0.1183(12) \,, \quad \text{ PDG } \text{12, } \text{ nonlattice } [803]\qquad \qquad \end{aligned}$$
(356)

(there was no update in [137]). There is good agreement with Eq. (344). Due to recent new determinations the lattice average is by now a factor two more precise than the nonlattice world average and an average of the two [Eq. (344) and Eq. (279)] yields

$$\begin{aligned} \alpha ^{(5)}_{\overline{\mathrm{MS}}}(M_Z)= & {} 0.1180(7) \,, \quad \text{ FLAG } \text{19 } \text{+ } \text{ PDG } \text{18 }. \end{aligned}$$
(357)

In the lower plot in Fig. 38 we show as blue circles the various PDG pre-averages which lead to the PDG 2018/2016 nonlattice average. They are on a similar level as our pre-ranges (green squares) : each one corresponds to an estimate (by the PDG) of \(\alpha _s\) determined from one set of input quantities. Within each pre-average multiple groups did the analysis and published their results as displayed in Ref. [137]. The PDG performed an average within each group;Footnote 77 we only display the latter in Fig. 38.

The fact that our range for the lattice determination of \(\alpha _{\overline{\mathrm{MS}}}(M_Z)\) in Eq. (344) is in excellent agreement with the PDG nonlattice average Eq. (279) is an excellent check for the subtle interplay of theory, phenomenology and experiments in the nonlattice determinations. The work done on the lattice provides an entirely independent determination, with negligible experimental uncertainty, which reaches a better precision even with our quite conservative estimate of its uncertainty.

Given that the PDG has not updated their number, Eq. (357) is presently the up-to-date world average.

We finish by commenting on perspectives for the future. The step-scaling methods have been shown to yield a very precise result and to satisfy all criteria easily. A downside is that dedicated simulations have to be done and the method is thus hardly used. It would be desirable to have at least one more such computation by an independent collaboration, as also requested in the review [666]. For now, we have seen a decrease of the error by 30% compared to FLAG 16. There is potential for a further reduction. Likely there will be more lattice calculations of \(\alpha _s\) from different quantities and by different collaborations. This will enable increasingly precise determinations, coupled with stringent cross-checks.

Fig. 39
figure 39

The two- and three-point correlation functions (illustrated by Feynman diagrams) that need to be calculated to extract the ground state nucleon matrix elements. (Left) the nucleon two-point function. (Middle) the connected three-point function with source-sink separation \(\tau \) and operator insertion time slice t. (Right) the disconnected three-point function with operator insertion at time t

10 Nucleon matrix elements

Authors: S. Collins, R. Gupta, A. Nicholson, H. Wittig

A large number of experiments testing the Standard Model (SM) and searching for physics Beyond the Standard Model (BSM) involve either free nucleons (proton and neutron beams) or the scattering of electrons, protons, neutrinos and dark matter off nuclear targets. Necessary ingredients in the analysis of the experimental results are the matrix elements of various probes (fundamental currents or operators in a low energy effective theory) between nucleon or nuclear states. The goal of lattice-QCD calculations in this context is to provide high precision predictions of these matrix elements, the simplest of which give the nucleon charges and form factors. Determinations of the charges are the most mature and in this review we summarize the results for six quantities, the isovector and flavour diagonal axial vector, scalar and tensor charges. Other quantities that are not being reviewed but for which significant progress has been made in the last five years are the nucleon axial vector and electromagnetic form factors [804,805,806,807,808,809,810,811,812] and parton distribution functions [813]. The more challenging calculations of nuclear matrix elements, that are needed, for example, to calculate the cross-sections of neutrinos or dark matter scattering off nuclear targets, are proceeding along three paths. First is direct evaluation of matrix elements calculated with initial and final states consisting of multiple nucleons [814, 815]. Second, convoluting nucleon matrix elements with nuclear effects [816], and third, determining two and higher body terms in the nuclear potential via the direct or the HAL QCD methods [817, 818]. We expect future FLAG reviews to include results on these quantities once a sufficient level of control over all the systematics is reached.

10.1 Isovector and flavour diagonal charges of the nucleon

The simplest nucleon matrix elements are composed of local quark bilinear operators, \(\overline{q_i} \Gamma _\alpha q_j\), where \(\Gamma _\alpha \) can be any of the sixteen Dirac matrices. In this report, we consider two types of flavour structures: (a) when \(i = u\) and \(j = d\). These \({\overline{u}} \Gamma _\alpha d\) operators arise in \(W^\pm \) mediated weak interactions such as in neutron or pion decay. We restrict the discussion to the matrix elements of the axial vector (A), scalar (S) and tensor (T) currents, which give the isovector charges, \(g_{A,S,T}^{u-d}\).Footnote 78 (b) When \(i = j \) for \(j \in \{u, d, s, c\}\), there is no change of flavour, e.g., in processes mediated via the electromagnetic or weak neutral interaction or dark matter. These \(\gamma \) or \(Z^0\) or dark matter mediated processes couple to all flavours with their corresponding charges. Since these probes interact with nucleons within nuclear targets, one has to include the effects of QCD (to go from the couplings defined at the quark and gluon level to those for nucleons) and nuclear forces in order to make contact with experiments. The isovector and flavour diagonal charges, given by the matrix elements of the corresponding operators calculated between nucleon states, are these nucleon level couplings. Here we review results for the light and strange flavours, \(g_{A,S,T}^{u}\), \(g_{A,S,T}^{d}\), and \(g_{A,S,T}^{s}\).

The isovector and flavour diagonal operators also arise in BSM theories due to the exchange of novel force carriers or as effective interactions due to loop effects. The associated couplings are defined at the energy scale \(\Lambda _{\mathrm{BSM}}\), while lattice-QCD calculations of matrix elements are carried out at a hadronic scale, \(\mu \), of a few GeV. The tool for connecting the couplings at the two scales is the renormalization group. Since the operators of interest are composed of quark fields (and more generally also of gluon fields), the predominant change in the corresponding couplings under a scale transformation is due to QCD. To define the operators and their couplings at the hadronic scale \(\mu \), one constructs renormalized operators, whose matrix elements are finite in the continuum limit. This requires calculating both multiplicative renormalization factors, including the anomalous dimensions and finite terms, and the mixing with other operators. We discuss the details of the renormalization factors needed for each of the six operators reviewed in this report in Sect. 10.1.3.

Once renormalized operators are defined, the matrix elements of interest are extracted using expectation values of two-point and three-point correlation functions illustrated in Fig. 39, where the latter can have both quark line connected and disconnected contributions. In order to isolate the ground state matrix element, these correlation functions are analyzed using their spectral decomposition. The current practice is to fit the n-point correlation functions (or ratios involving three- and two-point functions) including contributions from one or two excited states.

The ideal situation occurs if the time separation \(\tau \) between the nucleon source and sink positions, and the distance of the operator insertion time from the source and the sink, t and \(\tau - t\), respectively, are large enough such that the contribution of all excited states is negligible. In the limit of large \(\tau \), the ratio of noise to signal in the nucleon two and three-point correlation functions grows exponentially as \(e^{(M_N - \frac{3}{2}M_\pi )\tau }\) [819, 820], where \(M_N\) and \(M_\pi \) are the masses of the nucleon and the pion, respectively. Therefore, in particular at small pion masses, maintaining reasonable errors for large \(\tau \) is challenging, with current calculations limited to \(\tau \lesssim 1.5\) fm. In addition, the mass gap between the ground and excited (including multi-particle) states is smaller than in the meson sector and at these separations, excited-state effects can be significant. The approach commonly taken is to first obtain results with high statistics at multiple values of \(\tau \), using the methods described in Sect. 10.1.1. Then, as mentioned above, excited-state contamination is removed by fitting the data using a fit form involving one or two excited states. The different strategies that have been employed to minimize excited-state contamination are discussed in Sect. 10.1.2.

Usually, the quark-connected part of the three-point function (corresponding to the plot in the centre of Fig. 39) is computed via the so-called “sequential propagator method”, which uses the product of two quark propagators between the positions of the initial and the final nucleons as a source term for another inversion of the lattice Dirac operator. This implies that the position of the sink timeslice is fixed at some chosen value. Varying the value of the source-sink separation \(\tau \) then requires the calculation of another sequential propagator.

The evaluation of quark-disconnected contributions is computationally more challenging as the disconnected loop (which contains the operator insertion, as illustrated in Fig. 39 right) is needed at all points on a particular timeslice or, in general, over the whole lattice. The quark loop is computed stochastically and then correlated with the nucleon two-point function before averaging this three-point function over the ensemble of gauge configurations. The associated statistical error, therefore, is a combination of that due to the stochastic evaluation (on each configuration) and that from the gauge average. The number of stochastic sources employed on each configuration is, typically, optimized to reduce the overall error for a given computational cost. The statistical errors of the connected contributions, in contrast, usually come only from the ensemble average since they are often evaluated exactly on each configuration, for a small number of source positions. If these positions are well-separated in space and time, then each measurement is statistically independent. The methodology applied for these calculations and the variance reduction techniques are summarized in Sect. 10.1.1. By construction, arbitrary values of \(\tau \) across the entire temporal extent of the lattice can be realized when computing the quark-disconnected contribution, since the source-sink separation is determined by the part of the diagram that corresponds to the two-point nucleon correlator. However, in practice statistical fluctuations of both the connected and disconnected contributions increase sharply, so that the signal is lost in the statistical noise for \(\tau \gtrsim 1.5\) fm.

The lattice calculation is performed for a given number of quark flavours and at a number of values of the lattice spacing a, the pion mass \(M_\pi \), and the lattice size represented by \(M_\pi L\). The results need to be extrapolated to the physical point defined by \(a=0\), \(M_\pi = 135\) MeV and \(M_\pi L \rightarrow \infty \). This is done by fitting the data simultaneously in these three variables using a theoretically motivated ansatz. The ansätze used and the fitting strategy are described in Sect. 10.1.4.

The procedure for rating the various calculations and the criteria specific to this chapter are discussed in Sect. 10.2, which also includes a brief description of how the final averages are constructed. The physics motivation for computing the isovector charges, \(g_{A,S,T}^{u-d}\), and the review of the lattice results are presented in Sect. 10.3. This is followed by a discussion of the relevance of the flavour diagonal charges, \(g_{A,S,T}^{u,d,s}\), and a presentation of the lattice results in Sect. 10.4.

10.1.1 Technical aspects of the calculations of nucleon matrix elements

The calculation of n-point functions needed to extract nucleon matrix elements requires making four essential choices. The first involves choosing between the suite of background gauge field ensembles one has access to. The range of lattice parameters should be large enough to facilitate the extrapolation to the continuum and infinite volume limits, and the evaluation at the physical pion mass taken to be \(M_\pi =135\) MeV. Such ensembles have been generated with a variety of discretization schemes for the gauge and fermion actions that have different levels of improvement and preservation of continuum symmetries. The actions employed at present include (i) Wilson gauge with nonperturbatively improved Sheikholeslami–Wohlert fermions (nonperturbatively improved clover fermions) [85, 90, 402, 821,822,823,824], (ii) Iwasaki gauge with nonperturbatively improved clover fermions [812, 825], (iii) Iwasaki gauge with twisted mass fermions with a clover term [826,827,828,829,830], (iv) tadpole Symanzik improved gauge with highly improved staggered quarks (HISQ) [7, 83, 84, 86, 831,832,833,834,835], (v) Iwasaki gauge with domain wall fermions (DW) [6, 89, 836,837,838,839,840] and (vi) Iwasaki gauge with overlap fermions [841,842,843]. For details of the lattice actions, see Glossary A.1.

The second choice is of the valence quark action. Here there are two choices, to maintain a unitary formulation by choosing exactly the same action as is used in the generation of gauge configurations or to choose a different action and tune the quark masses to match the pseudoscalar meson spectrum in the two theories. Such mixed action formulations are nonunitary but are expected to have the same continuum limit as QCD. The reason for choosing a mixed action approach is expediency. For example, the generation of 2 + 1 + 1 flavour HISQ and 2 + 1 flavour DW ensembles with physical quark masses has been possible even at the coarse lattice spacing of \(a=0.15\) fm and there are indications that cut-off effects are reasonably small. These ensembles have been analyzed using clover-improved Wilson fermions, DW and overlap fermions since the construction of baryon correlation functions with definite spin and parity is much simpler compared to staggered fermions.

The third choice is the combination of the algorithm for inverting the Dirac matrix and variance reduction techniques. Efficient inversion and variance reduction techniques are needed for the calculation of nucleon correlation functions with high precision because the signal to noise degrades exponentially as \(e^{({\frac{3}{2}M_\pi -M_N}) \tau }\) with the source-sink separation \(\tau \). Thus, the number of measurements needed for high precision is much larger than in the meson sector. Commonly used inversion algorithms include the multigrid [844] and the deflation-accelerated Krylov solvers [845], which can handle linear systems with large condition numbers very efficiently, thereby enabling calculations of correlation functions at the physical pion mass.

The sampling of the path integral is limited by the number \(N_\mathrm{conf}\) of gauge configurations generated. One requires sufficiently large \(N_{\mathrm{conf}}\) such that the phase space (for example, different topological sectors) has been adequately sampled and all the correlation functions satisfy the expected lattice symmetries such as C, P, T, momentum and translation invariance. Thus, one needs gauge field generation algorithms that give decorrelated large volume configurations cost-effectively. On such large lattices, to reduce errors one can exploit the fact that the volume is large enough to allow multiple measurements of nucleon correlation functions that are essentially statistically independent. Two other common variance reduction techniques that reduce the cost of multiple measurements on each configuration are: the truncated solver with bias correction method [846] and deflation of the Dirac matrix for the low lying modes followed by sloppy solution with bias correction for the residual matrix consisting predominately of the high frequency modes [846, 847].

A number of other variation reduction methods are also being used and developed. These include deflation with hierarchical probing for disconnected diagrams [848, 849], the coherent source sequential propagator method [850, 851], low mode averaging [332, 852], the hopping parameter expansion [853, 854] and partitioning [855] (also known as dilution [856]).

The final choice is of the interpolating operator used to create and annihilate the nucleon state, and of the operator used to calculate the matrix element. Along with the choice of the interpolating operator (or operators if a variational method is used) one also chooses a “smearing” of the source used to construct the quark propagator. By tuning the width of the smearing, one can optimize the spatial extent of the nucleon interpolating operator to reduce the overlap with the excited states. Two common smearing algorithms are Gaussian (Wuppertal) [857] and Jacobi [858] smearing.

Having made all the above choices, for which a reasonable recipe exists, one calculates a statistical sample of correlation functions from which the desired ground state nucleon matrix element is extracted. Excited states, unfortunately, contribute significantly to nucleon correlation functions in present studies. To remove their contributions, calculations are performed with multiple source-sink separations \(\tau \) and fits are made to the correlation functions using their spectral decomposition as discussed in the next section.

10.1.2 Controlling excited-state contamination

Nucleon matrix elements are determined from a combination of two- and three-point correlation functions. To be more specific, let \(B^\alpha (\vec {x},t)\) denote an interpolating operator for the nucleon. Placing the initial state at timeslice \(t=0\), the two-point correlation function of a nucleon with momentum \(\vec {p}\) reads

$$\begin{aligned} C_2(\vec {p};\tau ) = \sum _{\vec {x},\vec {y}}\,e^{i\vec {p}\cdot (\vec {x}-\vec {y})}\, {\mathbb {P}}_{\beta \alpha }\,\left\langle B^\alpha (\vec {x},\tau )\,{\overline{B}}^\beta (\vec {y},0) \right\rangle , \end{aligned}$$
(358)

where the projector \({\mathbb {P}}\) selects the polarization, and \(\alpha , \beta \) denote Dirac indices. The three-point function of two nucleons and a quark bilinear operator \(O_\Gamma \) is defined as

$$\begin{aligned} C_3^\Gamma (\vec {q};t,\tau )= & {} \sum _{\vec {x},\vec {y},\vec {z}}\, e^{ i\vec {p\,}^\prime \cdot (\vec {x}-\vec {z})}\, e^{-i\vec {p}\cdot (\vec {y}-\vec {z})}\, {\mathbb {P}}_{\beta \alpha }\,\nonumber \\&\times \left\langle B^\alpha (\vec {x},\tau )\,O_\Gamma (\vec {z},t)\, {\overline{B}}^\beta (\vec {y},0) \right\rangle , \end{aligned}$$
(359)

where \(\vec {p},\ \vec {p\,}^\prime \) denote the momenta of the nucleons at the source and sink, respectively, and \(\vec {q}\equiv \vec {p\,}^\prime -\vec {p}\) is the momentum transfer. The bilinear operator is inserted at timeslice t, and \(\tau \) denotes the source-sink separation. Both \(C_2\) and \(C_3^\Gamma \) are constructed using the nonperturbative quark propagators, \(D^{-1}(y,x)\), where D is the lattice Dirac operator.

The framework for the analysis of excited-state contamination is based on spectral decomposition. After inserting complete sets of eigenstates of the transfer matrix, the expressions for the correlators \(C_2\) and \(C_3^\Gamma \) read

$$\begin{aligned} C_2(\vec {p};\tau )= & {} \frac{1}{L^3} \sum _{n}\,{\mathbb {P}}_{\beta \alpha }\,\langle \Omega |B^\alpha |n\rangle \langle n|{\overline{B}}^\beta |\Omega \rangle \, e^{-E_n\tau }, \end{aligned}$$
(360)
$$\begin{aligned} C_3^\Gamma (\vec {q};t,\tau )= & {} \frac{1}{L^3}\sum _{n,m}\, {\mathbb {P}}_{\beta \alpha }\, \langle \Omega |B^\alpha |n\rangle \,\nonumber \\&\times \langle n|O_\Gamma |m\rangle \, \langle m|{\overline{B}}^\beta |\Omega \rangle \, e^{-E_n(\tau -t)}\,e^{-E_m t},\nonumber \\ \end{aligned}$$
(361)

where \(|\Omega \rangle \) denotes the vacuum state, and \(E_n\) represents the energy of the \(n^{\mathrm{th}}\) eigenstate \(|n\rangle \) in the nucleon channel. Here we restrict the discussion to vanishing momentum transfer, \(\vec {q}=0\) and label the ground state by \(n=0\). The matrix element of interest, \(g_\Gamma \equiv \langle 0|O_\Gamma |0\rangle \) can, for instance, be obtained from the asymptotic behaviour of the ratio

$$\begin{aligned} R_\Gamma (t,\tau )\equiv & {} \frac{C_3^\Gamma (\vec {q}=0;t,\tau )}{C_2(\vec {p}=0;\tau )} {\mathop {\longrightarrow }\limits ^{t,(\tau -t)\rightarrow \infty }} g_{\Gamma } \nonumber \\&+ \,\mathrm{O}(e^{-\Delta t},\,e^{-\Delta (\tau -t)},\,e^{-\Delta \tau }), \end{aligned}$$
(362)

where \(\Delta \equiv E_1-E_0\) denotes the energy gap between the ground state and the first excitation. Here we assume that the bilinear operator \(O_\Gamma \) is appropriately renormalized (see Sect. 10.1.3).

Excited states with the same quantum numbers as the nucleon include resonances such as a Roper-like state with a mass of about 1.5 GeV, or multi-particle states consisting of a nucleon and one or more pions [859, 860]. The latter are expected to be responsible for the most relevant sub-leading contributions to two- and three-point correlators in Eqs. (358) and (359) or their ratios (362) as the pion mass approaches its physical value. Ignoring the interactions between the individual hadrons, one can easily identify the lowest-lying multi-particle states: they include the \(N\pi \pi \) state with all three particles at rest at \(\sim 1.2\) GeV, as well as \(N\pi \) states with both hadrons having nonzero and opposite momentum. Depending on the spatial box size L in physical units (with the smallest nonzero momentum equal to \(2\pi /L\)), there may be a dense spectrum of \(N\pi \) states before the first nucleon resonance is encountered. Corrections to nucleon correlation functions due to the pion continuum have been studied using chiral effective theory [859,860,861,862] and Lüscher’s finite-volume quantization condition [863].

The well-known noise problem of baryonic correlation functions implies that the long-distance regime, \(t, (\tau -t)\rightarrow \infty \), where the correlators are dominated by the ground state, is difficult to reach. Current lattice calculations of baryonic three-point functions are typically confined to source-sink separations of \(\tau \lesssim 1.5\) fm, despite the availability of efficient noise reduction methods. In view of the dense excitation spectrum encountered in the nucleon channel, one has to demonstrate that the contributions from excited states are sufficiently suppressed to guarantee an unbiased determination of nucleon matrix elements. There are several strategies to address this problem:

  • Multi-state fits to correlator ratios or individual two- and three-point functions;

  • Three-point correlation functions summed over the operator insertion time t;

  • Increasing the projection of the interpolator \(B^\alpha \) onto the ground state.

The first of the above methods includes excited state contributions explicitly when fitting to the spectral decomposition of the correlation functions, Eqs. (360, 361) or, alternatively, their ratio (see Eq. (362)). In its simplest form, the resulting expression for \(R_\Gamma \) includes the contributions from the first excited state, i.e.,

$$\begin{aligned} R_\Gamma (t,\tau )= & {} g_\Gamma +c_{01}\,e^{-{\Delta }t} +c_{10}\,e^{-{\Delta }(\tau -t)}\nonumber \\&+\,c_{11}\,e^{-{\Delta }\tau }+\cdots , \end{aligned}$$
(363)

where \(c_{01}, c_{10}, c_{11}\) and \(\Delta \) are treated as additional parameters when fitting \(R_\Gamma (t,\tau )\) simultaneously over intervals in the source-sink separation \(\tau \) and the operator insertion timeslice t. Multi-exponential fits become more difficult to stabilize for a growing number of excited states, since an increasing number of free parameters must be sufficiently constrained by the data. Therefore, a high level of statistical precision at several source-sink separations is required. One common way to address this issue is to introduce Bayesian constraints, as described in [864]. Alternatively, one may try to reduce the number of free parameters by fixing the energy gap \(\Delta \) (see, for instance, Ref. [805]), by assuming that the lowest-lying excitations are described by noninteracting multi-particle states consisting of the nucleon and at least one pion.

Ignoring the explicit contributions from excited states and fitting \(R_\Gamma (t,\tau )\) to a constant in t for fixed \(\tau \) amounts to applying what is called the “plateau method”. The name derives from the ideal situation that sufficiently large source-sink separations \(\tau \) can be realized, which would cause \(R_\Gamma (t,\tau )\) to exhibit a plateau in t independent of \(\tau \). The ability to control excited-state contamination is rather limited in this approach, since the only option is to check for consistency in the estimate of the plateau as \(\tau \) is varied. In view of the exponential degradation of the statistical signal for increasing \(\tau \), such stability checks are difficult to perform reliably.

Summed operator insertions, originally proposed in Ref. [865], have also emerged as a widely used method to address the problem of excited state contamination. One way to implement this method [866, 867] proceeds by summing \(R_\Gamma (t,\tau )\) over the insertion time t, resulting in the correlator ratio \(S_\Gamma (\tau )\),

$$\begin{aligned} S_\Gamma (\tau ) \equiv \sum _{t=a}^{\tau -a}\,R_\Gamma (t,\tau ). \end{aligned}$$
(364)

The asymptotic behaviour of \(S_\Gamma (\tau )\), including sub-leading terms, for large source-sink separations \(\tau \) can be easily derived from the spectral decomposition of the correlators and is given by [868]

$$\begin{aligned} S_\Gamma (\tau )\;&{\mathop {\longrightarrow }\limits ^{\tau \gg 1/\Delta }}\; K_\Gamma +(\tau -a)\,g_\Gamma \nonumber \\&\quad \quad \;+(\tau -a)\,e^{-\Delta \tau }d_\Gamma +e^{-\Delta \tau }f_\Gamma +\cdots ,\quad \end{aligned}$$
(365)

where \(K_\Gamma \) is a constant, and the coefficients \(d_\Gamma \) and \(f_\Gamma \) contain linear combinations of transition matrix elements involving the ground and first excited states. Thus, the matrix element of interest, \(g_\Gamma \), is obtained from the linear slope of \(S_\Gamma (\tau )\) with respect to the source-sink separation \(\tau \). While the leading corrections from excited states are parametrically smaller than those of the original ratio \(R_\Gamma (t,\tau )\) (see Eq. (362)), extracting the slope from a linear fit to \(S_\Gamma (\tau )\) typically results in relatively large statistical errors. In principle, one could include the contributions from excited states explicitly in the expression for \(S_\Gamma (\tau )\). However, in practice it is often difficult to constrain an enlarged set of parameters reliably, in particular if one cannot afford to determine \(S_\Gamma (\tau )\) except for a handful of source-sink separations.

The original summed operator insertion technique described in Refs. [857, 865, 869, 870] avoids the explicit summation over the operator insertion time t at every fixed value of \(\tau \). Instead, one replaces one of the quark propagators that appear in the representation of the two-point correlation function \(C_2(t)\) by a “sequential” propagator, according to

$$\begin{aligned} D^{-1}(y,x) \rightarrow D_\Gamma ^{-1}(y,x) = \sum _z D^{-1}(y,z)\Gamma D^{-1}(z,x).\nonumber \\ \end{aligned}$$
(366)

In this expression, the position \(z\equiv (\vec {z},t)\) of the insertion of the quark bilinear operator is implicitly summed over, by inverting the lattice Dirac operator D on the source field \(\Gamma D^{-1}(z,x)\). While this gives access to all source-sink separations \(0\le \tau \le T\), where T is the temporal extent of the lattice, the resulting correlator also contains contact terms, as well as contributions from \(\tau<t<T\) that must be controlled. This methodFootnote 79 has been adopted recently by the CalLat collaboration in their calculation of the isovector axial charge [84, 835].

As in the case of explicitly summing over the operator insertion time, the matrix element of interest is determined from the slope of the summed correlator. For instance, in Ref. [84], the axial charge was determined from the summed three-point correlation function, by fitting to its asymptotic behaviour [871] including sub-leading terms.

In practice, one often uses several methods simultaneously [e.g., multi-state fits and the summation method based on Eq. (365)], in order to check whether the results converge towards a common value. All of the approaches for controlling excited-state contributions proceed by fitting data obtained in a finite interval in \(\tau \) to a function that describes the approach to the asymptotic behaviour derived from the spectral decomposition. Obviously, the accessible values of \(\tau \) must be large enough so that the model function provides a good representation of the data that enter such a fit. It is then reasonable to impose a lower threshold on \(\tau \) above which the fit model is deemed reliable. We will return to this issue when explaining our quality criteria in Sect. 10.2.

The third method for controlling excited-state contamination aims at optimizing the projection onto the ground state in the two-point and three-point correlation functions [823, 851, 874]. The RQCD collaboration has chosen to optimize the parameters in the Gaussian smearing procedure, so that the overlap of the nucleon interpolating operator onto the ground state is maximized [823]. In this way it may be possible to use shorter source-sink separations without incurring a bias due to excited states.

The variational method, originally designed to provide detailed information on energy levels of the ground and excited states in a given channel [875,876,877,878], has also been adapted to the determination of hadron-to-hadron transition elements [868]. In the case of nucleon matrix elements, the authors of Ref.  [874] have employed a basis of operators to construct interpolators that couple to individual eigenstates in the nucleon channel. The method has produced promising results when applied to calculations of the axial and other forward matrix elements at a fixed value of the pion mass [851, 874, 879]. However, a more comprehensive study aimed at providing an estimate at the physical point has, until now, not been performed.

10.1.3 Renormalization and Symanzik improvement of local currents

In this section we discuss the matching of the normalization of lattice operators to a continuum reference scheme such as \({\overline{\mathrm {MS}}}\), and the application of Symanzik improvement to remove O(a) contributions. The relevant operators for this review are the axial (\(A_\mu \)), tensor (\(T_{\mu \nu }\)) and scalar (S) local operators of the form \({{{\mathcal {O}}}}_\Gamma ={\overline{q}}\Gamma q\), with \(\Gamma =\gamma _\mu \gamma _5\), \(i\sigma _{\mu \nu }\) and \({\mathbf {1}}\), respectively, whose matrix elements are evaluated in the forward limit. The general form for renormalized operators in the isovector flavour combination, at a scale \(\mu \), reads

$$\begin{aligned} {{{\mathcal {O}}}}_\Gamma ^{{\overline{\mathrm {MS}}}}(\mu )= & {} Z_{{{\mathcal {O}}}}^{{\overline{\mathrm {MS}}},\mathrm{Latt}}(\mu a,g^2)\left[ {{{\mathcal {O}}}}_\Gamma (a) +ab_{{{\mathcal {O}}}}m\mathcal{O}_\Gamma (a)\nonumber \right. \\&\left. +\,ac_{{{\mathcal {O}}}}{{{\mathcal {O}}}}_\Gamma ^{\mathrm{imp}}(a)\right] +O(a^2), \end{aligned}$$
(367)

where \(Z_{{{\mathcal {O}}}}^{{\overline{\mathrm {MS}}},\mathrm{Latt}}(\mu a,g^2)\) denotes the multiplicative renormalization factor determined in the chiral limit and the second and third terms represent all possible mass dependent and mass independent Symanzik improvement terms, respectively.Footnote 80 The chiral properties of overlap, domain-wall fermions (with improvement up to \(O(m_{\mathrm{res}}^n)\) where \(m_{\mathrm{res}}\) is the residual mass) and twisted mass fermions (at maximal twist [884, 885]) mean that the O(a) improvement terms are absent, while for nonperturbatively improved Sheikholeslami–Wohlert–Wilson (nonperturbatively-improved clover) fermions all terms appear in principle. For the operators of interest here there are several mass dependent terms but at most one (higher dimensional) \({{{\mathcal {O}}}}_\Gamma ^{\mathrm{imp}}\), see, e.g., Refs. [886, 887]. However, the latter involve external derivatives whose corresponding matrix elements vanish in the forward limit. Note that no mention is made of staggered fermions as they are not, currently, widely employed as valence quarks in nucleon matrix element calculations.

In order to illustrate the above remarks we consider the renormalization and improvement of the isovector axial current. This current has no anomalous dimension and hence the renormalization factor, \(Z_A=Z_A^{{\overline{\mathrm {MS}}},\mathrm{Latt}}(g^2)\), is independent of the scale. The factor is usually computed nonperturbatively via the axial Ward identity [888] or the Rome-Southampton method [468] (see Sect. A.3 for details). In some studies, the ratio with the corresponding vector renormalization factor, \(Z_A/Z_V\), is determined for which some of the systematics cancel. In this case, one constructs the combination \(Z_A g_A/(Z_V g_V)\), where \(Z_V g_V=1\) and \(g_A\) and \(g_V\) are the lattice forward matrix elements, to arrive at the renormalized axial charge [834]. For domain wall fermions the ratio is employed in order to remove \(O(am_{\mathrm{res}})\) terms and achieve leading discretisation effects starting at \(O(a^2)\) [10]. Thus, as mentioned above, O(a) improvement terms are only present for nonperturbatively-improved clover fermions. For the axial current, Eq. (367) takes the explicit form,

$$\begin{aligned} A_\mu ^{{\overline{\mathrm {MS}}}}(\mu )= & {} Z_A^{{\overline{\mathrm {MS}}},\mathrm{Latt}}(g^2)\left[ \left( 1+ ab_A m_{\mathrm{val}}+ 3a{\tilde{b}}_A m_{\mathrm{sea}}\right) A_\mu (a)\right. \nonumber \\&\left. +ac_A \partial _\mu P(a)\right] +O(a^2), \end{aligned}$$
(368)

where \(m_{\mathrm{val}}\) and \(m_{\mathrm{sea}}\) are the average valence- and sea-quark masses derived from the vector Ward identity [881, 887, 888], and P is the pseudoscalar operator \({\overline{q}}\gamma _5 q\). The matrix element of the derivative term is equivalent to \(q_\mu \langle N(p^\prime )|P|N(p)\rangle \) and hence vanishes in the forward limit when the momentum transfer \(q_\mu =0\). The improvement coefficients \(b_A\) and \({\tilde{b}}_A\) are known perturbatively for a variety of gauge actions [886, 889, 890] and nonperturbatively for the tree-level Symanzik-improved gauge action for \(N_{ f}=2\,+\,1\) [891].

Turning to operators for individual quark flavours, these can mix under renormalization and the singlet and nonsinglet renormalization factors can differ. For the axial current, such mixing occurs for all fermion formulations just like in the continuum, where the singlet combination acquires an anomalous dimension due to the \(\hbox {U}_A\)(1) anomaly. The ratio of singlet to nonsinglet renormalization factors, \(r_{{{\mathcal {O}}}}=Z^{\mathrm{s.}}_{{{\mathcal {O}}}}/Z^{\mathrm{n.s.}}_{{{\mathcal {O}}}}\) for \({{{\mathcal {O}}}}=A\) differs from 1 at \(O(\alpha _s^2)\) in perturbation theory (due to quark loops), suggesting that the mixing is a small effect. The nonperturbative determinations performed so far find \(r_A\approx 1\) [808, 828], supporting this. For the tensor current the disconnected diagram vanishes in the continuum due to chirality and consequently on the lattice \(r_T=1\) holds for overlap and DW fermions (assuming \(m_{\mathrm{res}}=0\) for the latter). For twisted-mass and clover fermions the mixing is expected to be small with \(r_T=1+O(\alpha _s^3)\) [892] and this is confirmed by the nonperturbative studies of Refs. [830, 893].

The scalar operators for the individual quark flavours, \({\overline{q}}q\), are relevant not only for the corresponding scalar charges, but also for the sigma terms, \(\sigma _q=m_q\langle N|{\overline{q}}q|N\rangle \), when combined with the quark masses (\(m_q\)). For overlap and DW fermions \(r_S=1\), like in the continuum and all \({\overline{q}}q\) renormalize multiplicatively with the isovector \(Z_S\). The latter is equal to the inverse of the mass renormaliation and hence \(m_q{\overline{q}}q\) is renormalization group (RG) invariant. For twisted mass fermions, through the use of Osterwalder–Seiler valence fermions, the operators \(m_{ud}({\overline{u}}u+{\overline{d}}d)\) and \(m_s{\overline{s}}s\) are also invariant [894].Footnote 81 In contrast, the lack of good chiral properties leads to significant mixing between quark flavours for clover fermions. Nonperturbative determinations via the axial Ward identity [693, 824] have found the ratio \(r_S\) to be much larger than the perturbative expectation \(1+O(\alpha _s^2)\) [892] may suggest. While the sum over the quark flavours which appear in the action, \(\sum ^{N_{ f}}_q m_q {\overline{q}}q\), is RG invariant, large cancellations between the contributions from individual flavours can occur when evaluating, e.g., the strange sigma term. Note that for twisted mass and clover fermions there is also an additive contribution \(\propto a^{-3}{\mathbf {1}}\) (or \(\propto \mu a^{-2}{\mathbf {1}}\)) to the scalar operator. This contribution is removed from the nucleon scalar matrix elements by working with the subtracted current, \({\overline{q}}q - \langle {\overline{q}}q\rangle \), where \(\langle {\overline{q}}q\rangle \) is the vacuum expectation value of the current [887].

Symanzik improvement for the singlet currents follows the same pattern as in the isovector case with O(a) terms only appearing for nonperturbatively-improved clover fermions. For the axial and tensor operators only mass dependent terms are relevant in the forward limit while for the scalar there is an additional gluonic operator \({{{\mathcal {O}}}}_S^{\mathrm{imp}}=\text {Tr}(F_{\mu \nu }F_{\mu \nu })\) with a coefficient of \(O(\alpha _s)\) in perturbation theory. When constructing the sigma terms from the quark masses and the scalar operator, the improvement terms remain and they must be included to remove all O(a) effects for nonperturbatively-improved clover fermions, see Ref. [887] for a discussion.

10.1.4 Extrapolations in a, \(M_\pi \) and \(M_\pi L\)

To obtain physical results which can be used to compare to or make predictions for experiment, all quantities must be extrapolated to the continuum and infinite volume limits. In general, either a chiral extrapolation or interpolation must also be made to the physical pion mass. These extrapolations need to be performed simultaneously since discretization and finite volume effects are themselves dependent upon the pion mass. Furthermore, in practice it is not possible to hold the pion mass fixed while the lattice spacing is varied, as some variation in a occurs when tuning the quark masses at fixed gauge coupling. Thus, one performs a simultaneous extrapolation in all three variables using a theoretically motivated formula of the form,

$$\begin{aligned} g(M_{\pi },a,L) = g_{\mathrm {phys}} + \delta _{M_{\pi }} + \delta _a + \delta _L \ , \end{aligned}$$
(369)

where \(g_{\mathrm {phys}}\) is the desired extrapolated result, and \(\delta _{M_{\pi }}\), \(\delta _a\), \(\delta _L\) are the deviations due to the pion mass, the lattice spacing, and the volume, respectively. Below we outline the forms for each of these terms.

All observables discussed in this section are dimensionless, therefore the extrapolation formulae may be parameterized by a set of dimensionless variables:

$$\begin{aligned} \epsilon _{\pi } = \frac{M_{\pi }}{\Lambda _{\chi }} \ , \quad M_{\pi } L \ , \quad \epsilon _a = \Lambda _a a. \end{aligned}$$
(370)

Here, \(\Lambda _{\chi } \sim 1\) GeV is a chiral symmetry breaking scale, which, for example, can be set to \(\Lambda _{\chi } = 4 \pi F_{\pi }\), where \(F_{\pi } = 92.2\) MeV is the pion decay constant, and \(\Lambda _a\) is a discretization scale, e.g., \(\Lambda _a = \frac{1}{4\pi w_0}\), where \(w_0\) is a gradient-flow scale [272].

Effective field theory methods may be used to determine the form of each of these extrapolations. For the single nucleon charges, Heavy-Baryon \(\chi \)PT (HB\(\chi \)PT) is a common choice [895], however, other formulations, such as unitarized \(\chi \)PT [896], are also employed. Various formulations of HB\(\chi \)PT exist, including those for two- and three-flavours, as well as with and without explicit \(\Delta \) degrees of freedom. Two-flavour HB\(\chi \)PT is typically used due to issues with convergence of the three-flavour theory [825, 897,898,899,900]. The convergence properties of all known formulations for baryon \(\chi \)PT, even at the physical pion mass, have not been well-established, and are generally believed to be poor compared to purely mesonic \(\chi \)PT.

To \({\mathcal {O}}\left( \epsilon _{\pi }^2\right) \), the two-flavour chiral expansion for the nucleon charges is known to be of the form [901],

$$\begin{aligned} g = g_0 + g_1 \epsilon _{\pi } + g_2 \epsilon _{\pi }^2 + {\tilde{g}}_2 \epsilon _{\pi }^2 \ln \left( \epsilon _{\pi }^2\right) \ , \end{aligned}$$
(371)

where \(g_1=0\) for all charges g except \(g_S^{u,d}\). The dimensionless coefficients \(g_{0,1,2}, {\tilde{g}}_2\) are assumed to be different for each of the different charges. The coefficients in front of the logarithms, \({\tilde{g}}_2\), are known functions of the lower order coefficients (LECs), and do not represent new, independent LECs. Mixed action calculations will have further dependence upon the mixed valence-sea pion mass, \(m_{vs}\).

Given the potential difficulties with convergence of the chiral expansion, known values of the \({\tilde{g}}_2\) in terms of LECs are not typically used, but are left as free fit parameters. Furthermore, many quantities have been found to display mild pion mass dependence, such that Taylor expansions, i.e., neglecting logarithms in the above expressions, are also often employed. The lack of a rigorously established theoretical basis for the extrapolation in the pion mass thus requires data close to the physical pion mass for obtaining high precision extrapolated/interpolated results.

Discretization effects depend upon the lattice action used in a particular calculation, and their form may be determined using the standard Symanzik power counting. In general, for an unimproved action, the corrections due to discretization effects, \(\delta _a\), include terms of the form,

$$\begin{aligned} \delta _a = c_1 \epsilon _a + c_2 \epsilon _a^2 + \cdots \ , \end{aligned}$$
(372)

where \(c_{1,2}\) are dimensionless coefficients. Additional terms of the form \({\tilde{c}}_n \left( \epsilon _{\pi } \epsilon _a\right) ^n\), where n is an integer whose lowest value depends on the combined discretization and chiral properties, will also appear. Improved actions systematically remove correction terms, e.g., an \({\mathcal {O}}\left( a\right) \) improved action, combined with an similarly improved operator, will contain terms in the extrapolation ansatz beginning at \(\epsilon _a^2\) (see Sect. 10.1.3).

Finite volume corrections, \(\delta _L\), may be determined in the usual way from effective field theory, by replacing loop integrals over continuous momenta with discrete sums. Finite volume effects therefore introduce no new undetermined parameters to the extrapolation. For example, at next-to-leading order, and neglecting contributions from intermediate delta baryons, the finite volume corrections for the axial charge in two-flavour HB\(\chi \)PT take the form [902],

$$\begin{aligned} \delta _L\equiv & {} g_{A}(L) - g_{A}(\infty ) \nonumber \\= & {} \frac{8}{3} \epsilon _{\pi }^2 \left[ g_0^3 F_1\left( M_{\pi }L\right) + g_0 F_3\left( M_{\pi } L\right) \right] \ , \end{aligned}$$
(373)

where

$$\begin{aligned} F_1\left( mL\right)= & {} \sum _{\mathbf {n\ne 0}}\left[ K_0\left( mL|{\mathbf {n}}|\right) - \frac{K_1\left( mL|{\mathbf {n}}|\right) }{mL|{\mathbf {n}}|}\right] \nonumber \\ F_3\left( mL\right)= & {} -\frac{3}{2} \sum _{{\mathbf {n}} \ne 0} \frac{K_1\left( mL|{\mathbf {n}}|\right) }{mL|{\mathbf {n}}|} \ , \end{aligned}$$
(374)

and \(K_{\nu }(z)\) are the modified Bessel functions of the second kind. Some extrapolations are performed using the form for asymptotically large \(M_{\pi } L\),

$$\begin{aligned} K_0(z) \rightarrow \frac{e^{-z}}{\sqrt{z}} \ , \end{aligned}$$
(375)

and neglecting contributions due to \(K_1\). Care must, however, be taken to establish that these corrections are negligible for all included values of \(M_{\pi } L\). The numerical coefficients, for example, 8/3 in Eq. (373), are often taken to be additional free fit parameters, due to the question of convergence of the theory discussed above.

Given the lack of knowledge about the convergence of the expansions and the resulting plethora of possibilities for extrapolation models at differing orders, it is important to include statistical tests of model selection for a given set of data. Bayesian model averaging [903] or use of the Akaike Information Criterion [904] are common choices which penalize over-parameterized models.

10.2 Quality criteria for nucleon matrix elements and averaging procedure

There are two specific issues which call for a modification and extension of the FLAG quality criteria listed in Sect. 2. The first concerns the rating of the chiral extrapolation: The FLAG criteria reflect the ability of \(\chi \)PT to provide accurate descriptions of the pion mass dependence of observables. Clearly, this ability is linked to the convergence properties of \(\chi \)PT in a particular mass range. Quantities extracted from nucleon matrix elements are extrapolated to the physical pion mass using some variant of baryonic \(\chi \)PT, whose convergence is not as well established compared to the mesonic sector. Therefore, we have opted for stricter quality criteria concerning the chiral extrapolation of nucleon matrix elements, i.e.,

  • \(M_{\pi ,\mathrm {min}}< 200\) MeV with three or more pion masses used in the extrapolation or two values of \(M_\pi \) with one lying within 10 MeV of 135 MeV (the physical neutral pion mass) and the other one below 200 MeV

  • 200 MeV \(\le M_{\pi ,\mathrm {min}} \le 300\) MeV with three or more pion masses used in the extrapolation; or two values of \(M_\pi \) with \(M_{\pi ,\mathrm {min}}< 200\) MeV; or a single value of \(M_\pi \) lying within 10 MeV of 135 MeV (the physical neutral pion mass)

  • Otherwise

In Sect. 10.1.2 we have discussed that insufficient control over excited state contributions, arising from the noise problem in baryonic correlation functions, may lead to a systematic bias in the determination of nucleon matrix elements. We therefore introduce an additional criterion that rates the efforts to suppress excited state contamination in the final result. As described in Sect. 10.1.2, the source-sink separation \(\tau \), i.e., the Euclidean distance between the initial and final nucleons, is the crucial variable. The rating scale concerning control over excited state contributions is thus

  • Three or more source-sink separations \(\tau \), at least two of which must be above 1.0 fm.

  • Two or more source-sink separations, \(\tau \), with at least one value above 1.0 fm.

  • Otherwise

Despite the enormous progress achieved in reducing excited state contamination, we emphasize that more stringent quality criteria may have to be adopted in future editions of the FLAG report to control this important systematic effect at the stated level of precision.

As explained in Sect. 2, FLAG averages are distinguished by the sea-quark content. Hence, for a given configuration of the quark sea (i.e., for \(N_f=2\), \(2\,+\,1\) or \(2\,+\,1\,+\,1\)), we first identify those calculations that pass the FLAG and the additional quality criteria defined in this section, i.e., excluding any calculation that has a red tag in one or more of the categories. We then add statistical and systematic errors in quadrature and perform a weighted average. If the fit is of bad quality (i.e., if \(\chi ^2_{\mathrm{min}}/\mathrm{dof}>1\)), the errors of the input quantities are scaled by \(\sqrt{\chi ^2/\mathrm{dof}}\). In the following step, correlations among different calculations are taken into account in the error estimate by applying Schmelling’s procedure [132].

10.3 Isovector charges

The axial, scalar and tensor isovector charges are needed to interpret the results of many experiments and phenomena mediated by weak interactions, including probes of new physics. The most natural process from which isovector charges can be measured is neutron beta decay (\(n \rightarrow p^{+} e^{-} {\overline{\nu }}_e\)). At the quark level, this process occurs when a down quark in a neutron transforms into an up quark due to weak interactions, in particular due to the axial current interaction. While scalar and tensor currents have not been observed in nature, effective scalar and tensor interactions arise in the SM due to loop effects. At the TeV and higher scales, contributions to these three currents could arise due to new interactions and/or loop effects in BSM theories. These super-weak corrections to standard weak decays can be probed through high precision measurements of the neutron decay distribution by examining deviations from SM predictions as described in Ref. [905]. The lattice-QCD methodology for the calculation of isovector charges is well-established, and the control over statistical and systematic uncertainties is becoming robust.

Table 62 Overview of results for \( g^{u-d}_A\)

The axial charge \(g_A^{u-d}\) is an important parameter that encapsulates the strength of weak interactions of nucleons. It enters in many analyses of nucleon structure and of SM and BSM physics. For example, it enters in (i) the extraction of \(V_{ud}\) and tests of the unitarity of the Cabibbo–Kobayashi–Maskawa (CKM) matrix; (ii) the analysis of neutrinoless double-beta decay, (iii) neutrino-nucleus quasi-elastic scattering cross-section; (iv) the rate of proton–proton fusion, the first step in the thermonuclear reaction chains that power low-mass hydrogen-burning stars like the Sun; (v) solar and reactor neutrino fluxes; (vi) muon capture rates, etc.. The current best determination of the ratio of the axial to the vector charge, \(g_A/g_V\), comes from measurement of neutron beta decay using polarized ultracold neutrons by the UCNA collaboration, 1.2772(20) [906, 907], and by PERKEO II, \(1.2761{}^{+14}_{-17}\) [908]. Note that, in the SM, \(g_V=1\) up to second order corrections in isospin breaking [909, 910] as a result of the conservation of the vector current. Given the accuracy with which \(g_A^{u-d}\) has been measured in experiments, the goal of lattice-QCD calculations is to calculate it directly with \(O(1\%)\) accuracy.

Isovector scalar or tensor interactions contribute to the helicity-flip parameters, called b and B, in the neutron decay distribution. By combining the calculation of the scalar and tensor charges with the measurements of b and B, one can put constraints on novel scalar and tensor interactions at the TeV scale as described in Ref. [905]. To optimally bound such scalar and tensor interactions using measurements of b and B parameters in planned experiments targeting \(10^{-3}\) precision [911,912,913], we need to determine \(g_S^{u-d}\) and \(g_T^{u-d}\) at the \(10\%\) level as explained in Refs. [834, 905]. Future higher-precision measurements of b and B would require correspondingly higher-precision calculations of the matrix elements to place even more stringent bounds on these couplings at the TeV-scale.

One can estimate \(g_S^{u-d}\) using the conserved vector current (CVC) relation, \(g_S/g_V = (M_N-M_P)^{\mathrm{QCD}}/ (m_d-m_u)^\mathrm{QCD}\), as done by Gonzalez-Alonso et al. [914]. In their analysis, they took estimates of the two mass differences on the right-hand side from the global lattice-QCD data [2] and obtained \(g_S^{u-d}=1.02(8)(7)\).

The tensor charge \(g_T^{u-d}\) can be extracted experimentally from semi-inclusive deep-inelastic scattering (SIDIS) data [915,916,917,918]. A sample of these phenomenological estimates is shown in Fig. 42, and the noteworthy feature is that the current uncertainty in these estimates is large.

10.3.1 Results for \(g_A^{u-d}\)

Calculations of the isovector axial charge have a long history, as can be seen from the compilation given in Table 62 and plotted in Fig. 40. There are results in two-flavour QCD, as well as for QCD with \(N_f=2\,+\,1\) and \(2\,+\,1\,+\,1\) dynamical flavours. All calculations discussed below use renormalization factors that were determined nonperturbatively, either via Ward identities or the Rome-Southampton method.

Fig. 40
figure 40

Lattice results and FLAG averages for the isovector axial charge \(g_A^{u-d}\) for \(N_f = 2\), \(2\,+\,1\), and \(2\,+\,1\,+\,1\) flavour calculations

The issue of excited state contamination received little if any attention before 2010. As a consequence, the range of source-sink separations employed in many of the early calculations prior to that year was rather limited, offering little control over this important systematic effect. This concerns the calculations by LHPC 05 [921], LHPC 10 [850], RBC 08 [922], RBC/UKQCD 08 [836], RBC/UKQCD 09B [837] and QCDSF 06 [821].

The Mainz group has performed calculations in two-flavour QCD, based on the ensembles generated by the Coordinated Lattice Simulations (CLS) effort, using nonperturbatively improved Wilson fermions and the Wilson gauge action. In their first calculation (Mainz 12 [822]) they computed three-point correlators over several source-sink separations up to \(\tau \approx 1.3\) fm. By comparing the technique of summed operator insertions (the “summation method”) to the more traditional plateau method, they found that the former gave consistently larger estimates for \(g_A^{u-d}\), which were in better agreement with the experimental value. In a follow-up paper (Mainz 17 [85]) they added more statistics, extended the range of pion masses towards lower values and used two-state fits in addition to the summation method.

Two flavours of O(a) improved Wilson quarks were also used in the calculations performed by QCDSF 06 [821], QCDSF 13 [402] and RQCD 14 [823]. QCDSF 13 [402] is an extension of the earlier study QCDSF 06 [821], including ensembles at smaller lattice spacing. Control over excited-state effects is still limited, since a range of several source-sink separations was studied only on one ensemble, and the main result was derived from the plateau method at a single source-sink separation of about 1 fm. The calculation by the Regensburg group (RQCD 14 [823]) was performed on a large part of the same ensembles used by QCDSF 13, supplemented by a larger volume at the smallest pion mass of 150 MeV and by an additional ensemble at coarser lattice spacing with \(M_\pi =290\) MeV. The strategy employed in RQCD 14 to control excited-state contamination was focused on optimizing the overlap of the nucleon interpolator onto the ground state, by choosing appropriate parameters in the smearing procedure. The efficacy of this approach was studied on a subset of ensembles for \(\tau \sim 0.5{-}1.2\) fm. In both QCDSF 13 and RQCD 14, the axial charge was determined from the ratio \(g_A/f_\pi \) in which finite-volume effects and other systematic errors are expected to cancel approximately.

The ETM collaboration has published results for the axial charge [826, 828], obtained using \(N_f=2\) flavours of twisted-mass Wilson fermions. In ETM 15D [826], three different source-sink separations were studied, and the range of pion masses was extended down to the physical values. The quoted result for \(g_A^{u-d}\) originates from a single lattice spacing and was obtained using the plateau method at the largest value of the source-sink separation \(\tau \) where agreement with the summation method was found. A further extension of the analysis (ETM 17B [828]) was performed at a single (but almost physical) pion mass value and single lattice spacing. ETMC quote the result at the smallest source-sink separation \(\tau \) for which the plateau value agrees with the two-state fit as their main estimate. Agreement with the summation method is also observed, albeit within the larger statistical errors of the latter.

Estimates for the axial charge with \(N_f=2\,+\,1\) have been published by the LHPC [850, 920, 921] and RBC/UKQCD collaborations [836, 837] and, more recently, by JLQCD 18 [843], \(\chi \)QCD 18 [6], PACS 18 [812], and Mainz 18 [919].

The calculations in LHPC 05 [921] and LHPC 10 [850] were performed employing a mixed-action setup, combining domain wall fermions in the valence sector with staggered (Asqtad) gauge ensembles generated by MILC. Although the dependence of the results on the source-sink separation was studied to some extent in LHPC 10, excited state effects are not sufficiently controlled according to our quality criteria described in Sect. 10.2. A different discretization of the quark action was used in their later study (LHPC 12A [920]), based on tree-level improved Wilson fermions with smeared gauge links, both in the sea and valence sectors. While this setup does not realize full O(a) improvement, it was found that smeared gauge links reduce the leading discretization effects of O(a) substantially. Three source-sink separations were studied in LHPC 12A on each ensemble down to nearly the physical quark mass at a single value of the lattice spacing. The quoted estimate for the axial charge is uncharacteristically low. While other quantities determined in the same study agreed well with experiment or other groups, the reasons for such a low value of \(g_A^{u-d}\) could not be established.

The RBC/UKQCD collaboration has employed \(N_f=2\,+\,1\) flavours of domain wall fermions in their calculations. The results quoted in RBC/UKQCD 08B [836] and RBC/UKQCD 09B [837] were obtained at relatively heavy pion masses at a single value of the lattice spacing, with only limited control over excited state effects. A systematic investigation of different source-sink separations has only been performed more recently [923], however, without quoting an estimate for \(g_A^{u-d}\).

The JLQCD collaboration (JLQCD 18 [843]) have performed a calculation using \(N_f=2\,+\,1\) flavours of overlap fermions and the Iwasaki gauge action. Owing to the large numerical cost of overlap fermions, which preserve exact chiral symmetry at nonzero lattice spacing, they have only simulated four light quark masses with \(290< M_\pi < 540\) MeV and at a single lattice spacing so far. Their simultaneous fit to the data for the correlator ratio \(R_A(t,\tau )\) computed at six values of \(\tau \) to a constant, gives a low value for \(g_A^{u-d}\) at the physical point. Overlap valence quarks were also used by the \(\chi \)QCD collaboration in their study of various nucleon matrix elements (\(\chi \)QCD 18 [6]), utilizing the gauge ensembles generated by RBC/UKQCD with domain wall fermions. The quoted estimate for the axial charge was obtained from a combination of two-state fits and the summation method, applied over a range of source-sink separations.

Two recent calculations with \(N_f=2\,+\,1\) have used O(a) improved Wilson fermions. The focus of the study by the PACS collaboration (PACS 18 [812]) was on the use of very large volumes at the physical pion mass. The calculation comprises only one lattice spacing and a single source-sink separation. Therefore, at the current stage, the study does not offer sufficient control over several systematic effects. The Mainz group (Mainz 18 [919]) has presented preliminary results for the axial charge, obtained by performing two-state fits to six different nucleon matrix elements (including the scalar and tensor charges), assuming that the mass gap to the excited state can be more reliably constrained in this way. Up to six source-sink separations per ensemble have been studied.

Two groups, PNDME and CalLat, have published results for \(N_f=2\,+\,1\,+\,1\), i.e., PNDME 16 [834], PNDME 18 [83], CalLat 17 [835] CalLat 18 [84]. While both groups share the staggered (HISQ) gauge ensembles generated by the MILC collaboration, they employ different discretizations in the valence quark sector: PNDME use O(a) improved Wilson fermions with the improvement coefficient \(c_{\mathrm{sw}}\) set to its tadpole-improved tree-level value. By contrast, CalLat use the Möbius variant of domain wall fermions, which are fully O(a) improved. The CalLat set of ensembles includes three values of the lattice spacing, i.e., \(a=0.09\), 0.12, and 0.15 fm, while PNDME added another set of ensembles at the finer lattice spacing of 0.06 fm to this collection. Both groups have included physical pion mass ensembles in their calculations. The operator matrix elements are renormalized nonperturbatively, using the Rome–Southampton method.

Table 63 Overview of results for \( g^{u-d}_S\)

In order to control excited state contamination, PNDME perform multi-state fits, including up to four (three) energy levels in the two-point (three-point) correlation functions. By contrast, CalLat have employed the Feynman-Hellmann-inspired implementation of summed operator insertions described in Sect. 10.1.2. Plotting the summed correlator \(S_A(\tau )\) as a function of the source-sink separation, they find that excited-state effects cannot be detected for \(\tau \gtrsim 1.0\) fm at their level of statistics. After subtracting the leading contributions from excited states determined from two-state fits, they argue that the data for \(S_A(\tau )\) can be described consistently down to \(\tau \simeq 0.3\) fm.

We now proceed to discuss global averages for the axial charge, in accordance with the procedures in Sect. 10.2. For QCD with \(N_f=2\,+\,1\,+\,1\), the calculations of PNDME and CalLat pass all our quality criteria, and hence the latest results, i.e., PNDME 18 [83] and CalLat 18 [84] qualify for being included in a global average. Since both PNDME and CalLat use the gauge ensembles produced by MILC, we assume that the quoted statistical errors are 100% correlated, even though the range of pion masses and lattice spacings explored in Refs. [83] and [84] is not exactly identical. Since the two calculations differ by the valence quark action, and since systematic errors have been estimated independently, we restrict the correlations between PNDME 18 and CalLat 18 to the statistical error only. Performing a weighted average yields \(g_A^{u-d} = 1.266(18)\) with \(\chi ^2/\mathrm{dof}=1.68\), where the error has been scaled by about 30% because of the large \(\chi ^2/\mathrm{dof}\). Given that the calculations of PNDME 18 and CalLat 18 are correlated, the large value of \(\chi ^2/\mathrm{dof}\) indicates a tension between the two results. In this situation it is appropriate to adopt a more conservative approach: We estimate the axial charge to be represented by the interval \(1.218\le g_A^{u-d}\le 1.284\), where the lower bound is identified with the result of PNDME 18, while the upper bound is the weighted average plus the scaled 1\(\sigma \) uncertainty. Hence, for \(N_f=2\,+\,1\,+\,1\) we quote \(g_A^{u-d}=1.251(33)\) as the FLAG estimate, where the central value marks the mid-point of the interval, and half the width is taken to be the error.

For QCD with \(N_f=2\,+\,1\) dynamical quarks, the calculations of \(\chi \)QCD 18 [6] and Mainz 18 [919] are free of red tags. However, since the result from the latter is preliminary and published only as a proceedings article, it does not qualify for being included in a global average. Hence, for \(N_f=2\,+\,1\) we identify the FLAG average with the result quoted in \(\chi \)QCD 18 [6], i.e., \(g_A^{u-d}=1.254(16)(30)\).

In the two-flavour case, the results by the Mainz group [85, 822] qualify for an average, since other recent calculations employed only a single source-sink separation on most ensembles (RQCD 14 [823], QCDSF 13 [402]) or because only a single lattice spacing was used (ETM 15D [826], ETM 17B [828]). For \(N_f=2\) we quote the latest estimate \(g_A^{u-d}\) from Mainz 17 [85], adding statistical and systematic errors in quadrature and symmetrizing the error. To summarize, the FLAG averages for the axial charge read

$$\begin{aligned}&N_{ f}=2\,+\,1\,+\,1:\;\;g_A^{u-d} = 1.251(33) \quad \,\mathrm {Refs.}~ \text{[83,84] }, \end{aligned}$$
(376)
$$\begin{aligned}&N_{ f}=2\,+\,1:\;\;g_A^{u-d} = 1.254(16)(30) \quad \,\mathrm {Ref.}~ \text{[6] }, \end{aligned}$$
(377)
$$\begin{aligned}&N_{ f}=2:\;\;g_A^{u-d} = 1.278(86) \quad \,\mathrm {Ref.}~ \text{[85] } \end{aligned}$$
(378)

Within errors, these averages are all compatible with the result of \(g_A^{u-d}=1.2724(23)\) quoted by the PDG. While the most recent lattice calculations reproduce the axial charge at the level of a few percent or even better, the experimental result is more precise by an order of magnitude.

10.3.2 Results for \(g_S^{u-d}\)

Calculations of the isovector scalar charge have, in general, larger errors than the isovector axial charge as can be seen from the compilation given in Table 63 and plotted in Fig. 41. For comparison, Fig. 41 also shows a phenomenological result produced using the conserved vector current (CVC) relation [914].

Fig. 41
figure 41

Lattice results and FLAG averages for the isovector scalar charge \(g^{u-d}_S\) for \(N_{ f}= 2\), \(2\,+\,1\), and \(2\,+\,1\,+\,1\) flavour calculations. Also shown is a phenomenological result obtained using the conserved vector current (CVC) relation [914] (circle)

Only a single calculation, PNDME 18 [83], which supersedes PNDME 16 [834] and PNDME 13 [831], meets all the criteria for inclusion in the average.

This \(2\,+\,1\,+\,1\) flavour mixed-action calculation was performed using the MILC HISQ ensembles, with a clover valence action. The 11 ensembles used include three pion mass values, \(M_{\pi } \sim \) 135, 225, 320 MeV, and four lattice spacings, \(a \sim \) 0.06, 0.09, 0.12, 0.15 fm. Note that four lattice spacings are required to meet the green star criteria, as this calculation is not fully O(a) improved. Lattice size ranges between \(3.3 \lesssim M_{\pi } L \lesssim 5.5\), and the set of ensembles includes three different volumes at a fixed pion mass \(M_{\pi } \sim 225\) MeV and lattice spacing \(a\sim 0.12\) fm. Physical point extrapolations were performed simultaneously, keeping only the leading order terms in the various expansion parameters. For the chiral extrapolation, these are the terms proportional to \(M_{\pi }^2\), while the continuum extrapolation is performed using the term proportional to a, because the action and operators are not fully O(a) improved. For the finite volume extrapolation, the asymptotic limit of the \(\chi \)PT prediction, Eq. (375), is used. The Akaike Information Criterion is used to conclude that including more fit parameters is not justified based on the data.

Excited state contamination is controlled using two-state fits to between three and five source-sink time separations. Time separations range between \(0.72 \lesssim \tau \lesssim 1.68\) fm, with all ensembles having at least two time separations greater than 1 fm. Renormalization was performed nonperturbatively using the RI-SMOM scheme and converted to \({\overline{\mathrm {MS}}}\) at 2 GeV using 2-loop perturbation theory.

Table 64 Overview of results for \( g^{u-d}_T\)

Regarding \(2\,+\,1\)-flavour calculations, the Mainz 18 calculation meets all criteria for averaging, however as it is only a preliminary result published in proceedings it is not considered. The calculation was performed on the Wilson CLS ensembles, using four lattice spacings down to 0.05 fm and several pion masses down to \(\sim 200\) GeV. Excited states were controlled using multi-state fits to several source-sink separations. The JLQCD 18 calculation, performed using overlap fermions on the Iwasaki gauge action, covered four pion masses down to 290 MeV. The lattice size was adjusted to keep \(M_\pi L \ge 4\) in all four cases. However, the single lattice spacing of \(a=0.11\) fm does not meet the criteria for continuum extrapolation. The calculations presented in LHPC 12A used three different lattice actions, Wilson-clover, domain wall, and mixed action. Pion masses ranged down to near the physical pion mass. Data at two lattice spacings were produced with the domain wall and Wilson actions, however, the final result utilized only the single lattice spacing of \(a=0.116\) fm from the Wilson action. Because the action is not fully O(a) improved, two lattice spacings are not sufficient for meeting the quality criteria for the continuum extrapolation.

The two-flavour calculations in Table 63 include ETM 17, which employed twisted mass fermions on the Iwasaki gauge action.Footnote 82 This work utilized a single physical pion mass ensemble with lattice spacing \(a\sim 0.09\) fm, and therefore does not meet the criteria for continuum extrapolation. The RQCD 14 calculation included three lattice spacings down to 0.06 fm and several pion masses down to near the physical point. While a study of excited state contamination was performed on some ensembles using multiple source-sink separations, many ensembles included only a single time separation, so it does not meet the criteria for excited states.

The final FLAG average for \(g_S^{u-d}\) is

$$\begin{aligned}&N_{ f}=2\,+\,1\,+\,1:&g_S^{u-d}&= 1.022(80)(60)&\,\mathrm {Ref.}~ \text{[83] }. \end{aligned}$$
(379)

10.3.3 Results for \(g_T^{u-d}\)

Estimates of the isovector tensor charge are currently the most precise of the isovector charges with values that are stable over time, as can be seen from the compilation given in Table 64 and plotted in Fig. 42. This is a consequence of the smaller statistical fluctuations in the raw data and the very mild dependence on a, \(M_\pi \), and the lattice size \(M_\pi L\). As a result, the uncertainty due to the various extrapolations is small. Also shown for comparison in Fig. 42 are phenomenological results using measures of transversity [925,926,927,928,929].

Only the PNDME 18 [83] calculation, which supersedes PNDME 16 [834], PNDME 15 [832, 833] and PNDME 13 [831], meets all the criteria for inclusion in the average. The details for this calculation are the same as those for \(g_S^{u-d}\) described in the previous section (Sect. 10.3.2), except that three-state fits were used to remove excited-state effects.

Fig. 42
figure 42

Lattice results and FLAG averages for the isovector tensor charge \(g^{u-d}_T\) for \(N_{ f}= 2\), \(2\,+\,1\), and \(2\,+\,1\,+\,1\) flavour calculations. Also shown are phenomenological results using measures of transversity [925,926,927,928,929] (circles)

For \(2\,+\,1\)-flavour calculations, details for the Mainz 18, JLQCD 18, and LHPC 12A, calculations are identical to those presented previously in Sect. 10.3.2. The earlier RBC/UKQCD 10 calculation was performed using domain wall fermions on the Iwasaki gauge action, with two volumes and several pion masses. The lowest pion mass used was \(M_{\pi }\sim 330\) MeV and does not meet the criteria for chiral extrapolation. In addition, the single lattice spacing and single source-sink separation do not meet the criteria for continuum extrapolation and excited states.

Two-flavour calculations include RQCD 14, with details identical to those described in Sect. 10.3.2. There are two calculations, ETM 15D and ETM 17, which employed twisted mass fermions on the Iwasaki gauge action. The earlier work utilized three ensembles, with three volumes and two pion masses down to the physical point. The more recent work used only the physical pion mass ensemble. Both works used only a single lattice spacing \(a\sim 0.09\) fm, and therefore do not meet the criteria for continuum extrapolation. The early work by RBC 08 with domain wall fermions used three heavy values for the pion mass, and a single value for the lattice spacing, volume, and source-sink separation, and therefore do not meet many of the criteria.

The final FLAG average for \(g_T^{u-d}\) is

$$\begin{aligned} N_{ f}=2\,+\,1\,+\,1: \quad g_T^{u-d}&= 0.989(32)(10)&\,\mathrm {Ref.}~ \text{[83] }. \end{aligned}$$
(380)

10.4 Flavour diagonal charges

Three examples of interactions for which matrix elements of flavour-diagonal operators (\(q \Gamma q\) where \(\Gamma \) defines the Lorentz structure of the bilinear quark operator) are needed are the neutral current interactions of neutrinos, elastic scattering of electrons off nuclei, and the scattering of dark matter off nuclei. In addition, these matrix elements also probe intrinsic properties of nucleons (the spin, the strangeness contribution and the electric dipole moment of the quarks) as explained below. For brevity, all operators are assumed to be appropriately renormalized as discussed in Sect. 10.1.3.

The matrix elements of the scalar operator, \({\overline{q}} q\) with flavour q, give the rate of change in the nucleon mass due to nonzero values of the corresponding quark mass. This relationship is given by the Feynman–Hellmann theorem. The quantities of interest are the nucleon \(\sigma \)-term, \(\sigma _{\pi N}\), and the strange and charm content of the nucleon, \(\sigma _{s}\) and \(\sigma _{c}\),

$$\begin{aligned} \sigma _{\pi N}&= m_{ud} \langle N| {\overline{u}} u + {\overline{d}} d | N \rangle \,, \end{aligned}$$
(381)
$$\begin{aligned} \sigma _{s}&= m_s \langle N| {\overline{s}} s | N \rangle \,, \end{aligned}$$
(382)
$$\begin{aligned} \sigma _{c}&= m_c \langle N| {\overline{c}} c | N \rangle \,. \end{aligned}$$
(383)

Here \(m_{ud}\) is the average of the up and down quark masses and \(m_s\) (\(m_c\)) is the strange (charm) quark mass. The \(\sigma _{\pi N, s, c}\) give the shift in \(M_N\) due to nonzero light-, strange- and charm-quark masses. The same matrix elements are also needed to quantify the spin independent interaction of dark matter with nucleons. Note that, while \(\sigma _b\) and \(\sigma _t\) are also phenomenologically interesting, they are unlikely to be calculated on the lattice. In principle, the heavy sigma terms can be estimated using \(\sigma _{u,d,s}\) by exploiting the heavy-quark limit [930,931,932].

The matrix elements of the axial operator, \({\overline{q}} \gamma _\mu \gamma _5 q\), give the contribution, \(\Delta q\), of quarks of flavour q to the spin of the nucleon:

$$\begin{aligned} \begin{aligned}&\langle N| {\overline{q}} \gamma _\mu \gamma _5 q | N \rangle = g_A^q {\overline{u}}_N \gamma _\mu \gamma _5 u_N, \\&g_A^q \equiv \Delta q = \int _0^1 dx (\Delta q(x) + \Delta {\overline{q}} (x) ) \,. \end{aligned} \end{aligned}$$
(384)

The charge \(g_A^q\) is thus the contribution of the spin of a quark of flavour q to the spin of the nucleon. It is also related to the first Mellin moment of the polarized parton distribution function (PDF), \(\Delta q\), as shown in the second line in Eq. (384). Measurements by the European Muon collaboration in 1987 of the spin asymmetry in polarized deep inelastic scattering showed that the sum of the spins of the quarks contributes less than half of the total spin of the proton [933]. To understand this unexpected result, called the “proton spin crisis”, it is common to start with Ji’s sum rule [934] that provides a gauge invariant decomposition of the nucleon’s total spin as

$$\begin{aligned} \frac{1}{2} = \sum _{q=u,d,s,c,\cdot } \left( \frac{1}{2} \Delta q + L_q\right) + J_g \,, \end{aligned}$$
(385)

where \(\Delta q /2 \equiv g_A^q /2 \) is the contribution of the intrinsic spin of a quark with flavour q; \(L_q\) is the orbital angular momentum of that quark; and \(J_g\) is the total angular momentum of the gluons. Thus, to obtain the spin of the proton starting from QCD, requires calculating the contributions of the three terms: the spin and orbital angular momentum of the quarks, and the angular momentum of the gluons. Lattice-QCD calculations of the various matrix elements needed to extract the three contributions are underway. An alternate decomposition of the spin of the proton has been provided by Jaffe and Manohar [935]. The two formulations differ in the decomposition of the contributions of the quark orbital angular momentum and of the gluons. The contribution of the quark spin, which is the subject of this review and given in Eq. (384), is the same in both formulations.

The tensor charges are defined as the matrix elements of the tensor operator, \({\overline{q}} \sigma ^{\mu \nu } q\) with \(\sigma ^{\mu \nu } = \{\gamma _\mu ,\gamma _\nu \}/2\):

$$\begin{aligned} g_T^q {\overline{u}}_N \sigma _{\mu \nu } u_N&= \langle N| {\overline{q}} \sigma _{\mu \nu } q | N \rangle \,, \end{aligned}$$
(386)

These flavour-diagonal tensor charges \(g_T^{u,d,s,c}\) quantify the contributions of the u, d, s, c quark electric dipole moments (EDM) to the neutron electric dipole moment (nEDM) [832, 936]. Since particles can have an EDM only due to P and T (or CP assuming CPT is a good symmetry) violating interactions, the nEDM is a very sensitive probe of new sources of CP violation that arise in most extensions of the SM designed to explain nature at the TeV scale. The current experimental bound on the nEDM is \(d_n < 2.9 \times 10^{-26}\ e\) cm [937], while the known CP violation in the SM implies \(d_n < 10^{-31}\ e\) cm [938]. A nonzero result over the intervening five orders of magnitude would signal new physics. Planned experiments aim to reduce the bound to around \( 10^{-28}\ e\) cm. A discovery or reduction in the bound from these experiments will put stringent constraints on many BSM theories, provided the matrix elements of novel CP-violating interactions, of which the quark EDM is one, are calculated with the required precision.

One can also extract these tensor charges from the zeroth moment of the transversity distributions that are measured in many experiments including Drell–Yan and semi-inclusive deep inelastic scattering (SIDIS). Of particular importance is the active program at Jefferson Lab (JLab) to measure them [915, 916]. Transversity distributions describe the net transverse polarization of quarks in a transversely polarized nucleon. Their extraction from the data taken over a limited range of \(Q^2\) and Bjorken x is, however, not straightforward and requires additional phenomenological modeling. At present, lattice-QCD estimates of \(g_T^{u,d,s}\) are the most accurate [832, 917, 918] as can be deduced from Fig. 42. Future experiments will significantly improve the extraction of the transversity distributions. Thus, accurate calculations of the tensor charges using lattice QCD will continue to help elucidate the structure of the nucleon in terms of quarks and gluons and provide a benchmark against which phenomenological estimates utilizing measurements at JLab and other experimental facilities worldwide can be compared.

Table 65 Overview of results for \(g^q_A\)
Fig. 43
figure 43

Lattice results and FLAG averages for \(g_A^{u,d,s}\) for the \(N_{ f}= 2\), \(2\,+\,1\), and \(2\,+\,1\,+\,1\) flavour calculations

The methodology for the calculation of flavour-diagonal charges is also well-established. The major challenges are the much larger statistical errors in the disconnected contributions for the same computational cost and the need for the additional calculations of the isosinglet renormalization factors.

10.4.1 Results for \(g_A^{u,d,s}\)

A compilation of recent results for the flavour-diagonal axial charges for the proton is given in Table 65 and plotted in Fig. 43. Results for the neutron can be obtained by interchanging the u and d flavor indices. Only two calculations qualify for global averages, the PNDME 18A for 2 + 1 + 1 flavours [86] and the \(\chi \)QCD 18 for 2 + 1 flavours [6]. The global averages given below are, therefore, the same as the corresponding results given in Table 65.

The 2 + 1 + 1 flavour FLAG results for the axial charges \(g_A^{u,d,s}\) of the proton are

$$\begin{aligned}&N_{ f}=2\,+\,1\,+\,1:&g_A^u&= 0.777(25)(30)&\,\mathrm {Ref.}~[86], \end{aligned}$$
(387)
$$\begin{aligned}&N_{ f}=2\,+\,1\,+\,1:&g_A^d&= -0.438(18)(30)&\,\mathrm {Ref.}~[86], \end{aligned}$$
(388)
$$\begin{aligned}&N_{ f}=2\,+\,1\,+\,1:&g_A^s&= -0.053(8)&\,\mathrm {Ref.}~[86]. \end{aligned}$$
(389)

These PNDME 18A [86] results were obtained using the \(2\,+\,1\,+\,1\) flavour clover-on-HISQ formulation. The connected contributions were obtained on 11 HISQ ensembles generated by the MILC collaboration with \(a \approx 0.057\), 0.87, 0.12 and 0.15 fm, \( M_\pi \approx 135\), 220 and 320 MeV, and \(3.3< M_\pi L < 5.5\). The light disconnected contributions were obtained on six of these ensembles with the lowest pion mass \(M_\pi \approx 220\) MeV, while the strange disconnected contributions were obtained on seven ensembles, i.e., including an additional one at \(a \approx 0.087\) fm and \(M_\pi \approx 135\) MeV. The excited state and the chiral-continuum fits were done separately for the connected and disconnected contributions, which introduces a systematic that is hypothesied to be small as explained in Ref. [86]. The analysis of the excited-state contamination, discussed in Sect. 10.1.2, was done using three-state fits for the connected contribution and two-state fits for the disconnected contributions. The chiral-continuum extrapolation was done keeping the leading correction terms proportional to \(M_\pi ^2\) and a in both cases, and the leading finite volume correction in \(M_\pi L\) was included in the analysis of the connected contributions. The isovector renormalization factor, used for all three flavour diagonal operators, was calculated on the lattice in the RI-SMOM scheme and converted to \({\overline{\mathrm {MS}}}\). The difference due to flavor mixing for the singlet case is small as discussed in Sect. 10.1.3.

The \(2\,+\,1\) flavour FLAG results from \(\chi \)QCD 18 were obtained using the overlap-on-domain-wall formalism [6]. Three domain-wall ensembles with lattice spacings 0.143, 0.11 and 0.083 fm and sea-quark pion masses \(M_\pi = 171\), 337 and 302 MeV, respectively, were analyzed. In addition to the three approximately unitary points, the paper presents data for an additional 4–5 valence quark masses on each ensemble, i.e., partially quenched data. Separate excited-state fits were done for the connected and disconnected contributions. The continuum, chiral and volume extrapolation to the combined unitary and nonunitary data is made including terms proportional to both \(M_{\pi ,\mathrm{valence}}^2\) and \(M_{\pi ,\mathrm{sea}}^2\), and two \(O(a^2)\) discretization terms for the two different domain wall actions. With just three unitary points, not all the coefficients are well constrained. The \(M_{\pi ,sea}\) dependence is omitted and considered as a systematic, and a prior is used for the coefficients of the \(a^2\) terms to stabilize the fit. These \(\chi \)QCD 18 2 + 1 flavour results for the proton, which supersede the \(\chi \)QCD 15 [840] analysis, are

$$\begin{aligned}&N_{ f}=2\,+\,1:&g_A^u&= 0.847(18)(32)&\,\mathrm {Ref.}~ \text{[6] }, \end{aligned}$$
(390)
$$\begin{aligned}&N_{ f}=2\,+\,1:&g_A^d&= -0.407(16)(18)&\,\mathrm {Ref.}~ \text{[6] }, \end{aligned}$$
(391)
$$\begin{aligned}&N_{ f}=2\,+\,1:&g_A^s&= -0.035(6)(7)&\,\mathrm {Ref.}~ \text{[6] }. \end{aligned}$$
(392)

The JLQCD 18 [843], ETM 17C [829] and Engelhardt 12 [939] calculations were not considered for the averages as they did not satisfy the criteria for the continuum extrapolation. All three calculations were done at a single lattice spacing. The JLQCD 18 calculation used overlap fermions and the Iwasaki gauge action. They perform a chiral fit using data at four pion masses in the range 290–540 MeV. Finite volume corrections are assumed to be negligible since each of the two pairs of points on different lattice volumes satisfy \(M_\pi L \ge 4\). The ETM 17C calculation is based on a single twisted mass ensemble with \(M_\pi =130\) MeV, \(a=0.094\) and a relatively small \(M_\pi L = 2.98\). Engelhardt 12 calculation was done on three asqtad ensembles with \(M_\pi = 293\), 356 and 495 MeV, but all at a single lattice spacing \(a=0.124\) fm.

Results for \(g_A^s\) were also presented recently by LHPC in Ref. [808]. However, this calculation is not reviewed as it has been performed on a single ensemble with \(a=0.114\) and a heavy pion mass value of \(M_\pi \approx 317\) MeV.

Table 66 Overview of results for \(\sigma _{\pi N}\) and \(\sigma _s\) from the direct approach (above) and \(\sigma _s\) from the hybrid approach (below)

10.4.2 Results for \(g_S^{u,d,s}\) from direct and hybrid calculations of the matrix elements

The sigma terms \(\sigma _q=m_q\langle N|{\bar{q}}q|N\rangle =m_q g_S^q\) or the quark mass fractions \(f_{T_q}=\sigma _q/M_N\) are normally computed rather than \(g_S^q\). These combinations have the advantage of being renormalization group invariant in the continuum, and this holds on the lattice for actions with good chiral properties, see Sect. 10.1.3 for a discussion. In order to aid comparison with phenomenological estimates, e.g., from \(\pi \)N scattering [940,941,942], the light quark sigma terms are usually added to give the \(\pi N\) sigma term, \(\sigma _{\pi N}=\sigma _u+\sigma _d\). The direct evaluation of the sigma terms involves the calculation of the corresponding three-point correlation functions for different source-sink separations \(\tau \). For \(\sigma _{\pi N}\) there are both connected and disconnected contributions, while for most lattice fermion formulations only disconnected contributions are needed for \(\sigma _s\). The techniques typically employed lead to the availability of a wider range of \(\tau \) for the disconnected contributions compared to the connected ones (both, however, suffer from signal to noise problems for large \(\tau \), as discussed in Sect. 10.1) and we only comment on the range of \(\tau \) computed for the latter in the following.

Recent results for \(\sigma _{\pi N}\) and for \(\sigma _s\) from the direct approach are compiled in Table 66. For both quantities, only the results from \(\chi \)QCD 15A [89] qualify for global averaging. In this mixed action study, three RBC/UKQCD \(N_{ f}=2\,+\,1\) domain wall ensembles are analysed comprising two lattice spacings, \(a=0.08\) fm with \(M_{\pi ,\mathrm sea}=300\) MeV and \(a=0.11\) fm with \(M_{\pi ,\mathrm sea}=330\) MeV and 139 MeV. Overlap fermions are employed with a number of nonunitary valence quark masses. The connected three-point functions are measured with three values of \(\tau \) in the range 0.9–1.4 fm. A combined chiral, continuum and volume extrapolation is performed for all data with \(M_\pi <350\) MeV. The leading order expressions are taken for the lattice-spacing and volume dependence while partially quenched SU(2) HB\(\chi \)PT up to \(M_\pi ^3\) terms models the chiral behaviour for \(\sigma _{\pi N}\). The strange quark sigma term has a milder dependence on the pion mass and only the leading order quadratic terms are included in this case.

The lack of other qualifying studies is an indication of the difficulty and computational expense of performing these calculations. Nonetheless, this situation is likely to improve in the future. We note that although the recent analyses, ETM 16A [827] and JLQCD 18 [843], are both performed at a single lattice spacing (\(a=0.09\) fm and 0.11 fm, respectively), they satisfy the criteria for chiral extrapolation, finite volume and excited states. ETM 16A is a single ensemble study with \(N_{ f}=2\) twisted mass fermions with a pion mass close to the physical point and \(M_\pi L=3.0\). Excited states are investigated utilizing \(\tau =0.9\) fm up to \(\tau =1.7\) fm for the connected three-point functions. JLQCD utilize \(N_{ f}=2\,+\,1\) overlap fermion ensembles with pion masses reaching down to 293 MeV (\(M_\pi L=4.0\)) and apply techniques which give a wide range of \(\tau \) for the connected contribution, with the final results extracted from \(\tau \ge 1.2\) fm.

RQCD in RQCD 16 [824] investigate the continuum, physical quark mass and infinite volume limits, where the lattice spacing spans the range 0.06–0.08 fm, the minimum \(M_\pi \) is 150 MeV and \(M_\pi L\) is varied between 3.4 and 6.7 at \(M_\pi =290\) MeV. This \(N_{ f}=2\) study has a red tag for the excited state criterion as multiple source-sink separations for the connected three-point functions are only computed on a subset of the ensembles. Clover fermions are employed and the lack of good chiral properties for this action means that there is mixing between quark flavours under renormalization when determining \(\sigma _s\) and a gluonic term needs to be considered for full O(a) improvement (which has not been included, see Sect. 10.1.3 for a discussion).

Earlier work focuses only on \(\sigma _s\). The analysis of JLQCD 12A [842], is performed on the same set of ensembles as the JLQCD 18 study discussed above and in addition includes smaller volumes for the lightest two pion masses.Footnote 83 No significant finite volume effects are observed. Engelhardt 12 [939] and \(\chi \)QCD 13A [839] have less control over the systematics. The former is a single lattice spacing analysis restricted to small spatial volumes while the latter is a partially quenched study on a single ensemble with unitary \(M_\pi >300\) MeV.

Table 67 Overview of results for \(\sigma _{\pi N}\) and \(\sigma _s\) from the Feynman-Hellmann approach

MILC have also computed \(\sigma _s\) using a hybrid method [943] which makes use of the Feynman–Hellmann (FH) theorem and involves evaluating the matrix element \(\langle N|\int \! d^4\!x\, {\bar{s}}s|N\rangle \).Footnote 84 This method is applied in MILC 09D [943] to the \(N_{ f}=2\,+\,1\) Asqtad ensembles with lattice spacings \(a=0.06,0.09,0.12\) fm and values of \(M_\pi \) ranging down to 224 MeV. A continuum and chiral extrapolation is performed including terms linear in the light-quark mass and quadratic in a. As the coefficient of the discretisation term is poorly determined, a Bayesian prior is used, with a width corresponding to a 10% discretisation effect between the continuum limit and the coarsest lattice spacing.Footnote 85 A similar updated analysis is presented in MILC 12C [91], with an improved evaluation of \(\langle N|\int \! d^4\!x\, {\bar{s}}s|N\rangle \) on a subset of the \(N_{ f}=2\,+\,1\) Asqtad ensembles. The study is also extended to HISQ \(N_{ f}=2\,+\,1\,+\,1\) ensembles comprising four lattice spacings with \(a=0.06-0.15\) fm and a minimum pion mass of 131 MeV. Results are presented for \(g_S^s=\langle N|{\bar{s}}s|N\rangle \) (in the \({\overline{\mathrm {MS}}}\) scheme at 2 GeV) rather than for \(\sigma _s\). The scalar matrix element is renormalized for both three and four flavours using the 2-loop factor for the Asqtad action [165]. The error incurred by applying the same factor to the HISQ results is expected to be small.Footnote 86

Both MILC 09D and MILC 12C achieve green tags for all the criteria, see Table 66. As the same set of Asqtad ensembles is utilized in both studies we take MILC 12C as superseding MILC 09D for the three flavour case. The global averaging is discussed in Sect. 10.4.4.

10.4.3 Results for \(g_S^{u,d,s}\) using the Feynman–Hellmann theorem

An alternative approach for accessing the sigma terms is to determine the slope of the nucleon mass as a function of the quark masses, or equivalently, the squared pseudoscalar meson masses. The Feynman–Hellman (FH) theorem gives

$$\begin{aligned} \sigma _{\pi N}= & {} m_u\frac{\partial M_N}{\partial m_u}+ m_d\frac{\partial M_N}{\partial m_d}\approx M_\pi ^2 \frac{\partial M_N}{\partial M_\pi ^2}, \nonumber \\ \sigma _s= & {} m_s \frac{\partial M_N}{\partial m_s}\approx \frac{1}{2} M_{{\bar{s}}s}^2 \frac{\partial M_N}{\partial M_{{\bar{s}}s}^2}, \end{aligned}$$
(393)

where the fictitious \({\bar{s}}s\) meson has a mass squared \(M^2_{{\bar{s}}s}=2M_K^2-M_\pi ^2\). In principle this is a straightforward method as the nucleon mass can be extracted from fits to two-point correlation functions, and a further fit to \(M_N\) as a function of \(M_\pi \) (and also \(M_K\) for \(\sigma _s\)) provides the slope. Nonetheless, this approach presents its own challenges: a functional form for the chiral behaviour of the nucleon mass is needed, and while baryonic \(\chi \)PT (B\(\chi \)PT) is the natural choice, the convergence properties of the different formulations are not well established. Results are sensitive to the formulation chosen and the order of the expansion employed. If there is an insufficient number of data points when implementing higher order terms, the coefficients are sometimes fixed using additional input, e.g., from analyses of experimental data. This may influence the slope extracted. Simulations with pion masses close to or bracketing the physical point can alleviate these difficulties. In some studies the nucleon mass is used to set the lattice spacing. This naturally forces the fit to reproduce the physical nucleon mass at the physical point and may affect the extracted slope.

An overview of recent determinations of \(\sigma _{\pi N}\) and \(\sigma _s\) is given in Table 67. Note that the renormalization and excited state criteria are not applied.Footnote 87 We do not impose the latter since a wide range of source-sink separations are available for nucleon two-point functions and ground state dominance is normally achieved.

There are several results for \(\sigma _{\pi N}\) that can be included in a global average. For \(N_{ f}=2\), one study meets the selection criteria.Footnote 88 The analysis of QCDSF 12 [90] employs nonperturbatively improved clover fermions over three lattice spacings (\(a=0.06-0.08\) fm) with pion masses reaching down to around 160 MeV. Finite volume corrected nucleon masses are extrapolated via \(O(p^4)\) covariant B\(\chi \)PT with three free parameters. The other coefficients are taken from experiment, phenomenology or FLAG, with the corresponding uncertainties accounted for in the fit for those coefficients that are not well known. The nucleon mass is used to set the scale. A novel feature of this study is that a direct determination of \(\sigma _{\pi N}\) at around \(M_\pi =290\) MeV was used as an additional constraint on the slope.

Turning to \(N_{ f}=2\,+\,1\), two studies performed by the BMW collaboration are relevant. In BMW 11A [87], stout smeared tree-level clover fermions are employed on 15 ensembles with simulation parameters encompassing a = 0.06–0.12 fm, \(M_\pi \sim \) 190–550 MeV and . Taylor, Padé and covariant SU(3) B\(\chi \)PT fit forms are considered. Due to the use of smeared gauge links, discretisation effects are found to be mild even though the fermion action is not fully O(a) improved. Fits are performed including an O(a) or \(O(a^2)\) term and also without a lattice-spacing dependent term. Finite volume effects were assessed to be small in an earlier work [952]. The final results are computed considering all combinations of the fit ansatz weighted by the quality of the fit. In BMW 15 [88], a more extensive analysis on 47 ensembles is presented for HEX-smeared clover fermions involving five lattice spacings and pion masses reaching down to 120 MeV. Bracketing the physical point reduces the reliance on a chiral extrapolation. Joint continuum, chiral and infinite volume extrapolations are carried out for a number of fit parameterisations with the final results determined via the Akaike information criterion procedure [904]. Although only \(\sigma _{\pi N}\) is accessible in the FH approach in the isospin limit, the individual quark fractions \(f_{T_q}=\sigma _q/M_N\) for \(q=u,d\) for the proton and the neutron are also quoted in BMW 15, using isospin relations.Footnote 89

Regarding \(N_{ f}=2\,+\,1\,+\,1\), there is only one recent study. In ETM 14A [21], fits are performed to the nucleon mass utilizing SU(2) \(\chi \)PT for data with \(M_\pi \ge 213\) MeV as part of an analysis to set the lattice spacing. The expansion is considered to \(O(p^3)\) and \(O(p^4)\), with two and three of the coefficients as free parameters, respectively. The difference between the two fits is taken as the systematic error. No discernable discretisation or finite volume effects are observed where the lattice spacing is varied over the range a = 0.06–0.09 fm and the spatial volumes cover \(M_\pi L=3.4\) up to \(M_\pi L>5\). The results are unchanged when a near physical point \(N_{ f}=2\) ensemble is added to the analysis in Ref. [949].

Fig. 44
figure 44

Lattice results and FLAG averages for the nucleon sigma term, \(\sigma _{\pi N}\), for the \(N_{ f}= 2\), \(2\,+\,1\), and \(2\,+\,1\,+\,1\) flavour calculations. Determinations via the direct approach are indicated by squares and the Feynman–Hellmann method by triangles. Results from calculations which analyse more than one lattice data set within the Feynman–Hellmann approach [454, 455, 949, 954,955,956,957,958,959,960] are shown for comparison (pentagons) along with those from recent analyses of \(\pi \)-N scattering [940,941,942, 961] (circles)

Other determinations of \(\sigma _{\pi N}\) in Table 67 receive one or more red tags. JLQCD 08B [841], PACS-CS 09 [825] and QCDSF 11 [947] are single lattice spacing studies. In addition, the volume for the minimum pion mass is rather small for JLQCD 08B and PACS-CS 09, while QCDSF 11 is restricted to heavier pion masses.

We also consider publications that are based on results for baryon masses found in the literature. As different lattice setups (in terms of \(N_{ f}\), lattice actions, etc.) will lead to different systematics, we only include works in Table 67 which utilize a single setup. These correspond to Shanahan 12 [946] and Martin Camalich 10 [948], which fit PACS-CS data [162] (the PACS-CS 09 study is also based on these results). Note that Shanahan 12 avoids a red tag for the volume criterion as the lightest pion mass ensemble is omitted. Recent studies which combine data from different setups/collaborations are displayed for comparison in Figs. 44 and 45 in the next section.

Several of the above studies have also determined the strange quark sigma term. This quantity is difficult to access via the Feynman-Hellmann method since in most simulations the physical point is approached by varying the light-quark mass, keeping \(m_s\) approximately constant. While additional ensembles can be generated, it is hard to resolve a small slope with respect to \(m_s\). Such problems are illustrated by the large uncertainties in the results from BMW 11A and BMW 15. Alternative approaches have been pursued in QCDSF 11, where the physical point is approached along a trajectory keeping the average of the light- and strange-quark masses fixed, and JLQCD 12A [842], where quark mass reweighting is applied. The latter is a single lattice spacing study. One can also fit to the whole baryon octet and apply SU(3) flavour symmetry constraints as investigated in, e.g., Martin Camalich 10, Shanahan 12, QCDSF 11 and BMW 11A.

Fig. 45
figure 45

Lattice results and FLAG averages for \(\sigma _s\) for the \(N_{ f}= 2\), \(2\,+\,1\), and \(2\,+\,1\,+\,1\) flavour calculations. Determinations via the direct approach are indicated by squares, the Feynman–Hellmann method by triangles and the hybrid approach by circles. Results from calculations which analyse more than one lattice data set within the Feynman–Hellmann approach [454, 455, 955, 956, 958] are shown for comparison (pentagons)

The determinations of \(\sigma _s\) in BMW 11A and BMW 15 qualify for averaging. The mixed action study of Junnarkar 13 [92] with domain wall valence fermions on MILC \(N_{ f}=2\,+\,1\) Asqtad ensembles also passes the FLAG criteria. The derivative \(\partial M_N/\partial m_s\) is determined from simulations above and below the physical strange quark mass for \(M_\pi \) around 240–675 MeV. The resulting values of \(\sigma _s\) are extrapolated quadratically in \(M_\pi \). The quark fraction \(f_{T_s}=\sigma _s/M_N\) exhibits a milder pion-mass dependence and extrapolations of this quantity were also performed using ansätze linear and quadratic in \(M_\pi \). A weighted average of all three fits was used to form the final result. Two lattice spacings were analysed, with a around 0.09 fm and 0.12 fm, however, discretisation effects could not be resolved. The global averaging of all calculations that qualify is discussed in the next section.

10.4.4 Summary of results for \(g_S^{u,d,s}\)

We consider computing global averages of results determined via the direct, hybrid and Feynman-Hellmann (FH) methods. Beginning with \(\sigma _{\pi N}\), Tables 66 and 67 show that for \(N_{ f}=2\,+\,1\,+\,1\) only ETM 14A (FH) satisfies the selection criteria. We take this result as our average for the four flavour case.

$$\begin{aligned} N_{ f}=2\,+\,1\,+\,1: \quad \sigma _{\pi N} = 64.9(1.5)(13.2)~\text{ MeV }\quad \,\mathrm {Ref.}~[21]. \end{aligned}$$
(394)

For \(N_{ f}=2\,+\,1\) we form an average from the BMW 11A (FH), BMW 15 (FH) and \(\chi \)QCD 15A (direct) results, yielding

$$\begin{aligned}&N_{ f}=2\,+\,1:&\sigma _{\pi N}&= 39.7(3.6) ~\text{ MeV }&\,\mathrm {Refs.}~ [87{-}89]. \end{aligned}$$
(395)

Note that both BMW results are included as they were obtained on independent sets of ensembles (employing different fermion actions). The average is dominated by the BMW 15 calculation, which has much smaller overall errors compared to the other two studies.

Turning to the results for \(N_{ f}=2\), only QCDSF 12 (FH) qualifies. This result forms our average

$$\begin{aligned}&N_{ f}=2:&\sigma _{\pi N}&= 37(8)(6)~\text{ MeV }&\,\mathrm {Ref.}~ \text{[90] }. \end{aligned}$$
(396)

Considering \(\sigma _s\) and the calculations detailed in Table 66, there is again only a single 2 + 1 + 1 flavour study, MILC 12C (hybrid), which satisfies the quality criteria. In order to convert the result for \(\langle N|{\bar{s}}s|N\rangle \) given in this work to a value for \(\sigma _s\), we multiply by the appropriate FLAG average for \(m_s\) given in Eq. (35). This gives our average for four flavours.

$$\begin{aligned}&N_{ f}=2\,+\,1\,+\,1:&\sigma _{s}&= 41.0(8.8)~\text{ MeV }&\,\mathrm {Ref.}~ \text{[91] }. \end{aligned}$$
(397)
Table 68 Overview of results for \(g^q_T\)

For \(N_{ f}=2\,+\,1\) we perform a weighted average of BMW 11A (FH), MILC 12C (hybrid), Junnarkar 13 (FH), BMW 15 (FH) and \(\chi \)QCD 15A (direct). MILC 09D [943] also passes the FLAG selection rules, however, this calculation is superseded by MILC 12C. As for Eq. (397), the strangeness scalar matrix element determined in the latter study is multiplied by the three flavour FLAG average for \(m_s\) given in Eq. (33). There are correlations between the MILC 12C and Junnarkar 13 results as there is some overlap between the sets of Asqtad ensembles used in both cases. To be conservative we take the statistical errors for these two studies to be 100% correlated. The global average is

$$\begin{aligned} N_{ f}=2\,+\,1: \quad \sigma _{s} = 52.9(7.0) ~\text{ MeV }\quad \,\mathrm {Refs.}~{ [87{-}89,91,92]}. \end{aligned}$$
(398)

Given that all of the \(N_{ f}=2\) studies have at least one red tag we are not able to give an average in this case.

Fig. 46
figure 46

Lattice results and FLAG averages for \(g_T^{u,d,s}\) for the \(N_{ f}= 2\), \(2\,+\,1\), and \(2\,+\,1\,+\,1\) flavour calculations

All the results for \(\sigma _{\pi N}\) and \(\sigma _s\) are displayed in Figs. 44 and 45 along with the averages given above. Note that where \(f_{T_s}\) is quoted in Tables 66 and 67, we multiply by the experimental proton mass in order to include the results in the figures. Those results which pass the FLAG criteria, shown in green, are consistent within one standard deviation with the averages for each \(N_{ f}\), and considering the size of the uncertainties in the averages no significant \(N_{ f}\)-dependence is observed. However, there is some fluctuation in the central values, in particular, when taking the lattice results as a whole into account, and we caution the reader that the averages may change as new results become available.

Also shown for comparison in the figures are determinations from the FH method which utilize more than one lattice data set [454, 455, 949, 954,955,956,957,958,959,960] as well as results for \(\sigma _{\pi N}\) obtained from recent analyses of \(\pi \)-N scattering [940,941,942, 961]. There is some tension, at the level of three to four standard deviations, between the lattice average for \(N_{ f}=2\,+\,1\) and Hoferichter et al. [942], who quote a precision similar to that of the average.

Finally we remark that, by exploiting the heavy-quark limit, the light- and strange-quark sigma terms can be used to estimate \(\sigma _q\) for the charm, bottom and top quarks [930,931,932]. The resulting estimate for the charm quark, see, e.g., the RQCD 16 \(N_{ f}=2\) analysis of Ref. [824] that reports \(f_{T_c}=0.075(4)\) or \(\sigma _c=70(4)\) MeV, is consistent with the direct determinations of ETM 16A [827] for \(N_{ f}=2\) of \(\sigma _c=79(21)(^{12}_{8})\) MeV and \(\chi \)QCD 13A [839] for \(N_{ f}=2\,+\,1\) of \(\sigma _c=94(31)\) MeV. MILC in MILC 12C [91] find \(\langle N|{\bar{c}}c|N\rangle =0.056(27)\) in the \({\overline{\mathrm {MS}}}\) scheme at a scale of 2 GeV for \(N_{ f}=2\,+\,1\,+\,1\) via the hybrid method. Considering the large uncertainty, this is consistent with the other results once multiplied by the charm quark mass.

10.4.5 Results for \(g_T^{u,d,s}\)

A compilation of recent results for the flavour-diagonal tensor charges \(g_T^{u,d,s}\) for the proton in the \({\overline{\mathrm {MS}}}\) scheme at 2 GeV is given in Table 68 and plotted in Fig. 46. Results for the neutron can be obtained by interchanging the u and d flavor indices. Only the PNDME 2 + 1 + 1 flavour calculations qualify for the global average.

The FLAG averages are the same as the PNDME 18B [7] results, which supersede the PNDME 16 [834] and the PNDME 15A [832] values:

$$\begin{aligned}&N_{ f}=2\,+\,1\,+\,1:&g_T^u&= 0.784(28)(10)&\,\mathrm {Ref.}~[7], \end{aligned}$$
(399)
$$\begin{aligned}&N_{ f}=2\,+\,1\,+\,1:&g_T^d&= -0.204(11)(10)&\,\mathrm {Ref.}~[7], \end{aligned}$$
(400)
$$\begin{aligned}&N_{ f}=2\,+\,1\,+\,1:&g_T^s&= -0.027(16)&\,\mathrm {Ref.}~[7]. \end{aligned}$$
(401)

The ensembles and the analysis strategy used in PNDME 18B is the same as described in Sect. 10.4.1 for \(g_A^{u,d,s}\). The only difference for the tensor charges was that a one-state (constant) fit was used for the disconnected contributions as the data did not show significant excited-state contamination. The isovector renormalization constant, used for all three flavour diagonal tensor operators, was calculated on the lattice in the RI-SMOM scheme and converted to \({\overline{\mathrm {MS}}}\) at 2 GeV using 2-loop perturbation theory. As discussed in Sect. 10.1.3, the difference between the singlet and isovector factors is expected to be small.

The JLQCD 18 [843] and ETM 17 calculations [830] were not considered for the final averages because they did not satisfy the criteria for the continuum extrapolation as already discussed in Sect. 10.4.1.