1 Introduction

Electrons and muons carry a magnetic moment aligned with their spin. The proportionality factor between the two axial vectors is parameterized by the gyromagnetic ratio g. In Dirac’s theory, \(g=2\), and for a lepton family \(\ell \) one characterizes the deviation of g from this reference value by \(a_\ell =(g-2)_\ell /2\). Historically, the ability of quantum electrodynamics (QED) to quantitatively predict this observable played a crucial role in establishing quantum field theory as the framework in which particle physics theories are formulated.

Presently, the achieved experimental precision on the measurement of the anomalous magnetic moment of the muon [1], \(a_\mu \), is 540 ppb. At this level of precision, such a measurement tests not only QED, but also the effects of the weak and the strong interaction of the Standard Model (SM) of particle physics. Currently there exists a tension of about 3.7 standard deviations between the SM prediction and the experimental measurement. The status of this test of the SM is reviewed in [2,3,4,5]. At the time of writing, the E989 experiment at Fermilab is performing a new direct measurement of \(a_\mu \) [6], and a further experiment using a different experimental technique is planned at J-PARC [7]. The final goal of these experiments is to reduce the uncertainty on \(a_\mu \) by a factor of four. A commensurate reduction of the theory error is thus of paramount importance.

The precision of the SM prediction for \(a_\mu \) is completely dominated by hadronic uncertainties. The leading hadronic contribution enters at second order in the fine-structure constant \(\alpha \) via the vacuum polarization and must be determined at the few-permille level in order to match the upcoming precision of the direct measurements of \(a_\mu \). The most accurate determination comes from the use of \(e^+e^-\rightarrow \mathrm{hadrons}\) data via a dispersion relation, although lattice QCD calculations have made significant progress in computing this quantity from first principles [5, 8]. The hadronic light-by-light (HLbL) scattering contribution \(a_\mu ^{\mathrm{hlbl}}\), which is of third order in \(\alpha \), currently contributes at a comparable level to the theory uncertainty budget and is being addressed both by dispersive and lattice methods; see [9,10,11] and references therein.

Our approach for determining \(a_\mu ^{\mathrm{hlbl}}\) is based on coordinate-space perturbation theory where the QED elements of the Feynman diagrams yielding \(a_\mu ^{\mathrm{hlbl}}\) are precomputed in infinite volume, and only the four-point amplitude of the electromagnetic current is actually computed on the lattice. Here we compute the lattice contribution at a point in the space of light quark masses corresponding to QCD with exact SU(3)-flavor (denoted \(\hbox {SU(3)}_{\mathrm{f}}\) in the rest of this work) symmetry. Furthermore, the sum of the three light quark masses is approximately the same as in nature. These two conditions leads to a degenerate mass of pions, kaons and the eta meson of about 420 MeV.

Our motivation for calculating \(a_\mu ^{\mathrm{hlbl}}\) at the \(\hbox {SU(3)}_{\mathrm{f}}\)-symmetric point is twofold. First, the lattice calculation itself is simplified in that only two out of five classes of Wick contractions contribute, due to the vanishing trace of the quark electric charge matrix. In addition, the overall lattice calculation is computationally far cheaper than for physical quark masses, so that more tests of systematic errors can be performed. Second, the interpretation of the results is simplified: the \(\hbox {SU(3)}_{\mathrm{f}}\)-symmetry reduces the number of unknown parameters in model estimates based on the exchange of the lightest mesons. In particular, the transition form factor (TFF) of the pion, which describes the coupling of the neutral pion to two virtual photons, has been calculated [12] on the lattice ensembles that we use. The TFF of the eta meson coincides with the TFF of the pion up to a simple overall charge factor. Of the pseudoscalar mesons, only the TFF of the \(\eta '\) remains independent and is largely unknown at the \(\hbox {SU(3)}_{\mathrm{f}}\)-symmetric point, however experimental information is available for the two-photon decay width (which provides the coupling strength to two real photons) and some experimental results are available for the singly as well as the doubly-virtual form factor [13,14,15], although only for relatively large virtualities above 1.5 \(\hbox {GeV}^2\).

The simplified connection to model estimates enables our work to provide a valuable cross-check for the predictions of hadronic models and dispersive methods; this work is thereby complementary to lattice calculations directly aiming at \(a_\mu ^{\mathrm{hlbl}}\) for physical quark masses [16]. At the same time, this study allows us to learn about the size of various sources of systematic error, particularly the finite-size effects, and how well we are able to correct for them semi-analytically.

The rest of this paper is organized as follows: we begin by presenting our methodology in Sect. 2, including the two methods we will investigate for computing the quark-connected contribution to \(a_\mu ^{\mathrm{hlbl}}\). Section 3 begins with a description of the lattice ensembles used in this work, as well as an example of the lowest-lying relevant meson spectrum for one of our ensembles. We then discuss the lattice determination of the \(\pi ^0\) and \(\eta \) transition form factors used for the modelling and finite-size correction of our data. Section 4 discusses the various model predictions for the integrand at the \(\hbox {SU}(3)_{\mathrm{f}}\)-symmetric point and confronts these with a selection of our lattice data. Results for the fully-connected class of Wick contractions are presented in Sect. 5, and those for the non-vanishing disconnected class in Sect. 6. In both cases, the lattice results are compared to the prediction for the exchange of pseudoscalar mesons at the integrand level. The main result of this paper – \(a_\mu ^{\mathrm{hlbl}}\) at the \(\hbox {SU(3)}_{\mathrm{f}}\)-symmetric point, Eq. (31) – is obtained in Sect. 7, which also contains a discussion of how this result will change for physical values of the quark masses. We summarize our findings and conclude in Sect. 8, and various technical aspects of the calculation are described in more detail in the appendices (A, B, C).

Fig. 1
figure 1

Five Wick-contraction topologies that are necessary for the calculation of \(a_\mu ^{\mathrm{hlbl}}\) as listed in Table 1. For each diagram, nontrivial permutations of the four quark-photon vertices yield the number of contractions listed in that table. The two diagrams in the first line are the dominant ones and the other three vanish in the \(\hbox {SU}(3)_{\mathrm{f}}\) limit

2 Methodology

One can compute the light-by-light scattering contribution to the \(g-2\) of the muon in position space by performing the integrals

$$\begin{aligned} a_\mu ^{\mathrm{hlbl}}&= \frac{m_\mu e^6}{3}\nonumber \\&\quad \times \int d^4y \int d^4x \; \mathcal {{\bar{L}}}_{[\rho ,\sigma ];\mu \nu \lambda }(x,y)\;i{\widehat{\Pi }}_{\rho ;\mu \nu \lambda \sigma }(x,y),\nonumber \\ \end{aligned}$$
(1)

where \(\mathcal {{\bar{L}}}\) is a QED kernel and \(i{\widehat{\Pi }}\) is a spatial moment of the connected Euclidean four-point function in QCD,

$$\begin{aligned} i{\widehat{\Pi }}_{\rho ;\mu \nu \lambda \sigma }( x, y)= & {} -\int d^4z\, z_\rho \, {{\widetilde{\Pi }}}_{\mu \nu \sigma \lambda }(x,y,z), \end{aligned}$$
(2)
$$\begin{aligned} {\widetilde{\Pi }}_{\mu \nu \sigma \lambda }(x,y,z)\equiv & {} \Big \langle \, j_\mu (x)\,j_\nu (y)\,j_\sigma (z)\, j_\lambda (0)\Big \rangle _{\mathrm{QCD}}\,, \end{aligned}$$
(3)

with \(j_\mu (x)\) the hadronic component of the electromagnetic current

$$\begin{aligned} j_\mu (x) = \frac{2}{3} ({\overline{u}} \gamma _{\mu } u)(x) - \frac{1}{3} ({\overline{d}} \gamma _{\mu } d)(x) - \frac{1}{3} ({\overline{s}} \gamma _{\mu } s)(x) \,. \end{aligned}$$
(4)

The QCD correlation function consists of all the various ways one can contract four vector currents, as shown in Fig. 1, all of which are “disconnected” except for the fully-connected contribution. At the flavor-symmetric point, only the upper two topologies contribute. Away from this point it is expected, from large-\(N_c\) arguments as well as from numerical evidence by the RBC/UKQCD collaboration [9], that the remaining topologies are suppressed. The number of contractions for each topology is given in Table 1.

More information on the infinite-volume QED kernel \(\mathcal {{\bar{L}}}(x,y)\) can be found in [17]. We make use of O(4) symmetry to simplify the integral further,

$$\begin{aligned} a_\mu ^{\mathrm{hlbl}}= \int _0^\infty d|y|\, f(|y|), \end{aligned}$$
(5)

where in the starting representation (1),

$$\begin{aligned} f(|y|)= & {} \frac{m_\mu e^6}{3} 2\pi ^2 |y|^3 \int d^4x \; \mathcal {{\bar{L}}}_{[\rho ,\sigma ];\mu \nu \lambda }(x,y)\;\nonumber \\&i{\widehat{\Pi }}_{\rho ;\mu \nu \lambda \sigma }(x,y). \end{aligned}$$
(6)

We will often display this integrand and always denote it by f(|y|), even though in practice we employ modified representations of \(a_\mu ^{\mathrm{hlbl}}\). One type of modification concerns the kernel \(\mathcal {{\bar{L}}}_{[\rho ,\sigma ];\mu \nu \lambda }(x,y)\). Various subtraction terms to the kernel have been proposed to beneficially change the shape of the integrand [18, 19] without changing the resulting integral. The importance of performing such subtractions cannot be understated as the unsubtracted kernel is poorly suited for practical lattice simulations due to being too peaked at short distances.

Table 1 Number of contractions needed for each type of diagram in Fig. 1

Here we make extensive use of a new subtraction scheme for the QED kernel [19],

$$\begin{aligned} \mathcal {{\bar{L}}}^{(\Lambda )}_{[\rho ,\sigma ];\mu \nu \lambda }(x,y)= & {} \mathcal {{\bar{L}}}_{[\rho ,\sigma ];\mu \nu \lambda }(x,y) \nonumber \\&-\partial _\mu ^{(x)} (x_\alpha e^{-\Lambda m_\mu ^2 x^2/2}) \mathcal {{\bar{L}}}_{[\rho ,\sigma ];\alpha \nu \lambda }(0,y) \nonumber \\&- \partial _\nu ^{(y)} (y_\alpha e^{-\Lambda m_\mu ^2 y^2/2})\mathcal {{\bar{L}}}_{[\rho ,\sigma ];\mu \alpha \lambda }(x,0),\nonumber \\ \end{aligned}$$
(7)

where \(\Lambda \) is an arbitrary, tuneable, dimensionless free parameter. When \(\Lambda =0\) we have the \(\mathcal {{\bar{L}}}^{(2)}\) kernel of [17] and as \(\Lambda \rightarrow \infty \) we recover the unsubtracted \(\mathcal {{\bar{L}}}^{(0)}\). The benefit of such a choice of kernel is that we are able to tune the shape of the integrand to reduce the long-distance effects while still preserving the beneficial properties of short-distance subtractions. An investigation of this kernel with the infinite-volume lepton loop is presented in Appendix B.

Fig. 2
figure 2

Wick contractions for the connected contribution. Each diagram represents two contractions with quark flow in opposite directions

In this section we will write the formulae for obtaining \(a_\mu ^{\mathrm{hlbl}}\) in terms of continuous integrals. The lattice, however, is discrete so we can only approximate these integrals with finite sums,Footnote 1

$$\begin{aligned} \int d^4x \!\approx \! a^4\sum _{t/a=-N_t/2}^{N_t/2-1}\,\,\sum _{z/a=-N_z/2}^{N_z/2-1}\,\, \sum _{y/a=-N_y/2}^{N_y/2-1}\,\,\sum _{x/a=-N_x/2}^{N_x/2-1},\nonumber \\ \end{aligned}$$
(8)

where a is the lattice spacing and \(N_\mu =L_\mu /a\). In Eq. (2), we set \(z_\rho =L/2\rightarrow 0\) to accommodate the discontinuity when it changes sign. The integral of f(|y|) with respect to |y| can be performed with the trapezoid rule and in practice we will average over equivalent values of f(|y|) to both increase statistical precision and reduce computational cost. Later on in the analysis we will show results for the partially-integrated value of \(a_\mu \),

$$\begin{aligned} a_\mu (|y|_\text {max}) = \int _0^{|y|_\text {max}} d|y| f(|y|) \end{aligned}$$
(9)

with the hope that we will see a plateau at large-enough \(|y|_\text {max}\), indicating that our integral has saturated.

In the following three subsections we will discuss two methods to compute the connected contribution: the first, method 1, is a direct calculation of the three pairs of connected Wick contractions; the second, method 2, uses rearrangements of the integrand expressed in terms of only one pair of Wick contraction to make the calculation cheaper. We will then give the methodology used for the calculation of the disconnected contribution, which also uses some rearrangements in the integrand.

2.1 Connected contribution (method 1)

The Wick contractions for the connected contribution are shown in Fig. 2. With four local vector currents, the four-point correlation function can be written as

$$\begin{aligned} {\widetilde{\Pi }}^{\mathrm{conn}}_{\mu \nu \sigma \lambda }(x,y,z) \!= & {} \! \frac{18}{81} Z_V^4 \left( {\widetilde{\Pi }}_{\mu \nu \sigma \lambda }^{(1)}(x,y,z) \!+\! {\widetilde{\Pi }}_{\mu \nu \sigma \lambda }^{(2)}(x,y,z) \right. \nonumber \\&\left. + {\widetilde{\Pi }}_{\mu \nu \sigma \lambda }^{(3)}(x,y,z) \right) , \end{aligned}$$
(10)

where \(Z_V\) is the renormalization factor of the vector currentFootnote 2 and each term represents one diagram in Fig. 2, i.e. a pair of Wick contractions. We will use the \(Z_V\) values from [20] without the O(a)-improvement terms proportional to quark mass combinations. The value 18/81 is the necessary charge factor for the degenerate light and strange quarks. Writing the \({\widetilde{\Pi }}^{(1,2,3)}\)s in terms of propagators yields

$$\begin{aligned}&{\widetilde{\Pi }}_{\mu \nu \sigma \lambda }^{(1)}(x,y,z) \nonumber \\&\,\, =\! -2\text {Re}\left\langle \,\mathrm{Tr}\left[ S(0,x) \gamma _{\mu } S(x,y) \gamma _{\nu } S(y,z) \gamma _{\sigma } S(z,0) \gamma _{\lambda } \right] \right\rangle _U, \nonumber \\&{\widetilde{\Pi }}_{\mu \nu \sigma \lambda }^{(2)}(x,y,z) \nonumber \\&\,\, =\! -2\text {Re}\left\langle \,\mathrm{Tr}\left[ S(0,y) \gamma _{\nu } S(y,z) \gamma _{\sigma } S(z,x) \gamma _{\mu } S(x,0) \gamma _{\lambda } \right] \right\rangle _U, \nonumber \\&{\widetilde{\Pi }}_{\mu \nu \sigma \lambda }^{(3)}(x,y,z) \nonumber \\&\,\, =\! -2\text {Re}\left\langle \,\mathrm{Tr}\left[ S(0,y) \gamma _{\nu } S(y,x) \gamma _{\mu } S(x,z) \gamma _{\sigma } S(z,0) \gamma _{\lambda } \right] \right\rangle _U,\nonumber \\ \end{aligned}$$
(11)

where S(xy) is the quark propagator with sink at x and source at y and \(\langle \cdots \rangle _U\) is the expectation value over gauge configurations. The trace is over both Dirac and color indices. In method 1, the six contractions are computed explicitly and we choose the direction \(y/a \propto (1,1,1,1)\).Footnote 3 In this method, we first compute the point-to-all propagator \(S(\cdot ,0)\) and the six sequential propagators using the fields \(z_{[\rho } \gamma _{\sigma ]} S(z,0)\) as sources (the anti-symmetrization of the indices \(\rho \) and \(\sigma \) is imposed by the symmetries of the QED kernel). Then, for each value of |y| used to sample the integrand in Eq. (5), we compute the point-to-all propagator \(S(\cdot , y)\) and the six sequential propagators using the fields \(z_{[\rho } \gamma _{\sigma ]} S(z,y)\) as sources. Therefore, for N evaluations of the integrand, we need \(7(N+1)\) propagator inversions. In our set-up all currents are local except for the one at x, which will be either local or the point-split, the latter implying a suitable modification of Eq. (11).

In practice, since we are (mostly) using open-boundary conditions in the time direction, the origin is located somewhere near the middle time-slice and randomly distributed in the spatial volume. We also average over the 16 combinations \(y/a = (\pm n, \pm n, \pm n, \pm n)\) to increase statistics.

2.2 Connected contribution (method 2)

The idea for method 2 is simple: we pick a reference diagram that is easiest to compute and use a change of variables in the integrals to relate the other diagrams to this reference. Here, we pick the diagram that does not have a propagator from the origin to y, or from z to x (the leftmost of Fig. 2). Such a choice avoids the extra inversions required for the sequential sources as we discussed in method 1. Simplistically, for two samples of a single |y| (i.e. \(+y\) and \(-y\)) we only need two point-to-all propagators. In fact, we can do much better than this if we keep propagators in memory so that they can be reused.

The diagrams \({\widetilde{\Pi }}^{(2)}\) and \({\widetilde{\Pi }}^{(3)}\) are related to \({\widetilde{\Pi }}^{(1)}\) by applying symmetries to Eq. (11): Euclidean-space translations and inversions, along with cyclicity of the trace [17, 19]. This leads to our master formula for method 2:

$$\begin{aligned} a_\mu ^{\mathrm{conn}}= & {} -\frac{18}{81}Z_{\mathrm{V}}^4 \frac{m_\mu e^6}{3}2\pi ^2\int d|y| |y|^3 \;\int d^4x \nonumber \\&\bigg ((\mathcal {{\bar{L}}}^{(\Lambda )}_{[\rho ,\sigma ];\mu \nu \lambda }(x,y) +\mathcal {{\bar{L}}}^{(\Lambda )}_{[\rho ,\sigma ];\nu \mu \lambda }(y,x) \nonumber \\&-\mathcal {{\bar{L}}}^{(\Lambda )}_{[\rho ,\sigma ];\lambda \nu \mu }(x,x -y))\int d^4z \,z_\rho {\widetilde{\Pi }}^{(1)}_{\mu \nu \sigma \lambda }(x,y,z) \nonumber \\&+\mathcal {{\bar{L}}}^{(\Lambda )}_{[\rho ,\sigma ];\lambda \nu \mu }(x,x-y)x_\rho \int d^4z \,{\widetilde{\Pi }}^{(1)}_{\mu \nu \sigma \lambda }(x,y,z)\bigg ).\nonumber \\ \end{aligned}$$
(12)

In infinite volume, the integral is equal to the result of method 1, but the integrand f(|y|) will generally be different. As a result, the systematic effects in method 2 will be different from method 1.

We invert point-source propagators along the line \((n,n,n,3n+t_\text {min}/a)\), and what we call |y| will be the difference between these points; here the integer n ranges from 0 to \(N_i/2\). The closest source position in time (\(t_\text {min}\)) is chosen to be suitably away from our (usually) open temporal boundary, ideally \(m_\pi t_{\text {min}}>4\) and \(m_\pi (L_t-t_\text {max})>4\). This line of propagator sources was chosen as we typically have a large anisotropy in the temporal direction, allowing us to achieve sizeable values of |y|.

In our implementation, we keep all \(N_i/2\) of the propagators in memory and perform the integrals for every possible y and origin, so for N propagator inversions we have \(N(N-1)\) samples distributed among the different values of non-zero |y|. To further boost statistics we will also average other directions that give the same |y|, e.g. \(y/a=(\pm n,\pm n,\pm n,t_\text {max}/a-3n)\).

2.3 Disconnected contribution

The disconnected contribution can be computed from the two-point contraction

$$\begin{aligned} \Pi _{\mu \nu }(x,y)= -\text {Re}\left( \text {Tr}[S(y,x)\gamma _\mu S(x,y)\gamma _\nu ]\right) . \end{aligned}$$
(13)

An important point to note is that \(\Pi _{\mu \nu }\) has a vacuum expectation value (VeV) that must be subtracted to ensure that the two “disconnected” quark loops are still connected by gluons. To this end, we define

$$\begin{aligned} {\hat{\Pi }}_{\mu \nu }(x,y) = \Pi _{\mu \nu }(x,y) - \langle \Pi _{\mu \nu }(x,y) \rangle _U, \end{aligned}$$
(14)

and use this to compute the \(2+2\) quark-disconnected contribution to \(a_\mu ^{\mathrm{hlbl}}\),

$$\begin{aligned} a_\mu ^{\mathrm{disc}}= & {} -\frac{36}{81}Z_{\mathrm{V}}^4\frac{m_\mu e^6}{3}2\pi ^2\int d|y| |y|^3 \int d^4x \nonumber \\&\biggl \langle (\mathcal {{\bar{L}}}^{(\Lambda )}_{[\rho ,\sigma ];\mu \nu \lambda }(x,y) +\mathcal {{\bar{L}}}^{(\Lambda )}_{[\rho ,\sigma ];\nu \mu \lambda }(y,x)) {\hat{\Pi }}_{\mu \lambda }(x,0) \nonumber \\&\times \int d^4z \,z_\rho {\hat{\Pi }}_{\sigma \nu }(z,y) \nonumber \\&+\mathcal {{\bar{L}}}^{(\Lambda )}_{[\rho ,\sigma ];\mu \nu \lambda }(x,y){\hat{\Pi }}_{\mu \nu } (x,y) \int d^4z \, z_\rho {\hat{\Pi }}_{\sigma \lambda }(z,0) \biggr \rangle _U.\nonumber \\ \end{aligned}$$
(15)

In our implementation we compute a grid of point-source propagators from some \(t_{\text {min}}\) to \(t_{\text {max}}\) wrapping completely around the spatial directions alternating between \((2n,2n,2n,6m+t_\text {min}/a)\) and \((2n+1,2n+1,2n+1,6m+3+t_\text {min}/a)\) where n and m take values between 0 and \(N_i/2-1\), and 0 and \((t_\text {max}-t_{\text {min}})/(6a)\) respectively.

Fig. 3
figure 3

Illustration of the grid of propagators used in the toy example given in the text. Filled circles indicate sites that lie on the lattice, gray circles are ones accessible through periodicity and the dashed lines indicate lines along our preferred directions (1, 1, 1, 3) and (2, 2, 2, 0)

Giving a toy example; for a \(4^3\times 12\) periodic lattice we would invert point-source propagators at \((0,0,0,0),(2,2,2,0),(1,1,1,3),(3,3,3,3),(0,0,0,6),(2,2,2,6), (1,1,1,9),\) and (3, 3, 3, 9). This is illustrated in Fig. 3. We then compute the two-point contraction of Eq. (13) for each of these sources and keep it in memory. We then perform a doubly-nested loop over all of the source positions, where we only integrate combinations of two-point contractions where our y lies in a direction we like, e.g. along the directions (1, 1, 1, 3) and (2, 2, 2, 0). The results of the z and x integrations are saved to disk, the VeV is subtracted, and the final contraction of indices is performed offline in the analysis.

Table 2 Lattice ensembles with \(\hbox {SU}(3)_{\mathrm{f}}\)-symmetry used in this work. Each ensemble is parametrized by the gauge coupling parameter \(\beta \), the quark hopping parameter \(\kappa \), the lattice size, and the temporal boundary condition. The lattice spacing a was determined in Ref. [22]

The benefit of such a method over calculating each separate |y| individually is the quadratic growth of equivalent combinations of |y| that are available as the number of source fields grows. For our \(4^3\times 12\) example we have 6 values of equivalent |y| per source position. It would be impractical to perform this calculation without exploiting this fact. We find it is beneficial to truncate the x-integral in our set-up to avoid negligible, noisy contributions at large distances. As our kernels are computed on-the-fly this is also useful as a computer-time saving measure as it reduces the number of QED kernel calls. We truncate the integral over x to points that lie within a maximum distance [\((r/a)^2=81,81,121,169\) for our coarsest to finest ensembles respectively] from the origin or from y, while we do not truncate the z-integral. Although these truncations differ in physical volume, the ensembles H101 and H200 were tested for smaller and larger values of \((r/a)^2\) on a subset of our data and even smaller values of \((r/a)^2\) than used here, for example 49 and 64 for the ensemble H101, were found to be consistent.

To reiterate, in our full calculation we do not perform the integrals for all possible y-vectors, as there are many that we expect to have bad finite-volume or discretization effects. For instance, in the toy example we could calculate f(|y|) for the y-vector \(y/a=(0,0,0,6)\), but this would have significant cut-off effects. Typically, we filter-out about \(\approx 80\%\) of the possible y-vectors and keep only (the modulus of) those parallel to (1, 1, 1, 3), (2, 2, 2, 0), and occasionally (1, 1, 1, 1).

3 Lattice parameters and properties of \(\hbox {SU}(3)_{\mathrm{f}}\)-symmetric QCD

Calculations have been performed on lattice ensembles provided by the Coordinated Lattice Simulations (CLS) initiative [21], which have been generated using three flavors of non-perturbatively O(a)-improved Wilson-clover fermions and with the tree-level \(O(a^2)\)-improved Lüscher–Weisz (Symanzik) gauge action. In particular, we consider only those with \(\hbox {SU}(3)_{\mathrm{f}}\)-symmetry. On these ensembles, the mass of the octet of light pseudoscalar mesons is approximately 420 MeV. These ensembles are summarized in Table 2 and in Fig. 4: there are four lattice spacings, as well as two pairs of ensembles that differ only by their volume.

Fig. 4
figure 4

Spatial extent L and lattice spacing a for the ensembles with \(\hbox {SU}(3)_{\mathrm{f}}\)-symmetry used in this work

3.1 The \(\text {SU}(3)_{\mathrm{f}}\) meson spectrum

At long distances, the leading contributions to the four-point correlator emanate from the lowest-lying mesons. Clearly, the degenerate \(\pi ^0\) and \(\eta \)-pole contributions dominate at the longest distances. However, at intermediate distances the contributions from heavier states, including resonances in the two-pseudoscalar channel, may also play a significant rôle. Especially since in our calculation at the \(\hbox {SU}(3)_{\mathrm{f}}\)-point the pseudoscalar mesons are not appreciably light compared to the other mesons. Therefore, we present results from a limited spectroscopic study of the low-lying meson spectrum in all relevant channels with total angular momentum \(J\le 1\).

Fig. 5
figure 5

The meson spectrum of ensemble H101. Effective masses are shown for each channel in the flavor octet (left) and singlet (center) sectors, along with bands indicating the plateau fits. Note that in many cases (in particular the octet pseudoscalar and vector channels), the error bar is smaller than the plotted symbol. Fitted masses are shown in the right panel; the outer error bars include an estimate of systematic uncertainty obtained by shifting the plateau fit range. Horizontal dashed lines indicate thresholds at twice and three times the octet pseudoscalar mass

Only charge-conjugation-even (C-even) mesons can couple to two electromagnetic currents and thus contribute in exchange diagrams to the four-point function. However, from the Wick-contraction structure, one can view method 2 as using charged vector currents (see Appendix A) and it is possible to couple a C-odd state to two of them. Therefore, the integrand for method 2 can receive contributions from C-odd mesons such as the rho [19]. For that reason, we also inspect the spectrum of C-odd mesons, even if their contributions to \(a_\mu ^{\mathrm{hlbl}}\) must vanish in the infinite-volume limit once all integrals are performed.

Our dedicated spectroscopy calculation is performed on the ensemble H101. This made use of the distillation framework [23] (for the connected hadron two-point functions) and its stochastic formulation [24], which was essential for efficiently computing the disconnected hadron two-point functions needed in the singlet sector. However, we have only used smeared quark bilinears as interpolating operators, and the lack of non-local multimeson-like interpolators means that this calculation should not generally be considered as a robust determination of the spectrum. This analysis precludes the use of finite-volume quantization conditions; therefore, only the approximate locations of resonances can be found, provided that they are narrow. Nevertheless, since we compute diagonal correlators, the effective masses taken at any Euclidean time provide an upper bound on the ground-state energy in a given channel. This observation is particularly useful in the flavor-singlet \(0^{++}\) channel, which turns out to admit a stable ground-state meson. Such a stable scalar meson has been found previously [25] in a lattice calculation at a similar pion mass, though not at an \(\hbox {SU(3)}_{\mathrm{f}}\)-symmetric point.

Table 3 Estimated meson masses in MeV from ensemble H101. The first uncertainty is statistical and the second was determined by varying the plateau fit range. The scale-setting uncertainty has not been included

Results are shown in Fig. 5 and summarized in Table 3. Note that since the signal in the flavor-singlet sector is much worse, the plateau fits had to be done at relatively short time separations, so that the shown uncertainty on the mass may be an underestimate. In addition, the plateau for the flavor-octet scalar is relatively poor, which might be due to its coupling to two octet pseudoscalars in S-wave and the plateau’s proximity to the corresponding threshold. In the case of a mis-identified plateau, the true ground-state energy would be lower, since the effective masses shown here are expected to approach the ground-state energy from above.

For the flavor-octet sector, after the \(\pi ^0\), the next-longest-distance meson-exchange contributions in the integrand for \(a_\mu ^{\mathrm{hlbl}}\) come from the \(a_0\) and (for the method 2 integrand) the \(\rho \), both of which sit near \(2m_\pi \). For the flavor-singlet sector, the three corresponding mesons are also the lightest, although the ordering is different, with the \(f_0\) (or \(\sigma \)) being a bound state with mass near 680 MeV and the \(\eta '\) mass sitting higher, somewhere near \(2m_\pi \). For the disconnected diagrams, which receive the difference between contributions from the exchanges of singlet and octet mesons (see Sect. 4.1), the pseudoscalars will provide the longest-distance contribution but we should also expect a significant contribution from scalars. The difference between the octet and singlet vector meson masses is very small; if the same holds for their form factors, then their combined contribution to the integrand will be negligible.

3.2 Pion-pole and \(\eta \)-pole contributions to \(a_\mu ^{\mathrm{hlbl}}\)

In Ref. [12] a model independent parametrization of the pion TFF has been obtained on the same set of lattice ensembles as used in this work. These results can be used to compute the pion-pole contribution, \(a_\mu ^{\mathrm {hlbl};\pi ^0} \), at the \(\hbox {SU}(3)_{\mathrm{f}}\)-symmetric point in the continuum limit. This result will be used in Sect. 7.3 to obtain a rough estimate of \(a_\mu ^{\mathrm{hlbl}}\) at the physical pion mass. More importantly, in Sects. 5.15.2, and 6, a vector meson dominance (VMD) parametrization of the TFF will be used to estimate the finite-size corrections to our lattice data at the symmetric point. For this purpose, we have performed a new fit to the data from [12] assuming a VMD parametrization, where we have restricted the fit to the singly-virtual case where the model provides a good description of our data. The fit parameters used in this work are collected in Table 4.

In [12], two strategies have been used to extract the pion TFF, and the corresponding results for the pion-pole contribution are displayed in Fig. 6 for two different discretizations of the 3-point correlation function (see [12] for details). In the first strategy, the TFF was computed on each lattice ensemble separately. This allowed us to determine the pion-pole contribution for different values of the pion mass and lattice spacing. The physical value was then obtained by a combined chiral-continuum extrapolation of \(a_\mu ^{\mathrm {hlbl};\pi ^0} \). We have repeated this analysis but now restrict the fit to only the ensembles included in this study. Here we use the pion mass of the given ensemble (instead of the physical one as done in [12]) in the weight functions that appear in Eq. (74) of [12]. This leads to the result (1) of Fig. 6.

In the second strategy, the pion TFF was directly extrapolated to the physical point using a global fit that includes several ensembles including, and away from, the \(\hbox {SU}(3)_{\mathrm{f}}\)-point. From the resulting fit parameters, we can extract the pion TFF in the continuum limit at a pion mass of 420 MeV. Using this result in Eq. (74) of [12], we obtain the grey point of Fig. 6, which we find to be in very good agreement with the first estimate.

Table 4 The VMD fit parameters of the \(\pi ^0\) transition form factor

The second strategy has the advantage of using an expanded set of ensembles (15 in total) to determine the TFF,

$$\begin{aligned} a_\mu ^{\mathrm {hlbl};\pi ^0} =21.0(1.2) \times 10^{-11} \quad (\hbox {SU}(3)_{\mathrm{f}}\, \text {point}) \end{aligned}$$
(16)

at the \(\hbox {SU}(3)_{\mathrm{f}}\)-point which is to be compared to the physical-pion value

$$\begin{aligned} a_\mu ^{\mathrm {hlbl};\pi ^0} = 59.7(3.6)\times 10^{-11} \quad (\text {physical point}). \end{aligned}$$
(17)

Unsurprisingly, we observe a strong dependence on the pion mass. The smaller pion contribution in Eq. (16) compared with (17) is due roughly in equal parts to the heavier pion mass and to the reduced coupling to photons, as can be seen by comparing the entries in Table 4 to the physical value, \({{\mathcal {F}}}_{\pi ^0\gamma \gamma }(0,0)\approx 0.274\,\mathrm{GeV}^{-1}\).

Fig. 6
figure 6

The pion-pole contribution to \(a_\mu ^{\mathrm{hlbl}}\) at the \(\hbox {SU(3)}_{\mathrm{f}}\) symmetric point with a pion mass of 420 MeV. Blue and red points correspond to two different discretizations of the 3-pt correlation function. The results (1) and (2) are explained in the main text

At the \(\hbox {SU}(3)_{\mathrm{f}}\)-symmetric point, the \(\eta \) is mass-degenerate with the pion, \(m_\eta \simeq 420\,\)MeV, and contributes exactly 1/3 of the \(\pi ^0\) exchange to \(a_\mu ^{\mathrm{hlbl}}\), i.e.

$$\begin{aligned} a_\mu ^{\mathrm{hlbl;\eta }}=(7.0\pm 0.4)\times 10^{-11} \quad (\hbox {SU}(3)_{\mathrm{f}}\text {-point}). \end{aligned}$$
(18)

A lattice calculation at the physical point for this contribution is not yet available, but the most recent estimate,Footnote 4 which comes from Canterbury approximants [5, 27], is \(a_\mu ^{\mathrm{hlbl,\eta ,\mathrm{phys}}}=16.3\times 10^{-11}\). This comparison shows that at the physical point the \(\eta \) gives a much larger contribution to \(a_\mu ^{\mathrm{hlbl}}\) than at the \(\hbox {SU}(3)_{\mathrm{f}}\)-symmetric point, in spite of being heavier. This can be traced back to its much larger coupling to two photons,

$$\begin{aligned} {{\mathcal {F}}}_{\eta \gamma \gamma }(0,0)[\mathrm{GeV}^{-1}] \simeq \left\{ \begin{array}{ll} 0.12 &{} \hbox {SU}(3)_{\mathrm{f}}\, \text {point}, \\ 0.27 &{} \text {physical point} . \end{array}\right. \end{aligned}$$
(19)

4 Hadronic models vs. the lattice integrand f(|y|)

In Sect. 4.1, we state the theoretical predictions for the integrand f(|y|) corresponding to the quark-connected diagrams in method 1 and method 2, as well as for the disconnected diagram. Then, in Sect. 4.2, we compare the lattice |y|-integrand obtained by method 1 for the quark-connected contribution to hadronic model predictions. For this purpose, we will focus on the ensembles N202 and H200. N202 has the largest physical volume (\(L=3.08\) fm) of all ensembles considered in this work, and H200 (with \(L=2.06\) fm) differs only by its comparatively smaller volume. Since these only differ by their volume, they allow us to test our understanding of finite-volume effects. Finally, a comparison of the theory predictions for the integrand of the quark-disconnected contributions to the corresponding lattice data is made in Sect. 4.3.

4.1 Predictions for the integrand

In order to gain some insight into the various contributions to the quantity \(a_\mu ^{\mathrm{hlbl}}\), we will compare predictions for the pseudoscalar exchanges as well as the contribution from a constituent-quark loop, and a charged-pseudoscalar loop, to the lattice data. Here, we provide some details as to how these predictions are obtained. Expressions for the amplitudes \(i{\widehat{\Pi }}_{\rho ;\mu \nu \lambda \sigma }\) calculated with a fermion loop, a charged-pion loop, or with the pseudoscalar exchange can be found in [17, 28]. In QCD with exact \(\hbox {SU(3)}_{\mathrm{f}}\)-symmetry, the \(\eta \) contribution to \(a_\mu ^{\mathrm{hlbl}}\) is always one third of the contribution of the \(\pi ^0\).

The flavor structure of single-meson exchanges in the different Wick-contraction topologies of four-point functions was discussed in detail in Ref. [29], including the case of \(N_{\mathrm{f}}=3\) QCD. The quark-connected diagrams receive contributions only from flavor-octet mesons, enhancedFootnote 5 by a factor of three. This is compensated by the quark-disconnected diagrams, which contain the differences between the flavor-singlet and twice the flavor-octet meson contributions. In the large-\(N_c\) limit, the singlet and octet contributions to the disconnected diagrams will cancel as their spectra becomes the same. For QCD, this degeneracy is most strongly broken in the pseudoscalar sector, where the octet’s mass is far below the singlet’s.

To interpret the integrand in method 2, one needs a mapping of individual quark-level Wick contractions onto meson-exchange diagrams [19] (see Appendix A for a derivation based on partially quenched chiral perturbation theory). It turns out that \(\widetilde{\Pi }^{(1)}_{\mu \nu \sigma \lambda }(x,y,z)\) does not contain the meson-exchange diagram in which the \(\pi ^0\) and \(\eta \) propagate between the pair of vertices (0, y) and (xz). Also, the normalization of the two other \((\pi ^0,\eta )\)-exchange diagrams is such that \({\widetilde{\Pi }}^{\mathrm{conn}}_{\mu \nu \sigma \lambda }\) contains the same \((\pi ^0,\eta )\) contribution as \({\widetilde{\Pi }}_{\mu \nu \sigma \lambda }\), enhanced (in the present \(\hbox {SU(3)}_{\mathrm{f}}\) case) by the charge factor three.

In addition, we need the mapping of individual quark-level disconnected diagrams onto meson-exchange diagrams. Here it turns out that there is a one-to-one mapping between a given quark-level diagram and a meson-exchange diagram. For instance, take a quark-level diagram, consisting of two quark loops each containing two vectorial vertices; each quark loop thus defines a pair of vertices. Such a diagram is in one-to-one correspondence with the diagram in which the meson is exchanged between the two pairs of vertices. For octet mesons, the latter diagram has a weight of \(-2\) relative to its normalization in the full HLbL amplitude. For singlet mesons, this relative weight factor is simply unity.

The short-distance contribution to \(a_\mu ^{\mathrm{hlbl}}\) is sometimes modelled by a constituent-quark loop. This corresponds to an effective degree of freedom, and the mass assigned to the ‘constituent quark’ is typically on the order of 300 MeV [32]. In the following sub-section, we will address the question “to what extent does such a contribution describe the short-distance part of the |y| integrand?”. In this case, the Wick contractions and weight factors of the constituent quark simply correspond to those of the fundamental quark degrees of freedom.

We have computed the charged pion and kaon loop contributions in the framework of scalar QED [17, 19]. The contribution of these loops to the set of quark-connected contractions and their contribution to the set of quark-disconnected contractions add up coherently, the latter being twice as large as the former. This result can again be derived in partially-quenched perturbation theory. While the \((\pi ^0,\eta )\) exchange contribution to the full \(a_\mu ^{\mathrm{hlbl}}\) is three times smaller than its contribution to \(a_\mu ^{\mathrm{conn}}\), the charged-pseudoscalar loop contribution is three times larger. The charged-pseudoscalar loop’s contribution might seem negligible in comparison to the integrand of the quark-connected, but it need not be negligible in the full integrand.

4.2 Lattice connected contribution

Figure 7 displays the integrand obtained with method 1 and \(\Lambda =0.16\). It is compared to the integrand for the exchange of the \((\pi ^0,\eta )\) mesons with a VMD TFF and using the parameters of Table 4. Beyond 1.5 fm, the prediction is consistent with the lattice data, albeit within large relative errors. Between 0.8 fm and 1.4 fm, the lattice data lie noticeably below the pseudoscalar-octet exchange prediction.

Fig. 7
figure 7

Integrand for the connected contribution on ensemble N202 using method 1 with \(\Lambda = 0.16\). The lattice data use a point-split current at x. The integrand is compared to the prediction for the exchange of the \(\pi ^0\) and \(\eta \) mesons with a VMD transition form factor, which is expected to provide a good approximation to the tail. In addition, an attempt to describe the short-distance contribution with a constituent-quark loop with a quark mass of 350 MeV is made

Fig. 8
figure 8

Results for the ensemble N202 using method 1 with \(\Lambda = 0.16\). Black points use a local current at the site x in Eq. (6) (which is our default approach) and blue points use a conserved current at x. Left: integrand f(|y|) as a function of |y|. Right: integrated value of \(a_{\mu }^{\mathrm{conn}}\) as a function of \(|y|_{\mathrm{max}}\)

At distances up to 0.8 fm, one would certainly not expect the pseudoscalar-octet exchange to saturate the integrand. We have attempted to model the higher-energy contributions using a constituent-quark loop, displayed in Fig. 7. For a constituent-quark mass of 350 MeV, the sum of this contribution and the pseudoscalar-octet exchange provides a good description of the maximum height of the integrand. At distances \(|y|\lesssim 0.4\) fm, one must expect large cutoff effects on the lattice integrand, as the separation of sources becomes comparable to our lattice spacing. This interpretation is confirmed by comparing the integrand to the one obtained using exclusively the local current, as in the left panel of Fig. 8, where a sizeable difference is visible up to about 0.4 fm. An interesting observation is that upon adding the constituent-quark loop to the pseudoscalar-octet exchange the result does not improve the agreement with the lattice data in the region \(0.8\mathrm{\,fm}<|y|<1.4\mathrm{\,fm}\). A clear excess remains in the prediction; we currently do not have a compelling explanation for this excess. The exchange of the lightest scalar-octet mesons (\(a_0\) type) would have the right sign to explain the effect, since scalar-meson exchanges are known to contribute negatively to \(a_\mu ^{\mathrm{hlbl}}\). In the future, a calculation of the scalar contribution along the lines of [33] would be worthwhile to find out whether it accounts for this missing effect. The charged-pion loop is also expected to contribute negatively and we have studied this contribution in the scalar-QED framework in [19]. We found the integrand to be negligible compared with the \(\pi ^0\) contribution beyond \(|y|=0.6\,\)fm, and introducing a vector form factor for the charged pion would likely only reduce the integrand further. It thus remains an open problem to understand the physics underlying the integrand for |y| around 1 fm.

The effect of the finite volume on the lattice integrand is illustrated in Fig. 9. A clear effect is seen between the comparatively ‘small’ ensemble H200 and the ‘large’ N202, the former integrand lying below the latter. This finite-volume dependence matches in sign and typical size the volume dependence of the pseudoscalar-octet exchange contribution, as seen in the figure.

4.3 Lattice disconnected contribution

Figure 12 (later in the text) displays the disconnected integrand for several gauge ensembles. It is compared to the prediction of the \((\pi ^0,\eta )\) exchange, including its appropriate weight factor of \(-2\) for the disconnected contribution, both in finite and infinite volume. The finite-volume effect predicted from the \((\pi ^0,\eta )\) exchange calculation is very small, and the lattice data from ensembles N202 and H200 confirms this expectation, at least at distances \(|y|<0.9\,\)fm, where the statistical errors allow for a meaningful comparison.

Fig. 9
figure 9

Study of FSEs for method 1. Left: integrand for the ensembles H200 and N202 with \(m_{\pi } L = 4.4\) and 6.4 respectively. Right: value of \(a_{\mu }^{\mathrm{conn}}\) for the ensemble N202 using the FSE correction prescription described in the text, as a function of \(|y|_{\mathrm{cut}}\)

The main difference between the disconnected and the connected contributions is that for the disconnected the \((\pi ^0,\eta )\) exchange already provides a decent description of the lattice data for |y| below 1 fm; in other words, we do not observe a large short-distance contribution. The \(\eta '\) and the \(\sigma \) mesons being singlets, they contribute to the disconnected diagrams with the same weight factor as they would to the full HLbL amplitude [29, 31]. In order to estimate the typical size of these heavy-meson contributions, we have computed the \(\eta '\) contribution under the following assumptions: the \(\eta '\) mass was set to 982 MeV, a value close to the result of our calculation on ensemble H101. Its coupling to two photons, \(\mathcal{F}_{\eta '\gamma \gamma }\), was assumed to be equal to its value at physical quark masses. The virtuality dependence of the transition form factor can be modelled with a VMD ansatz, with vector mass 952 MeV. Under these assumptions, the contribution is positive and sizeable (compared to the \((\pi ^0,\eta )\) exchange) up to \(|y|=1.5\,\)fm. In addition, we expect a significant contribution from the stable \(\sigma \) meson, whose mass we have found to lie well below the \(\pi \pi \) threshold. As a scalar, the \(\sigma \) would contribute negatively and thus compensate to some extent the \(\eta '\) contribution. Again, a dedicated calculation in the framework of [33] would certainly be worthwhile. The estimate for the \(\eta '\) contribution displayed in Fig. 12 is only meant to be representative of one particular meson-exchange contribution, with other cancellations being expected.

5 Results for the quark-connected contribution

5.1 Results from method 1

The lattice results for the quark-connected contribution to \(a_\mu ^{\mathrm{hlbl}}\) using method 1 have been generated along the direction \(y/a \propto (1,1,1,1)\). We have used two different discretizations of the four-point correlation function: the vector current located at the site x is either local or point-split (conserved), while the three other currents are always local. Our results are summarized in Table 5 and the integrand for the ensemble N202 (also presented in Sect. 4.2, Fig. 7) is shown on Fig. 8. The signal-to-noise ratio clearly deteriorates rapidly at large distances and we observe slightly better statistical precision when using a conserved vector current at x. Both discretizations give similar results, suggesting that there are small discretization effects present. In the end we will quote a final result with the fully-local discretization for a direct comparison with the results obtained using method 2.

Table 5 Results for the connected contribution using method 1 with four local vector currents using the decomposition given by Eq. (20). A \(25\%\) systematic to the total correction is used

Although all the ensembles used in this work satisfy \(m_{\pi } L > 4\), we still expect significant finite size effects (FSEs) due to the pseudoscalar-pole contribution, which is the dominant contribution in model estimates of hadronic light-by-light scattering [34], being a long-range phenomenon. Even with heavy pions \(m_{\pi } \approx 400~\)MeV, the tail extends beyond \(|y|=2.5~\text {fm}\) [17]. A comparison of the integrand for the ensembles H200 and N202, which only differ by their physical volumes, is depicted on the left panel of Fig. 9. Assuming a VMD model for the TFF, the pseudoscalar contribution can be computed in both finite and infinite volume. For more information on the calculation of the TFF we refer the reader back to Sect. 3.2; the parameters we use for modelling our data are summarized in Table 4 in that section.

To obtain our final estimate, the lattice data are integrated up to \(|y|_{\mathrm{cut}}\) where the integrand is compatible with zero. For \(|y|>|y|_{\mathrm{cut}}\), the tail is approximated by the pseudoscalar-pole contribution in infinite volume. Finally, for \(|y|<|y|_{\mathrm{cut}}\), the FSEs are estimated as the difference between the pseudoscalar-pole’s contribution computed in finite and infinite volume:

$$\begin{aligned} a_{\mu }^{\mathrm{conn}} = a_{\mu }^{\mathrm{data}} + a_{\mu }^{\mathrm{tail}} + a_{\mu }^{\mathrm{FSE}}. \end{aligned}$$
(20)

A systematic error of 25% is attributed to both the tail extension and the FSE correction. Since the same VMD model is used in both cases, we treat these corrections as being fully correlated.

The value of \(a_{\mu }^{\mathrm{conn}}\) as a function of \(|y|_{\mathrm{cut}}\) is shown in the right panel of Fig. 9; we observe a nice plateau for values \(|y|_{\mathrm{cut}}> 1.2~\)fm, suggesting that our systematic error estimate is quite conservative. Estimates of the finite-size corrections for each ensemble are summarized in Table 5 with their corresponding values of \(|y|_{\mathrm{cut}}\). In particular, we note that the systematic error on the FSEs is always larger than the statistical precision, except for our largest ensemble N202.

For the ensembles U103 and H200, with \(m_{\pi } L < 5\), we see very large FSE corrections, of the order of 50%. After these corrections the values of \(a_{\mu }^{\mathrm{conn}}\) do become compatible with the results at \(m_\pi L>5\) within about 1.5 \(\sigma \). We also observe a systematic over-estimate in comparison to the larger-volume results, and when it comes to our final continuum extrapolation we will omit these results.

O(a)-improvement is not implemented for the vector currents used in this work, but our experience with other observables involving electromagnetic currents, such as the LO HVP [35] and the pion TFF [12], suggests the remaining O(a) terms are small compared to the quadratic contribution. A linear fit in \(a^2\) leads to \(a_{\mu }^{\mathrm{conn, M1}} = 104.1(6.9) \times 10^{-11}\) with \(\chi ^2/\mathrm {d.o.f.} = 0.4\). To estimate the systematic error associated with this continuum extrapolation, we perform a constant fit which excludes the coarsest lattice spacing and obtain the slightly smaller value \(a_{\mu }^{\mathrm{conn, M1}} = 96.7(4.7) \times 10^{-11}\) with \(\chi ^2/\mathrm {d.o.f.} = 0.3\). Finally, we also tried a linear fit in the lattice spacing which leads to \(a_{\mu }^{\mathrm{conn, M1}} = 113.8(10.5) \times 10^{-11}\). The results of all these fits are shown in Fig. 10.

Fig. 10
figure 10

Continuum extrapolation of the connected contribution using Method 1. We perform a linear fit in \(a^2\) (red), a linear fit in a (green) or a constant fit (blue). The coarsest lattice spacing is excluded from the constant fit. Ensembles in grey have \(m_{\pi }L<5\) and are not included in the fits and they have been shifted for clarity

We quote our continuum-extrapolated value for the quark-connected contribution to \(a_\mu ^{\mathrm{hlbl}}\) using method 1 at the \(\hbox {SU(3)}_{\mathrm{f}}\)-symmetric point as

$$\begin{aligned} a_{\mu }^{\mathrm{conn, M1}} = 104.1(6.9)(3.7) \times 10^{-11} , \end{aligned}$$
(21)

where the first error includes both the statistical error and the systematic from the finite-size correction. The second error is an estimate of the continuum-limit extrapolation systematic error, taken as half the difference between the linear in \(a^2\) and constant fit ansätze.

5.2 Results from method 2

In our measurement of the quark-connected contribution to \(a_\mu ^{\mathrm{hlbl}}\) using Method 2 we focus on \(\Lambda =0.4\); this value was already indicated as being beneficial for the lepton loop as discussed in Appendix B. We performed measurements on all ensembles with \(\Lambda =0.0,0.4,0.8,\) and 1.0 and found that \(\Lambda =0.0\) approached plateau too slowly and \(\Lambda =1.0\) had a significant peak in the integrand at short distances but a more-pronounced negative-valued tail. It appears that \(\Lambda =0.4\) is a near-optimal choice for our calculation.

Although not presented here, we also performed the contractions with conserved currents at x and/or z. We found that putting a conserved current at z yields a result roughly comparable (point-by-point) to the determination with just 4 local currents. Having a conserved current at x appears to introduce large, unwanted discretization effects. From a computational standpoint the calculation with four local currents is simpler and has no apparent downside, so that is what we will present from here on.

Table 6 Results for the connected contribution using method 2. Here \(a_{\mu }^{\mathrm{data}}\) corresponds to the value using lattice data up to some linearly-interpolated value of \(y = |y|_{\mathrm{cut}}\), chosen to minimize the total error of \(a_\mu ^{\mathrm{conn}}\). Again, a \(25\%\) systematic to the total correction is used. In the last column we give the infinite-volume result
Fig. 11
figure 11

Study of FSEs for method 2. Left: integrand for the ensembles H200 and N202 with \(m_\pi L=4.4\) and 6.4 respectively. Right: value of \(a_{\mu }^{\mathrm{conn}}\) for the ensemble N202 using the FSE correction prescription described in the text, as a function of \(|y|_{\mathrm{cut}}\)

The left plot of Fig. 11 illustrates the finite-size effect between ensembles N202 and H200, and the discrepancy between these ensembles that differ only by volume is significant. The pion-pole prediction describes the tails of both data-sets reasonably well at large-enough values of |y|, although it completely under-estimates the position and height of the short-distance peak of the integrand. It is also worth noting how less statistically precise the result of H200 is compared to N202 for a comparable number of measurements; this gives some indication that the statistical precision is linked to either \(m_\pi L\) or the physical volume. It is clear that much like for method 1 there is a significant signal-to-noise problem for large values of |y|.

We perform the same finite-size correction procedure for method 2 as we did for method 1 above. On the right of Fig. 11 we show the stability of performing the FSE correction with varied \(|y|_{\mathrm{cut}}\) matching point on ensemble N202. We find excellent stability for many different values of \(|y|_{\mathrm{cut}}\) and the matching point of 2.6 fm was chosen in an attempt to minimize the total error.

If we consider the results of Table 6 we see good agreement after finite-size correction between ensembles that only differ by volume (compare U103 with H101, and H200 with N202), which suggests that our finite-size correction procedure is sensible. Unlike for method 1 we see no reason to exclude these smaller volumes from our final extrapolation. An unusual result in our determination in method 2 comes from the ensemble N300, which lies below the trend of all our other data points. Since it is the finest ensemble, we have no reason to exclude it, even though this point will reduce the quality of our final extrapolations.

We hold off on presenting the continuum-limit extrapolation here as we will employ a combined extrapolation after the next section (see Sect. 7.1, Fig. 14). However, we will quote the result of the connected continuum extrapolation,

$$\begin{aligned} a_\mu ^{\text {conn,M2}} = 98.9(2.5)\times 10^{-11}. \end{aligned}$$
(22)

The quoted error is a combination of the statistical and 100%-correlated finite-size systematic. We observe that within error this value is in complete agreement with the determination of method 1.

5.3 Comparison of the two methods

Table 7 Resources used for the calculations on ensemble N202. \(N_\text {Conf}\) gives the number of gauge configurations used, \(N_\text {Src}\) the number of source positions per gauge configuration, and \(N_\text {Prop}\) the total number of propagator solves

The appeal of method 2 for computing the connected contribution is mostly practical: it is computationally far less expensive than method 1. The saving between the two is roughly an order of magnitude, see Table 7 for a exemplary comparison of computational cost for one particular ensemble, N202. This is because method 2 effectively replaces sequential propagator solves by additional, much cheaper, QED kernel evaluations. The downside of using these additional kernels is that their combination tends to broaden the integrand f(|y|). This behavior was seen in the lepton loop study of Appendix B and is also clearly the case with all the lattice QCD data in this work. We can use the parameter \(\Lambda \) of Eq. (7) to partially ameliorate this broadening; the use of such a subtraction kernel appears to be very important specifically for method 2, as this regulator offers little to no advantage for method 1.

Fig. 12
figure 12

Integrand for the disconnected contribution on ensembles H101 and U103 (left), as well as N202 and H200 (right). The lattice data are shown as black points. The black dashed line shows the fully-connected model prediction, the blue line the \(\pi ^0 + \eta \) contribution for the disconnected and the magenta line gives the \(\pi ^0+\eta + \eta ^\prime \). The blue points show the \(\pi ^0+\eta \) contribution in finite volume

Fig. 13
figure 13

Value of \(a_\mu ^{\mathrm{disc}}\) as a function of \(|y|_{\mathrm{cut}}\) for ensembles H101 (left) and N202 (right)

If we compare the results for the integrand of H200 and N202 using method 1 with those of method 2 (Figs. 8, 11 respectively) we see that the integrand for method 2 is in general less-peaked at short distances and extends further in |y|. For example, the integrand for N202 using method 1 is effectively zero around 2 fm, whereas for method 2 it becomes zero closer to 3 fm. This behavior is reflected in Tables 5 and 6 by larger choices of \(|y|_\text {cut}\) for method 2 compared to method 1.

Again comparing Tables 5 and 6, we see that for both methods the smaller boxes (those with \(m_\pi L<5\)) require a significant finite volume correction. As the data for method 2 uses the direction (1, 1, 1, 3) we approach the boundary of the lattice (L/2) slowly for increasing |y|, and so the finite volume effect is smaller in comparison to the direction used in method 1. We do, however, see somewhat larger discretization effects for method 2 compared to those found in method 1, perhaps \(O(15\%)\) opposed to \(O(10\%)\) at our coarsest lattice spacing respectively. It is quite possible this is due to the \(\Lambda \)-regulator enhancing the integrand at shorter distances. Nevertheless, this is not a significant problem as we have several fine lattice spacings to help determine the continuum limit.

If we were to use \(\Lambda =0.0\) with method 2 the tail would extend even further into the region where finite-volume effects become significant. This is likely still controllable for the symmetric-point ensembles used here as they have large volumes and \(m_\pi L\), but this would become much more problematic for lighter-pion-mass ensembles where the signal is expected to degrade quickly at large distances and the integrand is expected to be even broader.

In the following sections, when we combine the results for the connected and disconnected contributions, we will use the results from method 2 as our connected contribution. This is because they are statistically more precise while still being consistent with those of method 1.

6 Results for the quark-disconnected contribution

Table 8 lists the ensembles and statistics used for the computation of the quark-disconnected contribution. As the smaller ensembles (U103, H101, B450, H200) were considerably cheaper to perform inversions on, their statistics is greatly enhanced. As the lattice volume increases, the cost of propagator inversions increases with some power \(V^n\), with \(n>1\), and this quickly becomes the dominant cost of the computation. The column \(N_\text {Src}\) indicates the number of point-source propagators inverted per gauge configuration to build the grid and the final column indicates the maximum and minimum number of equivalent values of |y| available from the set-up. For the ensembles with open boundary conditions, the number of self-averages, \(N_{\mathrm{{Equiv}}}\), for a given |y| decreases as |y|/a increases. Therefore open temporal boundaries make this calculation much more difficult as the signal degrades rapidly with large |y|/a.

Table 8 The setup used for each ensemble in the computation of the quark-disconnected contribution. \(N_\text {Src}\) gives the number of propagator inversions per configuration, \(N_{\mathrm{{Conf}}}\) gives the number of gauge configurations used and \(N_{\mathrm{{Equiv}}}\) gives the maximum and minimum number of equivalent values of |y| averaged per configuration. Shorter separations have larger numbers of self-averages, whereas larger values of |y| have smaller \(N_{\mathrm{{Equiv}}}\). The ensemble B450 has a periodic temporal boundary and temporal length of \(2\times \) that of the the spatial, so values of \(y/a=n(2,2,2,4)\) could be used, as well as the full periodicity in time

The ensemble B450 has lattice volume \(32^3\times 64\) and periodic boundary condition in time, so a fully-periodic grid built of the two basis vectors (4, 4, 4, 0) and (2, 2, 2, 4) was used. All of the other determinations had multiples of (1, 1, 1, 3) and (2, 2, 2, 0) |y| directions, so it is possible that B450 might have noticeably different discretization and finite-volume effects as this direction lies in a different lattice irreducible representation of the broken rotation group O(4). However, as we see in Figs. 12 and 13 the short-distance contribution to the integral is small, and so we can assume the same is true of the discretization effects. As we appropriately correct for FSEs with this choice of direction, we do not expect a significant discrepancy compared to the open-boundary data. Upon continuum extrapolation (Fig. 14) it does seem that this ensemble is consistent with the others, indicating that rotation-breaking artifacts are not the main source of discretization effects.

Fig. 14
figure 14

Combined continuum-extrapolation analysis for the connected and disconnected data. The fits were constrained to have the same slope and the final sum of the two was performed on the constant fit parameters. Also shown is the individual sum of the connected and disconnected pieces. The purple cross represents the addition of the continuum-extrapolated results for the connected and disconnected contributions

The integrand for the disconnected contribution is displayed in Fig. 12 for the value \(\Lambda =0.4\); much like for method 2 we find this value to be preferable. The integrand is compared to the prediction for the exchange of the \(\pi ^0\) and \(\eta \) mesons with a VMD TFF. In addition, the same prediction including an estimate of the \(\eta '\) contribution, based on the assumptions in Sect. 4.3, is indicated.

We do not see any statistically significant finite-volume effects in the integrands between the ensembles U103 and H101, and H200 and N202, for \(|y|<1\,\)fm. This observation is consistent with the predictions for the \((\pi ^0,\eta )\) exchange in finite volume. The central values of the integrand obtained on ensembles with different volumes differ substantially at some larger values of |y|, however this is also in the regime where the signal is rapidly deteriorating, if not lost already. There is a trend in the tail for the larger-volume results to enhance the magnitude of the disconnected contribution, much like what we saw in the connected contribution. We see some enhancement of the integrand compared to the \(\pi ^0+\eta +\eta ^\prime \) prediction at short distances; the likely cause of this is the contribution from scalar mesons.

Clearly, the \(\pi ^0\) and \(\eta \) exchanges already provide a rather good description of the shape of the integrand, unlike in the case of the connected contribution (e.g. Figs. 811). At the same time, the loss of the signal beyond 1.2 fm means that we cannot, at this point, confirm the validity of the \((\pi ^0,\eta )\) exchange at long distances. Given the rapid degradation in signal of the disconnected data, probing large distances of the disconnected contribution will be a very challenging undertaking. There are however good reasons to believe that this description should apply in that regime, and in the following we assume this to be the case.

Table 9 summarizes our results. We perform the FSE matching at a single linearly-interpolated point of \(|y|_{\mathrm{cut}}=1.2\,\text {fm}\). This point appears to be where we start losing signal for most of our ensembles, and so a significant proportion of the tail of the integrand has to be modelled. We take solace in the fact that the model appears to describe our data well even at distances far shorter than 1.2 fm, as can be seen in Figs. 12 and 13. By far the largest part of the correction comes from modelling the tail with the \((\pi ^0,\eta )\) exchange. This correction is of the order of \(100\%\) of the lattice-determined contribution. We have also computed an estimate of the \(\eta '\) exchange contribution as described in section 4.3. The values in Table 9 show that its contribution to the tail, \(|y|>1.2\) fm, is much smaller. Its magnitude is covered by the systematic uncertainty we assign to the modelling of the tail. We have chosen to include the estimated contribution of the \(\eta '\) to the tail in the central value of \(a_\mu ^{\mathrm{disc}}\).

Anticipating the combined analysis presented in Sect. 7.1, the continuum-extrapolated disconnected value we obtain from a constrained-slope fit to both the connected method 2 data and the disconnected data is

$$\begin{aligned} a_\mu ^{\mathrm{disc}} = -33.5(4.2)\times 10^{-11}. \end{aligned}$$
(23)

We observe that this result amounts to be about \((-1/3)\) of the connected contribution.

Table 9 Finite-volume corrected disconnected contributions to \(a_\mu ^{\text {hlbl}}\) at the \(\hbox {SU}(3)_{\mathrm{f}}\)-symmetric point. A breakdown of the FSE contributions to the total results are shown, again a 25% systematic to the total finite-size correction is used. The value of \(|y|_\text {cut}\) was \(1.2\text { fm}\) for each ensemble

7 The total \(a_\mu ^{\mathrm{hlbl}}\)

The main purpose of this section is to describe how we arrive at our final result for the total \(a_\mu ^{\mathrm{hlbl}}\) (Sect. 7.1), in which the systematics of the continuum extrapolation are discussed in detail. The following Sect. 7.2 presents a study of the dependence of our result on the muon mass, and finally, Sect. 7.3 discusses what outcome one may expect for \(a_\mu ^{\mathrm{hlbl}}\) at physical quark masses based on our findings.

7.1 Final result for \(a_\mu ^{\mathrm{hlbl}}\) in \(\hbox {SU}(3)_{\mathrm{f}}\)-symmetric QCD

7.1.1 Combining the connected and disconnected contributions

In order to obtain the final result for \(a_\mu ^{\mathrm{hlbl}}=a_\mu ^\mathrm{conn}+a_\mu ^{\mathrm{disc}}\), we combine the disconnected data with the connected data of method 2, as it is consistent with, yet statistically more precise than that of method 1. We then explore two types of analyses on this combined data set, which differ mainly by whether the data are combined after or before the continuum extrapolation. Having chosen the first option, the systematics of the extrapolation are further investigated and quantified in Sect. 7.1.2.

The full set of data for the two contributions and their sum is shown in Fig. 14, along with fits and results from the first analysis whose description follows. Both the connected and disconnected data show a negative slope in \(a^2\) despite the two contributions having opposite signs. This indicates that the leading discretization effects do not arise from a common multiplicative effect such as the renormalization factor.

The first analysis method consists in extrapolating both the connected and the disconnected contributions to the continuum, before summing the two extrapolated values to obtain \(a_\mu ^{\mathrm{hlbl}}\). In this procedure, the results for the connected contribution and the disconnected contribution are both corrected for finite-size effects and for the extension of the |y|-integrand. The statistical errors are obtained under the bootstrap, and a correlated Gaussian sample with appropriate width is propagated for the systematic error due to the correction. The systematic error of the correction, which is set to 25% of its size, is treated as being fully anti-correlated between the connected and the disconnected contribution (recall that the two contributions have opposite sign). In the following subsection, we explore ten different variants of this type of analysis. As a representative outcome, we give the result of simultaneously extrapolating the connected and the disconnected contributions to the continuum with a slope in \(a^2\) constrained to be equal for the two contributions,

$$\begin{aligned} a_\mu ^{\mathrm{hlbl}}= 65.4(4.9)\times 10^{-11}, \end{aligned}$$
(24)

where the quoted uncertainty is composed of the statistical error as well as the systematic error of the finite-size correction and tail extension. The systematic error of the continuum extrapolation is determined in Sect. 7.1.2.

In the second analysis method, the connected and disconnected contributions are summed first, and a single continuum extrapolation is performed. In this analysis, the correction applied to the lattice data is split into two parts, (a) \(a_{\mu }^{\mathrm{FSE}}\), the pure finite-size correction on the integrand, and (b) \(a_{\mu }^{\mathrm{tail}}\), the extension of the integrand beyond some \(|y|_{\mathrm{cut}}\). In combining the connected and disconnected contributions, each part is treated as being fully anti-correlated between the connected and the disconnected contribution; however, in contrast with the first analysis, the correlation between the systematic errors of the two parts is considered to be zero. The point of view adopted here is that the uncertainty of the correction has a somewhat different origin in each case: the systematic uncertainty of the tail extension mainly arises from neglected non-pseudoscalar exchange contributions around \(|y|=|y|_{\mathrm{cut}}\), while the applied finite-size effect correction neglects higher exponentials such as \(e^{-m_\pi L}\). The result of this procedure, for a continuum extrapolation linear in \(a^2\), is

$$\begin{aligned} a_\mu ^{\mathrm{hlbl}}= 64.5(6.7)\times 10^{-11}. \end{aligned}$$
(25)

The two analyses produce compatible values for \(a_\mu ^{\mathrm{hlbl}}\), but the first allows more flexibility since the two contributions are treated separately. Therefore, we will use the first analysis for the final result and study variations on it to estimate the systematic uncertainty.

7.1.2 Continuum extrapolation systematics

Our data are noisy and it is difficult to find a fit that describes them perfectly, so we identify several different choices of continuum extrapolation to investigate the spread and provide an associated continuum-extrapolation systematic. We note that (also expressed in Sect. 5.1) as we are not using the O(a)-improved vector currents it is possible that we have a term linear in the lattice spacing. We consider the two fit forms:

$$\begin{aligned} a_\mu ^{\text {conn}}(a)=a_\mu ^{\text {conn}}(0)+A a^n, \,\, a_\mu ^{\text {disc}}(a)=a_\mu ^{\text {disc}}(0)+B a^m,\nonumber \\ \end{aligned}$$
(26)

with various cuts to the data, constraints on A and B, and choices of n and m as listed in Table 10. We then add the distributions \(a_\mu ^{\mathrm{hlbl}}= a_\mu ^{\text {conn}}(0) + a_\mu ^{\text {disc}}(0)\) to obtain the results shown in Fig. 15.

Table 10 Different fit forms used to estimate the continuum extrapolation systematic

It is clear that any time we omit the coarse data the fit wants to flatten the slope. This is due to the anomalously low result of N300. When we do not include the coarse ensembles the error increases substantially, which is fairly indicative that the fit is struggling to accurately model the data.

Fig. 15
figure 15

An estimate of our continuum extrapolation systematic on the full \(a_\mu ^{\mathrm{hlbl}}\) for the combination of method 2 and the disconnected data

If we perform a fit linear in a the central value moves up, as was also the case for the method 1 continuum extrapolation. We choose to quote the linear in \(a^2\) fit to all the data with a constrained slope as our final result as it has the best \(\chi ^2/\mathrm {d.o.f.}=2\). For the continuum-extrapolation systematic, we use the lower error bound of the largest fit result and the upper bound of our lowest fit result. It is clear that constraining the slope or letting it vary does little to the position of the central value apart from reducing the error, this suggests that the fit is having a hard time accurately determining the slope with the quality of the data we have at present.

7.2 The dependence of our results on the muon mass

Since the mesons at the \(\hbox {SU(3)}_{\mathrm{f}}\)-symmetric point can be viewed as ‘heavy’ degrees of freedom relative to the muon, we would expect \(a_\mu ^{\mathrm{hlbl}}\) to be roughly proportional to \(m_\mu ^2\). Here, we will study what happens if we re-scale the muon mass on one of our ensembles. We are motivated to do so by our experience on related projects [35] whereby adjusting the muon mass by some dimensionless ratio can flatten the approach to the chiral limit. Here we investigate this idea by defining two quantities,

$$\begin{aligned} {\overline{a}}^{\text {hlbl}}_\mu = \left( \frac{f_\pi ^{\text {Latt.}}}{f_\pi ^{\text {Phys.}}}\right) ^2 a_\mu ^{\mathrm{hlbl}}(m_\mu ^{\text {Phys.}}), \end{aligned}$$
(27)

and,

$$\begin{aligned} \quad {\tilde{a}}^{\text {hlbl}}_\mu = a_\mu ^{\text {hlbl}}\left( \frac{f_\pi ^{\text {Latt.}}}{f_\pi ^{\text {Phys.}}} m_\mu ^{\text {Phys.}}\right) . \end{aligned}$$
(28)

The first quantity rescales the integrated result by the lattice-determined pion decay constant divided by the value in continuum, squared. The second quantity re-scales the muon mass used as input in our determination by this ratio. These two definitions would be comparable if \(a_\mu ^{\mathrm{hlbl}}\) scales as \(m_\mu ^2\), which is to be expected in the heavy-quark limit.

Figure 16 illustrates the effect of these re-scaling procedures on the connected contribution to \(a_\mu ^{\mathrm{hlbl}}\) on one of our coarsest and largest boxes, H101 (results are from Method 2 with \(\Lambda =0.4\)). It is clear that these two prescriptions are equivalent within error, which suggests that any change in the muon mass leads to a quadratic change in the integrated result. We can also use this analysis to very naively infer how much we expect the result to grow as we approach the chiral limit, and it appears that for the connected contribution this could be of the order of a \(25\%\) increase.

Fig. 16
figure 16

Partially-integrated lattice results for the ensemble H101 with and without re-scaling of the integral

7.3 Expectations for \(a_\mu ^{\mathrm{hlbl}}\) in QCD with physical quark masses

In this subsection, we first quantify the contributions to \(a_\mu ^{\mathrm{hlbl}}\) at the \(\hbox {SU}(3)_{\mathrm{f}}\)-symmetric point not coming from the light pseudoscalars. These contributions are expected to have only a modest quark-mass dependence, and they are also the hardest to determine quantitatively by (experimental-)data-driven methods. Therefore any information on these contributions is worthwhile collecting. By using our determination of this contribution together with our previous calculation of the \(\pi ^0\) exchange, we can arrive at an estimate for \(a_\mu ^{\mathrm{hlbl}}\) at physical quark masses.

We begin by noting that, subtracting the \(\pi ^0\) and \(\eta \) contributions (respectively Eqs. (16) and (18)) from our final SU(3)-point result (31), the contribution of heavier intermediate states amounts to

$$\begin{aligned} a_\mu ^{\mathrm{hlbl,SU(3)_{\mathrm{f}}}} - a_\mu ^{\mathrm{hlbl,\pi ^0+\eta ,SU(3)_{\mathrm{f}}}} = (37.4 \pm 8.3)\times 10^{-11}.\nonumber \\ \end{aligned}$$
(29)

In particular, this contribution accounts for 57% of the total \(a_\mu ^{\mathrm{hlbl}}\), and we have added all statistical and systematic errors in quadrature.

Next, we may try to roughly estimate \(a_\mu ^{\mathrm{hlbl}}\) at the physical point. As the (ud) quark masses are lowered to their physical values at fixed trace of the quark mass matrix, it is the pion whose mass changes by the largest factor: it becomes a factor of three lighter. Since we have an evaluation of the \(\pi ^0\) exchange contribution at the \(\hbox {SU}(3)_{\mathrm{f}}\)-symmetric point and at the physical point (see Eqs. (1617)), we can correct for this effect,

$$\begin{aligned}&a_\mu ^{\mathrm{hlbl,SU(3)_{\mathrm{f}}}} - a_\mu ^{\mathrm{hlbl,\pi ^0,SU(3)_{\mathrm{f}}}} + a_\mu ^{\mathrm{hlbl,\pi ^0,\mathrm{phys}}} \nonumber \\&\quad = (104.1\pm 9.1)\times 10^{-11}. \end{aligned}$$
(30)

One can think of Eq. (30) as a rough estimate of \(a_\mu ^{\mathrm{hlbl}}\) at the physical point, purely based on our lattice QCD results and the assumption of a negligible quark-mass dependence of the non-\(\pi ^0\)-exchange contributions. As argued in the next paragraph, we expect such an estimate to hold at the 20% level. It is well in line with the most recent evaluations [5].

In order to assess the systematic uncertainty of such an estimate, we perform a slightly more sophisticated method to correct for the quark-mass dependence of the \(\eta \) contribution and the charged pion loop. Within the scalar QED framework, we find \(-6.3\times 10^{-11}\) for the pion loop, to be doubled to include the kaon loop, and we expect a factor of two to three reduction if one includes an electromagnetic form factor for the pseudoscalar. Therefore, further subtracting \(a_\mu ^{\mathrm{hlbl},(\pi ^\pm ,K^\pm ),\mathrm{SU}(3)_{\mathrm{f}}}\approx 2\times (-3.0)\times 10^{-11}\) from Eq. (29), we obtain \((43.4\pm 9.3)\times 10^{-11}\). This number represents our estimate of the non-pseudo-Goldstone contributions at the \(\hbox {SU}(3)_{\mathrm{f}}\) -symmetric point.Footnote 6 We observe that by neglecting the quark-mass dependence of this contribution, and using the dispersive \(\pi ^0\)-exchange [5, 36] and the Canberbury-approximant \(\eta \)-exchange [5, 27] results for \(a_\mu ^{\mathrm{hlbl,\pi ^0+\eta ,\mathrm{phys}}}=79.3\times 10^{-11}\) and the box contribution [5, 37] to \( a_\mu ^{\mathrm{hlbl},(\pi ^\pm ,K^\pm ),\mathrm{phys}}=-16.4\times 10^{-11}\), we arrive at the estimate \(10^{11}a_\mu ^{\mathrm{hlbl,phys}}=79.3(3.0)-16.4(2)+43.4(9.3)=106.3(9.8)\). It is only slightly different from the more naive estimate of Eq. (30). Therefore we consider it safe to assign a systematic error of 20% to Eq. (30) as an estimate of \(a_\mu ^{\mathrm{hlbl}}\) at the physical point. This uncertainty estimate also generously covers the \(a_\mu ^{\mathrm{hlbl,phys}}\) value obtained by assuming that the non-pseudo-Goldstone contributions increase from the \(\hbox {SU(3)}_{\mathrm{f}}\) to the physical point by a factor \((f_\pi ^{\mathrm{SU(3)_{\mathrm{f}}}}/f_\pi ^{\mathrm{phys}})^2\) to account approximately for the quark-mass dependence of the QCD resonances; see the previous subsection concerning this point.

8 Conclusions

In this work we have computed the hadronic light-by-light contribution to the \(g-2\) of the muon using lattice QCD at the \(\hbox {SU}(3)_{\mathrm{f}}\)-symmetric point with \(m_{\pi } = m_{K} \approx 420\) MeV. We chose to initially work at the symmetric point for several reasons: Due to the significantly reduced computational cost (as compared to simulations at physical quark masses), we are able to control all known sources of systematic error, in particular the finite-size and finite lattice spacing effects. Second, the \(\hbox {SU(3)}_{\mathrm{f}}\)-symmetry implies that only two out of five quark-contraction topologies contribute, and it simplifies the hadronic models with which the integrand can be compared and interpreted. For instance, the \(\eta \)-exchange contribution simply amounts to one third of the pion-exchange. Third, the overall contribution from states beyond the light pseudoscalars is not expected to be strongly quark-mass dependent, so that the present calculation already constrains its size.

In order to help interpret our results for \(a_\mu ^{\mathrm{hlbl}}\), we have performed an exploratory study of the low-lying meson spectrum at the \(\hbox {SU(3)}_{\mathrm{f}}\) point. The most remarkable feature is the existence of a stable singlet \(J^{PC}=0^{++}\) meson with a mass of about 680 MeV. Also, our previous calculation of the pion transition form factor [12] allows us to quantify the contributions of the \(\pi ^0\) and \(\eta \) exchanges.

Our strategy to calculate \(a_\mu ^{\mathrm{hlbl}}\) relies on coordinate-space perturbation theory, for which muon and photon propagators are computed in infinite volume. We have presented the integrand for the final, one-dimensional integral over the distance |y| of a quark-photon vertex from one of the other two internal vertices, since it contains more information than the final \(a_\mu ^{\mathrm{hlbl}}\) value. For the quark-connected contribution we have identified two methods of calculation that we call method 1 and method 2. The former amounts to a direct computation of the three connected diagrams, but it is a numerically costly approach, as it involves many sequential-propagator calculations. To make this computation much cheaper, we have utilized several changes of variables and translational invariance to rewrite the integral in terms of an easy-to-calculate single diagram and a combination of different kernels; this we call method 2.

For a single |y|-value, method 2 requires only two propagator inversions, at the meagre cost of a more-complicated QED-kernel calculation. Method 2 has another computational advantage over method 1 in that it allows one to store a set of propagators in memory and perform their integrals, redefining the origin to be each propagator source; thereby, for N propagators we can compute \(N(N-1)\) non-zero samples of f(|y|). If the source points are evenly spaced and the volume is periodic this amounts to N self-averages per |y|. Such self-averaging is crucial in reducing the cost of the calculation.

We note that the combination of kernels needed for method 2 broadens the integrand in |y| significantly. To counteract this effect, we have introduced a new class of subtracted kernels with a Gaussian-regulator \(\Lambda \). This parameter \(\Lambda \) effectively allows us to tune the shape of the integrand to peak at shorter-distances. We find that a value of \(\Lambda =0.4\) suits our purposes quite well and allows for the integral of the connected lattice data to saturate at reasonably short distances of about \(2\text { fm}\).

We have handled the disconnected contribution by introducing a sparse sub-grid of equi-distant point sources, the idea being that we obtain a large number of self-averages from treating our origin as each point on the grid. Such a technique is necessary as this contribution is extremely noisy and suffers from a significant signal-to-noise problem at large distances. We note that potentially millions of averages are needed in order to get good control over errors at distances of order \(2\text { fm}\).

In our calculation, we have seen that finite-size effects are significant, and having a good theoretical understanding of the tail of the integrand is very important. For the quark-connected contribution, we have approximated the \(\pi ^0\) and \(\eta \) meson exchange contribution to the integrand using a vector-dominance transition form factor and compared it to the lattice data; only at fairly large |y| does the prediction quantitatively represent the integrand. We therefore attempt to conservatively incorporate as much lattice data as possible before making contact with the model. For the disconnected contribution, the model does a satisfactory job of describing the data over the entire range of |y| where we have signal, and we therefore match on to the prediction at shorter distances, where we still have control over the statistical errors of the lattice data.

We have shown that both the disconnected and connected contributions have a non-negligible discretization effect within our measured precision. It appears to be of the same sign and comparable magnitude for both contributions, but ultimately this extrapolation appears well under control. We decide to quote a result from a fit linear in \(a^2\) with a constrained slope to the data of method 2 and the disconnected as,

$$\begin{aligned} a_\mu ^{\mathrm{hlbl}}= (65.4\pm 4.9\pm 6.6)\times 10^{-11}, \end{aligned}$$
(31)

at the SU(3) flavor-symmetric point, where the first error results from the uncertainties on the individual gauge ensembles, and the second is the systematic error of the continuum extrapolation.

We have discussed in Sect. 7.3 how we expect our result for \(a_\mu ^{\mathrm{hlbl}}\) to evolve as the up and down quark masses are lowered towards their physical values at fixed trace of the quark mass matrix. Correcting for the increase in the \(\pi ^0\) exchange contributionFootnote 7 using our previous lattice calculation [12], we arrive at a value (Eq. (30)) which is very well in line with the most recent phenomenological [5] and lattice QCD results [16]. This value is quite stable under varying the assumptions about the quark-mass dependence of heavier-state contributions.

In order to reduce the systematic uncertainty of \(a_\mu ^{\mathrm{hlbl}}\) at physical quark masses using lattice QCD, obviously simulations at lighter quark masses are needed. The methods we have developed to correct for finite-size effects and to extend the tail of the |y|-integrand based on the \(\pi ^0\) exchange will be extremely valuable in this endeavor. While we have reached a semi-quantitative understanding of the integrand in terms of hadronic models, further work is needed on the theory side to bring this description to a fully quantitative level.