Hadronic light-by-light contribution to $(g-2)_\mu$ from lattice QCD with SU(3) flavor symmetry

We perform a lattice QCD calculation of the hadronic light-by-light contribution to $(g-2)_\mu$ at the SU(3) flavor-symmetric point $m_\pi=m_K\simeq 420\,$MeV. The representation used is based on coordinate-space perturbation theory, with all QED elements of the relevant Feynman diagrams implemented in continuum, infinite Euclidean space. As a consequence, the effect of using finite lattices to evaluate the QCD four-point function of the electromagnetic current is exponentially suppressed. Thanks to the SU(3)-flavor symmetry, only two topologies of diagrams contribute, the fully connected and the leading disconnected. We show the equivalence in the continuum limit of two methods of computing the connected contribution, and introduce a sparse-grid technique for computing the disconnected contribution. Thanks to our previous calculation of the pion transition form factor, we are able to correct for the residual finite-size effects and extend the tail of the integrand. We test our understanding of finite-size effects by using gauge ensembles differing only by their volume. After a continuum extrapolation based on four lattice spacings, we obtain $a_\mu^{\rm hlbl} = (65.4\pm 4.9 \pm 6.6)\times 10^{-11}$, where the first error results from the uncertainties on the individual gauge ensembles and the second is the systematic error of the continuum extrapolation. Finally, we estimate how this value will change as the light-quark masses are lowered to their physical values.

We perform a lattice QCD calculation of the hadronic light-by-light contribution to (g − 2) µ at the SU(3) flavor-symmetric point m π = m K 420 MeV. The representation used is based on coordinate-space perturbation theory, with all QED elements of the relevant Feynman diagrams implemented in continuum, infinite Euclidean space. As a consequence, the effect of using finite lattices to evaluate the QCD four-point function of the electromagnetic current is exponentially suppressed. Thanks to the SU(3)-flavor symmetry, only two topologies of diagrams contribute, the fully connected and the leading disconnected. We show the equivalence in the continuum limit of two methods of computing the connected contribution, and introduce a sparse-grid technique for computing the disconnected contribution. Thanks to our previous calculation of the pion transition form factor, we are able to correct for the residual finite-size effects and extend the tail of the integrand. We test our understanding of finite-size effects by using gauge ensembles differing only by their volume. After a continuum extrapolation based on four lattice spacings, we obtain a hlbl µ = (65.4 ± 4.9 ± 6.6) × 10 −11 , where the first error results from the uncertainties on the individual gauge ensembles and the second is the systematic error of the continuum extrapolation. Finally, we estimate how this value will change as the light-quark masses are lowered to their physical values. Electrons and muons carry a magnetic moment aligned with their spin. The proportionality factor between the two axial vectors is parameterized by the gyromagnetic ratio g. In Dirac's theory, g = 2, and for a lepton family one characterizes the deviation of g from this reference value by a = (g − 2) /2. Historically, the ability of Quantum Electrodynamics (QED) to quantitatively predict this observable played a crucial role in establishing quantum field theory as the framework in which particle physics theories are formulated.
Presently, the achieved experimental precision on the measurement of the anomalous magnetic moment of the muon [1], a µ , is 540 ppb. At this level of precision, such a measurement tests not only QED, but also the effects of the weak and the strong interaction of the Standard Model (SM) of particle physics. Currently there exists a tension of about 3.7 standard deviations between the SM prediction and the experimental measurement. The status of this test of the SM is reviewed in [2][3][4][5]. At the time of writing, the E989 experiment at Fermilab is performing a new direct measurement of a µ [6], and a further experiment using a different experimental technique is planned at J-PARC [7]. The final goal of these experiments is to reduce the uncertainty on a µ by a factor of four. A commensurate reduction of the theory error is thus of paramount importance.
The precision of the SM prediction for a µ is completely dominated by hadronic uncertainties. The leading hadronic contribution enters at second order in the fine-structure constant α via the vacuum polarization and must be determined at the few-permille level in order to match the upcoming precision of the direct measurements of a µ . The most accurate determination comes from the use of e + e − → hadrons data via a dispersion relation, although lattice QCD calculations have made significant progress in computing this quantity from first principles [5,8]. The hadronic light-by-light (HLbL) scattering contribution a hlbl µ , which is of third order in α, currently contributes at a comparable level to the theory uncertainty budget and is being addressed both by dispersive and lattice methods; see [9][10][11] and references therein.
Our approach for determining a hlbl µ is based on coordinate-space perturbation theory where the QED elements of the Feynman diagrams yielding a hlbl µ are precomputed in infinite volume, and only the four-point amplitude of the electromagnetic current is actually computed on the lattice. Here we compute the lattice contribution at a point in the space of light quark masses corresponding to QCD with exact SU(3)-flavor (denoted SU(3) f in the rest of this work) symmetry. Furthermore, the sum of the three light quark masses is approximately the same as in nature. These two conditions leads to a degenerate mass of pions, kaons and the eta meson of about 420 MeV.
Our motivation for calculating a hlbl µ at the SU(3) f -symmetric point is twofold. First, the lattice calculation itself is simplified in that only two out of five classes of Wick contractions contribute, due to the vanishing trace of the quark electric charge matrix. In addition, the overall lattice calculation is computationally far cheaper than for physical quark masses, so that more tests of systematic errors can be performed. Second, the interpretation of the results is simplified: the SU(3) f -symmetry reduces the number of unknown parameters in model estimates based on the exchange of the lightest mesons. In particular, the transition form factor (TFF) of the pion, which describes the coupling of the neutral pion to two virtual photons, has been calculated [12] on the lattice ensembles that we use. The TFF of the eta meson coincides with the TFF of the pion up to a simple overall charge factor. Of the pseudoscalar mesons, only the TFF of the η remains independent and is largely unknown at the SU(3) f -symmetric point, however experimental information is available for the two-photon decay width (which provides the coupling strength to two real photons) and some experimental results are available for the singly as well as the doubly-virtual form factor [13][14][15], although only for relatively large virtualities above 1.5 GeV 2 .
The simplified connection to model estimates enables our work to provide a valuable cross-check for the predictions of hadronic models and dispersive methods; this work is thereby complementary to lattice calculations directly aiming at a hlbl µ for physical quark masses [16]. At the same time, this study allows us to learn about the size of various sources of systematic error, particularly the finite-size effects, and how well we are able to correct for them semi-analytically.
The rest of this paper is organized as follows: We begin by presenting our methodology in Section II, including the two methods we will investigate for computing the quark-connected contribution to a hlbl µ . Section III begins with a description of the lattice ensembles used in this work, as well as an example of the lowest-lying relevant meson spectrum for one of our ensembles. We then discuss the lattice determination of the π 0 and η transition form factors used for the modelling and finite-size correction of our data. Section IV discusses the various model predictions for the integrand at the SU(3) f -symmetric point and confronts these with a selection of our lattice data. Results for the fully-connected class of Wick contractions are presented in Section V, and those for the non-vanishing disconnected class in Section VI. In both cases, the lattice results are compared to the prediction for the exchange of pseudoscalar mesons at the integrand level. The main result of this paper -a hlbl µ at the SU(3) f -symmetric point, Eq. (31) -is obtained in Section VII, which also contains a discussion of how this result will change for physical values of the quark masses. We summarize our findings and conclude in Section VIII, and various technical aspects of the calculation are described in more detail in the appendices (A, B, C).

II. METHODOLOGY
One can compute the light-by-light scattering contribution to the g − 2 of the muon in position space by performing the integrals whereL is a QED kernel and i Π is a spatial moment of the connected Euclidean four-point function in QCD, with j µ (x) the hadronic component of the electromagnetic current The QCD correlation function consists of all the various ways one can contract four vector currents, as shown in Fig. 1, all of which are "disconnected" except for the fully-connected contribution. At the flavor-symmetric point, only the upper two topologies contribute. Away from this point it is expected, from large-N c arguments as well as from numerical evidence by the RBC/UKQCD collaboration [9], that the remaining topologies are suppressed. The number of contractions for each topology is given in Table I. More information on the infinite-volume QED kernelL(x, y) can be found in [17]. We make use of O(4) symmetry to simplify the integral further, where in the starting representation (1), We will often display this integrand and always denote it by f (|y|), even though in practice we employ modified representations of a hlbl µ . One type of modification concerns the kernel L [ρ,σ];µνλ (x, y). Various subtraction terms to the kernel have been proposed to beneficially change the shape of the integrand [9,18] without changing the resulting integral. The importance of performing such subtractions cannot be understated as the unsubtracted kernel is poorly suited for practical lattice simulations due to being too peaked at short distances.
Here we make extensive use of a new subtraction scheme for the QED kernel [18], where Λ is an arbitrary, tuneable, dimensionless free parameter. When Λ = 0 we have theL (2) kernel of [17] and as Λ → ∞ we recover the unsubtractedL (0) . The benefit of such a choice of kernel is that we are able to tune the shape of the integrand to reduce the long-distance effects while still preserving the beneficial properties of short-distance subtractions. An investigation of this kernel with the infinite-volume lepton loop is presented in Appendix B. In this section we will write the formulae for obtaining a hlbl µ in terms of continuous integrals. The lattice, however, is discrete so we can only approximate these integrals with finite sums 1 , where a is the lattice spacing and N µ = L µ /a. In Eq. (2), we set z ρ = L/2 → 0 to accommodate the discontinuity when it changes sign. The integral of f (|y|) with respect to |y| can be performed with the trapezoid rule and in practice we will average over equivalent values of f (|y|) to both increase statistical precision and reduce computational cost. Later on in the analysis we will show results for the partially-integrated value of a µ , with the hope that we will see a plateau at large-enough |y| max , indicating that our integral has saturated. In the following three subsections we will discuss two methods to compute the connected contribution: the first, Method 1, is a direct calculation of the three pairs of connected Wick contractions; the second, Method 2, uses rearrangements of the integrand expressed in terms of only one pair of Wick contraction to make the calculation cheaper. We will then give the methodology used for the calculation of the disconnected contribution, which also uses some rearrangements in the integrand.

A. Connected contribution (Method 1)
The Wick contractions for the connected contribution are shown in Fig. 2. With four local vector currents, the four-point correlation function can be written as µνσλ (x, y, z) + Π (2) µνσλ (x, y, z) + Π µνσλ (x, y, z) , where Z V is the renormalization factor of the vector current 2 and each term represents one diagram in Fig. 2, i.e. a pair of Wick contractions. We will use the Z V values from [19] without the O(a)-improvement terms proportional to quark mass combinations. The value 18/81 is the necessary charge factor for the degenerate light and strange quarks. Writing the Π (1,2,3) s in terms of propagators yields where S(x, y) is the quark propagator with sink at x and source at y and · · · U is the expectation value over gauge configurations. The trace is over both Dirac and color indices. In Method 1, the six contractions are computed explicitly and we choose the direction y/a ∝ (1, 1, 1, 1) 3 . In this method, we first compute the point-to-all propagator S(·, 0) and the six sequential propagators using the fields z [ρ γ σ] S(z, 0) as sources (the anti-symmetrization of the indices ρ and σ is imposed by the symmetries of the QED kernel). Then, for each value of |y| used to sample the integrand in Eq. (5), we compute the point-to-all propagator S(·, y) and the six sequential propagators using the fields z [ρ γ σ] S(z, y) as sources. Therefore, for N evaluations of the integrand, we need 7(N + 1) propagator inversions. In our set-up all currents are local except for the one at x, which will be either local or the point-split, the latter implying a suitable modification of Eq. (11).
In practice, since we are (mostly) using open-boundary conditions in the time direction, the origin is located somewhere near the middle time-slice and randomly distributed in the spatial volume. We also average over the 16 combinations y/a = (±n, ±n, ±n, ±n) to increase statistics.

B. Connected contribution (Method 2)
The idea for Method 2 is simple: we pick a reference diagram that is easiest to compute and use a change of variables in the integrals to relate the other diagrams to this reference. Here, we pick the diagram that does not have a propagator from the origin to y, or from z to x (the leftmost of Fig. 2). Such a choice avoids the extra inversions required for the sequential sources as we discussed in Method 1. Simplistically, for two samples of a single |y| (i.e. +y and −y) we only need two point-to-all propagators. In fact, we can do much better than this if we keep propagators in memory so that they can be reused.
In infinite volume, the integral is equal to the result of Method 1, but the integrand f (|y|) will generally be different. As a result, the systematic effects in Method 2 will be different from Method 1.
We invert point-source propagators along the line (n, n, n, 3n + t min /a), and what we call |y| will be the difference between these points; here the integer n ranges from 0 to N i /2. The closest source position in time (t min ) is chosen to be suitably away from our (usually) open temporal boundary, ideally m π t min > 4 and m π (L t − t max ) > 4. This line of propagator sources was chosen as we typically have a large anisotropy in the temporal direction, allowing us to achieve sizeable values of |y|.
In our implementation, we keep all N i /2 of the propagators in memory and perform the integrals for every possible y and origin, so for N propagator inversions we have N (N − 1) samples distributed among the different values of non-zero |y|. To further boost statistics we will also average other directions that give the same |y|, e.g. y/a = (±n, ±n, ±n, t max /a−3n).

C. Disconnected contribution
The disconnected contribution can be computed from the two-point contraction An important point to note is that Π µν has a vacuum expectation value (VeV) that must be subtracted to ensure that the two "disconnected" quark loops are still connected by gluons.
The benefit of such a method over calculating each separate |y| individually is the quadratic growth of equivalent combinations of |y| that are available as the number of source fields grows. For our 4 3 × 12 example we have 6 values of equivalent |y| per source position. It would be impractical to perform this calculation without exploiting this fact. We find it is beneficial to truncate the x-integral in our set-up to avoid negligible, noisy contributions at large distances. As our kernels are computed on-the-fly this is also useful as a computer-time saving measure as it reduces the number of QED kernel calls. We truncate the integral over x to points that lie within a maximum distance [(r/a) 2 = 81, 81, 121, 169 for our coarsest to finest ensembles respectively] from the origin or from y, while we do not truncate the z-integral. Although these truncations differ in physical volume, the ensembles H101 and H200 were tested for smaller and larger values of (r/a) 2 on a subset of our data and even smaller values of (r/a) 2 than used here, for example 49 and 64 for the ensemble H101, were found to be consistent.
To reiterate, in our full calculation we do not perform the integrals for all possible yvectors, as there are many that we expect to have bad finite-volume or discretization effects.  For instance, in the toy example we could calculate f (|y|) for the y-vector y/a = (0, 0, 0, 6), but this would have significant cut-off effects. Typically, we filter-out about ≈ 80% of the possible y-vectors and keep only (the modulus of) those parallel to (1, 1, 1, 3), (2, 2, 2, 0), and occasionally (1, 1, 1, 1).

III. LATTICE PARAMETERS AND PROPERTIES OF SU(3) f -SYMMETRIC QCD
Calculations have been performed on lattice ensembles provided by the Coordinated Lattice Simulations (CLS) initiative [20], which have been generated using three flavors of non-perturbatively O(a)-improved Wilson-clover fermions and with the tree-level O(a 2 )improved Lüscher-Weisz (Symanzik) gauge action. In particular, we consider only those with SU(3) f -symmetry. On these ensembles, the mass of the octet of light pseudoscalar mesons is approximately 420 MeV. These ensembles are summarized in Table II and in Fig. 4: there are four lattice spacings, as well as two pairs of ensembles that differ only by their volume.  At long distances, the leading contributions to the four-point correlator emanate from the lowest-lying mesons. Clearly, the degenerate π 0 and η-pole contributions dominate at the longest distances. However, at intermediate distances the contributions from heavier states, including resonances in the two-pseudoscalar channel, may also play a significant rôle. Especially since in our calculation at the SU(3) f -point the pseudoscalar mesons are not appreciably light compared to the other mesons. Therefore, we present results from a limited spectroscopic study of the low-lying meson spectrum in all relevant channels with total angular momentum J ≤ 1.
Only charge-conjugation-even (C-even) mesons can couple to two electromagnetic currents and thus contribute in exchange diagrams to the four-point function. However, from the Wick-contraction structure, one can view Method 2 as using charged vector currents (see Appendix A) and it is possible to couple a C-odd state to two of them. Therefore, the integrand for Method 2 can receive contributions from C-odd mesons such as the rho [18]. For that reason, we also inspect the spectrum of C-odd mesons, even if their contributions to a hlbl µ must vanish in the infinite-volume limit once all integrals are performed. Our dedicated spectroscopy calculation is performed on the ensemble H101. This made use of the distillation framework [22] (for the connected hadron two-point functions) and its stochastic formulation [23], which was essential for efficiently computing the disconnected hadron two-point functions needed in the singlet sector. However, we have only used smeared quark bilinears as interpolating operators, and the lack of non-local multimesonlike interpolators means that this calculation should not generally be considered as a robust determination of the spectrum. This analysis precludes the use of finite-volume quantization conditions; therefore, only the approximate locations of resonances can be found, provided that they are narrow. Nevertheless, since we compute diagonal correlators, the effective masses taken at any Euclidean time provide an upper bound on the ground-state energy in a given channel. This observation is particularly useful in the flavor-singlet 0 ++ channel, which turns out to admit a stable ground-state meson. Such a stable scalar meson has been found previously [24] in a lattice calculation at a similar pion mass, though not at an SU(3) f -symmetric point.
Results are shown in Fig. 5 and summarized in Table III. Note that since the signal in the flavor-singlet sector is much worse, the plateau fits had to be done at relatively short time separations, so that the shown uncertainty on the mass may be an underestimate. In addition, the plateau for the flavor-octet scalar is relatively poor, which might be due  Note that in many cases (in particular the octet pseudoscalar and vector channels), the error bar is smaller than the plotted symbol. Fitted masses are shown in the right panel; the outer error bars include an estimate of systematic uncertainty obtained by shifting the plateau fit range. Horizontal dashed lines indicate thresholds at twice and three times the octet pseudoscalar mass.
to its coupling to two octet pseudoscalars in S-wave and the plateau's proximity to the corresponding threshold. In the case of a mis-identified plateau, the true ground-state energy would be lower, since the effective masses shown here are expected to approach the ground-state energy from above.
For the flavor-octet sector, after the π 0 , the next-longest-distance meson-exchange contributions in the integrand for a hlbl µ come from the a 0 and (for the Method 2 integrand) the ρ, both of which sit near 2m π . For the flavor-singlet sector, the three corresponding mesons are also the lightest, although the ordering is different, with the f 0 (or σ) being a bound state with mass near 680 MeV and the η mass sitting higher, somewhere near 2m π . For the disconnected diagrams, which receive the difference between contributions from the exchanges of singlet and octet mesons (see section IV A), the pseudoscalars will provide the longest-distance contribution but we should also expect a significant contribution from scalars. The difference between the octet and singlet vector meson masses is very small; if the same holds for their form factors, then their combined contribution to the integrand will be negligible.
B. Pion-pole and η-pole contributions to a hlbl µ In Ref. [12] a model independent parametrization of the pion TFF has been obtained on the same set of lattice ensembles as used in this work. These results can be used to compute the pion-pole contribution, a hlbl;π 0 µ , at the SU(3) f -symmetric point in the continuum limit. This result will be used in Section VII C to obtain a rough estimate of a hlbl µ at the physical pion mass. More importantly, in Sections V A, V B, and VI, a vector meson dominance (VMD) parametrization of the TFF will be used to estimate the finite-size corrections to our lattice data at the symmetric point. For this purpose, we have performed a new fit to the data from [12] assuming a VMD parametrization, where we have restricted the fit to the singly-virtual case where the model provides a good description of our data. The fit parameters used in this work are collected in Table IV.
In [12], two strategies have been used to extract the pion TFF, and the corresponding results for the pion-pole contribution are displayed in Fig. 6 for two different discretizations of the 3-point correlation function (see [12] for details). In the first strategy, the TFF was computed on each lattice ensemble separately. This allowed us to determine the pion-pole contribution for different values of the pion mass and lattice spacing. The physical value was then obtained by a combined chiral-continuum extrapolation of a hlbl;π 0 µ . We have repeated this analysis but now restrict the fit to only the ensembles included in this study. Here we use the pion mass of the given ensemble (instead of the physical one as done in [12]) in the weight functions that appear in Eq. (74) of [12]. This leads to the result (1) of Fig. 6.
In the second strategy, the pion TFF was directly extrapolated to the physical point using a global fit that includes several ensembles including, and away from, the SU(3) fpoint. From the resulting fit parameters, we can extract the pion TFF in the continuum limit at a pion mass of 420 MeV. Using this result in Eq. (74) of [12], we obtain the grey point of Fig. 6, which we find to be in very good agreement with the first estimate.
The second strategy has the advantage of using an expanded set of ensembles (15 in total) to determine the TFF, at the SU(3) f -point which is to be compared to the physical-pion value Unsurprisingly, we observe a strong dependence on the pion mass. The smaller pion contribution in Eq. (16) compared with (17) is due roughly in equal parts to the heavier pion mass and to the reduced coupling to photons, as can be seen by comparing the entries in Table IV to the physical value, F π 0 γγ (0, 0) ≈ 0.274 GeV −1 . At the SU(3) f -symmetric point, the η is mass-degenerate with the pion, m η 420 MeV, and contributes exactly 1/3 of the π 0 exchange to a hlbl µ , i.e.
A lattice calculation at the physical point for this contribution is not yet available, but the most recent estimate 4 , which comes from Canterbury approximants [5,26], is a hlbl,η,phys µ = 4 An older estimate is a hlbl,η,phys µ = 14.5 × 10 −11 using the VMD model [25].  (1) and (2) are explained in the main text.
16.3 × 10 −11 . This comparison shows that at the physical point the η gives a much larger contribution to a hlbl µ than at the SU(3) f -symmetric point, in spite of being heavier. This can be traced back to its much larger coupling to two photons,

IV. HADRONIC MODELS VS. THE LATTICE INTEGRAND f (|y|)
In subsection IV A, we state the theoretical predictions for the integrand f (|y|) corresponding to the quark-connected diagrams in Method 1 and Method 2, as well as for the disconnected diagram. Then, in subsection IV B, we compare the lattice |y|-integrand obtained by Method 1 for the quark-connected contribution to hadronic model predictions. For this purpose, we will focus on the ensembles N202 and H200. N202 has the largest physical volume (L = 3.08 fm) of all ensembles considered in this work, and H200 (with L = 2.06 fm) differs only by its comparatively smaller volume. Since these only differ by their volume, they allow us to test our understanding of finite-volume effects. Finally, a comparison of the theory predictions for the integrand of the quark-disconnected contributions to the corresponding lattice data is made in subsection IV C.

A. Predictions for the integrand
In order to gain some insight into the various contributions to the quantity a hlbl µ , we will compare predictions for the pseudoscalar exchanges as well as the contribution from a constituent-quark loop, and a charged-pseudoscalar loop, to the lattice data. Here, we provide some details as to how these predictions are obtained. Expressions for the amplitudes i Π ρ;µνλσ calculated with a fermion loop, a charged-pion loop, or with the pseudoscalar exchange can be found in [17,27]. In QCD with exact SU(3) f -symmetry, the η contribution to a hlbl µ is always one third of the contribution of the π 0 . The flavor structure of single-meson exchanges in the different Wick-contraction topologies of four-point functions was discussed in detail in Ref. [28], including the case of N f = 3 QCD. The quark-connected diagrams receive contributions only from flavor-octet mesons, enhanced 5 by a factor of three. This is compensated by the quark-disconnected diagrams, which contain the differences between the flavor-singlet and twice the flavor-octet meson contributions. In the large-N c limit, the singlet and octet contributions to the disconnected diagrams will cancel as their spectra becomes the same. For QCD, this degeneracy is most strongly broken in the pseudoscalar sector, where the octet's mass is far below the singlet's.
To interpret the integrand in Method 2, one needs a mapping of individual quark-level Wick contractions onto meson-exchange diagrams [18] (see Appendix A for a derivation based on partially quenched chiral perturbation theory). It turns out that Π (1) µνσλ (x, y, z) does not contain the meson-exchange diagram in which the π 0 and η propagate between the pair of vertices (0, y) and (x, z). Also, the normalization of the two other (π 0 , η)-exchange diagrams is such that Π conn µνσλ contains the same (π 0 , η) contribution as Π µνσλ , enhanced (in the present SU(3) f case) by the charge factor three.
In addition, we need the mapping of individual quark-level disconnected diagrams onto meson-exchange diagrams. Here it turns out that there is a one-to-one mapping between a given quark-level diagram and a meson-exchange diagram. For instance, take a quark-level diagram, consisting of two quark loops each containing two vectorial vertices; each quark loop thus defines a pair of vertices. Such a diagram is in one-to-one correspondence with the diagram in which the meson is exchanged between the two pairs of vertices. For octet mesons, the latter diagram has a weight of −2 relative to its normalization in the full HLbL amplitude. For singlet mesons, this relative weight factor is simply unity.
The short-distance contribution to a hlbl µ is sometimes modelled by a constituent-quark loop. This corresponds to an effective degree of freedom, and the mass assigned to the 'constituent quark' is typically on the order of 300 MeV [31]. In the following sub-section, we will address the question "to what extent does such a contribution describe the shortdistance part of the |y| integrand?". In this case, the Wick contractions and weight factors of the constituent quark simply correspond to those of the fundamental quark degrees of freedom.
We have computed the charged pion and kaon loop contributions in the framework of scalar QED [17,18]. The contribution of these loops to the set of quark-connected contractions and their contribution to the set of quark-disconnected contractions add up coherently, the latter being twice as large as the former. This result can again be derived in partiallyquenched perturbation theory. While the (π 0 , η) exchange contribution to the full a hlbl µ is three times smaller than its contribution to a conn µ , the charged-pseudoscalar loop contribution is three times larger. The charged-pseudoscalar loop's contribution might seem negligible in comparison to the integrand of the quark-connected, but it need not be negligible in the full integrand. The lattice data use a point-split current at x. The integrand is compared to the prediction for the exchange of the π 0 and η mesons with a VMD transition form factor, which is expected to provide a good approximation to the tail. In addition, an attempt to describe the short-distance contribution with a constituent-quark loop with a quark mass of 350 MeV is made. It is compared to the integrand for the exchange of the (π 0 , η) mesons with a VMD TFF and using the parameters of Table IV. Beyond 1.5 fm, the prediction is consistent with the lattice data, albeit within large relative errors. Between 0.8 fm and 1.4 fm, the lattice data lie noticeably below the pseudoscalar-octet exchange prediction.

B. Lattice connected contribution
At distances up to 0.8 fm, one would certainly not expect the pseudoscalar-octet exchange to saturate the integrand. We have attempted to model the higher-energy contributions using a constituent-quark loop, displayed in Fig. 7. For a constituent-quark mass of 350 MeV, the sum of this contribution and the pseudoscalar-octet exchange provides a good description of the maximum height of the integrand. At distances |y| 0.4 fm, one must expect large cutoff effects on the lattice integrand, as the separation of sources becomes comparable to our lattice spacing. This interpretation is confirmed by comparing the integrand to the one obtained using exclusively the local current, as in the left panel of Fig. 8, where a sizeable difference is visible up to about 0.4 fm. An interesting observation is that upon adding the constituent-quark loop to the pseudoscalar-octet exchange the result does not improve the agreement with the lattice data in the region 0.8 fm < |y| < 1.4 fm. A clear excess remains in the prediction; we currently do not have a compelling explanation for this excess. The exchange of the lightest scalar-octet mesons (a 0 type) would have the right sign to explain the effect, since scalar-meson exchanges are known to contribute negatively to a hlbl µ . In the future, a calculation of the scalar contribution along the lines of [32] would be worthwhile to find out whether it accounts for this missing effect. The charged-pion loop is also expected to contribute negatively and we have studied this contribution in the scalar-QED framework in [18]. We found the integrand to be negligible compared with the π 0 contribution beyond |y| = 0.6 fm, and introducing a vector form factor for the charged pion would likely only reduce the integrand further. It thus remains an open problem to understand the physics underlying the integrand for |y| around 1 fm.
The effect of the finite volume on the lattice integrand is illustrated in Fig. 9. A clear effect is seen between the comparatively 'small' ensemble H200 and the 'large' N202, the former integrand lying below the latter. This finite-volume dependence matches in sign and typical size the volume dependence of the pseudoscalar-octet exchange contribution, as seen in the figure.
C. Lattice disconnected contribution Fig. 12 (later in the text) displays the disconnected integrand for several gauge ensembles. It is compared to the prediction of the (π 0 , η) exchange, including its appropriate weight factor of −2 for the disconnected contribution, both in finite and infinite volume. The finite-volume effect predicted from the (π 0 , η) exchange calculation is very small, and the lattice data from ensembles N202 and H200 confirms this expectation, at least at distances |y| < 0.9 fm, where the statistical errors allow for a meaningful comparison.
The main difference between the disconnected and the connected contributions is that for the disconnected the (π 0 , η) exchange already provides a decent description of the lattice data for |y| below 1 fm; in other words, we do not observe a large short-distance contribution. The η and the σ mesons being singlets, they contribute to the disconnected diagrams with the same weight factor as they would to the full HLbL amplitude [28,30]. In order to estimate the typical size of these heavy-meson contributions, we have computed the η contribution under the following assumptions: The η mass was set to 982 MeV, a value close to the result of our calculation on ensemble H101. Its coupling to two photons, F η γγ , was assumed to be equal to its value at physical quark masses. The virtuality dependence of the transition form factor can be modelled with a VMD ansatz, with vector mass 952 MeV. Under these assumptions, the contribution is positive and sizeable (compared to the (π 0 , η) exchange) up to |y| = 1.5 fm. In addition, we expect a significant contribution from the stable σ meson, whose mass we have found to lie well below the ππ threshold. As a scalar, the σ would contribute negatively and thus compensate to some extent the η contribution. Again, a dedicated calculation in the framework of [32] would certainly be worthwhile. The estimate for the η contribution displayed in Fig. 12 is only meant to be representative of one particular meson-exchange contribution, with other cancellations being expected. The lattice results for the quark-connected contribution to a hlbl µ using Method 1 have been generated along the direction y/a ∝ (1, 1, 1, 1). We have used two different discretizations of the four-point correlation function: the vector current located at the site x is either local or point-split (conserved), while the three other currents are always local. Our results are summarized in Table V and the integrand for the ensemble N202 (also presented in Sec. IV B, Fig. 7) is shown on Fig. 8. The signal-to-noise ratio clearly deteriorates rapidly at large distances and we observe slightly better statistical precision when using a conserved vector current at x. Both discretizations give similar results, suggesting that there are small discretization effects present. In the end we will quote a final result with the fully-local discretization for a direct comparison with the results obtained using Method 2.
Although all the ensembles used in this work satisfy m π L > 4, we still expect significant finite size effects (FSEs) due to the pseudoscalar-pole contribution, which is the dominant contribution in model estimates of hadronic light-by-light scattering [33], being a longrange phenomenon. Even with heavy pions m π ≈ 400 MeV, the tail extends beyond |y| = 2.5 fm [17]. A comparison of the integrand for the ensembles H200 and N202, which only differ by their physical volumes, is depicted on the left panel of Fig. 9. Assuming a VMD model for the TFF, the pseudoscalar contribution can be computed in both finite and infinite volume. For more information on the calculation of the TFF we refer the reader back to Sec. III B; the parameters we use for modelling our data are summarized in Table IV in that section.
To obtain our final estimate, the lattice data are integrated up to |y| cut where the integrand is compatible with zero. For |y| > |y| cut , the tail is aproximated by the pseudoscalarpole contribution in infinite volume. Finally, for |y| < |y| cut , the FSEs are estimated as the difference between the pseudoscalar-pole's contribution computed in finite and infinite volume: A systematic error of 25% is attributed to both the tail extension and the FSE correction.
Since the same VMD model is used in both cases, we treat these corrections as being fully correlated.
The value of a conn µ as a function of |y| cut is shown in the right panel of Fig. 9; we observe a nice plateau for values |y| cut > 1.2 fm, suggesting that our systematic error estimate is quite conservative. Estimates of the finite-size corrections for each ensemble are summarized in Table V with their corresponding values of |y| cut . In particular, we note that the systematic error on the FSEs is always larger than the statistical precision, except for our largest ensemble N202.
For the ensembles U103 and H200, with m π L < 5, we see very large FSE corrections, of We also observe a systematic over-estimate in comparison to the larger-volume results, and when it comes to our final continuum extrapolation we will omit these results.
O(a)-improvement is not implemented for the vector currents used in this work, but our experience with other observables involving electromagnetic currents, such as the LO HVP [34] and the pion TFF [12], suggests the remaining O(a) terms are small compared to the quadratic contribution. A linear fit in a 2 leads to a conn,M1 The results of all these fits are shown in Fig. 10.
We quote our continuum-extrapolated value for the quark-connected contribution to a hlbl where the first error includes both the statistical error and the systematic from the finite-size  correction. The second error is an estimate of the continuum-limit extrapolation systematic error, taken as half the difference between the linear in a 2 and constant fit ansätze.

B. Results from Method 2
In our measurement of the quark-connected contribution to a hlbl µ using Method 2 we focus on Λ = 0.4; this value was already indicated as being beneficial for the lepton loop as discussed in Appendix B. We performed measurements on all ensembles with Λ = 0.0, 0.4, 0.8, and 1.0 and found that Λ = 0.0 approached plateau too slowly and Λ = 1.0 had a significant peak in the integrand at short distances but a more-pronounced negative-valued tail. It appears that Λ = 0.4 is a near-optimal choice for our calculation.
Although not presented here, we also performed the contractions with conserved currents at x and/or z. We found that putting a conserved current at z yields a result roughly comparable (point-by-point) to the determination with just 4 local currents. Having a conserved current at x appears to introduce large, unwanted discretization effects. From a computational standpoint the calculation with four local currents is simpler and has no apparent downside, so that is what we will present from here on.
The left plot of Fig. 11 illustrates the finite-size effect between ensembles N202 and H200, and the discrepancy between these ensembles that differ only by volume is significant. The pion-pole prediction describes the tails of both data-sets reasonably well at large-enough values of |y|, although it completely under-estimates the position and height of the shortdistance peak of the integrand. It is also worth noting how less statistically precise the result of H200 is compared to N202 for a comparable number of measurements; this gives some indication that the statistical precision is linked to either m π L or the physical volume. It is clear that much like for Method 1 there is a significant signal-to-noise problem for large values of |y|. corresponds to the value using lattice data up to some linearly-interpolated value of y = |y| cut , chosen to minimize the total error of a conn µ . Again, a 25% systematic to the total correction is used. In the last column we give the infinite-volume result. We perform the same finite-size correction procedure for Method 2 as we did for Method 1 above. On the right of Fig. 11 we show the stability of performing the FSE correction with varied |y| cut matching point on ensemble N202. We find excellent stability for many different values of |y| cut and the matching point of 2.6 fm was chosen in an attempt to minimize the total error.
If we consider the results of Table VI we see good agreement after finite-size correction between ensembles that only differ by volume (compare U103 with H101, and H200 with N202), which suggests that our finite-size correction procedure is sensible. Unlike for Method 1 we see no reason to exclude these smaller volumes from our final extrapolation. An unusual result in our determination in Method 2 comes from the ensemble N300, which lies below the trend of all our other data points. Since it is the finest ensemble, we have no reason to exclude it, even though this point will reduce the quality of our final extrapolations.
We hold off on presenting the continuum-limit extrapolation here as we will employ a combined extrapolation after the next section (See Sec. VII A, Fig. 14). However, we will quote the result of the connected continuum extrapolation, a conn,M2 µ = 98.9(2.5) × 10 −11 .   The quoted error is a combination of the statistical and 100%-correlated finite-size systematic. We observe that within error this value is in complete agreement with the determination of Method 1.

C. Comparison of the two methods
The appeal of Method 2 for computing the connected contribution is mostly practical: it is computationally far less expensive than Method 1. The saving between the two is roughly an order of magnitude, see Tab. VII for a exemplary comparison of computational cost for one particular ensemble, N202. This is because Method 2 effectively replaces sequential propagator solves by additional, much cheaper, QED kernel evaluations. The downside of using these additional kernels is that their combination tends to broaden the integrand f (|y|). This behavior was seen in the lepton loop study of Appendix B and is also clearly the case with all the lattice QCD data in this work. We can use the parameter Λ of Eq. (7) to partially ameliorate this broadening; the use of such a subtraction kernel appears to be very important specifically for Method 2, as this regulator offers little to no advantage for Method 1.
If we compare the results for the integrand of H200 and N202 using Method 1 with those of Method 2 (Figs. 8 and 11 respectively) we see that the integrand for Method 2 is in general less-peaked at short distances and extends further in |y|. For example, the integrand for N202 using Method 1 is effectively zero around 2 fm, whereas for Method 2 it becomes zero closer to 3 fm. This behavior is reflected in Tables V and VI by larger choices of |y| cut for Method 2 compared to Method 1.
Again comparing Tables V and VI, we see that for both methods the smaller boxes (those with m π L < 5) require a significant finite volume correction. As the data for Method 2 uses the direction (1, 1, 1, 3) we approach the boundary of the lattice (L/2) slowly for increasing |y|, and so the finite volume effect is smaller in comparison to the direction used in Method 1. We do, however, see somewhat larger discretization effects for Method 2 compared to those found in Method 1, perhaps O(15%) opposed to O(10%) at our coarsest lattice spacing respectively. It is quite possible this is due to the Λ-regulator enhancing the integrand at shorter distances. Nevertheless, this is not a significant problem as we have several fine lattice spacings to help determine the continuum limit.
If we were to use Λ = 0.0 with Method 2 the tail would extend even further into the region where finite-volume effects become significant. This is likely still controllable for the symmetric-point ensembles used here as they have large volumes and m π L, but this would become much more problematic for lighter-pion-mass ensembles where the signal is expected to degrade quickly at large distances and the integrand is expected to be even broader.
In the following sections, when we combine the results for the connected and disconnected contributions, we will use the results from Method 2 as our connected contribution. This is because they are statistically more precise while still being consistent with those of Method 1. N202 H200 dir= (1,1,1,3) Conn: π 0 +η inf.vol Disc: π 0 +η inf.vol Disc: π 0 +η L=L H200 dir=(2,2,2,0) Disc: π 0 +η L=L H200 dir= (1,1,1,3) Disc: π 0 +η+η' FIG. 12. Integrand for the disconnected contribution on ensembles H101 and U103 (left), as well as N202 and H200 (right). The lattice data are shown as black points. The black dashed line shows the fully-connected model prediction, the blue line the π 0 + η contribution for the disconnected and the magenta line gives the π 0 + η + η . The blue points show the π 0 + η contribution in finite volume. Table VIII lists the ensembles and statistics used for the computation of the quarkdisconnected contribution. As the smaller ensembles (U103, H101, B450, H200) were considerably cheaper to perform inversions on, their statistics is greatly enhanced. As the lattice volume increases, the cost of propagator inversions increases with some power V n , with n > 1, and this quickly becomes the dominant cost of the computation. The column N Src indicates the number of point-source propagators inverted per gauge configuration to build the grid and the final column indicates the maximum and minimum number of equivalent values of |y| available from the set-up. For the ensembles with open boundary conditions, the number of self-averages, N Equiv , for a given |y| decreases as |y|/a increases. Therefore open temporal boundaries make this calculation much more difficult as the signal degrades rapidly with large |y|/a.

VI. RESULTS FOR THE QUARK-DISCONNECTED CONTRIBUTION
The ensemble B450 has lattice volume 32 3 × 64 and periodic boundary condition in time, so a fully-periodic grid built of the two basis vectors (4, 4, 4, 0) and (2, 2, 2, 4) was used. All of the other determinations had multiples of (1, 1, 1, 3) and (2, 2, 2, 0) |y| directions, so it is possible that B450 might have noticeably different discretization and finite-volume effects as this direction lies in a different lattice irreducible representation of the broken rotation group O(4). However, as we see in Figs. 12 and 13 the short-distance contribution to the integral is small, and so we can assume the same is true of the discretization effects. As we appropriately correct for FSEs with this choice of direction, we do not expect a significant discrepancy compared to the open-boundary data. Upon continuum extrapolation (Fig. 14)  that this ensemble is consistent with the others, indicating that rotation-breaking artifacts are not the main source of discretization effects. The integrand for the disconnected contribution is displayed in Fig. 12 for the value Λ = 0.4; much like for Method 2 we find this value to be preferable. The integrand is compared to the prediction for the exchange of the π 0 and η mesons with a VMD TFF. In addition, the same prediction including an estimate of the η contribution, based on the assumptions in Section IV C, is indicated.
We do not see any statistically significant finite-volume effects in the integrands between the ensembles U103 and H101, and H200 and N202, for |y| < 1 fm. This observation is consistent with the predictions for the (π 0 , η) exchange in finite volume. The central values of the integrand obtained on ensembles with different volumes differ substantially at some larger values of |y|, however this is also in the regime where the signal is rapidly deteriorating, if not lost already. There is a trend in the tail for the larger-volume results to enhance the magnitude of the disconnected contribution, much like what we saw in the connected contribution. We see some enhancement of the integrand compared to the π 0 + η + η prediction at short distances; the likely cause of this is the contribution from scalar mesons.
Clearly, the π 0 and η exchanges already provide a rather good description of the shape of the integrand, unlike in the case of the connected contribution (e.g. Figs. 8 and 11). At the same time, the loss of the signal beyond 1.2 fm means that we cannot, at this point, confirm the validity of the (π 0 , η) exchange at long distances. Given the rapid degradation in signal of the disconnected data, probing large distances of the disconnected contribution will be a very challenging undertaking. There are however good reasons to believe that this description should apply in that regime, and in the following we assume this to be the case.
Table IX summarizes our results. We perform the FSE matching at a single linearlyinterpolated point of |y| cut = 1.2 fm. This point appears to be where we start losing signal for most of our ensembles, and so a significant proportion of the tail of the integrand has to be modelled. We take solace in the fact that the model appears to describe our data well even at distances far shorter than 1.2 fm, as can be seen in Figs. 12 and 13. By far the largest part of the correction comes from modelling the tail with the (π 0 , η) exchange. This correction is of the order of 100% of the lattice-determined contribution. We have also computed an estimate of the η exchange contribution as described in section IV C. The values in Table IX show that its contribution to the tail, |y| > 1.2 fm, is much smaller. Its magnitude is covered by the systematic uncertainty we assign to the modelling of the tail. We have chosen to include the estimated contribution of the η to the tail in the central value of a disc µ . Anticipating the combined analysis presented in section VII A, the continuum-extrapolated disconnected value we obtain from a constrained-slope fit to both the connected Method 2 data and the disconnected data is We observe that this result amounts to be about (−1/3) of the connected contribution.

VII. THE TOTAL a hlbl µ
The main purpose of this section is to describe how we arrive at our final result for the total a hlbl µ (subsection VII A), in which the systematics of the continuum extrapolation are discussed in detail. The following sub-section VII B presents a study of the dependence of our result on the muon mass, and finally, subsection VII C discusses what outcome one may expect for a hlbl µ at physical quark masses based on our findings.

Combining the connected and disconnected contributions
In order to obtain the final result for a hlbl µ = a conn µ + a disc µ , we combine the disconnected data with the connected data of Method 2, as it is consistent with, yet statistically more precise than that of Method 1. We then perform two analyses on this combined data set. The full set of data for the two contributions and their sum is shown in Fig. 14, along with fits and results from the first analysis whose description follows. Both the connected and disconnected data show a negative slope in a 2 despite the two contributions having opposite signs. This indicates that the leading discretization effects do not arise from a common multiplicative effect such as the renormalization factor.
The first analysis consists in simultaneously extrapolating both the connected and the disconnected contributions to the continuum with a slope in a 2 constrained to be equal for the two contributions, before summing the two constant parameters of the fit to obtain a hlbl µ . In this procedure, the results for the connected contribution and the disconnected contribution are both corrected for finite-size effects and for the extension of the |y|-integrand. The statistical errors are obtained under the bootstrap, and a correlated gaussian sample with appropriate width is propagated for the systematic error due to the correction. The systematic error of the correction, which is set to 25% of its size, is treated as being fully anti-correlated between the connected and the disconnected contribution (recall that the two contributions have opposite sign). The fit ansatz for the two continuum extrapolations is a polynomial of degree one in a 2 . For this analysis we obtain the result, a hlbl µ = 65.4(4.9) × 10 −11 .
In the second analysis method, the connected and disconnected contributions are summed prior to performing the continuum extrapolation. This extrapolation is then performed linearly in a 2 . In this analysis, the correction applied to the lattice data is split into two parts, (a) a FSE µ , the pure finite-size correction on the integrand, and (b) a tail µ , the extension of the integrand beyond some |y| cut . In combining the connected and disconnected contributions, each part is treated as being fully anti-correlated between the connected and the disconnected contribution; however, in contrast with the first analysis, the correlation between the systematic errors of the two parts is considered to be zero. The alternative point of view adopted here is that predicting the tail of the infinite-volume integrand is a rather different exercise from predicting the finite-size effect on the finite-volume integrand; one of the two predictions could be successful while the other is not. The result of this procedure is a hlbl µ = 64.5(6.7) × 10 −11 .
The two analyses produce compatible values for a hlbl µ but the first is more precise and allows more flexibility since the two contributions are treated separately. Therefore, we will use the first analysis for the final result and study variations on it to estimate the systematic uncertainty.

Continuum extrapolation systematics
Our data are noisy and it is difficult to find a fit that describes them perfectly, so we identify several different choices of continuum extrapolation to investigate the spread and provide an associated continuum-extrapolation systematic. We note that (also expressed in Sec. V A) as we are not using the O(a)-improved vector currents it is possible that we have a term linear in the lattice spacing. We consider the two fit forms: with various cuts to the data, constraints on A and B, and choices of n and m as listed in Table X. We then add the distributions a hlbl µ = a conn µ (0) + a disc µ (0) to obtain the results shown in Fig. 15.  It is clear that any time we omit the coarse data the fit wants to flatten the slope. This is due to the anomalously low result of N300. When we do not include the coarse ensembles the error increases substantially, which is fairly indicative that the fit is struggling to accurately model the data. If we perform a fit linear in a the central value moves up, as was also the case for the Method 1 continuum extrapolation. We choose to quote the linear in a 2 fit to all the data with a constrained slope as our final result as it has the best χ 2 /d.o.f. = 2. For the continuum-extrapolation systematic, we use the lower error bound of the largest fit result and the upper bound of our lowest fit result. It is clear that constraining the slope or letting it vary does little to the position of the central value apart from reducing the error, this suggests that the fit is having a hard time accurately determining the slope with the quality of the data we have at present.

B. The dependence of our results on the muon mass
Since the mesons at the SU(3) f -symmetric point can be viewed as 'heavy' degrees of freedom relative to the muon, we would expect a hlbl µ to be roughly proportional to m 2 µ . Here, we will study what happens if we re-scale the muon mass on one of our ensembles. We are motivated to do so by our experience on related projects [34] whereby adjusting the muon mass by some dimensionless ratio can flatten the approach to the chiral limit. Here we investigate this idea by defining two quantities, and,ã The first quantity rescales the integrated result by the lattice-determined pion decay constant divided by the value in continuum, squared. The second quantity re-scales the muon mass used as input in our determination by this ratio. These two definitions would be comparable if a hlbl µ scales as m 2 µ , which is to be expected in the heavy-quark limit. Fig. 16 illustrates the effect of these re-scaling procedures on the connected contribution to a hlbl µ on one of our coarsest and largest boxes, H101 (results are from Method 2 with Λ = 0.4). It is clear that these two prescriptions are equivalent within error, which suggests that any change in the muon mass leads to a quadratic change in the integrated result. We can also use this analysis to very naively infer how much we expect the result to grow as we approach the chiral limit, and it appears that for the connected contribution this could be of the order of a 25% increase. In this subsection, we first quantify the contributions to a hlbl µ at the SU(3) f -symmetric point not coming from the light pseudoscalars. These contributions are expected to have only a modest quark-mass dependence, and they are also the hardest to determine quantitatively by (experimental-)data-driven methods. Therefore any information on these contributions is worthwhile collecting. By using our determination of this contribution together with our previous calculation of the π 0 exchange, we can arrive at an estimate for a hlbl µ at physical quark masses.
We begin by noting that, subtracting the π 0 and η contributions (respectively Eqs. (16) and (18) In particular, this contribution accounts for 57% of the total a hlbl µ , and we have added all statistical and systematic errors in quadrature. Next, we may try to roughly estimate a hlbl µ at the physical point. As the (u, d) quark masses are lowered to their physical values at fixed trace of the quark mass matrix, it is the pion whose mass changes by the largest factor: it becomes a factor of three lighter. Since we have an evaluation of the π 0 exchange contribution at the SU(3) f -symmetric point and at the physical point (see Eqs. (16)(17)), we can correct for this effect, One can think of Eq. (30) as a rough estimate of a hlbl µ at the physical point, purely based on our lattice QCD results and the assumption of a negligible quark-mass dependence of the non-π 0 -exchange contributions. As argued in the next paragraph, we expect such an estimate to hold at the 20% level. It is well in line with the most recent evaluations [5].
In order to assess the systematic uncertainty of such an estimate, we perform a slightly more sophisticated method to correct for the quark-mass dependence of the η contribution and the charged pion loop. Within the scalar QED framework, we find −6.3 × 10 −11 for the pion loop, to be doubled to include the kaon loop, and we expect a factor of two to three reduction if one includes an electromagnetic form factor for the pseudoscalar. Therefore, further subtracting a hlbl,(π ± ,K ± ),SU(3) f µ ≈ 2 × (−3.0) × 10 −11 from Eq. (29), we obtain (43.4 ± 9.3) × 10 −11 . This number represents our estimate of the non-pseudo-Goldstone contributions at the SU(3) f -symmetric point 6 . We observe that by neglecting the quarkmass dependence of this contribution, and using the dispersive π 0 -exchange [5,35] and the Canberbury-approximant η-exchange [5,26] results for a hlbl,π 0 +η,phys µ = 79.3 × 10 −11 and the box contribution [5,36]

VIII. CONCLUSIONS
In this work we have computed the hadronic light-by-light contribution to the g − 2 of the muon using lattice QCD at the SU(3) f -symmetric point with m π = m K ≈ 420 MeV. We chose to initially work at the symmetric point for several reasons: Due to the significantly reduced computational cost (as compared to simulations at physical quark masses), we are able to control all known sources of systematic error, in particular the finite-size and finite lattice spacing effects. Second, the SU(3) f -symmetry implies that only two out of five quark-contraction topologies contribute, and it simplifies the hadronic models with which the integrand can be compared and interpreted. For instance, the η-exchange contribution simply amounts to one third of the pion-exchange. Third, the overall contribution from states beyond the light pseudoscalars is not expected to be strongly quark-mass dependent, so that the present calculation already constrains its size.
In order to help interpret our results for a hlbl µ , we have performed an exploratory study of the low-lying meson spectrum at the SU(3) f point. The most remarkable feature is the existence of a stable singlet J P C = 0 ++ meson with a mass of about 680 MeV. Also, our previous calculation of the pion transition form factor [12] allows us to quantify the contributions of the π 0 and η exchanges.
Our strategy to calculate a hlbl µ relies on coordinate-space perturbation theory, for which muon and photon propagators are computed in infinite volume. We have presented the integrand for the final, one-dimensional integral over the distance |y| of a quark-photon vertex from one of the other two internal vertices, since it contains more information than the final a hlbl µ value. For the quark-connected contribution we have identified two methods of calculation that we call Method 1 and Method 2. The former amounts to a direct computation of the three connected diagrams, but it is a numerically costly approach, as it involves many sequential-propagator calculations. To make this computation much cheaper, we have utilized several changes of variables and translational invariance to rewrite the integral in terms of an easy-to-calculate single diagram and a combination of different kernels; this we call Method 2.
For a single |y|-value, Method 2 requires only two propagator inversions, at the meagre cost of a more-complicated QED-kernel calculation. Method 2 has another computational advantage over Method 1 in that it allows one to store a set of propagators in memory and perform their integrals, redefining the origin to be each propagator source; thereby, for N propagators we can compute N (N − 1) non-zero samples of f (|y|). If the source points are evenly spaced and the volume is periodic this amounts to N self-averages per |y|. Such self-averaging is crucial in reducing the cost of the calculation.
We note that the combination of kernels needed for Method 2 broadens the integrand in |y| significantly. To counteract this effect, we have introduced a new class of subtracted kernels with a gaussian-regulator Λ. This parameter Λ effectively allows us to tune the shape of the integrand to peak at shorter-distances. We find that a value of Λ = 0.4 suits our purposes quite well and allows for the integral of the connected lattice data to saturate at reasonably short distances of about 2 fm.
We have handled the disconnected contribution by introducing a sparse sub-grid of equidistant point sources, the idea being that we obtain a large number of self-averages from treating our origin as each point on the grid. Such a technique is necessary as this contribution is extremely noisy and suffers from a significant signal-to-noise problem at large distances. We note that potentially millions of averages are needed in order to get good control over errors at distances of order 2 fm.
In our calculation, we have seen that finite-size effects are significant, and having a good theoretical understanding of the tail of the integrand is very important. For the quarkconnected contribution, we have approximated the π 0 and η meson exchange contribution to the integrand using a vector-dominance transition form factor and compared it to the lattice data; only at fairly large |y| does the prediction quantitatively represent the integrand. We therefore attempt to conservatively incorporate as much lattice data as possible before making contact with the model. For the disconnected contribution, the model does a satisfactory job of describing the data over the entire range of |y| where we have signal, and we therefore match on to the prediction at shorter distances, where we still have control over the statistical errors of the lattice data.
We have shown that both the disconnected and connected contributions have a nonnegligible discretization effect within our measured precision. It appears to be of the same sign and comparable magnitude for both contributions, but ultimately this extrapolation appears well under control. We decide to quote a result from a fit linear in a 2 with a constrained slope to the data of Method 2 and the disconnected as, a hlbl µ = (65.4 ± 4.9 ± 6.6) × 10 −11 , at the SU (3) flavor-symmetric point, where the first error results from the uncertainties on the individual gauge ensembles, and the second is the systematic error of the continuum extrapolation.
We have discussed in subsection VII C how we expect our result for a hlbl µ to evolve as the up and down quark masses are lowered towards their physical values at fixed trace of the quark mass matrix. Correcting for the increase in the π 0 exchange contribution 7 using our previous lattice calculation [12], we arrive at a value (Eq. (30)) which is very well in line with the most recent phenomenological [5] and lattice QCD results [16]. This value is quite stable under varying the assumptions about the quark-mass dependence of heavier-state contributions.
In order to reduce the systematic uncertainty of a hlbl µ at physical quark masses using lattice QCD, obviously simulations at lighter quark masses are needed. The methods we have developed to correct for finite-size effects and to extend the tail of the |y|-integrand based on the π 0 exchange will be extremely valuable in this endeavor. While we have reached a semi-quantitative understanding of the integrand in terms of hadronic models, further work is needed on the theory side to bring this description to a fully quantitative level.
Appendix A: Pseudoscalar-meson-exchange-channel matching in Method 2 and the quark-disconnected contribution In this work, when we compute the quark-connected diagrams using Method 2 and the quark-disconnected diagrams, only a subset of Feynman diagrams are considered due to the contributing Wick-contractions at QCD level. In order to study the FSE in an effective field theory framework, one has to map these contractions to the corresponding Feynman diagrams in the effective theory. Our study of the FSE is based on pseudoscalar-meson (PS) exchange from Chiral Perturbation Theory. The matching between the QCD Feynman diagrams and the PS-meson-exchange channels can be done in various ways. Here, we present a determination of the matching using Partially-Quenched QCD (PQQCD) and Partially-Quenched Chiral Perturbation Theory (PQChPT). There have been many applications of PQQCD in the Lattice QCD community (see e.g. [41][42][43]). In particular, it can serve as a tool to give an estimate of the size of the contribution coming from quark-disconnected diagrams compared to the connected ones for some observables [44]. Here we do not intend to detail the formalism, but only to give some arguments to explain how we reach the mappings explained in our methodology.
PQQCD is a theory with a graded Lie-group SU(N |M ) as symmetry group. In this theory there are N − M sea quarks and M valence (quenched) quarks with their ghost counterparts. The presence of these quenched quarks does not change the partition function from the un-quenched theory with N − M dynamical quarks. Therefore, one can formulate each of the different Wick contractions needed for computing an n-point function (in an un-quenched theory) in a partially-quenched theory by adding certain number of quenched quarks. Then, one can study the long-distance behavior of the observable using PQChPT, which is the corresponding effective field theory to PQQCD.
With only three flavors (as is the case in this work), one cannot build a four-point function that requires only quark-connected diagrams, because each quark line would require a different flavor in order to avoid disconnected diagrams. However, such a four-point function can be constructed with an additional quenched quark flavor, i.e. SU(4|1) PQQCD. For instance, if we introduce a quenched quark r and its ghostr with the same mass as the ones originally present in the SU(3) f theory, namely u, d and s, in the computation of the four-point function ūγ µ d(x)dγ ν s(y)sγ σ r(z)rγ σ u(0) + h.c. , only quark-connected diagrams are involved. Eq. (A1) is exactly the four-point function Π (1) µνσλ (x, y, z, 0) that is used for the connected contribution in Method 2. Moreover, one can study the quark-disconnected diagrams that we consider in this work within the same partially quenched theory. For instance, ūγ µ d(x)dγ ν u(y)sγ σ r(z)rγ σ s(0) + h.c. (A2) only contains the disconnected diagram where (x, y) and (z, 0) are connected by quark lines.
To determine the relevent PS-meson-exchange channel, one can consider the Wess-Zumino-Witten term [45,46] in PQChPT at leading order in the decay constant. Eq. (A1) can be understood as a four-point vector current correlation function. A general procedure is to include all the possible super traces of operators [47]. However, in the case of the coupling to external charged vector currents, which are the Noether currents of the symmetry group SU(N |M ) whose generators are super-traceless, the only allowed non-vanishing term is proportional to where φ is the Goldstone boson/ghost field, F µν is the field strength of the external vector current and the T a 's are the generators of the group. As a result, in momentum-space, the PS-meson exchange in the s-channel of the two-vector-currents-to-two-vector-currents where G is the scalar field propagator , and g ab = 2str(T a T b ).
The matching between different contractions and the PS-exchange channels based on this computation is shown in Fig. 17. With the charge factors included, one will recover the ratio between the total quark-disconnected contribution and the total quark-connected given in Ref. [28], based on charge factor arguments.

Appendix B: The new subtraction with the lepton-loop
In this section we investigate the new subtraction for both Methods 1 and 2 using the infinite volume lepton-loop. A comparison of previous results with our kernel to the leptonloop can be found in [17,27]. The lepton-loop result can be determined easily by standard numerical integration techniques, the results of which can be used to inform our choices of kernel parameters in the full QCD calculation.
For both figures (18 and 19) we set the lepton mass equal to that of the muon. Fig. 18 shows the results for Method 1. There is little difference between the choices of Λ for this calculation and they all saturate the integral reasonably quickly. The result when Λ = ∞ is too peaked at short distances to make this a useful kernel for the lattice computation. All of the peak positions for the integrands are roughly at the same (small) value of |y|.
In Fig. 19 we show the results for Method 2. We see that the usual kernel (Λ = ∞) is very peaked at short distances, this will make it unsuitable for the lattice calculation. We also note that Λ = 0 gives a broad integrand that very slowly approaches the exact result, this also makes it somewhat unsuitable for a lattice calculation unless a large physical volume and large values of |y| are available.
The choice of Λ = 0.4 has a slightly more pronounced peak at small |y| than Λ = 0 and it goes to zero at large distances rapidly. There is a small negative contribution to the integral at large |y|, but in the full QCD case can be corrected for by modelling the tail, as Λ increases this negative correction grows. If our lepton-loop result translates to full QCD this choice of Λ should allow for most of our integral to be encompassed by the lattice volume. seem usable for Method 1. It is interesting to note that Method 2 actively shifts the peak of the integrand to typically intermediate distances. So, with a careful choice of the parameter Λ we can tune the kernel to peak in a region where the calculation is not so sensitive to either of the lattice cut-offs of lattice spacing and volume.
Appendix C: Kernel independence of a conn µ and a disc µ The validity of the subtractions applied to the QED kernel originates from the Ward identity (current conservation) satisfied by the electromagnetic current. Thus, one will expect a kernel-independent result at the level of the total result of a hlbl µ . In this paper, we have treated the contribution from the connected diagrams (a conn µ ) and the contribution from the disconnected diagrams (a disc) µ ) separately. One natural question would be whether these two quantities themselves are also kernel-independent. The answer to this question is yes, due to the Ward identity satisfied by the vector current. We will give a proof of this statement in the SU(3) f case. In Euclidean space-time, if an infinitesimal transformation δ of the fields does not change the path integral measure, we have where S is the QCD action and O is any operator. Without changing the action, we can consider a partially-quenched theory by introducing a quenched quark r and its ghost partner r (see Appendix A). If we consider an infinitesimal transformation generated by the local SU(2) f symmetry of the quark pair (u, d) and their conjugates : we have the following expression for the variation of the action related to the corresponding Noether current δ S = i (x 0 )∂ λ (ūγ λ d +dγ λ u)(x 0 ).
Plugging in everything in Eq. (C1), we have where Π µν (x, x 0 )Π νσ (y, z) gives one of the contractions needed for the disconnected computation. One can then easily show that all the disconnected diagrams have similar currentconservation property. Consequently, the subtractions that we apply to the QED-kernel do not change the result on the contribution from the disconnected diagrams. It follows directly that the connected contribution is also kernel-independent, because a conn µ = a hlbl µ − a disc µ in our SU(3) f case. One can further show that a conn µ is also itself kernel-independent in general flavor cases.