Frequency-splitting estimators of single-propagator traces

Single-propagator traces are the most elementary fermion Wick contractions which occur in numerical lattice QCD, and are usually computed by introducing random-noise estimators to profit from volume averaging. The additional contribution to the variance induced by the random noise is typically orders of magnitude larger than the one due to the gauge field. We propose a new family of stochastic estimators of single-propagator traces built upon a frequency splitting combined with a hopping expansion of the quark propagator, and test their efficiency in two-flavour QCD with pions as light as 190 MeV. Depending on the fermion bilinear considered, the cost of computing these diagrams is reduced by one to two orders of magnitude or more with respect to standard random-noise estimators. As two concrete examples of physics applications, we compute the disconnected contributions to correlation functions of two vector currents in the isosinglet omega channel and to the hadronic vacuum polarization relevant for the muon anomalous magnetic moment. In both cases, estimators with variances dominated by the gauge noise are computed with a modest numerical effort. Theory suggests large gains for disconnected three and higher point correlation functions as well. The frequency-splitting estimators and their split-even components are directly applicable to the newly proposed multi-level integration in the presence of fermions.


Introduction
Disconnected fermion Wick contractions contribute to many physics processes at the forefront of research in particle and nuclear physics: the hadronic contribution to the muon anomalous magnetic moment, K → ππ decays, nucleon form factors, quantum electrodynamics and strong isospin-breaking contributions to hadronic matrix elements, η propagator to name a few. When computed numerically in lattice Quantum Chromodynamics (QCD) and if the distances between the disconnected pieces are large, their variances are dominated by the vacuum contribution. The latter are well approximated by the product of variances of the connected sub-diagrams the contractions are made of. The recently-proposed multi-level Monte Carlo integration in the presence of fermions [1,2] is particularly appealing for computing disconnected contractions, since the various sub-diagrams can be computed (essentially) independently from each other, thus making the scaling of the statistical error with the cost of the simulation much more favorable with respect to the standard Monte Carlo integration.
The simplest examples of this kind are the disconnected Wick contractions of fermion bilinear two-point correlation functions, where each single-propagator trace is usually computed by introducing random-noise estimators [3][4][5]. As the action of the auxiliary fields is already factorized, the multi-level integration in the gauge field becomes highly profitable once the variance of each connected sub-diagram is driven by its intrinsic gauge noise. The random-noise contribution, however, is typically orders of magnitude larger than the one due to the gauge field, a fact which calls for more efficient estimators in order to avoid the need of averaging over many random-noise fields with large computational cost.
The aim of this paper is to fill this gap by introducing a new family of stochastic estimators of single-propagator traces which combine the newly introduced split-even estimators with a frequency splitting and a hopping expansion of the quark propagator. We test their efficiency by simulating two-flavour QCD with pions as light as 190 MeV. As a result, depending on the fermion bilinear considered, the cost of computing singlepropagator traces is reduced by one to two orders of magnitude or more with respect to the computational needs for standard random-noise estimators. The frequency-splitting estimators can be straightforwardly implemented in any standard Monte Carlo computation of disconnected Wick contractions, as well as directly combined with the newly proposed multi-level integration in the presence of fermions.
In the next section we summarize basic facts about variances of generic disconnected Wick contractions, while those of single-propagator traces are discussed in section 3. The following section is dedicated to introduce stochastic estimators of single-propagator traces of heavy quarks based on a hopping expansion of the propagator, while in section 5 we introduce the split-even estimators for the difference of two single-propagator traces also relevant for the muon anomalous magnetic moment. The frequency-splitting estimators are introduced in section 6, where also the outcomes of their numerical tests are reported. In section 7 we discuss the impact of these findings on two concrete ex-amples of physics applications: the disconnected contributions to the correlator of two electromagnetic currents in the isospin limit relevant for the hadronic contribution to the muon anomalous magnetic moment, and the propagator of the ω vector meson. The paper ends with a short section of conclusions and outlook, followed by some appendices where some useful notation and formulas are collected.

Variances of disconnected Wick contractions
The connected correlation function of a generic disconnected Wick contraction, made of two sub-diagrams 1 W 0 (0) and W 1 (x) centered at the origin and in x respectively, can be written as with its variance given by For large distances |x|, and analogously for σ 2 C W 1 , and the dots stand for exponentially sub-leading effects. If the gauge fields in the regions centered at the origin and in x are updated independently in the course of a multi-level Monte Carlo, e.g. Ref. [1,2], the statistical error of each of the two sub-diagrams C W 0 and C W 1 decreases (essentially) proportionally to the inverse of the square root of the cost of the simulation. The overall statistical error on C W 1 W 0 thus scales with the inverse of the cost rather than with its square root. The above argument can be iterated straightforwardly to multi-disconnected contractions. Maybe the simplest example of this kind is a disconnected Wick contraction of the correlator of two bilinear operators for which, following Eq. (2.3), the variance is well approximated by the product of variances of two single-propagator trace estimators.

Single-propagator traces
The single traces we are interested in are where D mr is the massive Dirac operator with bare quark mass m r (for definiteness we adopt the O(a)-improved Wilson-Dirac operator, see Appendix A), a is the lattice spacing, the factor is chosen so that t Γ,r (x) is real, and σ µν = i 2 [γ µ , γ ν ]. We are interested in the zero three-momentum field 2t whose expectation value is where ψ r is a quark flavour of mass m r , and L 3 is the three-dimensional lattice volume. The variance oft Γ,r (x 0 ), can be written as the subscript c stands as usual for connected, and ψ r is a second flavour 3 of mass m r = m r . The operator product expansion would predict generically that σ 2 t Γ,r diverges as a −3 . There are exceptions, however, depending on the symmetries preserved by the regularization and on the operator implemented 4 . Moreover σ 2 t Γ,r vanishes in the freetheory limit g 0 → 0, and the first non-zero contribution appears at O(g 4 0 ) or higher in perturbation theory.

Random-noise estimator
We introduce random auxiliary fields (random sources) [3,5] defined so that all their cumulants are null with the exception of the two-point functions which satisfy  Table 1: Overview of the ensembles and statistics used in this study. We give the label, the spatial extent of the lattice, the hopping parameter κ, the number of MDUs simulated after thermalization, the number of independent configurations selected N cfg , the pion mass M π , and the product M π L. For F7, N cfg = 100 configurations have been used for estimating the variances while the final results for the two-point functions have been obtained with N cfg = 1200.
where a, b and γ, δ are colour and spin indices respectively, and x, y are lattice coordinates. By using Eq. (3.8), it is straightforward to prove that a random-noise estimator of s Γ,r is given by where η i are N s independent sources (colour and spin indices omitted from now on).
The variance of the zero-momentum estimator where again ψ r and ψ r are two degenerate flavours of mass m r , and to simplify the notation we have introduced the usual definition P rs = O γ 5 ,rs for the pseudoscalar density (no time-dilution is used since we are interested in the estimator at all times). The random-noise contribution to the variances in Eq. (3.11) diverges proportionally to a −3 like the gauge one. Both integrated correlators on the r.h.s. of Eq. (3.11), however, are colour enhanced with respect to the gauge noise and they are of O(1) in the free theory, see Appendix B. The Γ-dependent contribution is indeed the flavour-connected counterpart of the disconnected contraction appearing in Eq. (3.6). The Γ-independent term P P , which is also integrated over the time-coordinate, diverges proportionally to m −1 r when m r → 0 due to the pion pole, giving large contributions to the stochastic variances of all bilinears indistinctly. It is interesting to notice that if we would not take the real part in Eq. (3.9), the variances would be larger and Γ-independent since the O Γ,rr O Γ,r r contributions are dropped, and the prefactor 1/(2N s ) goes into 1/N s .
The random-noise contributions to the variances of the standard stochastic estimators in Eq. (3.9) are thus expected to be much larger than the gauge-noise with large ultraviolet and infrared divergent terms.

Numerical tests
To test the efficiency of the various stochastic trace estimators considered in this paper, we have simulated QCD with two dynamical flavours discretized by the Wilson gluonic action and the O(a)-improved Wilson-Dirac operator as defined in Appendix A. The details of the ensembles of configurations considered, all generated by the CLS community [7][8][9], are listed in Table 1. The bare coupling constant is always fixed so that β = 6/g 2 0 = 5.3, corresponding to a spacing of a = 0.065 fm. All lattices have a size of 2L × L 3 , periodic boundary conditions for gluons, (anti-) periodic boundary conditions in (time) space directions for fermions, and spatial dimensions always large enough so that M π L ≥ 4. The pion mass ranges from 190 MeV to 440 MeV. We have always skipped an enough number of molecular dynamics units (MDU) between two consecutive measurements so that gauge-field configurations can be considered as independent in the statistical analysis, see Ref. [9,10] and references therein for more details.
The first primary observables that we have computed are the estimators in Eq. (3.10) with Gaussian random noise. Their variances are shown in Fig. 1 as a function of the number of random-noise sources N s for the ensemble F7. Data for the E5 and the G8 lattices show the same qualitative behaviour. Variances go down linearly in 1/N s until the random-noise contribution becomes negligible, see Eq. (3.11), after which a plateau corresponds to the gauge noise (dashed lines). The first clear message from the data is that the random-noise contribution to the variances is comparable for the various bilinears, as suggested by Eq. (3.11), and it is orders of magnitude larger than the gauge noise. Moreover, the latter can vary by orders of magnitude among the various bilinears, see section 6, with the densities having the largest gauge noise while the currents the smallest one.

Hopping expansion of single-propagator traces
To investigate the contribution to trace variances from high-frequency modes of the quark propagator, we first consider single-propagator traces of heavy quarks. In this kinematic regime the hopping expansion (HPE) is known to lead to a significant reduction of the random-noise contribution to trace variances [11][12][13]. For the O(a)-improved Wilson-Dirac operator, it is natural to exploit the even-odd decomposition to generalize the hopping parameter expansion to and the subscript m has been omitted in the block matrices of the even-odd decomposition of the Dirac operator, see Appendix A for further details. The zero three-momentum single-propagator traces in Eq. (3.3) can thus be decomposed as collects the first 2n contributions of the HPE whilē is the remainder. Notice that convergence of the expansion is not required for Eq. (4.3) to be valid. For small n,t M Γ,r can be computed exactly and efficiently with 24 n 4 applications of M 2n,mr , see Appendix C for more details. The second contributiont R Γ,r can be replaced by the noisy estimator stochastic variances of the remainderτ R Γ,r are between one and two orders of magnitude smaller than those of the standard estimatorsτ Γ,r . For n = 4 a further reduction of approximately 4 to 8, depending on the bilinear, is obtained. If we had defined the estimator of the remainder by applying H 2n mr to one source only, the variance in the free case would increase approximately by a factor 2 or so. The ultraviolet filtering of H n mr on both random sources is thus beneficial with respect to applying H 2n mr to one source only.

Numerical tests
We have computed the single-propagator trace estimatorsτ Γ,r ,t M Γ,r andτ R Γ,r for n = 2 on all ensembles listed in Table 1 for several valence quark masses. For F7 and for the subtracted bare quark mass am q,r = 0.3, the variances are shown in Fig. 2 for the pseudoscalar density and for a spatial component of the vector current respectively. Similar results are obtained for other bilinears and/or for the E5 and the G8 lattices. The variances are in the same ballpark as the free-theory values. A clear picture emerges: the bulk of the random-noise contribution to σ 2 τ Γ,r is due to M 2n,mr for all bilinears. Once the latter is subtracted from the propagator and its contribution tot Γ,r is computed exactly, the random noise is reduced by approximately one order of magnitude or more. Notice that σ 2  function of the valence bare subtracted quark mass am q,r for the pseudoscalar density and the spatial component of the vector current. As expected the variance reduction due to the subtraction of M 2n,mr gets larger and larger at heavier quark masses. In particular at am q,r = 0.3 the variance of the remainder is approximately one order of magnitude smaller than at the sea quark mass value of am q,r = 0.00207. It is worth noting that even at this light mass, the random-noise contribution to σ 2 τ Γ,r from M 2n,mr is still significant for all bilinears. The variance reduction due to HPE, however, is only a factor 2 or so.
All in all data suggest that at heavy masses an efficient estimator of s Γ,r is obtained by computingt M Γ,r exactly and the remainder via the stochastic estimatorτ R Γ,r . Which is the optimal order n and how many random sources N s are required for the remainder depend on the bilinear considered and on the final target observable of interest, see section 6.

Differences of single-propagator traces
To analyse the contribution to trace variances from low-frequency modes of the quark propagator, we consider the difference of two single-propagator traces with different masses. It is worth noting, however, that often the difference itself is a sub-diagram of the correlator of interest, e.g. the disconnected contribution to the hadronic vacuum polarization from the up, down and strange quarks in the exact isospin limit. The estimator of the difference of two single-propagator traces reads where m r = m s . Its expectation value can be written as where, to simplify the notation, we have introduced the usual notation S rs = O I,rs for the scalar density. If we define the zero three-momentum field as its variance is given by where two extra valence fermions ψ r and ψ s , with masses m r = m r and m s = m s respectively, are introduced if not already present in the theory. This time the operator product expansion generically predicts that σ 2 t Γ,rs diverges as a −1 , i.e. two powers less than in Eq. (3.6) thanks to the presence of the squared-mass difference in the prefactor. Analogously to section 3, there are exceptions depending on the symmetries preserved by the regularization and on the discretization chosen for the operator, and σ 2 t Γ,rs vanishes in the free-theory limit with the first non-zero contribution appearing at O(g 4 0 ) or higher in perturbation theory.

Standard random-noise estimator
Maybe the simplest random-noise estimator of s Γ,rs is where the variance of its zero three-momentum counterpart a y 1 ,y 2 ,y 3 a 12 S rs (y 1 )P ss (y 2 )S s r (y 3 )P r r (0) . (5.7) Generically the random-noise contribution on the r.h.s of (5.7) diverges proportionally to a −1 like the gauge variance. The Γ-independent contribution SP SP is one of the spectral sums introduced in Ref. [14]. It is integrated over one time-coordinate more with respect to the first term, and it gives large contributions to the stochastic variances of all bilinears indistinctly. If we would not take the real part in Eq. (5.5), the variances would be larger and Γ-independent since the SOSO contributions are dropped, and the prefactor 1/(2N s ) goes into 1/N s .

Split-even random-noise estimator
An alternative random-noise estimator of the difference of two traces is The corresponding zero three-momentum field is and its variance reads  The random-noise contributions to the variances of the split-even estimators in Eq. (5.9) are thus expected to be significantly smaller than for the standard estimators 5 of differences of single-propagator traces. This is not surprising since in this case both sources, η i and η † i , are ultraviolet filtered by a quark propagator and the variance has one integral less in the time-coordinate analogously to the case of time-diluted sources 6 . With respect to the gauge variance, however, the random-noise contribution is still expected to be larger.

Numerical tests
We have computed the two random-noise estimators in Eqs. (5.6) and (5.9) on all ensembles listed in Table 1 and for several pairs of quark masses. For F7 and for the bare valence masses am q,r = 0.00207 and am q,s = 0.0189, corresponding to the sea and approximately the strange quark masses [9,15], the variances are shown in Fig. 4 for the pseudoscalar density and for one spatial component of the vector current. Similar results are obtained for other bilinears and/or other lattices. The variance of the standard estimators σ 2 θ Γ,rs (red filled symbols) turns out to be essentially Γ-independent as suggested by Eq. (5.7), and it is dominated by the spectral sum SP SP . The split-even 5 The split-even is an estimator for all times at once, as well as the standard estimator in Eq. (5.6) we compare with. If time-dilution was used in (5.6), the computation of the estimator for all times would have been singificantly more expensive. 6 By the same argument, if a split-line estimator localized in a given region of space is chosen, the sum over y2 in Eq. (5.10) is restricted to that region. estimatorsτ Γ,rs (x 0 ) have much smaller variances 7 . The reduction factor ranges from approximately one order of magnitude for the scalar and pseudoscalar densities up to around two orders of magnitude or more for the axial and vector currents as well as for the tensor bilinear. The gauge noise is still smaller than the random noise, but with the split-even estimator the number N s of random sources needed to approach the gauge noise is moderate. It ranges from a few for the pseudoscalar density up to O(100) for the vector current.
Overall, the data show that the split-even random-noise estimator is much more efficient than the standard one for computing differences of single-propagator traces, and it allows one to approach the gauge noise for all bilinears with a moderate number of noisy sources.

Frequency-splitting of single-propagator traces
The results of the last two sections suggest to introduce a family of frequency-splitting random-noise estimators of single-propagator zero three-momentum traces defined as wheret M Γ,rm ,τ R Γ,rm , andτ Γ,r k r k+1 are defined in Eqs. (4.4), (4.6), and (5.9) respectively. The corresponding variances are given by where the various terms on the r.h.s are defined in sections 4 and 5. At high momenta (heavy masses) the contribution fromt M Γ,rm , responsible for the bulk of the variance of the standard random-noise estimator, is computed exactly with a limited number of probing vectors, and only the remainderτ R Γ,rm is estimated by a random-noise estimator. The low-frequency contributions τ Γ,r k r k+1 can then be estimated by the very efficient spliteven estimator. It is rather clear that splitting the single-propagator traces in several parts whose contributions come from different frequency regions is beneficial. It allows us to design a customized estimator for each contribution which profits from its own peculiarities. An important ingredient in this analysis is the fact that solvers invert the Dirac operator with heavier quark masses at a lower numerical cost.

Numerical tests
The best choice of the number of mass differences, the values of the masses, and the order of the HPE for defining the frequency-splitting estimators in Eq. (6.1) depends on many factors: the bilinear of interest, the target mass, the solver chosen for inverting the Dirac operators and its particular implementation, etc. It is not the aim of this paper to optimize with respect to all these factors 8 but, provided a reasonable choice is made, our goal is to give a numerical proof that the frequency-splitting estimators are efficient and allow to reduce significantly the numerical cost for computing single-propagator traces.
To this aim we have implemented two such estimators: • FS1 is the simplest frequency-splitting estimator with one mass difference only. The masses are am q = 0.00207 and 0.1,τ Γ,r 0 r 1 andτ R Γ,r 1 are defined with N s = 1 and 4 respectively. For the lattice F7, inverting the Dirac operator at the heavier mass costs approximately 1/3 than at the target lighter mass. Each evaluation of this estimator therefore costs approximately 2.5 times more than processing one random source for the standard estimator 9 .
• FS2 is defined by 4 mass splittings corresponding to the masses am q = 0.00207, 0.02, 0.06, 0.15, 0.3, and the corresponding random-noise estimators are defined with N s = 1, 1, 2, 3, and 10 random sources respectively. For the lattice F7, the cost of inverting the Dirac operator for the second up to the fifth mass is approximately 1/2, 1/3, 1/4, and 1/6 with respect to the lightest quark mass respectively. Each application of this estimator thus costs approximately 6.5 times with respect to processing one random source for the standard estimator.
In both cases the solver used is the generalized conjugate residual (GCR) algorithm preconditioned by a Schwarz alternating procedure (SAP) and local deflation as implemented in openQCD-1.6 [18].
In Fig. 5 we show the variances of FS1, FS2 and of the standard estimator as a function of N s , the number of evaluations of each of them per gauge configuration. Similar plots are obtained for the other bilinears and the other two lattices. A clear message emerges: a large gain is obtained for both frequency-splitting estimators with mild differences in efficiency between them. The FS1 is slightly better for the scalar and pseudoscalar densities, while FS2 is more efficient for the vector, axial-vector and tensor bilinears. In particular, the variance of FS1 is approximately 20 and 15 times smaller than the one of the standard estimators for the scalar and pseudoscalar densities respectively. Taking into account that one application of FS1 costs approximately 2.5 more, the gain in computation cost is 8 and 6 for the scalar and pseudoscalar 10 densities respectively. For the vector and the axial-vector, the variance of FS2 is approximately 2 orders of magnitude smaller than the one of the standard estimators. As the FS2 is 6.5 times more expensive, the gain in computational cost is approximately a factor 15. For the tensor the factor gained reaches approximately 20.
It is worth noting that for the scalar, pseudoscalar and the tensor bilinears just one or a few evaluations of the frequency-splitting estimators are needed for the variance to be comparable to the gauge noise. For the axial-vector and vector currents O (10) and O(100) evaluations of the FS2 estimators are required to reach the same goal. As a result, in all cases the gauge noise is reached with a limited and affordable number of evaluations of the frequency-splitting estimators. If necessary the frequency-splitting estimator can be easily combined with low-mode averaging [19,20] and its variants [21].

Numerical tests for two-point functions
In this section we discuss the numerical results for two representative examples of disconnected contributions to two-point functions, which are the simplest correlation functions with a non-trivial time dependence composed only of single-propagator traces. We use the estimators proposed in sections 5 and 6 to confirm the expected improvement over x 0 /a a 3 C rs V V × 10 6 split-even Figure 6: Left: variance of the disconnected contribution in Eq. (7.1) with x 0 /a = 10 using the standard (red filled squares) and split-even estimator (blue open squares). The stochastic noise of the split-even estimator is comparable with the gauge noise after N s ∼ 256. Right: the disconnected contribution using the split-even estimator from N cfg = 1200 gauge configurations.
the standard estimator, and check the factorization formula for the variance given in section 2.

Split-even estimator for electromagnetic current
As alluded to in section 5, an important application of the split-even estimator is the determination of the disconnected contribution to the correlation function of two electromagnetic currents with three light flavours. In the isospin limit, this gives rise to a difference of single-propagator traces as in Eq. (5.1) with r and s corresponding to the up/down and strange quark flavours respectively. In particular the correlator determines the light disconnected contribution, via the time-momentum representation [22], of the leading-order hadronic vacuum polarization, once each current is renormalized by Z V = 0.74636(70) [23] and the correct electric charge factor of 1/9 is included.
In the left-hand panel of Fig. 6, we show the variance of this correlation function for x 0 /a = 10 computed by using the standard (red filled squares) and split-even estimators (blue open squares) in Eqs. (5.6) and (5.9) respectively. A reduction of the variance of up to four orders of magnitude is obtained with the split-even estimator (two orders of magnitude in the cost), which starts to be comparable to the gauge noise for N s ∼ 256. As expected, the variance is practically constant in x 0 and well-described by the factorization formula in Eq. (2.3) when the averaging over time and the polarizations of the current are taken into account. In the right-hand panel of Fig. 6 our best estimate of the correlation function using the split-even estimator is shown using an increased number of gauge configurations, with respect to those used for estimating the variances, of N cfg = 1200. This in turn corresponds to a relative statistical precision of approximately 10% to the disconnected light-quark part of the muon anomalous magnetic moment coming from contributions to the integral up to time-distances of 1.5 fm. If the integral is computed up to 3.0 fm or so, the relative statistical error grows up to 70%, calling for the multi-level integration to determine the contribution from the long distance part of the integrand. To properly renormalize the correlator each current has to be multiplied by the factor Z V which brings a negligible error with respect to the statistical error of the bare correlator 11 .

Frequency-splitting estimator for isoscalar vector currents
In spectroscopic applications, disconnected diagrams arise generically in isoscalar channels. The vector channel, for instance, contains the contribution To evaluate this correlation function, we use the FS2 estimator introduced in section 6 for both single-propagator traces. In the left plot of Fig. 7 we show the variances of the standard estimator (filled symbols) against the number of sources, and the improved FS2 estimator (open symbols) against the number of its evaluations per gauge configuration. The gauge variance is approached with about N s ∼ 256 evaluations of the FS2 estimator, similarly to the case of the one-point function of section 6. In this case, while the disconnected piece gives only a small contribution to the isoscalar channel at intermediate hadronic distances, its variance quickly dominates the statistical error at large distances. The improved estimator thus allows the full correlation function to be resolved at much larger distances.

Conclusions
The numerical computation of disconnected Wick contractions is challenging in lattice QCD because (a) their variances are dominated by the vacuum contribution, which in turn implies that statistical errors remain constant with the distance of the disconnected pieces while the signal typically decreases exponentially, and (b) averaging each disconnected sub-diagram over the volume tends to be numerically expensive because the quark propagators must be re-computed at each lattice point. A milestone for solving the second problem was the introduction of random-noise estimators [3][4][5] which allow one to sum over many or all source points stochastically. However for single-propagator traces, the simplest among the disconnected sub-diagrams, such estimators tend to have variances which are typically orders of magnitude larger than the intrinsic gauge noise. An a priori theoretical analysis of the variances is thus mandatory for deciding how to define exactly the stochastic observables.
Luckily the random-noise contribution to the variances can be re-expressed in the form of simple integrated correlation functions of local composite operators, a fact which allows us to use the quantum field theory machinery for analyzing the origin of the statistical errors and eventually to reduce them.
As a result, we have introduced new stochastic observables for single-propagator traces: the split-even and the frequency-splitting estimators for difference of two traces and for single traces respectively. The former needs from a few random sources for the pseudoscalar density up to O(100) for the vector current to approach the gauge noise. The reduction in numerical cost with respect to the standard estimator ranges from one order of magnitude for the scalar and pseudoscalar densities up to around two orders of magnitude or more for the axial and vector currents as well as for the tensor bilinear. Just one or a few evaluations of the frequency-splitting estimators are needed for the variances of the scalar, pseudoscalar and tensor bilinears to be comparable to the gauge noise, while for the axial-vector and vector currents O(10) and O(100) evaluations are required to reach the same goal. In this case the reduction of the computational cost with respect to the standard estimator is of one order of magnitude or so depending on the bilinear. In all cases considered the variances of the stochastic estimators reach the level of the intrinsic gauge noise with a moderate number of evaluations per gauge configuration.
The use of these new estimators significantly speeds up the computation of disconnected fermion Wick contractions which contribute to many physics processes at the forefront of research in particle and nuclear physics: the hadronic contribution to the muon anomalous magnetic moment, K → ππ decays, nucleon form factors, quantum electrodynamics and strong isospin-breaking contributions to hadronic matrix elements, η propagator, etc. As an example we have shown their potential for computing the disconnected contribution to the light-quark contribution to the muon anomalous magnetic moment and to the correlator of two singlet vector currents. Theory suggests large gains for disconnected three and higher point correlation functions as well. To solve or mitigate the problem (a) alluded to at the beginning of this section, the next step is to combine these estimators with the newly proposed multi-level integration in the presence of fermions [1,2].

Acknowledgments
Simulations have been performed on the PC clusters Marconi at CINECA (CINECA-INFN and CINECA-Bicocca agreements) and Wilson at Milano-Bicocca. We thank these institutions for the computer resources and the technical support. We are grateful to our colleagues within the CLS initiative for sharing the ensembles of gauge configurations with two dynamical flavours. L.G. and T. H. acknowledge partial support by the INFN project "High performance data network".

A O(a)-improved Wilson-Dirac operator
The massive O(a)-improved Wilson-Dirac operator is defined as [26,27] where m is the bare quark mass, D w is the massless Wilson-Dirac operator γ µ are the Dirac matrices, and the summation over repeated indices is understood. The covariant forward and backward derivatives ∇ µ and ∇ * µ are defined to be where U µ (x) are the link fields. The clover term is defined as where the field strength of the gauge field is

A.1 Hopping expansion
By applying the standard even-odd decomposition of the Wilson-Dirac operator see Ref. [28] for unexplained notation, it is straightforward to verify that

B Bilinear chains in the free case
The propagator of a free Wilson fermion is and as usualp 2 =p µpµ andp 2 =p µpµ .

B.1 Two-point correlators
The integrated two-point correlation functions of non-singlet bilinears are The non-zero elements of M 2n,m are those that connect two lattice sites x and y with x − y 1 < na, while the matrix is dense in the spin and colour indices. For a lattice which can be decomposed in hypercubic blocks of size (2na) 4 , an obvious scheme to define the set of probing vectors which satisfies the condition (C.1) is v k (x) = 1 k = i cs + 12 l 2n (x) 0 otherwise (C.3) where i cs = 1, . . . , 12 indicates the spin-colour index and l 2n (x) = (x 0 /a) mod 2n + 2n · [(x 1 /a) mod 2n] + . . . is the lexicographical index labeling the sites in any given block. This scheme, illustrated in Fig. 8 for n = 2, requires K = 192 n 4 probing vectors because one vector is required for each of the spin-colour components for every site in the block. A more efficient scheme, already outlined in Ref. [31], is depicted in the right-hand panel of Fig. 8 where as before i cs indicates the spin-colour index, p = 0, 1 is the parity of the block, and again l n (x) is the lexicographical index labeling the sites in any given block. This scheme requires just K = 24 n 4 vectors, which is a factor 8 fewer than the first one.