Exploiting stochastic locality in lattice QCD: hadronic observables and their uncertainties

Because of the mass gap, lattice QCD simulations exhibit stochastic locality: distant regions of the lattice fluctuate independently. There is a long history of exploiting this to increase statistics by obtaining multiple spatially-separated samples from each gauge field; in the extreme case, we arrive at the master-field approach in which a single gauge field is used. Here we develop techniques for studying hadronic observables using position-space correlators, which are more localized, and compare with the standard time-momentum representation. We also adapt methods for estimating the variance of an observable from autocorrelated Monte Carlo samples to the case of correlated spatially-separated samples.

Numerical lattice QCD is, by now, a well established tool for extracting systematic and precise non-perturbative predictions of strong-force observables.The field has long sustained a positive feedback loop, with cutting edge calculations motivating continued technical advances, which in turn support the next generation of calculations.The nature of the progress is varied; here we give three examples.First, the desire to push more observables into the sub-percent regime has led to recent developments in the use of multi-level algorithms [1][2][3][4], which promise to exponentially improve the signal in importance-sampling-determined correlators, at fixed computational cost.A second example stems from the fact that many lattice calculations are already at percent or subpercent precision: this has led to major developments in the inclusion of electromagneticand isospin-breaking effects, see e.g.[5][6][7][8][9], to ensure that meaningful quantities are being calculated at the reported precision.As a third example, going beyond the improvement of established lattice quantities, the field continues to develop strategies for new classes of observables.For example, the spectral methods described in refs.[10][11][12][13][14][15][16] have received major attention recently as the basis of a new strategy for overcoming limitations of the Euclidean signature in calculations.
In this work we are concerned with another aspect of high-precision calculations: the need to design optimal estimators for both the central values and covariance matrices of lattice data.This is particularly relevant in the case of a limited number of gauge-field configurations.
The key concept that we exploit in this work is stochastic locality, the notion that quantum fields belonging to regions sufficiently separated in space-time are exponentially decorrelated and fluctuate (almost) independently, thanks to the mass gap of QCD.In the limit of a single field configuration, generated with a finite but large volume capable to accommodate enough independent fluctuations of the fields, it is possible to exploit stochastic locality to define estimators of correlation functions and of variances from translation invariance [17].This paradigm of performing so-called master-field simulations [17,18], i.e. the generation of a few large-volume fields, is currently being investigated in the pure gauge theory [17,19] and in full QCD [18,20].This idea was proposed in view of calculations at very fine lattice spacings, which suffer from frozen topological charge, leading to biases in the estimation of QCD observables.Working in a large volume suppresses this effect in addition to reducing systematic uncertainties from the periodicity of fields that occur independent of topological charge freezing.
The need to design optimal estimators can arise for multiple reasons.Such master-field calculations represent a fairly dramatic example, but in general, many modern lattice QCD calculations may face the practical need to rely on a small number of gauge-field configurations.For example, in practice, calculations using expensive actions such as domain-wall fermions (e.g.used extensively by the RBC-UKQCD and JLQCD collaborations), often end up generating fewer gauge-field configurations as compared to calculations with other actions.
This work represents a collection of methods and results that can be used to exploit their advantages in different lattice setups, i.e. their use is not limited to the master-field scenario.More specifically, we examine the following three aspects.First, we study how far stochastic locality can be pushed in traditional simulations with volumes between 4 and 9 in units of the pion Compton length, corresponding to 3 to 6 fm.Second, we examine strategies to design improved estimators of fermionic observables, with a specific attention to their volume-scaling properties, which are particularly relevant for the master-field program.Finally, we investigate whether typical low energy properties of simple hadrons, such as masses and transition matrix elements, can be extracted more efficiently from correlators defined in position space rather than the usual time-momentum representation (TMR).This paper is organised as follows.In section 2 we present the details of the simulations used in this work and define the observables that we examine, both in time-momentum and coordinate-space representations.In section 3 we introduce several estimators for such correlation functions (using stochastic locality as a guiding principle), review the formalism to calculate variances from spatial translation invariance and we present a detailed numerical study on its applicability.In section 4 we collect and work out the necessary formalism required to study correlators in position space, with a particular attention to boundary effects.For the extraction of the pion mass we show that the latter can be controlled analytically.In section 5 we perform a detailed comparison of time-momentum representation and position-space data for the extraction of spectra and matrix elements.Finally, in section 6 we summarize and discuss our findings.The appendices contain several clarifications and collect additional results interesting for large volume simulations.Examples include truncated sums for momentum projections and an alternative implementation of the master-field idea using a large temporal extent.

Lattice setup
In this section we introduce the basic expectation values that we examine in our study and define the notation that we will use throughout the manuscript.Further we describe the numerical setup used in the presented calculations.

Hadronic correlation functions
In the following we focus on simple pseudoscalar, axial, vector, and nucleon two-point correlation functions in Euclidean space: C AP,µ (x) ≡ ⟨A µ (x)P † (0)⟩ = ⟨P (x)A † µ (0)⟩, (2.2) C N N (x) ≡ ⟨χ(x) χ(0)⟩, (2.5) where A µ (x) ≡ ū(x)γ µ γ 5 d(x), (2.7) χ(x) ≡ ϵ abc u T a (x)Cγ 5 d b (x) u c (x) . (2.9) Here u(x) and d(x) are quark fields with Dirac and color indices left implicit, except for the color indices shown in the definition of χ(x).Both smeared and unsmeared quark fields are considered, as discussed further below.The conventional approach in extracting physical information from such correlators is to use the time-momentum representation of the two-point correlators: (2.10) In our numerical studies, we will focus on p = 0 and simply denote that case C(t) ≡ C(t, 0).Keeping generality for now, we are interested in the masses for the pseudoscalar and nucleon correlators obtained at large t: where the → indicates that we only keep the leading term on the right-hand side.Here p = (iE, p) and c P , c N are overlap factors that parameterize the coupling of the interpolating operator to the ground state. 1 Up to exponentially suppressed volume effects, the masses can be extracted from E π (p) = m 2 π + p 2 and E N (p) = m 2 N + p 2 .From these correlation functions the pion decay constant f π is a further standard observable that can be determined.It is defined as the axial current matrix element with a properly normalised pion state, ⟨0|A µ (0)|π(p)⟩ ≡ ip µ f π . 2 If the quarks are local, i.e. not smeared, the C AP,µ correlator can be used to extract f π .A conventional method to extract f π consists of taking the asymptotic part of the correlator in the time-momentum representation, with p = 0 and the axial current in the time direction A 0 .Then the decay constant is given by the ratio c A (0)/m π up to renormalization and possible improvement.Since in this paper we are interested only in computing and comparing the noise of the matrix element, here we neglect such complications and simply define where the c A = c A (0) amplitude (as well as c P = c P (0)) can be chosen to be real and positive.At the practical level, one can compute f bare π from a combined fit of C P P (t, 0) and C AP,0 (t, 0) with m π , c A and c P as parameters.
For the vector correlator, we decompose the spatial components at zero momentum as ; if the quark fields are unsmeared, this can be used to obtain the isovector part of the subtracted hadronic vacuum polarization (HVP) function [23]: where Z V is the renormalization factor for the vector current.Also in this case we neglect possible improvement terms.

Numerical setup
To verify our analytical findings and to effectively study the presented methods, we generate a number of ensembles that enable like-by-like comparisons.These ensembles span a range of volumes at fixed pion mass and a range of pion masses at fixed volume so that the corresponding scaling behaviors can be studied in detail.
The ensembles that we used or generated for this study are reported in table 1 and span pion masses ranging from (approximately) 410 MeV down to 215 MeV and volumes from m π L ≃ 3.3 up to 9. The largest spatial extent we reach has a physical length L ≃ 6 fm.For recent progress in generating larger lattices with physical volumes of L ≃ 9 fm and L ≃ 18 fm, following the master-field paradigm, see refs.[18,20].
The chiral trajectory is set by fixing the sum of quark masses at the so-called flavor symmetric point [29][30][31].This implies that the strange quark mass is lighter than the physical value.We do not consider observables containing an explicit strange quark here and focus on purely light-quark (isovector) correlators.We further draw attention to the 32BT ensemble: here the same space-time volume of the 64B ensemble is reached as from our smallest volume by elongating the T direction while keeping the spatial volume fixed at L = 32a.We discuss results on this long-T ensemble in appendix E; see also ref. [32] as initial reference for the long-T approach.
We note that quark field smearing is a technique commonly used when studying hadrons.It serves two purposes: First and foremost, it suppresses contributions from excited states relative to the ground state, so that the latter can be isolated at shorter distances.Second, it extends the footprint of point-source propagators, which tends to improve the statistical signal.The standard smearing methods used with the time-momentum representation extend only in spatial directions.As such they are O(4)-covariant.To study how these common methods behave in our approach, we adopt the smearing from ref. [33], which uses a fermion propagator in three dimensions.In order to study hadrons in position space, we also use quark fields with the gradient flow [34] applied, which is an O(4)-covariant smearing.Details on smearing parameters are given below in table 2 and the surrounding text in section 5.2.

Estimators of correlators and variances
In this section we continue with the description of the numerical strategies adopted to estimate both central values and errors, while numerical results are deferred to the next sections.
In Lattice QCD calculations, Wick's theorem is employed to express fermionic correlation functions in terms of quark propagators S(x, y).Point-source propagators are defined from sources with support on a single lattice site, and may be used to calculate two-point isovector correlators on a fixed gauge-field background as C point (x; y) ≡ − Re Tr Γ ′ S(x + y, y)ΓS(y, x + y) . (3.1) We omit from the notation spin, color and flavor indices, since we are only interested in observables depending on the two light degenerate quark fields, and we also do not indicate whether the underlying quark fields have been smeared with a specific label, but specify it in the text.Taking the expectation value3 over fluctuations of the gauge field, we obtain where C may be taken from eqs. displace the location y of the source and to average the corresponding different estimators of the correlation function.Stochastic locality suggests that to obtain efficient sampling with largely uncorrelated samples, the minimal distance among the source locations should be a function of the typical correlation length of the system, and of the observable.Taken together with translation invariance, one naturally arrives at a regular displacement of the point sources on a grid, which we discuss below.

Stochastic estimators
Solving for propagators from many different point sources y ∈ G has a cost that scales proportionally to |G|, the number of such points.To avoid such cost scaling, one could let the source have support on all points in G and perform a single propagator solve.For a given sink location x, the correlator would be dominated by the closest source y, and in general the contributions from the other points of the grid would be exponentially suppressed: see figure 1.However, when x − y grows, disentangling their effect from the main signal would become increasingly difficult; therefore, it is desirable to explicitly eliminate such contaminations.To do so, we introduce N η noise fields η i (x) with support on the sparse set of points G that define a stochastic grid, which satisfy where I sc = I s ⊗ I c is the identity matrix in spin and color space.The neglected terms of ) represent the stochastic noise.Letting ψ i (x) be the propagator with η i as its source, we obtain ⟨ψ i (x)η † j (y)⟩ = δ ij S(x, y) , for y ∈ G .
In practice, we employ U(1) noise with color and spin dilution [35,36], which can be represented as having η i (x) be a diagonal spin-color matrix containing random phases.In this work we consider the estimator which satisfies ⟨C sgrid (x; y)⟩ = C(x) for all y ∈ G. Therefore, on a single configuration the correlator averaged from all |G| points in G, defines an estimator for C(x) with stochastic error that scales as ) stochastic error.Here we have used ⟨⟨•⟩⟩ to denote a volume-averaged estimator; this notation will be especially used in the next subsection.
For a baryon, the correlator has the form, C N N (x − y) = ⟨B[S(x, y), S(x, y), S(x, y)]⟩, where B is a trilinear map containing the color and spin contractions, which may be implicitly defined from eqs. (2.5) and (2.9).Similarly to eq. (3.6) a stochastic estimator is defined from For our choice of color-and spin-diluted U (1) noise, the diagonal nature of η i (y) for each y ∈ G allows us to use the simpler (and still unbiased) estimator with The equations derived above work for any distribution of the source locations and (depending on the specific problem) different displacements from a regular grid might turn out to be more efficient.The (potential) advantage of the estimator in eq.(3.6) is that fewer than |G| explicit evaluations of propagators may be sufficient.For short-distance observables, with a small footprint, denser grids lead to a substantial reduction of the variance (see for instance ref. [37]).On the other hand, for long-distance observables sparser grids may be preferable; in this case, field configurations with large volumes profit much more from such a strategy.Specifically, if one keeps the density of points in the grid constant, the cost grows only linearly in the volume.
In practice, with master fields the calculation might be significantly accelerated by considering a block decomposition of the Dirac operator 4 , along the lines of refs.[2,39], for example by centering the domain around the location of a point of the grid [17].We do not investigate this approach here and defer its study to future work.
The estimators defined above are designed to obtain an optimized sampling for correlators that depend generically on the four-vector x.However physical information is commonly extracted from the time-momentum representation.In this case it is more efficient to let the noise field have support on all points at fixed Euclidean time, which we call a stochastic wall, and to use the one-end trick [40] to obtain an estimator that requires just one noise field.In this case, we let η i (x) be a scalar noise field with support on the wall and ψ i (x) be the matrix field containing the propagator with η i ⊗ I sc as its source [41].Denoting the Euclidean time location of the wall with y 0 , a simple estimator for the isovector zero-momentum correlator of two bilinear operators is found as: ) . (3.11) Since η i (x) has support on an entire time slice, zero-momentum projection is achieved at the source, and C swall (x 0 − y 0 ; x) satisfies ⟨ C swall (x 0 − y 0 ; x)⟩ = C(x 0 − y 0 ) for all x.
Therefore the freedom at the sink location, i.e. x, can be used to define the estimator which after the gauge-field average has an improved variance compared to the same timemomentum correlator calculated from point sources, for fixed moderate computational cost.The fine sampling of (L/a) 3 points will be particularly useful for a detailed study in section 3.3 of the saturation of our estimate of the variance of ⟨⟨ C swall ⟩⟩.

Variances
The above definitions of the correlation functions induce a situation where the observables (indexed by α, β, . . . ) have localized estimators 5 O α (x), and data are available for N = |Λ| regularly-spaced points x ∈ Λ.Here N could be as large as (L/a) 4 if all lattice points are available, but our analysis is generic and also applies to situations where Λ is a Ddimensional subspace or a coarse sub-grid of spacing b, in which case N = (L/b) D .As such it covers the three mentioned implementations using points, grids and walls.
In our case, we restrict ourselves to the situation where O α (x) are known on a single large-volume field configuration.The best estimator of the true expectation value ⟨O α ⟩ is given by the volume average where the sum is to be understood as running over the N points in Λ for which data are available.Using invariance under space-time translations, the covariance of our volumeaverage estimators [17], x,y is given in terms of the correlation function Γ αβ , in analogy to the autocorrelation function for Monte Carlo time [42].Likewise, we define C αβ as the sum of Γ αβ (y) without the factor of 1/N , where the finite-volume sum can be approximated as an infinite-volume sum, up to exponentially suppressed corrections.The function Γ αβ is expected to fall off exponentially as a function of the distance |y|, with a mass m that is dictated by the details of the system.In particular, because O α has a non-zero vacuum expectation value, states with vacuum quantum numbers will appear in the spectrum and the asymptotic behavior will typically be determined by the lightest 0 ++ state, 6 i.e. m = 2m π .In principle, this approach leads to improved error estimators, i.e. a reduction of the error of the error, compared to the traditional Monte Carlo analysis where the space-time information is blocked and not exploited in the error estimates.In a practical situation, where the volume is large but finite, we have to substitute ⟨O α ⟩ with ⟨⟨O α ⟩⟩, thus obtaining a (biased) estimator for Γ αβ given by with bias (see appendix C) 7⟨⟨Γ αβ (y Moreover, it is necessary to truncate the sum defining the covariance matrix: we introduce the finite summation radius R and define The truncation introduces an additional bias: where m is the mass governing the falloff of Γ αβ (y).Where appropriate (such as for α = β), in analogy to the integrated autocorrelation time τ int [42], we also introduce the integrated correlation volume such that the variance of ⟨⟨O α ⟩⟩ becomes with generically var(X) ≡ ⟨(X − ⟨X⟩) 2 ⟩.Note that τ α (R) implicitly depends on the number of dimensions D of the subspace where the master-field analysis is considered, and for D = 1, this has the same form as τ int .
Extending the derivation of ref. [42] to D dimensions (see appendix C), we obtain for the error of the error Above N (R) is the number of available points satisfying |y| ≤ R and can be approximated as is the volume of a D-ball of radius r and Γ(x) is Euler's gamma function.This result can be used to formulate an automatic windowing procedure for a master-field type analysis where an optimal summation radius R is found by balancing statistical and systematic errors; see appendix D. In summary: for the chosen R, our estimate for the covariance of ⟨⟨O α ⟩⟩ and ⟨⟨O β ⟩⟩ is ⟨⟨C αβ (R)⟩⟩/N , which for α = β is also given by eq.(3.23).
Finally to conclude our considerations on master-field error estimators, we note that a blocking procedure can be defined in analogy to binning of adjacent gauge configurations in a Monte Carlo chain.That is, one may use a method to absorb the effect of autocorrelations so to treat the blocked data with standard statistical tools.Specifically starting from a N = (L/a) D lattice, a coarse blocked lattice of size with u µ the coordinate of the block.At this point the analysis can proceed as before with the replacement x µ → u µ and N → N B .As expected, by increasing the block size ⟨⟨C αβ (R)⟩⟩ saturates at smaller N (R): this corresponds to smaller R/b, but the physical scale R governing saturation remains the same.Thus ⟨⟨Γ αβ (0)⟩⟩ becomes an increasingly better approximation of the error, at the expense of a larger error of the error (see also ref. [42]).For sufficiently large block sizes, the blocks become statistically independent and standard analysis techniques such as bootstrap or jackknife can be used without the D-dimensional formalism.On the other hand, the downside of blocking is that the errorof-the-error due to correlations is suppressed by a power law rather than exponentially.In this study we used minimal blocking, i.e. b/a = 2 or 4, only as a practical strategy to reduce storage costs for the observables and to speed up intermediate stages of the analysis; this enables the calculation of ⟨⟨Γ αβ ⟩⟩ without the need for a computing cluster.

Error saturation in master-field estimates
To study the scaling of the summation radius R with our ensembles with traditional volumes and several field configurations, we estimate the modified correlation function8 on every Monte Carlo configuration i, where Here Λ denotes the subset of points considered in the specific analysis and N MC the total number of independent measurements.When autocorrelations are sizeable, the index i in eq.(3.26) runs over bins of sufficiently long length and N MC is redefined as the number of bins.Exploiting the additional dimension offered by Monte Carlo time, we estimate the error of Γ αβ itself using the variance of ⟨⟨Γ i αβ ⟩⟩ among the N MC available (decorrelated) samples in Monte Carlo time, which returns more precise estimates than the analytic formula in eq.(3.24).In master-field calculations, the latter may be more useful.
The first observable that we examine is the gradient-flowed energy density E t at positive flow time t ≃ t 0 [43].We consider the symmetric definition based on the clover discretization of the field strength tensor.Being a pure-gauge one-point function, it can be evaluated for all space-time points, thus giving us the opportunity to study the saturation of the variance with R for different estimators based on different D-dimensional partitions Λ of the lattice.Specifically, we examine the three cases and we block the data on the corresponding orthogonal dimensions: for example, in the case of Λ T , E t (x) is pre-averaged over the three spatial directions.In the left panel of figure 2, we show how these lower dimensional partitions of the lattice can be used to estimate the error based on stochastic locality 9 .We observe, when we increase the number of dimensions the variance starts to saturate at larger values of R.This effect is understood as a consequence of the D-dimensional integration measure d D x ∝ dR R D−1 .The presence 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Variance of the estimator ⟨⟨E t ⟩⟩ for a flow time t ≈ t 0 .Left: variance estimated using stochastic locality on our 48B ensemble for different choices of Λ given in eq.(3.28).Right: scaling of the variance with the volume, estimated on the 32B, 48B and 64B ensembles using Λ L 3 .of clear plateaus suggests that error estimators based on stochastic locality are already accessible (for some quantities) in several present calculations with moderate volumes, for example for correlation functions known on several or all source time locations.
For the energy density this statement is further supported by the right panel of figure 2 where we compare our three volumes and observe excellent agreement of the plateaus obtained for the variance.In fact, even though the study of E t 0 in a real master-field pure-gauge simulation was already presented in ref. [17], here we numerically demonstrate, in dynamical QCD, that for our precision the plateau is reached at m π R ≃ 2 implying that volumes as small as m π L ≃ 4 are sufficiently large to saturate its master-field error.
In figure 3, left panel, we demonstrate how the blocking procedure outlined in section 3.2 effectively resums spatial correlations, leading to a faster fall-off of Γ αβ .In addition, the error of the error at the saturation R is comparable between different block sizes and no blocking, except for the largest b = 16a, where we observe a larger error of the error.This makes modest blocking a cost-efficient solution to storage and memory constraints.At the same time, the fact that the error is less well determined for the largest blocking indicates that error estimates based on stochastic locality should provide an improvement as compared to estimates based on fluctuations across distinct gauge fields.
On the fermionic side, we will be considering simple mesonic two-point correlators projected to zero momentum, with pseudoscalar and vector operators, cf.eqs.(2.1) and (2.4).One strategy to estimate the corresponding correlation function Γ αβ (x) could be in principle based on several point sources located on a regular grid, as discussed in section 3.1.Taking into account the cost to calculate quark propagators, interesting questions to answer are how many point sources should be considered, and at which point should we stop.In the right panel of figure 3 we examine this scenario using the energy density as our probe.Despite knowing it for all locations, we perform a sub-sampling over three-dimensional regular grids of various sizes to imitate the case of an estimator based on stochastic grids (in the limit of large N η ).As expected, by making the grid denser, we reduce the variance and eventually start to sample the short-distance structure of Γ αβ (x).This is similar to traditional Monte Carlo averages where we resolve autocorrelations by measuring more frequently in Molecular-Dynamics units.At this point we saturate the amount of independent statistical information and can stop increasing the number of point sources.For this example, the error based on N = 12 3 samples is nearly as small as that based taking all N = 48 3 points while using 64 times fewer samples.
To study the saturation window from fermionic observables, we find it more practical to adopt standard approaches based on stochastic wall sources.From the identification O α=x 0 (x) with C swall (x 0 ; x) taken from eq. (3.11), at fixed x 0 we use the sink location x to estimate the master-field error in an L 3 subvolume [i.e.Λ L 3 in eq.(3.28)].
The two-point function with zero-momentum projection at the source is a non-local observable.Heuristically we expect that, as one increases the source-sink separation, the "footprint" of this quantity in the finite spacetime volume grows.To illustrate this, figure 4 shows the integrated correlation volume introduced in eq.(3.22) for the pseudoscalar and vector-vector correlators and a range of source-sink separations x 0 .In general we observe a growth of τ with larger values of x 0 , and correspondingly a larger summation radius needed to reach the plateau in τ (R).For the vector-vector correlator, the expected scaling of the correlation function Γ(x) is with e −2mπ|x| .We checked that our data are compatible with such scaling by fitting τ (R) with the fit function cΓ(s, M R), with c, s, M as free parameters, and Γ(s, b) defined in eq.(D.3).Note that the apparent saturation in x 0 reached for x 0 ≥ 0.38 fm should not be taken as a fact: this observable is affected by the signal-to-noise problem which means that at fixed statistics (and volume) for sufficiently large x 0 the noise will be large enough to hide the effect of increasingly larger correlation volumes, in the same manner as autocorrelations along Monte Carlo time emerge from less noisy observables.The pseudoscalar correlator, on the contrary, does not suffer from the signal-to-noise problem and is much more likely to show a poor convergence in R within our volumes.We demonstrate this in the left panel of figure 4. By increasing the source-sink separation the footprint of the observable grows up to the point where very short plateaus are observed, eventually preventing the usage of this error estimation technique for the study of the asymptotic behavior of the correlator.This suggests that additional statistical information is injected only by new field configurations or larger volumes, i.e. master fields.
The pion mass plays an important role in the applicability of this error estimation technique since it governs the exponential convergence of τ (R) at large values of R. By lowering the light quark masses we expect a larger footprint of the master-field variance estimator, or in other words, a larger τ (R).This phenomenon is observed in the left panel of figure 5, where we plot the scaling with the pion mass of the integrated correlation volume τ (R) for the vector-vector two-point function C V V (x 0 ) at short source-sink separation, using our three 32 3  behavior is apparently not observed in the right panel, where we examine the correlator at a longer source-sink separation.To understand why this is the case, we must first consider the fact that at longer distances and fixed statistics, we expect larger statistical errors as we approach the chiral limit.With noisier data, resolving spatial correlations at longer distances becomes more challenging and as a consequence our three (asymptotic) estimators of τ become compatible.We conclude that the effect seen in the right panel of figure 5 is only an artefact of the lower relative statistical accuracy on lighter ensembles and does not imply a different hierarchy.
We conclude this study of the variance based on stochastic locality with a few considerations.First we note that it is a powerful tool also for traditional volumes with m π L ≃ 4. Specifically for simulations with a low number of (independent) gauge field configurations, reliable error estimators may be obtained from master-field analysis exploiting invariance under one or more directions, depending on the data available.Clearly one has to examine every observable (e.g.every source-sink separation) independently and check that long reliable plateaus can be identified.First explorations along these lines can be found in ref. [44].Since the complexity of this strategy quickly grows with the number of observables, in appendix D we discuss an extension of the automatic window procedure proposed by Wolff [42] to this context.
Second, the correlation function Γ αβ (x) is an observable of the theory with a welldefined infinite-volume limit and exponentially suppressed volume corrections.As such, its x dependence may be studied (at fixed lattice spacing) on intermediate volumes and used in the planning of the number of fields and volumes required for a given master-field simulation and precision.For example, by fitting the variance of the pion correlator on the 64B ensemble for x 0 = 4a ≃ 0.38 fm, we can approximately estimate its asymptotic value.From the latter we deduce that a master-field analysis on a three-dimensional space with geometry (L/a) 3 = 192 3 would return a pion correlator with similar half-percent accuracy from a single configuration.

Position-space correlators
It is reasonable to expect that a local-in-space-time approach to correlators might be beneficial in exploiting the stochastic locality properties of fields to the fullest extent possible.Here, we consider fully position-space correlation functions as one possible strategy to achieve this.By not including a momentum projection, we avoid explicitly introducing long-distance contributions to the correlators.If the quark field smearing is O(4) covariant, then the overlap factors are independent of p. Neglecting discretization and finite-volume effects, this allows us to take the inverse of the three-dimensional Fourier transform and obtain for |x| → ∞: Above, K n (x) denotes a modified Bessel function of the second kind.We make use of the (Euclidean) Lorentz invariance of the theory in the continuum 10 to introduce correlators that are functions of the 4d radial direction r ≡ |x| only and transform as scalars.For each of C P P (x) and C AP,µ (x) there is only one option, and we use a ˚label to indicate the corresponding function of r The asymptotic behavior for r → ∞ is then For C AA,µν (x), by contrast, there are two possible contractions, leading to C( 2) Similarly, the spinor-valued correlator C N N (x) can give rise to two contractions Finally, it was shown in ref. [45] how to determine the (I = 1 component of the) subtracted HVP function Π(−Q 2 ) = Π(−Q 2 ) − Π(0) at space-like momenta Q 2 > 0 from the (light quarks) vector correlator in position space: for known kernels f 1,2 and At this stage it is instructive to incorporate the leading wrap around finite-volume effects that occur as r approaches L/2 (while continuing to neglect discretization effects).For the pseudoscalar correlator in the time-momentum representation, the leading finitetemporal-extent effect can be accounted for by modifying the effective mass to include the expected cosh-like behavior.A similar approach can be applied to position-space correlators that have a pseudoscalar as the leading long-distance contribution; we will see that this involves integrals of Bessel functions.
In the case of the pseudoscalar correlator, the leading position-space boundary effects are given by summing over images of the scalar propagator C P P (x), defined in eq.(4.1), in the four periodic directions.To make this concrete we define where L is a diagonal matrix encoding the lattice geometry and n ∈ Z 4 is a four-vector of integers.The introduction of a finite periodic space-time breaks continuous rotational symmetry, so C L P P (x) is not automatically a function of r ≡ |x| as it is in eq.(4.5).One possible strategy to address this is to explicitly perform an average over points with fixed r.To this end we introduce where dΩ 4 = π 0 dφ 1 sin 2 φ 1 π 0 dφ 2 sin φ 2 2π 0 dφ 3 .To simplify, one can perform a change of variables for each fixed n such that n is aligned with the z-axis.Evaluating the dφ 2 and dφ 3 integrals, we reach a one-dimensional integral in φ ≡ φ 1 , where we have used that the infinite-volume correlator C P P (x) is in fact only a function of |x|, and have also made use of CP P (r) as defined in eq.(4.5).We have additionally introduced the shorthand It follows that the angular averaged finite-volume correlator, CL P P (r), is equal to Taking now the specific case of a L 3 × T geometry, we drop terms scaling as e − √ 2mπL , e −mπT or faster and substitute the asymptotic form for CP P (r), eq.(4.1), to reach a concrete approximation for CL P P (r).At large distances 0 ≪ r ≤ L/2, this takes the form |c P | 2 F P P (r, m π , L), where defining a special case of eq.(4.16).This concrete functional form can be used to define a position-space analogue of the cosh effective mass, by numerically determining the unique function M satisfying M F P P (r, m π , L) In figure 7, below, we show the numerical results of implementing this strategy; see also section 5.2.1.
In the case of the C AP,µ (x) correlator, one can follow a similar procedure.The radial correlator in infinite volume is defined in eq.(4.5), with a factor of x µ to contract the µ index of the axial current A µ .The corresponding finite-volume expression summed over images is CL where it is important to note that the contracted x µ is introduced by hand and is therefore not summed over the periodic images.Following exactly the same recipe as above, one finds that the asymptotic behavior of CL AP (r) on a L 3 × T geometry can be written as c A c P F AP (r, m π , L) where which is a special case of . The final step is to perform the same derivation for C AA,µν (x).In eqs.(4.7) and (4.8) above we have defined the functions C(1) AA (r) and C(2) AA (r), resulting from two scalar contractions of the indices µ and ν, together with angular averaging.The finite-volume quantities with the same definitions are then given by CL( 1) CL( 2) The asymptotic volume effects for each finite-volume correlator CL(i) AA (r) can be expressed in terms of a known function multiplying the axial-current overlap factor.This function is given by AA (r, m π , L).For the case of i = 1, in which the indices are contracted, one finds the same result as for the pseudoscalar (PP) correlator, F AA (r, m π , L) = F P P (r, m π , L).By contrast, for the case of i = 2, the result is more complicated where is a special case of To reveal the exponential scaling of these effects, we expand the special functions introduced above about L = ∞, and obtain for eq.(4.18) where we have expanded using L ≫ r ≫ 1/m π and have dropped all subdominant terms.This implies that the volume corrections can be significantly enhanced if r is taken too large at fixed L. For typical volume sizes of m π L ≃ 4, such effects must be included in the analysis.For the ensembles used in this work, the effect is still quite significant (as shown in figure 7, below) and is larger than in the case of zero-momentum projection, for which the backward-propagating pion is suppressed.The latter is likely due to the fact that the temporal extent is larger than the spatial extent on all ensembles included.Nonetheless, the importance of wrap around effects in the position-space correlator highlights how such methods benefit significantly more (from the point of view of systematic errors) from larger volumes.
The corresponding corrections for the nucleon mass are significantly more complicated.The dominant volume effects arise from the nucleon emitting and re-absorbing a pion, not from the mirror images, and therefore depend on the nucleon-pion-nucleon coupling.Quantifying these effects goes beyond the scope of the present work and this issue arises in general for all states that couple to pions, e.g. for multi-pion states in C P P (x).

Numerical results
In the first subsection we investigate the stochastic grids as a potential efficient estimator of position-space correlators on large volumes.In the second subsection we instead examine differences between position-space correlators and more traditional momentum-projected ones computed on the same point sources for several standard mesonic and baryonic observables.

Stochastic grids
Let us divide the lattice into equal domains centered around the regularly displaced source points of our grid.The stochastic estimator in eq.(3.6) is in principle valid for any sink point x, but is particularly efficient for source-sink separations that do not exceed such domains.To test the efficiency of the stochastic grids introduced above, we use our larger ensemble with lattice volume of 96 × 64 3 sites, and define G as a grid of 3 × 2 3 = 24 points with spacing b = 32a.The |G| domains centered around each y ∈ G are hypercubes of 32 4  sites and we study several radial correlators for distances with norm r ≤ 16a.
In the left panel of figure 6, we study the efficiency of the stochastic grid estimator by comparing it with the average of up to 24 point-source estimators using exactly the points in G and the same gauge field configurations.The leading contribution to the stochastically vanishing difference between the two estimators will come from the neighboring points in G.For this comparison, we examine the pseudoscalar, vector, and nucleon radial correlators.In the left panel, we use the most precise estimator in each case (6 stochastic grid sources or 24 point sources) and show the ratio of the two estimators as a function of the radial distance.Within its uncertainty (and within ±2 %), it is consistent with one as should be the case for our unbiased estimators.The noisy deviation from one is very small for small r and grows (particularly for the nucleon and vector correlators) at larger r; this is to be expected, since as r increases, the correlator decays while the sink becomes closer to the neighboring sources responsible for this deviation.
In the right panel of figure 6, we compare the product of computational cost and variance of the grid estimator of C(r) at a fixed r = 12a with the one with the point estimator, for the three correlators CP P , CV V and C(1) N N .This is plotted as a function of the number of either noise sources N η or point sources N src for the two estimators respectively.Both are equal to the number of 12-component solutions of the Dirac equation.As expected, the product of cost and variance is approximately constant with increasing N src for the point estimator.Note that they have been arbitrarily normalized to 24 units In this figure we compare different estimators for several correlators, cf.eqs.(4.5), (4.9) and (4.12), defined in position space as a function of the radial distance.Specifically we consider up to 6 random sources with support on a 3 × 2 3 grid, with fixed offset, and up to 24 point sources gradually placed to fill the same point locations of the grid.Left: ratio of the estimator based on N η = 6 stochastic grids over the estimator based on 24 point sources.Correlated errors are represented by the error bands.Right: product of computational cost and variance for the stochastic grid estimator (empty markers) as a function of N η compared to the point sources estimator (solid markers) as a function of N src .For each choice of pseudoscalar, vector or nucleon correlator (in the same colors as in the plot on the left), the cost × variance product is relative to 1/24th of the cost × variance of N src = 24 point sources.
at N src = 24.In the same units, the product of cost and variance of the the grid estimator is significantly lower and grows approximately linearly with N η .At N η = 6 the variance of the grid estimator is between 3 and 5.5 times smaller than the one of the point estimator at the same cost, with the gain being larger for the vector and nucleon correlators than for the pion one.On larger volumes we expect stochastic grids to be even more efficient, which makes them a promising strategy for master-field calculations.
Even though the volumes studied in this work are fairly large, if we employ stochastic grids with our minimal setup L/b = 2, we are only able to examine position-space correlators up to radial distances of 16 lattice units.For this reason, we turn to estimators based on single point sources to test position-space methods against correlators in the time-momentum representation.This allows us to instead reach r ≤ 32a, a distance that is sufficient to reliably extract -once boundary effects are taken into account -ground-state energies like the pion and nucleon masses, as shown in sections 5.2.1 and 5.2.2 respectively.Further tests of stochastic grids on much larger volumes up to m π L ≃ 25 are presented in ref. [46] Table 2. Pion and nucleon masses from position-space correlators on point sources, compared to zero-momentum projected correlators on the same point sources, with or without smearing.

Position-space and time-momentum representations
Below we present results obtained from the 64B ensemble.Its spatial volume in units of the inverse pion mass is approximately nine, and from our previous studies on the saturation of the error we already know that it is not sufficient for the pion correlator.Therefore, in the following we calculate errors along Monte Carlo time, using traditional analysis strategies.
We compare results from position space versus TMR.We also show the effect of smearing, using 3d fermions [33] for TMR and gradient flow for position space; parameters are given in table 2.

Pion mass
The simplest observable that we consider is the pion mass m π , extracted from the positionspace pseudoscalar correlator whose large-|x| behavior is given in eq.(4.1).The rotational symmetry of the position-space correlator in the continuum and infinite volume is broken by the finite lattice spacing, with directions that are not equivalent under hypercubic symmetry contributing different cut-off effects.In principle, the many different hypercubicinequivalent directions provide us with considerable freedom in defining our hadronic observables; however, to the purpose of this initial study, we limit ourselves to the correlator CP P (r) as function of the radial distance only as introduced in eq.(4.5).This is obtained by averaging uniformly over hyperspheres of fixed r in a way similar to the one discussed in section 4. At finite lattice spacing, the integral over the angles dΩ 4 is replaced by the sum CP P (r) = 1 r 4 (r 2 /a 2 ) |x|=r C P P (x), (5.1)where r 4 (n) = 8 d|n,4∤d d is the number integer 4-vectors z ∈ Z 4 satisfying z 2 = n. 11omparing this description of the lattice data with the long-r behavior of eq.(4.1), following eq.(4.20) we numerically solve for m π as a function of r.It is worth noting that since there exists an x such that |x| = r for all integer values of r 2 /a 2 , the density in r of the available lattice data increases with r.We apply eq.(4.20) choosing ∆ as the closest  Effective mass corresponding to the radial pion correlator CP P (r).The purple points are obtained applying the infinite-volume formula for the long-distance behavior in eq. ( 4.1), whereas the blue points are obtained by accounting for finite-volume effects, eq.(4.18).The green line shows the mass obtained from the best "one-state" fit to the correlator and the red curve shows the effective mass corresponding to the best "two-state" fit to the correlator as described in the main text.
value to one lattice unit that is available, that is, such that (r + ∆) 2 /a 2 ∈ N, as we observe that this produces a much smoother effective mass than using, for instance, the smallest possible value of ∆.
The result for the effective mass as a function of r without correcting for boundary effects are shown in figure 7 in purple color, where it is evident that the determination is affected by significant boundary effects.We therefore apply the correction described in section 4 and employ eq.(4.18) to solve for the effective mass against our lattice data using eq.(4.20).The resulting effective mass shown in blue in figure 7 is flat at large r, confirming that we are able to successfully account for boundary effects.In the same plot we also show the results of a direct "one-state" fit to lattice data with the m π and c P parameters left free, restricted to the region in r where there is an effective mass plateau, and of another "two-state" fit with an added "excited state" term a 1 In both cases, the fitted value of m π is compatible with the effective mass plateau value.
The results of the position-space determination of m π both with or without gradientflow smearing are given in table 2 and shown in the left panel of figure 8.The result obtained without applying smearing to the sources is m π = 0.1376(4)/a ≈ 288.9(8)MeV. (5.2) We observe that, in the case of the pseudoscalar correlator, smearing does not lead to a better determination.However, the position-space results have a better statistical precision  than the results (also given in table 2) obtained on the same point sources using the standard momentum-space techniques, including when 3d fermion smearing is used.

Nucleon mass
The next hadronic observable that we considered is the nucleon mass m N extracted from the correlator of two uud spinor fields in eq.(2.9), which in position space has the large-|x| behavior given in eq.(4.4).As in the case of the pseudoscalar correlator, in this work we consider only the correlator data averaged over all angles as an estimator of the radial correlator, which in this case takes the form of two contractions C(1) N N (r) and C(2) N N (r), defined as in eqs.(4.9) and (4.10) respectively with the dΩ 4 → r −1 4 (r 2 /a 2 ) |x|=r replacement as in eq.(5.1).We determine the effective masses from both correlators solving eq.(4.20), again with ∆ ≈ 1a, and taking into account that, in principle, the results of the two contractions can be different due to different discretization effects.The results are shown in figure 9 in blue and orange for C(1) N N and C(2) N N respectively, where we observe a boundary effect at large r that leads to an increase of the effective mass.In contrast with the pseudoscalar correlator case, the main boundary contribution to the nucleon correlator does not come from the mirror images considered in section 4, which fall off much faster than in the case of the pion, but from the propagation of intermediate N π states.
We also perform a direct fit to the radial correlators using the appropriate large-r description and with free m N and c N parameters, restricting the fit range in r to the plateau region up to r max = 24a to avoid uncontrolled boundary effects.We note however that we expect any master-field calculation will have a sufficiently large volume that boundary effects become irrelevant.We also consider a fit that includes an extra factor of [  x CNN contractions in eqs.(4.9) and (4.10), respectively.The horizontal lines show the mass obtained from the best "one-state" fit to the tr CNN (green) and tr / x CNN (brown) correlators.The shaded curves show the effective mass corresponding to the best "two-state" fit to the tr CNN (purple) and tr / x CNN (red) correlators described in the main text.(5. 3) The nucleon mass extracted from C(2) N N has a similar error but is systematically larger both with or without smearing, which is compatible with the fact that C(1) N N and C(2) N N can have different discretization effects.Using the same sources we obtain more precise result from position space compared with TMR; for discussion of the error scaling based on the Parisi-Lepage argument see appendix A. Contrary to the pion case, smearing has a visible impact on the determination of the nucleon mass.Applying 3d-fermion smearing reduces the error on m N from the momentum-projected correlator by one third.Gradient-flow smearing also improves the already-smaller error of the position-space estimator by a factor of two.However, this comes at the cost of a distortion of the position-space correlator at short distances, within a range proportional to √ t flow , that is visible in figure 9 at r ≲ 8a.
Including the extra factor of [1 + a 1 mπ r K 1 (m π r)] in the fit model yields compatible results with or without smearing as long as t min ≥ 8a.Empirically, the fitted parameter a 1 is AA (bottom right) and C( 2) AA (bottom left) correlator data to the fitted asymptotic behavior based on eqs.(4.6), (4.7) and (4.8) respectively, with the fitted prefactors of c A and c P set to unity in the ratio.The purple points show the ratio obtained when omitting the boundary effects in the denominator.The blue points show instead the result when boundary effects are included, with the light blue error band representing the statistical error on the correlator data.The orange band shows the result of the fit for the normalization of each correlator, that determines the modulus of the c A and c P amplitude parameters and in turn the bare pion decay constant. of the same order irrespective of smearing, suggesting that the excited states modelled by the extra factor are not suppressed by gradient-flow smearing.This is different from 3d-fermion or other forms of smearing usually employed in the momentum-projected case, which generally contribute to suppressing the amplitude of excited states.

Pion decay constant
To compute the pion decay constant, we perform a combined fit of the pseudoscalar correlator in the radial direction CP P (r) defined on the lattice in eq.(5.1), together with the lattice version of CAP (r) in eq.(4.6) and the two contractions C(1) AA (r) and C(2) AA (r) in eqs.(4.7) and (4.7), C( 1) C( 2) ) 3 GeV 2 ratio position-space 0.0895(7) 0.653(6) 0.0568(4) 0.0869(5) 1.5288 (28) zero-momentum 0.0874(8) 0.633(6) 0.0584(7) 0.0902(7) 1.544 (6) Table 3. Pion decay constant, its ratio with the pion mass, and the HVP from position-space correlators on point sources, compared to zero-momentum projected correlator on the same point sources.The discrepancies between the two rows may be due to cutoff effects or, in the case of the HVP, due to failing to saturate the estimator in both cases.
Using only the correlator data without any smearing of the local A µ currents and P densities, the combined fit allows us to simultaneously extract the mass m π and the amplitudes c P and c A .As in the case of the pion mass discussed in section 5.2.1, agreement with the data at the longest distances is obtained only if the boundary effects are accounted for in each of the four correlators in the combined fit model, which is obtained modifying the r → ∞ asymptotic behavior of eqs.(5.1), (5.4), (5.5) and (5.6) according to the discussion in section 4. At short distances, the data for C(1) AA (r) has the opposite sign, and agrees with eq.(5.5)only at relatively large r.Thus, we fit C( 1) AA (r) data starting from r min = 24a, and CP P (r), CAP (r) and C( 2) AA (r) starting from 19a, 16a and 19a respectively.The results of the fit are am π = 0.1371(5), a 2 c P = 0.1477(13), a 2 c A = 0.012 27 (11). ( We observe that the pion mass obtained with this combined fit is smaller but compatible with the value given in eq.(5.2), obtained from the effective mass plateau of the CP P (r) correlator only and gradient-flow smearing.
Turning to the main focus of this section, these fit results yield a value of the decay constant This position-space result for f bare π has a comparable statistical precision to the result obtained on the same point sources using the standard momentum space techniques, which are also given in table 3. The significant difference between the mean values of the highlycorrelated position-space and zero-momentum results in the table can be attributed to the different discretization effects of the two methods.A similarly significant difference is observed between the values of f bare π /m π computed with position-space or zero-momentum methods.In figure 10 the lattice data for each of the four correlators are compared with the respective asymptotic behaviors both with and without the inclusion of boundary effects, reconstructed from the fit results in eq.(5.7).

Hadronic vacuum polarization
Among the observables that can be extracted from the vector current, the hadronic vacuum polarization (HVP) contribution to the muon anomalous magnetic moment (g − 2) µ plays up to an Euclidean time t cut (for the TMR method) or a radial distance r cut (for the CCS method) at two values of Q 2 , in different color.The statistical errors are smaller than the TMR point markers, and than the CCS line thickness.
a prominent role.Due to the long-standing (g − 2) µ puzzle [47], the HVP contribution is of major phenomenological interest and its determination at the sub-percent level is the objective of many large scale lattice efforts.Here, we limit ourselves to a computation of the related HVP function Π(−Q 2 ) defined in eq. ( 2.15) at space-like momentum transfers of Q 2 = 1 GeV 2 and 3 GeV 2 ; see also ref. [48] for a much higher precision computation.The covariant coordinate-space (CCS) method to extract Π(−Q 2 ) from position-space correlators has been introduced in refs.[45,49] and amounts to integrating the sum of two contributions, each being the product of one of two contractions of the vector current, C(1) V V (r) and C(2) V V (r) from eq. (4.12), and its corresponding kernel, over the radial coordinate r, as shown in eq.(4.11).In spirit, this is very similar to the traditional time-momentum representation (TMR) method where one computes the Euclidean-time integral of the product of the zero-momentum-projected vector correlator and a kernel.The choice of comparatively large Q 2 values helps in lowering the relative weight of the large r distances in the kernel, which have large statistical uncertainties.In spite of this, we find that the extent in r of the correlators that we have available is not enough to saturate the CCS integral, as shown in figure 11 from the fact that the partial integral up to r does not saturate at the largest r.This contrasts with the saturation of the Euclidean-time integral in the TMR method, also shown in figure 11, and confirms a major drawback of CCS methods already observed in ref. [49]: the need for very large volumes and long-range position-space correlators for the CCS integral to saturate.We also note that, even when the estimators are saturated, the resulting values are expected to differ by cutoff effects and finite-volume effects.
With this caveat, we perform a numerical test computing ΠI=1 (−Q 2 ), the I = 1 part of the HVP function.The computation of the HVP function relies on a genuine vector current, thus we use only the unsmeared meson operators.We quote the results of ΠI=1 (− where the bare subscript indicates that we are not including the renormalization factor Z V of the local vector current.The error on ΠI=1 bare is 0.7 % and 0.6 % at the smaller and larger value of Q 2 , respectively.This can be compared with the result obtained from the same correlator data using the TMR method: ΠI=1 bare (−1 GeV 2 ) = 0.0584 (7) and ΠI=1 bare (−1 GeV 2 ) = 0.0902 (7), with an error of 1.2 % and 0.8 % respectively.These results are given in table 3.While in both cases the statistical error of the traditional method is roughly 50 % larger, this analysis does not take into account the systematics associated with the truncation of the integral, which is most likely larger in the CCS case.
To eliminate the need for the renormalization factor Z V of the local vector current, in table 3 we also quote the ratio between ΠI=1 (−Q 2 ) at the two values of Q 2 , ΠI=1 (−3 GeV 2 ) ΠI=1 (−1 GeV 2 ) = 1.5288 (28). (5.10) Taking the ratio also has an effect in reducing the relative statistical error to just 0.18 %.This can be compared with the result obtained from the same correlator data using the TMR method where one finds 1.544 (6).For reference, in ref. [48] the ratio in eq.(5.10) has been computed to high precision using an O(a)-improved vector current on a CLS ensemble with a similar mass and a ≈ 0.064 fm (N200), giving 1.5493 (19), while in the continuum limit the same ratio evaluates to 1.484 (14).

Conclusions
In this work, we have described and tested methods for using stochastic locality to maximize the information extracted from a given set of numerical gauge fields.As is stressed throughout, the strategies used to optimally profit from locality must be examined on an observable-specific basis and should be considered both for the central values and the uncertainties of a given correlation function.
After providing standard correlator definitions and a specification of our lattice setup in section 2, in section 3 we have described our approach to estimate both central values and uncertainties.An important variation in our study is in the choice between correlators projected to definite spatial momentum, i.e. in the time-momentum representation, and position-space correlators.We have introduced two classes of estimators for each of these: point sources and stochastic wall sources for the time-momentum representation, and point sources and stochastic grids for the position-space construction.We have additionally considered the effect of smearing in each case, so that our comparisons are more relevant to state-of-the art calculations that often employ the latter.
While our correlator estimators are well established in the literature, our approach for estimating the (co)variances is less explored.This is explained in sections 3.2 and 3.3.In a nutshell, the strategy is to view an estimator for the quantity of interest (e.g. a two-point function with some fixed separation) as an observable with a footprint localized near some reference position x (e.g. the source position).One then subtracts the central value and defines a modified correlation function as the product of two instances of the subtracted observable with reference positions x and x + y.A key consequence of stochastic locality is that, as the magnitude of y is taken large, this function decays exponentially.The variance of the spatially averaged estimator of the target quantity is itself estimated from the integral over y of the modified correlation function: we cut this off at long distance, leaving exponentially suppressed corrections.
In practice, the success of this method hinges on whether the integral value is saturated before finite-volume effects distort the estimator of the variance.In section 3.3 we have demonstrated the feasibility of this across various observables.While the energy density at positive flow time, for example, has a variance that saturates very quickly, the situation is much more challenging for the zero-momentum pseudoscalar two-point function, especially at large source-sink separations.That the latter behaves poorly is intuitive: the quantum numbers, the spatial sum, and the source-sink separation all contribute to the observable having a large footprint, so spatially decorrelating a second instance of the quantity is challenging.
These results naturally led to the position-space approach presented in section 4. While position-space correlation functions have more desirable locality properties, one expects excited-state and wrap-around effects to be more challenging without momentum projection.It is thus not obvious how the costs and benefits will balance in the final result.To explore this, we have compared the extraction of four different observables (the pion and nucleon mass, the pion decay constant, and the hadronic vacuum polarization function) using position-space and time-momentum representation correlators.Our results, summarized in section 5, show that position-space methods can give competitive determinations.While further investigation is needed, one generally finds lower statistical uncertainties from the position-space determinations, especially for the nucleon mass.In a full analysis, this may be offset by additional systematic uncertainties because our analysis of position-space correlators uses the approximation of continuum-like behavior at large distances, which is not required for analyzing momentum-projected correlators.
To summarize, we identify three take-home messages of this study: First, whenever multiple measurements are performed on a given gauge field, it is valuable to construct a modified correlator in order to study the spatial decorrelation of the observable.In the case of immediate decorrelation on the sampled set, one can treat the samples independently as with decorrelated gauge fields.Second, position-space correlation functions can be competitive in the determination of masses, matrix elements and other lattice observables based on two-point functions.Third, and finally, there is no need for a strict distinction between traditional calculations with some requisite number of configurations and masterfield calculations with a requisite space-time volume.Instead, for any combined volume in the five-dimensional combination of space-time and Monte Carlo time, one can use datadriven studies of (auto)correlations to empirically examine the statistical precision for a given observable.
Future work in this vein includes the application of these strategies to larger spacetime volumes on which the saturation of the variance correlator can be achieved for more observables.Additional work is also needed to rigorously treat the discretization and finite-volume effects in the fit functions used to extract observables from position-space quantities.Furthermore, while this work has considered the two extremes of standard momentum projection and position-space calculations, it would be instructive to consider a compromise between the two scenarios.In particular, momentum projection with a Gaussian wave packet or similar might well give an optimal balance between reducing the footprint and defining a useful estimation of the physical quantity of interest.An initial exploration of this is presented in appendix B.
Despite the clear need for further exploration, it seems likely that careful use of stochastic locality, along the lines summarized in this article, will play a useful role in the next generation of lattice QCD calculations.1. zero momentum (finite volume): e −(m N −3mπ/2)t , 2. zero momentum (infinite volume): t 1.5 e −(m N −3mπ/2)t , 3. radially averaged: r 2.25 e −(m N −3mπ/2)r .
This somewhat improved asymptotic behavior may partially explain the reduced uncertainty we obtained for the nucleon mass in section 5.2.2.

B Truncating three-dimensional sums for correlator estimators in large volumes
A generic strategy for reducing the noise in correlation functions that involve a sum over some of the lattice dimensions is to approximate the sum by restricting it to a sub-volume.
If one can include the dominant contributions to the signal while excluding noisy longdistance contributions, this offers the prospect of trading a small bias for a reduction in statistical uncertainty.This idea has been explored in ref. [54] but is also widely used e.g. when restricting temporal sums in calculations of the hadronic vacuum polarization contribution to the muon anomalous magnetic moment.
To this end, we define the truncated zero-momentum correlator Pion and nucleon correlators with gradient-flow smearing on the 64 3 ensemble are shown in figure 14.In general, to reach the same fraction C cut / C, larger source-sink separations t require larger cutoff radii r max .The pion correlator is slow to saturate: even r max = L/2, i.e. summing over the largest ball that fits inside the lattice volume, is not consistent with the full sum.For the nucleon, the signal does saturate by r max = 20a.However, the noise also saturates, so that there is no statistical benefit from truncating the sum.This finding is consistent with the result of ref. [54] for connected diagrams: both the signal and the noise of the position-space nucleon correlator decay exponentially at large distance.

C Estimating the error of the error and other biases
First we examine the bias in our estimator ⟨⟨Γ αβ ⟩⟩ by considering for the remaining terms we obtain By adding and subtracting ⟨O α ⟩⟨O β ⟩ we get the straightforward extension of the corresponding result in ref. [42] ⟨⟨Γ The second bias that we consider originates from the truncation of the integral over Γ αβ .In eq.(3.14) periodic boundary conditions in all directions have been assumed, implying that we can average N = V /a 4 estimators of Γ αβ (x − y) (for all distances x − y).In such cases only an exponentially small bias is introduced.The situation is different if instead we consider the case with open or Dirichlet boundary conditions along at least one dimension.This situation is realized when the observables are known in a sub-domain of the lattice or if open boundaries are employed along the time direction for example.Taking this second simpler case for a lattice geometry of L 3 × T , with T denoting the temporal extent, we have with m the mass scale dominating the behavior of Γ αβ at large distances.A similar correction is present in Markov chains [55].
We now analyze the statistical error of the covariance matrix ⟨⟨C αβ (R)⟩⟩.To simplify the notation, we denote sums |x|≤R with x and we examine the covariance of ⟨⟨C αβ (R)⟩⟩ with ⟨⟨C γδ (R)⟩⟩: where Γ ′ αβ (x) ≡ ⟨⟨⟨Γ αβ (x)⟩⟩⟩ is the biased expectation value of our estimator ⟨⟨Γ αβ ⟩⟩.Setting α = γ and β = δ returns precisely var(⟨⟨C αβ (R)⟩⟩).Focusing on the four-point function, we neglect the fully connected part, relevant only when x, y, z, z ′ are all close to each other, and like in ref. [42] consider solely the factorized expectation values12 In the second line we have explicitly neglected the bias correction derived in eq.(C.4), since it amounts to an effect of higher order in 1/N , while we kept it in the first term to cancel the corresponding one in cov (⟨⟨C αβ (R)⟩⟩, ⟨⟨C γδ (R)⟩⟩).Noting that the equation above depends only on the difference z − z ′ , we introduce t → z − z ′ and t → z − z ′ − y in the last two terms respectively, simplify the sum over z ′ , thus obtaining where N (R) = y = y θ(R − |y|).

D An automatic windowing procedure for the determination of the masterfield error
In this appendix we focus on the errors, namely C αα (R), and we drop the corresponding subscripts (also from Γ αα ).Starting from the following Ansatz for the asymptotic behavior of the correlation function Γ, with integer k ≥ 0, and assuming to use our error estimators in D dimensions, the relative systematic error due to the truncation of the sum in eq.(3.20) is approximated by A good balance with the error of the error derived in eq.(C.9) is achieved for the minimal value of R where the derivative becomes positive.Note that above we used eq.(C.9) with the continuum version of N (R).
Following Ulli Wolff's suggestion in ref. [42], we replace τ with the calculated τ (R) ≤ τ and m with the effective mass (from eq.(D.2)) We introduced the parameter S τ which plays a similar role as in ref. [42], while the parameter k is a new tunable parameter.From our experiments S τ ≃ 2 and k = 0 return acceptable windows.This procedure is implemented in the pyobs library [51].
E Long-T simulations, a master-field variation using very cold lattices The long-T approach was put forward in ref. [32] as a variation of the master-field idea.
To summarize, the master-field approach presents a strategy to manage contaminations originating from the topology freezing in simulations, aside from proposing a new statistical method for evaluating observables and their uncertainties as is studied in the main body of this work.To this end, note that the effect of freezing scales as 1/V for sufficiently large spacetime volume V , while statistical uncertainties are only suppressed by 1/ √ V .For a fixed number of gauge fields they are therefore guaranteed to dominate over freezing effects as the space-time volume increases.An effective suppression of topological effects can therefore be reached by increasing all four dimensions but also by increasing only one dimension while keeping the others fixed.Choosing to elongate the T dimension while keeping the spatial volume fixed at a "traditional" size, e.g. at the small volume of L = 32a in our simulations here, we arrive at the long-T approach.
In our previous study, we were in particular interested in the suppression of topological freezing with T , dubbed defrosting, in simulations with a lattice spacing where topology freezing becomes a problem, i.e. for a approximately 0.055 fm for our choice of action.The two ensembles examined here differ only in geometry while having the same four-volume: the 32BT follows the long-T approach and has an elongated direction T = 768a with fixed "traditional" spatial size L = 32a, whereas the 64B has an increased spatial size L = 64a with T = 96a.Eq. (E.2) produces a larger statistical error for ensemble 64B than for 32BT because its spatial sum includes more noisy contributions from large spatial separations of the topological charge density; the smaller footprint of the estimator in eq.(E.4) produces a smaller error approaching that of ensemble 32BT.
However, in that study, resource limitations put a comparison of results in the large spacetime volumes achieved by the long-T method and corresponding spatial volumes out of reach.Furthermore, due to the onset of topology freezing, including small volumes in a comparison of topological observables became more difficult.The situation in our case here is different: With the selected lattice spacing of a = 0.094 fm, topology freezing is not problematic, and we are able to generate and compare lattices in a consistent manner.
Our main observable is the topological susceptibility χ top .We estimate it from the truncated sum (see e.g.ref [56]) ⟨⟨χ top (t)⟩⟩ = a T x 0 χ top (t; x 0 ) , (E.1) with χ top (t; x 0 ) = a 7 L 3 −t≤y 0 ≤t x,y∈L 3 q(y, x 0 + y 0 )q(x, x 0 ) , ⟨χ top (t; x 0 )⟩ t≫0 ≈ χ top , (E.2) and q(x) = − 1 32π 2 ϵ µνρσ Tr[F µν (x)F ρσ (x)] . (E.3) F µν (x) represents the field-strength tensor which we determine at positive gradient-flow time [43], specifically at t flow /a 2 = 3.125 equivalent to √ 8t flow ≃ 0.47 fm, using the clover discretization.In this appendix we examine two ensembles with equal space-time volumes and same bare parameters, namely the 32BT and 64B lattices (cf.table 1), and we use for both the same number of measurements (equally spaced in MD units).Therefore, we expect similar statistical uncertainties, which we estimate using correlations along Euclidean and Monte Carlo time for the 32BT and 64B ensemble respectively 13 .However, as we can see from the results in figure 15, the 64B ensemble has in general larger errors.This effect is understood as originating from the zero-momentum projection, which for the 64B implies including pairs of points further apart (w.r.t.32BT) which contribute only to the noise of the observable.To clarify this issue, on the 64B lattice, we estimate the topological susceptibility using a truncated spatial sum of the temporally-summed correlator, χ top (r; x) = a 5 T x 0 ,y 0 |y|≤r q(x + y, y 0 )q(x, x 0 ) , ⟨χ top (r; x)⟩ r≫0 ≈ χ top . (E.4) The corresponding results are labelled in figure 15 as "64B, L 3 ".Their statistical error is closer to the 32BT zero-momentum estimator.A nice agreement of the asymptotic plateau values for all estimators of χ top is found, showing the effectiveness of the long-T approach as a variation of the more general master-field paradigm.

F Observing and diagnosing "exceptional" configurations
While performing measurements on the 64B ensemble listed in table 1, we observed significant fluctuations in the Monte Carlo history of two observables: the topological susceptibility and the variance of the pseudoscalar correlator.These fluctuations resulted in spikes that showed an impact on the observable.For instance, when measured on all the available configurations, the topological susceptibility on 64B as shown in figure 15 was significantly different than the value on the ensemble 32BT, which has the same four-dimensional volume.Moreover, the error on the hadronic quantities measured in section 5 was larger than expected.
To investigate the issue, we studied the spectrum of the light quark Dirac operator D by computing σ 0 , the square root of the smallest eigenvalue of the Hermitian operator D † D using a subspace iteration algorithm.The Monte Carlo history of the two observables and of 1/σ 0 is plotted in the three panels of figure 16.In the central panel of figure 16 we plot the fluctuations of the integrated correlation volume τ (R = 10a), defined in eq.(3.22), of the zero-momentum-projected pseudoscalar correlator C P P (x 0 ) at x 0 = 8a.Given the absence of a signal-to-noise problem, and an approximate behavior with four inverse powers of D, it is not surprising to see spikes in its Monte Carlo history coinciding with 1/σ 0 .This shows how the variance defined from stochastic locality may be used to monitor the presence of exceptional configurations.
In the bottom panel of figure 16 we plot the fluctuations of the topological susceptibility defined using the global topological charge, which is equivalent to eq. (E.1) with t = T /2, .Monte Carlo history plots of three quantities on the extended 64B ensemble.In the top row, we graph the inverse of the smallest singular value σ 0 of the Dirac operator with light quark mass.The central row shows the integrated correlation volume τ (R) of the pseudoscalar correlator C P P (x 0 ) as studied in section 3.3, evaluated at R = 10a and x 0 = 8a.In the bottom row, we plot the Monte Carlo history of the topological charge susceptibility at flow-time t flow /a 2 = 3.125 (see also appendix E).For both observables in the central and bottom line, the ensemble average is subtracted and only the fluctuations around the mean are shown.The orange lines are MC histories smeared with a Gaussian filter with a width of three configurations.
as measured on each configuration around the ensemble mean.The correlation of this gluonic observable with the low-lying spectrum of the Dirac operator is weaker and to highlight the presence of larger fluctuations in the first part of the Monte Carlo chain, we smear the Monte Carlo histories with a Gaussian kernel with width equal to 3 (in units of configuration numbers), represented by solid lines in figure 16.
Empirically, if we use only the second part of the Monte Carlo chain, avoiding the main spikes visible in figure 16, we observe that the topological susceptibility agrees between 64B and 32BT as shown in figure 15 and the error on hadronic quantities is reduced or stays approximately constant in spite of the lower statistics.Owing to these observations, we decided to limit all measurements used in the paper for this ensemble to the last 50 configurations (as reported in table 1), from 79 up to 128, of the Monte Carlo history in figure 16.

AF
Extension of the Parisi-Lepage argument: Error volume scaling B Truncating three-dimensional sums for correlator estimators in large volumes C Estimating the error of the error and other biases D An automatic windowing procedure for the determination of the masterfield error Observing and diagnosing "exceptional" configurations 1 Introduction

( 2 . 1 - 2 . 4 )Figure 1 .
Figure1.Sketch of the estimator C sgrid with a grid of point sources over a two-dimensional window of the lattice.The set G of source points ∈ G is a regular grid with spacing b.A mesonic two-point function is evaluated at sink point x that is in the domain defined by y ∈ G and within a distance r max from y. Two of the spurious contributions from the "wrong" source are shown in light gray.

Figure 3 .
Figure 3. Variance of ⟨⟨E t0 ⟩⟩ calculated from a three-dimensional (spatial) slice, i.e. using Λ L 3 in eq.(3.28).Left: scaling of the variance with the blocking size b/a for the 64B ensemble.The case b/a = 1 amounts to considering all (L/a) 3 points.Right: scaling of the error with the sparsity of the three-dimensional grid.The red curve corresponds to taking all N = 48 3 points of a time slice of the 48B ensemble.

Figure 4 .
Figure 4. Integrated correlation volume τ (R) from eq. (3.22) as a function of the source-sink separation x 0 and summation radius R, for zero-momentum projected correlators measured on the 64B ensemble using stochastic walls.Left: pseudoscalar two-point function, C P P (x 0 ).Right: vector-vector correlator C V V (x 0 ).

3 Figure 5 .
Figure 5. Integrated correlation volume τ (R), cf.eq.(3.22), of the vector-vector correlator at short (left panel) and long distances (right panel), as a function of the pion mass and summation radius R. The three ensembles 32A, 32B, and 32C have the same volume but differ in the simulated values of the quark masses, with m π L ranging from 3.3 up to 6.3.

Figure 6 .
Figure 6.In this figure we compare different estimators for several correlators, cf.eqs.(4.5), (4.9) and (4.12), defined in position space as a function of the radial distance.Specifically we consider up to 6 random sources with support on a 3 × 2 3 grid, with fixed offset, and up to 24 point sources gradually placed to fill the same point locations of the grid.Left: ratio of the estimator based on N η = 6 stochastic grids over the estimator based on 24 point sources.Correlated errors are represented by the error bands.Right: product of computational cost and variance for the stochastic grid estimator (empty markers) as a function of N η compared to the point sources estimator (solid markers) as a function of N src .For each choice of pseudoscalar, vector or nucleon correlator (in the same colors as in the plot on the left), the cost × variance product is relative to 1/24th of the cost × variance of N src = 24 point sources.

Figure 7 .
Figure 7. Effective mass corresponding to the radial pion correlator CP P (r).The purple points are obtained applying the infinite-volume formula for the long-distance behavior in eq.(4.1), whereas the blue points are obtained by accounting for finite-volume effects, eq.(4.18).The green line shows the mass obtained from the best "one-state" fit to the correlator and the red curve shows the effective mass corresponding to the best "two-state" fit to the correlator as described in the main text.

Figure 8 .
Figure 8. Left: comparison of the value of m π obtained from the "one-state" fit for various choices of correlators and smearing.Right: comparison of the value of m N obtained from the "one-state" fit for various choices of correlators and smearing.

Figure 9 .
Figure 9. Effective mass corresponding to the radial nucleon correlator CNN (r).The blue and orange points are obtained by applying the long-distance behavior to the tr CNN and tr /x CNN contractions in eqs.(4.9) and (4.10), respectively.The horizontal lines show the mass obtained from the best "one-state" fit to the tr CNN (green) and tr /x CNN (brown) correlators.The shaded curves show the effective mass corresponding to the best "two-state" fit to the tr CNN (purple) and tr /x CNN (red) correlators described in the main text.

a 1 mπ r K 1
(m π r)], with a free amplitude parameter a 1 and a mass parameter fixed to m π , and we observe that this model effectively describes the correlator data for both C(1) N N and C(2) N N to smaller values of r.The results of the position-space determination of m N from both C(1) N N and C(2) N N , with or without gradient-flow smearing, are given in table 2 and shown in the right panel of figure8.From C(1) N N using gradient-flow smearing, we obtain the value m N = 0.494(6)/a ≈ 1037(13) MeV.

Figure 10 .
Figure 10.Ratio of CP P (top left), CAP (top right), C(1) AA (bottom right) and C(2) AA (bottom left) correlator data to the fitted asymptotic behavior based on eqs.(4.6), (4.7) and (4.8) respectively, with the fitted prefactors of c A and c P set to unity in the ratio.The purple points show the ratio obtained when omitting the boundary effects in the denominator.The blue points show instead the result when boundary effects are included, with the light blue error band representing the statistical error on the correlator data.The orange band shows the result of the fit for the normalization of each correlator, that determines the modulus of the c A and c P amplitude parameters and in turn the bare pion decay constant.

m π t = 1 .1 m π t = 0. 7 Figure 12 .
Figure 12.Estimate of the three-meson contribution to the variance of the zero-momentum nucleon correlator versus box size, normalized to its value in infinite volume.Curves are shown for three values of m π t: 0.7 and 1.5 correspond to t ≈ 1 fm for m π = 135 and 290 MeV, respectively.

Figure 13 .
Figure13.Variance of zero-momentum nucleon correlator versus source-sink separation, for three lattice volumes and fixed pion mass: ensembles 32B, 48B, and 64B.Curves show the estimate f var , multiplied by a single overall normalization factor to line up with the lattice data and with a modification e −Et → e −Et + e −E(Lt−t) to model the effect of the finite time extent.The dotted curve shows the estimate in infinite spatial volume and the dashed curves show the contribution to f var from three mesons at rest.

= 5 t/a = 10 t/a = 15 t/a = 20 Figure 14 .
Figure 14.Truncated zero-momentum pion (left) and nucleon (right) correlators C cut (t, r max ) versus r max , normalized by the central value of C(t).Data are shown for four values of t.

Figure 15 .
Figure 15.Truncated sum for topological susceptibility versus Euclidean time t [eq.(E.2)] or threedimensional spatial distance r with label "L 3 " [eq.(E.4)]; the asymptotic plateau values yield χ top .The two ensembles examined here differ only in geometry while having the same four-volume: the 32BT follows the long-T approach and has an elongated direction T = 768a with fixed "traditional" spatial size L = 32a, whereas the 64B has an increased spatial size L = 64a with T = 96a.Eq. (E.2) produces a larger statistical error for ensemble 64B than for 32BT because its spatial sum includes more noisy contributions from large spatial separations of the topological charge density; the smaller footprint of the estimator in eq.(E.4) produces a smaller error approaching that of ensemble 32BT.
Figure 16.Monte Carlo history plots of three quantities on the extended 64B ensemble.In the top row, we graph the inverse of the smallest singular value σ 0 of the Dirac operator with light quark mass.The central row shows the integrated correlation volume τ (R) of the pseudoscalar correlator C P P (x 0 ) as studied in section 3.3, evaluated at R = 10a and x 0 = 8a.In the bottom row, we plot the Monte Carlo history of the topological charge susceptibility at flow-time t flow /a 2 = 3.125 (see also appendix E).For both observables in the central and bottom line, the ensemble average is subtracted and only the fluctuations around the mean are shown.The orange lines are MC histories smeared with a Gaussian filter with a width of three configurations.

Table 1 .
N f = 2 + 1 gauge ensembles used in this work.All simulations use the stabilised Wilson fermion framework.The coefficient of the exponentiated clover is set to c SW = 1.955242.V /V 0 denotes the ratio of the global lattice volume w.r.t. to 32 3 × 96.For ensemble 64B we discuss in appendix F the identification of an exceptional configuration earlier than these 50.
.22) In addition to |x L ẑ |, defined in eq.(4.19) above, this estimator depends on