An Improved Single-Plaquette Gauge Action

We describe and test a nonperturbatively improved single-plaquette lattice action for 4-d SU(2) and SU(3) pure gauge theory, which suppresses large fluctuations of the plaquette, without requiring the naive continuum limit for smooth fields. We tune the action parameters based on torelon masses in moderate cubic physical volumes, and investigate the size of cut-off effects in other physical quantities, including torelon masses in asymmetric spatial volumes, the static quark potential, and gradient flow observables. In 2-d O(N) models similarly constructed nearest-neighbor actions have led to a drastic reduction of cut-off effects, down to the permille level, in a wide variety of physical quantities. In the gauge theories, we find significant reduction of lattice artifacts, and for some observables, the coarsest lattice result is very close to the continuum value. We estimate an improvement factor of 40 compared to using the Wilson gauge action to achieve the same statistical accuracy and suppression of cut-off effects. The simplicity of the gauge action makes it amenable for dynamical fermion simulations.


Introduction
Cut-off effects are a major source of systematic errors in lattice QCD calculations. Improved lattice actions are valuable for obtaining reliable continuum results, but usually imply an increased numerical effort. The Symanzik improvement program is a systematic method to eliminate cut-off effects, order by order in the lattice spacing a, by including additional operators in the lattice action, beyond the standard plaquette term [1][2][3][4]. These operators, which have a larger space-time extent than the standard term, lead to greater numerical cost in Monte Carlo simulations. The coefficients of the additional operators can be fixed either perturbatively, by expanding the lattice operators in continuum operators of increasing dimension, or nonperturbatively by adjusting them to satisfy continuum physics constraints. A more radical improvement strategy underlies the perfect action approach, which attempts to eliminate cut-off effects to all orders of a, at least at the classical level [5][6][7][8][9]. The classically perfect fixed point action, which is located on the critical surface at the end of a renormalized trajectory, is very complicated. Still, it can be parametrized to high accuracy with a relatively large number of terms, which is thus costly. The parametrized fixed point action is then used for numerical simulations of the quantum theory away from the critical surface. This has led to a substantial reduction of cut-off effects in a variety of asymptotically free field theories, ranging from the 2-d O(3) model to QCD. A different approach uses mixed fundamental-adjoint actions to reduce cut-off effects without extending the space-time extent of the operators beyond a single plaquette, and thus with only a moderate increase of the computational cost [10][11][12][13]. In [14,15] the cut-off effects were studied using a modified single-plaquette gauge action proposed in [16] for theoretical purposes.
2-d O(N) models share several features, including asymptotic freedom and a nonperturbatively generated mass gap, with 4-d non-Abelian gauge theories. Hence, they serve as a good testing ground for lattice improvement studies. Interestingly, in contrast to what Symanzik improvement suggests, the cut-off effects in the 2-d O(3) model appeared to be O(a) instead of O(a 2 ) [17,18]. A careful analysis of this apparent contradiction verified the Symanzik O(a 2 ) prediction, but showed that easily accessible lattice spacings are affected by large logarithmic corrections, which mimic O(a) behavior [19,20]. Recently, different lattice actions have been studied with the goal to improve the cut-off effects [21]. In addition to the standard action, this study used a topological action [22], which constrains the maximal angle between neighboring spins and is therefore invariant under small field deformations. Although it does not have the correct naive continuum limit and it violates the Schwarz inequality between the action and the topological charge, it still yields the correct quantum continuum limit [21]. Combining this action with the standard action one gets an improved constrained action, which eliminates the lattice spacing effects almost entirely. Using this improved action, it was possible to study the θ-vacuum angle in the 2-d O(3) model, which turned out to be a relevant parameter of the theory that does not get renormalized non-perturbatively. For the first time, this numerically confirmed the conjectured exact S-matrix results at θ = π [23] beyond any reasonable doubt. This also confirmed the existence of a conformal fixed point at θ = π, where the model reduces to the WZNW model at low energies. This study has also been a basis for further investigations to demonstrate walking near the conformal fixed point close to θ ≈ π [24]. The essential features of walking technicolor models are shared by this toy model and can be accurately investigated by numerical simulations. Optimized lattice actions have also been studied intensively in [25] for 2-d O(N ) models, where it has been shown that cut-off effects can be reduced to the per mille level. A topological lattice action has also been used in a recent study of 4-d U(1) gauge theory, to demonstrate that the correct continuum limit is obtained, to examine the effect of the lattice action on monopole condensation in the confined phase, and to test a method to measure the free energy [26].
In this work, we apply a similar strategy to gauge theories. Our approach is different from Symanzik's improvement program [1], where one adds operators with higher dimensions to the standard action to eliminate the leading cut-off effects. The Symanzik improved action is perturbative, even if the coefficients are determined non-perturbatively. The experience with the O(N ) non-linear sigma model suggests that at moderate lattice spacing used in the numerical simulations the main cause of the cut-off effects are the large local fluctuations of the action density. A truly non-perturbative action, like the topological action or the constrained action with negative β performs surprisingly well in that case. With one extra parameter (and a "cheap" action) one can reach a strong suppression of cut-off effects.
For the SU(2) and SU(3) pure gauge theory here we study a slight modification of the constrained action. We found that the improved action decreases the cut-off effects of many quantities including torelon masses on asymmetric lattices, the static potential, and observables related to the gradient flow of the gauge fields. The simplicity of the gauge action makes it easy to embed it in simulations of gauge theories with dynamical fermions at little extra cost, with the possible gain of reduced lattice artifacts. Our approach differs from those used in [13] and [14,15] in two aspects. Our set of single-plaquette actions considered is broader -it includes the possibility β ≤ 0 (see below). In other words, we do not restrict ourselves to actions having a naive continuum limit, hence we can optimize the action on coarser lattices as well. Secondly, we use the torelon masses in small boxes to optimize the action -these are much easier to measure than the string tension σ or the deconfining temperature T c . This paper is organized as follows. In section 2 we discuss the parametrization of the improved action, the procedure for optimization of the action parameters, and some basic properties of torelon states. Section 3 shows our simulation results for SU(2) gauge theory, where we describe the tuning of the action parameters and the reduction of lattice artifacts for torelons on asymmetric spatial volumes and for the static quark potential. In section 4 we present similar findings for SU(3) gauge theory, as well as a study of the cut-off dependence of observables obtained from the gradient flow of the gauge fields. Section 5 details what algorithms we use and the numerical cost of Monte Carlo simulations with this novel action. We finish with our conclusions in section 6.

Determination of the parameters of the action
Consider the constrained action for pure Yang-Mills theory with the action density associated with the plaquette S p = βw , for w < δ , ∞ , otherwise . (2.1) Here w = 1− 1 N TrU p , where U p is the standard plaquette matrix, and plaquette values larger than the constraint δ are prohibited. Keeping in mind that the gauge action could be used in Hybrid Monte Carlo simulations, we have chosen a smooth version of the constrained action with For large power q this has the same effect as the constrained action with δ ≈ γ −1/q . In our simulations we used a fixed value of the power, q = 10. 1 To reduce the cut-off effects one can choose two appropriate physical quantities. One of them is used to set the lattice spacing a, the other to estimate the size of the cut-off effects at the given resolution. For the 2-d O(N ) spin model these were [25] the mass gap measured on a long strip with spatial sizes L and 2L (cf. step scaling function, [27]). For the gauge theory we considered the energy gap between the vacuum state and states with given electric flux wrapping around the periodic spatial directions [31], in short the "torelon masses". The two quantities used for optimizing the action were the torelon masses m 100 (L) and m 110 (L) in an L 3 spatial box, corresponding to fluxes wrapping around along an axis and along a diagonal. Our procedure was the following. Taking first a lattice of size L 3 × L t we determined a line β = β(γ) along which u 100 (L) ≡ m 100 (L)L = u is fixed. (For SU(2) we took u = 1.375, while for SU(3) u = 1.0.) Note that β(γ) is a decreasing function, and at some γ it becomes negative. It is important that we do not restrict ourselves to the β > 0 region. 2 The distribution of the plaquette variable w for the standard action and for the improved action at aT c ∼ 1/4 is shown in Fig. 1. The optimization of the action is done by moving along the β(γ) line to minimize the deviation of u 110 (L) ≡ m 110 (L)L from its continuum limit. The latter was obtained by measuring the corresponding torelon masses at the same physical point u 100 (L) = u and finer resolutions with the Wilson action and extrapolating to a/L = 0. This way one obtains the pair of couplings (β, γ) which are optimal for this resolution (and the given choice of observables). One could proceed further on lattices with larger L/a, similarly to the case of the O(N ) spin model. Instead, we have chosen a less ambitious optimization: for finer resolutions we kept the same γ as obtained on the coarse lattice L/a = 4, and tuned only β to obtain u 100 (L) = u for L/a = 6 and 8. Then using these pairs of couplings (β(a), γ) we measured different quantities, like torelon masses on spatial lattices of different shapes, the static qq potential, and observables related to the gradient flow of the gauge fields.
It is worth to discuss briefly our choice of the basic physical quantity used for optimization. The torelon is an excitation characterized by an electric flux wrapping through the torus L 1 × L 2 × L 3 × ∞ in one (or several) periodic spatial directions. The Hilbert space of the transfer matrix is split into sectors characterized by quantum numbers k = (k 1 , k 2 , k 3 ), where k i = 0, . . . , N −1 for SU(N ). In particular, the value k 1 describes the transformation property of the corresponding state (the wave function in the strong-coupling basis) with respect to the multiplication by z = exp(2πi/N ) of links U 1 (x 1 , x 2 , x 3 , t) at some plane with a given fixed x 1 . Such a state can be created from a state in the vacuum sector k = (0, 0, 0) by multiplying it by a product of traces of Polyakov loops, or by the trace of a single Polyakov type loop wrapping around (k 1 , k 2 , k 3 ) times the spatial volume. We denote the trace of the Polyakov loop Φ i in the direction i on time slice t by 3 and With this the torelon mass m k = m k (L 1 , L 2 , L 3 ) is obtained from the exponential fall-off of the correlation function The torelon mass (more precisely the energy difference between the lowest state in the sector characterized by electric flux k and the vacuum state) has a special dependence on the size and shape of the 3-volume. For small volumes L 1/T c it is extremely small (it is given by a tunneling through a high barrier [32]). In this case the flux is completely spread in the transverse direction. In a cubic 3-volume L 3 with increasing L the flux assumes a finite width ("flux tube") while its energy increases as m 100 (L) ∼ σL, where σ is the string tension. There is a relatively sharp transition between these two regimes, and we have chosen our physical lattice sizes (i.e. the value of u ) to be roughly in this region.
For asymmetric volumes m 100 (L 1 , L 2 , L 3 ) increases with L 1 , and decreases with increasing transverse sizes L 2 , L 3 . For L 1 = 1/T c , L 2 = L 3 = ∞ the system undergoes a phase transition. We work here, however, in volumes where all spatial sizes L i are of O (1/T c ), hence the observables are smooth functions of β.

The SU(2) case
On a cubic spatial volume L 3 we define the fixed physical volume via the dimensionless combination m 100 L ≡ u 100 (L) = u = 1.375 i.e. the lattice size is measured in torelon mass units. At the chosen u value we measured the diagonal torelon state u 110 (L) on cubic spatial lattices of size L/a = 4, 6, 8, 10 using the Wilson action. The temporal extent L t was chosen to be either 10L or 20L, the former corresponding to free boundary conditions in the time direction and the latter to periodic boundary conditions in the time direction. The advantage of free boundary conditions is that one can use a smaller lattice volume, with the drawback that time-like correlators can only be measured sufficiently far way from the ends. We used both setups to cross-check that they give consistent determinations of the torelon mass. The extrapolation to the continuum limit using a linear fit in a 2 gave u 110 (L) = 2.888(5) (cf. figure 2).
To optimize the couplings in eq. (2.2) we tried first the choice q = 2 on a 4 3 ×80 lattice, but a more significant improvement of the cut-off effect has been found by increasing the power, and for the rest of our simulations we took q = 10. Increasing γ from the standard action case (γ = 0) helps to decrease the cut-off effect for u 110 (L). We went up with this until γ = 52, corresponding to an effective cut δ = γ −1/q = 0.67 for w. At L/a = 4 and γ = 52 the condition table 1). Increasing γ further -i.e. decreasing the effective constraint δ -lowers β, and the action density becomes restricted practically to a narrow region for w closely below δ. This would slow down significantly the effectiveness of the Monte Carlo simulations. We repeated the procedure for the improved action on L/a = 6 and 8, holding q = 10 and γ = 52 fixed. The u 110 torelon masses measured using the standard Wilson action and separately the improved action are plotted in figure 2, together with the extrapolations to the continuum limit. The couplings of the actions are given in table 1, where we include an extrapolation from simulations to the chosen value of the physical point u 100 (L) = u = 1.375. For the error propagation we used du 110 (L)/du 100 (L) ≈ 1.8, measured by repeating the simulation 4 at slightly different β.
In contrast to the situation with the O(N ) non-linear sigma model, for the SU(N ) case we could not completely eliminate the cut-off effects for the chosen pair of physical quantities, u 100 (L) and u 110 (L), on the coarsest L/a = 4 lattice using the single-plaquette improved action with only one tunable parameter. However, the improved action significantly reduces the cut-off effect at L/a = 4 down to 1%, compared to 6% for the Wilson action, and by L/a = 6 the improved action result is essentially compatible with the continuum value. For both actions, the lattice artifacts appear to be O(a 2 ) and we make our continuum extrapolations assuming quadratic dependence on the lattice spacing. We see very good agreement between the extrapolated values for the two lattice actions. We could make a more accurate determination for the continuum value of u 110 (L) with a constrained fit of Wilson and improved action data which demands that they have a common continuum limit, but that would not serve our purpose here to check for consistency between the two independent sets of simulations.

Scaling Tests
Once the parameters of the action have been set as described above, we can proceed to examine the cut-off dependence of other physical quantities, to see what improvement the new action delivers. We start with a discussion of torelon masses measured on asymmetric spatial volumes.
We have measured the torelon masses on asymmetric spatial lattices both for the improved action and the standard Wilson action. Note that this is a completely independent set of lattice simulations from the ones which were used to tune the action parameters.    We have considered shapes of type (L, L, 3L/2), (L, 3L/2, 3L/2), (L, L, 2L) and (L, 2L, 2L), with the shorthand notation (LLr), (Lrr), (LLR) and (LRR) in the plots and text. The corresponding results are shown in Figs. 3-5 for torelons which wrap around either one or two of the short spatial directions. One could also try to detect lattice artifacts in the heavier states such as the u 111 torelon in symmetric and asymmetric spatial volumes, but we found these masses could not be extracted with sufficient accuracy to be useful for comparison of cut-off dependence. Let us first examine the u 100 states. As discussed earlier, the torelon wrapping around the shortest distance gets lighter as the transverse spatial directions increase in size. For example the u 100 (LRR) state is 2.5 times lighter than u 100 (L) = 1.375. This makes their determination from the exponential decay of Polyakov loop correlators somewhat easier, as the signal persists for larger time separation. We see that both for the improved and the Wilson actions the lattice artifacts again appear to be O(a 2 ). Once extrapolated to the continuum, there is very good agreement between the two sets of simulations, except for some possible tension for u 100 (LRR). Note that there is no tuning done at this stage: the bare couplings β and γ have been fixed by requiring u 100 (L) = u = 1.375. The improved action has consistently smaller lattice artifacts than the Wilson action, and on the coarsest lattice L/a = 4, the cut-off dependence in e.g. the u 100 (LRR) is reduced from ∼ 25% with the Wilson action to ∼ 4% for the improved action. We next discuss the u 110 states. An interesting empirical observation is that, while the u 100 masses approach the continuum limit from above, the u 110 masses approach it from below. Unfortunately, the statistical precision is not good enough to make a strong statement about reduced artifacts with the new action at finite lattice spacing. It is important however that we find consistency in the continuum results between the Wilson and improved action simulations, which are both determined with better than 1% precision.
Given that our tuning of the action parameters used u 100 (LLL) and u 110 (LLL), both measured on cubic L 3 spatial volumes, one of the limitations was the reduced statistical accuracy of the heavier u 110 state. An alternative strategy would be to fix the optimal couplings by choosing the pair u 100 (LLL) and u 100 (Lrr), the second state having a lighter mass and hence being easier to measure. However there is the drawback that one needs two full sets of simulations on both symmetric and asymmetric spatial lattices to complete the tuning. 5 We pursued this approach and found the results were similar: with the new lattice action we could not completely eliminate the cut-off dependence on the coarsest L/a = 4 lattice. Hence we do not show these results here.
At this point it is interesting to note that the SU(2) case is special. Considering a large spatial volume L 3 , the flux tubes going along the diagonal directions, say along (1, 1, 0) and (1, −1, 0) are obviously two different states. However, they belong to the same sector k = (1, 1, 0) in the SU(2) case. As a consequence, these states are mixed, and the eigenstates of the transfer matrix are even/odd w.r.t. 90 • rotation in the 1-2 plane. The operators producing these mass eigenstates from the vacuum sector can be constructed by One expects that the energy difference between the odd and even lowest states is relatively large for small L (where the width of the flux spreads over the available volume, and there    is a large overlap between fluxes going along the two diagonals) and tiny for large L. Note that the operator given in (2.4) is an even one. Similar eigenstates also appear in the 111 sector. Although for some cases we measured the odd torelon masses like m (−) 110 (L), we did not use them in this work.

Static Potential
In this section we present the scaling behavior of quantities related to the static potential. Usually the force between quarks is used as a practical and simple way to fix the scale, relating the bare coupling and the lattice spacing in physical units [29]. Here we consider the reverse approach. We investigate the approach to the continuum limit of the force between static quarks after fixing the scale with the torelon mass.
We have performed numerical simulations of SU(2) Yang-Mills theory comparing the standard Wilson action and the improved one. We have considered the bare couplings tuned with the torelon mass m 100 (L) for L = 4, 6 and 8 and listed in table 1. The actual sizes of the simulated lattices were 24 4 , 36 4 and 36 3 × 48, respectively, for the three above parameter sets. The static potential has been measured on axis from the two-point correlation function of Polyakov loops at distance r where L 0 is the lattice size along the temporal direction. The Monte Carlo simulations have been carried out using the multi-level algorithm [30]. The force F is obtained from the static potential by where r is a properly chosen point between r and r − 1. The simplest case is the midpoint r = r − 1/2. The cut-off effects can be somewhat reduced by choosing r = r I (r), the tree-level improved midpoint distance between r and (r − 1) [29]. We present here the naive choice r = r − 1/2, but the qualitative behavior of the cut-off effect is the same with the other choice as well.
The force is a dimensionful physical observable and it is useful to investigate its scaling behavior by considering the dimensionless quantity H(r) = F (r)r 2 . (3.5) We have studied the approach to the continuum limit of two scales, r 1 and r 2 defined as the distances where H(r) has the values 1.0 and 1.65, respectively. Note that -as we mentioned above -one usually fixes the scale by defining the Sommer parameter, r 0 = 0.5 fm, as the distance where H(r 0 ) = 1.65. In figure 6 we show the scaling behavior of r 1 /L 0 and r 2 /L 0 as a function of (a/L 0 ) 2 where the scale L 0 is defined by the torelon mass through L 0 m 100 (L 0 ) = 1.375. With the improved action, the reduction of lattice artifacts in the static force is less pronounced than in the torelon masses. For both actions the artifacts appear to be linear in a 2 and there is very good agreement between the continuum extrapolation results. It is important to note that the lattice artifacts of the static potential can be only partially attributed to the properties of the lattice action. The choice of the operator (say smeared Polyakov loops versus naive ones) also contributes, and it is not easy to separate these effects. Even in continuum electrostatics, the force between two charges of cubic shape is not exactly proportional to 1/r 2 , it also depends on the relative orientation of the cubes.

The SU(3) case
Due to its obvious relevance as part of QCD, and as a possible prelude for future work, we also tuned and tested the improved action in SU(3) gauge theory, and compared it to the standard Wilson action to see how much the cut-off effects could be suppressed. Because much of the procedure is similar to the SU(2) work described above, we avoid a detailed description of the common aspects.
We use the same parametrization of the gauge action as in eq. (2.2) and the same value q = 10 to curb large fluctuations of the plaquette. For the tuning procedure, we again choose to keep the physical volume fixed in units of the torelon mass, with the precise value being m 100 (L)L ≡ u 100 (L) = u 100 = 1.0. Note that for the same value of L/a this corresponds to a somewhat finer lattice spacing than in our SU(2) study. Starting on the coarsest lattice L/a = 4, we tested a range of values of γ, for each one finding the tuned coupling β(γ) where the above condition was satisfied. We simultaneously measured the u 110 torelon mass at the same γ and β(γ) values. We found that for γ = 200 the cut-off effects in the u 110 state were largely removed at this coarse lattice spacing. This corresponds to an effective cut on the plaquette values at δ = γ −1/q = 0.59. Going to larger values of γ significantly reduces the efficiency of the Monte Carlo simulations, with decreasing improvement in reducing lattice artifacts. For this reason, we used the fixed value γ = 200 for the remainder of the SU (3)    We repeated the tuning exercise for the improved action at L/a = 6 and 8, as well as finding the corresponding bare couplings for the Wilson action over a range of lattice sizes from L/a = 4 to 12. Our procedure to tune the action parameters was technically slightly different for the SU(3) case. We measured simultaneously the torelon masses at ∼ 40 nearby β values, and fitted the β-dependence by a 5-th order polynomial. Then from these fits we determined the appropriate β value from u 100 (L) = u and used this to determine u 110 (L). The errors were determined by a bootstrap procedure. To present the data in a similar way as in table 1, in table 2 we give the value u 110 (L) corresponding to "fixed β" and u 110 (L) corresponding to "fixed u 100 (L) = u ". We show the results for the u 110 (L) torelon mass for both improved and Wilson actions in figure 7. At the coarsest lattice spacing, the cut-off effects for the improved action are ∼ 1% compared to ∼ 5% for the Wilson action. Although both are small effects, at the level of accuracy we could reach they are clearly observable. Similar to SU(2), by L/a = 6 the improved action measurement lies almost on top of the continuum extrapolated value. The overall smaller size of the cut-off effects compared to the SU(2) case is possibly due to the choice of a smaller value of u 100 . One contrast with respect to the SU(2) findings is that at this level of accuracy, the continuum extrapolation must be quadratic in a 2 for the Wilson action if one wishes to include the L/a = 4 data point. For the improved action, an extrapolation linear in a 2 describes the data perfectly well.

Scaling Tests
As before for SU(2), once we complete the tuning procedure for γ and β(γ) on symmetric spatial volumes, we next move to asymmetric spatial volumes to examine what improvement the new action brings. The torelon mass results are shown in Figs. 8-10. The first question is whether or not there is good consistency between the continuum extrapolated results using the two lattice actions, to which the answer is yes. For the u 100 states the cut-off effects are largely suppressed on the coarsest L/a = 4 lattice with the new action. For the Wilson action on L × L × 2L and L × 2L × 2L volumes, again the continuum extrapolation of the u 100 mass is quadratic in a 2 for the Wilson action, whereas the improved action extrapolation appears to be linear in a 2 . The relative cut-off effect for the Wilson action increases as we go towards the largest spatial volume L × 2L × 2L, at which point the torelon is very light with u 100 (LRR) ≈ 0.035, given that the same state on a symmetric volume corresponds to u 100 = 1. The u 110 masses are determined with 1% accuracy or better, but as for SU (2), there is no clear reduction of lattice artifacts with the newly proposed action.
To compare these lattice artifacts to those for the mixed fundamental-adjoint action we also measured the torelon masses for the action used in [13]. Fixing the adjoint coupling to β a = −4.0 as in [13], with the fundamental coupling β f = 9.398 for L/a = 4 we obtained u 100 (L) = 1.0036 (8) and u 110 (L) = 2.1502 (27). This yields the extrapolated value u 110 (L) = 2.1435 (31), which is halfway between values for the Wilson action and the newly proposed action (cf. table 2 and figure 7). Hence the mixed fundamental-adjoint action brings some improvement, but not as much as the new action we present.

Gradient flow observables
The gradient flow is a recently developed method which smooths out lattice fields in a controlled fashion and from which renormalized observables can be measured with very high accuracy [35]. In the following we make use of the fact that the gauge field obtained at flow time t/a 2 > 0 is a smooth renormalized field [36]. Hence, the expectation values of local gauge invariant expressions in this field are well-defined physical quantities that probe the theory at length scales on the order of √ t. In particular, we will consider observables related to the action density E(t) at flow time t, and the set of its derivatives One possible discretization of the action density on the lattice makes use of the sum of unoriented plaquettes with a common lower-left corner [35] and is denoted by E plaq (t), but we also used a more symmetric clover-type discretization E sym (t).
One way to exploit the gradient flow is to extract the lattice spacing. For example, one can define the lattice scale √ t 0 via the requirement that where in the original investigation [35] the value c = 0.3 was chosen. The corresponding value of t 0 /a 2 can be measured with very good precision and be used as a reference scale when taking the continuum limit a → 0. Similarly one can define a scale w 0 set by the requirement that W where again there is freedom in the choice of the parameter c , which in the literature has first been tested for the choice c = 0.3 [37]. Note that for a given choice of c and c , the dimensionless ratio t 0 /w 2 0 is physical and has a well-defined value in the continuum limit. Further derivatives of the action density renormalized at flow time t 0 can be determined by evaluating W and the continuum limit of these quantities is taken by lim a→0 W (n) 0 . Unlike the mass spectrum of the theory, gradient flow observables are not spectral quantities, therefore their cut-off dependence follows from (a) the lattice action used in the Monte Carlo generation of the ensembles, (b) the action used in implementing the flow, and (c) the lattice discretization of the action density. We generated separate ensembles using both the standard Wilson and improved lattice gauge actions over a range of lattice volumes and lattice spacings. For both sets, we use the Wilson action in the flow and we consider both E plaq and E sym to facilitate the continuum limit. It turns out that the gradient flow observables in lattice units are essentially independent of the volume above L/ √ t 0 8.0−8.5. As a consequence, we restricted our analysis of the results to simulations for which L/ √ t 0 > 10.0. The quantification of the lattice artifacts due to the specific choice of the gradient flow action is beyond the scope of the present work, but the ones due to the discretization of the action density operator can be estimated by comparing the observables evaluated with E plaq and E sym . In figure 11 we show the continuum limit of W (0) (t 0 ) using the symmetrized definition for the action density while t 0 is determined from the plaquette definition. By construction W (0) (t 0 ) takes the value 0.3 in the continuum, independent of the discretization employed for the action density. The continuum limits displayed in figure 11 show that within two standard deviations this is indeed the case for our simulation results, both for the Wilson and the improved gauge action. In order to enhance the differences between the two actions, we removed the average O(a 2 ) correction which one can assume stems from the discretization of the action density. We fit the results of the Wilson and improved action separately with an ansatz c 0 + c 1 a 2 + c 2 a 4 , then average c 1 from both fits, and subtract c 1,ave a 2 from the original data. This is a convenient way to magnify the deviations. The result is displayed in the right plot of figure 11 and illustrates that the continuum extrapolations can be achieved by employing an O(a 2 ) correction for the improved action, while O(a 2 ) and O(a 4 ) corrections are necessary for the Wilson action. However, it is clear that the differences between the results from the Wilson and the improved gauge action are very small. We assign this to the fact that the lattice artifacts introduced through the discretization of the operator and/or the choice of the flow procedure are so large that they dominate the ones due to the choice of the action for Monte Carlo simulation.
Yet another part of the challenge in quantifying the magnitude of cut-off effects is the ambiguity in the choice of lattice scale. To illustrate this, in figure 12 we consider the dimensionless ratio t 0 /w 2 0 as a function of the lattice spacing expressed in units of t 0 (left plot) and in units of the torelon mass m 100 (right plot) determined earlier in the spectroscopy study. On the coarsest lattices, corresponding to am 100 = 1/4 for the new action, cut-off effects are now significantly smaller with the improved action. Still, the dominant effect at finite lattice spacing is the choice of the action density operator. We tried to separate out the effect of the operator by considering a linear combination of E plaq and E sym , so as to reduce the lattice artifacts stemming from the discretization of the action density operator. However, even in the improved combination the difference between the standard and new lattice actions is not dramatic.  Given the high precision of the gradient flow method, it is interesting to investigate how many derivatives of the action density can be accurately measured. In figure 13 we show the continuum limit of the first and second derivatives of the action density calculated from both the plaquette and symmetrized definition, and for both gauge actions, with t 0 always being determined from the corresponding definition. We see that lattice artifacts remain large and are again dominated by the choice of operator, not the lattice action used to generate the ensemble. Most important, we see consistent continuum results across the various discretizations, and cut-off effects which are always well described by O(a 2 ) and O(a 4 ) corrections.

Plaquette distribution
To investigate further the origin of cut-off effects in observables related to the gradient flow, we studied the distribution of the plaquette and its evolution along the flow. This is shown in figures 14-16. The plaquette distribution on the original gauge configurations generated during the Monte Carlo simulation are very different, by design: the new action essentially suppresses large fluctuations, with the peak occurring further from the limiting value where the plaquette is unity. As we tune to finer lattice spacing, the distributions move smoothly. If we inspect the plaquette distributions at flow time t 0 in figures 15 and 16, a different picture emerges. The Wilson and improved action ensembles have very similar distributions once the coarse fluctuations are smoothened out, with a rapidly decreasing tail of plaquette values away from 1. Using higher resolution to probe the region closest to  (2) plaq, Wilson plaq, improved action sym, Wilson sym, improved action Figure 13. Continuum limit of the dimensionless derivatives W (1) (t 0 ) and W (2) (t 0 ) using the plaquette and the symmetrized definition for the action density.
1, we see again the strong similarity between the ensembles. The gradient flow washes out lattice artifacts contained in the original lattice action, which are replaced by discretization effects of the flow scheme instead. Obviously, the matching of the distributions between the two actions is in line with matching the lattice spacings from the gradient flow, and this is the reason why the gradient flow observables from the two actions show rather similar lattice artifacts.

Algorithms and Cost Estimates
The obvious drawback to using an improved action in lattice simulations is the increased numerical cost, both in construction and in the Monte Carlo simulation, where smaller updates are needed for e.g. the Metropolis algorithm to have a reasonable acceptance rate. Hence an increased time is necessary in reaching a given numerical accuracy for a fixed computational resource. This has to be balanced against the expectation of being able to determine continuum results using coarser lattices than for the standard action. The motivation to examine a new action as parametrized in eq. (2.2) is that using only powers of the basic plaquette minimizes the numerical cost, while the suppression of large plaquette fluctuations is achieved in a way very different than for lattice actions constructed along the Symanzik improvement program.
Let us consider the numerical slowdown. For one sweep of a lattice volume, the CPU time taken for the improved action relative to the Wilson action is t imp /t Wil = 1.9, both for SU(2) and SU(3) gauge theory. In order to estimate the autocorrelation time, we compare the squared relative error of the Polyakov loop correlator at distance t = L, normalized to the same number of sweeps, (∆C(t)/C(t)) 2 × n sweep , which we found for SU(3) is 1.4 times larger with the improved action than for the Wilson action (for an acceptence rate ∼ 0.5 for both actions). For a different determination of the autocorrelation time, we also examined the torelon mass error, also normalized to the same number of sweeps, (∆m) 2 × n sweep .
For this quantity we find an increase by a factor of 1.5 going from the standard to the new action, essentially the same value. Hence to achieve a given statistical accuracy our new action is about 3 times more expensive in computer time than the Wilson action. However, the reduction in cut-off effects more than compensates for the increased cost. As a rough estimate, taking into account that the a 2 cut-off effect is reduced by a factor ∼ 5, to reach the same cut-off error and statistical error one gains in computer time a factor 5 3 /3 ≈ 40, assuming the CPU time needed for an independent configuration grows like (L/a) 6 (lattice volume × critical slowing down). We can only speculate that for full QCD the gain could be even larger.
We use two different algorithms for Monte Carlo simulations with the improved action. One is a standard Metropolis update, where a trial gauge link is generated by a rotation of the original gauge link, followed by an accept/reject decision. The rotation has an adjustable parameter, which allows one to tune to whatever desired acceptance rate. A different algorithm is where a trial gauge link is generated using the standard heathbath algorithm for the Wilson action, but with a bare coupling β = β. The trial gauge link is then accepted or rejected with a Metropolis step based on the change of the action (β − β )w + γw q . The adjustable parameter here is β which can be tuned to improve the acceptance rate, however the rate cannot be made arbitrarily close to 100%. We find similar efficiency between the two algorithms in our Monte Carlo results.

Conclusions
The type of lattice action we propose and study in this paper is somewhat unusual, given that it does not have the usual naive continuum limit. On the basis of universality, the essential elements are the dimensionality of the system and that the lattice action has the correct internal symmetries. Our numerical results fully support this view and show beyond any doubt that our chosen discretization gives the correct continuum theory.
The findings regarding suppression of lattice artifacts depend on the observable in question. In the case of spectral quantities such as the torelon masses it was possible to almost fully remove cut-off dependence on the coarsest lattice we simulate. For nonspectral quantities such as the static quark potential and force, and observables given by the gradient flow such as the lattice scales t 0 and w 0 , improvement in the operators is necessary beyond just improvement of the action, as can be seen from the reduction but not the removal of artifacts. We did not measure other spectral quantities such as the glueball spectrum, the critical temperature T c and the string tension σ, since those studies would require larger numerical simulations. The tuning strategy we choose is based on the torelon spectrum. The parametrization is general -one could extend it without increasing the numerical cost by adding an adjoint plaquette term to the action, which opens up the possibility that additional tuning of the extra parameters would give even further reduction of lattice artifacts.
Our study of the gradient flow and its use in scale setting, as well as observables given by higher order derivatives of the renormalized action density, is a useful test of the accuracy of the scheme and of the systematics due to lattice artifacts, as well as a test for universality.
The newly proposed lattice action is cheap and would be straightforward to incorporate into simulations with dynamical fermions. The crucial question is how much of the improvement carries over to such simulations, which remains to be investigated.