Space-time symmetries and the Yang-Mills gradient flow

The recent introduction of the gradient flow has provided a new tool to probe the dynamics of quantum field theories. The latest developments have shown how to use the gradient flow for the exploration of symmetries, and the definition of the corresponding renormalized Noether currents. In this paper we introduce infinitesimal translations along the gradient flow for gauge theories, and study the corresponding Ward identities. This approach is readily generalized to the case of gauge theories defined on a lattice, where the regulator breaks translation invariance. The Ward identities in this case lead to a nonperturbative renormalization of the energy-momentum tensor. We discuss an application of this method to the study of dilatations and scale invariance on the lattice.


Introduction
The lattice regulator provides a unique framework to investigate non-perturbative properties of non-Abelian gauge theories. However this formulation explicitly breaks the Poincaré group at finite lattice spacing and the exact restoration of the related invariances can be recovered only in the continuum limit.
As space-time symmetries are explicitly broken, the Ward identities associated to translations are violated, and the construction of a renormalized energy-momentum tensor that generates the transformations requires special care. A nonperturbative renormalization of the energy-momentum tensor is necessary in order to guarantee that numerical studies of physical quantities related to the Noether currents are not obscured by lattice artefacts. For instance the study of scale invariance in quantum field theories is a problem that requires the knowledge of a properly-defined energy-momentum tensor is necessary.
The lattice energy-momentum tensor can be obtained as a linear combination of all operators with dimension not greater than four allowed by the lattice symmetries. The coefficients have to be tuned in such a way that the Ward identities of the continuum are satisfied up to cutoff effects. This condition makes sure that the defined operator is the generator of the Poincaré transformations in the continuum. This program was articulated in great detail in refs. [1,2].
The approach of [1] is based on the idea that one can probe the lattice energymomentum tensor with a certain number of local observables. However this approach can be used only if the energy-momentum tensor is separated from the probe observables, otherwise extra contact terms due to mixing with higher-dimensional operators might be generated. This problem has been occasionally seen as an intrinsic limitation of the strategy proposed by the authors of [1] (see for instance the introduction of [3], or the works [4,5] in the context of supersymmetry). On the contrary we argue that the limitation originates entirely from the choice of local observables to probe translation symmetry (or any other symmetry indeed).
In this paper we review this program in the light of the recently developed Yang-Mills gradient flow [6][7][8][9]. More specifically we use the gradient flow in order to define more appropriate probes for the translation Ward identities. Thanks to its remarkable renormalization properties the gradient flow offers a systematic way to define renormalization-independent observables and finite composite operators. The gradient flow essentially smears the elementary fields on a typical range of order √ 8t where t is the flow time. Observables constructed from the fields at some positive flow time are non-local in the elementary fields, and they represent more natural probes for the translation Ward identities. The main goal of this paper is to analyse all possible divergences that can arise from the translation Ward identities on the lattice when observables at positive flow time are used as probes. We shall see that contact terms are completely absent from the Ward identities, and hence they are regular in any space-time point. In section 6 a strategy to renormalize the energy-momentum tensor is proposed. The basic idea to use observables at positive flow time as probes for Ward identities is not new, and has been already applied in ref. [9] to chiral symmetry.
The analysis of divergences and the regularization of Ward identities passes through a complete analysis of the space-time symmetries of the flow equation (section 4), which can be implemented as the equation of motion of a five dimensional theory [8]. Beyond the technical aspects, this analysis also generates new insight.
The Noether current associated to a symmetry is obtained by considering some local version of the symmetry transformation. The Ward identities describe the response to the transformation applied to an ultra-local region of the space-time (a single point, in distributional sense). In the case of translations the Ward identities describe what happens if a single point of space-time gets translated by a certain infinitesimal displacement. If a lattice regulator has being used, this is certainly not the most natural choice. As we shall see in section 4, the gradient flow provides a natural way to probe symmetries at (any) intermediate length scale, by defining quasi-local transformations, i.e. transformations that modify the fields smoothly within a region with typical linear size of order √ 8t. These quasi-local transformations do not generate the artificial divergences arising in the ultralocal approach, not even on the lattice.
We extend our analysis to dilatations as well (section 5). We will be able to prove an operatorial version of the Callan-Symanzik equation [10][11][12], in which the flow time is interpreted as a (square) energy scale, and regular expectation values with the insertion of the trace of the energy-momentum tensor source the violation of scale invariance. Our analysis provides a tool to test scale-invariance at all energy scales, which is directly related to the trace anomaly.
In a recent paper [3], the gradient flow is also used to regularize the energy-momentum tensor. This very interesting approach is orthogonal to ours: an operator is defined at each positive flow time t (and it is therefore finite in any regularization scheme), which coincides with the energy-momentum tensor in the t → 0 + limit. This quantity is defined in terms of two coefficients which are calculated in a perturbative expansion. In section 7 we connect our general analysis with the small flow-time expansion, and we outline a possible non-perturbative definition of the coefficients appearing in ref. [3].
We also want to point out that completely different strategies have been recently explored to renormalize the energy-momentum tensor on the lattice [13,14].

Gradient flow -an essential toolkit
In this section we review the definition of the gradient flow, and some of its salient properties that will be relevant for this paper. Throughout this paper we use the notations of ref. [8]. In sections 3, 4, 5 and 7, we focus on the theory on the continuum, regulated using dimensional regularization. The dimension of space-time is taken to be D = 4 − 2 , but we will not need to use the cutoff explicitely. In section 6 we use an explicit lattice discretization. In the context of lattice gauge theories, the Yang-Mills gradient flow is referred to as Wilson flow (see e.g. [7]), and it has been used in a number of applications [7,9,[15][16][17][18][19]. However we do not give a review of the Wilson flow here, and we refer the reader to the relevant literature.

Flow equations
The flow of the gauge fieldB t,µ (x) is defined through the set of equations where the greek indices run only in the D-dimensional space, we refer to t as the flow time, and A µ (x) = A A µ (x)T A is the fundamental gauge field of the D-dimensional theory. Field correlators involving the gauge field at flow time t can be calculated in a local field x)T A is introduced to enforce the constraint in eq. (2.1). The bulk action is given by: Integrating out the Lagrange multiplier L µ yields a delta function in the path integral which guarantees that the field B µ (t, x) at flow time t is the solution of the flow equation B t,µ (x). The generators T A are antihermitean and are normalized as: A perturbative analysis of the properties of this theory has been discussed in Ref. [8]. For our purposes, it is interesting to emphasize that: (1) propagators involving fields in the bulk have an exponential suppression for large momenta, and (2) the flow propagator B A µ (t, p)L B ν (s, q) vanishes unless t > s: where α 0 and λ 0 are gauge-fixing parameters. Both properties are useful in order to understand the structure of the divergences in correlators involving B and L.

Jacobian matrix of the trivializing map
The gradient flow is reversible, which means that the map between field configurations at two different flow timesB s →B t is invertible. The Jacobian matrix associated with this map (only forward propagation is considered) is: . (2.10) The Jacobian matrix J was already introduced in ref. [6] in the context of the trivializing maps. At leading order in perturbation theory, J coincides with the flow propagator: This Jacobian matrix has many regularity properties, of which one is of particular interest for the discussions in later sections of this paper. For t > s the Jacobian is a regular function that decays exponentially in |x − y| as discussed e.g. in ref. [8].
The original action in eq. (2.3) is recovered in the B 0 = 0 gauge. As the measure in the path integral is invariant under the change of variables that brings to the B 0 = 0 gauge, the actions (2.3) and (2.13) describe the same quantum field theory.

Translations
The action of space-time translations on gauge fields can be defined in a gauge-covariant way [20,21]: The associated global transformations (i.e. with a uniform α ρ ) reduce to infinitesimal translations up to a field-dependent gauge transformation, and therefore are bona fide translations for any gauge-invariant observable. The four Noether currents associated with these transformations are gathered in an energy-momentum tensor that is symmetric and gauge-invariant. If the theory does not contain scalars this energy-momentum tensor is uniquely determined up to the cosmological constant which we will assume to be set equal to zero throughout this paper. For pure Yang-Mills the energy-momentum tensor defined from the gauge-covariant transformations above is: 2) and the variation of the action under (3.1) is given by: The fields are normalized in such a way that the action takes the form: In particular the action is invariant under the transformation (3.1) when α ρ is chosen to be uniform, i.e. independent of the space-time coordinates. It is important to stress that any explicit breaking of the symmetry generates an extra contribution to δ α S. Such explicit breaking can originate from terms in the action, or from the regularization used to define the theory. For instance the lattice regularization breaks translation symmetry, leading to a non-trivial renormalization of the energy-momentum tensor. We defer the discussion of the broken Ward identities to section 6. The variation of a generic observable P under the transformation (3.1) can be written as: The corresponding translation Ward identity (TWI) can be written as: A more familiar form of the TWI is obtained by choosing for P a product of gauge-invariant local observables; the l.h.s. of the equation above can be rewritten as the variation of the product of observables, leading to: Note that the TWI (3.6) and (3.7) hold for the regulated correlators in the bare theory, because dimensional regularization preserves translation invariance. A clarification is in order here. When the theory is defined using dimensional regularization and a perturbative expansion, we must address the issue of gauge fixing. Gaugefixing terms and ghost terms are added to the action, and consequently to the energymomentum tensor. However when gauge invariant observables are considered in the TWI, these extra-pieces in the energy momentum tensor do not contribute to the expectation values, so we can safely omit them.
In order to remove the cutoff in equation (3.7), the bare parameters and fields have to be replaced with renormalized ones: 8) and the observables φ j with their renormalized counterpart (φ j ) R : Ward identities are a powerful tool to analyse the divergences of Noether currents and related operators. Indeed, the finiteness (in a distributional sense) of the l.h.s. of eq. (3.9) in the → 0 limit, implies the finiteness of the gauge-invariant part of the operator ∂ µ T µρ (x). In gauge theories with no scalars this is shown to be equivalent to the finiteness of the energy-momentum tensor itself in the → 0 limit [22][23][24][25][26]. In other words the gaugeinvariant part of the energy-momentum tensor does not require renormalization in dimensional regularization; in order to avoid the usage of an overabundant notation we will not introduce the symbol (T µν ) R . For a generic non-local observable P , the → 0 limit of both sides of the TWI (3.6) is trickier because contact terms will arise in general, and we will not pursue this direction further. However in the next subsection we will show that, if P is chosen to be an observable that depends on the fields at positive flow-time only, such contact terms do not arise and the corresponding TWI is regular in the → 0 limit.
Let us conclude this introductory discussion by stressing that Ward identities have been used routinely in the context of renormalization. A prominent (and familiar) example of their usage is the renormalization of quark bilinears from chiral Ward identities [27]. Further progress has been made recently in the case of the chiral Ward identities by using probe fields at positive flow time [9]. Following this idea, we will investigate in the following sections the possibility of extending the discussion of TWI at positive flow time.

Probe observables at positive flow time
We want to specialize the TWI (3.6) to the case of a probe observable P T that depends only on the fieldB T,µ at flow time T > 0. The variation of P T under the transformation (3.1) can be written using the chain rule: where the Jacobian matrix defined in eq. (2.10) has been used. Let us emphasise that we are considering here the variation of a probe observable P T induced by an infinitesimal translation of the gauge fields at flow time t = 0. The expression above is purely algebraic, and it is exact for the regulated theory, i.e. for any value of > 0. In order to discuss the renormalization of the TWI, the divergence structure of δ x,ρ P T has to be understood. At first sight, this task seems to be difficult because the Jacobian matrix J is a non-local operator, and has a quite complicated expansion in terms of the elementary fields of the D-dimensional theory. However this problem can be completely circumvented by looking at the extended theory in D + 1 dimensions. Indeed let us consider the composite operator: which is defined in the higher-dimensional bulk theory in terms of the Lagrange multiplier L µ and of the bulk field G µν . Since the Lagrange multiplier appears linearly in the action, any polynomial in L µ can be explicitly integrated out in the path integral. In particular if the probe observable P T depends only on the field B µ (T, x) at flow time T > 0 (and does not depend on the Lagrange multiplier) it is possible to show that: The calculation is rather technical and is reported in appendix A. As for the case discussed above, this equation holds for any value of > 0. Using eq. (3.12) the problem of identifying the divergences of δ x,ρ P T is reduced to the standard task of identifying the divergences of the product of two operators in the (D + 1)-dimensional theory. We know already that no renormalization is required for the fields in P T . Also no divergences are generated from Wick contractions of fields in P T as the propagators of the bulk fields are exponentially suppressed at large momenta. The same conclusion applies to Wick contractions of fields inT 0ρ (0, x) with fields in P T . Divergences can only arise from the fact thatT 0ρ (0, x) is a composite operator of fields on the boundary. In principleT 0ρ (0, x) could mix with any other gauge-invariant operator of dimension 5 that transforms as a vector under Lorentz transformations. However the Wick contractions involving the Lagrange multiplier L µ are such that the two-point correlation functions ofT 0ρ (0, x) and any local operator composed from the gauge field at flow time zero vanish up to contact terms. Divergent additive renormalizations toT 0ρ (0, x) by such operators are therefore excluded. Divergences could arise from the mixing with operators involving the Lagrange multiplier, butT 0ρ (0, x) itself is the only one with dimension not greater than 5 and the required symmetry properties. Therefore the operatorT 0ρ (0, x) can renormalize only multiplicatively. We anticipate here that this argument does not rely on using dimensional regularization, and holds also on the lattice. In dimensional regularization, since translation invariance is preserved, one can combine eq. (3.12) and the TWI (3.6) into: which shows thatT 0ρ stays finite in the → 0 limit, and does not require to be renormalized. Thanks to eq. (3.12) the same conclusion holds for the expectation value δ x,ρ P T . This essentially means that the differential operator δ x,ρ can at most generate a multiplicative renormalization when applied to an observable P T which is a function of fields at positive flow time only, but no contact terms are generated. However the multiplicative renormalization factor is constrained to be equal to one in dimensional regularization thanks to translation invariance. In section 6 we will see how this discussion generalizes to the case of a regularization that breaks translation invariance, such as the lattice.

Translations at positive flow time
The flow equations are invariant under global translations. This means that one is free to translate the fields at any flow time t, the result on any observable will be exactly the same as one would obtain by first translating the boundary fields and then evolving them up to flow time t. This argument can be taken one step further, by generalizing the local transformation (3.1) as: This equation defines a family of transformations parametrized by the flow time t. Clearly for t = 0 the usual translation defined in the previous section is recovered. The differential operatorδ t,x,ρ depends only on the fieldsB t,µ at the space-time point x, but is not local in the fundamental field A µ . As a consequence, the finite transformation generated bȳ δ t,x,ρ modifies the fundamental field A µ not only at x, but in a neighborhood of it. This neighborhood has a typical linear size of order √ 8t. In close analogy to eq. (3.12), it is possible to show that: Note that in this case the tensorT is evaluated at flow time t, while it was computed on the boundary in eq. (3.12). If α ρ is uniform, the transformation generated byδ t,x,ρ reduces to the composition of a canonical infinitesimal translation of the fieldB t,µ and a field-dependent gauge transformation, which is immaterial when acting on gauge-invariant observables. Since the flow equations are invariant under global translations,δ t,α reduces to a canonical infinitesimal translation of the fields at any flow time when acting on gauge-invariant observables: It is interesting to consider some special instances of eq. (4.3), e.g. by choosing an observable φ T (x) that only depends on the fieldB µ at flow time T and space-time position x. If T = t then a local version of eq. (4.3) holds: If T > t then the delta function gets regularized and a milder result holds. If V is a sphere centered in x with radius r then roughly speaking: . (4.5) We refer to the end of this section for the proof of a precise version of this equation.
The nice feature of the differential operatorδ t,α for t > 0 is that it depends only on fields at positive flow time, and therefore it does not require renormalization in any regularization scheme. Associated with it, for each flow time t, there is a new energy-momentum tensor and a new TWI. As the transformation (4.1) is non-local in the original field A µ , this new energy-momentum tensor is not local in the D-dimensional theory. However it is possible to write it in terms of local operators in the (D + 1)-dimensional theory by exploiting the space-time symmetries of the (D + 1)-dimensional theory.
The bulk action in eq. (2.13) is clearly invariant under (D + 1)-dimensional canonical translations (the translation in the flow time is broken only by boundary effects). Following the procedure described on the boundary, infinitesimal local translations can be upgraded to the following gauge-covariant transformations acting on the bulk fields: with the constraint that α 0 (0, x) = 0. Capital indices run from 0 to D, and the index 0 denotes the flow time. We will always consider here observables that do not depend on the Lagrange multiplier L µ . The variation of one of these observables P is: The variation of the bulk action under the transformation (4.6) defines a (D + 1)dimensional energy-momentum tensor: up to terms that are proportional to the constraint and therefore vanish in expectation values. Notice that the operatorT 0R for R = 0 is the same that appears in eqs. (3.12) and (3.13). As the number of differential operators is proliferating, we find convenient to review at this point the meaning of all of them. The differential operatorδ t,x,ρ acts on fields that satify already the flow equation. The fields are deformed at flow time t, and the flow equation propagates this deformation to all other flow times. To make sense of this picture, we use the fact that the flow is invertible at least at finite cutoff. In particular the operator δ x,ρ =δ 0,x,ρ deforms the fields on the boundary, i.e. the initial condition for the flow equation, and therefore the deformation is propagated to any positive flow time. The operator δ t,x,ρ that we have just defined is completely different, as it acts on the (D + 1)dimensional fields before the flow equation is imposed. It deforms the fields locally in the (D + 1)-dimensional space and such deformation is not propagated in flow time. Of course if one starts with a field configuration that satisfies the flow equation, its deformation will in general not satisfy the same equation. The variation in the equation is reabsorbed by the deformation of the Lagrange multiplier. For any t > 0, the Ward identities associated with the transformation (4.6) are: For a probe observable P T that depends only on the field B µ at flow time T > t, the l.h.s. of the previous equation vanishes. In this particular case, eq. (4.11) can be written as: We will not need to consider the case R = 0 in this section, and we will therefore develop the arguments below for the case where R = ρ spans the usual space-time directions. We will see now how eq. (4.12) leads to the Ward identities for the family of transformations defined in eq. (4.1), and how one can use this equation to prove eq. (4.5). Note that all fields that are computed at flow time t > 0 have finite correlators, and do not require renormalization as the regularization is removed.
We would like to integrate eq. (4.12) in flow time in an interval (0, t). However this equation is valid only at positive flow time. The problem is that for t = 0 the Ward identity (4.11) gets an extra contribution from the fact that the boundary fields are transformed along with the bulk ones. Moreover eq. (4.12) is valid for bare fields at finite cutoff. At positive t, since only fields in the bulk are involved, this equation does not have any divergences and its → 0 limit can be safely taken. Therefore, after the cutoff is removed, we integrate eq. (4.12) in an interval (t 0 , t) first with 0 < t 0 < t < T : and then we take the t 0 → 0 + limit. We have already proven eq. (4.2): and eq. (3.13): By using these results, eq. (4.13) can be repackaged into the TWI associated with the differential operatorδ t,x,ρ , which defines the corresponding energy momentum tensorT µρ : Clearly this TWI reduces to eq. (3.6) for t = 0; however these manipulations are meaningful only if the integral appearing in the energy-momentum tensor (4.17) is finite. The possible divergences ofT µρ (s, x) at s → 0 + are classified in terms of all operators of dimension up to 6 that can mix withT µρ (s, x). Such operators must contain at least a Lagrange multiplier, i.e. an operator of dimension 3. Therefore, by taking into account the Lorentz structure,T µρ (s, x) can mix with operators of dimension 6 and 4. However it is easy to see that gauge-invariance excludes operators of dimension 4. This means thatT µρ (s, x) has at most a logarithmic divergence for s → 0 + , which is integrable. This concludes our discussion, as the singularity in the energy-momentum tensor (4.17) is integrable.
In order to understand the action of the operatorδ t,x,ρ on fields defined at T > t, let us now integrate eq. (4.12) in flow time in the interval (t, T ). Using eq. (4.2) again: δ t,x,ρ P T = δ T,x,ρ P T + P T ∂ µ T t dsT µρ (s, x) . Assuming that all the local observables in X T lie outside of the sphere V , and by using eq. (4.4), one gets: The operatorT µρ contains only terms that are linear in the Lagrange multiplier L µ . Since the propagator LB is exponentially suppressed with the space-time separation of the two fields, the contribution of the last term of the previous equation is exponentially suppressed if all the fields are far enough from the boundary ∂V of the sphere. Ifr is distance from ∂V of the closest operator (clearlyr ≤ r), then: , (4.20) which is the precise form of eq. (4.5).

Dilatations
In order to discuss dilatations, we are going to extend the definition of the differential operatorδ t,x,ρ in eq. (4.1) to include the flow-time direction: where R runs over all D + 1 dimensions. This differential operator can be related to thẽ T 0R operator at generic flow time: by integrating explicitly the Lagrange multiplier, as shown in appendix A. Local dilatations are a special case of local translations. On the boundary a local dilatation is generated by the transformation (3.1) with α ρ (x) = x ρ β(x). A global dilatation corresponds to a uniform β. The flow equation is also invariant under dilatations provided that the flow time is rescaled too by its classical dimension. Local dilatations in the bulk are generated by the transformation (4.6) with α ρ (t, x) = x ρ β(t, x) and α 0 (t, x) = 2tβ(t, x).
In practice we consider the equation: which follows trivially from eq. (4.12) and stays finite in the → 0 limit. It is interesting to notice that the operator 2T 00 +T µµ (which is almost the trace of the bulk energymomentum tensor, except that different components are weighted with the dimension of the corresponding coordinate) might break dilatation invariance in the bulk. However some trivial algebra shows that this generalized trace is a divergence: Now we can use eqs. (5.2) and (3.13), together with the observation that: asT 00 (t 0 , x) diverges at most logarithmically, and we can repackage eq. (5.5) into the dilatation Ward identity (DWI): where 2tδ t,x,0 + x ρδt,x,ρ is the differential operator that generates dilatations at flow time t. Usual power-counting arguments show that the integral in the dilatation currentD µ is finite. As usual P T is an observable that depends on the field B µ at flow time T only, and T > t. Of course dilatations are not symmetries of pure Yang-Mills. The trace of the energy-momentum tensor that appears in the r.h.s. of the DWI (5.7) is the source of the anomaly.
If φ T (x) is an observable that depends only on the fieldB T at flow time T and spacetime point x, then the global dilatation is simply: where d φ is the dimension of the operator φ T . The DWI for φ T reduces to a very simple form: This equation is the operatorial form of the Callan-Symanzik equation [10][11][12], in which (8T ) −1/2 is the energy scale, and contact terms are absent (which is the same as saying that the operator φ T does not renormalize). Equation (5.11) is extremely interesting as it allows the trace of the energy-momentum tensor to b probed just by looking at the evolution under gradient flow of observables.

Space-time symmetries on the lattice
If the lattice regulator is used, then the explicit breaking of translation symmetry generates an extra term in the TWI (3.6), which implies that the energy-momentum tensor will require renormalization. Even after subtracting the divergences, the TWI (3.6) is valid in this case only up to terms that vanish in infinite-cutoff limit. We will review how this happens, following the presentation in ref. [1]. At finite lattice spacing a, a regularized version of the transformation (3.1), can be defined by choosing for example a particular discretization of the field strength F µν (for definiteness one can adopt the clover plaquette definition) and by replacing the fundamental field A µ (x) with the link variable U µ (x). The discretized transformation will be denoted byδ:δ where ∂ A Uµ(x) is the left Lie derivative on the gauge group with respect to U µ (x). This transformation leaves the measure of the path integral unchanged, however it is not a symmetry as the lattice actionŜ is not invariant when the parameter α ρ is chosen to be uniform: whereT (1) µρ is your favourite naively-discretized energy-momentum tensor. The R ρ operator, which depends on the choice of discretization forT (1) µρ , is the residual term in the Ward identity, and comes from the explicit breaking of the symmetry. It is a higher-dimensional operator, and vanishes in the formal a → 0 limit (i.e. on fixed field configurations that have a smooth continuum limit). However formally subleading corrections cannot be neglected in field correlators as subleading coefficients can combine with divergent expectation values giving rise to finite contributions. By standard dimensional analysis arguments one can isolate the possible divergences in R ρ : whereR ρ is a finite operator, and theT (2,3) µρ operators are: If the renormalized energy-momentum tensor on the lattice is defined as: the Ward identity associated with the transformation (6.2) becomes: As for the case of dimensional regularization, we will discuss the continuum limit of this equation for two possible choices of the observable P : a product of local observables, or a generic observable that depends on the fields at positive flow time only.
Let us choose for P a product of properly renormalized local observables at separate points. The assumption that translation symmetry has to be recovered in the continuum limit implies that the coefficients c i and Z δ can be tuned in such a way that: (1) the energy-momentum tensor is finite in the continuum limit, i.e. the following limit: is finite up to contact terms; (2) the l.h.s. of eq. (6.8) is finite in the continuum limit in distributional sense, and is equal to: In particular this implies that the term R ρ (x)φ 1 (x 1 ) R · · ·φ k (x k ) R is zero in the continuum limit up to contact terms. These contact terms cancel analogous contact terms arising in Z δ δ x,ρφ1 (x 1 ) R · · ·φ k (x k ) R . However some divergences in the cutoff have to survive in eq. (6.10) in order to reproduce the delta function. These divergences have a purely algebraic origin in the continuum limit, and show that local operators do not necessarily represent the most natural choice to probe the translation Ward identity.
Let us consider now a probe observableP T which is function of the fields at positive flow time T only. As in dimensional regularization, also the lattice differential operator δ x,ρ can be represented by a local operator in the (D + 1)-dimensional theory. As discussed in the appendix B, the following equality holds at any lattice spacing: which is the discretized version of eq. (3.12). As discussed in section 4 the operator tr L µ (0, x)F ρµ (x) renormalizes multiplicatively. One can therefore introduce the renormalized operator: such that the limits that appear in the following chain of equations are finite: As the operatorT 0ρ (0, x) is renormalization group invariant in the continuum, the renormalization of the corresponding lattice-discretized operator is finite, i.e. Z δ is depends on the lattice spacing only through the bare coupling. This finite normalization must be fixed by requiring that the continuum differential operator δ x,ρ defined through eq. (6.13) generates translations, or in other words satisfies eq. (4.5) for t = 0. It is important to notice also that no contact term is generated in eq. (6.13), therefore its continuum limit is regular as a function of the space-time position. Roughly speaking, through eq. (6.13) the use of observables at positive flow time allows the renormalization of the differential operatorδ x,ρ without using the assumption that translation invariance must be recovered in the continuum limit. Under such supplementary assumption one concludes that the coefficients c i can be tuned in such a way that: (1) the energy-momentum tensor is finite in the continuum limit, i.e. the following limit: lim a→0 P TTµρ (x) R = P T T µρ (x) (6.14) is finite and regular at the space-time point x (as no contact terms are generated); (2) the contribution of the operatorR vanishes in the continuum limit at any space-time point x: Putting all together, up to subleading corrections in the lattice spacing: Probe observables defined in terms of the fields at some positive flow time generate neither delta functions in the Ward identity nor contact terms, and seem to represent a more natural choice to probe translation symmetry (or any other symmetry).

Strategies to renormalize the energy-momentum tensor
Let us consider a local observable φ t,ρ (x) in the fields at positive flow time t. For reasons that will be clear soon, we choose it to transform like a vector with respect to the hypercubic symmetry. Up to subleading corrections in the lattice spacing, eq. (6.16) implies: In the continuum limit this equation is valid for any space-time position x (even x = 0), any flow time t, and any probe observable. The ratios c i /Z δ are therefore highly constrained by this equation. We have chosen a vector probe so that the expectation values in eq. (6.17) do not vanish at x = 0.
We need to fix now the multiplicative renormalization Z δ . This can be done in several ways. For instance one can enforce eq. (4.20) within a two-point function. One can consider the wall average of a local observable φ t (x): 18) and choose the integration volume in eq. (4.20) to be the space-time slice −d < x 4 < d: The operator Φ t (z 4 ) must lay outside of the integration slice. The distancer that controls the exponential is the minimum between d and |z 4 −d|. In order to suppress the exponential correction,r has to be larger than the smearing range √ 8t. If the exponential is negligible, then there is a range of values of d for which the l.h.s. of eq. (6.19) is constant, and this can be easily checked in a numerical calculation.
An alternative method, based on the same idea, consists in using a Schrödinger functional setup, with boundaries at x 4 = ±L 4 . One needs to engineer boundary conditions such that the background field F µν depends on the coordinate x 4 . In this case eq. (4.20) becomes: (6.20) The distancer that controls the exponential is the minimum between d and L 4 − d. The advantage of this approach is that only 1-point functions need to be considered.
Finally one can decide to fix the multiplicative renormalization by means of the DWI (5.11). However one has to take into account corrections coming from the compact geometry of the space-time in lattice simulations: (6.21) The distancer that controls the exponential is the minimum between d and L − d.

Remarks on small flow-time expansion
As already shown in ref. [3], one can obtain the energy momentum tensor from the small flow-time expansion of the following two operators: 2) The small flow-time expansions of the operators E(t, x) and Y µρ (t, x) are organized in terms of the dimension d k of the possible mixing renormalized boundary operators Θ The coefficients α Y (t) and α E (t) are renormalization group invariant, and have a pertubative expansion in terms of the running coupling g(q) at the energy scale q = (8t) −1/2 . As calculated in ref. [3]: The calculation is done in the MS renormalization scheme. b 0 and b 1 are the coefficients of the expansion of the beta function: The small flow-time behaviour of the coefficients c k (t; µ) is dictated by the renormalization group equation. These coefficients are at most logarithmically divergent. 1 We propose here a strategy to determine the coefficients α Y (t) and α E (t) nonperturbatively up to O(t) corrections. Let us focus on the trace of the energy momentum for now. Given a probe observable φ T at positive flow time T , one can define an effective coefficient α eff E (t) by imposing that only the leading term of OPE contributes to E(t, x): Notice that α eff E (t) defined in this way depends on the probe observable. By using eq. (7.4), one easily sees that: therefore the nonuniversal terms are at least O(t).
Here O(t) has to be interpreted up to logarithmic corrections. By using the dilatation Ward identity in the form of eq. (5.11), one obtains the explicit representation: In a region in which the nonuniversal O(t) contribution are small, then the following operatorial relation holds: 14) It is worth to stress that this nonperturbative definition of α eff E (t) provides a definition of the trace of the energy-momentum tensor that is correct up to order t (times logarithms), while a perturbative definition of α eff E (t) as it is pursued in ref. [3] would give rise or errors that are of order ln −2n t. 1 The coefficients c k are dimensionless and depend on the running coupling constant g(µ) and on the ratio q/µ. They satisfy a renormalization-group equation: where the anomalous dimension matrix accounts for mixing of operators under renormalization group. The leading behaviour of c k (t; µ) at large q (i.e. small t) is governed by the leading term of the anomalous dimension at small g: for some vector v that depends on the initial condition of the renormalization group equation. This shows that c k (t; µ) can have at most a logarithmic divergence in t.
Analogously one can define an effective coefficient α eff Y (t) by imposing the integrated TWI, for instance in the form: In this formula we have separated space and time coordinates x = (x, x D ). The local TWI (4.16) has been integrated in a space-time slice −d < x D < d. The index k runs from 1 to D − 1, the point x lays outside of the integration volume, andr is the minimum between d, |x D + d| and |x D − d|. Also in this case, by using eq. (7.3), one sees that:

Conclusions
Since the renormalization properties of the gradient flow have been clarified, the latter provides a theoretically robust way to investigate the dynamics of gauge theories, and several interesting applications have already appeared since it was originally introduced.
In this paper we focus on the possibility of using the gradient flow for studying space-time symmetries like translations and dilatations. An important corollary of our study is that the gradient flow can be used to define a properly-normalized energy-momentum tensor for pure Yang-Mills theories defined on a lattice. The main idea used in this work, inspired by the study in ref. [9], is that the variations under infinitesimal local translations of correlators of fields along the flow can be used to generate translation Ward identities, which encode the symmetry properties of the quantum field theory. We have explored two applications of this idea.
First we studied the case where the transformation is defined on the fields at flow time t = 0, and we obtained the Ward identities using probe operators at positive flow time. The divergencies of the correlators appearing in these identities have been analysed using a representation of the gradient flow in terms of a (D + 1)-dimensional local field theory. When a lattice regulator is used, translation symmetry is broken by the regulator, and the energy-momentum tensor undergoes renormalization. A finite energy-momentum tensor can only be defined after the subtraction of divergent mixings with other operators. The Ward identities for the renormalized lattice energy-momentum tensor using probe operators at time T > 0 are shown in eq. (6.16); the key feature is that these identities can be used to fix the renormalization coefficients in a nonperturbative way. These results extend the programme that was first laid out in refs. [1] to the case of probe operators smeared using the gradient flow. Numerical simulations are needed to verify that this is a viable method in practice; they are deferred to future investigations.
Because the gradient flow commutes with uniform translations, we can also study the Ward identities obtained by transforming the fields at nonvanishing flow time t. Once again a (D + 1)-dimensional representation of the gradient flow allows us to analyse the structure of the field correlators in terms of local fields in the bulk. We have obtained the renormalized Ward identities that are generated by these transformations. They are universal properties of the field correlators, reflecting the translation invariance of the physical world, and do not depend on the regulator used to define the bare theory. The Noether currents appearing in these Ward identities are related to the energy-momentum tensor of the original D-dimensional theory in eq. (4.16).
Our analysis includes the case of dilatations. Indeed local dilatations are a special case of local translations. Studying local dilatations in the bulk, we were able to write dilatation Ward identities for operators at generic flow time T . These Ward identities show explicitly the anomalous breaking of scale invariance, and thereby provide a new tool to study the trace of the energy-momentum tensor. The variation of the probe fields along the gradient flow is directly related to the correlator of the trace of the energy-momentum tensor with the probe fields, as shown in eq. (5.11). This is a remarkable result that allows the scale invariance of the theory to be probed using the gradient flow.
An interesting extension of the results in ref. [3] emerges naturally in the framework used here to discuss the transformation properties under dilatations. In ref. [3] the Ddimensional energy-momentum tensor was defined using a perturbative determination of the small flow-time expansion of operators defined in the bulk. It is possible to introduce a nonperturbative definition of the leading coefficients in this expansion, and to compute them making use of probe observables along the gradient flow, see e.g. eq. 7.13.
Using the gradient flow to study space-time symmetries is a fertile research direction. The recent extension of the gradient flow to theories with fermions [9] should enable a straightforward generalization of our arguments to gauge theories coupled to matter. We plan to come back on these topics in future studies.

A Integration of the Lagrange multiplier
When observables depend linearly on the Lagrange multiplier L µ , like in the case of thẽ T M R defined in eqs. (4.9) and (4.10), the field L µ can be explicitly integrated out. We consider functional integrals of the particular form: where both P and X A are functions of the field B only, and moreover X A is a local function of the field B and its spatial derivatives in the point (t, x) only.
For the calculation it is convenient to consider the flow-time direction discretized with a time step equal to ∆t which we will send to zero at the end of the calculation. We remind that in the notations of [8] the field L A µ is imaginary. The discretized bulk action is: The integration measure DL is normalized in such a way that: where the fieldB t (x) is the solution of the discretized gradient flow equation with initial conditionB 0 = A. The second equality in the previous equation is obtained by changing variables from [F(t)] t=0,...,T to [B(t)] t=∆t,...,T +∆t (the shift in the indices is important!). The Jacobian matrix of this map is (proportional to): where the matrix R is defined as: Notice that ∆[B] is an upper triangular matrix, whose diagonal is in s = t + ∆T . Its determinant (which requires regularization) is just the product of all the diagonal entries of the matrix ∆[B] and does not depend on the fields [8,28].
Let us isolate the functional integral in L µ from eq. (A.1): .
When this result is plugged back into the original integral (A.1), one can integrate by part in B and get We will show that the terms (A.10) and (A.11) vanish.
The matrix δB/δF appearing in the previous equation is the inverse of the ∆[B] matrix defined in eq. (A.5). Moreover as ∆[B] is an upper triangular matrix, also δB/δF is an upper triangular matrix. The equations that define δB/δF are: = 0 , s > t + ∆t , (A.14) where∂ + s is the discrete forward derivative that appears in eq. (A.5). Notice that eq. (A.12) implies that the term (A.10) vanishes.

(A.15)
From here it is clear that δB B ν (s, y)/δF A µ (t, x) does not depend on the field B at flow time s, therefore the term (A.11) vanishes.
Finally we want to notice that: , (A. 16) for s ≥ t + ∆t, where the r.h.s. is the Jacobian matrix of the mapB t+∆t →B s . In fact the Jacobian matrix δB B s,ν (y)/δB A t+∆t,µ (x) satisfies a (discretized) differential equation that is obtained by differentiating the discretized flow equation. A rapid inspection shows that this differential equation coincides with eq. (A.14) after the substitution B =B. Also both matrices in eq. (A.16) satisfy the same initial condition (A.13). eq. (A.16) follows from the uniqueness of the solution of eq. (A.14).
By taking the ∆t → 0 limit, eq. (A.8) becomes then: which is the main formula of this appendix. Notice that in the ∆t → 0 limit one has to replace: , as s becomes a continuous parameter. If P T is function of the field B at some positive flow time T > t only, then the functional derivative with respect to B(s, y) contains a delta function that can be extracted: Plugging this into eq. (A.17) yields: where the chain rule has been used in the last step. This equation has been used several times in this paper. Equation (3.12) is a particular instance of the previous equation.
Another interesting class of functional integrals is represented by where again P is function of the field B only. We have used many times in this paper the fact that this integral vanishes. Moreover this integral is useful to extend, by subtraction, the result in eq. (A.20) to observables X A that include a flow-time derivative of the field B. We can follow the previous derivation and use (  where P is the projector on the Lie algebra: When restricted to a neighborhood of the identity in the gauge group, the projector P is invertible and its Jacobian is: Throughout this paper, the Lie derivative on the gauge group is defined as: The functional integral over the bulk field V is restricted on a neighborhood of the solution of the flow equation in which the map V →F is invertible. As we do not need to know this domain explicitly, we will omit it in the next formulae. By inspecting eq. (B.2) one sees thatF µ (t, x) depends on the field V ν (s, y) at flow times s ≤ t only. The Jacobian matrix of the map V →F is block-triangular, and its determinant is given by the product of the determinants of the diagonal blocks: Integrals of the form: can be easily calculated by using the change of variable V →F: We notice now that, if X A (t, x) depends on bulk fields at flow time t only, then it does not depend onF at flow time t + ∆t. Also we will assume that P T depends on the fields at flow time T only. The constraint F = 0 is equivalent to requiring that the bulk field satisfies the flow equation V =V : We want to argue now that the partial derivative in the previous equation is related to the Jacobian matrix of the trivializing mapV t+∆t →V T . In order to do so it is convenient to introduce the following differential: For instance, if t = 0 then one can use the chain rule to show that: where U is the boundary field. eq. (6.11) is just a particular application of this formula.