Energy-momentum tensor in QCD: nucleon mass decomposition and mechanical equilibrium

We review and examine in detail recent developments regarding the question of the nucleon mass decomposition. We discuss in particular the virial theorem in quantum field theory and its implications for the nucleon mass decomposition and mechanical equilibrium. We reconsider the renormalization of the QCD energy-momentum tensor in minimal-subtraction-type schemes and the physical interpretation of its components, as well as the role played by the trace anomaly and Poincar\'e symmetry. We also study the concept of"quantum anomalous energy"proposed in some works as a new contribution to the nucleon mass. Examining the various arguments, we conclude that the quantum anomalous energy is not a genuine contribution to the mass sum rule, as a consequence of translation symmetry.


I. INTRODUCTION
In QCD the phenomenon of confinement prevents quarks and gluons to appear in the physical spectrum. Instead, one finds exclusively bound states of these elementary constituents. Hadron masses can therefore largely differ from the sum of its constituent masses. Understanding how hadron masses arise is therefore of utmost importance.
Long ago, the nucleon mass has been decomposed in a frame-independent way into a quark and a gluon contribution using the trace of the energy-momentum tensor (EMT) operator g µν T µν [1,2]. Later, a decomposition into four contributions based on the component T 00 in the rest frame has been proposed in Refs. [3,4]. Leaving aside the precise form of the underlying renormalized operators, both decompositions are mathematically correct but provide quite different pictures of the nucleon mass, triggering debates within the hadronic physics community about their physical meaning.
In order to clarify the situation, a general Poincaré-covariant and scheme-independent analysis has recently been presented in Ref. [5] which concluded that the above two decompositions actually mix information about mass with the constraint of mechanical equilibrium. Keeping these two aspects of the hadronic bound state physics separated, one obtains in fact a natural decomposition of the hadron mass into a quark contribution and a gluon contribution. Quarks being massive particles, the quark contribution can be refined by separating the rest energy (i.e. quark mass) from the kinetic and potential energies. A three-term decomposition of T 00 of this form is what has been discussed recently, along with the corresponding renormalized operators in dimensional regularization (DR) in minimal-subtraction-type (MS) schemes [6,7].
While agreeing with the mathematical aspects of Refs. [5][6][7], it is claimed in [8] that the two-term and three-term energy decompositions miss "some fundamental insight on the origin of the nucleon mass", namely the role played by the trace anomaly. A key step to obtain a four-term energy decomposition is to separate the EMT into traceless and trace parts, motivated by the fact that these two parts do not mix under Lorentz transformations and hence under renormalization. Focusing on T 00 , it is found that the traceless part provides three quarters of the nucleon mass and the trace part provides the remaining quarter, a result referred to as a "virial theorem" in Refs. [3,4,[8][9][10].
Here we take a fresh look at the virial theorem in the context of quantum field theory (QFT); see also Refs. [11][12][13]. In particular, we show that, for a closed system, the virial theorem coincides with the constraint of mechanical equilibrium put forward in Refs. [5,[14][15][16].
The present work is organized as follows. In Section II we study in detail the QFT version of the virial theorem. After elaborating in Section III on the physical interpretation of the EMT components, we analyze in Section IV the consequences for the problem of the nucleon mass decomposition in the light of the arguments presented in Refs. [8][9][10]. We review in Section V the renormalization of the EMT operators in DR in MS-type schemes and the operator structure of the different mass terms in the four-term energy decomposition, which has been under some discussion recently [6-10, 17, 18]. We also comment on the role played by the trace anomaly and the related concept of "quantum anomalous energy". We show in Section VI the importance of preserving translation symmetry, and we argue that one should be particularly careful when providing physical interpretation to operators appearing in a lattice-regulated theory. Our findings are then summarized in Section VII. We present further details on the virial theorem and its physical meaning in various contexts in Appendix A, and give a brief account of the DR approach in Appendix B.

II. VIRIAL THEOREM IN QUANTUM FIELD THEORY
The virial theorem is essentially a statement about mechanical equilibrium in a bound state, expressed as a stationarity condition on the energy under spatial dilatations. It has largely been discussed in the context of classical and quantum mechanics (see Appendix A for a short review), but its proper transposition to field theories is less known. In this section, we present first an original derivation of the virial theorem for stationary states in QFT, and then obtain a stronger version based on the divergence of the EMT. We then discuss the relation with the plane-wave approach.

A. Dilatations
In a field theory, dilatations are associated with the current 1 [19,20] j µ D = T µν x ν . (1) Note that we will not assume a priori that the system is closed, and hence that the EMT is conserved. The corresponding charge with H = d 3 x T 00 and G = d 3 x T 0i x i generates spacetime dilatations or, in infinitesimal form, where φ(x) is a generic dynamical field appearing in the EMT and d φ is its scale dimension. In the following, we will drop all surface terms, assuming as usual that their contributions vanish for the physical states that we consider.
Using the Heisenberg equation of motion and the standard commutation relations with the momentum operator P i = d 3 x T 0i , we can write 2 For µ = i, this relation indicates that dilatations simply rescale the momentum. For µ = 0, the rescaling of the Hamiltonian is accompanied by another contribution arising from the breaking of dilatation symmetry. The latter is measured by dD/dt and can be expressed as where the density of four-force is defined as F µ = ∂ λ T λµ . Under an infinitesimal dilatation x µ → (1 + δκ) x µ , the variation of the Hamiltonian is given by 1 In general there can be an additional term V µ called the virial current. It is however often possible to redefine the EMT so that the virial current does not appear. 2 Note that if the theory is invariant under dilatations, then dD/dt = 0 which implies 1 i [H, D] = H. If a stationary state exists, then H = 0. Eigenstates of the Hamiltonian with nonzero energy are therefore not stationary. The only possibility is that they move at the speed of light, meaning that they must be massless.
Note that only spatial dilatations (i.e. those generated by G) matter since [H, D] = −[H, G]. We could therefore have directly started with G instead of D, like in the case of point particles treated in Appendix A, but the spirit of relativistic field theories makes it a priori more natural to consider spacetime dilatations rather than pure spatial dilatations.
Using the operator equation (7), we can write the mean variation of the energy as where O is the expectation value of the operator O in some properly normalized state, V is the volume and ℓ is the radius of a sphere containing the system,p is the average isotropic stress or pressure 3 , andf is the average radial force. The combination δW =p δV +f δℓ represents the mean work exerted by the system under the infinitesimal spatial dilatation.

B. Virial theorem for stationary states
Assuming that the Hamiltonian is time-independent, the QFT version of the virial theorem follows directly from the expectation value of Eq. (7) in a (normalized) stationary state H|Ψ = E|Ψ It is a balance equation stating that in a stationary state the virtual work exerted by the system under a spatial dilatation vanishes. In other words, the system is in mechanical equilibrium.
For a system of massive point particles, one can write , where v k = dr k /dt is the velocity of particle k, p k is its momentum, and F k is the force acting on it; see Appendix A C.
Noting that for a non-relativistic system the total kinetic energy is given by T = k 1 2 v k · p k , it is easy to see that Eq. (11) reduces after integration to which is the familiar form of the virial theorem in non-relativistic quantum mechanics. For a closed system, the total EMT is conserved and the virial theorem reduces to This relation has largely been discussed in the QED context for an electron state [21][22][23][24][25][26], and is a key aspect of the hadron mechanical structure [14][15][16] which impacts the hadron mass decomposition [5]. It is however usually obtained from a different approach, and therefore often not recognized as the virial theorem. The situation is different, e.g., in plasma physics [27][28][29] where Eq. (14) is well known as the virial theorem. Defining the isotropic stress or pressure distribution as p(x) ≡ 1 3 i Ψ|T ii (x)|Ψ , we see that the virial theorem for a closed system amounts simply to the von Laue condition for mechanical equilibrium [30] derived long ago in the context of classical field theory. We observe that there exists actually some confusion in the field theory literature about the notion of virial theorem. For example, in the seminal paper [31] introducing the MIT bag model two so-called "virial theorems" are derived using naive transpositions of the point mechanics quantity G = k r k · p k to continuum mechanics. The first one is based on is a massless scalar field describing quarks inside the bag andφ(x) is its time derivative. The second one is based on . In a subsequent paper [32], it has been observed that the key results derived from the combination of d Ψ|Ω|Ψ /dt = 0 and d Ψ|Ω|Ψ /dt = 0 can in fact be obtained from the stationarity of the system rest energy under spatial dilatations. This variational principle expresses mechanical equilibrium and has been used later in the context of soliton models [15,33,34], where it is commonly referred to as the virial theorem. We agree with the latter naming since requiring stationarity under spatial dilatations amounts to using the correct form for the generator of spatial dilatations G → d 3 x T 0i x i from which one usually derives the virial theorem d Ψ|G|Ψ /dt = 0; see Appendix A.
Let us now consider a stronger version of the virial theorem which is most easily derived without explicit reference to dilatations. To this end, let us generalize the approach followed in Refs. [11,12,15,35,36] and write the identity For a stationary state, the left-hand side vanishes using the Heisenberg equation of motion and we can write Multiplying by x i and integrating over space gives for µ = j This is the QFT version of the so-called tensor virial theorem [37], whose spatial trace reduces to the usual (scalar) virial theorem (11). It expresses the fact that a stationary state is in mechanical equilibrium not only under isotropic dilatations, but more generally under any (infinitesimal) spatial deformation. For a closed system, the tensor virial theorem reduces to Since the virial theorem concerns only the stress tensor, one may wonder in what frame it applies. Clearly it cannot be a generic frame for it would imply that the expectation value of the total EMT must identically vanish. By a stationary state it is usually understood a normalizable state, excluding therefore momentum eigenstates. It is easy to see that the expectation value of total momentum P in a stationary state vanishes, meaning that the system is in average at rest. One can indeed use e.g. the center of energy R = 1 H d 3 x x T 00 to define the position of a closed system. The velocity operator is then given by dR/dt = P /H, whose expectation value in a stationary state vanishes using again the Heisenberg equation of motion. So, in conclusion, the virial theorem simply expresses the condition of mechanical equilibrium of a massive system in the system rest frame.
C. Virial theorem for momentum eigenstates In particle physics, it is customary to work with four-momentum eigenstates instead of normalizable stationary states. It is actually possible to obtain in a simple way the content of the virial theorem for a closed system for such states, but the derivation turns out to involve additional information that should be distinguished from the virial theorem.
Poincaré symmetry implies that the forward matrix elements of the total EMT must have the form [38][39][40] p|T µν (x)|p = 2p µ p ν , where |p is covariantly normalized, i.e. p ′ |p = (2π) 3 2p 0 δ (3) (p ′ − p). (For simplicity, we suppress the spin labels for the nucleon states throughout this work.) This ensures that the total four-momentum is given by The average total stress tensor then reads For a state at rest defined by p µ rest = (M, 0), we recover directly the tensor virial theorem for a closed system In the context of classical field theory, von Laue [30] showed that Eq. (23) is a necessary and sufficient condition for the total four-momentum to transform as a Lorentz four-vector. The same condition must also hold in QFT [21,23,24] since it is just based on Lorentz symmetry. Clearly the tensor analysis approach, which is based on Eq. (20), is very powerful and arrives at Eq. (19) in a very simple (though somewhat formal) way, with the advantage of extending its expression to any Lorentz frame. The drawback is that it keeps the physical meaning obscure, and in particular the fact that it includes automatically the virial theorem which is associated with spatial dilatations. To clarify the physical meaning of Eq. (20), we first note that the tensor virial theorem for a closed system in Eq. (23) can be expressed in an arbitrary frame as where u µ = p µ /M is the system four-velocity and dV = u 0 d 3 x is the Lorentz-invariant proper volume element. It implies that where we used translation invariance and the fact that p|p = (2π) Note that one arrives at the same conclusion using tensor analysis and the conservation of the total EMT which excludes other possible Lorentz structures involving g µν or polarization tensors [41]. We stress that the virial theorem does not require to know the coefficient A. The latter is fixed by the additional requirement that the proper energy is the mass of the system or, equivalently, by the requirement of four-momentum conservation (21). This implies that A = M , leading us back to Eq. (20). This analysis shows clearly that the expression (20) combines in fact two distinct physical aspects of bound systems: one is the virial theorem expressing mechanical equilibrium (24) and the other is that the mass of the system is M (26).

III. EMT MATRIX ELEMENTS AND THEIR INTERPRETATION
Now that the QFT version of the virial theorem is well identified, we would like to add further discussion about the physical interpretation of the EMT matrix elements. For our purpose, it will be sufficient to work with the symmetric (or Belinfante) form of the EMT. We will also assume that the total EMT can be written as the sum of partial EMTs associated with the individual species of constituents in the system. In QCD, we will typically separate the system into quark and gluon contributions. The quark contribution can further be decomposed into flavor contributions. Note that vacuum expectation values are always implicitly subtracted from these operators.

A. Parametrization
For a spin-1/2 target, the matrix elements of the symmetric (or Belinfante) EMT can be parametrized in general as [38,39,42] with Γ µν a (P, ∆) = where is the average four-momentum, ∆ = p ′ − p is the four-momentum transfer, and a is just a generic label specifying the EMT contribution. The gravitational form factors depend on ∆ 2 only and therefore are frame-independent. In the forward limit, this parametrization reduces to

B. Spatial distributions
The expectation value of the EMT tensor in some physical nucleon state |Ψ (not necessarily stationary) at time t = 0 is given by where the momentum-space wave packet is defined asΨ(p) = p|Ψ / 2p 0 . Applying a Wigner transform, this can be rewritten in a phase-space representation as [16,43,44] Ψ|T µν a (0, x)|Ψ = with the nucleon Wigner distribution and the internal EMT distribution for a nucleon localized in the Wigner sense in phase space. The average rest frame P = 0 is known as the Breit frame, where Eq. (34) reduces to the 3D EMT distributions introduced in Ref. [14] and reviewed in Ref. [15]. When P = 0 one can choose without loss of generality the z axis along P . Integrating over r z , one obtains the 2D EMT distributions which become genuine probabilistic distributions in the infinite-momentum limit [16,45,46]. In these two cases there is no energy transfer, p ′0 = p 0 , so that the spatial distributions are in fact static, i.e., time-independent. Relativistic spatial charge distributions are constructed in a similar way using the charge current j µ (x) [47].

C. Four-momentum sum rules
Integrating Eq. (31) over all space and using the parametrization (30), we arrive at For a state with well-defined momentum p, this reduces to which is also valid for t = 0 because |p is an energy eigenstate. With the four-momentum operator being defined as we recover from Eq. (36) the expression [48] p|P µ a (t)|p Summing over all the contributions we should recover the four-momentum of the state, leading therefore to a momentum sum rule for µ = i a A a (0) = 1 (39) and an energy sum rule for µ = 0 Equation (40) must be true in any frame and we therefore conclude that aC a (0) = 0.

D. Lorentz symmetry and physical interpretation
From the point of view of Lorentz symmetry, it is interesting to decompose the EMT into a symmetric traceless contribution and a trace contribution [3,4], since these belong to different representations of the Lorentz group and hence do not mix under Lorentz transformations. One can then rewrite Eq. (36) as Alternatively, we may note that the four-momentum of the system is a timelike four-vector that can be used to provide a natural foliation of spacetime into spacelike hypersurfaces. From the physical point of view, this means that p µ specifies in a covariant way the rest frame of the system. We can then decompose the EMT into parallel and orthogonal contributions to p µ [5,[49][50][51] p| In the rest frame, we have in particular which shows that the combination A a (0) +C a (0) represents the fraction of the system rest energy carried by the subsystem a, and that −C a (0)M represents the rest isotropic stress of this subsystem integrated over the volume. Dividing by the proper volume V of the system, one can interpret A a (0) +C a (0) as the average energy density and −C a (0) as the average isotropic pressure, both defined in the rest frame of the system and expressed in units of M/V [5]. Thanks to Eq. (44), we can easily understand the physical meaning of the sum rule (41). It is simply the virial theorem derived in Section II, expressing the mechanical equilibrium of the system [5]. The fact that we have in the forward limit two gravitational form factors A a (0) andC a (0) satisfying two independent sum rules (40) and (41) strongly suggests that they correspond to two distinct aspects of the physics of bound states. This is further motivated by the structure of the EMT which adopts its simplest form in the nucleon rest frame (44).
Although the habit of interpreting expectation values of the EMT using the language of continuum mechanics has a long history [14, 21-24, 26-29, 52-55], some concerns about this picture have recently been expressed in Ref. [8]. It is claimed that the interpretation in terms of energy density and pressure makes sense only under some conditions. One of them is that the particles mean free paths must be much smaller than the volume elements. It is then concluded that "in QCD, only at high-temperature and density, a fluid description of the combined quark and gluon plasma might make sense" [8]. We remind however that the alluded conditions indicate in fact when a macroscopic continuum description can be used when the microscopic degrees of freedom are discrete. They tell us, e.g., under what circumstances one can describe a gas composed of a large number of particles as a classical continuous medium from an effective macroscopic point of view.
In QFT the fundamental degrees of freedom are fields, and particles emerge as somewhat localized excitations of the latter. Unlike classical field theory, which is usually seen as an effective macroscopic description, QFT is a fundamental continuous description. Following quantum mechanics, the expectation value Ψ|T µν (x)|Ψ simply represents the quantum mean value of the EMT at some spacetime point x. There is no coarse graining 4 involved and one can safely apply the language of continuum mechanics. We also point out that the average energy density and pressure are here defined by dividing the total rest energy and work by the whole proper volume of the nucleon, which is necessarily larger than the quark and gluon mean free paths. The conditions of applicability of the effective macroscopic description are therefore also satisfied.
Let us illustrate this with an example. If we consider an ideal gas from a macroscopic point of view, the stress tensor will contain both a convective contribution and an internal pressure contribution, but from the microscopic pointparticle perspective both arise from the motion of the particles and hence are purely convective. The macroscopic distinction arises simply because of the coarse graining procedure, which defines an effective local pressure by averaging over distances larger than the particle mean free path. In QFT, the situation is reversed since the microscopic degrees of freedom are not particles but quantum fields. So on top of the convective contribution (i.e., kinetic energy), the microscopic stress tensor will in general also receive some internal contribution. For a stationary state there cannot be friction, so the internal stress is akin to a conservative potential energy and can accordingly be interpreted as pressure. More precisely, average pressure 5 is simply understood in the sense ofp = −δE/δV where δE = Ψ|δH|Ψ is the variation of energy associated with an infinitesimal change of volume δV , see Section II A.

IV. TENSOR ANALYSIS OF NUCLEON MASS
Having discussed in detail the physical interpretation of EMT matrix elements, we now address specifically the question of the nucleon mass decomposition. We will adopt here the approach of Ref. [5] which extends the work of Polyakov [14] to the non-conserved (partial) EMTs of the partons. It is very general in the sense that it is based only on the components of the EMT, and not on the particular form assumed by the latter in a given theory and renormalization scheme. Note that this does not mean of course that the magnitude of the individual contributions are renormalization-scheme independent. Again, we will consider that the total EMT can be written as a sum of partial EMTs as in Eq. (27).

A. Proper energy decomposition
In special relativity, mass is defined by the equation p µ p µ = M 2 , where p µ is the total four-momentum of the system. In QFT, this becomes an operator identity with P µ the total four-momentum operator. Since we can write for a massive momentum eigenstate, we conclude that (invariant) mass is fundamentally the proper energy of the system [5,63], i.e. the Lorentz-invariant expression of the rest-frame energy 6 . A mass decomposition is therefore a proper energy decomposition. Using Eq. (38), it is given by the Lorentzinvariant relation In the rest frame, the nucleon four-velocity reduces to u µ = (1, 0) and the proper energy P µ u µ coincides with the energy P 0 = d 3 x T 00 (x), which was the starting point of Refs. [3,4]. Nucleons being composed of quarks and gluons, it has been argued in Ref. [5] that a natural decomposition of the nucleon mass will consist of two terms, where is interpreted as the internal proper energy associated with parton species a. It is possible to refine this decomposition. An obvious and theoretically trivial refinement would be to separate U q into contributions from the different quark flavors. We do not elaborate on this point which only matters when studying numerical values for the contributions to the nucleon mass. Let us rather focus on another refinement. Since quarks are massive particles we can write [5][6][7] where can be interpreted as the quark mass or rest-energy contribution, and therefore (U q − U m ) as the quark proper kinetic and potential energies. This is motivated by the familiar decomposition of a free-particle energy into kinetic and rest energy contributions, The three-term energy decomposition (50) has recently been obtained in DR [6,7], and will be discussed in more detail in Section V. The pioneering four-term energy decomposition proposed in Refs. [3,4] has recently been slightly reorganized in Refs. [8,[64][65][66]. To obtain its modern form within the present tensor analysis approach, we need to write using the refinement with c m,a two renormalization-scheme-dependent coefficients, γ m the quark mass anomalous dimension, β the QCD beta function, and F µν the gluon field strength tensor. From the tensor analysis perspective, we are unable to find any motivation for interpreting M q,g as the quark/gluon kinetic and potential energies, as was proposed in Refs. [3,4]. Without a clear physical interpretation of M q and M g , the refinement (53) appears somewhat ad hoc, leading to the conclusion in Ref. [5] that the introduction of M m and M a in the nucleon mass decomposition is in some sense arbitrary.
Contrary to the analysis of Ref. [5], the derivation of the four-term energy decomposition of Refs. [3,4] did actually not start from a decomposition of the total EMT into quark and gluon contributions. Instead, the total EMT is first decomposed into where the traceless and trace parts are defined as Working for convenience in the rest frame, the proper energy density is simply given by the µ = ν = 0 component so that following the decomposition in Eq. (55). Using now the virial theorem (23) and the definition of mass (47), one concludes that the so-called "tensor" and "scalar" energies are given by In Refs. [3,4] the result (58) was obtained using Eq. (20) and was interpreted as analogous to the virial theorem. In recent papers [8][9][10], the relation is now referred to as the "relativistic virial theorem", motivated by the observation that one can deduce from it the familiar non-relativistic expression 2 Ψ|T |Ψ = − Ψ|V|Ψ in the case of the positronium system in Coulomb gauge. Strictly speaking, this argument does not prove that Eq. (59) is the actual relativistic virial theorem. It only indicates that Eq. (59) holds true because of the relativistic virial theorem. As stressed earlier, there is some confusion in the literature about what the virial theorem is in field theory. In order to clarify this point, we reviewed in Appendix A its derivation in both classical and quantum mechanics, and we extended explicitly the derivation to QFT in Section II. The result for a closed system is given in Eq. (23), and expresses the fact that the system is in mechanical equilibrium [5,[14][15][16]. As shown in Section II C, the virial theorem is already contained in Eq. (20) used to derive the relation (59). The latter should therefore be considered as a corollary of the relativistic virial theorem (23) rather than the virial theorem per se. Indeed, using the decomposition (55) and the virial theorem we have Now, from the definition of the traceless and trace parts (56) it follows that leading then to Eq. (59). While the decomposition (55) can certainly be motivated by the fact thatT µν andT µν do not mix under renormalization and Lorentz transformations, it comes at the price of mixing T 00 and T ii components, as clearly indicated by Eq. (57). We have seen that the Lorentz-invariant definition of mass (47) requires only the four-momentum density operator T 0µ . The stress tensor T ij has nothing to do with mass, so introducing it in the mass decomposition means that one is mixing information about the proper energy content with the requirement of mechanical equilibrium [5]. This can be seen directly from the parametrization of the modern 7 four-term energy decomposition There are only two unknown numbers a and b (known as Ji's parameters) for four terms. There must therefore be two independent linear relations. One is obviously the mass sum rule (53). The other independent relation is which follows from the virial theorem (23) and the definition of the a and b parameters 8 given in Refs. [3,4] p|T µν To sum up, there are two independent pieces of information encoded in the forward matrix elements of the EMT. One is the mass of the system encoded in the rest frame by the T 00 component. The other is the virial theorem, which expresses the mechanical equilibrium of the system, encoded in the rest frame by the T ij components. A genuine mass decomposition should not mix these two aspects. This is the case of the two-term energy decomposition (48) proposed in Ref. [5] and its three-term refinement (50) found in [6,7].

B. The role of the trace anomaly in the origin of the nucleon mass
In a response to criticisms of the four-term energy decomposition, in Ref. [8] it was argued that "it is unclear what new insight can be brought to the understanding of the proton mass through the process of regrouping if any. To the contrary, this rearrangement stands to lose much". It was concluded that the two-term and three-term energy decompositions miss "the fundamental insight on the origin of the proton mass". By "fundamental insight", we believe the role played by the trace anomaly in the nucleon mass was meant. In this subsection we address this important point.
The trace of the EMT measures the breaking of dilatation symmetry due to the presence of mass scales in the theory. Mass scales are generally provided at the classical level by the constituent masses. At the quantum level, another scale appears through the process of renormalization leading to anomalous contributions to the trace of the EMT.
At the operator level, there is no connection between the trace of the EMT T µ µ and the mass of a physical state M 2 = P µ P µ with P µ = d 3 x T 0µ (x). The connection can however be made at the level of expectation values in a stationary state |Ψ thanks to the virial theorem Ψ| d 3 x T ij (x)|Ψ = 0 which implies [25,26] In the literature the relation is in fact usually derived in terms of four-momentum eigenstates directly from the trace of Eq. (20), leading to [1,2] p|T µ This derivation is manifestly Lorentz invariant but hides the fact that it implicitly makes use of the virial theorem, since Eq. (20) can be obtained solely from Poincaré symmetry arguments. As a result of this relation, there seems to be a deeply-rooted idea that the trace anomaly must fundamentally be connected to the nucleon mass, and so one might expect it to appear explicitly in the mass budget. We stress however that this appearance can only be made through the use of the virial theorem, since matrix elements of i T ii are necessarily involved. The virial theorem is useful because it allows one to relate the average value of different quantities, like e.g. the average kinetic and potential energies in point mechanics, and therefore to reexpress the total energy in a different way, see Appendix A B for an example in Dirac theory. However, it does not provide any clue about the actual origin of mass. One can also understand this from the point of view of dilatations. Note that the EMT can generally be interpreted as the response of the system to infinitesimal spacetime distortions x µ → x µ + εξ µ (x) [67]. Indeed, the corresponding variation of the action is given by When the EMT is symmetric, we can write because an infinitesimal spacetime distortion can be seen as a diffeomorphism under which the variation of the (symmetric) metric is given by δg µν = −ε(∂ µ ξ ν + ∂ ν ξ µ ). It follows that temporal dilatations tell us something about the Hamiltonian and hence the mass of the system, while spatial dilatations lead to the virial theorem and tell us something about mechanical equilibrium. The trace anomaly, which is associated with isotropic spacetime dilatations, then necessarily combines these two independent aspects of a bound system. So, we do not consider that writing brings any fundamental insight into the question of the origin of the nucleon mass [5]. It is just a particular case of the more general relation which is obviously true for any value of α, independently of the virial theorem. We repeat that for α = 0 the two terms in Eq. (70) correspond to different combinations of EMT components, which is not a natural thing to do from the tensor analysis perspective since the individual terms would then correspond to different physical quantities rather than different contributions to the same physical quantity [5]. This is similar to what happens, e.g., in thermodynamics. One can define the notion of enthalpy H as the sum of internal energy U and pressure-volume work W = pV . However, one does usually not consider that the relation U = H −pV = (U +pV )−pV represents an actual decomposition of internal energy.
As a final remark, we find it somewhat surprising that in Section IV of Ref. [8] it is argued that using the relation (22) for the first term of Eq. (70) in the case α = 1 does not make a lot of sense physically, while in Section II of the same paper the relation (22) is used in the case α = 1/4 to provide some alleged fundamental insight, namely the fact that 1/4 of the nucleon mass comes from the EMT trace. Once again, the main motivation for the choice α = 1/4 is that the two terms do not mix with each other under Lorentz transformations and renormalization, but without the virial theorem these terms have separately no clear relation to mass. Only the two-term (48) and three-term (50) energy decompositions make no use of the virial theorem, and can therefore be considered as genuine mass decompositions.

C. Generalized mass decomposition
So far we insisted on the fact that each term appearing in a genuine mass decomposition should have the meaning of a contribution to proper energy. If we relax this requirement and simply demand that 1) the various terms have the dimension of energy and 2) the sum of the corresponding expectation values gives the mass of the system, then we are led to the concept of generalized mass decomposition, which allows one to treat within the same framework various decompositions proposed in the literature. For ease of presentation we will consider in this section only a decomposition of the EMT into quark and gluon contributions, so that each term of the corresponding generalized mass decomposition has the same expression in terms of form factors. It should however be kept in mind that further refinements similar to the ones leading to the three and four-term energy decompositions discussed in Section IV A can naturally be considered.
For convenience, let us work in the nucleon rest frame where the total energy coincides with mass. We have already seen that once we have defined a decomposition of the EMT T µν = a T µν a , a natural mass decomposition follows automatically by considering the matrix elements of T 00 in the rest frame, Combining this rest-frame energy decomposition with the virial theorem and the fact that discrete spacetime symmetries imply we can define a generalized mass decomposition as follows, where the c µν are arbitrary coefficients with the constraint c 00 = 1. Some notable examples are: • Energy decomposition: c µν = g 0µ g 0ν ; • Trace decomposition: c µν = g µν ; • Enthalpy decomposition: c µν = (4g 0µ g 0ν − g µν )/3; • Tolman mass decomposition: c µν = 2g 0µ g 0ν − g µν ; • Light-front momentum decomposition: c µν = 2g −µ g −ν ; • Light-front energy decomposition: c µν = 2g −µ g +ν .
Defining mass as the rest-frame energy naturally leads to the energy decompositions discussed in Refs. [3][4][5][6][7]. Mass being by definition a Lorentz scalar, some authors prefer to relate it to the trace of the EMT, see e.g. Refs. [1,2,18,[68][69][70]. Since it is the enthalpy that forms together with the three-momentum a Lorentz four-vector in relativistic thermodynamics [71][72][73], one could also argue that mass is the proportionality factor between four-momentum and four-velocity and hence consider an enthalpy decomposition instead of an energy decomposition. In the context of general relativity, defining the total mass of a system becomes an even more delicate problem owing to contributions associated with the gravitational field. Tolman mass is one of the standard notions of quasi-local mass commonly used because it has "the great advantage that it can be evaluated by integrating over the region occupied by matter or electromagnetic energy" [35,74,75]. In the context of high-energy scatterings, it is particularly convenient to switch to light-front components defined as a ± = (a 0 ± a 3 )/ √ 2, where the z-direction is the collision axis. In this formulation of relativistic dynamics, the little group is Galilean and the longitudinal light-front momentum plays in the (x, y)-plane the same role as mass does in the non-relativistic context [16,46,76,77]. Finally, coming back to the fact that mass can be seen as the rest-frame energy and noting that light-front boosts are kinematical transformations, one may consider alternatively the light-front version of the energy decomposition discussed e.g. in Refs. [16,78].
We note that the generalized mass decomposition introduced in Eq. (74) can easily be expressed in a Lorentzinvariant way according to where dV = u 0 d 3 x is again the invariant proper volume element and the condition on the coefficients is now c µν u µ u ν = 1. For the notable examples presented above, it suffices to apply the substitutions g 0µ → u µ and g ±µ → n ± /(n ± ·u) √ 2 with n ± = (1, ±u/|u|). This shows that Poincaré symmetry alone does not lead to a unique generalized mass decomposition.
To sum up, while the genuine mass decomposition relies only on four-momentum conservation a generalized mass decomposition requires the more general relation which includes the additional information of mechanical equilibrium expressed by the virial theorem (24). The proper physical interpretation of the contribution associated with subsystem a should therefore account for pressure effects when c µ µ = 1 [5].

V. OPERATOR STRUCTURE OF THE ENERGY DECOMPOSITION
Mass decompositions are often motivated by their operator structures. Some controversy arose recently concerning the form of the renormalized EMT operator in QCD. We discuss in this section this important point, revisiting the structure of the operators in the MS and MS schemes (hereafter, referred as MS-like schemes) in the framework of (conventional) DR. For a concise description of conventional DR as well as of other DR procedures we refer to Appendix B, which is based on the original work of Refs. [79][80][81]. We then review the operator structure of the different mass terms in the four-term energy decomposition proposed in Ref. [4] and recently revised in Refs. [6,7,10]. Finally, we discuss the status of the so-called "quantum anomalous energy" put forward in the recent works [8][9][10].

A. Renormalized QCD energy-momentum tensor
The EMT is a sum of composite operators, i.e., products of fields and their derivatives evaluated at a single spacetime point. Composite operators are usually divergent in perturbation theory even after Lagrangian renormalization, and require therefore additional renormalization. MS-like renormalization schemes with DR appear to be particularly convenient, since it has been shown that almost all usual algebraic manipulations done at the level of unrenormalized operators remain valid in terms of the renormalized ones [81,82]. Moreover, both Poincaré and gauge symmetries remain exact 9 in the intermediate steps. In the following, renormalized operators will be distinguished from unrenormalized ones by a label R.
Nielsen [83] showed long ago that the total EMT in QCD is finite and does not require additional renormalization, so that (T µν ) R = T µν . Since the individual terms appearing in T µν do not depend on the spacetime dimension d, it follows from the linearity property of MS-like renormalization schemes that [17,18,84,85] To keep the presentation simple, we omitted terms proportional to the EOM and the gauge non-invariant ones since they do not contribute to the physical matrix elements. Note also that the vacuum expectation values are always implicitly subtracted. The trace of the QCD EMT is also finite and takes the form [83,86,87] It differs from the classical trace g µν T µν class = ψmψ = (ψmψ) R by a term which is renormalization-group invariant and called the trace anomaly Looking at the structure of the renormalized EMT (78), it is natural to define the renormalized quark and gluon contributions as Since the trace of the QCD EMT is given by we can write 10 , following the notation of Refs. [17,18], where x and y are finite numbers of order O(α s ) which parametrize how the anomalous contributions to the trace are shared between the quark and gluon parts of the EMT.

B. Operator mixing
We sketch here the construction of the renormalized EMT operators in the MS-like schemes with DR as defined in Appendix B, and we refer to [7,17,18,84,85] for more details. Renormalization through normal products (defining the finite part of composite operators) and the emergence of the anomaly in DR have both been studied long ago in Refs. [81][82][83][86][87][88][89]. An explicit application to the O(N ) nonlinear sigma model in MS scheme is given in Ref. [90].
The renormalized operators are obtained from a basis of bare composite operator O i as follows, 10 One can in principle add a term ∝ (ψ i 2 ↔ / Dψ) R on the right-hand side of both equations. It is however irrelevant because of the EOM.
As discussed in Refs. [6,7,17,18,69,85], the remormalization of the QCD EMT involves four independent operators O i that mix through Eq. (84) via 10 renormalization constants. Explicitly, one has the following system of equations,  where the operators with vanishing contribution to the physical matrix elements have consistently been omitted. Thanks to Lorentz symmetry, one can alternatively regroup the operators to form scalar and symmetric traceless tensor representations of the Lorentz group which do not mix under renormalization. This amounts to changing the operator basis in such a way that the renormalization matrix in Eq. (85) turns into a block-diagonal form, i.e.,   85) and (86) are perfectly equivalent. There is therefore in practice no distinction between what the authors of Ref. [10] call the "standard" and "non-standard" way of renormalizing operators. The only crucial point is that one has to be careful with the way of writing properly the renormalized traceless operators, an aspect that will be discussed in more detail in Section V C.
The EMT renormalization constants in the MS-like schemes have been derived up to two and three loops in Refs. [18,69], and further discussed in Refs. [6,7,17] in the context of various mass sum rules. In these schemes, using DR and d = 4 − 2ǫ, the structure of the renormalization constants is where the finite quantity S ǫ can follow different conventions [91,92], with an expansion in powers of ǫ that differs at O(ǫ 2 ) and higher. We refrain from providing here the explicit form of the renormalization factors, but they can be found in Refs. [17,18], and also in [7] in different MS schemes. We notice that, formally, the MS renormalization factors can be obtained from the MS ones by simply setting S ǫ = 1. When dealing with tensor operators, one has to pay particular attention to the manipulation of the trace and renormalization operations since, in general, they do not commute [81,89], In DR, this arises from the fact that the trace operation may change the pole structure and hence the result of the normal product. The non-commutativity of these operations is a reflection of the trace anomaly. In some other renormalization schemes, like for instance BPHZ, these operations do commute but linearity is lost [81,93]. The general message is that because of the trace anomaly it is impossible to preserve all the algebraic manipulations under renormalization. To clarify this point, we consider an explicit example. As outlined in Appendix B, any DR scheme is well-defined only in perturbation theory and consists of replacing the four-dimensional loop integration with the map I d , and mapping all vectors from the (eventually Wick rotated) four-dimensional Minkowski space into the infinitely-dimensional QS d space. We remind that relations among operators are usually understood as relations valid for the corresponding Green functions. Indeed, the relation g µν ψiD µ γ ν ψ = ψi / Dψ (90) has to be understood at the level of the matrix elements g µν ψiD µ γ ν ψ = ψi / Dψ .
For the bare operator, one has For the renormalized operator, instead, one obtains, using Eq. (85), which can be written as where both c 1 and c 2 start at O(α s ) in perturbation theory and are defined as The total contribution c 1 ψmψ + c 2 F 2 can be interpreted as a finite correction to the trace of the bare operator 11 .

C. Symmetric traceless operators
It appears that there is some confusion in the literature about the form of the symmetric traceless operators. One reason is that many textbooks and papers do not spell out explicitly the trace terms and simply writē This lack of explicitness can be understood from the observation that, in the context of high-energy scattering, one is often only interested in the specific light-front component T ++ a , representing the light-front density of longitudinal momentum. Since g ++ = 0, one does not need to worry in this case about the explicit form of the trace terms. The problem arises however as soon as one considers components with a non-vanishing contribution from the metric, like e.g. the energy density T 00 a . The unambiguous definition of the symmetric traceless part of a generic rank-two tensor operator in d-dimensional spacetime isŌ In particular, for a renormalized operator the explicit expression is [94] ( This tensor is manifestly traceless irrespective of whether the trace and renormalization operations commute or not. Also, since the explicit d-dependence appears outside of the normal product, we can safely replace d by 4. Renormalization should preserve Lorentz symmetry, and so operators belonging to different representations of the Lorentz group should not mix with each other. We can therefore write [95] which is obviously compatible by linearity with Eqs. (84), (97) and (98). The standard shorthand notation (Ō µν i ) R is however somewhat misleading, because it gives an incentive to write an expression which must be treated with great care. For example, while renormalized operators are by construction finite in the limit d → 4, one should in general refrain from replacing directly d by 4 in Eq. (100). Indeed, in MS-like schemes with DR the notation (O) R means that one removes the contributions of the operator O which diverge as d → 4. The latter limit must then be considered at the very end of a calculation and cannot be applied directly to the expression inside the brackets in Eq. (100). For the same reason, one must also pay attention that in general because of Eq. (89). As a result of the above discussion, the renormalized symmetric traceless quark and gluon operators are unambiguously given by [17,18,84,85] and can alternatively be expressed as using Eq. (83). One could also formally write Once again, it is essential that the limit d → 4 is taken after minimal subtraction.
Based on their explicit operator expressions in MS-like scheme with DR, we conclude that there is in general no simple physical interpretation for (T 00 q ) R or (T 00 g ) R . We discuss in the following the consequences for the energy decomposition.

D. Energy decomposition
Following the approach of the original works on the nucleon mass decomposition [3,4], the total renormalized EMT can be obtained by adding the renormalized traceless and trace parts Using the incentive form for the renormalized traceless operators given in Eq. (104), we get explicitly In order to obtain a decomposition of energy at the operator level, we consider (T 00 ) R and integrate over space.
Following the original derivation [3,4], the QCD Hamiltonian has been decomposed into a traceless (tensor) and trace (scalar) part as [8] where are three contributions that are separately renormalization group invariant. The expansion of the last contribution in powers of ǫ gives using the linearity property of MS-like scheme and the trace anomaly relation [83,86,87] ǫ In conclusion, (H q + H g ) contains an anomalous contribution which compensates exactly H a in Eq. (108). No anomalous contribution survives therefore in the energy budget, which is then composed of three terms instead of four [6,7] with This structure follows also directly from Eq. (78) without the need of decomposing first the EMT into traceless and trace parts. Moreover, one can safely interpret H q and H g as the quark and gluon kinetic+potential energies, in agreement with the tensor analysis approach.

E. Diagonal schemes
While agreeing formally with the results of the previous section, the authors of Ref. [10] complained that mixes the tensor and scalar representations of the Lorentz group, and claimed that the notation 1 2 (E 2 + B 2 ) R has commonly been reserved for (T 00 g ) R and not (T 00 g ) R , referring to the works [68,96]. Our opinion is that the latter statement is a misrepresentation of what can be found in the literature. In both papers [68,96] the renormalized traceless gluon operator indeed appears in its classical form, but we observe that neither the corresponding quark operator nor the renormalization scheme are specified. These works are in fact inspired by an old seminal paper of Voloshin and Zakharov [97], where it is suggested that one can measure the gluonic part of the trace anomaly using quarkonia. It appears that Voloshin and Zakharov used the relation g µν g αβ (F µα F νβ ) R = (F 2 ) R to derive a low-energy theorem. This indicates that they are not working in a MS-like renormalization scheme but in another one where (T µν g ) R = (T µν g ) R , so that the EMT trace (including the anomalous contributions) arises solely from the quark sector g µν (T µν ) R = g µν (T µν q ) R . Other choices have also been made in the literature. For example, in a comment to Ref. [97], Novikov and Shifman [98] wrote the total renormalized gluon EMT (T µν g ) R in the classical form, in agreement with Eq. (81) and Refs. [99,100], but they required that the trace is given by g µν (T µν g ) R = βF 2g (F 2 ) R , where β F is the contribution to the beta function arising from gluon loops only. This indicates that yet a different renormalization scheme has been chosen 12 . This example shows the importance of clearly specifying the renormalization scheme, for otherwise a comparison between different works may lead to apparent contradictions.
When discussing the form of the gluon operators and their properties, one must not forget the quark sector. We have seen that in MS-like schemes the traces of the quark and gluon contributions to the EMT (83) involve some mixing parametrized by two finite numbers x and y. It follows from the unambiguous definition (98) that the renormalized traceless quark and gluon operators can also be expressed in a way that involves explicitly x and y, see Eq. (103). The values of these parameters are directly determined by the renormalization factors. Their explicit expressions in MS-like schemes with DR can be found in Refs. [7,17,18,69]. They are quite cumbersome and indicate that due to operator mixing the trace anomaly is shared in a nontrivial way between the quark and gluon contributions.
Simpler expressions for the renormalized operators can however be obtained by applying a finite renormalization to the MS-like operators. The only effect of this finite renormalization will be to reshuffle the anomalous contributions between g µν (T µν q ) R and g µν (T µν g ) R , i.e., changing the values of x and y. The total anomaly remains however unchanged. It is through such a finite renormalization that one can connect in principle the operators in MS-like scheme with DR to the ones discussed in Refs. [68,[96][97][98].
The so-called diagonal schemes [6,7] keep the mixing between quark and gluon operators under the trace operation as simple as possible. We present here three of the most meaningful choices: • D1 scheme -One may choose a scheme where the quark and gluon operators do not mix under the trace operation [6]. It corresponds to the choice x = 0 and y = γ m so that which was the situation considered in Ref. [5], allowing one to identify the quark and gluon contributions to the EMT trace used in Refs. [3,4] with the corresponding traces of the quark and gluon contributions to the EMT.
• D2 scheme -Since the whole anomaly β 2g (F 2 ) R + γ m (ψmψ) R and (ψmψ) R are separately renormalizationgroup invariant, one may prefer to work with x = y β 2gγm . The D2 scheme introduced in Ref. [7] corresponds to the choice x = y = 0 and attributes all the anomalous terms to the renormalized gluon contribution to the EMT, • D3 scheme -For completeness, we mention a third possibility corresponding to the choice x = β 2g and y = γ m , which attributes all the anomalous terms to the renormalized quark contribution to the EMT.

F. "Quantum anomalous energy"
In a series of recent papers [8][9][10], the concept of "quantum anomalous energy" (QAE) has been emphasized as a key aspect of the nucleon mass structure. QAE finds its origin in the four-term energy decomposition proposed in Refs. [3,4], where it has been argued that in MS-like renormalization with DR the QCD Hamiltonian takes the form 12 Adding to the confusion, in a recent paper [101] the temporal component of the gluon part of the QCD EMT is denoted T 00 g = 1 2 (E 2 +B 2 ) with a reference to the work of Novikov and Shifman [98], while at the same time it is presented as a 2 ++ gluon operator, i.e. a symmetric traceless tensor, like in the work of Voloshin and Zakharov [97].
It seems therefore that the renormalized QCD Hamiltonian receives on top of its classical form a new contribution equal to a quarter of the trace anomaly. This contribution is unexpected and referred to as QAE.
The analysis of Refs. [3,4] has been revisited in Refs. [6,7] with the new conclusion that no anomalous contributions actually survive in the Hamiltonian; see our discussion in Section V D. The difference between these two analyses can be traced back to the renormalized traceless operators. While those operators did not appear explicitly in the original works [3,4], their precise form used in the context of the four-term decomposition has been specified in a recent work [8]. Contrary to Eq. (104), it appears that the d → 4 limit is taken before the normal product, giving the impression that the traceless operators can be written in the same way as in the classical case. As stressed in Section V C, this is an incorrect notation since it assumes that trace and renormalization are commuting operations, a property that is not satisfied in general in MS-like schemes with DR.
Writing the renormalized QCD EMT as a sum of a traceless part with classical form and a trace part with anomalous contributions like in Refs. [3,4,[8][9][10] may seem a priori attractive, since it gives the impression that the anomaly is distributed equally between the diagonal components of the EMT. As a result, the Hamiltonian H = d 3 x (T 00 ) R is expected to provide a quarter of the trace anomaly, considered in these papers as a "new" form of energy. This picture is however inconsistent with Poincaré symmetry. Indeed, time translation is an exact symmetry of the theory. The form of the corresponding generator, i.e. the Hamiltonian, must then be the same as in the classical case as a consequence of the quantum action principle [81,82].
As indicated by its name, the trace anomaly is a pure quantum contribution associated with the trace of the EMT, and not with its individual diagonal components. It expresses the breaking of spacetime dilatations, and not a breaking of spacetime translations. At the operator level, the trace anomaly has nothing to do with the Hamiltonian. Motivated by Lorentz symmetry and deep-inelastic scattering experiments 13 , one may of course decompose the Hamiltonian into tensor and scalar parts as in Eq. (107), but this does not bring much fundamental insight since it just amounts to writing the Hamiltonian as where H S contains anomalous contributions (see Eq. (108)) while H does not. As already stressed in Section IV B, the only way to relate non-trivially the Hamiltonian, and hence the mass of a system, to the trace anomaly is at the level of matrix elements. Indeed, the virial theorem (23) tells us that the expectation value of the stress tensor must vanish at rest. As a result, we can write which leads to the relation [6,7] It is then clear that one can have a mass sum rule with contributions from either the parton energies or the anomaly, but a sum rule with both contributions at the same time does not appear naturally. In summary, the renormalized Hamiltonian in QCD does not contain anomalous contributions since it is protected by translation symmetry. The so-called QAE given by the expectation value of H a , defined as a quarter of the trace anomaly and appearing in the "scalar" part of the Hamiltonian, does not provide clear fundamental insight since it is exactly compensated by the same contribution with opposite sign from the "tensor" part of the Hamiltonian. The latter does not appear in Refs. [3,4,[8][9][10] due to an unjustified notation for the renormalized traceless operators in MS-like scheme with DR.

VI. EMT DECOMPOSITION ON THE LATTICE
Recent papers [8][9][10] try to justify the appearance of QAE in the nucleon mass budget based on some works by Rothe [102,103] in the context of lattice QCD (LQCD). It appears that the question of the Hamiltonian in LQCD is an old and difficult problem. Contrary to DR, lattice regularization allows one to renormalize the theory in a non-perturbative way. On the other hand, Poincaré symmetry is broken by the introduction of a finite lattice spacing. One must therefore pay particular attention that in the limit of vanishing lattice spacing the Poincaré symmetry is correctly recovered. It also means that one has to be careful with the physical interpretation of lattice expressions, since the breaking of Poincaré symmetry by the regulator generates artifacts, especially in currents associated with spacetime symmetries like the EMT. Unfortunately, this essential aspect of the problem has not been considered in Refs. [8][9][10]. We show in the following that taking it into account sheds light on the results presented in these papers, and leads to the conclusion that the relevant LQCD papers actually do not provide concrete support for the concept of QAE.

A. Lattice sum rules
Using the Wilson action [104], Michael found that the glueball mass can be expressed as [105,106] where a is the symmetric lattice spacing,β = 2N/g 2 0 is the bare lattice coupling parameter for the SU (N ) gauge sector of the theory, and is the plaquette action in a one glueball state (with the vacuum value implicitly subtracted) summed over all plaquettes at one time slice. In the naive continuum limit, one can basically write so that Eq. (122) can be interpreted as the lattice version of Eq. (65). We will denote this as As clearly shown by a recent variation of Michael's derivation [10], the fact that one gets an expression for the mass in terms of the EMT trace follows from an isotropic lattice scaling transformation. Further relations can be obtained by considering asymmetric lattices. Following the formalism of Ref. [107], a different lattice coupling parameterβ µν must be attributed to the different plaquette orientations µν . In the case where one distinguishes the temporal spacing a 0 = a t from the isotropic spatial spacing a 1 = a 2 = a 3 = a s , Michael arrived at the following two sum rules [105,106], where t = 0i and s = ij with i, j = 0. The coefficients S and U are defined at the symmetric point a t = a s = a by Since Eq. (125) is associated with temporal dilatations, it can be interpreted as the lattice version of the energy sum rule [8][9][10]. Similarly, we observe that Eq. (126) is associated with spatial dilatations and can therefore be interpreted as the lattice version of the virial theorem. Combining Eqs. (125) and (126), Michael found two alternative expressions for the glueball mass Using the relation obtained by Karsch [107] 2(S + U ) = dβ d ln a (130) and = 3 t + 3 s , we see that Eq. (128) derived from asymmetric lattices and evaluated at the symmetric point is consistent with Eq. (124) obtained directly from symmetric lattices. In the weak-coupling limitβ → ∞, one finds [107] 2(S − U ) ≈ −4β (131) so that one can write [106] M ≈ In the naive continuum limit, the lattice plaquettes are identified with the chromoelectric and chromomagnetic contributions to field energy We remind that the sign change in the chromoelectric contribution comes from the transition from Euclidean space to Minkowski space. Using this identification, it appears that the classical form of field energy provides only 3/4 of the glueball mass [103,106]. Massaging a bit the energy sum rule (125), one has In the naive continuum limit, Rothe concluded that the missing 1/4 of the glueball mass comes from the trace anomaly, in apparent agreement with the analysis of Refs. [3,4]. The recent works [8][9][10] present a variation of this discussion and use it as a support of the concept of QAE. This result, however, has to be considered with a grain of salt, since it relies on both the weak-coupling limit (131) and the naive continuum limit (133). Non-perturbative evaluations at finite temperature of the combination 2(S − U ) show in fact significant deviations from the weak-coupling value [108][109][110]. This suggests that in general the classical form of field energy on the lattice does not provide exactly 3/4 of the glueball mass, and indicates that the lattice operators must be renormalized. This is to be expected since (−3 t + 3 s ) is not a Noether charge due to the breaking of Poincaré symmetry on the lattice, where the hypercubic symmetry typically leads to more complicated mixing patterns than in the continuum [56,[110][111][112][113]. In particular, the chromoelectric and chromomagnetic contributions to the field energy mix under renormalization [114,115], invalidating therefore the naive continuum limit interpretation (133).
In conclusion, contrary to the suggestion of Refs. [8][9][10] a careful inspection reveals that the lattice energy sum rule does not provide a clear support to the concept of QAE. In particular, it is essential to consider the renormalization of the lattice operators before providing any physical interpretation. (Note that E 2 and B 2 in Eq. (133) do not have the same meaning as the corresponding renormalized operators in the continuum.) This is crucial for the components of the EMT since the breaking of Poincaré symmetry on the lattice is a source of artifacts, as we will see in the following.

B. Translation symmetry
How to construct the EMT on the lattice is a tough question that has been studied for over 30 years [60-62, 84, 85, 90, 111, 116-126]. For recent investigations of the hadron mass structure on the lattice see also Refs. [65,66,[127][128][129]. A major difficulty is that lattice regularization breaks translation symmetry and makes the construction and renormalization of the EMT non-trivial. In particular, it turns out that any discretization of the classical EMT, denoted T µν tree , is not conserved in the quantum theory [111,[116][117][118][119], Here R ν is proportional to the lattice EOM and X ν is an operator that formally vanishes when the lattice spacing a tends to zero. Because of radiative corrections, X ν provides however finite contributions when inserted into Green functions. It vanishes for zero external momentum transfer and can thus be rewritten as X ν = −∂ µ T µν corr . The conserved EMT on the lattice is therefore defined as The correction term T µν corr ensures that the translational Ward identities are satisfied. At the same time it ensures that the trace anomaly is correctly reproduced 14 . The fact that the classical expression for the EMT must be corrected is consistent with the general observation that once a genuine symmetry is broken by the regulator, one should expect to see the appearance of additional symmetry-restoring counterterms [130][131][132][133].
In the recent work [10], the mass structure of a (1 + 1)-dimensional non-linear sigma model in the large-N limit is studied at the one-loop level. It is found that the operator form of the total Hamiltonian H = d 3 x T 00 (x) depends on the choice of regularization scheme. In particular, the classical Hamiltonian H c = d 3 x T 00 tree (x) is regulator-dependent and has no universal physical meaning. The authors observe that in symmetric regularization schemes, where all directions are treated equally, the classical Hamiltonian coincides with the traceless part H T = d 3 xT 00 (x), while in regularization schemes where the energy integral can be rescaled back and forth (like e.g. dimensional regularization) the classical Hamiltonian coincides with the total Hamiltonian.
These results can easily be understood from the point of view of translation symmetry. Regularization schemes where the energy integral can be rescaled back and forth are precisely those preserving translation symmetry in the temporal direction. It should therefore not be surprising that the total Hamiltonian takes the same form as in the classical theory, since this is a mere consequence of the quantum action principle [81,82]. More generally, it has been argued that translational Ward identities obtained with a cutoff procedure preserving Poincaré symmetry cannot contain anomalies [20,134]. On the other hand, when translation invariance in the temporal direction is broken by the regulator, the total Hamiltonian must necessarily involve additional contributions like in Eq. (136) to restore translation symmetry. These correction terms should be regarded as mere artifacts arising due to a poor choice of symmetry-breaking regulator, and not as genuine physical contributions 15 . They must disappear in the process of renormalization to comply with Poincaré symmetry and the quantum action principle.
The authors of Ref. [10] observed that the one-loop contribution to the classical Hamiltonian vanishes only in symmetric regularization schemes. In other schemes, the naively vanishing integral gives a finite contribution because of the asymmetric regulator, which has been interpreted as a sign of their anomalous nature. In particular, in dimensional regularization the one-loop integral takes the form The symmetric integral generates a 1 4πǫ pole that cancels the explicit ǫ factor in front of it, leading to a finite result. Since a similar mechanism is responsible for the trace anomaly in dimensional regularization, the authors of Ref. [10] interpreted the non-vanishing of the one-loop contribution to the classical Hamiltonian as of anomalous nature.
We disagree with this interpretation. The trace anomaly arises in dimensional regularization from an evanescent term of the form ǫO with O some operator. At the classical level, the operator O is finite so that the evanescent term vanishes in the limit ǫ → 0. At the quantum level, the operator contains a 1 ǫ pole leading to the trace anomaly. In contradistinction, H c is not an evanescent operator and does not vanish at the classical level in the limit ǫ → 0. The explicit ǫ factor is not part of the definition of H c but appears in Eq. (137) only after a change of variables. Despite the similitude with the trace anomaly mechanism, the one-loop contribution to H c is actually not anomalous.
Anomalies are usually associated with a pair of symmetries [135]. In the present case, the pair consists of translation and dilatation symmetries. In the regularized theory, there is no way of preserving at the same time both translation symmetry and the standard definition of trace. In dimensional regularization, Poincaré symmetry is preserved by distorting the classical d = 4 spacetime into a d = 4 − 2ǫ one, affecting therefore the definition of trace; see also Appendix B. Pauli-Villars regularization also preserves Poincaré symmetry but adds regulator fields which provide new contributions to the trace. Lattice regularization, on the other hand, preserves the classical d = 4 spacetime but breaks Poincaré symmetry by making it discrete. In this case the definition of trace is unaffected by the regularization and the trace anomaly must appear in the form of correction terms to the EMT ensuring that the translational Ward identities are satisfied.
To sum up, the appearance of trace anomaly contributions to the energy on the lattice is a pure artifact associated with the breaking of translation symmetry by the discretization. Translation symmetry being an exact symmetry at the quantum level, this artifact should disappear in the process of renormalization to agree with the results obtained using symmetry-preserving regularization schemes. In conclusion, we do not find any concrete support to the concept of QAE from LQCD.

VII. CONCLUSIONS
Understanding the decomposition of the nucleon mass in QCD in terms of contributions from quarks and gluons is a topic of high interest and fundamental importance. Presently, different opinions exist in this area, reflected by different mass decompositions in the literature. Here we have concentrated on the mass decompositions which are based on the component T 00 of the QCD EMT: a four-term decomposition originally proposed in Refs. [3,4] and recently slightly modified in Refs. [8][9][10][64][65][66], a two-term decomposition put forward in Ref. [5], and a three-term decomposition arrived at in Refs. [6,7]. The latter two are very closely related -by separating the total quark contribution to the nucleon mass into quark kinetic plus potential energies and a quark mass term one obtains the three-term decomposition starting from the two-term decomposition.
One controversy concerns the proper expressions for the renormalized operators of the mass decomposition. We have elaborated on this important point using DR and MS-type schemes, and we re-confirm the findings of Refs. [6,7] in that regard, which to some extent are based on the renormalization of the full EMT discussed in Refs. [17,18]. This implies, in particular, that in DR the operator 1 2 (E 2 +B 2 ) R has a unique meaning in terms of components of the EMT. This operator corresponds to the total gluon contribution to the nucleon mass. Furthermore, different points of view exist with regard to the physical interpretation of the terms in the mass decompositions. We re-iterate the concern raised in Ref. [5] that the four-term decomposition contains mixtures of genuine energy terms and pressure-volume terms. This feature is closely related to the fact that, in order to derive the four-term decomposition, one must make use of the condition for mechanical equilibrium of the nucleon. As we have shown, this condition actually coincides with the virial theorem, which we have discussed at length. Both the two-term and the three-term decomposition do not make use of the virial theorem, and their contributions have a clean physical interpretation. One argument that was put forth in favor of the four-term decomposition is that it contains the so-called "quantum anomalous energy", which has been suggested as a unique contribution to the nucleon mass [8][9][10]. We have explained why, in our view, this term is not a genuine contribution to the mass decomposition.
Even though the two-term and three-term mass decompositions do not contain the operator of the trace anomaly, it remains important to pursue attempts to measure (the gluon contribution to) the trace anomaly [68,69,[136][137][138][139][140][141][142][143][144][145][146][147]. Such measurements can help obtaining a more robust phenomenology of the quark and gluon contributions to the EMT trace. This in turn allows one to better pin down the quark mass term and as such the numerics of all the terms of the nucleon mass decomposition. Finally, we would like to emphasize that all the nucleon mass decompositions require the same phenomenological input, namely two independent gravitational form factors.

A. VIRIAL THEOREM
In this Appendix, we review the virial theorem in various contexts and discuss its physical meaning.

A. Classical point mechanics
Originally, the virial theorem comes from classical point mechanics [148] where one considers a system of discrete pointlike particles bound by potential forces. Denoting by r k and p k the position and momentum of the kth particle, one introduces the quantity whose time derivative can be expressed as where the velocity of the kth particle is defined as v k = dr k /dt and the net force acting on it as F k = dp k /dt. In both relativistic and non-relativistic cases, the velocity of a particle can be expressed as the derivative of the kinetic energy with respect to momentum. Moreover, if the forces derive from a potential that depends only on the coordinates, we can finally write where T ({p i }) and V({r i }) are the total kinetic and potential energies depending on all the momentum and position variables, respectively. For convenience, we introduce a double square bracket notation to indicate that some quantity O is averaged over a long time. One can then write Now, for a bound system in the center-of-mass frame 16 , particle coordinates and momenta are expected to be bounded, so that G min ≤ G(t) ≤ G max for all t with both G min and G max finite. In that case, one expects [[dG/dt]] = 0 and hence This is a generic form of the virial theorem in classical point mechanics, valid for both relativistic and non-relativistic theories [149,150]. In particular, for a non-relativistic theory with a potential between any two particles i and j of the form V(r i , r j ) = Cr n , where r is the relative distance and C is some constant, the virial theorem reduces to a simple relation between the time-averaged total kinetic and potential internal energies and allows one to express e.g. the total center-of-mass energy purely in terms of [[V]]. The kinetic energy being always positive, the sign of the constant C must be the same as the sign of n for the bound system to exist. In other words, the net forces must be attractive. More generally, one can write the virial theorem in non-relativistic point mechanics as The quantity on the right-hand side is called the virial, derived from the latin word vis meaning "force", "energy" or "power". This form of the virial theorem is very useful e.g. for the description of gases. Indeed, for a non-relativistic gas contained in a box of volume V at rest with a constant pressure p, the virial theorem tells us that [151] [ where the second term on the r.h.s. is the virial associated with internal forces only, and the first term corresponds to the contribution arising from the external forces exerted by the walls of the box If the potential between two particles is of the form V(r i , r j ) = Cr n , Eq. (152) becomes 2 Ψ|T |Ψ = n Ψ|V|Ψ which is the quantum-mechanical counterpart of Eq. (144). Denoting the center-of-mass position and momentum operators by r CM and p CM , we have also 0 = d Ψ|r CM |Ψ /dt = Ψ|p CM |Ψ /m indicating that a stationary state is in average at rest 18 . This is consistent with the observation that Eq. (152) cannot be valid in all frames, since for a closed bound system the total kinetic energy increases with the total momentum of the system, whereas the potential energy does not [155]. Note also that if we work with a superposition of stationary states, the expectation value Ψ|G|Ψ will generally depend on time. However, like in the classical case we may still expect it to be bounded, so that [[d Ψ|G|Ψ /dt]] = 0 [156].
It was later realized that G is simply the generator of spatial dilatations U D = e −iκG , as one can see from with λ = e κ . This is also clear from since p k · ∂ ∂p k and r k · ∂ ∂r k measure the degree of homogeneity in momentum and position space, respectively. This observation leads to an interesting alternative derivation of the virial theorem from a variational approach [154,155,[157][158][159] which applies also to relativistic quantum mechanics. Let us introduce the function where |Ψ is some state and |Ψ κ = U D |Ψ is its dilated counterpart which depends on the parameter κ. In a variational approach, we require that the eigenvalue E of a stationary state |Ψ must be an extremum of E(κ) at κ = 0. In other words, we demand that Using Eq. (154), the stationarity condition under spatial rescaling gives directly the virial theorem recognized as the quantum-mechanical counterpart of Eq. (143). The connection between this derivation and the former one simply follows from the identity which relates the breaking of dilatation symmetry to the behavior of the Hamiltonian under dilatations. This shows that the virial theorem is fundamentally a statement about mechanical equilibrium expressed by the stationarity of the system under spatial dilatations. Note however that it does not say anything about stability since the latter is determined by the second derivative w.r.t. κ. The virial theorem is often used to simplify the calculation of the total energy of a bound system. Let us consider for example a relativistic spin-1/2 particle in a static external potential [160]. In the Dirac theory, we can write the particle energy as since the virial theorem (157) tells us that d 3 r ψ † α·p ψ = d 3 r ψ † r·∇V (r)ψ. For a Coulomb potential V C (r) ∝ 1/r, we have r · ∇V C (r) = −V C (r) so that we get the remarkably simple expression Evaluating the r.h.s. of Eq. (160) with a trial wave function provides, in a simple manner, an upper bound for the energy of the system. For a free particle, the rest energy is E = m and we recover from Eq. (160) the expected normalization of the free Dirac wave function d 3 r ψ † βψ = 1. For a bound particle, we expect E < m and hence d 3 r ψ † βψ < 1.

C. Link between field theory and point mechanics
As discussed in Sec. II B the virial theorem can be extended to a field-theoretical framework. We show here how the continuum treatment reduces to the usual one in point mechanics. For a point particle we can write the momentum density as T 0i (x) = p i (x) δ (3) (x − r(t)). Its time derivative receives therefore two contributions, where v = dr/dt is the particle velocity. The net force acting on the particle is obtained by integrating this equation over space r(t)) = ∂ ∂t p i (t, r(t)) + v(t) · ∇p i (t, r(t)).
We recognize as expected the expression for the material derivative. We can then rewrite Eq. (161) as So the rate of momentum change ∂ 0 T 0i (x) at a fixed spacetime location (Eulerian description) is related to the rate of momentum change F i (t, r(t)) δ (3) (x − r(t)) = d dt T 0i (x) of the fixed material point (Lagrangian description) via the convective rate of momentum change v(t) · ∇T 0i (x). Remembering now that G = i d 3 x T 0i x i , we get G(t, r(t)) = r(t) · p(t, r(t)) (164) and d dt G(t, r(t)) = r(t) · F (t, r(t)) + v(t) · p(t, r(t)) which agree with the expressions found in point mechanics, where the explicit and implicit time dependences of G, F and p are merged into a single total time dependence. Comparing now Eq. (163) with Eq. (16) for µ = i, we find that as expected, but also which indicates that the stress tensor associated with a point particle is simply given by the tensor product of the momentum density with the velocity. Put differently, the stress tensor arises from the sole motion of the particle and has therefore purely convective contributions. Since a point particle has no extension, it has no internal, that is, rest-frame pressure.

B. DIMENSIONAL REGULARIZATION
In this section, we summarize the main working principles of dimensional regularization (DR), as described in the works of Refs. [82,84]. In particular, we will outline the properties of the (infinite-dimensional) domain space in the DR approach and how the physical space of a system can consistently be incorporated as a subspace of this domain.
DR is defined only in perturbation theory through the following general procedure: for any given Green's function, one inserts an expansion (of order n) of the exponential of the action and replaces the usual momentum integration in 4 dimensions with a map I d which is usually called 'integration in d dimensions'. Since the momentum integration has changed, it is natural that also the 'momenta' have changed their nature, from four-vectors to objects with different dimensionality. These objects actually take different definitions depending on the type of DR that one adopts. We can distinguish the following DR types (see, e.g., Ref. [79], for a comprehensive overview): the conventional DR (CDR), where one treats all the vectors and tensors in d dimensions (see below for the proper definition of d-dimensional vectors); the 't Hooft-Veltman (HV) regularization, where one attributes d dimensions only to the 'singular' or 'internal' vector (or tensor) fields and 4 dimensions to all other fields; the four-dimensional helicity scheme (FDH) and dimensional reduction (DRED), where one treats the momenta in d dimensions and enlarges the space to d s = d + n ǫ dimensions to treat the singular vector fields in d s dimension, working with the regular vector fields in 4 (within FDH) or d s (within DRED) dimensions.
A common misconception is that any DR scheme extends or contracts the Minkowski momentum space S 4 into a non-integer vector space. This is not how DR actually works at a consistent theoretical level. DR extends S 4 into an infinitely-dimensional vector space, indicated as a quasi-d-dimensional space 19 QS d , which may be further enlarged to a quasi-d s dimensional space, denoted by QS ds , via a direct (orthogonal) sum with QS nǫ , The space QS d is the natural domain of CDR and of momentum integration in all the DR schemes, and the claim that one works with d dimensions comes from the scaling property of the map I d , which resembles the scaling property of a finite dimensional space with d dimensions. We can briefly summarize the properties of the map I d which is at the core of all the DR variants following the outline in Refs. [80,81]. Formally, I d is defined for scalar 'integrands' as the map from the direct product of space functions on QS d and QS d into complex numbers, i.e., In the following, we will assume that QS d is a Euclidean space. To translate the theory from Minkowski space into Euclidean space one can either apply a Wick-rotation from the very beginning or can single out the time dimension by writing p = (p 0 , q) I d (f, p) = dp 0 I d−1 (f (p 0 , q), q).
The separation of I d into a standard one-dimensional integral and a residual d − 1 map is allowed and consistent, as it will become evident from the discussion below. This is what is meant when working in 1+(3-2ǫ) dimensions 20 . The fundamental axioms for the map I d are: To prove the existence, one first assumes that all the external physical vectors q i in QS d belong to a vector subspace V ⊂ QS d with dim(V ) = J < +∞. This assumption is not restrictive, since any external vector should live in the physical Minkowski space of the theory (for standard QED or QCD the physical space is 4-dimensional). Any vector p ∈ QS d can then be written as p = p + p ⊥ , where p ∈ V and p · p ⊥ = 0. Within such a decomposition, a generic scalar function is given by and one defines the map I d to be the ordinary J-dimensional integral over p performed after the integration in one dimension over p ⊥ = |p ⊥ | with a weight p d−J−1 ⊥ , i.e., Such a definition is consistent with the fundamental axioms of the map I d and justifies the commonly adopted nomenclature of 'integration in d dimensions'. Furthermore, it works independently on the dimension of V so long as dim(V ) < +∞. One can easily extend the definition to the case of tensor integrals. For a generic tensor function, one can always write down the following expansion into scalar functions f i (p 2 , q 2 , p · q), f ij (p, q) = p i q j f 1 (p 2 , q 2 , p · q) + q i p j f 2 (p 2 , q 2 , p · q) + p i p j f 3 (p 2 , q 2 , p · q) + q i q j f 4 (p 2 , q 2 , p · q) + g ij f 5 (p 2 , q 2 , p · q), and then proceed with the integration term by term. If the index of the tensor function is carried by one of the external momenta q i , then there is no ambiguity in the meaning of the index, since the external momenta live in the finite subspace V . Vice versa, if the index is carried by p, one enlarges the space V for the integration of the corresponding term in Eq. (176) in such a way to include that explicit component of p in the parallel integration. As final result, one has that in CDR or HV the open indices either belong to the minimal subspace V which contains all the external vectors -in such a case we usually have the identification between V and the Wick-rotated Minkowski space -or to the metric tensor g µν . The final step is to introduce a proper definition of the covariant tensor g µν in an infinite-dimensional space. The naïve construction as the inverse of g µν would lead to g µν g µν = +∞, which is not very useful. Instead, one can define g µν , and hence the dual space of covariant tensor operators, through the map I d , by requiring that its action on the generic tensor function T is For the special case T = g µν one obtains g µν g µν = d as one would expect in a 'd-dimensional space'. Within the framework of any DR scheme discussed above, the standard meaning of a component for any vector or tensor fails, notably if the component index is carried by the metric tensor which inherently lives in the infinitely-dimensional space QS d . However, this does not cause any trouble, since all physical observables are Lorentz scalars. The component of a physical quantity represented by a vector or a tensor comes into play only through scalar products and assumes a particular value once a reference frame has been specified. With this understanding, we have a clear interpretation also in the DR schemes.