Non-renormalizability of the HMC algorithm

In lattice field theory, renormalizable simulation algorithms are attractive, because their scaling behaviour as a function of the lattice spacing is predictable. Algorithms implementing the Langevin equation, for example, are known to be renormalizable if the simulated theory is. In this paper we show that the situation is different in the case of the molecular-dynamics evolution on which the HMC algorithm is based. More precisely, studying the phi^4 theory, we find that the hyperbolic character of the molecular-dynamics equations leads to non-local (and thus non-removable) ultraviolet singularities already at one-loop order of perturbation theory.


Introduction
Numerical simulations in lattice field theory are based on stochastic processes that produce random sequences of representative field configurations. It is often useful to interpret the simulation time in these calculations as a further space-time coordinate. The n-point autocorrelation functions of the local fields then formally look like the correlation functions in a field theory with an extra dimension and they are, in fact, sometimes representable in this way. Depending on the simulation algorithm, and if the simulated theory is renormalizable, the autocorrelation functions may conceivably be renormalizable as well. The scaling properties of such algorithms (which, for brevity, will be referred to as renormalizable) are encoded in the continuum theory and thus become predictable to some extent.
In the pure SU(N ) gauge theory, for example, simulation algorithms that integrate the Langevin equation are known to be renormalizable [1,2]. The integrated autocorrelation times τ int of physical observables have dimension [length] 2 in this case. Moreover, the standard renormalization group analysis and a one-loop calculation in perturbation theory [3,4] imply that they scale according to [5] τ int = Cg at small lattice spacings a, where C is an observable-dependent constant, g 0 the bare gauge coupling and r 0 the Sommer radius [9]. In lattice units, the autocorrelation times thus increase like 1/a 2 as a → 0 up to a logarithmically decreasing factor †.
Most simulations of lattice QCD performed today are based on some variant of the HMC algorithm [10]. The form of the underlying molecular-dynamics equations and free-field studies [11] suggest that the simulation time has physical dimension [length] in this case and that the autocorrelation times consequently scale essentially like 1/a. As far we know, the renormalizability of the algorithm has however never been studied and its scaling properties in presence of interactions thus remain unknown.
In this paper, the issue is addressed in the framework of perturbation theory. For simplicity the φ 4 theory is considered, but our main result (the non-renormalizability of the molecular-dynamics equations) no doubt extends to most theories of interest. A slightly generalized version of the HMC algorithm is studied, which was introduced many years ago by Horowitz [12] (see sects. 2 and 3). The non-renormalizability of the associated stochastic equation is then established by showing that the four-point autocorrelation function of the fundamental field has a non-removable ultraviolet singularity at second order in the coupling (sects. 4 and 5).

Stochastic molecular dynamics
In order to simplify the discussion as much as possible, we consider the φ 4 theory with a single scalar field φ and dimensional instead of a lattice regularization. The action of the field in D = 4 − 2ǫ Euclidean dimensions is given by where m 0 denotes the bare mass parameter and g 0 the bare coupling constant. The generalized HMC algorithm [12] integrates a stochastic version of the moleculardynamics equations that derive from the action (2.1). In the following subsections, we briefly discuss these equations and solve them in powers of the coupling g 0 .

Evolution equations
As usual the molecular dynamics evolves the field φ(t, x) together with its momentum π(t, x) as a function of a fictitious time t. The stochastic evolution equations [12] ∂ t φ = π, involve another mass parameter, µ 0 > 0, and a Gaussian noise η(t, x) with vanishing expectation value and variance Evidently, the ordinary molecular dynamics is recovered in the limit µ 0 → 0. Moreover, in the second-order form, and after substituting t → 2µ 0 t, the evolution equations are seen to coincide with the Langevin equation up to a term that goes to zero at large µ 0 . Since its introduction by Horowitz [12], the generalized HMC algorithm has been occasionally studied in the literature, where it is referred to as the Kramers equation or the L2MC algorithm (see refs. [13,14,11], for example). In practice, one starts from the first-order equations (2.2),(2.3) and implements the algorithm using symplectic integrators and acceptance-rejection steps. For the theoretical analysis in this paper, we however prefer to proceed with the second-order equation (2.5).
2.2 Solution of eq. (2.5) to leading order in g 0 The leading-order equation coincides with the Klein-Gordon equation in D + 1 dimensions except for the term proportional to µ 0 , which tends to damp the time evolution of the field. At large µ 0 and after a rescaling of t, the equation actually turns into the heat equation. The Green function of the differential operator D is discussed in some detail in appendix A. Here and below, the notational convention is used. It is then straightforward to show that the solution of the wave equation (2.6) at time t ≥ t 0 with prescribed initial data at time t 0 is given by (2.11) Note that the dependence on the initial data dies away exponentially with increasing time (see appendix A). The stochastic molecular-dynamics evolution thus thermalizes and eventually loses all memory of the initial values of the field. In the following, we shall only be interested in the behaviour of the autocorrelation functions after thermalization. We therefore move the thermalization phase to time of eq. (2.6) that describes the stationary situation.

Equation (2.5) may be written in the form
or, equivalently, as an integral equation (2.14) Iteration of the latter then yields the solution φ(t, x) in powers of g 0 . Each term in this expansion may be represented by a tree diagram with directed lines, four-point and one-point vertices (see fig. 1). In frequency-momentum space, stand for the insertion of the noise field. As in ordinary Feynman diagrams, there is a frequency-momentum conservation δ-function with ingoing frequency-momenta (ω 1 , p 1 ), . . . , (ω 4 , p 4 ). The lines in these diagrams are directed, becauseK(ω, p) is not invariant under a change of sign of ω. In position space, the arrows point in the direction of increasing simulation time.

Autocorrelation functions
The n-point autocorrelation functions of the field φ(t, x) are usually defined by taking the time average of the product φ(t 1 , x 1 ) . . . φ(t n , x n ) at fixed time lags t i − t j . In the present setup, the translation symmetry in time allows the time average to be replaced by the average . . . over the noise field η(s, y). We are thus led to consider the correlation functions in frequency-momentum space, which may be computed in perturbation theory by expanding the fieldsφ(ω k , p k ) in powers of the coupling g 0 , following the lines of the previous section, and by contracting the noise fields using Wick's rule. As a result one obtains a sum of Feynman diagrams for the autocorrelation functions similar to the ones for the ordinary (field-theoretical) correlation functions.

Feynman rules
The one-point vertices in the tree diagrams that represent the terms in the expansion ofφ(ω k , p k ) are connected to the rest of the tree through a directed line. When the noise fields at any two such vertices are contracted, an undirected line In the case of the two-point autocorrelation function, for example, the contraction of the lowest-order diagram The Feynman diagrams for the autocorrelation functions thus involve directed lines (2.16), undirected lines (3.2) and the vertices (2.19). Given these lines and vertices, the Feynman rules are the usual ones except for the following special features: (a) Both kinds of lines (directed and undirected) can connect to the vertices. The value of the latter is the same in all cases.
(b) Each vertex has exactly one outward-directed line attached to it. This line may be an internal or an external line.
(c) There are no diagrams with loops of directed lines.
(d) External lines may be outward-directed lines or undirected lines. There are no inward-directed external lines.
The structure of the Feynman diagrams is otherwise the same as in any field theory.
In particular, the diagrams can be decomposed into one-particle irreducible parts and the lines connecting them. Note that the external legs of the irreducible parts may be directed or undirected (see fig. 2). The two types of legs must be distinguished and there are thus many more irreducible diagrams than in the case of the ordinary correlation functions.

Computation of the two-point function to one-loop order
The decomposition of the two-point autocorrelation function into one-particle irreducible parts reads where the self-energy Σ is given to one-loop order by the diagram 1 in fig. 2. A short calculation, using eq. (A.7), then shows that To this order, the self-energy thus coincides with the familiar tadpole diagram that contributes to the ordinary two-point correlation function. In particular, the poles at ǫ = 0 will later be seen to cancel when the mass parameter m 0 is renormalized.

Computation of the diagrams 2 and 3
Apart from diagram 1 (which may be inserted in the external lines), the diagrams 2 and 3 in fig. 2 are the only one-particle irreducible diagrams that contribute to the four-point autocorrelation function at one-loop order. Up to a factor g 2 0 and the statistical factor, they are given by where (ω, p) is the external frequency-momentum that flows through the diagrams from left to right. The integrals I 2 and I 3 can be transformed to a useful alternative form by inserting the time-momentum representation of the propagators (see appendix A). Taking the support properties of the Green function into account, one first notes that Partial integration then yields (3.14) The integral I 3 is similarly given by so that it suffices to work out the integrals J 0 and J 1 . The integral J 0 coincides with the familiar one-loop diagram contributing to the ordinary four-point correlation function in the φ 4 theory. It is logarithmically divergent and thus has a pole, at ǫ = 0. In four dimensions, the integral J 1 has negative dimension of mass and may therefore be expected to be absolutely convergent. The following explicit calculation however shows that this is not so.
The integral over t in eq. (3.14) can be performed analytically starting from the time-momentum representation (A.7) of the field propagator. After some algebra and the substitution q → q − 1 2 p, the integration leads to the expression The integrand in this formula is a singularity-free function of ω, p and q, but at large q the integral is logarithmically divergent in four dimensions. A somewhat lengthy calculation then shows that ; the branch of the square root to be taken is the principal one).

Relation to the ordinary correlation functions
Since the stochastic molecular dynamics simulates the field theory with action (2.1), the equal-time autocorrelation functions C n (p 1 , . . . , p n ) = ω 1 ...ω nÃ n (ω 1 , p 1 ; . . . ; ω n , p n ) (4.1) must coincide with the ordinary correlation functions of the fundamental field in momentum space [12]. In this section, we show that the two-and the four-point autocorrelation functions do have this property at one-loop order of perturbation theory. Partly the calculation serves as a consistency check, but some of the intermediate results will be helpful in sect. 5 as well, where we discuss the non-renormalizability of the stochastic molecular dynamics.

Two-point function
Recalling the results obtained in subsect. 3.2, the two-point autocorrelation function is given bỹ Using the time-momentum representation (A.7) of the field propagator, the integrals over the frequencies are easily worked out and one recovers the familiar expressioñ for the two-point correlation function in the φ 4 theory.

Four-point function at leading order
The leading-order contributioñ for the frequency-conservation δ-function, the integrals over the frequencies that lead from the autocorrelation to the ordinary correlation functions factorize and after a few further steps one obtains for the leading-order four-point correlation function. Use has here been made of the identity (A.6) and of the fact that the Green functionK(t, p) vanishes at negative times t. Performing the time integration in eq. (4.6), the correlation functioñ is then seen to coincide with the expected expression.

Four-point function at one-loop order
The second-order contributionÃ As discussed in subsect. 3.3, diagram 2 is a linear combination of the integrals J 0 and J 1 , while diagram 3 is expressed through J 1 alone. Collecting all terms proportional to J 0 , their sum is found to be given bỹ Both expressions (4.8) and (4.9) are easily integrated over the frequencies, because the only factor that depends on ω 1 , . . . , ω 4 is the tree-level four-point function. One then discovers that the expected result forC (1) 4 (p 1 , . . . , p 4 ) is obtained already from these two contributions to the four-point function.
In order to show that the remaining terms (i.e. those proportional to J 1 ) vanish when integrated over the frequencies, first consider the channel where the frequencymomentum combination (ω, p) = (ω 1 + ω 2 , p 1 + p 2 ) (4.10) flows through the diagrams 2 and 3. Dropping the terms proportional to J 0 and using the identity (A.5), one obtains for the contribution of diagram 2 in this channel. The contribution of the other diagram is similarly given by where the prefactor includes the symmetry factor of the diagram. In total there are six further diagrams that differ from the diagrams 2 and 3 by a different distribution of the arrows to the external lines, each of them being given by the corresponding expression (4.11) or (4.12) with the proper assignment of arrows. The sum of all these contributions to the four-point function is then equal to i.e. all terms where the arrows are both ingoing or both outgoing cancel in the sum. It is not difficult to show that the terms in eq. (4.13) vanish when integrated over the frequencies. The integral over the first term, for example, is proportional to Eliminating ω 4 , the integral becomes ω 1 ,ω 2 ,ω 3 J 1 (ω 1 + ω 2 , p)K(ω 1 , p 1 )G(ω 2 , p 2 )G(ω 3 , p 3 )K(ω 1 + ω 2 + ω 3 , p 4 ), (4.15) and if one first integrates over ω 1 , the term is seen to vanish, because the integrand has no singularities in the half-plane Im ω 1 ≥ 0 and falls off at least like ω −4 1 at large ω 1 . Exactly the same happens for all other terms in eq. (4.13) and also those in the other two frequency-momentum channels.

Non-renormalizability of the stochastic molecular dynamics
We now address the question whether the ultraviolet singularities of the autocorrelation functions can be canceled by the addition of local counterterms to the evolution equation (2.5).

Parameter renormalization
Evidently, the list of counterterms must include those corresponding to the usual parameter and field renormalization that is required for the renormalization of the ordinary correlation functions. In the minimal subtraction scheme, the bare coupling and mass are related to the renormalized parameters g and m through where M denotes the normalization mass. To one-loop order, the fundamental field does not need to be renormalized in this theory. Recalling eqs. (4.2), (4.8) and (4.9), it is then immediately clear that the parameter renormalization cancels the singularities of the two-and four-point autocorrelation functions which derive from the poles (3.8) and (3.16) of the integrals I 1 and J 0 .

Non-renormalizability of the four-point function
The four-point autocorrelation function has further singularities proportional to the divergent part (3.18) of the integral J 1 . As explained in sect. 4, the ordinary fourpoint correlation function does not receive any contributions from this integral (and is therefore finite after the parameter renormalization), but the terms proportional to J 1 do contribute to the autocorrelation function at non-zero time separations.
The residue of the pole in eq. (3.18) is the Fourier transform of a distribution supported on the light cone t = |x|. Both diagrams 2 and 3 thus have a non-local singularity that cannot be canceled by including local counterterms in the stochastic molecular dynamics. The latter is therefore not renormalizable. The presence of the singularity (5.3) can be understood by noting that the integrand of the integral has a non-integrable singularity in D = 4 dimensions proportional to (t − |x|) −1 (see subsect. A.3). Such light-cone singularities are a characteristic feature of Green func-random initial momentum Fig. 3. The HMC algorithm moves the fundamental field φ through field space along a piecewise smooth curve. In the smooth segments of the curve, the field is evolved from time t = 0 to some time t = τ according to the molecular-dynamics equations, starting from the current field φ and Gaussian random values for its momentum π.
tions of hyperbolic wave equations and the non-renormalizability of the stochastic molecular dynamics is thus seen to be related to its hyperbolic nature.
In ordinary field theory, one-loop integrals do not have non-local ultraviolet singularities, because they can be Wick rotated to Euclidean space where the propagators are singular at coinciding points only. The spectral condition and the locality of the theory guarantee that no singularities stand in the way of the Wick rotation [15]. In the case of the diagrams 2 and 3, however, the integrands have poles in all quadrants of the complex frequency plane and the integrals (3.9) and (3.10) consequently cannot be Wick rotated without generating additional terms.

Implications for the HMC algorithm
In practice, the HMC algorithm involves a numerical integration of the (ordinary) molecular-dynamics equations and acceptance-rejection steps to correct for the integration errors. For simplicity the integration is assumed to be exact in this section. No acceptance-rejection steps are then required and whether one uses the first-or the second-order form of the molecular-dynamics equations makes no difference.
The molecular-dynamics trajectories generated by the algorithm are smooth segments of a continuous curve in field space (see fig. 3). Along the trajectories, the n-point autocorrelation functions in the time-momentum representation, may be defined, where the bracket . . . stands for the average over all trajectories in an infinitely long simulation. The autocorrelation functions (5.5) only describe the dynamical properties of the algorithm in the specified range of times, but the discussion in the following paragraphs shows that already these correlation functions are not renormalizable.
The average over trajectories in eq. (5.5) amounts to taking the average over the initial values of the field φ and its momentum π = ∂ t φ. Since these are distributed according to the equilibrium distribution (a Gaussian in the case of the momentum), the average coincides with the ordinary expectation value. In perturbation theory, the correlation functions can therefore be calculated by solving the (non-stochastic) molecular-dynamics equations in the range 0 ≤ t ≤ τ with prescribed initial data at t = 0 and by computing the expectation value of the productφ(t 1 , p 1 ) . . .φ(t n , p n ) using the standard Feynman rules for the correlation functions of the initial data.
In the case of the stochastic molecular dynamics, the computation of the autocorrelation functions in the time-momentum representation can be organized in the same way. A notable difference is that the contractions of the noise field give rise to additional diagrams, but since all these diagrams disappear in the limit µ 0 → 0, it is clear that the autocorrelation functions (5.5) are given bŷ A n (t 1 , p 1 ; . . . ; t n , p n ) = lim µ 0 ↓0 ω 1 ,...,ω n e −i(ω 1 t 1 +...+ω n t n )Ã n (ω 1 , p 1 ; . . . ; ω n , p n ), where the autocorrelation functions on the right are those discussed in the previous sections. Note that the frequency integrals must be performed before µ 0 is taken to zero, as otherwise one may run into infrared-singular intermediate expressions.
In view of its relation to the stochastic molecular dynamics, as expressed through eq. (5.6), and since the distribution (5.3) remains non-local at µ 0 = 0, we are thus led to conclude that also the HMC algorithm is not renormalizable.

The Langevin limit
As already mentioned in subsect. 2.1, the stochastic molecular-dynamics equation (2.5) is equivalent to the Langevin equation in the limit µ 0 → ∞ up to a rescaling of the simulation time †. The associated n-point autocorrelation functions, are known to be renormalizable to all orders of perturbation theory [1]. It may be instructive to see how exactly the renormalizability of the autocorrelation functions gets restored at one-loop order when µ 0 is sent to infinity. To this † In lattice field theory, the Langevin limit can also be reached together with the continuum limit by setting µ 0 to some fixed value in units of the lattice spacing. end, first note that the Green function assumes the expected form, which is smooth in position space except for a singularity at the origin. At one-loop order, the limit of the two-point function and the renormalizable parts of the four-point function is then easily determined starting from the identities (4.2), (4.8) and (4.9). These remain valid in the limit and only the tree-level autocorrelation functions are replaced by the corresponding expressions involving the propagator and the Green function (5.8).
The contribution to the four-point function proportional to the integral J 1 (which contains the non-removable ultraviolet singularity in the case of the stochastic molecular dynamics) however changes its character, because the integral turns out to be absolutely convergent. In the Langevin limit, all ultraviolet singularities at one-loop order are thus canceled by the parameter renormalization, as is expected to be the case in this theory [1].

Concluding remarks
The HMC algorithm is currently the preferred simulation algorithm in lattice QCD.
In the past two decades, various improvements were included in this algorithm, many of them with the aim of reducing the computational effort required at small sea-quark masses (see ref. [16] for a recent review). Its scaling behaviour with respect to the lattice spacing has not received as much attention so far, but rapidly becomes an important issue when the continuum limit is approached.
While the dynamical properties of the HMC algorithm are well understood in free field theory [11], the situation in the presence of interactions tends to be rather more complicated. In particular, certain lattice artifacts (topology-changing tunneling transitions, for example, or unphysical critical points in the space of bare couplings) can cause large autocorrelations. The results obtained in this paper show that even in the absence of such effects there is no reason to expect that the HMC algorithm scales essentially as in a theory of free fields. Evidently, the non-renormalizability of the algorithm does not imply that it is invalid or unusable close to the continuum limit, but without further insight its scaling behaviour is unpredictable in interacting theories.
The HMC algorithm and the stochastic molecular dynamics may conceivably fall into the universality class of the Langevin equation. Independently of whether this is the case or not, it may be worth looking for renormalizable algorithms where the simulation time has scaling dimension less than 2. Eventually such algorithms might turn out to be faster than the HMC algorithm and they would have the advantage that their efficiency at small lattice spacings is predictable.
Appendix A. Properties of the Green function K(t, x)

A.1 Definition
The Fourier transform (2.9) of the Green function is a smooth function of ω and p that satisfies for some (mass-dependent) constant C.K(ω, p) is therefore a tempered distribution and so is the Green function in position space. However, the Fourier integral (2.8) is not absolutely convergent and is to be understood in the sense of distributions. All these comments also apply to the propagator (3.3) of the basic field. As a distribution, K(t, x) satisfies the wave equation where D is given by eq. (2.7). Since the polynomial representing D in frequencymomentum space is nowhere equal to zero, the Green function is the unique tempered distribution that solves eq. (A.2). In particular, one does not have the freedom of specifying retarded or advanced boundary conditions.

A.2 Time-momentum representation
Using the residue theorem, it is possible to work out the Green function in the time-momentum representation. Note that ǫ p is imaginary at small momenta if µ 0 > m 0 , but the Green function always decays exponentially in the time direction.
In the case of the field propagator, the identity and thus may be used to show that An immediate consequence of these results is that K(t, x) and G(t, x) become smooth functions of t at all t = 0 when smeared with a test function in x. Moreover, as t ↓ 0.

A.3 Explicit expression for K(t, x) in four dimensions
In dimension D = 4, the Green function in position space, K(t, x), can be calculated analytically. While the expression is of some interest in the context of our discussion of the non-renormalizability of the stochastic molecular dynamics in sect. 5, its exact form is not needed and the proof of the results quoted below is therefore omitted.
First note that the product e tµ 0 K(t, x) is invariant under Lorentz transformations in 5 dimensions. As a consequence, and since the Green function vanishes at negative times, its support must be contained in the forward light cone t ≥ |x|. Inside the light cone, i.e. at all t > |x|, the formula has the same divergent part as J 1 , because the integrand of the differenceĴ 1 −J 1 falls off like (q 2 ) −3 at large q and is therefore absolutely integrable in four dimensions. In view of the reality propertŷ In eq. (B.5), the phases of the expressions in the curly brackets range from − 1 2 π to + 1 2 π and it is understood that the corresponding branch of their powers is taken. The integral (B.5) is absolutely convergent for D < 4 and a holomorphic function of z in the half-plane Re z > 0. Its values along the real axis z > 0 can be computed by first substituting u → u/z and subsequently is thus obtained, where the branch of the square root with positive real part is to be taken. This formula holds for both positive and negative ω.